Socialism

Socialism is both an economic system and an ideology (in the non-pejorative sense of that term). A socialist economy features social rather than private ownership of the means of production. It also typically organizes economic activity through planning rather than market forces, and gears production towards needs satisfaction rather than profit accumulation. Socialist ideology asserts the moral and economic superiority of an economy with these features, especially as compared with capitalism. More specifically, socialists typically argue that capitalism undermines democracy, facilitates exploitation, distributes opportunities and resources unfairly, and vitiates community, stunting self-realization and human development. Socialism, by democratizing, humanizing, and rationalizing economic relations, largely eliminates these problems.

Socialist ideology thus has both critical and constructive aspects. Critically, it provides an account of what’s wrong with capitalism; constructively, it provides a theory of how to transcend capitalism’s flaws, namely, by transcending capitalism itself, replacing capitalism’s central features (private property, markets, profits) with socialist alternatives (at a minimum social property, but typically planning and production for use as well).

How, precisely, socialist concepts like social ownership and planning should be realized in practice is a matter of dispute among socialists. One major split concerns the proper role of markets in a socialist economy. Some socialists argue that extensive reliance on markets is perfectly compatible with core socialist values. Others disagree, arguing that to be a socialist is (among other things) to reject the ‘anarchy of the market’ in favor of a planned economy. But what form of planning should socialists advocate? This is a second major area of dispute, with some socialists endorsing central planning and others proposing a radically decentralized, participatory alternative.

This article explores all of these themes. It starts with definitions, then presents normative arguments for preferring socialism to capitalism, and concludes by discussing three broad socialist institutional proposals: central planning, participatory planning, and market socialism.

Two limitations should be noted at the outset. The article focuses on moral and political-philosophical issues rather than purely economic ones, discussing the latter only briefly. Second, little is said here about socialism’s rich and complicated history. The article emphasizes the philosophical content of socialist ideas rather than their historical development or political instantiation.

Socialism and Capitalism: Basic Institutional Contrasts
Socialism vs. Communism in Marxist Thought
Why Socialism? Economic Considerations
Why Socialism? Democracy
1. Scope
2. Influence
Why Socialism? Exploitation
1. Exploitation as Forced, Unpaid Labor
2. Eliminating Exploitation
Why Socialism? Freedom and Human Development
1. Formal Freedom
2. Effective Freedom
Why Socialism? Community and Equality
1. Why Produce? Communal vs. Market Reciprocity
2. Justice, Inequality, Community
Institutional Models of Socialism for the 21st Century
References and Further Reading

1. Socialism and Capitalism: Basic Institutional Contrasts

Considered as an economic system, socialism is best understood in contrast with capitalism.

Capitalism designates an economic system with all of the following features:

The means of production are, for the most part, privately owned;
People own their labor power, and are legally free to sell it to (or withhold it from) others;
Production is generally oriented towards profit rather than use: firms produce not in the first instance to satisfy human needs, but rather to make money; and
Markets play a major role in allocating inputs to commodity production and determining the amount and direction of investment.

An economic system is socialist only if it rejects feature 1, private ownership of the means of production in favor of public or social ownership. But must an economic system reject any of features 2-4 to count as socialist, or is rejection of private property sufficient as well as necessary? Here, socialists disagree. Some, often called “market socialists”, hold that socialism is compatible, in principle, with wage labor, profit-seeking firms, and extensive use of markets to organize and coordinate production and investment. Others, sometimes called “orthodox” or “classical” socialists, contend that an economic system with these features is scarcely distinguishable from capitalism; true socialism, on this view, requires not merely social ownership of the means of production but also planned production for use, as opposed to “anarchic”, market-driven production for profit.

This section explores the core socialist commitment to social ownership of the means of production. Other important aspects of socialism—for instance, its stance towards markets and planning—are discussed in later sections (especially section 8).

a. Ownership: Some Preliminaries

Consider a society’s instruments of production, its land, buildings, factories, tools, and machinery; consider also its raw materials, its oil and timber and minerals and so on. Together, these instruments and these materials comprise society’s means of production. To whom should these means of production belong: to society as a whole, or to private individuals or groups of individuals? This is the central question dividing capitalists and socialists, with capitalists advocating extensive rights of private ownership of the means of production and socialists advocating extensive social or public ownership of these means.

Notice that the capitalist/socialist dispute does not concern the desirability of private property in items unrelated to production. The issue between socialists and capitalists is not whether individuals should be able to own “personal property” (for example, toothbrushes, houses, clothing, and other articles of everyday use) but whether they should be able to own “productive property” (for example, stores, factories, raw materials, and other productive assets).

But what does it mean to own something? Standardly, to own something is to enjoy a bundle of legally enforceable rights and powers over that thing. These rights and powers typically include the right to use, to control, to transfer, to alter (at the limit, even to destroy), and to generate income from the thing owned, as well as the right to exclude non-owners from interacting with the owned thing in these ways. Because these rights admit of gradations, so too does ownership, which is scalar—a matter of degree—rather than dichotomous. In general, the wider one’s rights of use, control, and so on over an object, the fewer restrictions one faces in exercising these various rights, and the wider one’s ownership rights over that object. Ownership, notice, may be narrowed and restricted without ceasing to be ownership. Limited ownership is not an oxymoron.

Another important distinction here is that between legal and effective ownership. These can go together, as when a person owns her car both in law and in fact: she not only has the title, but also possesses actual powers of use, control, and so on over the vehicle. But so too can they come apart. “The means of production belong to all the people,” proclaimed the Soviet Union’s constitution, but these were just words, for in reality democratically unaccountable bureaucrats and party officials grasped all the important economic levers. Something similar could be said of the relationship between shareholders in large capitalist corporations, on the one hand, and management and executives on the other: the former have “paper” ownership, but it is the latter that really exercise control. In general, it is effective rather than merely legal or formal ownership that is of interest in the present context. Capitalists and socialists alike want to realize their preferred patterns of ownership not just on paper, but also in reality.

b. Private, State, and Social Ownership

To understand socialism, one must distinguish between three forms of ownership. Under private ownership, individuals or groups of individuals (for example, corporations) are the primary agents of ownership; it is they who enjoy the various rights of use, control, transfer, income generation, and so on discussed above. Under state ownership, the state retains for itself these rights, and is thus the primary agent of ownership. Both of these forms of ownership should be familiar to anyone who has frequented a business or driven on an interstate highway.

Much less familiar is the key socialist idea of social ownership. Social ownership of an asset means that “the people have control over the disposition of that asset and its product” (Roemer, A Future for Socialism 18). Social ownership of the means of production, then, obtains to the degree that the people themselves have control over these means: over their use and over the products that eventuate from that use. This is a conceptually simple idea, but it can be difficult to grasp its practical implications. How, in concrete terms, could social control over the means of production be realized?

Historically, socialists have struggled to answer this question, and for good reason: it is not at all obvious how meaningful control over something as massive and complex as a modern economy might be shared across tens or even hundreds of millions of people. Broadly speaking, socialists have identified two main strategies of socialization. The first seeks to socialize the economy by nationalizing it. The second seeks the same end by radically decentralizing and democratizing economic power. These strategies will be investigated in greater detail below (see section 8), but for now a few orienting remarks are in order.

First, regarding nationalization: state ownership functions as a vehicle for socialization only to the extent that the people are themselves in control of the state. Otherwise nationalization amounts to little more than statism, not socialism; it constitutes economic rule by state officials rather than by society as a whole. Any genuinely socialist program of nationalization, then, must adhere to a two-part recipe: nationalize the economy, but also democratize the state, thereby putting the people in control of the economy at one remove.

This second step has proven rather elusive in practice. It was not accomplished—indeed, it was not even really attempted—by the so-called “socialist” authoritarianisms of the 20th century such as the Soviet Union and China. And certainly considerable barriers to genuine democratization exist even in countries with longstanding liberal democratic traditions, such as the United States. These barriers include the awesome influence of special interests and concentrated wealth on the political process, corporate domination of political media, voter ignorance and apathy, and so on. Democracy—popular control over the state—is, in short, an ideal easier praised than implemented, even under favorable conditions. However, these considerable practical problems aside, there seems to be nothing incoherent in principle with the idea of a genuinely socialist—because genuinely democratic—program of nationalization.

Or is there? Many socialists argue that state ownership can never fully realize socialism’s promise, no matter how democratic the relationship between the people and the state. This is because real social ownership involves not only control-at-a-remove, so to speak, but also active involvement and participation. On this conception, it is not enough for democratically accountable politicians and bureaucrats to steer the economy in your name; rather, you must do (or at least have the real opportunity to do) some of the steering yourself. The core idea here is well expressed by Michael Harrington:

Socialization means the democratization of decision making in the everyday economy, of micro as well as macro choices. It looks primarily but not exclusively to the decentralized, face-to-face participation of the direct producers and their communities in determining the matters that shape their social lives (197).

In a socialist society, average, everyday people must be active rather than passive, empowered rather than subordinated, involved rather than excluded. But if this is what genuine socialization requires, then socialism is

not a formula or a specific legal mode of ownership, but a principle of empowering people at the base, which can animate a whole range of measures, some of which we do not yet even imagine (Harrington, 197).

The point is not that nationalization can never play a role in making socialism real, but that it cannot play the outsized role often assigned to it.

But if socialists should not rely exclusively on nationalization, to what else should they appeal instead? Different socialists will answer this question in different ways, as we will see in section 8. But most would recommend leavening democratically controlled state ownership with sizable helpings of workplace democracy (as found, for instance, in the Mondragon and La Lega cooperatives in Spain and Italy, respectively), social control over investment, and various other measures to economically empower local communities and individuals (for instance, the “participatory budgeting” process found in Porto Allegre, Brazil, through which citizens meet in popular assemblies to decide how the city’s resources should be spent). By knitting together nationalization of major industries with these and other programs and initiatives, socialists hope to bring to fruition the “truly audacious project of empowering people to take command of their everyday lives” (Harrington, 197).

c. Economic Systems as Hybrids

In principle, an economy could be wholly capitalist, statist, or socialist. An economy would be wholly capitalist just in case all its productive assets were privately controlled; wholly statist, provided all such assets were state-controlled; and wholly socialist, provided all such assets were socially-controlled. While these are coherent theoretical possibilities, they have not been implemented in practice. In reality, all economies are hybrids that blend together private, social, and state ownership. It is better, then, to think of capitalism, statism, and socialism “not simply as all-or-nothing ideal types of economic structures, but also as variables” (Wright, 124). According to this analysis, an economy can be more or less capitalist, socialist, or statist, depending on the particular balance it strikes between the three forms of ownership.

For example, even in the United States—widely seen as a bastion of capitalism—the state plays a considerable role in controlling economic activity and in distributing the proceeds thereof. Does this mean it is a statist or perhaps even a socialist economy? No. Economies should be individuated with reference to their dominant mode of ownership. Since capitalist ownership dominates the United States’ economy—most of its productive assets being privately owned—it should be thought of as capitalist, albeit with some non-capitalist aspects. Similarly, an economy within which most productive assets are socially controlled should count as socialist, even if (as would almost certainly be the case) it also included statist or capitalist elements.

2. Socialism vs. Communism in Marxist Thought

Although this article focuses on socialism rather than Marxism per se, there is an important distinction within Marxist thought that warrants mention here. This is the distinction between socialism and communism.

Both socialism and communism are forms of post-capitalism. Both feature social rather than private ownership of the means of production. Both, within Marxist orthodoxy, reject market production for profit in favor of planned production for use. But beyond these important similarities lie significant differences. In the Critique of the Gotha Progam, Marx’s fullest discussion of these matters, he divides post-capitalism into two parts, a “lower phase” (later called “socialism” by followers of Marx) and a “higher phase” (communism). The lower phase follows immediately on the heels of capitalism, and so resembles it in certain ways. As Marx memorably puts this point, socialism is “in every respect, economically, morally and intellectually still stamped with the birth marks of the old society from whose womb it emerges” (Critique of the Gotha Program 614). These capitalist “birth marks” include:

Material scarcity. Like capitalism, socialism does not overcome scarcity. Under socialism, the social surplus increases, but it is not yet sufficiently large to cover all competing claims.

The state. Socialism transforms the state but does not do away with it. What was a “dictatorship of the bourgeoisie” under capitalism becomes a “dictatorship of the proletariat” under socialism: a state controlled by and for the working class. (Since workers make up the vast majority, this is less authoritarian than it sounds.) Workers must seize the state and use it to implement, deepen, and secure the socialist transformation of society. Only after this transformation is complete can the state “wither away”, and the “government of people” be replaced by the “administration of things”.

The division of labor. Socialism, like capitalism, will feature occupational specialization. Having developed under capitalist educational and cultural institutions, most people were socialized to fit narrow, undemanding productive roles. They are not, therefore, “all around individuals” capable of performing a wide variety of complex productive tasks. Accordingly, socialism must feature a broadly inegalitarian occupational structure. As under capitalism, there will be janitors and engineers, nurses’ aides and surgeons, factory workers and planners.

Finally, under socialism (many) people will retain certain capitalist attitudes about production and distribution. For example, they expect compensation to vary with contribution. Since contributions will differ, so too will rewards, leading to unequal standards of living. Turning from distribution to production, many socialist producers will, like their capitalist predecessors, regard work as merely a “means to life” rather than “life’s prime want”. They work, in short, to get paid, rather than to develop and apply their capacities or to benefit their comrades.

So in all of these ways, the “lower phase” of post-capitalism resembles its capitalist predecessor. Over time, however, these capitalist “birth marks” fade, all traces of bourgeois attitudes and institutions vanish, and humanity finally achieves the “higher phase” of post-capitalist society, full communism.

What would “full communism” be like? Marx never answered this question in detail—and indeed, he disparaged as utopian those socialists who focused excessively on “drawing up recipes for the kitchens of the future”—but from his brief remarks about communist society certain broad outlines can be discerned. Perhaps his most famous description of communism comes in the following passage from the Critique of the Gotha Program:

In a higher phase of communist society, after the enslaving subordination of the individual to the division of labor, and therewith also the antithesis between mental and physical labor, has vanished; after labor has become not only a means of life but also life’s prime want; after the productive forces have also increased with the all-round development of the individual, and all the springs of cooperative wealth flow more abundantly—only then can the narrow horizon of bourgeois right be crossed in its entirety and society inscribe on its banner: from each according to his ability, to each according to his needs (615):

Unpacking this passage, we see that Marx makes all of the following claims about communism:

It has done away with the division of labor, especially that between mental and physical labor;

Attitudes towards work have changed (perhaps in part because work itself has changed). Communist producers regard work as both instrumentally and intrinsically valuable: they see work not merely as a means to life, but also as “life’s prime want”;

Human beings themselves have changed under communism, becoming “all-around”, highly developed individuals (rather than the stunted, one-sided creatures that so many of them were under capitalism and perhaps even under socialism);

Material scarcity has been eliminated or at least greatly attenuated, as “the productive forces have increased” and thus “all the springs of cooperative wealth flow more abundantly”;

As a result of all these changes, communist society is able to conform to the famous principle: from each according to his ability, to each according to his needs—thus severing the link (found in communism’s “lower phase”) between contribution and reward.

Not only will communism (unlike socialism) do away with class, material scarcity, and occupational specialization, it will also do away with the state. As noted above, the state begins to wither away under socialism. But this process is not completed until the “higher phase” of full communism, for it is only in that phase that lingering class antagonisms are finally eradicated. With these antagonisms cleared away, the state has nothing to do—no class conflict to manage, no further function to perform—and so, like a vestigial limb, it gradually atrophies from disuse. Or, as Engels famously puts this point in Socialism: Utopian and Scientific,

State interference in social relations becomes, in one domain after another, superfluous, and then dies out of itself; the government of persons is replaced by the administration of things, and by the conduct of processes of production. The state is not “abolished”. It withers away (91).

In sum, within Marxist theory socialism and communism are very different indeed. Although both eradicate private property and profits, only the latter also eliminates the division of labor, the state, material scarcity, and perhaps even conflict itself. It is only under communism that mankind completes its ascendance from the “kingdom of necessity” into the “kingdom of freedom” (Engels 95).

3. Why Socialism? Economic Considerations

Is socialism worthy of allegiance, and if so, why?

The standard normative argument for socialism is comparative. Socialists typically single out certain moral and political values, argue that these values are poorly served under capitalism, and then support socialism by contending that these values would fare better—not necessarily perfectly, but better—under socialism. Values drawn upon by socialists vary, but usually include democracy, non-exploitation, freedom (both formal and effective), community, and equality. Sections 4–7 discuss these values and their alleged connections with socialism.

But before turning to these explicitly normative arguments, a word should be said about the purely economic case for socialism. (Since this article’s focus is normative rather than economic, this section will be brief.) Capitalism, many socialists hold, is wild and wasteful, prone to great booms and tremendously destructive busts. The argument goes like this: capitalist competition greatly augments society’s forces of production. Each firm, merely to stay in business, must innovate. As a result, productivity soars. Ever more output can be produced for ever fewer inputs, labor included. Abundance looms.

But this very abundance, paradoxically, is an economic problem. Gluts drive down prices as supply overwhelms demand. Profits decline. Firms, forced to cut costs, sack workers and slash wages. As unemployment and economic insecurity mount, demand plummets still further: people simply don’t have much money to spend. With reduced demand comes reduced opportunities for profits, hence, reduced production. What was a boom has turned into a bust, and society faces the absurd spectacle of idle farms next to hungry people; empty shoe factories beside shoeless workers; foreclosed houses alongside the homeless.

Capitalism, then, makes possible universal abundance. But its central features—market competition, the pursuit of profits, and private property—ensure that this possibility will never be realized. In Marxist language, there is a deep “contradiction” between capitalism’s “forces of production” and its “relations of production”, a contradiction that nothing short of socialist revolution can solve. Society must overthrow capitalist productive relations, replacing anarchic market production for profit with planned production for use. Only then will humanity eliminate the ridiculous concatenation of vast productive potential alongside vast unmet needs. Or so the socialist argument goes.

Socialists find further economic faults with capitalism. Capitalism misallocates resources towards producing what is profitable rather than what is needed. True, what is needed can sometime be profitable. But often the two categories come apart. Think, the socialist will say, of the vast resources spent producing luxuries for the rich, while the needy go without. Or consider the underproduction of critical, but unprofitable, antibiotics, even as “lifestyle drugs” (like Propecia, for baldness) roll off the production line.

Capitalism is also inefficient in its use of human labor power. Capitalism functions best when there exists a “reserve army of the unemployed,” in Marx’s phrase. The credible threat of unemployment reduces workers’ salary demands and increases their work effort. But unemployment means idle workers: able bodied people, willing to work, who cannot find an outlet for their productivity. This is a waste, and it would not exist under socialism (or so it is claimed.)

Further, capitalism allows an entire segment of the (able-bodied) population to live without working: namely, the independently wealthy, who can simply live off investment income. This, again, is wasteful; were these people recruited into the labor process, labor time for the rest could decline. Finally, capitalism misdirects the labor of many of those it does employ. Just think, the socialist will say, of the legions of lawyers, advertisers, marketers, and financial workers. Such workers (and others beside) perform no real productive function. Their jobs are necessary only within the framework of capitalism itself. In a socialist economy, there is no need for marketing, financial speculation, or lawyers specializing in mergers and acquisitions. Socialism would free people currently doing these tasks to apply their talents in a more useful way. Marketers could become teachers; financiers, farmers. And we would all be the better for it.

In sum, socialists seek to upend the common sense view of capitalism. Most people take it for granted that whatever its normative flaws, at the very least capitalism ‘delivers the goods,’ so to speak. Not so, replies the socialist. Because it is prone to economic crises, and is wasteful and inefficient in its use of the means of production (including human labor), capitalism’s economic bona fides must be questioned.

4. Why Socialism? Democracy

The article turns now to the normative case against capitalism and in favor of socialism, starting with democracy.

Democracy means rule by the people, as opposed to rule by the rich, or rule by the excellent, or, more generally, rule by any part of the people over the rest. Systems plausibly claiming to be democratic can vary along at least three dimensions. They can bring a broader or a narrower range of issues under democratic jurisdiction; their members can be more or less directly involved in the exercise of political power; and they can insist upon greater or lesser equality of influence (or perhaps opportunity for influence) over political processes. Call these the scope, involvement, and influence dimensions, respectively.

Other things being equal, as involvement, scope, and equality of influence increase, so too does democracy. Thus it can make sense to say that one democratic system is “more democratic” than another. So too, it is possible to compare different democratic ideals in terms of their “democratic-ness”. A principle or ideal that insists upon maximal equality of influence, for instance, is (other things equal) more democratic than a principle or ideal that does not.

Socialists are radical democrats. They do not merely profess rule by the people; they also interpret that ideal in a highly democratic way, opting for maximalist or near-maximalist positions along all three of the just-mentioned dimensions. They want democracy to have very broad scope; they want citizens to be highly involved in democratic processes; and they want citizens to have roughly equal opportunities to influence these processes. And they typically argue, further, that the democratic ideal, understood in this rich and demanding way, militates against capitalism and in favor of socialism. This article will focus on the scope and influence dimensions.

a. Scope

To see this argument, consider first the scope dimension of democracy, which concerns the question: where should the boundary between public and private, between politics and civil society, be drawn? Which issues should be subject to democratic choice? Many socialists endorse something like the following principle:

All Affected Principle: People affected by a decision should enjoy a say over that decision, proportional to the degree to which they are affected.

However, it is a rather short step—or so say socialists—from this intuitively plausible principle to the radical conclusion that economics should be subordinated to democracy, that large swathes of economic life should be politicized and brought under popular control. All that is required to make that leap is the seemingly incontrovertible premise that many economic issues affect the public. When a local business fires 20% of its workers, this affects the public. When financiers withdraw support for a new shopping center, this affects the public. When society’s productive assets are deployed to make yachts for millionaires rather than affordable housing, this affects the public. When corporations pull up roots and relocate production to greener pastures, this affects the public.

In all of these cases (and many others besides), people’s lives are affected—indeed, often profoundly affected—by economic decisions. Do they get a say in these decisions, as required by the All Affected Principle? Not under capitalism, which grants extensive control over such matters to holders of private property rights. Where private property reigns, owners rather than affected parties decide, for example, whether to hire or fire, to invest, to relocate, and so on. From the socialist point of view, this is a serious offense against democracy. Capitalism, socialists claim, depoliticizes what should remain political; it cedes far too much control over common affairs to private parties. It is, in this way, insufficiently democratic.

But if the root cause of this democratic deficit is private control over productive assets, then the solution, or so socialists argue, must be social control over the same. Social property brings into the democratic domain what private property improperly removes. What touches all must be decided by all; economic matters touch all; therefore economic matters must be decided by all. This is the simple but powerful democratic syllogism at the heart of one major argument for socialism, for social rather than private control of the economy. What might social control over the economy look like in practice? Section 8 explores competing answers to this question.

b. Influence

Socialists find further grounds for rejecting capitalism in democracy’s influence dimension. Standardly, democracy is held to require not merely that all citizens have a say, but that they have an equal say. But what does this really mean? To clarify, suppose that A and B have equal voting rights, but A, being rich, educated, and leisured, has a greater chance to influence the political process than B, who is poor, uneducated, and short on free time (he must work long hours to make ends meet). Do A and B have an “equal say”, in the sense required by democracy?

Nearly all socialists, and indeed, many non-socialists, would say “no”; they would detect a democratic deficit in this scenario, for they typically see democracy as requiring not merely formal equality of opportunity for political influence but also substantive or fair equality of opportunity for political influence. On this view, it is not enough for A and B to enjoy identical legal protections to vote, to run for office, to engage in political speech, and so on. Instead, genuine democracy requires (over and above this merely legal equality) that equally talented and motivated citizens have roughly equal prospects for winning office and/or influencing policy, regardless of their economic and social circumstances—or something along these lines.

Now, capitalism clearly can implement formal political equality. Many capitalist societies grant their citizens equal rights to vote, to run for office, and so on. But can capitalism implement substantive political equality?

Many socialists think not. Capitalism, they point out, generates steep economic inequalities, dividing society into rich and poor. But in a variety of ways, the rich can translate their economic advantages into political ones. This translation can occur relatively directly, as when the rich buy political influence through campaign contributions, or when they hire lobbyists to steer legislative priorities (sometimes going so far as to draft laws themselves). Or it can occur relatively indirectly, as when the wealthy use their ownership of media to shape public opinion (and thus the political process), or when capitalists threaten to take their money out of the country in response to disliked (usually leftist) policies, thereby limiting what government can do.

But whether moneyed interests affect politics directly or indirectly, the net result is the same: capitalism amplifies the voices of the rich, enabling their concerns to dominate the political process. Indeed, some socialists, pressing this objection to its logical conclusion, contend that “democracy” under capitalism is really little more than oligarchy—rule by the rich—covered by a democratic fig leaf. Or, as Vladimir Lenin put this point: “Democracy for an insignificant minority, democracy for the rich—that is the democracy of capitalist society” (79).

Sophisticated defenders of capitalism respond by arguing that capitalism’s democratic deficits can be repaired within a fundamentally capitalist framework. Campaign finance reform, regulation of lobbying, restrictions on corporate domination of media, even limitations on the movement of capital across borders would, together, do much to restore or preserve political equality amidst capitalist economic inequality, and yet none of them are incompatible with capitalism per se. It follows (capitalists argue) that there is no need to throw out the baby of capitalism with the bathwater of political inequality. Sufficiently reformed, capitalism can indeed realize not just formal political equality but also substantive political equality.

The question, socialists would reply, is whether these reforms would ever be chosen by political elites under capitalism. Will capitalist oligarchs willingly undercut the very basis of their rule by socializing control over mass media, installing real campaign finance reform, limiting capital flows, and so on?

Would socialism perform any better than capitalism on this “influence” dimension of democracy? Would it enable equally talented and motivated citizens to have roughly equal prospects for influencing politics? Socialists argue that it would. Because it eliminates class, socialism eliminates the major threat to substantive political equality. (Of course, other forms of exclusion, such as racism and sexism, must also be overcome.) Wealthy property owners will not dominate the political process at the expense of the poor and unpropertied because the latter will be an empty set. Everyone will be a wealthy property owner, in the sense that everyone will share control over the means of production and will have access to a dignified standard of living. Everyone will therefore have roughly equal economic resources to bring to bear on the political process.

Put differently, whereas capitalism attempts to secure political equality despite massive economic inequalities, socialism attempts to secure political equality in large part by eliminating these inequalities.

5. Why Socialism? Exploitation

According to many socialists, one of capitalism’s central moral failings is that it is exploitative. Socialism, by contrast, would not be exploitative—or so these socialists allege—and this is one of the main reasons for preferring it to capitalism.

But what is exploitation? Is capitalism truly exploitative? And would socialism really eliminate exploitation? This subsection explores socialist answers to these questions.

a. Exploitation as Forced, Unpaid Labor

Although there is no universally accepted account of exploitation, Jeffrey Reiman’s Marx-inspired suggestion that exploitation is “a kind of coercive prying loose of unpaid labor” provides a good framework for discussion (3). On this account, a person is exploited if and only if she is forced to work for free. Feudal serfs, for example, were exploited because they were legally and physically compelled, at sword-point if necessary, to spend part of their working time toiling in the lord’s fields for nothing in return. This was forced, unpaid labor of the most obvious sort, and it constituted a serious form of exploitation.

But are capitalist employees exploited? At first glance, it would appear not. Workers get paid wages, so it doesn’t seem as if they are working for free. Nor does it appear that workers are forced to work. Capitalism, being a system of “free labor”, grants workers ownership over their labor power and entitles them to sell it—or not—as they please. So where is the force supposedly inherent in the capital/worker relationship?

Take the issue of force first. In general, a person is forced to do something X whenever she has no reasonable alternative to doing X. Workers, then, are forced to sell their labor power to capitalists just in case they have no reasonable alternative to doing so. But of course they don’t have a reasonable alternative, or so some socialists contend. Their argument is simple. Everyone must make a living. There are, under capitalist property relations, only two main ways to do this: one can live off of investment or property income, or one can live off of wages. By definition, workers cannot pick this first option; they don’t own means of production, so they can’t live off of income generated by such ownership. This leaves wage labor as the only acceptable option. True, workers are formally free to decline capitalist employment, but this does not represent a reasonable option since its consequences are so dire: starvation or, in more enlightened circumstances, life on the dole. Workers therefore have no minimally reasonable choice but to sell their labor power to owners of means of production.

It follows that workers are forced to work for capitalists, even if they are not so forced by capitalists (or indeed, by anyone else). The forcing in question is structural rather than agential; as Reiman explains, it is “an indirect force built into the very fact that capitalists own the means of production and laborers do not.” Or, as Marx puts this point, it is the “the dull compulsion of economic relations” rather than “direct force” that “completes the subjection of the laborer to the capitalist” (Capital Vol. I, 737).

Not all socialists accept this argument. G.A. Cohen, for example, suggests that individual workers do have a reasonable alternative to selling their labor power: they can become capitalists themselves (“The Structure of Proletarian Unfreedom”). Not overnight, perhaps, but with enough scrimping and saving, is it not possible for an individual worker to start a business of her own? Cohen concludes that individual workers are not forced to sell their labor power. (He also argues that workers are “collectively unfree”—unfree as a class—since not all, or even many, workers can escape their class at the same time; the economy can absorb only so many small business owners at any given moment. But this alleged collective unfreedom of workers, though interesting and important, is peripheral to our present topic and so must be set aside.)

In response, some socialists question whether opening a small business really represents a reasonable option for most workers. For one thing, many workers simply can’t save enough to open such a business: their wages are just too small relative to the cost of living. For another, even if a worker is able, through years of thrift, to open his own business, most businesses fail, often leaving the owner much worse off financially than she would have been had she simply remained a wage laborer. Pulling together these ideas, one critic of Cohen concludes that “escaping into the petty bourgeoisie…is a reasonable alternative only for a tiny minority of workers. Thus the vast majority of working-class individuals are forced to sell their labor power to earn a living” (Peffer, 152).

But even if Cohen is wrong, and individual workers are forced to sell their labor power, notice that it does not yet follow that workers are exploited. For forced labor alone does not exploitation make. Exploitation, as described above, involves forced, unpaid labor. Let us turn, then, to the issue of compensation, and in particular, to the question of whether workers toil (at least in part) for free.

Again, surface appearances cut against the socialist position. Wage laborers standardly receive an hourly wage. If they work, say, eight hours, they get eight hours’ pay. It certainly seems, then, that workers receive full compensation for their toil. Perhaps this compensation is unfairly low, but that is a different issue: the exploitation charge, standardly construed, is that workers are forced to work for no pay, not that they are forced to work for low pay.

But probe more deeply, some socialists contend, and the unpaid nature of much work under capitalism becomes clear. To see their argument, it helps to start with an easier case: feudal production. Under feudalism, serfs spent part of their working time working in their own fields and the rest working in their lord’s fields. They kept whatever they could grow on their own plots, and surrendered whatever they grew on the lord’s. Put differently, serfs received compensation for part of their working time, but no compensation at all for the rest of it. A great deal of their work, then, was wholly unpaid: a fact that was very obvious to all involved, given the physical separation between paid work (on the serf’s fields) and unpaid work (on the lord’s).

Marxists argue that precisely the same division between paid and unpaid work exists under capitalism. Workers spend the first part of their working day working, in effect, for themselves. This is the part of the day during which they produce the equivalent of their wages. Marx calls this “necessary labor time”. But the working day does not stop there. Indeed, it cannot stop there, for if it did, there would be no “surplus product” for the capitalist to appropriate, and thus no reason for the capitalist to hire the worker in the first place. So the capitalist requires the worker to perform “surplus labor”, which is just labor beyond “necessary labor”: labor beyond what is required to produce value equivalent to the worker’s wage. The value produced during surplus labor time, Marx calls “surplus value”. Crucially, this surplus value belongs to the capitalist rather than the worker, and is the source of all profits.

To illustrate, consider a worker who produces 1 widget per hour over the course of an eight-hour shift, thus yielding eight widgets in total. Her boss takes these widgets, sells them, and then returns part of the proceeds to the worker in the form of a wage. But this wage must be less than what the capitalist reaped by selling the widgets. Otherwise the capitalist would have nothing left over as profit. To fix ideas, suppose that the worker’s daily wage is equivalent to the value of 2 widgets. To produce this value, she had to toil for 2 hours (at 1 widget per hour). Yet her shift lasts 8 hours. It follows that she spent 2 hours working for herself, and 6 hours working for her boss: which is to say, 6 hours working for free.

We can now appreciate Marx’s remark that “the secret of the self-expansion of capital [that is, the secret of profit] resolves itself into having the disposal of a definite quantity of other people’s unpaid labor” (Capital Vol. I, 534). Profits, on Marxist analysis, are possible only through the extraction of unpaid surplus labor from workers. Wage workers toil gratis no less than serfs. That the division between paid and unpaid labor under capitalism is temporal rather than physical or spatial (as under serfdom) makes this division harder to see, but it does not in any way diminish its reality—or so the socialist argument goes.

b. Eliminating Exploitation

How exactly is socialism supposed to eliminate exploitation? Notice that it would not eliminate work itself, as Marx writes, “Just as the savage must wrestle with Nature to satisfy his wants, to maintain and reproduce life, so must civilized man, and he must do so under all social formations and under all modes of production” (Capital Vol. III, Ch. 48). So even under socialism, work must be done.

However, it does not follow that people must be forced to do it. Society could eliminate the compulsion to labor by partly decoupling income (or access to basic resources more broadly) from work. Philippe van Parijs’s “unconditional basic income” represents one way to achieve this decoupling. On his proposal, which has attracted significant support from socialist quarters, each citizen, no matter how rich or how poor, would be paid a monthly income, set as high as possible, and in any case sufficient to live with dignity. This income would come without any strings attached. In particular, it would not be conditional on working, seeking work, or training for future work. It would go to all members of the political community: leisured surfers off of Malibu no less than industrious steelworkers in Pittsburgh.

Perhaps the economic feasibility of such a proposal may be questioned. But for present purposes, the important thing to appreciate is the way in which a UBI (as it is known) gives each person the “real freedom” to drop out of the paid labor force, thereby eliminating both the compulsion to work and (therefore) exploitation.

From a socialist perspective, there are at least two potential problems with this way of eliminating exploitation.

First, a UBI enables people to live off the hard work of others—no reciprocation required. Again, surfers get the check no less than people with paid employment. But socialists complain when capitalists live off the work of others; shouldn’t they complain when surfers (and so forth) behave similarly?

Second, there is nothing uniquely socialist about a UBI. Capitalist no less than socialist societies can implement a UBI, thereby enabling everyone to live decently without working. A defender of capitalism might therefore insist that when it comes to exploitation, capitalism and socialism are on all fours: both are equally susceptible to exploitation and equally able to enact the policies needed to eliminate it.

In response, socialists might point to the second necessary feature of exploitation, non-compensation. Notice that compensation takes many forms. Acquiring exclusive control over a sum of money, or over a bundle of resources, is one of them. But so too is acquiring a share of control over resources. Say that you and I work to build a tree house which we then jointly control. Neither of us has exclusive say over the tree house. And yet it would be wrong to conclude that our labors have gone uncompensated. We have been compensated; it’s just that our compensation comes in the form of common rather than private property.

This is precisely the sense in which all labor is compensated under socialism. Workers own the means of production together; they (therefore) own the surplus generated by these means. True, they do not own this surplus privately. They share control over its disposition and use. But shared control can be a form of compensation no less than private control.

Under capitalism, workers have private ownership over their wages (and the things these wages buy) but no ownership at all over most of what they produce. This is the sense in which most of their laboring activity goes uncompensated. Workers produce a surplus, hand it over to capitalists, and are then cut out of the picture; their bosses are free to do with the surplus whatever they like: consume it, invest it, burn it, and so forth. Under socialism, by contrast, workers have private ownership over their wages (or, in a money-less economy, over resources for personal use) and collective ownership over the social surplus they produce. They both make the surplus and share control over how to use this surplus. At no point, then, are socialist producers toiling ‘for free’, since their labors go towards building an economy that is shared and controlled by all. It’s as if everyone made a gigantic tree house that everyone is then free to use and to help govern.

So, contrary to the capitalist objection raised 4 paragraphs back, it seems that socialism is uniquely well positioned to eliminate exploitation. Both socialism and capitalism could, in principle, eliminate forced labor by attenuating the link between income and work. But only socialism can ensure that all work is compensated through common ownership of the social surplus. Thus socialism expunges exploitation from economic life even absent something like a UBI, whereas the same cannot be said of capitalism.

Against this argument, critics might reply that the kind of ‘compensation’ for surplus labor promised by socialism is wholly inadequate. Under capitalism, the worker’s surplus is appropriated by the capitalist; under socialism, the worker’s surplus is appropriated by society. From the worker’s point of view, this may seem a distinction without a difference. Both appropriations rob the worker of effective control over the fruits of her labor. True, under socialism the worker is a member of the group doing the appropriating, but, as merely one of millions of such members, her individual influence over that group is infinitesimal. Is it plausible to regard her tiny sliver of decision-making power over the surplus as ‘compensation’ for her surplus labor? Arguably not, in which case socialism does not actually eliminate exploitation.

6. Why Socialism? Freedom and Human Development

Many socialists point to considerations of freedom, broadly understood, to support socialism over capitalism.

Freedom comes in many varieties. This article will discuss two. Formal freedom involves the absence of interference. Effective freedom involves the presence of capability. A person who is unable to walk has the formal freedom to ascend a steep flight of steps—assuming that no one will interfere with her attempt—but lacks the effective freedom to do so.

a. Formal Freedom

It is sometimes suggested that socialism fares poorly with respect to formal freedom. There are two main grounds for this contention, one historical, the other conceptual.

Historically, many countries claiming to be socialist trampled basic liberties such as freedom of expression and religion. They imprisoned and killed political dissidents and other ‘enemies of the people’. Far from being free societies, they were deeply oppressive ones.

Some critics of socialism suggest that this historical correlation between socialism and oppression was no accident. Rather, it reflects a deep flaw in socialism’s design. Socialism concentrates economic and political power in the hands of the state. Abuse is inevitable under such conditions. Milton Friedman, building off of this insight, famously posited a necessary connection between capitalism (which, unlike socialism, disperses economic power rather than concentrating it) and freedom: not all capitalist societies are free, but all durably free societies must be capitalist.

Socialists concede the heart of Friedman’s point, but argue that it does not undermine their position. Friedman, they say, was right to warn against excessive centralization of power. But he was wrong to suggest that socialism necessarily requires said centralization. The contemporary socialist ideal is profoundly democratic and decentralized; it seeks to disperse economic power, not concentrate it. It aspires to an economy and a society controlled from the broad bottom, not the narrow top. So the kind of socialism that contemporary socialists embrace is simply different than the kind of ‘socialism’ targeted by Friedman’s critique. Put differently, Friedman’s worry attacks a view held by very few socialists today—or so it might be argued.

Turning to a different objection, it is sometimes suggested that on purely conceptual grounds socialism is a more restrictive society than capitalism. The argument for this claim is simple. Capitalism permits private ownership of productive assets; socialism does not. Socialism therefore provides less formal freedom than capitalism. It interferes with various economic activities that capitalism allows. Thus, if what you value is formal freedom, then you should prefer capitalism to socialism.

The trouble with this argument, as pointed out by G.A. Cohen, is that it “see[s] the freedom which is intrinsic to capitalism [but not] the unfreedom which necessarily accompanies capitalist freedom” (“Capitalism, Freedom, and the Proletariat” 150). Capitalism does indeed allow some things that socialism forbids: for example, opening a business. But the converse is also true. To use Cohen’s example: I am free to pitch a tent on common land. I am not free to pitch a tent on land that you own privately. Should I try, the state will interfere, thereby reducing my formal freedom. Private property’s effects on formal freedom, then, are not uniformly positive, but mixed. Private property extends formal freedom to owners even as it withdraws it from non-owners. As Cohen writes, “To think of capitalism as a realm of freedom is to overlook half its nature” (“Capitalism, Freedom, and the Proletariat” 152)

Of course, precisely the same can and indeed must be said of socialism. All systems of property, whether capitalist or socialist, exert complex effects on formal freedom; all such systems necessarily distribute both freedom and unfreedom. But in light of this complexity, our guiding question here—which system, capitalism or socialism, provides more formal freedom?—is probably unanswerable. All we can say with confidence is that these systems provide differently shaped zones of formal freedom; each extends formal freedom in some ways while restricting it in others. However, it seems extremely difficult, perhaps impossible, to determine which of these zones is ‘larger’ overall. At the very least, defenders of capitalism must say a great deal more to establish that capitalism is, a priori, a freer society than socialism.

Socialists score this particular fight a draw.

b. Effective Freedom

Whereas socialists tend to play defense regarding formal freedom, they go on offense when discussing effective freedom.

Effective freedom, again, involves the capacity to accomplish one’s ends. This implies but goes beyond formal freedom. Say that my goal is to complete a marathon. One way I can fail to accomplish this goal is by meeting with agential interference. If you physically restrain me from participating in the race, you undermine my effective freedom by undermining my formal freedom. However, effective freedom usually requires much more than the mere absence of interference. I can actually complete a marathon, for example, only if a host of further conditions are in place. Some are broadly social: I must live in a society in which marathons occur. Others are broadly economic: I must be able to afford all the costs associated with training for the race, traveling to the race, entering the race, and so forth. And there are physical or “internal” factors as well. I can’t finish the marathon unless I have sufficient mobility and endurance. All of which is to say that effective freedom depends upon a wide range of factors, many of which have nothing to do with human interference per se.

Now, in a good and just society, which effective freedoms—which “capabilities,” as they are sometimes called—would people have? The typical socialist response runs as follows. At a minimum, everyone must have the effective freedom to meet their basic needs for food, shelter, health care, and so on. With these capabilities in place, people are able to survive. This is a crucial accomplishment, and one demanded by minimal standards of justice and decency. However, a truly good society must set its sights higher; it must enable people not merely to survive, but also to flourish.

And what is human flourishing? Socialists standardly accept a broadly Marxist/Aristotelian account according to which the good life centrally involves not just the passive pleasures of consumption (watching TV, eating tasty food, and so on) but also the more active and enduring satisfactions of “self-realization”, which can be defined as the “development and application of a person’s talents in a way that gives meaning to his or her life” (Roemer, A Future for Socialism 3). Mastering an instrument, playing a sport, solving a physics problem, writing an article, building a shed: these are all examples of potentially self-realizing activities.

Such activities typically have a steep “learning curve” that makes them frustrating at first, but deeply gratifying over the long haul. In this, they contrast sharply with consumption activities, which have the opposite hedonic profile: watching TV is immediately gratifying, but its charms wane with repetition. This contrast is one reason why self-realization is (according to many socialists) more important to human flourishing than consumption. A life replete with consumption but lacking in self-realization becomes stale, cramped, unsatisfying. Indeed, at the limit, it veers towards meaninglessness. It is only through the development and exercise of one’s higher powers and talents that one leads a truly human existence—or so the socialist view has it.

So a genuinely good and just society, then, is one in which “the free development of each is the condition of the free development of all,” as Marx and Engels declare in the Communist Manifesto: it is one in which each person has real access to the material, social, and political preconditions for human development and self-realization. But how, precisely, does any of this amount to an argument for socialism? The answer is that socialists typically see capitalism as a serious barrier to self-realization, a barrier that nothing short of socialism can remove. As Jon Elster puts it, capitalism “offers [the opportunity for self-realization] to a few but denies it to the vast majority” (Introduction to Karl Marx 43). Socialism, by contrast, would democratize self-realization, putting it within reach of average, everyday people for the first time in human history—or so it is claimed.

To fill in these claims, consider the material and social preconditions for self-realization. To develop and apply one’s talents in a way that gives meaning to life, one must, at a minimum, have one’s basic needs met. People who are sick, hungry, or homeless are simply not in a good position to develop and exercise their higher talents. However, since capitalism reliably leads to poverty via frequent busts, structural unemployment, downward pressure on wages, and so on—or so socialists will claim—it therefore reliably depresses access to self-realization for a significant portion of the population. Socialism, by contrast, would eliminate poverty and thus would eliminate this potent material barrier to self-realization.

Suppose, however, that basic needs are met: what else is required for self-realization? Time. Now, under capitalism, most people are forced, through lack of private property, to perform wage-labor for a living (see section 5.a). Their days are thus divided into two parts: working time and leisure time. But time spent in a capitalist workplace is, for the vast majority of people, hardly time for self-realization. Capitalist jobs are oriented around the demands of profit, not self-realization. And quite often, the most profitable way to organize work is to “deskill” it: to make it simple, routine, even stultifying (Braverman).

Granted, there are exceptions. Some workers, such as doctors, engineers, college professors, carpenters, have challenging, complex, autonomous, engaging jobs that help bring self-realization and meaning to their lives. But these are the lucky few. More typical is the experience of, say, assemblers, fast food workers, cashiers, poultry-plant operators, secretaries, human resource clerks, and so on and so forth. Saddled with “alienating” jobs like these, workers work merely to live; as Marx writes, they “feel themselves at home only when they are not working, and when they are working they do not feel at home” (Economic and Philosophic Manuscripts). This is not to demean the people occupying these roles, nor is it necessarily to deny the social importance of these jobs. Rather, it is only to point out that these jobs offer little opportunity to develop and exercise complex talents in a way that brings meaning to life. If capitalism’s armies of cashiers, clerks, and so on are to experience self-realization, then, it will have to be off the clock, during their free time.

Yet here we hit upon a further capitalist barrier to self-realization: its unwillingness to expand what Marx called the “realm of freedom,” leisure, by shrinking the “realm of necessity,” work. Despite ever-rising productivity—more output per unit of labor input—working time rarely declines under capitalism. This is, on its face, rather puzzling. After all, there are, in principle, two ways an enterprise could respond to an increase in productivity. It could keep working hours constant while increasing output, or it could keep output constant while cutting working time. Yet capitalist firms consistently choose the first option over the second; they choose to produce more stuff rather than reduce the working day.

What explains this “output bias”? Profits (Cohen, Karl Marx’s Theory of History Ch. XI). Firms do not make more money by reducing working time; they make more money by increasing output. And so we get, under capitalism, a society chronically short on leisure but drowning in consumer goods; we get the familiar harried rat race, albeit with iPhones.

Now, this mountain of stuff must be sold. This requires spending enormous resources on the “sales effort”—marketing, advertising, and so on—so as to stimulate demand. The result is a highly consumerist society in which many people identify the good life with the life of consumption rather than self-realization. This widespread consumerism may be further promoted by a “sour grapes effect”. Denied self-realization by the alienating nature of work and insufficient free time, the capitalist worker decides self-realization isn’t worth having to begin with. Like the fox in Aesop’s fable, he rejects the tasty fruit he cannot have (self-realization) for the blander fruit within reach (consumption).

In sum, for a variety of interconnected reasons, having to do with its tendency to produce poverty, deskill work, provide inadequate free time, and promote a consumerist orientation, capitalism undermines self-realization and therefore human flourishing. Not, admittedly, for all. But for the vast majority, capitalism renders a rich and meaningful life difficult if not impossible to achieve. Or so the socialist argument goes.

How would things differ in a socialist economy? We have already seen that socialism, by (allegedly) eliminating poverty, would eliminate that particular material barrier to self-realization. Regarding work and leisure, socialists argue that because their system places human beings rather than anarchic market forces in control of the economy, it empowers us to prioritize self-realization and expanded leisure in the design and organization of work. Since we control production, we can tailor it to suit our preferences. If we want better, non-alienating work and more free time, we can get it. Admittedly, this would probably result in lower output. With reduced hours and more engaging labor processes, less stuff would be produced. But from the socialist point of view, this is no great tragedy. Past a certain point, more stuff contributes very little to human flourishing. Once a decent standard of living has been secured, self-realization hinges mainly on access to meaningful work and adequate free time. If the price of securing these things is less stuff, so be it. Fewer iPhones in exchange for more meaningful jobs and no rat race: this is a tradeoff that socialists heartily recommend.

7. Why Socialism? Community and Equality

Capitalism is competitive and cut-throat; socialism is cooperative and harmonious. Capitalism divides; socialism unites, or so many socialists have argued. The crucial value in play in these arguments is “community”.

The concept of “community” admits of at least two different interpretations. The first concerns producers’ motivations: what drives people to wake up each day and go to work? Is it fear and greed, or a desire to serve one’s fellows? The latter is the motivation consistent with community, yet it is relentlessly undermined by capitalism (or so socialists claim). The second sense of community concerns limitations on material inequality. When inequalities in living conditions grow too steep, mutual incomprehension results. People dwell in different worlds. This undermines community (in this second sense), or so it may be argued. These two senses of community, and their fates under capitalism and socialism, will be explored more deeply in what follows.

a. Why Produce? Communal vs. Market Reciprocity

As Adam Smith observed, under capitalism “it is not from the benevolence of the butcher, the brewer, or the baker that we expect our dinner, but from their regard to their own interest” (Book 1, Ch 2). The baker hands over a loaf only because you pay him. Remove the payment, and he removes the bread. So it goes in a market society, for as G.A. Cohen argues, in such a society productive activity is motivated “not on the basis of commitment to one’s fellow human beings and a desire to serve them while being served by them, but on the basis of cash reward” (Why Not Socialism? 39). Market participants operate on the maxim “serve-to-be-served”; they serve only in order to receive service in return. They strive to give as little as possible while getting as much as they can—“buy low and sell high,” as the saying goes. Indeed, the best-case scenario (by market logic’s lights) is to give nothing and get everything.

Market logic thus locks us into deeply anti-social relations. The marketeer, looking at humanity, sees not comrades or brothers and sisters, but customers and competitors. Predator-like, he sees mere “sources of enrichment” and “threats to success”. The former are to be fleeced, the latter crushed. Yet these are horrible ways to relate to other people. Market society may deliver the goods, but it does so only by bringing out some of the worst aspects of human nature. Or so some socialists argue.

But is there an alternative? Cohen asks us to consider how people behave on a camping trip. If A needs help setting up her tent, does B use her need strategically as a means to self-enrichment? Does he ‘drive a hard bargain’, making his support contingent on receiving something in return? No; that’s not how decent people behave in such a context. Rather, in the standard case, B helps A simply because A needs help. Service in response to need: this is what motivates productive activity on a camping trip.

Now, this is not to say that B’s assistance comes entirely without strings attached. B needn’t be a sucker, so to speak; he needn’t continue to help A if A consistently fails to return the favor. On a camping trip, one reasonably expects some degree of reciprocity. Campers thus occupy a sweet spot between anti-social market predation on the one hand and self-denying altruism on the other. Cohen labels this sweet spot “communal reciprocity,” and describes its content as “serve-and-be-served”. Campers acting on this motivation value both sides of the conjunction. They regard it as intrinsically desirable to serve each other, yet they also do expect some degree of reciprocation. As Cohen explains, “the relationship between us under communal reciprocity is not the market-instrumental one in which I give because I get, but the non-instrumental one in which I give because you need, or want, and in which I expect a comparable generosity from you” (Why Not Socialism? 43).Cohen recognizes an important caveat here: the responsibility to reciprocate is conditional upon ability. Thus, there’s no problem with disabled people receiving support without making a contribution in return.

Such behavior is entirely normal and functional on a camping trip. Communal reciprocity clearly “works” in such a context. But can it work on a massive, society-wide scale? Can millions or billions of strangers serve each other, with tolerable economic results, out of fraternity and benevolence rather than greed and fear?

Skeptics cite two main grounds for doubt. The first is human nature: surely people are simply too selfish, greedy and tribal for communal reciprocity to work on a massive scale. Treating your actual brothers as brothers is one thing; treating total strangers as brothers is quite another.

Socialists reply that human nature is complex. We are indeed greedy and competitive, but so too are we generous and cooperative. Economic context powerfully influences which of these traits predominate. Edward Bellamy, an influential 19th century American socialist novelist and thinker, compares human nature to a rosebush (Ch. 26). Put a rosebush in a swamp, and it will appear sickly and ugly. One might conclude that rosebushes are, by ‘nature’, noxious little shrubs. But this would be a mistake. We know that rosebushes are capable of great beauty, given the right developmental conditions. Yet surely, argues Bellamy, the same goes for human beings. Shaped by capitalism, people appear greedy, cramped, and fearful. But this is only because we’re mired in a metaphorical swamp. Put us in the more hospitable soil of socialism and we, like the rosebush, would blossom; we would display all the fellow-feeling, generosity, and cooperative instincts socialism requires.

Human nature, in short, poses no serious obstacle to socialism. ‘Socialist man’ dwells within all of us, waiting only for the right social conditions to emerge.

But these social conditions simply cannot emerge, for they are infeasible: this is the second skeptical objection. Without markets, economies simply do not function tolerably well—witness the failure of Soviet-style planning. In response, Cohen argues that this is just one data point. It would be overly hasty to write off all non-market alternatives simply on the basis of one failed experiment. He admits that socialists face a “design” problem. They do not now know how to power an economy on generosity and fraternity rather than greed or fear. But design problems often turn out to be solvable with enough ingenuity and attention. Non-market socialists do not currently have the answers. But in the fullness of time, they might—or so Cohen argues.

Before turning to a different community-based critique of capitalism, a powerful challenge to Cohen’s argument should be noted. Jason Brennan points out that socialism cannot lay claim to communal reciprocity by definitional fiat. Socialism is just communal ownership of the means of production. Whether this particular way of structuring property fosters communal reciprocity, leading to a generosity and a “serve-and-be-served” mentality on a wide scale, is an entirely empirical question that cannot be answered from the ‘philosopher’s armchair,’ as it were. Yet when we turn to the empirical record, we find little support for Cohen’s position.

If Cohen were right, then we should expect to see an inverse relationship between markets and various pro-social attitudes and behaviors. We should expect to see greater levels of greed, mistrust, and so on as markets expand and deepen. The most marketized societies should also be the most anti-social. But this is not at all what we find. In fact, we find precisely the opposite. Studies cited by Brennan suggest that market exchange promotes various pro-social attitudes such as trust, fairness, and reciprocity. Brennan concludes that Cohen has it backwards: if we wish to spread camping trip values across society, we should embrace markets, not reject them.

Notice that Brennan’s critique (if sound) damages only non-market varieties of socialism. It does not undermine (and indeed actually provides some support for) market versions of the same. (Market socialism is discussed further in 8.c.)

b. Justice, Inequality, Community

This article has not said very much about equality as a socialist ideal. This may surprise some readers. Isn’t equality of condition one of socialism’s central aims? Indeed, socialism’s allegedly uncompromising egalitarianism is sometimes used as the basis for a reductio ad absurdum of the socialist position. The reductio runs like this: according to socialism everyone must be equal; one way to do this is to ‘level down’ the better off; but this is morally repugnant; so socialism must be rejected. One thinks here of Kurt Vonnegut’s famous anti-egalitarian tale “Harrison Bergeron”, in which an equality-obsessed government knocks the noggins of the more intelligent, bringing them into alignment with their IQ-disadvantaged peers.

The reductio fails because socialists do not advocate equality of condition, at least not in any straightforward sense. Much light has been shed on this issue by the now-voluminous philosophical literature on egalitarianism. Of particular import is the work by philosophers like Richard Arneson and G.A. Cohen on the question: “equality of what”? Insofar as leftists seek equality, what is it that they wish to equalize? Standard options include 1) resources, 2) welfare, 3) opportunities for resources, and 4) opportunities for welfare.

Most philosophers agree that the first two options are non-starters. Equalizing outcomes (as 1 and 2 would do) improperly ignores personal choice and responsibility. The point is nicely illustrated by Aesop’s fable of the grasshopper and the ant. Stipulate that both bugs know that winter is coming, and that both have the capability, that is, the effective freedom, to build a house and to gather adequate supplies. That is to say, both have equal opportunity to provision themselves. Yet only industrious ant chooses to use this opportunity; carefree grasshopper decides to dance and play instead. Fast forward to winter: there sits ant in his house, warm and well-fed, while grasshopper shivers hungrily outside. Now, no matter which metric we use—resources or welfare—ant is clearly much better off than grasshopper. Between the two bugs, a very significant inequality of condition obtains. But does this inequality constitute an injustice?

Interestingly, many socialists would answer ‘no’ to this question. These socialists hold a “responsibility-sensitive” form of egalitarianism sometimes called “luck egalitarianism”. On this view, outcome inequalities (whether measured in resources, welfare, or some other metric) are just if and only if they “reflect the genuine choices of parties who are initially equally placed and who may therefore reasonably be held responsible for the consequences of those choices” (Cohen, Why Not Socialism? 26). By luck egalitarian lights, then, the grasshopper/ant inequality is perfectly just since it reflects divergent choices rather than differences in unchosen circumstances. (Circumstantially, the bugs were identically placed. Both could have prepared for winter. But only ant chose to do so.)

Notice that the luck egalitarian would reach a different verdict if grasshopper and ant were unequal because (say) grasshopper was born without legs, and thus couldn’t provision himself. Then it would be unjust for him to go without food or shelter. For that outcome would reflect factors beyond his control, namely, his unchosen disability, in violation of the luck egalitarian standard.

In sum, contemporary socialist “luck egalitarians” have a nuanced view on equality and justice. Opportunities (for resources, welfare, or whatever) must be equal. But outcomes may be unequal provided that these inequalities are due to choices rather than circumstances. In a socialist society, then, grasshopper’s shivering does not necessarily signal injustice.

It might, however, signal a different moral defect: namely, a breech of community or compassion. Socialists aspire to a social world within which people care about and (when necessary) care for one another. Dramatically different living conditions put this regime of mutual comprehension, concern, and caring in jeopardy. Condemned to the wintry cold, grasshopper would face trials beyond ant’s understanding. The two bugs would come to dwell in different worlds. Whatever fellow-feeling or mutual concern previously marked their relations would vanish, leaving only a gulf of indifference and estrangement. This is no way for socialist comrades to live: not because it would be unjust (by hypothesis, it would not) but because it would be insufficiently fraternal and compassionate. Cohen concludes that “certain inequalities that cannot be forbidden in the name of [justice] should nevertheless be forbidden, in the name of community” (Why Not Socialism? 37). On this line, it would be just, but not justified, for ant to bar his door.

Are we back to the Harrison Bergeron reductio, then? Does socialism implausibly require absolute equality of condition after all? No, for two reasons.

First, not all inequalities undermine community. Perhaps ant must, in the name of community, provide grasshopper with some of his food and shelter. But does community require him to split his possessions down the middle? Surely not. The point is that while extreme inequalities may place community under strain, more modest ones might not.

Second, Cohen declares, without much argument, that the demands of community trump those of justice. But this ranking may be contested. Why shouldn’t justice trump community, at least occasionally? Perhaps just inequalities should sometimes be allowed to stand even if they undermine community.

8. Institutional Models of Socialism for the 21st Century

What, in practice, would a socialist society actually look like? What concrete institutions and policies—political, economic, and social—would it use to organize, motivate, and direct economic activity? It is difficult to assess the desirability of socialism without answering these questions. The normative case for socialism depends, at least in part, on the attractiveness and feasibility of its institutional vision. More prosaically: even if one is convinced of the abstract philosophical arguments canvassed in section 4, one still has to know what socialism would really be like in order to tell whether one wants it.

This section discusses three broad institutional models of socialism for the 21st century: central planning, participatory/democratic planning, and market socialism. All three models, being socialist, reject private ownership of the means of production in favor of social ownership. But beyond this important point of commonality, many significant differences emerge, especially concerning a) whether planning should be centralized or decentralized, and b) the appropriate role of markets in a socialist economy.

a. Central Planning

Throughout the 20th century, the standard socialist answer to the question “if not capitalism, then what?” was centrally-planned socialism.

Under central planning, “production is organized and coordinated within an administrative hierarchy, with decisions being made at the center and passed down through intermediate levels of the hierarchy to the production units” (Devine 55). Political authorities at the top of this hierarchy decide on broad economic objectives—build up heavy industry, satisfy consumer preferences, develop a backward region, and so on. Central planners then generate a concrete plan to achieve these objectives. To this end, they first gather a massive amount of information. Tens of thousands of enterprises inform planners of their productive capabilities and input requirements; millions of consumers communicate their consumption preferences. With this information in hand, planners, through a complex, multi-stage, “iterative” process, arrive at an overall plan for the economy that sets specific production targets for each enterprise. (Factory A, produce X shoes; factory B, produce Y amount of steel, and so on.) The center sends these orders to enterprise managers, who then devise more specific labor processes through which their workers produce the ordered goods in the right way at the right time. To the extent that the overall plan is fulfilled, sufficient resources are produced to meet whatever broad objectives political authorities have chosen, and the economy ‘works’.

What is to be said in favor of central planning? In theory, quite a bit. Central planning replaces capitalism’s anarchic market production for profit with planned production for use. It therefore promises to eliminate all those problems that socialists associate with private property, markets, and the pursuit of profits—problems like economic instability and poverty, class conflict and exploitation, various barriers to “real freedom” and self-realization, such as alienating labor and inadequate free time, lack of community and solidarity, and unjust economic inequalities. Freed of these capitalist pathologies, a centrally planned society would be classless, prosperous, and harmonious; it would be a society in which “the free development of each is the condition of the free development of all” (Marx and Engels Ch. 2).

Or so the story goes. However, critics allege that in practice central planning performs poorly. There are two problems worth pulling apart here.

The first is economic. Although centrally planned economies eliminate the worst forms of poverty, they do not produce generalized affluence. Under central planning, innovation is sluggish. Product quality is low. Shortages and hoarding are common. Work effort is lacking. These defects stem from underlying information, calculation, and incentive problems. Central planners, critics argue, cannot know what people want, or what producers are able to produce, with sufficient accuracy; nor, even if they could, would they be able to use this massive quantity of information to calculate a coherent overall plan; nor, even if they could calculate such a plan, would they be able to incentivize managers and workers to follow it faithfully.

The second problem with central planning is normative. Central planning, critics say, does not lead to an egalitarian, classless utopia, but to an authoritarian, undemocratic society dominated by a “coordinator class” of political elites, planners, and enterprise managers. Indeed, the basic logic of the system guarantees that central planning is a “road to serfdom” (in Hayek’s famous phrase) rather than a route to democratic empowerment. As one critic explains, “Central planners gather information, calculate a plan, and issue ‘marching orders’ to production units. The relationship between the central agency and the production units is authoritative rather than democratic, and exclusive rather than participatory” (Albert 52). Information flows up the hierarchy; orders flow down. Central planners decide; everyone else obeys. This seems rather far from the “radical empowerment” envisioned by many socialists.

Indeed, central planning’s economic and normative failings are related; the latter compound the former. It is partly because central planning alienates and disempowers workers that it performs so poorly qua economic system. Workers, so treated, expend little effort at work, ignore orders, under-report their productive capabilities, over-report their output, and so on.

Persuaded by these objections, most socialists today reject central planning, holding that it simply doesn’t work sufficiently well and that it comes at too steep a cost to democratic empowerment and freedom. But if they reject central planning, what do they propose to put in its place? There would seem to be only two options: either socialists rehabilitate planning by decentralizing and democratizing it, or they make peace with the market. The first route leads to some form of “participatory planning”; the second, to “market socialism”. The next two subsections explore these models in greater detail.

b. Participatory Planning

Perhaps the problem with central planning has to do with centralization rather than planning: this is the core thought behind “participatory” or “democratic” planning. Advocates of this approach include Pat Devine, Michael Albert, and Robin Hahnel. Because Albert and Hahnel’s model, called “participatory economics,” or “parecon” for short, is especially well developed, this article shall take it as representative of the broader participatory planning approach.

i. Parecon: Basic Features

Parecon rests on five main institutional proposals:

Social ownership of the means of production
Democratic workplaces
Balanced “job complexes”
Remuneration according to effort, sacrifice, and need
Economic coordination based on comprehensive participatory planning, using a complex system of nested “worker” and “consumer” councils

We may move quickly through the first and fourth of these proposals. Social ownership means simply that nobody in particular owns the means of production; rather, “we all own [them] equally, so that ownership has no bearing on the distribution of income, wealth or power” (Albert 9). What does bear on the distribution of income in a parecon is effort and sacrifice (112-117). The underlying rationale here is luck egalitarian (see 7.b). People, Albert and Hahnel argue, should be rewarded or penalized only for those things under their control. But the only thing that people control is their level of effort and sacrifice. Therefore, they should be rewarded and penalized only for their level of effort and sacrifice. Those who work harder or longer should enjoy greater consumption opportunities than those who work less hard and/or less long. There is an exception here: people who are unable to work will be provided with an average income.

Proposals 2, 3, and 5 require more extensive discussion.

Democratic workplaces. Parecon takes as one of its core values the idea of “democratic self-management,” which implies that “each actor in the economy should influence economic outcomes in proportion to how those outcomes affect him or her”. In other words, “Our say in decisions should reflect how much they affect us” (Albert 40). This norm implies that decisions affecting only a given individual should be left entirely to that individual. But other decisions have broader consequences, and are therefore appropriate objects of democratic choice. Many workplace decisions fall into this “other-affecting” category. Albert and Hahnel propose a complex system of nested “councils” for handling such decisions. Some workplace decisions will be entirely up to individual workers; others, assigned to work-teams; still others, to the enterprise as a whole. Indeed, since some workplace decisions affect people beyond the workplace’s four walls, such as consumers, some method for granting an appropriate degree of influence to these affected external actors must be found. Albert and Hahnel propose democratic “consumer councils” and “industry councils”. More will be said about this system of democratic council coordination below.

Balanced job complexes. Parecon proposes to radically remake the division of labor by creating “balanced job complexes” in which “the combination of tasks and responsibilities each worker has would accord them the same empowerment and quality of life benefits as the combination every other worker has” (Albert 10). This is, of course, very far from how occupations are structured currently. Under capitalism, considerations of profit and class power largely determine the way in which different productive tasks are bundled into jobs. The result is a division of labor that assigns routine, boring, disempowering (but profitable) work to the many, while reserving varied, complex, empowering work for the privileged few.

Parecon rejects this inequitable division. It does so on grounds of fairness: why should interesting and enjoyable work be hoarded by some rather than shared by all? It also objects on democratic grounds: unequal division of empowering work “inexorably destroys participatory potentials and creates class differences” (Albert 104). Any workplace with, say, janitors and managers will be a de facto hierarchy, even if it is, on paper, democratically organized. In the name of fairness and democracy, then, we must transform the division of labor so that “every individual [will] be regularly involved in both conception and execution tasks, with comparable empowerment and quality of life circumstances for all” (Albert 111).

ii. Allocation in Parecon: Economic Coordination Through Councils

This brings us to feature 5: economic coordination through councils. Every economy must decide what gets produced and consumed, and in what quantities. This is the problem of allocation. Market systems solve this problem through decentralized, voluntary, self-interested competition and exchange between buyers and sellers. Recall Adam Smith’s baker, who makes bread not because someone tells him to, but because by making bread, he can make money through trade. Centrally-planned economies solve the allocative problem by handing it over to a small group of economic elites, who craft a comprehensive plan for the entire economy and issue binding instructions for realizing it. Moscow decides that X amount of shoes will be produced, Y amount of steel, and so on, and enforces these demands on lower levels in the economic hierarchy.

Parecon rejects both approaches. In place of markets and central planning, it proposes a system of nested worker and consumer councils, through which individuals cooperatively generate an overall plan for production and consumption. Albert and Hahnel call this system “decentralized participatory planning.”

Simplifying greatly, its basic gist is this. To figure out what people want to consume, we ask them. To figure out what they are willing to produce, we ask them. We then aggregate all these responses and compare proposed supply with proposed demand. If they don’t match, we close the gap through democratic negotiation conducted on a footing of equality. Through such negotiation, we eventually reach, say, five feasible plans. We put them to a vote and implement the winner. Decentralized participatory planning thus promises to solve the allocative problem without hierarchy or markets.

The system features several key participants: first, worker councils (and federations thereof—for example, a “software industry council,” a “farming council,” and so forth.); second, consumer councils (and federations thereof—for example, neighborhood councils, city-level councils, state-level councils, and so on); and third, the “iteration facilitation board,” or IFB. The IFB initiates the planning process by announcing provisional prices for all inputs and outputs. Importantly, these prices should reflect the “full social costs and benefits” associated with these inputs and outputs, including opportunity costs and externalities, whether positive or negative. Albert and Hahnel see this as a key difference with market systems. In a parecon, prices will accurately track the true social costs of production. Prices rarely do this in market societies. Think, for instance, of the absurdly low price of gasoline in the United States, despite the ecological and economic costs of its widespread production and use.

With these provisional prices in hand, each economic actor—individuals and councils and federations of councils—proposes both a) a consumption plan and b) a production plan. The former specifies what the actor would like to consume during the period being planned (the upcoming year, say); the latter specifies which outputs the actor proposes to produce, and the inputs they will require to do this. Plans will go to appropriate councils for approval. Thus, a family might submit their consumption plan to the neighborhood consumption council, while a worker might submit her plan to her work-team or to the larger workplace council.

On what basis are proposals approved or rejected? Individual consumption proposals should be approved if the person’s income is equal to or greater than the total cost of the goods requested. Income, remember, is a function of one’s effort and sacrifice at work. Higher-level consumption proposals (for a neighborhood, say) should be approved if the group’s income (minus the costs of members’ personal consumption) suffices to cover the costs of the requested items. The underlying thought here is that a community’s overall consumption should correspond to the amount of effort its members expend producing goods and services for society: the harder the community works, the more it is entitled to consume. Turning to production proposals, these are evaluated by comparing the estimated social benefits of the goods and services produced with the estimated social costs of producing them. If this ratio is positive, then the production proposal should be accepted; if it is negative, then the proposal does not represent a responsible use of societal resources, and it is sent back for revision.

The IFB aggregates all approved proposals and determines whether projected supply matches projected demand. Barring a miracle, it won’t, not at this stage. So the IFB recalculates prices in light of the mismatch between supply and demand, raising prices for goods with excess demand, lowering prices for those with excess supply, and sends the plans back to their originators for revision. Using the new prices, consumers and producers tweak their proposals, perhaps shifting demand to lower-priced goods and increasing supply of goods with high prices. They then send these revised proposals to the relevant councils, which evaluate them as before. Eventually, all approved proposals make their way to the IFB, which recalculates overall supply and demand to see if they match. If they do, then the process is over; if they don’t, then another round of revisions is required. If the process ends with multiple feasible plans, then society votes to determine the winner.

iii. Evaluating Parecon

This is, to be sure, an incredibly complex procedure, indeed, much more complex than this brief sketch indicates. Even Albert and Hahnel admit that it will take multiple rounds of negotiation, and no small amount of paperwork, to arrive at a feasible plan. But the hope is that “as every individual or collective worker or consumer participant negotiates through successive rounds of back and forth exchange of their proposals with all other participants, they alter their proposals to accord with the messages they receive, and the process converges” (Albert 128). And this, notice, without markets or central planning:

There is no center or top. There is no competition. Each actor fulfills responsibilities that bring them into greater rather than reduced solidarity with other producers and consumers. Everyone is remunerated appropriately for effort and sacrifice. And everyone has proportionate influence on their personal choices as well as those of larger collectives and the whole society (Albert 128-129).

The absence of hierarchy is worth emphasizing. Although there is an IFB, and although one’s consumption and production proposals must be approved by others, the overall distribution of power is web-like rather than hierarchical. No one occupies any special position of authority. Economic decisions are not dominated by the wealthy (as under capitalism) or the politically connected (as under central planning). Instead, all economic decision-making is radically democratic and open to negotiation: each person has a say over decisions that affect him or her, proportional to the degree to which he or she is affected. Parecon may have important flaws, but inadequate respect for democratic values would not seem to be among them. Indeed, it is hard to imagine a system more faithful to the core socialist commitment to bottom-up, democratic control over the economy. This, surely, is parecon’s chief virtue from the socialist perspective.

But would it work? Some commentators are skeptical. Erik Olin Wright writes:

The information complexity of the iterated planning process described in Parecon might in the end simply overwhelm the planning process. Albert is confident that with appropriate computers…this would not be a problem…Perhaps he is right. But he may also be horribly wrong. As described…the information process seems hugely burdensome (264).

Defenders of parecon reply to such worries in several ways. First, they argue that critics overestimate the amount of planning that parecon requires. Setting up one’s initial consumption proposal may well take lots of time and energy. But with that initial investment made, planning in subsequent years should be much quicker. One can simply base future plans off of the original one, tweaking here and there as necessary.

Nor need one specify in great detail what one wants to consume. “For planning purposes,” writes Albert, “we need only request types of goods, even though later everyone will pick an exact size, style, and color to actually consume” (130). One’s consumption proposal must “express preferences for socks, but not for colors or type of socks; for soda, books, and bicycles, but not for flavors, titles, or styles of each” (217). (Of course, critics may find new grounds for concern here. Because consumption preferences tend to be rather fine grained—a person wants to read a dystopian science fiction novel, not some generic book; she wants wild-caught salmon, not “food” or even “fish”—parecon seemingly faces a dilemma. If people do not request specific items, then many desired items will be in short supply, and consumer satisfaction will be low. If, on the other hand, people do request specific items, we’re thrown back on the original worry: isn’t this planning process unworkably cumbersome?)

Second reply to the infeasibility worry: we must remember that other economic systems require paperwork and planning, too. Under market socialism (and capitalism) consumers must make budgets, do their taxes, pay bills, go shopping, and so on. Enterprises must decide what they will make and in what quantities. They must also make various personnel decisions, deciding who will work with whom, for how long, on which projects, and so on. Added up over the course of the year, the amount of time spent on such activities is far from trivial. Indeed, one might argue that total planning time will be roughly constant across market- and participatory-planning-based systems.

Finally, suppose that parecon does require a substantial amount of time and energy, or perhaps, more time and energy than alternative systems. Still, these costs must be judged against the potential benefits. Parecon promises a more equal, fraternal, just, democratic society. Is it even remotely reasonable to reject such a society on the grounds that it requires too much paperwork?

In sum, defenders of parecon argue 1) that their proposal won’t prove nearly as burdensome in practice as critics fear; indeed, 2) one may reasonably doubt that it is any more burdensome than other systems; and 3) even if parecon does prove burdensome both absolutely and comparatively, surely the sacrifice is worth the result.

c. Market Socialism

Suppose one rejects central planning, but also doubts the feasibility (or perhaps even the desirability) of parecon-style participatory planning. Must one therefore reject socialism? No, not according to “market socialists” such as John Roemer, David Schweickart, David Miller, Erik Olin Wright, Tom Malleson, and others.

On the traditional view, socialists must, by definition, be opposed not simply to private property, but also to markets. Market socialists disagree. On their view, socialism requires only a certain form of ownership, namely, social rather than private ownership. About markets, socialists should be open-minded. Markets are just tools for communicating information and motivating economic activity. Like any tool, they should be evaluated instrumentally. Do they work better than alternatives? If so, then socialists should embrace them.

And indeed, market socialists characteristically argue that markets do work better than the alternatives: just look at the economic record. This is not to say that markets are perfect, nor is it to say that they should be allowed to operate ‘freely,’ without any constraints. Market regulations are integral to the market socialist vision. Market socialists are no kind of market fundamentalists. Rather, they view themselves as pragmatists. They see the evils of capitalism, but they also see the problems with planning-based socialist alternatives. The way forward, they argue, is to take the good parts of capitalism and combine them with the good parts of socialism. This will displease fundamentalists on both sides, but what alternative is there? Capitalism is a moral disaster. Central planning was worse. Participatory planning is a pipe dream. 21st century leftists must fuse socialism with markets. There is no other way. Or so market socialists argue.

i. Schweickart’s “Economic Democracy”

To further explain the market socialist position, this article will present David Schweickart’s market socialist model, “Economic Democracy” (ED for short). (For a recent proposal very similar in spirit to Schweickart’s see Malleson 2015. For other important developments of market socialism, see Roemer 1994, Miller 1989, and Carens 1981). Boiled down to essentials, ED has three main features: worker self-management, the market, and social control of investment.

Worker self-management: “Each productive enterprise is controlled by those who work there” (After Capitalism 49). Workers together decide all aspects of production: what to make, how to make it, workplace policies, compensation, and so on. This does not preclude the use of managers or experts. In large firms especially, some delegation of authority will almost certainly prove necessary. Schweickart suggests that workers might elect a workers’ council which will then appoint executives, managers, and so on.

The market. In stark contrast with planning-based forms of socialism, ED solves the problem of allocation using market competition between profit-seeking enterprises. ED’s enterprises start with a sum of money (M). They use this money to buy productive inputs on the market, which they transform into commodities. They then compete with other enterprises to sell these commodities to consumers or other enterprises, thereby ending up with a new amount of money (M’). (Prices are determined mainly by market forces of supply and demand, although price regulations may sometimes be appropriate: again, Schweickart is no market fundamentalist.) In the normal case M’ > M, indicating that the enterprise has turned a profit. Indeed, turning a profit is the immediate aim of production in ED: enterprises produce to make money, not (primarily) to satisfy human needs. As Schweickart says, “profit is not a dirty word in this form of socialism” (After Capitalism 51).

This may sound rather close to capitalism, but in fact there is an important difference here. Under capitalism, profits go to owners, not workers, who receive wages. Under ED, by contrast, there are no wages; rather, “workers get all that remains once nonlabor costs…have been paid” (After Capitalism 51) Precisely how workers divvy up the enterprise’s surplus is up to them. In theory they could split it equally. But given the need to outcompete other enterprises—hence, to attract and retain skilled labor—some degree of inequality is likely to be chosen. More productive workers, or workers with skills in higher demand, will almost certainly earn more than their fellows. Notice the difference here with parecon, which links income to effort rather than contribution or other “morally arbitrary” factors beyond the agent’s control. Empirical evidence suggests that self-managed firms (like those in the Mondragon cooperative in Spain) opt for a 4 or 5:1 ratio between the incomes of the highest- and lowest-paid employees: quite a dramatic difference from the 300:1 spread typical in large capitalist corporations.

Social control over investment. This is the most clearly “socialist” piece of Schweickart’s model. In an ED, the means of production belong to all members of society, not to the enterprises that happen to deploy them. To reflect this social rather than sectional or private ownership, all enterprises must pay a capital assets tax. “This tax,” writes Schweickart, “may be regarded as a leasing fee paid by the workers of the enterprise for use of social property that belongs to all” (After Capitalism 52). Revenues from this tax constitute the national investment fund, which is the sole source of investment money in ED. By tweaking the tax rate, society can determine the size of the national investment fund—hence, the amount of money available for investment, and thus the overall level of economic growth and development.

Note the contrast with capitalism: under capitalism, most investment comes from private rather than public sources. Both the amount and direction of economic development therefore depend on the whims and abilities of private investors. This leads to the boom and bust cycle discussed in section 3, as well as other pathologies such as excessive growth, ecological devastation, underdeveloped regions alongside overdeveloped ones, unemployment, poverty, and all the rest. Under ED, by contrast, investment is democratically controlled by all members of society. In theory, this should enable “more rational, equitable, and democratic development than can be expected under capitalism” (After Capitalism 52)—a point that will be explained further below.

How, specifically, should social control over investment be institutionalized? There are many options. At one extreme, society might opt for a planning-heavy system in which a democratically accountable planning board draws up a plan for all new investment, which Schweickart estimates would constitute about 15% of GDP, and allocates funds accordingly. For example, the planning board might decide to prioritize renewable energy, or consumer goods, or whatever. At the other extreme, society might prefer a laissez-faire model in which funds are channeled through public banks to enterprises using essentially the same criteria that capitalist banks use: namely, profitability. In this version of ED, market forces would largely determine the pattern of investment.

Schweickart himself proposes something in the middle of these two options. Funds should go to regions (for example, Texas) and communities (for example, Fort Worth) on a per capita basis. If the Fort Worth region has the same population as Silicon Valley, then it will receive the same amount of investment. In ED, then, there will be no economic backwaters, no regions or communities left behind. Nor will regions or communities be locked into a destructive neoliberal “race to the bottom” to attract investment dollars. Each community receives its “fair share” no matter how business-friendly (or unfriendly) its policies.

Once distributed to regions and communities on a per capita basis, investment funds are channeled to regional and community enterprises by public banks. Enterprises in need of investment (say, to expand production) apply to area banks for funds. Banks assess applications on the basis of a) profitability, b) job creation, and c) any other democratically chosen criteria, such as ecological impact. This mixed standard implies that while profitability matters, it is not all that matters. Projects that further socially chosen goals may be chosen over more profitable, but less socially desirable alternatives.

Summing up, Schweickart’s model strategically transplants certain core elements of capitalism into a broadly socialist framework. We get markets and profit-seeking enterprises, but also workplace democracy and social control over investment. The result, Schweickart argues, is an economy that outperforms all rivals—whether socialist or capitalist—in terms of values dear to socialist hearts, such as equality, economic stability, human development, democracy, and environmentalism. ED thus promises to deliver “socialism that would really work,” to quote the title of one of Schweickart’s early articles on the topic.

ii. Evaluating Economic Democracy

Perhaps it would really work, but would it be socialism? This, in essence, is the main criticism of Schweickart’s model (and of market socialism more generally).

That his proposal would work—that it is feasible—seems relatively uncontroversial. Markets work. Self-managed enterprises work, as illustrated by decades of empirical evidence. Social control over investment is the only truly novel piece of Schweickart’s model, but its basic mechanisms—the capital assets tax, the national investment fund, the system of public banks allocating funds on the basis of profitability as well as other socially chosen considerations—raise no obvious feasibility worries. Granted, neoclassical economists will complain that because ED regulates and interferes with markets in various ways, it sacrifices efficiency. But “less than perfectly efficient” does not mean “infeasible”. And efficiency isn’t the only thing we want from an economy anyway. Better to sacrifice some efficiency, Schweickart would argue, for gains in employment, more equitable development across regions, greater democratic empowerment at work, and so on. So all things considered, market socialism seems eminently feasible. This is perhaps its greatest selling point.

But is it desirable? Critics right and left will argue that it is not. Those on the right will complain that ED limits basic economic freedoms, such as the formal freedom to own the means of production, to hire wage labor, and to run a business in a un-democratic fashion. Market socialists will reply that not all formal freedoms are worth protecting. They will further suggest that ED will enhance effective economic freedom for the vast majority, even if this means diminishing economic freedom, both formal and effective, for those elites who would, absent ED, enjoy greater workplace control and authority. Under capitalism, most workers control no productive property and enjoy no real say over their work. Economic power is monopolized by a tiny class of owners. Under ED, by contrast, economic hierarchies are flattened. Economic power within the enterprise is distributed equally to all workers on a one worker, one vote basis. Consequently, everyone has the effective freedom to shape workplace decisions. Seen from this angle, ED enhances rather than reduces economic freedom.

Market socialism attracts critical fire from the left as well as the right. It is a strange form of socialism indeed, leftist critics will argue, that features anarchic market production for profit rather than planned production for use. With markets and profits come competition, greed, fear, and the diminution of community; with markets and profits come consumerism, ever-expanding hours of work, and the ecologically insane desire for never-ending economic growth.

Schweickart replies that these worries are overblown. Yes, ED features competition; yes, there will be advertising and some degree of consumerism; yes, enterprises may, under certain circumstances, seek to grow. But the details make a difference. Competition, consumerism, and economic growth are all held in check in ED by countervailing forces. Social control over investment means that we can democratically determine the overall rate and direction of economic growth. We can prioritize environmental aims, for instance, over the rapacious quest, so characteristic of capitalism, for additional output at whatever cost. Workplace democracy means that we can choose shorter working hours in exchange for reduced consumption opportunities. Moreover, because democratic firms seek to maximize profit per-worker (rather than total profit, as do capitalist firms), they will not expand as aggressively as their capitalist counterparts. But reduced expansion means less output that needs to be sold, which, in turn, reduces demand for advertising and marketing. In short, for all of these reasons ED is absolutely compatible with the socialist vision of a less-consumerist, more leisurely, ecologically sane world, or so defenders of market socialism would argue.

Indeed, market socialists would draw a more general lesson here. From the fact that markets in a capitalist context lead to undesirable effects X, Y, or Z, we cannot automatically infer that they would lead to X, Y, or Z in the dramatically different political-economic framework of market socialism. Maybe they would, but maybe they wouldn’t. The only way to tell, insist market socialists, is to work carefully through the details.

9. References and Further Reading

Albert, Michael. Parecon: Life After Capitalism. London: Verso, 2003.
- Presents Albert (and Hahnel’s) participatory planning model of socialism.
Albert, Michael, and Robin Hahnel. Looking Forward: Participatory Economics for the Twenty First Century. South End Press, 1991.
- An early statement of Albert and Hahnel’s participatory planning model of socialism.
Arneson, Richard. “Equality and Equal Opportunity for Welfare.” Philosophical Studies 56 (1), 77-93, 1989.
- A canonical statement of the “luck egalitarian” position.
Bellamy, Edward. Looking Backward. Dover, 1996 [1888].
- A utopian novel, widely acclaimed in its day, depicting political, economic and social arrangements in socialist Boston, some 100 years after a successful revolution.
Braverman, Harry. Labor and Monopoly Capital: The Degradation of Work in the Twentieth Century. 25th Anniversary Edition. New York: Monthly Review Press, 1998 [1974].
- Important Marxist analysis of work, according to which the imperatives of profit-maximization force capitalists to simplify and routinize labor processes, thereby degrading work.
Brennan, Jason. Why Not Capitalism? New York: Routledge, 2014.
- A sharp parody of and rejoinder to G.A. Cohen’s Why Not Socialism? that defends capitalism on moral (rather than pragmatic) grounds.
Carens, Joseph. Equality, Incentives, and the Market: An Essay in Utopian Politico-Economic Theory. Chicago: University of Chicago Press, 1981.
- Describes a market socialist economic system that—unlike capitalist and non-market socialist alternatives—fully realizes the values of equality, freedom, and economic efficiency.
Cohen, G.A. “The Structure of Proletarian Unfreedom.” Philosophy and Public Affairs, Vol. 12, No. 1, 3–33, 1983.
- Argues that workers are individually free (since they are not forced to work for capitalists) but not collectively free (since few workers can escape proletarian status at any given time).
Cohen, G.A. History, Labour, and Freedom: Themes From Marx. Oxford: Clarendon Press, 1988.
- Collection of Cohen’s essays on Marxist themes.
Cohen, G.A. “On the Currency of Egalitarian Justice.” Ethics 99 (4), 906-944, 1989.
- Important statement of luck egalitarianism.
Cohen, G.A. Karl Marx’s Theory of History: A Defence. Expanded edition. Princeton, NJ: Princeton University Press, 2000.
- Cohen’s classic reconstruction and qualified defense of Marx’s theory of history, “historical materialism”. Widely regarded as a founding text of the so-called “Analytical Marxism” movement.
Cohen, G.A. Why Not Socialism? Princeton: Princeton University Press, 2009.
- Argues that—bracketing issues of feasibility—socialism is morally desirable, but concedes that socialists do not know whether socialism is feasible.
Cohen, G.A. “Capitalism, Freedom, and the Proletariat.” In G.A. Cohen, On The Currency of Egalitarian Justice and Other Essays. Princeton: Princeton University Press, 2011.
- Analyzes freedom under capitalism, arguing that private property restricts formal freedom in underappreciated ways.
Devine, Pat. Democracy and Economic Planning. Cambridge: Polity Press, 1988.
- Rich, detailed, economically sophisticated statement of a democratic alternative to central planning, with especially interesting ideas about the division of labor.
Elster, Jon. An Introduction to Karl Marx. Cambridge: Cambridge University Press. 1986.
- An often-critical reconstruction of central Marxist themes by one of the central figures in the Analytical Marxism movement.
Elster, Jon. Self-Realization in Work and Politics: The Marxist Conception of the Good Life. Social Philosophy and Policy, Vol. 3, No. 2, 1986.
- Analytically crisp discussion of self-realization and the prospects for achieving it under capitalism and socialism.
Engels, Frederick. Socialism: Utopian and Scientific. Pathfinder Press, 2008 [1880].
- Important overview of historical materialism and the socialist critique of capitalism by Marx’s intellectual partner; arguably more accessible to beginners than anything by Marx himself.
Friedman, Milton. Capitalism and Freedom. 40th Anniversary Edition. Chicago: University of Chicago Press. 2002 [1962].
- Friedman’s classic defense of libertarian capitalism on moral grounds.
Gilabert, Pablo. “The Socialist Principle ‘From Each According to Their Abilities, To Each According to Their Needs’.” Journal of Social Philosophy, Vol. 46, No. 2, 197-225, 2015.
- Interesting recent paper that brings the needs/abilities principle into dialogue with other positions in distributive justice.
Harrington, Michael. Socialism: Past and Future. New York: Little, Brown & Co, 1989.
- Historically learned, empirically informed overview of socialism’s development and future trajectory by an important figure in American socialist politics.
Hayek, Friedrich. The Road to Serfdom: Text and Documents—The Definitive Edition. Chicago: University of Chicago Press, 2007.
- Hayek’s celebrated broadside against socialist planning and the creeping threat to freedom that it represents.
Holmstrom, Nancy. “Exploitation.” Canadian Journal of Philosophy, Vol. 7, No. 2, 353-369, 1977.
- Early, analytically sharp defense of the view that exploitation is forced, uncompensated labor, the products of which producers do not control.
Lenin, Vladimir. The State and Revolution. New York: Penguin, 2009 [1918].
- Argues, to give one example, that genuine democracy is impossible under capitalism.
Levine, Andrew. Arguing for Socialism. London: Verso. 1988.
- Rigorous, subtle work that mounts a qualified case for socialism using tools of contemporary moral and political philosophy.
Malleson, Tom. After Occupy: Economic Democracy for the 21st Century. New York: Oxford University Press, 2015.
- Empirically and philosophically rich development of a broadly market-socialist position with an especially interesting defense of workplace democracy.
Marx, Karl. Capital: A Critique of Political Economy, Vol. 1. New York: Vintage Books, 1977 [1867].
- Marx’s masterwork lays bare capitalism’s “laws of motion”, but says little about alternatives.
Marx, Karl. Critique of the Gotha Program. In David McLellan (Ed.), Karl Marx: Selected Writings, second edition. Oxford: Oxford University Press, 2000 [1875].
Marx, Karl, and Frederick Engels. The Communist Manifesto. London: Verso, 1998 [1848].
- An enormously influential political pamphlet outlining core elements of the Marxist theory of history, critique of capitalism, and program for a socialist future.
Miller, David. Market, State, and Community: Theoretical Foundations of Market Socialism. Oxford: Oxford University Press, 1989.
- Important, philosophically sophisticated statement of market socialist ideas.
Ollman, Bertell, ed. Market Socialism: The Debate among Socialists. New York: Routledge, 1998.
- Brings together leftist critiques and defenses of market socialism.
Peffer, Rodney. Marxism, Morality, and Social Justice. Princeton: Princeton University Press, 1991.
- Accessible reconstruction of Marxist themes, using techniques of analytic philosophy, that brings Marxism into dialogue with liberal egalitarians like John Rawls.
Reiman, Jeffrey. “Exploitation, force, and the moral assessment of capitalism: Thoughts on Roemer and Cohen.” Philosophy and Public Affairs, Vol. 16, No. 1, 3-41, 1987.
- Argues that exploitation is forced, unpaid labor, and further contends—contrary to Cohen—that individual workers are indeed forced to work for capitalists.
Roemer, John. “Should Marxists Be Interested in Exploitation?” Philosophy and Public Affairs Vol. 14, No. 1, 30-65, 1985.
- His answer is no: Marxists should focus on distributive justice rather than exploitation.
Roemer, John. A Future for Socialism. Cambridge, MA: Harvard University Press, 1994.
- Important statement of market socialism by a leading figure in the Analytical Marxist movement.
Schweickart, David. “Economic Democracy: A Worthy Socialism That Would Really Work.” Science & Society, Vol. 56, No. 1 (Spring), 9-38, 1992.
- A capsule presentation of Schweickart’s market socialist model, “economic democracy”.
Schweickart, David. “Nonsense on Stilts: Michael Albert’s Parecon.” Schweickart’s website. Posted January, 2006.
- Argues that Albert and Hahnel’s “participatory economics” can’t work, and wouldn’t be desirable even if it did.
Schweickart, David. After Capitalism. Second edition. Lantham, MD: Rowman & Littlefield, 2011.
- Argues for a heterodox form of socialism that blends profits and markets with workplace democracy and social control over investment.
Smith, Adam. The Wealth of Nations: Books 1-3. New York: Penguin, 1982 [1776].
- Smith’s classic discussion of early capitalism.
Van Parijs, Philippe. Real Freedom For All. Oxford: Oxford University Press, 1997.
- Defends a “basic income” on “real libertarian” grounds.
Wright, Erik Olin. Envisioning Real Utopias. London: Verso, 2010.
- Drawing on a vast fund of research from social science and philosophy, reimagines socialism for the 21st century.

Author Information

Samuel Arnold
Email: s.arnold@tcu.edu
Texas Christian University
U. S. A.

Egalitarianism

Are all persons of equal moral worth? Is variation in income and wealth just? Does it matter that the allocation of income and wealth is shaped by undeserved luck? No one deserves the family into which they are born, their innate abilities, or their starting place in society, yet these have a dramatic impact on life outcomes.

Keeping in mind the extreme inequality in many countries, is there some obligation to pursue greater equality of income and wealth? Is inequality inherently unjust? Is equality a baseline from which we judge other distributions of goods? Do inequalities have to be justified by people somehow deserving what they have, or by inequality somehow improving society?

As a view within political philosophy, egalitarianism has to do both with how people are treated and with distributive justice. Civil rights movements reject certain types of social and political discrimination and demand that people be treated equally. Distributive justice is another form of egalitarianism that addresses life outcomes and the allocation of valuable things such as income, wealth, and other goods.

The proper metric of equality is a contentious issue. Is egalitarianism about subjective feelings of well-being, about wealth and income, about a broader conception of resources, or some other alternative? This leads us to the question of whether an equal distribution of the preferred metric deals with the starting gate of each person’s life (giving everyone a fair and equal opportunity to compete and succeed) or with equality of life outcomes. Egalitarianism also raises a question of scope. If there is an obligation to pursue distributive equality, does it apply only within particular states or globally?

1. What is Egalitarianism?

Consider three different claims about equality:

All persons have equal moral and legal standing.
In some contexts, it is unjust for people to be treated unequally on the basis of irrelevant traits.
When persons’ opportunities or life outcomes are unequal in some important respect, we have a reason to lessen that inequality. (This reason is not necessarily decisive.)

All of these claims express a commitment to equality. They are each progressively more egalitarian. Understanding the difference between these claims, their normative implications, and the various ways the content of the third claim can be further specified, are crucial to understanding the disparate collection of philosophical views that compose egalitarianism.

Claim (1) entails claim (2), and therefore captures part of contemporary egalitarianism. If all persons are equal, then there are political constraints on how they can be treated unequally. Disenfranchisement and differential rights violate the equality affirmed in (1). (3) is even stronger than (2), because it is not only committed to treating people equally, but ensuring that people have equal amounts of some important good. There is controversy whether (1) entails, is merely compatible with, or is incompatible with (3).

The descriptive thesis found in claim (1) affirms the equality of all persons. This must not be the plainly false assertion that for any given trait, all persons are equal. We differ in our abilities, resources, opportunities, preferences, and temperaments. The claim must be about something more specific. All persons have equal moral worth or equal standing. The United States Declaration of Independence famously states that “all men are created equal.” Jeremy Bentham’s dictum “each to count for one, none to count for more than one,” is another expression of the descriptive thesis. While the conditions in which people live, their wealth and income, their abilities, their satisfaction, and their life prospects may radically differ, they are all morally equal. In moral and political deliberation, each person deserves equal concern. All should have equal moral and legal standing.

If all persons are equal in this way, then some forms of unequal treatment must be unjust. The descriptive thesis, applied within a particular state, at least entails equal rights and equal standing. Therefore (1) constrains how a just political society can be structured because it entails some degree of support for claim (2). The degree is debatable in terms of which contexts require equal treatment, what types of institutions must treat people equally, and so on. At least in terms of basic political rights, discrimination on the basis of gender, ethnicity, and caste is prohibited. Many would also extend these to commerce and the wider public sphere: businesses should not be able to refuse service on the basis of race, gender, or sexual orientation. The descriptive thesis must entail some commitment to equal treatment, but the scope of that commitment is disputed.

Claim (3) (let us call it the egalitarian thesis) is closely related to the descriptive thesis. (1) is taken by some as ground for affirming (3). Denying (1) is grounds for rejecting the imperative in (3). Yet the two theses are distinct. A commitment to (1) does not obviously entail a commitment to (3), because (3) is more robust and has wider scope. (1) may entail (3), but establishing this requires a substantive argument. The descriptive thesis’ extension into the social standing, well-being, wealth, income, and life outcomes of citizens is controversial. Unlike (3), (1) is not on its face opposed to radical inequalities in income, wealth, capabilities, welfare, life prospects, or social standing. If those inequalities arise within legitimate political institutions that respect the equal standing of all persons, they may be just.

The egalitarian thesis addresses more than the moral worth of persons. It expresses an obligation to pursue distributive equality. Deviations from equality are prima facie unjust. But along which dimension ought we pursue greater equality? Candidate metrics include resources, income, wealth, welfare, or capabilities to perform certain functions. The obligation to pursue equality along some such dimension makes (3) fully egalitarian in the contemporary sense of the term. (1) does not necessarily prohibit dramatic inequalities, whether they are deserved or undeserved, due to hard work or luck, recent or hereditary. Absent further argument, the content of (1) is only concerned with such inequalities conditionally, when they violate the equal moral status of persons. Of course if social exclusion, caste discrimination, and unequal rights are prohibited in light of the fact that (1) entails some level of commitment to (2), this will influence the distribution those metrics. This is not the same as a direct obligation to pursue distributive equality of one of those metrics.

(1) is descriptive in content but has normative implications. Egalitarianism is essentially prescriptive and normative. (3) directly states what ought to be done with regard to the inequalities among persons. It is an imperative to reduce distributive inequality along some dimension. The normative commitments that follow from (1) set minimal standards: states must not violate the equal standing of persons. The normative commitments of (3) are stronger and more aspirational: we continually pursue equality by reducing inequality. This is a pursuit of substantive distributive justice—equality of some sort of condition or opportunities. It is not mere formal equality of rights, or of economic notions such as considering everyone equal as long as their income is determined by their marginal product.

Egalitarians are thus committed to distributive justice in a way that (1) need not be. (1) may entail a certain conception of distributive justice having to do with equality of opportunity and individual rights, especially property rights. For example, John Locke argued that all persons are equal and have the same rights. The equal standing and equal rights of all persons, even in the pre-civilized state of nature, is a crucial component of his theory of just government. This is a commitment to equality, but it is not egalitarian in the contemporary sense. It does deal with distributive justice, but only in terms of respecting property rights and the right to free exchange of property. A commitment to equality is not yet a commitment to substantive distributive justice (a commitment to have a fair and equitable distribution of goods), and is compatible with merely formal or historical distributive justice (defining a just distribution as one that respects standing property rights and the right of people to trade without theft or coercion).

What is an egalitarian commitment to substantive distributive justice? In the most literal sense, it requires equalizing the distribution of some quantifiable thing among persons, such as income or wealth. An egalitarian may see distributive justice as an end in itself. This would mean it is constitutive of a just society. It can also mean that we choose a metric of equality that is intrinsically good, such as welfare or well-being. Those things are desirable in themselves, not because they are instrumental in acquiring other goods. Alternatively, egalitarianism can be seen as merely instrumental. For example, distributive justice can be seen as a means to achieving some other social end, such as creating social relationships among citizens that are equal and non-oppressive, and allowing them to flourish and function as citizens. An example of an instrumental metric of equality is resources, because resources can be used to generate welfare.

Strictly speaking, all non-equalizing views of substantive distributive justice are alternatives to egalitarianism. This would exclude Rawls’ difference principle, which allows for inequalities when they are required to raise the absolute condition of the worst off. It would also exclude views that prioritize aid to the worst off or argue in favor of redistribution to guarantee a sufficient minimum for all. The contemporary usage of the term is not restricted to equalizing views. While there are contemporary debates between egalitarianism narrowly defined and non-equalizing views such as Rawls’, the most illuminating contemporary definition of the term is that it is a commitment to substantive distributive justice as opposed to merely formal or historical distributive justice.

Egalitarianism therefore comprises divergent views about equality that go beyond the merely descriptive thesis and affirm at least one of the following theses: first, some important type of thing should be distributed equally among persons; second, distributive inequality (along some relevant dimension) is prima facie unjust and should be reduced.

Both principles further specify the normativity contained in (3), yet still give little concrete guidance. Consider a different normative principle with similar form: the current level of infant mortality is unjust and should be reduced. While this thesis does not tell us how to achieve our end, it clearly specifies the end. We know what counts as success because we know what infant mortality is and how to measure it. These two distributive principles, while clearly egalitarian, do not articulate any specific end. They give no guidance on what quantifiable thing matters to distributive justice. What form must a just distribution take? Is it about wealth? Income? Well-being? Preference satisfaction? Something else?

The remainder of this article focuses on the following topics:

What is the proper egalitarian metric? Well-being? Resources? Income? Capabilities?

Once we settle on a metric, are we then concerned with ex ante or outcome equality? In other words, is egalitarianism concerned with a fair allocation of holdings among persons at the starting gate of each life, so that the ensuing competition is fair, or is it concerned with equal life outcomes? Do choice and responsibility matter to this question? What if a given inequality is due to informed and avoidable choices made by the relevant persons? Can such inequalities be just? Should our shares be determined by our choices and actions? If so, then what is genuine equality—a pattern of distribution in which each person is maximally responsible for their holdings, with the role of luck minimized?

Anti-egalitarianism. Many deny the fundamental equality of persons. Some think men are superior to women, certain races are superior to others, and certain castes should dominate others. If so, there is no general moral imperative to lessen inequality among persons. Anti-egalitarianism of this sort rejects both (1) and (2). This article will not address such views. The more philosophically compelling anti-egalitarianism stems not from a rejection of (1) but rather from one of the following readings of it:(3) does not follow from (1).Pursuit of (3) is counterproductive or has bad consequences. This includes political objections about incentives and productivity, an objection that if equality is desirable then it is desirable to lower the condition of those who have more even when this does not objectively aid those who have less, and objections that egalitarianism is motivated by envy.Engaging in redistribution to pursue the aim of (3) is incompatible with (1). For example, pursuing (3) violates rights that follow from (1).

The relationship between egalitarianism and global justice. Does egalitarianism apply to the global community of humanity, or only within particular states? If it does not apply globally, is this a justified deference to the moral value of specific political attachments, a temporary compromise on the way to a more defensible form of egalitarianism, or is it simply unjustifiable favoritism?

2. Equality of What?

Egalitarianism requires a commitment to equalizing our holdings or at least reducing distributive inequality. Neither of these aims can solely be about equal standing or equal moral worth, if equal moral worth can be respected in a society that exhibits inequality among one of the specified dimensions. Respect for (1) puts some constraints on either inequality or the acceptable material minimum (say, by respect for equal rights entailing the minimum holdings to make those rights effective). That has to do with distributive justice, but in an attenuated sense that falls short of egalitarianism. Similarly, a society with radical inequality may make a rational calculus that some minimal redistribution is required for social stability, but this is prudential and conditional, not genuinely egalitarian.

What, other than equal standing or moral worth, is egalitarianism about? We examine five of the most influential candidates: welfare, resources, capabilities, democratic/social equality, and primary goods.

a. Welfare

Welfare is well-being or one’s quality of life. There are two main variants of welfare. The first is hedonic: welfare is pleasure or happiness. Your welfare increases as you experience more pleasures and fewer pains. The second is desire or preference satisfaction. Your welfare increases the more your desires, goals, and preferences are satisfied.

According to hedonic welfare egalitarianism, this feeling is what fundamentally matters in life. Welfare is the purpose of our actions. This view is common in ethics generally and is not restricted to political egalitarianism. Jeremy Bentham argued that humans seek pleasure and avoid pain, and that this is both a descriptive truth about human psychology and a normative truth about what we morally ought to do. Welfare is an intrinsic good. Other goods are useful in an instrumental sense. They can be used to obtain welfare.

If the use of material resources generates welfare, then equalizing welfare will attain substantive outcome equality even among people who exhibit different levels of efficiency in welfare generation. An able-bodied person may require fewer resources than a disabled person to achieve a given level of well-being. Suppose a disabled person needs a wheelchair. If she holds an equal amount of resources as a non-disabled person, then the able bodied person is better off than the disabled person. The disabled person must exchange resources for a wheelchair. So either she is not mobile or she is mobile but has fewer remaining resources than the able-bodied, and in either case she is worse off. Welfare equality accounts for variation in talents and abilities and opportunities. Equality of welfare attempts to neutralize the impact of these variations on the distribution of welfare.

From a welfare egalitarian perspective, a just distribution of material resources is merely instrumental to achieving what really matters. We cannot redistribute welfare directly; we can only redistribute the resources that persons can use to generate welfare. Since equality of welfare accounts for variations in how efficiently a person can convert resources into welfare, it is markedly different from equality of resources. An egalitarian welfare distribution will not distribute resources equally.

A problem facing this approach is that preferences adapt to one’s living conditions. Therefore, if preferences help determine one’s level of welfare, unjust inequalities in living conditions might not be rectified by welfare egalitarianism. Nussbaum gives examples of women deprived of resources and opportunities adapting their preferences. This leads to them reporting similar satisfaction levels to women who are objectively less deprived. The adaptive preferences worry is that when there are unjust inequalities, those at the bottom will adapt their preferences to this injustice. A preference can adapt such that you no longer desire that which you are denied. Someone for whom college is an impossible goal may adapt their preferences so that they do not desire to attend college. “Sour grapes” is an even stronger negative preference or aversion to the thing denied. Empirical studies support the thesis that preferences adapt to environmental factors and expectations. Thus someone with fewer opportunities than another may eventually report equivalent welfare levels to those with more opportunities, merely because their preferences, expectations, and standards have lowered. Welfare egalitarianism might therefore convert inequality to equality via subjugated persons internalizing and accepting their inferior status, thereby increasing their satisfaction and reported welfare. (For more on preferences see Harsanyi 1982; and Nussbaum 1999 Ch.5, 2001a.)

However, adaptive preferences are also a benefit for welfare egalitarianism. If persons did not adapt their preferences and ends in response to what they can reasonably expect to attain, aggregate life outcomes would be worse. If goals and preferences were completely non-adaptive, our collective welfare levels would suffer. Adapting one’s ends and preferences is part of forming a rational plan of life. Consider someone who pursues a goal of being a professional athlete at the expense of other professional and personal options. If that person lacks the relevant physical ability, this goal is harmful to their welfare.

Another question facing welfare egalitarianism is whether we should adopt an objective or subjective conception of welfare. Thus far, the description of welfare has been subjective. But what if someone derives high levels of welfare from objects or activities that have low or negative social worth? What if the person experiences higher level of welfare in pursuit of an idiosyncratic end rather than securing the objective necessities for survival? What if there are higher and lower forms of welfare?

Scanlon (1975) gives an example of someone who prefers to have resources to build a temple rather than to provide for his own health and physical well-being. If he would experience greater subjective welfare under the former scenario, is that the ideal outcome? Or should we take an objective view, specify welfare in terms of the most objectively urgent needs, and guarantee that those are met? Suppose a person will have a below average level of subjective welfare if they have their basic necessities but not the temple, and a very high level of subjective welfare if they have the temple but not the basic necessities. What would welfare egalitarianism have us do? This is a dispute over whether any objective welfare standards are sovereign over individual preferences.

Two other problems for welfare egalitarianism deal with psychological variations among persons. Consider variation in disposition. The cheerful and the gloomy will vary in welfare levels as their share of resources holds constant. Do the gloomy deserve compensation? If resources are the raw material for generating welfare, this would lead to subsidizing the gloomy merely for being gloomy. The opposing view is that the gloomy should adapt rather than be subsidized, and if they do not adapt this is a personal matter, not an unjust inequality.

Expensive and inexpensive tastes are further problems for equality of welfare. Someone might have tastes and preferences that require a large number of resources, or particularly scarce resources, to satisfy. Those with expensive tastes require more resources to achieve a given level of welfare than those with less expensive tastes. While both disabilities and expensive tastes are inefficiencies in the conversion of resources to welfare, it seems a mistake to lump them together. Tastes can change over time. They are subject to their bearer’s agency in ways disabilities are not. People can cultivate, modify, and abandon their tastes and preferences. Also, being deprived of the goods made possible by, say, being ambulatory is not clearly equivalent to the deprivation suffered by someone with an unsatisfied preference for exotic food and wine. There seems to be a difference between using society’s resource to subsidize those with disabilities and subsidizing those with expensive tastes. Proponents of resource egalitarianism find welfare egalitarianism inadequately sensitive to this difference.

Some of the objections to welfare egalitarianism just outlined can be answered by moving to equality of opportunity for welfare. Equality of opportunity for welfare accounts for the luck egalitarian principle that what is bad, is for someone to be worse off than others through no fault of their own. Equality of opportunity for welfare does not commit itself to subsidizing the imprudent or those who cultivate expensive tastes. For an example of equality of opportunity for welfare, see Arneson (1989, 1990). For an equality of opportunity view with a wider metric that includes aspects of both welfare and resources, see Cohen (1989). Equality of opportunity is addressed in greater detail in Section 3.

b. Resources

Resources are things one can possess or use. Think of the various things you can use to generate welfare: wealth, income, land, food, consumer goods. Wider conceptions of resources include one’s own talents and abilities. Resources can also be social: social capital, respect, and opportunities.

Welfare is an intrinsic good, resources are instrumental goods. Resources are good because they can be used to generate welfare, or to guarantee that people are fully capable of functioning and thriving, or able to pursue some specific conception of the good life. Why focus on an instrumental good rather than the intrinsic good? Recall that different persons may require different amounts of resources to achieve equivalent levels of welfare. For example, we can understand disability as inefficiency in welfare generation. Equality of welfare counteracts disabilities, variations in talent and ability, and so on. From the welfare egalitarian perspective, focusing on resources misses the point.

On the other hand, equality of resources gives an attractive answer to other forms of resource-to-welfare inefficiencies that are not obviously matters of justice. What if my tastes are simply more expensive than yours? If you can achieve a specific welfare level with low-grade hamburger, but I need wagyu beef to reach the same level, then equality of welfare, at least in principle, requires subsidizing my share of resources above yours. You get fewer resources than I do only because your tastes are less expensive. Is this just? Many find it to be implausible in principle and inapplicable in practice. Consider the problems of implementing a scheme of distributive justice that would subsidize expensive tastes. This would generate resentment and reduce the commitment to distributive justice in society. There is also a problem of knowledge and trust—how do I know you have expensive tastes? Everyone has an incentive to report having expensive tastes when they are subsidized.

The bad sort of adaptive preferences amplifies this problem. Suppose your tastes are less expensive than mine because you were raised in a less privileged environment with fewer resources and opportunities. This institutionalizes prior inequalities and subsidizes further those who were already better off. If that seems unjust, then it is attractive to shift focus from the intrinsic good to the instrumental good. If we equalize resources, we can give everyone a fair opportunity to generate welfare and leave variations in tastes as a private concern.

One welfare-egalitarian response to these problems is to distinguish between tastes that are under the control of the person and those for which the person is not responsible. If my taste is out of my control, then its impact on my welfare levels is a matter of justice. If I intentionally cultivated the taste, or refuse to expend effort attempting to revise it, then it is a private concern. But this distinction raises perplexing empirical questions. How could we ascertain whether or not a taste is under one’s control? This is a counterfactual claim about what would happen if the person tried to change it, or a historical question about what happened when in fact they tried to change it.

Equality of resources provides a compelling answer to these problems. If we all have an equivalent bundle of resources, and have control over how we expend them, then whatever tastes an individual has is a private concern. It is not a matter of justice. But the advantage gained in terms of expensive tastes generates a cost: we may no longer have a sufficiently egalitarian response to unjust inefficiencies such as disabilities. Even if expensive tastes and gloominess should not be concerns of distributive justice, inefficiencies involving disability should be. If you and I have the same bundle of resources, but you need a wheelchair to be mobile and I do not, then you are disadvantaged. Our positions are not equal. If equal shares of resources define distributive justice, the disabled are at a disadvantage.

Dworkin (1981b, 2002) takes this as one reason to treat some features of the self as resources. This allows resource egalitarianism to differentiate expensive tastes and disabilities. Dworkin sees both as inefficiencies in welfare generation, but only disability is also a resource deprivation. Someone who can walk has more bodily resources than someone who cannot. This wide conception of resource egalitarianism sees disability as a resource deprivation and therefore a matter of distributive justice. Equal shares of resources now account for disabilities. In the example from the previous paragraph, you will receive the same bundle as me plus a wheelchair. Our total bundles are comparable, because mine includes an ambulatory body while yours includes a non-ambulatory body plus a wheelchair. This approach also applies to innate talents. Someone with abilities or talents that are in high demand already has more resources than someone without such innate talents.

Dworkin’s strategy immediately raises the question of how to determine the value of specific traits and abilities. If we want to implement such a scheme of redistributive justice, how would we specify the value of all these resources? It is a trivial matter to specify equality of wealth or income, but not to quantify the resource variation among persons with various abilities, disabilities, and talents. Dworkin attempts to solve such problems by abstracting away from particular cases and looking at decisions that rational people would make in a hypothetical insurance market. Rational agents, unaware of their own actual talents, abilities, and disabilities, purchase coverage against having disabilities or a lack of valued skills. For example, one considers what sort of policy would be attractive to insure against blindness, lack of in-demand talents, and so on. Then the actual redistributive scheme in society should redistribute resources to actual persons in accord with the insurance coverage that it would have been rational to purchase. Think of it along the lines of medical insurance or unemployment insurance. The hypothetical insurance market provides a rough guide for determining the value of specific resources, giving a baseline of compensation for those who lack such resources.

Resource egalitarianism aims to secure for everyone an equal set of resources and an equal opportunity to convert those resources into welfare. How well people do this, and resulting inequalities stemming from their choices, are not core concerns of this conception of distributive justice.

c. Capabilities

Capabilities are potential functionings, such as walking to work, reading a book, travelling, or being safe and secure in one’s home. If you have the capability to do a specific thing then you have both the abilities and resources required to do it, whether or not you actually choose to do it. A person has the capability to participate in a town hall discussion when they have the physical ability to move into that space (their body, or lack of assistive devices, or the infrastructure does not prevent this motion), the safety to do so without being assaulted, the ability to become informed about the issues (literacy, access to information), and so on. Whatever material and social conditions are required for a specific functioning are possessed by whoever has the relevant capability.

Capabilities-based approaches to distributive justice are sufficientarian rather than equalizing. What is unjust is not the number of capabilities possessed by those on the top compared to others, but the objective inadequacy of the capabilities of those on the bottom. While not equalizing, this is egalitarian. It is concerned with substantive distributive justice. These theories are meant to provide a minimal component of justice that can be combined with further normative principles. When coupled with egalitarian principles, the view is no longer sufficientarian. In terms of its minimal core, though, just as with resource egalitarianism, its commitment to distributive justice is instrumental: a more egalitarian distribution of resources can bring more persons up to the threshold capability level.

The capabilities-based approach’s distinction between capability and function accounts for responsibility and autonomy. What the theory attempts to secure is a sufficient level of capabilities for all. Whether an individual functions is up to his or her own choice. The capabilities approach is therefore not subject to the adaptive preferences objection. No matter how much one adapts their tastes, preferences, and expectations downward, it is unjust whenever they lack the essential capabilities. They may, through free choice or conditioning, choose not to function in certain ways—but they must have the relevant capabilities. In this case, the agent is not making a judgment that something is not worth doing when they currently cannot do it, they are making a judgment that they do not want to do something that they are capable of doing. They have the abilities and resources required to do so. Thus, adaptive preferences can still lead to inequalities in functioning, but this does not impact distributive justice. A sufficient level of capabilities for all requires a certain pattern of the distribution of resources. That pattern is not impacted by the choice of some persons not to function in certain ways.

This approach raises an obvious and crucial question: which capabilities matter to distributive justice? Not every capability should matter, such as the capability to pollute the environment. It also seems that capabilities must be specified in a coarse rather than fine-grained way. The theory would be intractable if every discrete form of functioning were correlated with a discrete capability. For the theory to be illuminating and useful the list must be manageable.

Some capabilities theorists, such as Sen, avoid enumerating an official list. Nussbaum argues that the following list enables one to live a full life with dignity. She does not treat it as timeless or the final word:

Life—capable of living a normal lifespan.
Bodily Health—health, nutrition, shelter.
Bodily Integrity—movement, security against violence, choice in reproduction, sexual satisfaction.
Senses, imagination, thought—the exercise of these capacities in a fully human sense, facilitated by education and protected by rights (of expression, religion, and so forth).
Emotions—emotional development allowing one to form attachments.
Practical reason—development, critical reflection upon, and pursuit of a conception of a good human life.
Affiliation—social interaction, the social bases of self-respect.
Other species—living with and showing concern for the natural world.
Play—recreation.
Control over one’s environment—political activity, political guarantees of security and noninterference, property holdings, full participation in the economic and civic spheres.

Nussbaum’s capabilities list gives a general picture of human flourishing. It reaches every domain of human life. (For more on the capabilities approach, see “Sen’s Capability Approach.”)

d. Democratic/Social Equality

Democratic or social equality is a narrower-scope form of the capabilities approach. Elizabeth Anderson (1999, 2010) developed the most prominent version. Her theory stems from a critique of the individualistic nature of both resource and welfare egalitarianism. Those theories of distributive justice address equality among the holdings of different individuals. Anderson objects to the focus on individual holdings of resources or welfare levels. The point of egalitarianism is social, dealing with relations among persons, not atomistic, dealing with individual allocations of some metric. Anderson rejects the individual compensation model entirely. We cannot do away with unjust inequality by allocating more resources or welfare to those at the bottom. Anderson focuses on the capabilities of citizens and the social relationships between them. Unjust inequalities are caused by oppression, which is social.

Let us again consider disability. Anderson argues that disability is as much a social as a biological fact. The impact on one’s life of having a particular disability varies according to the way social space and infrastructure are constituted and on the social practices of fellow citizens. For example, someone in a wheelchair has less of a handicap when social spaces are physically accessible to them. Equality and inequality are essentially social—the impact of many disabilities depends on social attitudes and political policies. What accommodations do the majority enact through democratic policy? Do non-disabled treat the disabled as equal and fully capable? The proper response to disability cannot be individual compensation. The resources and redistribution that should be used to counteract such handicaps must deal with social practices and infrastructure. Individualistic models could account for why a disabled person requires extra medical resources, but does not reach the level of infrastructure and social practice. Wheelchair accessible social spaces are not part of any individual’s holdings of resources. They are not her property. Yet they are fundamental to understanding disability and inequality.

Unjust inequalities are not mere individual deprivations of welfare or resources compared to others, but socially imposed oppression and exploitation. The paradigm of unjust distribution is not one in which some have much more than others, but in which some oppress and exploit others. Inequality is constituted by certain sorts of social relations. The ideal distribution is not one in which everyone is equalized in terms of resources or welfare, but in which everyone can fully function as a citizen. This is a narrow-scope capabilities approach in two ways: first, the capabilities list is not all-encompassing; second, this is all within a particular political state. Indeed, Anderson’s conception is specifically democratic equality.

This approach is committed to substantive distributive justice as instrumental in guaranteeing that all citizens have a sufficient set of capabilities. Whether a citizen possesses a given capability is jointly determined by the individual, their resources, their environment (natural and built), and the social practices and attitudes of their fellow citizens. Hence the focus is more on institutional changes to make the infrastructure navigable with disabilities, and changes to social norms and behavior, rather than seeing disabilities as an inefficiency for which the individual has a claim to a greater resource share.

The list of capabilities is narrowed to those required to function as a citizen, but nonetheless must be rather coarse and general. The capabilities list must include what is needed to fully function as a citizen and to avoid oppressive social relationships. However, fully functioning as a citizen includes more than political life. It also includes the ability to function in the civil and economic spheres. The point of egalitarianism is not to impose a pattern of distribution but to eradicate oppression, which is socially imposed.

Not only is this theory narrower than the theories of Sen and Nussbaum, it is more constrained than any other option we have considered. Welfare, resources, preferences, primary goods, Nussbaum’s capabilities—each of these reaches into every domain of human life. This conception of equality only touches our lives as citizens. Now, to be sure, since capabilities must be specified in a rather coarse-grained way, the relevant capabilities to citizenship can be put to use in other domains of life. Nonetheless, the scope is relatively narrow.

One objection facing this approach is that it may be possible to guarantee that everyone can fully function as citizens and avoid oppression while at the same time having radical inequality of resources or welfare. If so, perhaps this view is unacceptably narrow because guaranteeing the threshold capability level is compatible with unjust inequalities in life outcomes. Another worry is that this view might be less able to address global justice than other alternatives. That is a disadvantage if one thinks that a unified theory should cover both domestic and global justice.

e. Primary Goods

We now turn to an influential variation on resource egalitarianism. It is not strictly equalizing, and it employs a wide and diverse conception of resources. John Rawls argued that primary goods are what citizens have reason to care about, regardless of whatever else they care about. Primary goods include health, physical and mental abilities, income, wealth, rights, liberties, opportunities, and the social bases of self-respect. No matter what particular conception of the good a citizen may have, what their life plans, goals, and deepest commitments are, she has reason to want more rather than fewer primary goods. Primary goods are what must be expended or employed in pursuit of your conception of the good. (This could mean recreation, education, artistic output, religious missionary work, and so on.) Non-material goods such as liberties and opportunities are what make one’s freedom effective. The social bases of self-respect make for a rewarding life.

All of these primary goods are valuable to you regardless of your religion, values, and life goals. No matter what comprehensive conception of the good you affirm, it is rational to want more rather than fewer primary goods. However, given our differing conceptions of the good, we will not all agree on the best way to use the additional goods created by our social cooperation. Principles of justice are required to fairly allocate resources. For Rawls, the right is prior to the good. Just principles for allocating primary goods trumps pursuit of our individual, various conceptions of the good. This is one thing meant by the title of his book Justice as Fairness.

Rawls’ theory is egalitarian but not necessarily equalizing. It focuses on substantive distributive justice but does not always aim for an equal distribution of all primary goods. Basic rights and liberties must be distributed equally. Fair equality of opportunity requires that opportunities are distributed equally across persons of equal talent and motivation. However, considering all the various primary goods including wealth and income, equality is merely the baseline from which other distributions are judged. Other distributions can be preferable to equality. Inequalities can be justified instrumentally when they are necessary to raise the absolute condition of the worst off. This is accomplished when inequality is a necessary causal mechanism for increasing total productivity. Greater incentives may be required to motivate the talented to be more productive. The worst off would prefer to live in a society in which they get a larger slice of a larger economic pie than to live in a purely equal society in which they get a smaller slice of a smaller pie. Rawls’ strategy is to answer the problem of distributive justice via a social contract. We consider an idealized choice scenario in which free and equal persons come to an agreement about the nature of the society they wish to enter. If our society matches principles that those persons would have chosen, our society is just. A society meeting this standard is as close as we can get to a voluntary agreement to be bound by a particular state.

Rawls argued that the distribution of benefits and burdens in society should not be fundamentally determined by that which is arbitrary from the moral point of view. This rejection of the morally arbitrary explains Rawls’ choice of the veil of ignorance as part of the preferred choice scenario for picking principles of justice. Rawls argued that we should choose principles of justice by imagining persons behind a veil of ignorance that prevents them from basing their choice on what is morally arbitrary. It is not possible to choose principles tailored to serve one’s own peculiar self-interest. The choice of principles is still made out of self-interest, but it is the interest of an abstract model of the person, not of a specific person who is aware of their particular, contingent situation in the actual world. The veil occludes knowledge of much that is due to chance, but also much that is due to choice, including the choosers’ various conceptions of the good. This scenario attempts to value choice by creating the conditions under which people can all pursue their own conceptions of the good. They reason about how to secure primary goods, which can be expended in pursuit of any conception of the good. The original position creates a model of the Kantian notion of the self, and the veil of ignorance forces the choosers to make decisions that are categorical. They lack the knowledge required to make hypothetical choices based in their own particular conception of the good and their peculiar desires.

Rawls argues that under these conditions, rational actors would choose a maximin strategy. Each individual’s goal is to make the worst possible outcome for themselves as good as it can be. They would not take an avoidable gamble on entering into a society with persons suffering at the bottom of the socioeconomic ladder because they would not want to risk living their entire lives under such conditions. Nor would they object to inequality when it raises the absolute level of the worst off, since they are more concerned with the objective quality of their own lives than with envy of those with more primary goods. Rational, self-interested persons situated in a fair procedure for making decisions about their society will affirm the difference principle. According to the difference principle, if incentives that generate inequality are required to increase productivity, then the resulting inequality can be just. If such incentives are required to motivate higher productivity, then they should be allowed as long as they can be harnessed to assist the worst off. By using inequality to motivate productivity, the economic pie grows, and redistribution can improve the lives of the worst off. Note, however, that the difference principle cannot justify violations of the descriptive thesis affirming the equal worth of all persons. A liberty principle takes priority over the difference principle. We may not create a system of unequal rights and liberties even if doing so would allow us to raise the absolute condition of the worst off.

Gerald Cohen objects to the demand for greater incentives that the difference principle allows. The people who require greater incentives to work productively are blameworthy. Why, knowing that if they work to their full ability this will benefit the worst off, do they not do so without demanding a greater share of primary goods? Cohen argues that this demand for incentives is exploitative. If the talented changed their outlook, we would have greater equality and improvement of the lives of the worst off. Rawls’ theory deals with principles governing political institutions and the basic structure of society, not with private actions and motivations. Cohen thinks egalitarianism should be internalized. In Rawls’ theory, persons in the original position are conceived of as self-interested, and a fair procedure for choosing principles of justice ensures a commitment to distributive justice. But that is a product of the fairness of the choice scenario and the self-interest of the participants. Cohen thinks that egalitarianism as a moral and political imperative should motivate individual choices and actions, not only shape the basic structure of society and its institutions. Still, as a matter of public policy, Cohen deems Rawls’ view a radical improvement on contemporary society. His objection is that the difference principle is subordinate to unjust motivations and attitudes. Justice requires that we have egalitarian motivations, and therefore the talented should never demand the incentives allowed by the difference principle. Egalitarianism is a normative ideal, and talented persons ought to work productively and support redistributive policies to pursue equality without demanding a greater share of primary goods. Rawls thinks that in actual societies people will have a variety of motivations. The problem for Cohen is that the original position models persons as self-interested rather than egalitarian. He concludes that the difference principle is not just.

f. Luck Egalitarianism

We now turn to a view that combines egalitarianism, Rawls’ rejection of the influence of morally arbitrary factors, and an emphasis on the values of choice and responsibility. Rawls’ social contract view holds that the morally arbitrary should not fundamentally determine the distribution of primary goods or people’s life prospects. So one’s family, one’s innate talents, and one’s starting place in society should not shape one’s life prospects or distributive share unless this benefits the worst off. These factors are undeserved and should not alone determine the distribution of benefits and burdens in society. Luck egalitarianism distills this thought into a complete theory of distributive justice. The ideal distribution is sensitive to people’s choices and informed gambles, but not to brute luck in the distribution of talents and opportunities. For the luck egalitarians, our capacities for free deliberation, choice, and action are pre-institutional. Therefore, they should inform and determine the principles of distributive justice, and the institutional expectations for entitlement and deservingness. (Hurley argues that this is a crucial feature of luck egalitarianism.) These features of the self are not ignored in Rawls’ view, but they do not fundamentally shape the institutions.

Luck egalitarianism is a responsibility-sensitive conception of equality and a system for distributing goods and aid under conditions of scarcity. It prioritizes aid to those who suffer through no fault of their own. It is a non-equalizing commitment to substantive distributive justice. Equality provides a baseline, though in a quite different way from Rawls. The role of equality here is what we can call ex ante equality. At the starting gate of life, we should be equal in some sense. Depending on the favored metric, we should begin with an equal amount of resources or opportunity for welfare. The luck egalitarian ideal is that we start on an equal footing, and then the outcomes of our life choices and freely taken gambles should determine our future holdings. Inequality therefore can be just. It is not just because it brings about some further social good, as the difference principle allowed for inequalities that improve the objective condition of the worst off. Rather, inequalities are justified by being brought about in the right way, by having the right sort of causal origin.

Indeed, luck egalitarianism is an alternative way to develop the emphasis on choice, responsibility, and individual sovereignty that leads some to reject egalitarianism entirely. Cohen argues that the view co-opts these values from the anti-egalitarians. Luck egalitarianism is not opposed to inequality per se; it is opposed to inequalities that have the wrong sort of origins. Inequalities based in brute luck, that is, the type of morally arbitrary factors cited by Rawls (innate talents, parentage, starting place in society) generate unjust inequalities. But option luck, that is, luck in the outcomes of freely taken risks or gambles, lead to just inequalities. As with the capabilities approach, luck egalitarianism may be combined with other principles of justice. (See Cohen on community.)

One objection to luck egalitarianism is based in skepticism about free will and moral responsibility. The theory hinges on the moral importance of choice and responsibility. If there is no robust conception of free will and moral responsibility, why think that inequalities caused by our choices are just?

Another worry about the theory is abandonment. Does luck egalitarianism offer no aid to those who suffer because of choices with poor outcomes? If inequalities are just whenever they are caused by choice, then is there no minimum level of well-being guaranteed for all? One sort of response to this worry is combining luck egalitarianism with other political values. Cohen argues that a commitment to community prohibits inequalities that would be allowed in a purely luck egalitarian system. Kymlicka argues that luck egalitarianism can be combined with social egalitarian views that likewise prohibit some inequalities that might be allowed by luck egalitarianism.

Anderson develops a social egalitarian view and is a strong critic of luck egalitarianism. Her conception of democratic equality is not only a development of the capabilities theory but also an explicit rejection of luck egalitarianism. She thinks that the luck egalitarian focus on brute luck means the theory completely misses the social nature of inequality. She objects that luck egalitarianism ends up trying to correct the “cosmic injustice” of brute luck in an attempt to ensure that people get what they deserve, and that this blinds them to the social oppression and exploitation that constitutes inequality. Unjust inequality has to do with social relationships.

Another question facing those who support luck egalitarianism is how to define equal starting places. This leads us into the larger issue of what constitutes equality of opportunity.

3. Equality of Opportunity

What if there are dramatic inequalities in the opportunities for choice, education, and careers? This is a problem for luck egalitarians, because they need to specify a starting gate conception of equality. It is also a pressing issue for the other conceptions of equality.

Dworkin argues that inequalities can be historically justified when persons made their choices from an equivalent set of options. This commits luck egalitarianism to robust equality of opportunity. However, his standard is difficult to interpret, since citizens can never have a strictly equivalent set of options, unless that set is so restricted that the society is dystopian. There must be some standard to define when their options are fungible or equivalent enough. However, this is a massive problem for egalitarian theory, and it seems luck egalitarianism’s values of choice and responsibility alone cannot solve it. Answering that problem requires some other standard of value. When do persons have equal opportunities?

Equality of opportunity is a natural extension of the descriptive thesis that affirmed the equality of all persons. The descriptive thesis is incompatible with forms of oppression that rule out classes of people from competing for certain positions within society. A denial of the descriptive thesis entails a denial of a commitment to equality opportunity. But what exactly does equality of opportunity require? It can be understood as ranging from merely formal equality of opportunity to substantive equality of opportunity. The more one approaches the latter, the more one becomes committed to substantive distributive justice.

Formal equality of opportunity requires that desirable positions and resources in society be allocated by open and meritocratic competition. Firms, government agencies, and universities are appropriate candidates for such equality of opportunity. This requires little or no substantive distributive justice. It does require that all citizens can participate in the competition, and that the winners are chosen on the basis of purely meritocratic concerns. Meritocracy requires that the traits that determine who wins the competition actually predict success in the position. Formal equality of opportunity prohibits allocating positions on the basis of gender, ethnicity, and so on. This deals only with opportunities, not outcomes. It does not address systemic inequalities in who wins the meritocratic competitions.

Substantive equality of opportunity addresses both the procedures for allocating positions and the preparation of the candidates that determine their chances of success. It deals with both fair procedures and the actual outcomes of those procedures. For example, if positions are open on the basis of purely meritocratic competition, but the advantages conferred by wealthy parentage are so overwhelming that only the children of the wealthy win the desirable positions, this is merely formal equality of opportunity. Those who support substantive equality of opportunity argue that the merely formal is morally inadequate.

Consider Bernard Williams’ example of a hypothetical warrior society. In the past, this was a caste society in which warriors had high prestige and the majority of wealth. The society transitions to a system of formal equality of opportunity. Under the old order, only the sons of wealthy families were eligible to be chosen as warriors. All others were consigned to poverty and subjugation. Now, warrior positions are allocated under a system that exhibits formal equality of opportunity. Under the new order, there is a meritocratic allocation of the desirable warrior positions. These desirable positions are distributed according to the results of an open, meritocratic, and fair tryout. Rich and poor alike may enter the competition. There is no bias in judging the winners and losers. Stipulate that women may now obtain these positions. Success in the examination is predictive of success as a warrior, so the system is meritocratic.

However, this is all compatible with only the offspring of warriors having adequate nutrition and training to succeed in the competition. Although careers are open to talents, the poor have no chance to cultivate the relevant talents. Even those with the luck to be born with innate ability have their prospects defined by their parentage. Those who were not born to a warrior family cannot succeed. Therefore, the old social hierarchy will persist, even though a strict caste system has been replaced by open, meritocratic procedures that satisfy formal equality of opportunity.

A formal equality of opportunity defender might point out that the long-term outlook for this social hierarchy is made much more tenuous by the implementation of formal equality of opportunity. Other changes to the society could impact the levels of inequality. The dominant positions in society are subject to change over time in a way that they were not under the original caste system.

Still, from the egalitarian perspective, this meritocratic society is unjust. That destabilizing forces can change things under formal equality of opportunity does not redeem the status quo. The current situation is unjust, and destabilizing change would not entail that the next distribution will be just, only that the individuals occupying the dominant and subordinate positions will change. The transition might be to one in which different non-meritocratic attributes correlate with having any chance for success; say, from warrior families to merchant families, or that the offspring of a small set of occupations will be the only ones with a genuine opportunity to succeed.

A perfectionist, someone who thinks that society should maximize the pursuit of some particular conception of the good, could argue that formal equality of opportunity is adequate because the concentration of wealth, which in turn prepares people to flourish as warriors, creates the best set of warriors overall. One can object to this on perfectionist terms (that generating the best warriors is not the proper overriding good, or that this system does not generate the best set of warriors) or on Rawlsian terms of liberal justice (no one conception of the good should be made sovereign in a free society, and no one would agree to this arrangement in the original position).

Suppose the example is shifted slightly. Rather than only the sons of wealthy high caste families having any opportunity to succeed, there is a small amount of social mobility. Some not born into a privileged position win the meritocratic competition. There is not substantive equality of opportunity, but there is both formal equality of opportunity and actual mobility. A supporter of substantive equality of opportunity will still object that it is the strength of the correlation between family background, the resources provided by that background, and obtaining a warrior position is itself adequate evidence of the inadequacy of formal equality of opportunity. These concerns push one to rely on another metric, such as resources, to attain a substantive, material form of equality of opportunity.

Of course, examples need not be so rigid as Williams’ caste society. A collection of informal social attitudes and practices may also violate equality of opportunity. If women are not seen as capable of being good pilots, then hiring and promotion procedures will lack genuine formal equality of opportunity, even if this is neither inscribed in company policy, in law, or in a caste system. These impediments to equality of opportunity are endemic in contemporary society. There are more strategies for answering these problems than can possibly be described in this brief article, so we will mention only two that expand upon views already covered. Rawls (2009) developed a conception of fair equality of opportunity that undermines the role of class, race, gender, or caste to determine life prospects. Fair equality of opportunity requires that persons of equivalent talent who expend equivalent effort have equivalent outcomes. Roemer (2009) provides a sophisticated luck egalitarian account of equality of opportunity that separates people into different types. The competitions that allocate desirable resources and positions should be designed so that effort is rewarded. The details of this scheme are beyond the scope of this article, but these two views are good starting places for readers who want to research the issue in greater depth.

4. Anti-Egalitarianism

An obvious form of anti-egalitarianism rejects the descriptive thesis. If persons are not equal, then there is no moral imperative to pursue substantive distributive justice. Sexism, racism, caste discrimination, and so on are obviously not views that lead into egalitarianism. These objections are beyond the scope of this article.

A common political objection to egalitarianism is that it is based in envy. None of the theories canvassed in this article are explicitly based in envy, so this objection has more to do with the alleged psychological motivations for becoming an egalitarian rather than criticism of egalitarian arguments themselves. Of course, Rawls’ theory explicitly rejects envy. Persons in the original position want to secure the greatest number of primary goods for themselves. Their choice is not impacted by envy of those who may end up with an even greater share of primary goods.

A second political objection is that egalitarianism undermines productivity. If the state redistributes income or other resources, then there is less incentive to be productive. Egalitarians can deny this on empirical grounds, object that total productivity is not the most important criterion, or attempt to harness the way that incentives motivate productivity (as with Rawls’ difference principle).

A practical objection is that a commitment to distributive equality would lead us to “level down” the allocations of those who have more for no real benefit. Suppose all the members of a population have x units of your preferred metric of distributive justice, except for one person who has 2x. Now consider whether it is desirable to transition from that distribution to one in which everyone holds x units. This makes one person worse off and no person better off. The distribution is now equal, but is it preferable? Is it more just? A strict egalitarian can respond that if equality is intrinsically valuable then the distribution is improved in that respect. They are not strictly committed to concluding that this makes the new distribution preferable overall. That only follows if equality is the overriding or sole value. If equality must be balanced against other values, then egalitarians have an answer to the leveling down objection. A strict egalitarian who thinks equality is instrumental already accepts other values, so they can argue that in these cases equality is not instrumental in bringing about the desired consequences.

The leveling down objection is a threat to views that pursue strict equality. Non-equalizing conceptions of substantive distributive justice avoid the problem. What most theories aim to do is improve the condition of the worst off and thereby lessen inequality, not pursue strict equality unconditionally. Views that prioritize aid to the worst off or support a sufficient minimum floor are not obviously subject to this objection. Even if one thinks it is morally obligatory to redistribute resources to improve the condition of those who are worse off than others, it does not follow that it is obligatory to destroy resources when that is the only way to achieve distributive equality.

Perhaps the most philosophically interesting objections to egalitarianism are themselves based in the descriptive thesis that all persons are in fact equal. One objection is that egalitarian distributive justice is insufficiently sensitive to both deservingness and human agency. A second is that there is no just way to implement a redistributive scheme that aims towards equality, because doing so violates freedoms and rights that follow from our equality.

Welfare egalitarianism, resource egalitarianism, the capabilities approach, and Rawls’ difference principle are patterned conceptions of distributive justice as opposed to historical conceptions. Strict egalitarianism defines a pattern of equal shares, the various capabilities approaches define patterns involving a sufficient minimum below which persons cannot fall, and the difference principle states that the level of permissible deviation from the baseline of equality is defined by what is necessary to raise the absolute condition of the worst off.

Nozick (1974) argues against all patterned conceptions of distributive justice. He claims that according to patterned conceptions of justice, if a given pattern is just, it makes no difference which persons occupy which places in the distribution. Justice is defined in terms of structural features of the pattern, not the identity of those occupying specific places in the pattern. Yet that seems counterintuitive. Those at the top might deserve their place on the basis of working hard. Inequalities might be generated by the voluntary transfer of goods that took place in a distribution that was already just. Nozick concludes that, rather than favoring a patterned conception of distributive justice, we ought to understand distributive justice in terms of historical entitlements and voluntary transactions. He agrees with Rawls that the distribution of natural talents is not a basis for deservingness, but denies that this means the distribution of those talents (and the varying wealth and income derivable from them) is arbitrary from the moral point of view. It is not arbitrary because natural talents are implicated in the normative relationship of self-ownership. Persons own themselves. That includes their native abilities. This means that, by extension, they hold strong entitlements to the property they can obtain by exercising those (undeserved) talents.

Since Nozick was primarily responding to Rawls’ Theory of Justice, it is worth looking at this objection and the extent it threatens patterned conceptions in general and Rawls’ conception in particular. Rawls’ view can be defended against Nozick’s objection that according to patterned conceptions of distributive justice, it should not matter which individual occupies which place in the pattern. Consider the role given to institutional expectations and institutional desert. Rawls’ theory allows for people to deserve property so long as the state’s institutions have created the reasonable expectation of such property rights. In other words, entitlement to property is generated by the basic structure of the state. Institutional expectations ground such entitlements. Therefore, his view is compatible with a conception of private property that is not indifferent to which persons occupy which positions in the distribution. Of course, Nozick, following Locke, thinks individuals can have preinstitutional entitlements, so his view of property rights is much stronger.

Still, in Rawls’ patterned view of distributive justice, it must matter which particular individuals occupy which places in the patterned distribution, because the point of the difference principle is that scarce talents are harnessed for the benefit of all. There is a causal relationship between which persons occupy which positions and the pattern of the total distribution. The size of the economic pie is defined by which people occupy which places. Switching places would change total productivity and harm the absolute condition of the worst off. Rawls argues that a given society’s distribution of goods is just if it matches the difference principle. The specific pattern depends on myriad factors, and those factors cannot be held constant while you switch the persons occupying the different positions in the pattern. For example, if in a given state greater incentives are required to motivate some of the highly talented to be more productive, you cannot switch their place in the pattern without changing the productivity level. In such cases, Nozick’s discussion of switching persons within the pattern would necessarily modify the pattern itself. The hypothetical place switching across identical patterns cannot be implemented. So what Nozick means by “patterned” does not capture everything that matters in substantive distributive justice. This response also applies to luck egalitarian accounts of distributive justice. Luck egalitarianism is committed to having shares allocated in accordance with the individual’s choices and option luck. (For a much stronger desert-based alternative, see Kagan 2012.)

Nozick’s second objection has to do with individual liberty to make voluntary transactions. Suppose an actual distribution meets your definition of a just pattern, whatever that may be. So long as persons can make voluntary transactions (purchases, gifts, trades, bequests), the original pattern will be lost. This all happens without exploitation or coercion. The only way to regain the pattern is to for the state to interfere with these voluntary transactions and coercively redistribute the resources. But that is objectionable for two reasons. First, since the deviation from the initial pattern was entirely voluntary, nobody has a valid objection to the second pattern. It wrongs no one, since every transaction that changed the pattern was consensual. Second, coercive redistribution to retain the original pattern must violate property rights. In the initial distribution, which we stipulate was just, each had a right to their holdings. Through voluntary transfers, the new pattern was generated. But if the transactions were voluntary, the new owners of these resources are as entitled to them as the original owners were. The original pattern was just and therefore it is neither required nor permissible for the state to redistribute anything. Egalitarian redistribution enforced by the state must violate property rights. No program can pursue substantive distributive justice through redistribution, because such redistribution is unjust.

This anti-egalitarianism is crucial for Nozick’s understanding of the descriptive thesis: individual rights, including the right to own and transfer property, constitute our equality. Those rights preclude systems of imposing, retaining, or regaining a specific distributive pattern. His understanding of equality is incompatible with egalitarianism. Nozick concludes that we should understand distributive justice in formal and historical terms, not in terms of patterning. He then argues for a set of historical principles governing the original acquisition and subsequent transfer of property. Nozick affirms that persons are equal, but this means that each person has equally strong property rights. The descriptive thesis on this view entails a denial of egalitarianism. Egalitarianism can only be pursued by violating the property rights that follow from our equality.

a. Sufficiency vs. Equality

There is also a sufficiency objection to strictly equalizing views. Frankfurt objects that

The mistaken belief that economic equality is important in itself leads people to detach the problem of formulating their economic ambitions from the problem of understanding what is most fundamentally significant to them. It influences them to take too seriously, as though it were a matter of great moral concern, a question that is inherently rather insignificant and not directly to the point, namely, how their economic status compares with the economic status of others. In this way the doctrine of equality contributes to the moral disorientation and shallowness of our time. (Frankfurt 1987)

A person focused on strict egalitarianism evaluates their own life and holdings based on something impersonal and independent of the particular features of their own lives and their own personal needs. Egalitarianism is harmful.

However, the egalitarian impulse is really based in something that is of moral importance—the principle that all persons should have a sufficient level of well-being. On Frankfurt’s view, people become egalitarians on the basis of compelling reasons, but those reasons have to do solely with sufficiency, not equality.

It seems clear that egalitarianism and the doctrine of sufficiency are logically independent: considerations that support the one cannot be presumed to provide support also for the other. Yet proponents of egalitarianism frequently suppose that they have offered grounds for their position when in fact what they have offered is pertinent as support only for the doctrine of sufficiency. Thus they often, in attempting to gain acceptance for egalitarianism, call attention to disparities between the conditions of life characteristic of the rich and those characteristic of the poor. (Frankfurt 1987)

The case for egalitarianism is usually only a case against poverty.

The fundamental error of egalitarianism lies in supposing that it is morally important whether one person has less than another regardless of how much either of them has. […] The economic comparison implies nothing concerning whether either of the people compared has any morally important unsatisfied needs at all nor concerning whether either is content with what he has. (Frankfurt 1987)

Defenders of equality must show that substantive distributive justice is not captured by concerns over sufficiency alone. We will use Scanlon as a representative example. (See also Parfit and O’Neill for discussions of equality as opposed to sufficiency.) Scanlon offers five sorts of reasons to be concerned with equality and not merely sufficiency. 1. Some inequalities create humiliating differences in status. One could object that sufficiency is whatever level required to avoid humiliation and shame. However, the level is sensitive to differences between the better off and worse off rather than being determined by objective or unchanging standards. This means that, contra Frankfurt, we are intrinsically concerned with differences between people, not just that everyone meets some sufficient benchmark. 2. Inequalities can give those who have more an unjust amount of power over others. 3. Social institutions are only fair if there is equality of starting places in society. Inequality can undermine procedural fairness. We can see this in economic competition, inequality of opportunity, and political influence. 4. Inequalities can be objectionable when they involve failure to treat equally those who have a claim to equal benefit. Just because everyone has a sufficient level of some service or resource provided by the state does not mean that unequal allocation is just. 5. Inequality can violate the claims of citizens to benefit from the fruits of social cooperation. This is how Scanlon reads Rawls as egalitarian. The participants in the original position are equal participants. The presumption is that they have an equal claim to the benefits of social cooperation. This is why equality is the benchmark from which inequalities are judged, and only those that benefit everyone are permissible. The primary goods are produced by social cooperation and, contra Nozick, the baseline or benchmark is that every equal citizen has an equal claim to those benefits. (For more on the debate between equality, sufficiency, and giving priority to the worst off, see the references for Nagel, Parfit, and Scanlon. For elucidating commentary on Scanlon, see Wolff 2013.)

5. Domestic or Global?

Many egalitarians hold a stronger domestic than global view. Redistributive priority is given to fellow citizens over persons in other nations. This, on its face, seems inconsistent or unwarranted. If one is committed to equality, what difference could national borders make? Is it just for a state to prioritize domestic distributive justice over global distributive justice? As a pure matter of luck egalitarianism, the state into which one is born is a paradigm example of brute luck. Having one’s life prospects be determined by nation of origin seems as morally arbitrary as having one’s life prospects determined by parentage. The arbitrariness of nationality combined with the universality of the descriptive thesis (all persons are equal) creates tension with domestic prioritization. On the other hand, if redistributive justice deals with the allocation of goods produced by the cooperation of citizens, then perhaps there is a justification for prioritizing domestic over international redistribution. The amount of redistribution required to address global inequality may depend on the nature of the goods to be allocated as well as the degree of entanglement among the world’s various states.

Consider an efficiency argument against global egalitarianism. One may be an egalitarian yet argue for domestic priority based on increased costs of sending aid to distant locations, difficulty with managing the efficient distribution on the other end, or epistemic advantages of dealing with local rather than remote issues. Peter Singer argues against the efficiency rationale. Changes in modern transportation, financial systems, and information technology have lessened most of the inefficiencies in aiding far away persons. Singer’s argument is not about egalitarianism per se, but about preventing what all reasonable people can agree are objectively bad states of affairs: famine, starvation, epidemics, and so forth. So on the one hand, it is not egalitarian in the sense of an equal distribution of some metric, but rather egalitarian in the sense of doing away with suffering at the bottom rungs of the global society.

Singer’s view is a useful example of moral obligations being global. If moral obligations can be global, then perhaps so too can egalitarianism. Proximity is arbitrary in his analysis: someone suffering nearby is no more morally relevant than someone suffering far away. Given the magnitude of global suffering, there is an egalitarian element to his utilitarian calculus. So long as these objectively bad states of affairs are occurring, first world people are obligated to work to prevent them. This will flatten global inequality. His view can be taken in two ways: the strict reading requires sacrifice to the point of marginal utility with the globe’s worst off, or a weaker (though still radical) reading that requires significant sacrifice. However, Singer constrains both readings with a utilitarian productivity argument: the first world may need some excess consumer culture (that in the short term contradicts our obligations to the worst off) to keep the economy at a level where it can make the maximum contribution to the plight of the globe’s worst off.

Singer therefore takes the descriptive thesis to require radical, obligatory sacrifice on the part of citizens in first world countries. Given the amount of objectively bad states of affairs in the world, those who are comparatively well off are obligated to reallocate resources to the worst off.

Onora O’Neill gives another example of global moral obligation. She argues for a right not to be killed unjustly. Global resource inequalities amount to de facto killings. They are unjust, since they can be avoided at reasonable cost. There is no obligation to equalize anything globally, but there is an obligation to avoid violating the global poor’s right to not be killed unjustly. She gives an argument by analogy that highlights the tension between property entitlements and distributive justice. In a lifeboat scenario, one who has excess water and food but withholds it from others, who will die without it, violates their right not to be killed unjustly. Property entitlements vary in strength in different contexts. She then argues that the planet is no different from a lifeboat, so that those dying from poverty and famine have their right not to be killed unjustly violated. This argument hinges on the contextual variability of property rights and the relative strength of the right not to be killed over property rights. The right not to be killed trumps property rights, so the redistribution required to avoid these killings is obligatory. Unlike Singer, this does not generalize to an obligation to prevent all objectively bad happenings globally. First world citizens are only obligated to do what is required to secure everyone’s right not to be killed unjustly. Yet this is radical, too—her conception of agency means that those complicit in first world economies are killing the globe’s worst off. Redistribution is the means to avoid these killings.

Given these types of arguments for global moral obligations, what can be said in favor of domestic priority in egalitarian redistribution? If distributive equality is a matter of justice, should redistribution be global? As in the discussion of anti-egalitarianism, one obvious objection is to deny that the descriptive thesis holds globally. Denying the equality of all the globe’s people is not philosophically interesting. A stronger argument is that the demands of egalitarian justice are tied up with institutions and practices that are not global. If matters of distributive justice have to do with coercive redistribution, then perhaps only persons living within the same state fall under egalitarian requirements. If so, global distributive justice would only apply if there were genuinely powerful and coercive global institutions. Egalitarian obligations only arise within a coercive political structure. That the state holds coercive power over the citizens means that they should each be treated equally and, perhaps, that the state should engage in redistribution to pursue equality of holdings. Various forms of this view appeal to different features of the state. A similar argument is that redistributive justice has to do with allocating the resources made possible through social cooperation. If so, then the bonds of citizenship matter to distributive justice, and we should treat domestic and international inequality differently.

Another domestic-priority view is that egalitarian norms arise among people who share political bonds and obligations, and those attachments are local rather than global. These sorts of objections are not unconditionally opposed to global egalitarianism; they rather object that egalitarianism is tied to certain relationships and institutions that currently are not global. Some egalitarians counter that the amount of global engagement, cooperation, and institutional entanglement does generate global egalitarian obligations. (For example, see Pogge 1989.)

Richard Miller gives a consequentialist argument for domestic prioritization. Too much redistribution directed outside of a particular state can have a destabilizing impact. Even if that state is well off compared to others, as long as it has inequality in its own economy, then those on the internal bottom rungs may become alienated if resources are taken out of their economic system and sent to another country whose most deprived citizens are even more worse off. The worst-off citizens within the relatively wealthier state are participating in a scheme of social cooperation that benefits the well off, their state engages in egalitarian redistribution, but the redistributive scheme prioritizes the needs of the worst off in other countries. It seems as though this scheme provides benefits to all but the domestic worst off. This can undermine their commitment both to productive labor and the respect for the rule of law. This in turn harms the state, makes it less stable and productive, and therefore makes it less able to generate external aid.

Miller also attempts to transcend the patriotism-cosmopolitanism dispute by universalizing patriotic priority. For the vast majority of people, certain universal human goods are only satisfied in local political communities. (The exceptions are a miniscule small number of global elites.) Our need for social interaction and political community is satisfied locally, as we do not share rich attachments with persons across the globe. This changes the inherently arbitrary nature of the state into which one was born into something morally relevant. This is not a rejection of all global redistribution, but an attempt to break from the view that patriotic priority and helping the globe’s worst off are polar opposites. Combined with the previous consequentialist argument, this means that in order to secure these universal human goods, we need individual states, and within each state we need patriotic priority in redistributive justice. Each person needs these goods categorically, they can only be provided locally, and they are threatened when the redistributive scheme within a given state does not exhibit patriotic priority. This is all compatible with the descriptive thesis applying globally. On this view the descriptive thesis only requires that we are not insensitive to the suffering of others. We do have global obligations to assist others, but this does not mean all the demands of distributive justice are all global.

A commitment to global equality requires radical, perhaps unrealistic sacrifice. That can be taken as reason to reject global egalitarianism: persons cannot reasonably be expected to bring about global equality. However, normative principles specify what we ought to do, not what we are comfortable doing. What we ought to do might require a complete change to our way of life.

6. References and Further Reading

Anderson, Elizabeth S. 1999. “What Is the Point of Equality?” Ethics 109 (2): 287–337.
- (An attack on contemporary egalitarian theory in general and luck egalitarianism in particular. Provides a defense of democratic equality.)
Anderson, Elizabeth S. 2010. “The Fundamental Disagreement between Luck Egalitarians and Relational Egalitarians.” Canadian Journal of Philosophy 40, no. sup1: 1–23.
Arneson, Richard J. 1989. “Equality and Equal Opportunity for Welfare.” Philosophical Studies 56 (1): 77–93.
Arneson, Richard J. 1990. “Liberalism, Distributive Subjectivism, and Equal Opportunity for Welfare.” Philosophy & Public Affairs, 19: 158–94.
Arneson, Richard J. 2000. “Luck Egalitarianism and Prioritarianism.” Ethics 110 (2): 339–49.
Arneson, Richard J. 2004. “Luck Egalitarianism Interpreted and Defended.” Philosophical Topics 32: 1–20.
- (Important defense of luck egalitarianism.)
Barry, Nicholas. 2006. “Defending Luck Egalitarianism.” Journal of Applied Philosophy 23 (1): 89–107.
Blake, Michael. 2001. “Distributive Justice, State Coercion, and Autonomy.” Philosophy & Public Affairs 30 (3): 257–96.
- (Egalitarian obligations hold within particular states, not globally.)
Cavanagh, Matt. 2002. Against Equality of Opportunity. Oxford: Clarendon Press.
Cohen, Gerald A. 1989. “On the Currency of Egalitarian Justice.” Ethics 99 (4): 906–44.
Cohen, Gerald A. 2009. Rescuing Justice and Equality. Cambridge: Harvard University Press.
Cohen, Joshua. 1989. “Democratic Equality.” Ethics 99 (4): 727–51.
Dworkin, Ronald. 1981a. “What is Equality? Part 1: Equality of Welfare.” Philosophy & Public Affairs 10 (3): 185–246.
Dworkin, Ronald. 1981b. “What is Equality? Part 2: Equality of Resources.” Philosophy & Public Affairs 10 (4): 283–345.
- (Useful discussion of different metrics of equality.)
Dworkin, Ronald. 2002. Sovereign Virtue: The Theory and Practice of Equality. Cambridge: Harvard University Press.
Elster, Jon. 1985. Sour Grapes: Studies in the Subversion of Rationality. Cambridge: Cambridge University Press.
Feinberg, Joel. 1974. “Non-Comparative Justice.” Philosophical Review 83 (3): 297–358.
Fleurbaey, Marc. 1995. “Equal Opportunity or Equal Social Outcome?” Economics and Philosophy 11 (1): 25–55.
Frankfurt, Harry. 1987. “Equality As a Moral Ideal.” Ethics 98 (1): 21–43.
Freeman, Samuel. 2006. “Distributive Justice and the Law of Peoples.” In Rawls’s Law of Peoples: A Realistic Utopia?, edited by Rex Martin and David Reidy, 243–60. Oxford: Blackwell Publishing.
Harsanyi, John C. 1982. “Morality and the Theory of Rational Behavior.” In Utilitarianism and Beyond, edited by Amartya Sen and Bernard Williams, 39–62. Cambridge: Cambridge University Press.
Hurley, Susan. 2003. Justice, Luck, and Knowledge. Oxford: Oxford University Press.
Kagan, Shelly. 1999. “Equality and Desert,” in What Do We Deserve: A Reader on Justice and Desert, edited by Louis P. Pojman and Owen McLeod. New York: 298–314.
Kagan, Shelly. 2012. The Geometry of Desert. New York: Oxford University Press.
- (Desert-based conception of distributive justice.)
Knight, Carl. 2009. Luck Egalitarianism: Equality, Responsibility, and Justice. Edinburgh: Edinburgh University Press.
Knight, Carl, and Zofia Stemplowska, eds. 2011. Responsibility and Distributive Justice. New York: Oxford University Press.
Knight, Carl. 2013. “Luck Egalitarianism.” Philosophy Compass 8 (10): 924–34.
Kymlicka, Will. 1990. Contemporary Political Philosophy. Oxford: Clarendon Press.
Lake, Christopher. 2001. Equality and Responsibility. New York: Oxford University Press.
Miller, Richard W. 1998. “Cosmopolitan Respect and Patriotic Concern.” Philosophy & Public Affairs 27 (3): 202–24.
- (A defense of domestic prioritization in redistribution.)
Nagel, Thomas. 1991. Equality and Partiality. Oxford: Oxford University Press.
Nagel, Thomas. 2005. “The Problem of Global Justice.” Philosophy & Public Affairs 33 (2): 113–47.
- (Nagel argues that obligations of egalitarian justice only extend as far as a scheme of enforcement, which typically extends only throughout a particular state.)
Nagel, Thomas. 2012. “Equality.” In Mortal Questions 106–27. New York: Cambridge University Press.
Nozick, Robert. 1974. Anarchy, State, and Utopia. New York: Basic books.
- (A libertarian affirmation of the equality of all persons and rejection of redistribution aiming at greater equality.)
Nussbaum, Martha C. 1999. Sex and Social Justice. New York: Oxford University Press.
Nussbaum, Martha C. 2001a. “Symposium on Amartya Sen’s Philosophy: 5 Adaptive Preferences and Women’s Options.” Economics and Philosophy 17 (1): 67–88.
Nussbaum, Martha C. 2001b. Women and Human Development: The Capabilities Approach. New York: Cambridge University Press
- (An influential development of the capabilities approach.)
O’Neill, Onora. 1975. “Lifeboat Earth.” Philosophy & Public Affairs 4 (3): 273–92.
Otsuka, Michael. 1998. “Self‐Ownership and Equality: A Lockean Reconciliation.” Philosophy & Public Affairs 27 (1): 65–92.
Otsuka, Michael. 2003. Libertarianism Without Inequality. Oxford: Oxford University Press.
- (A reconciliation of libertarianism and substantive distributive justice.)
Parfit, Derek. 1995. Equality or Priority. First presented at the Lindley Lectures, November 21 1991. Lawrence: University of Kansas.
Parfit, Derek. 1997. “Equality and Priority.” Ratio 10 (3): 202–21.
Piketty, Thomas. 2014. Capital in the Twenty-First Century. Translated by Arthur Goldhammer. New York: Belknap Press.
Pogge, Thomas W. 1989. Realizing Rawls. Ithaca, New York: Cornell University Press.
Pogge, Thomas W. 1994. “An Egalitarian Law of Peoples.” Philosophy & Public Affairs 23 (3): 195–224.
Rakowski, Eric. 1991. Equal Justice. New York: Oxford University Press.
Rawls, John. 2009. A Theory of Justice. Cambridge: Harvard University Press.
Raz, Joseph. 1986. The Morality of Freedom. Oxford: Oxford University Press.
Roemer, John E. 2009. Equality of Opportunity. Cambridge: Harvard University Press.
- (Sophisticated account of luck egalitarian equality of opportunity that focuses on effort.)
Sangiovanni, Andrea. 2007. “Global Justice, Reciprocity, and the State.” Philosophy & Public Affairs 35 (1): 3–39.
- (Argues that egalitarian obligations only arise within particular political communities.)
Scanlon, Thomas M. 1975. “Preference and Urgency.” The Journal of Philosophy 72 (19): 655–69.
Scanlon, Thomas. 1996. “The Diversity of Objections to Inequality.” In The Difficulty of Tolerance: Essays in Political Philosophy, 202–18. Cambridge: Cambridge University Press.
Scanlon, Thomas. 1998. What We Owe to Each Other. Cambridge: Harvard University Press.
Scheffler, Samuel. 2003. “What is Egalitarianism?” Philosophy & Public Affairs 31 (1): 5–39.
Scheffler, Samuel. 2005. “Choice, Circumstance, and the Value of Equality.” Politics, Philosophy & Economics 4 (1): 5–28.
Segall, Shlomi. 2007. “In Solidarity with the Imprudent: A Defense of Luck Egalitarianism.” Social Theory and Practice 33 (2): 177–98.
Sen, Amartya, and Bernard Williams, eds. 1982. Utilitarianism and Beyond. Cambridge: Cambridge University Press.
Sen, Amartya. 1980. “Equality of What?” In The Tanner Lectures on Human Values, vol. 1, edited by S. McMurrin, 195–220. Salt Lake City: University of Utah Press.
Sen, Amartya. 1992. Inequality Reexamined. Oxford: Oxford University Press.
Sher, George. 1979. “Effort, Ability, and Personal Desert.” Philosophy & Public Affairs 8 (4): 361–76.
Tan, Kok-Chor. 2008. “A Defense of Luck Egalitarianism.” The Journal of Philosophy 105 (11): 665–90.
Temkin, Larry S. 1993. Inequality. New York: Oxford University Press.
Van Parijs, Philippe. 1995. Real Freedom for All: What (If Anything) Can Justify Capitalism? Oxford: Oxford University Press.
Walzer, Michael. 1983. Spheres of Justice: A Defense of Pluralism and Equality. New York: Basic Books.
Williams, Bernard. 1962. “The Idea of Equality.” In Philosophy, Politics and Society: Second Series, edited by P. Laslett and W.G. Runciman. Oxford: Blackwell.
Wolff, Jonathan. 1998. “Fairness, Respect, and the Egalitarian Ethos.” Philosophy & Public Affairs 27 (2): 97–122.
Wolff, Jonathan. 2013. “Scanlon on Social and Material Inequality.” Journal of Moral Philosophy 10 (4): 406–25.

Author Information

Ryan Long
Email: longr@philau.edu
Philadelphia University
U. S. A.

Jürgen Habermas (1929—)

photo by Ziel

Jürgen Habermas produced a large body of work over more than five decades. His early work was devoted to the public sphere, to modernization, and to critiques of trends in philosophy and politics. He then slowly began to articulate theories of rationality, meaning, and truth. His two-volume Theory of Communicative Action in 1981 revised and systematized many of these ideas, and inaugurated his mature thought. Afterward, he turned his attention to ethics and democratic theory. He linked theory and practice by engaging work in other disciplines and speaking as a public intellectual. Given the wide scope of his work, it is useful to identify a few enduring themes.

Habermas represents the second generation of Frankfurt School Critical Theory. His mature work started a “communicative turn” in Critical Theory. This turn contrasted with the approaches of his mentors, Max Horkheimer and Theodor W. Adorno, who were among the founders of Critical Theory. Habermas sees this turn as a paradigm shift away from many assumptions within traditional ontological approaches of ancient philosophy as well as what he calls the “philosophy of the subject” that characterized the early modern period. He has instead tried to build a “post-metaphysical” and linguistically oriented approach to philosophical research.

Another contrast with early Critical Theory is that Habermas defends the “unfinished” emancipatory project of the Enlightenment against various critiques. One such critique arose when the moral catastrophe of WWII shattered hopes that modernity’s increasing rationalization and technological innovation would yield human emancipation. Habermas argued that a picture of Enlightenment rationality wedded to domination only arises if we conflate instrumental rationality with rationality as such—if technical control is mistaken for the entirety of communication. He subsequently developed an account of “communicative rationality” oriented around achieving mutual understandings rather than simply success or authenticity.

Another enduring theme in Habermas’ work is his defense of “post-national” structures of political self-determination and transnational governance against more traditional models of the nation-state. He sees traditional notions of national identity as declining in importance; and the world, as faced with problems stemming from interdependency that can no longer be addressed at the national level. Instead of national identity centered on shared historical traditions, ethnic belonging, or national culture, he advocates a “constitutional patriotism” where political commitment, collective identity, and allegiance coalesce around the shared principles and procedures of a liberal democratic constitutionalism facilitating public discourse and self-determination. Habermas also claims that emerging structures of international law and transnational governance represent generally positive achievements moving the global political order in a cosmopolitan direction that better protects human rights and fosters the spread of democratic norms. He sees the emergence of the European Union as paradigmatic in this regard. However, his cosmopolitanism should not be overstated. He does not advocate global democracy in any strong sense, and he is committed to the idea that democratic self-determination requires a measure of localized mutual identification in the form of civic solidarity—a legally mediated solidarity around shared history, institutions, and rooted in some shared “ethical” pattern of life (see Sittlichkeit discussion below) fostering mutual understandings.

Biography: Early Life to Structural Transformation
Enduring Themes in Formative and Transitional Work
1. Public Deliberation Over Positivist Decisionism and Technocracy
2. From Philosophical Anthropology to a Theory of Social Evolution
The Linguistic Turn into the Theory of Communicative Action
Discourse Ethics
Political and Legal Theory
References and Further Reading

1. Biography: Early Life to Structural Transformation

Habermas was born in 1929 in Düsseldorf, Germany. He has noted that early corrective surgeries for a cleft palate sensitized him to human vulnerability and interdependence, and that subsequent childhood struggles with fluid verbal communication may partly explain his theoretical interest in communication and mutual recognition. He has also cited the end of WWII and frustrations over postwar Germany’s uneven willingness to fully break with its past as key personal experiences that inform his political theory.

Habermas belongs to what historians call the “Flakhelfer generation” or the “forty-fivers.” Flakhelfer means antiaircraft-assistant. At the end of the war, people born between 1926 and 1929 were drafted and sent to help man antiaircraft artillery defenses. Over a million youth served as such personnel. The second “forty-fiver” label captures how this generation came of age with the 1945 Nazi defeat. These experiences fostered a political skepticism and vigilance born out of having been exploited, and an affinity for the nascent liberal democratic principles of postwar Germany. Both labels capture formative features of Habermas’ biography (Specter 2010, Matustik 2001).

Reflecting on his upbringing during the war, Habermas describes his family as having passively adapted to the Nazi regime—neither identifying with nor opposing it. He was recruited into the Hitler Youth in 1944 and sent to man defenses on the western front shortly before the war ended. Soon thereafter he learned of the Nazi atrocities through radio broadcasts of the Nuremburg trials and concentration camp documentaries at local theaters. Such experiences left a deep impact: “all at once we saw that we had been living in a politically criminal system” (AS 77, 43, 231).

After the war, he studied philosophy at the universities of Göttingen (1949-50), Zurich (50-51) and Bonn (51-54). He wrote his thesis on Schelling under the direction of Erich Rothacker and Oskar Becker. He was increasingly frustrated with the unwillingness of German politicians and academics to own up to their role in the war. He was disappointed in the postwar government’s failure to make a fresh political start and distressed by continuities with the past. In interviews, he has recalled leaving a campaign rally in 1949 after being disgusted by the far-right connotations of the flags and songs used. He was similarly disappointed by German academics. At university he studied the work of Arnold Gehlen and Martin Heidegger extensively, but their prior Nazi ties were not discussed openly. In 1953 Heidegger reissued his 1935 Lectures on Metaphysics in a largely unedited form that included reference to the “inner truth and greatness of the Nazi movement.” Habermas published an op-ed challenging Heidegger, and the lack of response seemed to confirm his suspicions (NC, 140-172). He wrote a piece critiquing Gehlen a few years later (1956). Around the same time he was distressed to learn Rothacker and Becker had also been active Nazi party members.

Near the end of his studies Habermas worked as a freelance journalist and published essays in the intellectual journal Merkur. He took an interest in the interdisciplinary Institute for Social Research affiliated with the University of Frankfurt. The Institute had returned from wartime exile in 1950, and Adorno became director in 1955. Adorno was familiar with Habermas’ essays and took him on as a research assistant. While at the Institute Habermas studied philosophy and sociology, worked on research projects, and continued to publish op-ed pieces. One such piece, Marx and Marxism, struck Horkheimer as too radical. Horkheimer wrote to Adorno suggesting he dismiss Habermas from the Institute. The following year Horkheimer rejected Habermas’ Habilitationsschrift proposal on the public sphere. Habermas did not want to alter his project, so he completed his dissertation at the University of Marburg under the Marxist political scientist Wolfgang Abendroth.

His Habilitationsschrift, The Structural Transformation of the Public Sphere (German 1962, English 1989), was well received in Germany. It chronicled the rise of the bourgeois public sphere in 18^th and 19^th century Europe, as well as its decline amidst the mass consumer capitalism of the 20^th century. Habermas gave an account of the way in which newspapers, coffee shops, literary journals, pubs, public meetings, parliament and other public forums facilitated the emergence of powerful new social norms of discourse and debate that mediated between private interests and the public good. These forums functioned as mechanisms to disseminate information and help freely form the public political will needed for collective self-determination. These norms also partly embodied important principles like equality, solidarity, and liberty. By the late 19^th century, however, capitalism was increasingly monopolistic. Large corporations easily influenced the state and society. Economic elites could use ownership of the media and other (previously public) forums to manipulate or manufacture public opinion and buy-off politicians. Citizens deliberating about the common good were transformed into atomized consumers pursuing private interests. Habermas describes this as the “re-feudalization” of the public sphere. While his narrative was pessimistic, the end of Structural Transformation seems to hold out hope that the truncated normative potential of the public sphere may yet be revived. The work solidified Habermas’ place in the German academy. After a short stint in Heidelberg, he returned to the University of Frankfurt in 1964 as a professor of philosophy and sociology, taking over the chair vacated by Horkheimer’s retirement.

In the spirit of his early call for renewed public sphere debate, Habermas has consistently engaged political movements as a public intellectual and taken part in various scholarly debates. This has not always been easy. After returning to Frankfurt he had been a mentor for the German student movement, but had a falling out with student radicals in 1967. In June of that year a variety of simmering protests—over the restructuring of German universities, proposed “emergency laws,” the Vietnam War, and other issues—boiled over. The breaking point was when a student at a protest against the Shah of Iran was shot and fatally beaten by plainclothes police, who then tried to cover up the incident. This stoked the flames of student protests. Sit-ins and protests crippled everyday life. Under the leadership of Rudi Dutschke students occupied the Free University of Berlin.

Habermas worried that protest leaders seemed to be advocating an unsophisticated and extra-legal opposition to any and all authority that could easily lead to violence. At a conference in Hannover shortly after the shooting he publically reproached Dutschke by calling his model of extra-legal direct-action “left fascism.” That charge alienated Habermas from the leftist student movement and inspired an essay collection Die Linke antwortet Jürgen Habermas (The Left Answers Habermas—German 1969). Rapprochement would only come a decade later when, in the aftermath of a series of killings by the radical left-wing Red Army Faction, politicians on the right tried to garner political capital by suggesting that such terrorism was rooted in the ideas of Frankfurt School Critical Theory. Habermas and Dutschke published pieces repudiating the accusation. A decade later, the editor of the essay collection apologized for how the book made it seem like Habermas’ falling out with the student movement marked a conservative turn that meant he was no longer part of the left.

As a public intellectual, Habermas has engaged a variety of topics: the anti-nuclear movement of the late fifties, the “Euromissile” debate of the early eighties and, in the early two-thousands, both the terrorism of 9/11 and the second Iraq War. In the second half of the eighties he was also a key voice in the Historikerstreit debate between historians, philosophers, and other academics about the proper way for Germany to situate and remember the Holocaust amidst the history of other atrocities. In 1989 he made important contributions to public debate about the reunification of Germany. While Habermas was not against reunification, he was critical of the speed and manner in which reunification was carried out. More recently, he has approached public debate on the European Union along the broadly similar lines of a cautious optimism that is also on guard against a forced, rushed, or duped false unity that would lack legitimacy and stability over the long term.

In a more academic vein, he has had numerous exchanges with thinkers like Jacques Derrida, Richard Rorty, Hans-Georg Gadamer, Niklas Luhmann, John Rawls, Robert Brandom, Hilary Putnam, and Cardinal Joseph Ratzinger (before he was Pope Benedict XVI). His ongoing debate with postmodernism is arguably the most enduring line of debate. Broadly speaking, thinkers like Michel Foucault, Jacques Derrida, and Richard Rorty have levied criticisms to the effect that reason is little more than a historically and culturally contingent social form, that notions of universally valid morality and truth are ethnocentric projections of power, that interests shaped by radically different ways of life are irreconcilable, and that our belief in the emancipatory moral progress of humankind is a myth. Habermas has tried to meet such challenges in much the same way as he responded to Horkheimer and Adorno’s Dialectic of Enlightenment: by relying on his account of communicative rationality in Theory of Communicative Action. However, before turning to that more mature theory, we must survey a few major phases of his formative and transitional work.

2. Enduring Themes in Formative and Transitional Work

a. Public Deliberation Over Positivist Decisionism and Technocracy

The essays in Towards a Rational Society (German 1968 and 1969, English 1970) and Theory and Practice (German 1971, English 1973b) were written on the heels of Structural Transformation. They were written amidst the “positivism dispute” in Germany about the relation between the natural and social sciences. The (somewhat inaccurately labeled) “positivist” side of this debate took scientific inquiry as the sole paradigm of knowledge and generally thought of the social sciences as analogous to the natural sciences. Following Adorno, Habermas argued against a positivistic understanding of the social sciences.

For Habermas, positivism is comprised of three claims: (1) knowledge consists of causal explanations cast in terms of basic laws or principles (for example, laws of nature), (2) knowledge passively reflects or mirrors independently existing natural facts, (3) knowledge is about what is, not what ought to be. He calls these claims scientism, objectivism, and value-neutrality. He said each can be pernicious, especially in the social scientific realm. Scientism fosters the view that only causal and empirically verifiable hypotheses can count as true knowledge. Objectivism seems to falsely naturalize the world by ignoring how lived experiences, human subjectivity, and interests can structure the object domain that gets identified as relevant or worthy of study. Lastly, value-neutrality misleads us into thinking that the role of knowledge is purely descriptive and technical. Values or preferences are seen as separate from knowledge and, as such, wholly subjective “givens” lying beyond rational justification. In turn, knowledge is seen as a tool for efficiently controlling the environment so as to realize whatever values an agent happens to hold. Ironically, this fails to see the tacit value commitments already inscribed in this general paradigm of knowledge.

Habermas’ critique makes sense given his place in Frankfurt Critical Theory. Despite differences with the first generation, he shares the decidedly non-neutral commitments to human emancipation, interdisciplinarity, and self-reflexive theory. Like Horkheimer and Adorno, Habermas worried the prior ascendancy of positivism had left influences on our conceptualizations of knowledge and social inquiry that were hard for even reflective positivists to leave behind. Indeed, he critiques Karl R. Popper’s account of inquiry and knowledge even though it rejects what Habermas calls objectivism. In opposition to a positivist picture of knowledge merely mirroring the world, Habermas holds the Frankfurt School’s Hegelian-Marxist-inspired conception of a dialectical relation between knowledge and world. Finally, like his Frankfurt School contemporaries, Habermas was concerned that positivism had left subtle yet pernicious impacts on politics.

In early writings Habermas is especially critical of two related trends, decisionism and technocracy, that stem from a positivistic understanding of political science and practice. Decisionism starts from the assumption that there is no such thing as the public interest, but rather a clash of inherently subjective values that do not (even in principle) admit of rational persuasion or agreement. It follows that political elites must either simply decide between competing values or base policy on their aggregation. Either way, political value preferences are taken as brute or static facts; there is no sense in which reasoned argumentation and persuasion could genuinely transform such preferences or lead people to a new understanding of their values. Technocracy builds from this point by emphasizing the “objective necessities” (Sachzwänge) supposedly involved in a political system—economic growth, social stability, national security—and highlighting the increasing ability of policy experts to advise political leaders about strategies for optimally realizing these goals. The worry with this approach is that questions about what specific type of growth, stability, and security we seek (and why) are removed from debate by definitional fiat. In decisionism, political legitimacy flows from periodic expressions of acclamation or disapproval at the way leaders have manifested predefined values. In technocracy, legitimacy supposedly flows from the ability of politicians to find and follow expert advice so as to attain fixed outcomes pre-defined by “objective necessities.” Both models render the potentially transformative effects of public deliberation superfluous. Legitimacy is seen as flowing from either certain outcomes or periodic expressions of aggregate preference.

Habermas thinks both models are extremely problematic accounts of democratic political practice and legitimacy. While Structural Transformation only gestured at how the normative potential of the public sphere could be reinvigorated in contemporary circumstances, this theme received increasing attention in works such as Legitimation Crisis (German 1973, English 1975), Theory of Communicative Action, and Between Facts and Norms (German 1992, English 1996). An account of democratic legitimacy that combats decisionism and technocracy is an enduring concern. Indeed, despite championing the European Union he has continued to critique technocracy by criticizing the way in which it has arisen and is currently structured (2008, 2009, 2012, 2014).

b. From Philosophical Anthropology to a Theory of Social Evolution

Knowledge and Human Interests (German 1968, English 1971) and Communication and the Evolution of Society (German 1976, English 1979) are two early attempts at a new systematic framework for Critical Theory. The approaches he uses are akin to the tradition of “philosophical anthropology” in the German social theory of the early 1900s that grew out of phenomenology—a tradition that is quite different from contemporary anthropology. Knowledge and Human Interests sought to overcome positivist epistemology that saw knowledge as simply discerning static facts, and to give a plausible account of the dialectical relation between knowledge (theory) and world (practice). Habermas’ main claim was that the knowledge of scientific and social progress is tacitly guided by three types of “knowledge constitutive interests”—technical, practical, and emancipatory—that are “anthropologically deep-seated” in the human species.

Knowledge and Human Interests tries to recover and develop alternative models of the relation between theory and practice. The approach is historical and reconstructive in that it interprets the attempts of prior theorists as part of a trajectory that Habermas wants to extend. He reviews prior reformulations of Kant’s “transcendental synthesis” (the form-legislating activity making objective experience possible) and his “transcendental unity of apperception” (the unity of the subject having such experience). He also tries to articulate the way in which Hegel relocated such synthesis in the historical development of human subjectivity (absolute spirit) and how Marx relocated it in the material use of tools and techniques (embodied labor). Habermas wants to add to such a trajectory by rehabilitating their shared insight that the constitution of experience is not generated by transcendental operations but by the worldly natural activities of the human species. Yet he wants to do this in a way that avoids the mistakes of Marx and Hegel as well. He tries to do this by building on his interpretation of Hegel, which was already concisely captured by his essay Science and Technology as Ideology (German 1968, included in English 1970).

In that essay he responded to Herbert Marcuse’s claim that the technical reason of science inherently embodies domination. According to Marcuse, under late capitalism the technical reason of science functions ideologically to collapse intersubjective practical questions about how we want to live together into technical questions about how to control the world to get what we want. Habermas shares Marcuse’s concerns, as his criticism of technocracy makes clear. Yet he thinks this dynamic is contingent because, taken as an emergent collective project, humankind constitutes how the world shows up in experience through its worldly activity. More specifically, Habermas identifies two irreducibly distinct and dialectically related modes of human self-formation, “labor” and “interaction.” Whereas labor is an action type that aims at technical control to achieve success, interaction is an action type that aims at mutual understandings embodied in consensual norms. Marcuse’s claim (and his remedy of a “new science”) would only stand if the “interaction” of intersubjective collective political choice—including the question of how we use technology—was somehow subsumed or rendered superfluous by the “labor” of technological progress in controlling the external world. But, given Habermas’ views in this period, this is impossible. Interaction and labor seem to be pitched as irreducible and invariant categories of human experience. Neither can be dropped nor can one be subsumed in the other—even if their relation becomes unbalanced.

In Knowledge and Human Interests, this division between labor and interaction is recast as the technical and practical interests of humankind. The technical interest is in the material reproduction of the species through labor on nature. Humans use tools and technologies to manage nature for material accommodation. The practical interest is in the social reproduction of human communities through intersubjective norms of culture and communication. Human social life requires members who can understand each other, share expectations, and achieve cooperation. In a sense, these interests are the “most fundamental.” Moreover, the knowledge that flows from them is supposed to slowly accrue over time in the enduring institutions of society: theoretical knowledge driven by the technical interest in controlling nature accrues in the “empirical-analytic” sciences, and normative knowledge driven by the practical interest in mutual understandings accrues in the interpretive “historical-hermeneutic” sciences.

But, going beyond Science and Technology as Ideology, in Knowledge and Human Interests Habermas adds a third “emancipatory” human interest in freedom and autonomy. The labor of material reproduction and the interaction norms of social reproduction require, in a weak sense, psychosocial mechanisms to repress or deny basic drives and impulses that would destroy material and social reproduction. For instance, labor requires delayed gratification and social interaction requires internalized notions of obligation, reciprocity, shame, guilt, and so forth. Unfortunately, psychosocial mechanisms of control are often used far more than they need to be to secure material and social reproduction. Indeed, perverse incentives to rely on such mechanisms may even arise: if the burdens and benefits of material and social reproduction processes become unfairly distributed across groups and solidified over time, then those in power may find psychosocial mechanisms useful. If women are falsely taught there are natural laws of gender relations such that the dominant patterns of marriage and domestic work that consistently disadvantage them are the best they can hope for, this is an ideological mechanism of social control. It is the limitation of freedom and autonomy for no purpose other than domination, and it “functions” through systematically distorted communication.

Habermas posits a human interest in using self-reflection and insight to combat ideologically veiled, superfluous social domination so as to realize freedom and autonomy. While there is no clearly institutionalized set of sciences where the knowledge spurred on by such an interest would accrue, Habermas points to Marx’s critique of ideology and Freud’s psychoanalytic dissolution of repression as demonstrating a cognitive viewpoint that focuses on neither (efficient) work nor (legitimate) interaction but (free) identity formation liberated from internalized systematically distorted communication. Here Habermas takes his lead from Kant’s idea that reason aims to emancipate itself from “self-incurred tutelage,” and tries to forge a link between theory (reason) and practice (in the sense of self-realization) through using critical reflection on self and society to unveil and dissolve internalized oppressive power structures that betray one’s own true interests.

Knowledge and Human Interests was envisioned as a preface for two other books that would jointly challenge the separation of theory and practice. However, the project was never finished. On the one hand, Habermas felt that vibrant critiques of positivism in the philosophy of science made the rest of the project superfluous. On the other, the work encountered heavy criticism. For starters, Habermas seems to pitch work and interaction as real action types. But, if we account for how work is communicatively structured, interaction is teleologically ordered, and how historical notions of work and interaction structure one’s sense of freedom, then it is clear these can be at best idealizations. Moreover, as even sympathetic interpreters noted, his account of an emancipatory interest seemed to blur together reflection on “general presuppositions and conditions of valid knowledge and action” with “reflection on the specific formative history of a particular individual or group” (Giddens, McCarthy, 95). Lastly, his stipulation of knowledge-constitutive interests seemed to reproduce the sort of foundationalism he wished to avoid.

Given such criticism, it may seem surprising that Communication and the Evolution of Society reconstructs Marx’s historical materialism as a theory of social evolution. This sounds foundationalist and deterministically teleological. These impressions are misleading. Around this time Habermas began presenting his work as a “research program” with tentative and fallible claims evaluable by theoretical discourses. Moreover, while he speaks of evolution, he uses the term differently than 19^th century philosophies of history (Hegel, Marx, Spencer) or later Darwinian accounts. His “social evolution” is neither a merely path-dependent accumulative directionality nor a progressive, strongly teleological realization of an ideal goal. Instead, he envisions a society’s latent potentials as tending to unfold according to an immanent developmental logic similar to the developmental logic cognitive-developmental psychologists claim maturing people normally follow. Lastly, Habermas’ theory of social evolution avoids worries about determinism by distinguishing between the logic and the mechanisms of development such that evolution is neither inevitable, linear, irreversible, nor continuous. A brief sketch of his theory follows.

Habermas characterizes human society as a system that integrates material production (work) and normative socialization (interaction) processes through linguistically coordinated action. This is qualitatively different from the static and transitive status hierarchy systems of even other “social” animals. In various human epochs the linguistic coordination of these processes crystalizes around different “organizational principles” that are the “institutional nucleus” of social integration. In the most basic societies kinship structures play this role by (to take just one possible configuration) dividing labor and specifying socialization responsibilities through sex-based roles and norms. Habermas claims this organizational principle was replaced by political order in traditional societies and the economy in liberal capitalist societies. Social evolution in general and the particular movements from one “nucleus” to the next stem from learning in material and social reproduction.

Understood as ideal types, work and interaction mark out different ways of relating to the world. On the one hand, in material production one mainly adopts an instrumental perspective that tries to control an object in conformity to one’s will. In this orientation, learning is gauged by success in controlling the world and the resultant knowledge is cognitive-technical. On the other, in social reproduction one mainly adopts a communicative perspective that tries to coordinate actions and expectations through consensually agreed upon normative standards. In this orientation learning is gauged by mutual understanding and the resultant knowledge is moral-practical. Each learning process follows its own logic. But, since the processes are integrated in the same social system, advances in either type of knowledge can yield internal tensions or incongruities. These cannot be suppressed by force or ideology for long, and eventually need to be solved by more learning or innovation. If these internal tensions are too great, they induce a crisis requiring an entirely new “institutional nucleus.”

For Habermas, the slow social learning in history is the sedimentation of iterated processes of individual learning that accumulates in social institutions. While there is no unified macro-subject that learns, social evolution is also not mere happenstance plus inertia. It is the indirect outcome of individual learning processes, and such processes unfold with a developmental logic or deep structure of learning: “the fundamental mechanism for social evolution in general is to be found in an automatic inability not to learn. Not learning but not-learning is the phenomenon that calls for explanation” (LC, 15; also see Rapic 2014, 68). Habermas posits a universal developmental logic that tends to guide individual learning and maturation in technical-instrumental and moral-practical knowledge. He discerns this logic in the complementary research of Jean Piaget in cognitive development and Lawrence Kohlberg in the development of moral judgment. As social and individual learning are linked, such underlying logic has slowly created homologies—similarities in sequence and form—between: (i.) individual ego-development and group identity, (ii.) individual ego-development and world-perspectives, and (iii.) the individual ego-development of moral judgment and the structures of law and morality (Owen 2002, 132). Habermas pays more attention to the last homology and later writings focus on Kohlberg, so it is instructive to focus there (1990b).

Kohlberg’s research on how children typically develop moral judgment yielded a schema of three levels (pre-conventional, conventional, and post-conventional) and six stages (punishment-obedience, instrumental-hedonism/relativism, “good-boy-nice-girl”, legalistic social-contract/law-and-order, universal ethical principles). Two stages correspond to each level. Habermas follows Kohlberg’s three levels in claiming we can retrospectively discern pre-conventional, conventional, and post-conventional phases through which societies have historically developed. Just as normal individuals who progress from child to adult pass through levels where different types of reasons are taken to be acceptable for action and judgment, so too we can retrospectively look at the development of social integration mechanisms in societies as having been achieved in progressive phases where legal and moral institutions were structured by underlying organizational principles.

Habermas slightly diverges from the six stages of Kohlberg’s schema by proposing a schema of neolithic societies, archaic civilizations, developed civilizations, and early modern societies. Neolithic societies organized interaction via kinship and mythical worldviews. They also resolved conflicts via feuds appealing to an authority to mediate disputes in a pre-conventional way to restore the status quo. Archaic civilizations organized interaction via hierarchies beyond kinship and tailored mythical worldviews backing such hierarchies. Conflicts started to be resolved via mediation appealing to an authority relying on more abstract ideas of justice—punishment instead of retaliation, assessment of intentions, and so forth. Developed civilizations still organized interaction conventionally, but adopted a rationalized worldview with post-conventional moral elements. This allowed conflicts to be mediated by a type of law that, while rooted in a community’s (conventional) moral framework, was separable from the authority administering it. Finally, with early modern societies, we find certain domains of interaction are post-conventionally structured. Moreover, a sharper divide between morality and legality emerges such that conflicts can be legally regulated without presupposing shared morality or needing to rely on the cohering force of mythical worldviews backing hierarchies (McCarthy 1978, 252).

Obviously, this sketch is rather vague and needs further elaboration. This is especially true in light of the ways a superficial reading (that takes social evolution as strictly parallel rather than homologous to individual development) lends itself to unsavory developmentalist narratives. Yet, apart from a few later writings, Habermas has not returned to his theory of social evolution in a systematic way. Several secondary authors have tried to fill in the details (Rockmore 1989, Owen 2002, Brunkhorst 2014, Rapic 2014). Nevertheless, Habermas still endorses the contours of his theory of social evolution: these ideas show up in Theory of Communicative Action and his later writings on the nature and development of legality and democratic legitimacy bear a loose connection to this early work (especially the final homology above) insofar as they are tailored for specifically post-conventional societies. Yet, before turning to his democratic theory, we must tackle the hugely important intervening body of work concerning his communicative turn and its articulation in his Theory of Communicative Action.

3. The Linguistic Turn into the Theory of Communicative Action

Habermas’ engagement with speech act theory and hermeneutics in the late 1960s and 70s started a linguistic turn that came to full fruition in Theory of Communicative Action. This turn makes sense after both Knowledge and Human Interests and Communication and the Evolution of Society. He came to see the knowledge-constitutive interests of the former as illicitly relying on assumptions in the philosophy of consciousness and Kantian transcendentalism, while the reconstructed phases of social learning and evolution in the latter can seem far too naturalistic or foundationalist. In contrast, a focus on communicative structures let him form his own pragmatic theory of meaning, rationality, and social integration based in reconstructions of the competencies and normative presuppositions underlying communication. This approach is transcendental and naturalistic but only weakly so. Far from an account of ultimate foundations, his approach takes itself to be a post-metaphysical methodology for philosophical and social scientific research into practical reason. From the start of his linguistic turn until well after Theory of Communicative Action this approach underwent revisions. In what follows, only a broad outline of this trajectory is given.

Habermas has cited his 1971 Gauss lectures at Princeton (German publication 1984b, English publication 2001) as the first clear expression of the linguistic turn, but it was also evident in On the Logic of the Social Sciences (German 1967, English 1988a). His first truly systematic foray in Anglo-American philosophy of language came with What is Universal Pragmatics? (German 1976b, included in English 1979). His ideas were then revised further in Theory of Communicative Action. While the development of his ideas throughout this period is an important exegetical task, for present purposes the broad way he takes up speech act theory is what is important: he accepts the division in linguistics between syntax, semantics, and pragmatics. He considers each division to be reconstructing the tacit system of rules used by competent speakers to recognize the well-formed-ness (syntax), meaningfulness (semantics), and success (pragmatics) of speech. His main interpretive twist is that the theories of truth-conditional propositional meaning often associated with philosophical projects regarding language only locate part of the meaning of speech. Thus, he moves away from meaning based on the correspondence theory of truth and gives an account of the unique pragmatic validity behind the meaning of speech.

While his linguistic turn is sometimes cast as a break with prior theory, his interpretive approach actually coheres quite well with his early critique of positivism. He has always rejected the idea that language simply states things about the world. Instead of merely analyzing propositions that either do (true) or do not (false) obtain in the world, he is interested in the full range of ways people use language. He claims that, instead of focusing on sentences, a complete theory of language would focus on contextual utterances as the most basic unit of meaning. Thus, he developed a formal pragmatics (called “universal pragmatics” in early work). Building on the work of Karl Bühler, he conceives of the pragmatic use of language in context as embedding sentences in relations between speaker, hearer, and the world. This embedding helps to intersubjectively stabilize such relations. Habermas claims that, in uttering a speech act speakers mean something (express subjective intentions), do something (interact with or appeal to a hearer) and say something (cognitively represent the world). While truth-conditional theories of meaning focus on cognitive representations of the world, Habermas prioritizes the pragmatics of speech acts over the semantic or syntactical analysis of sentences. What is done through speech is taken to be what is most basic for meaning.

During his linguistic turn Habermas appropriated several ideas from John Searle. Even though Searle has not always fully agreed with such appropriations, two of them are useful points of orientation (Searle 2010, 62). Habermas adopts Searle’s idea of the constitutive rules underlying language: just like the rules of a game define what counts as a legitimate move or status, so too there is an implicit rule-governed structure to the use of language by competent speakers. He also adopts Searle’s view, built on JL Austin’s work, that speech has a double structure of both propositional content and illocutionary force. For instance, the propositional content of “it is snowy in Chicago” is a representation of the world. But the same content can be used in different illocutionary modes: as a warning to drive carefully, as a plea to delay travel, as a question or answer in a larger conversation, and so on. Moreover, beyond such illocutionary force, all speech acts also have derivative perlocutionary effects that, unlike illocution, are not internally connected to the meaning of what is said. A warning about snow may elicit annoyance or gratitude, but such responses are contextually inferred and not necessarily connected to either the propositional content or the warning itself.

These ideas about the structure of speech highlight a few key points. First, Habermas takes perlocutionary success (for example, eliciting gratitude) to be parasitic on illocutionary force (for example, the speech is perceived as a warning, not a plea). Attaining success with others by realizing one’s intention in the world is secondary to achieving an understanding with them. For example, even when lying, the lie only works by first coming to a false understanding that what is being said is true. Second, he identifies three modes of communication—cognitive, interactive, and expressive—that depend on whether a speaker’s main illocutionary intention is to raise a truth claim of propositional content, a claim of rightness for an act, or a claim of sincerity about psychological states. Third, he identifies corresponding speech act types—constatives, regulatives, and expressives—that, seen from the perspective of a competent language user, contain immanent obligations to redeem the aforementioned claims by respectively providing grounds, articulating justifications, or proving sincerity and trustworthiness.

In short, Habermas thinks there are general presuppositions of communicative competence and possible understanding that underlie speech and which require speakers to take responsibility for the “fit” between an utterance and inner, outer, and social worlds. For any speech act oriented towards mutual understanding, there is a presumed fit of sincerity to the speaker’s inner world, truth to the outer world, and rightness to what is inter-subjectively done in the social world. Naturally, these presumptions are defeasible. Yet, the point is that speakers who want to reach an agreement have to presuppose sincerity, truth and rightness so as to be able to mutually accept something as a fact, valid norm, or subjectively held experience.

For Habermas these elements form the “validity basis of speech.” He claims that, by uttering a speech act, a speaker is seen as also potentially raising three “validity claims”: sincerity for what is expressed, rightness for what is done, and truth for what is said or presupposed. Depending on the speech act type, one claim often predominates (for example, constatives raise a validity claim of truth) and, more often than not, speech rests on undisturbed background agreements about facts, norms, and experiences. Moreover, minor disagreement can be quickly resolved through clarifying meaning, reminding others of facts, asking about preexisting commitments, highlighting situational features, and so on. Habermas sometimes refers to such minor communicative repairs as “everyday speech.” But when disagreement persists we may need to transition to what Habermas calls “discourse”: a particular mode of communication in which a hearer asks for reasons that would back up a speaker’s validity claim. In discourse the validity claims that are always immanent within speech become explicit.

Clearly, Habermas uses “validity” in an odd way. The notion of validity is most often used in formal logic where it refers to the preservation of truth when inferentially moving from one proposition to another in an argument. This is not how Habermas uses the term. What then does he mean by validity? It is instructive to look at the assumptions behind his theory of meaning. When his model of meaning emphasizes what language does over what it merely says or means the operative assumption is that the primary function of speech is to arrive at mutual understandings enabling conflict-free interaction. Moreover, at least with respect to claims of truth and rightness, he assumes genuine and stable understandings arise out of the give and take of reasons. Claims of truth and rightness are paradigmatically cognitive in that they admit of justification through reasons offered in discourse. What Habermas means by validity then is a close structural relationship between the give and take of reasons and either achieving an understanding or (more strongly) a consensus that allows for conflict-free interaction. This yields an “acceptability theory” of meaning where the acceptance of norms is always open to further debate and refinement through better reasons.

As we cannot know in advance what reasons will bear on a given issue, only robust and open discourses license us to take the (provisional) consensuses we do achieve as valid. Habermas therefore formulates formal and counterfactual conditions—the “pragmatic presuppositions” of speech and the “ideal speech situation”—that describe and set standards for the type of reason-giving that mutual understandings must pass through before we can regard them as valid (on these formal conditions and how understanding and consensus may differ see below and section 4). At the same time, we never start this give-and-take of reasons from scratch. People are born into cultures operating on background understandings that are embodied in inherited norms of action. Borrowing from Husserl and others, Habermas calls this stock of understandings the “lifeworld.”

The lifeworld is an important if somewhat slippery idea in Habermas’ work. One way to understand his particular interpretation of it is through the lens of his debate with Gadamer. Broadly speaking, Habermas agrees with the view of language held by Gadamer and hermeneutics generally: language is not simply a tool to convey information, its most basic form is dialogic use in context, and it has an inbuilt aim of understanding. On such a view, objectivity is not just correspondence to an independent world but instead something that is ascribed to mutual understandings (about the world, relations to others, and oneself) intersubjectively achieved in communication. Moreover, communication has an underlying structure that makes understandings possible in the first place. Meaning is therefore in some sense parasitic on this background structure.

On this much Gadamer and Habermas agree. But Gadamer takes all this to mean that explicit understanding and misunderstanding are only possible due to a taken-for-granted understanding of cultural belonging and socialization into a natural language. Habermas agrees that culture and socialization are important, but is worried that Gadamer’s take on the background structures that form the “conditions of possibility” for meaning yields a relativistic “absolutization of tradition.” On Habermas’ interpretation the lifeworld encompasses the sort of belonging and socialization referred to by Gadamer, but it works with and is underpinned by certain deep structures of communication itself. For Habermas, the complementarity between the lifeworld and a particular manifestation of these deep structures in discourse and “communicative action” (below) is what lets one interrogate and progressively revise parts of the background stock of inherited understandings and validity claims, thereby avoiding either relativism or the dogmatic veneration of tradition.

For Habermas the lifeworld is a reservoir of taken-for-granted practices, roles, social meanings, and norms that constitutes a shared horizon of understanding and possible interactions. The lifeworld is a largely implicit “know-how” that is holistically structured and unavailable (in its entirety) to conscious reflective control. We pick it up by being socialized into the shared meaning patterns and personality structures made available by the social institutions of our culture: kinship, education, religion, civil society, and so on. The lifeworld sets out norms that structure our daily interactions. We don’t usually talk about the norms we use to regulate our behavior. We simply assume they stand on good reasons and deploy them intuitively.

But what if someone willfully breaks or explicitly rejects a norm? This calls for discourse to explain and repair the breach or alter the norm. As a micro-level example: if someone breaks a promise then they will be asked to justify their behavior with good reasons or apologize. Such communication is also called for when norms suffer more serious breakdowns: one may question the reasons behind norms and whether they remain valid, or run into a new and complex situation where it is unclear which norms, how, to what extent, and if they apply. Regardless of how serious the norm breach or breakdown is, we need to engage in discourse to repair, refine, and replenish shared norms that let us avoid conflict, stabilize expectations, and harmonize interests. Discourse is the legitimate modern mechanism to repair the lifeworld; it embodies what Habermas calls “communicative action.”

Communicative action can be seen as a practical attitude or way of engaging others that is highly consensual and that fully embodies the inbuilt aim of speech: reaching a mutual understanding. In later writings Habermas distinguishes weak and strong communicative action. The weak form is an exchange of reasons aimed at mutual understanding. The strong form is a practical attitude of engagement seeking fairly robust cooperation based in consensus about the substantive content of a shared enterprise. This allows solidarity to flourish. In either form, communicative action is distinct from “strategic action,” wherein socially interacting people aim to realize their own individual goals by using others like tools or instruments (indeed, he calls this type of action “instrumental” when it is solitary or non-social). A key difference between strategic and communicative action is that strategic actors have a fixed, non-negotiable objective in mind when entering dialogue. The point of their engagement is to appeal, induce, cajole, or compel others into complying with what they think it takes to bring their objective about. In contrast, communicatively acting parties seek a mutual understanding that can serve as the basis for cooperation. In principle, this involves openness to an altered understanding of one’s interests and aims in the face of better reasons and arguments.

The contrast between communicative and strategic action is tightly linked to the distinction between communicative and purposive rationality. Purposive rationality is when an actor adopts an orientation to the world focused on cognitive knowledge about it, and uses that knowledge to realize goals in the world. As noted, it has social (strategic) and non-social (instrumental) variants. Communicative rationality is when actors also account for their relation to one another within the norm-guided social world they inhabit, and try to coordinate action in a conflict free manner. On this model of rationality, actors not only care about their own goals or following the relevant norms others do, but also challenging and revising them on the basis of new and better reasons.

Approaching rationality after action orientations is not merely stylistic. Habermas notes that while many theorists start with rationality and then analyze action, the view of action that such an order of analysis primes us to accept can tacitly smuggle in quasi-ontological connotations about the possible relations actors can have amongst themselves and to the world. Indeed, this mistake figures into Habermas’ critique of Weber’s account of the progressive social rationalization ushered in by modernity. Weber framed Western rationalism in terms of “mastery of the world” and then naturally assumed the rationalization of society simply meant increased purposive rationality. As is apparent from Habermas’ account of social learning, this is not the only way to understand the “evolution” of societies or the species as a whole throughout history. By expanding rationality beyond purposive rationality Habermas is able to resist the Weberian conclusion that had been attractive to Horkheimer and Adorno: that modernity’s increasing “rationalization” yielded a world devoid of meaning, people focused on control for their own individual ends, and that the spread of enlightenment rationality went conceptually hand-and-glove with domination. Habermas feels the notion of rationality in his Theory of Communicative Action resists such critiques.

The contrast between communicative and strategic action mainly concerns how an action is pursued. Indeed, while these action orientations are mutually exclusive when seen from an actor’s perspective, the same goal can often be approached in either communicative or strategic ways. For instance, in my rural town I may have a discussion with neighbors whereby we determine we share an interest in having snow cleared from our road, and that the best way to do this is by taking turns clearing it. This could count as an instance of communicative action. But, imagine a wealthy and powerful recluse who is indifferent to his neighbors. He could just pay a snowplow to clear the road up until his driveway. He could also use his power to manipulate or threaten others to clear the snow for him (for example, he could call the mayor and hint he may withhold a campaign donation if the snow is not cleared). Strategic action is about eliciting, inducing, or compelling behavior by others to realize one’s individual goals. This differs from communicative action, which is rooted in the give-and-take of reasons and the “unforced force” of the best argument justifying an action norm.

Strategic action and purposive rationality are not always undesirable. There are many social domains where they are useful and expected. Indeed, they are often needed because communicative action is very demanding and modern societies are so complex that meeting these demands all the time is impossible. Speakers engaged in communicative action must offer justifications to achieve a sincerely held agreement that their goals and the cooperation to achieve them are seen as good, right, and true (see section 4). But, in complex and pluralistic modern societies, such demands are often unrealistic. Modern social contexts often lack opportunities for highly consensual discussion. This is why Habermas thinks weak communicative action is likely sufficient for low stakes domains where not all three types of validity claims predominate, and why strategic interaction is well-suited for other domains. For Habermas, modern societies require systematically structured social domains that relax communicative demands yet still achieve a modicum of societal integration.

Habermas takes the institutional apparatus of the administrative state and the capitalist market to be paradigmatic examples of social integration via “systems” rather than through the lifeworld. For example, if a state bureaucracy administers a benefit or service it takes itself to be enacting prior decisions of the political realm. As such, open-ended dialogue with a claimant makes no sense: someone either does or does not qualify; a law either does or does not apply. Similarly, in a clearly defined and regulated market actors know where market boundaries lie and that everyone within the market is strategically engaged. Each market actor seeks individual benefit. It makes little sense to attempt an open-ended dialogue in a context where one supposes all others are acting strategically for profit. Both domains coordinate action, but not through robustly cooperative and consensual communication that yields solidarity. Certainly, not all large-scale and institutionalized interaction is strategic. Some social domains like scientific collaboration or democratic politics institutionalize reflexive processes of communicative action (see section 5 on democratic theory). In such fora cooperation may yield solidarity across the enterprise. Even so, the systems integration like that found in bureaucracies or markets sharply differs from integration through communicative action.

It should be stressed that these are simply paradigmatic examples, and that the same social domain can be institutionalized differently across societies. It is therefore more useful to look at the coordinative media that are typically used to interact with and steer any given institutionalized system rather than positing a fictive typology of clear social domains wherein it is assumed that either strategic or communicative action takes place. Habermas identifies three such media: speech, money, and power. Speech is the medium by which understanding is achieved in communicative action, while money and power are non-communicative media that coordinate action in realms like state bureaucracies or markets. A medium may largely be used in one social domain but that doesn’t mean it has no role in others. While speech is certainly the main medium of healthy democratic politics, this doesn’t mean money and power never play a role.

This all might seem to imply that there is no single correct way for system and lifeworld to jointly achieve social integration. Indeed, the complementarity between system and lifeworld laid out in Theory of Communicative action is broad enough to accommodate a wide range of institutional pluralism with respect to the structure of markets, bureaucracies, politics, scientific collaboration, and so on. But, the claim that there is no “one size fits all template” for social integration should not be taken as the claim that system and lifeworld have no proper relationship. Socialization into a lifeworld precedes social integration via systems. This is true historically and at the individual level.

Moreover, Habermas claims the lifeworld has conceptual priority with respect to systems integration. His thinking runs as follows: the lifeworld is the codified (yet revisable) stock of mutual normative understandings available to any person for consensually regulating social interaction; it is the reservoir of communicative action. Systems integration represents carefully circumscribed realms of instrumental and strategic action wherein we are released from the full demands of communicative action. Yet the very definition and limitation of these realms always depends on communicative action regarding, for example, the types of markets or state administration a community wants to have and why. Without being rooted in the mutual understandings of the lifeworld, we would get untrammeled systems of money and power disconnected from the intersubjectively vouchsafed practical reason that Habermas thinks underpins all meaning. The organizing principles of systems themselves would stop being coherent. For instance, market competition makes sense against a backdrop of normative principles like fairness, equal opportunity to compete, rules against capitalizing on secret information, and so on. But if markets were so “no-holds-barred” that these principles no longer applied, then engaging in market activity would cease to make sense. Similarly, if markets were so regulated that there was no genuine risk or opportunity they would also start to loose coherence as an enterprise. In both these skewed hypothetical scenarios the system is rigged and thus, if there are functional alternatives, it is not worth participating in. This is a variant of his early anti-technocracy argument. Positing “objective necessities” like economic growth, social stability, national security and then circumventing communicative action veils disagreement on what type of growth, stability, and security is important for a given community and why. As such, systems designed to achieve these ends are primed to loose coherence and legitimacy based in widely accepted structuring principles.

Habermas thinks the lifeworld self-replenishes through communicative action: if we come to reject inherited mutual understandings embedded in our normative practices, we can use communicative action to revise those norms or make new ones. Mechanisms of systems integration depend on this lifeworld backdrop for their coherence as enterprises achieving a modicum of social integration. The trouble is that systems have their own self-perpetuating logic that, if unchecked, will “colonize” and destroy the lifeworld. This is a main thesis in Theory of Communicative Action: strategic action embodied in domains of systems integration must be balanced by communicative action embodied in reflexive institutions of communicative action such as democratic politics. If a society fails to strike this balance, then systems integration will slowly encroach on the lifeworld, absorb its functions, and paint itself as necessary, immutable, and beyond human control. Current market and state structures will take on a veneer of being natural or inevitable, and those they govern will no longer have the shared normative resources with which they could arrive at mutual understandings about how they collectively want their institutions to look like. According to Habermas, this will lead to a variety of “social pathologies” at the micro level: anomie, alienation, lack of social bonds, an inability to take responsibility, and social instability.

In Theory of Communicative Action Habermas pins his hopes for resisting the colonization of the lifeworld on appeals to invigorate and support new social movements at the grassroots level, as they can directly draw upon the normative resources of lifeworld. This model of democratic politics essentially urges groups of engaged democratic citizens to shore up the boundaries of the public sphere and civil society against encroaching domains of systems integration such as the market and administrative state. This is why his early political theory is often called a “siege model” of democratic politics. As section 5 will show, this model was heavily revised in Between Facts and Norms. Before turning to that work, we must flesh out discourse ethics—an idea that figured into Theory and Communicative Action but which was only fully developed later.

4. Discourse Ethics

Habermas’s moral theory is called discourse ethics. It is designed for contemporary societies where moral agents encounter pluralistic notions of the good and try to act on the basis of publically justifiable principles. This theory first received explicit and independent articulation in Moral Consciousness and Communicative Action (German 1983, English 1990a) and Justification and Application (German 1991a, English 1993), but it was anticipated by and depends on ideas in Theory of Communicative Action. The overview that follows draws upon these works. Much like the prior section, it only traces the broad outline of discourse ethics.

Discourse ethics applies the framework of a pragmatic theory of meaning and communicative rationality to the moral realm in order to show how moral norms are justified in contemporary societies. It could be seen as a theory that uncovers what we pragmatically do when we make and defend the moral validity claims underlying and manifested in our norms. Yet, we need to be careful with this characterization. Because of its cognitive commitments to moral learning and knowledge discourse ethics cannot simply be a reconstructive description of how it is we practically avoid conflicts and stabilize expectations in post-conventional social contexts. It is also an attempt to provide a formal procedure for determining which norms are in fact morally right, wrong, and permissible. Discourse ethics is squarely situated in the tradition of Neo-Kantian deontology in that it takes the rightness and wrongness of obligations and actions to be universal and absolute. On such a view, the same moral norms apply to all agents equally. They strictly bind one to performing certain actions, prohibit others, and define the boundaries of permissibility. There is no “relative” validity of genuinely moral norms even though, as we shall see, they can be embedded in social contexts that have consequences for their application. As long as these caveats are kept in mind we can understand discourse ethics by analyzing the practice of making and defending validity claims and how there are certain conditions of possibility tacitly underpinning and enabling this practice.

What are the conditions that enable this practice? As touched on above, Habermas posits certain unavoidable pragmatic presuppositions of speech which, when realized in discourse, can approximate a counterfactual ideal speech situation to greater or lesser degrees (1971; MCCA, 86). Discourse participants need to presuppose these conditions in order for the practice of discursive justification to make sense and for arguments to be truly persuasive. Four of these presuppositions are identified as the most important: (i.) no one who could make a relevant contribution is excluded, (ii.) participants have equal chances to make a contribution, (iii.) participants sincerely mean what they say, and (iv.) assent or dissent is motivated by the strength of reasons and their ability to persuade through discursive argumentation rather than through coercion, inducement, and so on (BNR, 82; TIO, 44). The point is not that actual discourses ever realize these conditions—this is why the ideal speech situation is best understood as a counterfactual regulative ideal. Rather, the point is that the outcomes of any discourses are only reasonably taken to be “valid” (empirically true, morally right, and so forth) under the presumption that these conditions have been sufficiently met. As soon as a violation is discovered this casts doubt upon the validity of the discursive outcome.

In addition to these pragmatic presuppositions Habermas proposes his discourse principle (D). This principle is supposed to capture the type of impartial, discursive justification of practical norms required in post-conventional societies: “only those action norms are valid to which all possibly affected persons could agree as participants in rational discourse”. (BFN 107; TIO 41) While (D) was initially framed as a principle for moral discourses it was soon revised to the more general form above, as there are many practical norms concerning interpersonal interaction that are not directly moral even if they must be compatible with morality. Yet even in its broadened form it is crucial to note that (D) only applies to discourses concerning practical norms about interpersonal behavioral expectations, not all discourses about theoretical, aesthetic, or therapeutic concerns (which may or may not involve interpersonal social interaction). The guiding thought is that if discourses about an action norm are carried out in a sufficiently ideal manner and they yield consensus then this is a good indication the norm is valid. The principle does not hold that consensus reached through discourse constitutes validity, nor that whatever norm people coalesce around after discourse that looks sufficiently ideal is assured to be valid. Rather, (D) simply holds that consensus about a norm can be a good test of validity if it has been achieved in the right type of discursive way. It is important to note that, because of its very broad scope, (D) mainly functions by pointing out invalid norms. By itself the discourse principle cannot tell us which norms are valid. It can only help us identify norms that are good candidates for validity.

Moreover, before the validity of an action norm can be assessed, we need more details on the types of discourse and validity claims at issue (TIO 42). Within his project of discourse ethics Habermas identifies moral, ethical, and pragmatic discourses (JA 1-17; MCCA 98). Each type deploys practical reason differently, framing and analyzing questions under the rubrics of the purposive (practical), the good (ethical), or the just (moral). The language of differing discourse “types” should not be taken to mean that norms come prepackaged in distinct kinds. Instead, any norm can be discursively thematized in any of these ways and should not be arbitrarily limited to a given type. With that caution in mind, we can begin to understand discourse types and the norms they produce.

Ethical discourses are a good place to start. For, while they are constrained by the outcomes of moral discourses and therefore not foundational, our prior discussion of the lifeworld provides an apt segue. Ethical discourses are paradigmatically about clarifying, consciously appropriating, and realizing the identity, history, and self-understanding of a group or individual. They make validity claims to authenticity rather than truth or rightness. They also involve value judgments about a particular social form or practice concerning the good life in a community. This is one reason why the outcomes of ethical discourses will have relative validity: they are meant to redeem validity claims for actors in some community or another. Another reason is that values differ from the types of generalizable or universalizable interests embodied in moral norms. While moral norms are supposed to strictly oblige agents to either do or not do some action, values admit of degree. While moral norms express principles backed by reasons, values are affective components of meaning acquired in virtue of living in a given social context. They are connected to reasons but not reducible to them. Values can orient us to goals, aid motivation, and help successfully navigate the lifeworld but cannot ground moral obligations by themselves. Values attract or repel but do not persuade; they can provide motivation to “do the right thing”—to have the will to follow a moral insight—but they do not constitute or even always help us discern what “the right thing” is (BFN 255).

Ethical discourses are rooted in ethicality (Sittlichkeit), which is distinct from morality (Moralität). Like many philosophers, Habermas separates the realm of the right from the realm of the good. Following a loosely Hegelian terminology, he parses this as the difference between morality and ethicality. Ethicality is a way of life composed of both cognitive and affective elements as well as more structural elements that reproduce this way of life: laws, institutions, conventions, social roles, and so forth. It is particularistic in that it defines goals in terms of what is good for a group as a whole and its members. As Habermas believes in George Herbert Mead’s model of “individuation through socialization,” ethicality is deeply engrained and connected to the lifeworld. No one can simply drop their internalized ethical perspective just as no one can simply step out of the lifeworld they have inherited. Individuals are always in some sense bound up with the identity, practices, and values of their upbringing and traditions even if they come to largely reject them. But, as was clear from Habermas’ critique of Gadamer, ethical perspectives do not determine us. Ethical discourses explain how this is by mediating between inheritance and transcendence. While we inherit and internalize an ethical perspective as individuals, we can always question parts of it that we wish to challenge, refashion, or reject for lack of sufficient reasons underwriting certain norms.

This dialectic between the ethicality we internalize through socialization and the way in which we wish to consciously reappropriate and (dis)own portions of such ethicality helps to explain why, in contrast to other discourse types, Habermas pays a great deal of attention to ethical discourses at both the individual and group levels. Ethical discourses at the individual level are called ethical-existential while ethical discourses at the group level are referred to as ethical-political discourse. For example, an individual considering a certain profession would engage in an ethical-existential discourse (for example, is this profession right for me given my character and goals?), while a polity considering whether certain policies express their collective interest, identity, and values would engage in an ethical-political discourse (for example, does this policy align with the collective identity and commitments we have had and how we want to appropriate them moving forward?).

There are two key points about these levels. First, the outcomes of such discourses are constrained by morality irrespective of what would be authentic at individual or group levels: an individual cannot simply decide to become a serial killer just as a country cannot simply enact a policy that has patently immoral consequences (for example, for those outside it). While Habermas thinks it is important to account for the way in which morality is embedded in social contexts through ethical discourses, he is staunchly opposed to postmodern or communitarian takes on morality and justice. Second, there will often be a reflexive interplay between these two levels of ethical discourse. Discourses about what it means to genuinely inhabit a collective identity can impact the ordering and strength of the values held by individuals, and discourses about who one fundamentally is and wishes to be can, through resistance to dominant interpretations of traditions and highlighting unacknowledged injustices, impact how others in a collectivity appropriate their identity and normative practices moving forward. This interplay is bookended by broader moral discourses at both levels, thereby helping the outcomes of such discourses stay in the realm of permissibility.

Pragmatic discourses are similar to ethical discourses in that they start from the teleological perspective of an agent who already has a goal. But in contrast to the reflexive, clarifying, and potentially transformative self-realization and collective self-determination of ethical discourses, pragmatic discourses simply start with a goal of presumed value and set about realizing it. This goal may involve identity and values but it could also refer to more pedestrian concerns and interests. Because the goal is presumed to be worthwhile the values, interests, or goals at issue show up as relatively static. Pragmatic discourses simply focus on the most efficient way to realize or bring about a goal, and their claim to validity concerns whether or not certain strategies or interventions in the world are likely to produce a desired result. As Habermas puts it, pragmatic discourses correlate “causes to effects in accordance with value preferences and prior goal determinations” so as to generate a “relative ought” that expresses “what one “ought” or “must” do when faced with a particular problem if one wants to realize certain values or goals” (JA 3). The “ought” is relative because it is something akin to a rule of prudence that depends on whether an agent happens to have a certain interest or find a goal worth pursuing.

Finally, we turn to what might be seen as the most important type of discourse: moral discourses. Moral discourses are broader in scope and establish stronger validity claims than either ethical or pragmatic discourses. They seek to discern and justify norms that bind universally rather than simply in the confines of a specific community or because an agent happens to find a goal valuable. These norms have binarily coded, unconditional validity instead of the gradated, relative validity of the outcomes produced by pragmatic and ethical discourses.

In order to discursively discern this non-relative sense of moral validity Habermas proposes a separate principle, his principle of universalization (U), for discourses about moral norms: “A norm is valid when the foreseeable consequences and side effects of its general observance for the interests and value orientations of each individual could be jointly accepted by all concerned without coercion” (TIO 42). While (U) has gone through several different formulations, the basic idea is that for whatever valid moral norms there are, such norms can be accepted by all affected persons in a sufficiently ideal discourse wherein they assert their own interests and values. (U) checks if the norms we take to be moral actually are in virtue of whether or not they are universalizable. If they are not universalizable, they cannot be moral norms. Beyond this basic characterization there are some interpretive issues with (U). Three are worth brief focus: its apparent reference to consequences, where (U) comes from, and the role of interests.

First, in the version of (U) above, it is easy to mistake the “foreseeable consequences and side effects” clause with the addition of a mild consequentialist constraint. Given Habermas’ deontological commitments this would be odd. Instead, the clause builds in a “time and knowledge index” so that it does not make impossible demands on moral agents. Fully satisfying (U) would require discourse participants who had unlimited time, complete knowledge, and no illusions about their own interests and values; it would require participants who transcended their human condition. As (U) must be usable in the real world it can only ask that moral discourse participants attempt to account for the “anticipated typical situations” to which a norm would apply when they attempt to justify any moral norm (JA 37). The circumscribed task of (U) is key: it is only supposed to justify moral norms in the abstract. While this justification may point towards “typical” cases of application, it does not predetermine all applications. What about novel, atypical, or completely unforeseen situations to which the norm might unexpectedly apply?

Following Klaus Günther, Habermas claims that moral (and legal) decisions in specific cases require a logic of appropriateness found in discourses of application (Günther 1993; JA 35-37). Discourses of application look at a concrete case and survey all potentially applicable norms, relevant facts, and circumstances. They try to offer exhaustive or “complete” descriptions of a situation so as to decide among multiple, sometimes competing or only partly applicable norms that might regulate a situation. There is a division of labor between the two types of recursively related discourse: whereas discourses of justification lay out the reasons why we should endorse a norm as a general rule with reference to typical situations, discourses of application seek to apply norms to concrete cases which may be wholly new or defy expectations. As fallible agents we can make a variety of different errors in our discursive justification of a norm or fail to anticipate new situations or altered understandings of facts, values, and interests—a failure that would be revealed in application. Habermas calls this the “dual falliblist proviso,” and it instills an awareness that moral justification is an ongoing project (TJ 259). The recursive interplay of justification and application is supposed to progressively address prior errors and oversights. New insights gleaned from application discourses or novel situations can lead us to revisit norms whose justification was taken for granted, and this refinement of our understanding regarding how and why norms are justified will help us apply them better. If we had providential foreknowledge we would not need application discourses. But since we are fallible the “foreseeable consequences and side effects” should be seen as referring to an in-built “time and knowledge” index for the outcomes of justificatory discourses, which are then supplemented by application discourses that may impact the formulation of the initial norm.

The second interpretive issue is where (U) comes from. Habermas initially claimed that (U) could be formally deduced from a combination of the pragmatic presuppositions of discourse and (D), but weakened this claim shortly thereafter (JA 32 n17). Instead of deriving (U) from a formal deduction or informal inferences he now claims—using a term coined by Peirce—we arrive at (U) “abductively” (TIO 42). To arrive at something abductively is to suggest that we first observe a phenomenon (moral norms) and adopt a “best guess” hypothesis to explain it (the moral principle), which can then be subjected to further inductive testing (Ingram 2010, 47; Finlayson 2000a, 19). In short, (U) is now proposed as the best candidate principle for helping to explain moral normativity. To buttress the plausibility of this claim Habermas has also fallen back on his theory of social evolution and the “weak…notion of normative justification” in post-conventional contexts (TIO 45). Indeed, he now often speaks about (U) as following from the type of impartial justificatory procedure appropriate to a post-conventional condition that seeks to discern norms that are “equally in everyone’s interest,” “generalizable,” or “universalizable” (RPT 367; BFN 108, 460; TJ 265). The reference to interests leads us to the third interpretive issue with (U).

Early formulations of (U) only refer to interests (MCCA 65, 120). The inclusion of value orientations is potentially confusing. As noted above values are not necessarily cognitively grounded. As Habermas has always presented his moral theory as cognitivist it would be odd to give values such a central role. It seemed to make sense that initial formulations of (U) only included interests, as Habermas has defined interests in a cognitive fashion (on interests as “reasons to want” see Finlayson, 2000b). Bolstering an interpretation of (U) that puts priority on (cognitive) interests he has stated that “(U) works like a rule that eliminates as non-generalizable content all those concrete value orientations with which particular biographies or forms of life are permeated” (MCCA 121), and that the specific part of (U) referring to “uncoerced joint acceptance” means that any reasons put forth in moral discourse must “cast off their agent-relative meaning and take on an epistemic meaning from the standpoint of symmetrical considerations” (TIO 43). Moreover, the interpretive secondary literature has often emphasized the centrality of interests over values and focused on how Habermas often talks about “generalizable” or “universalizable” interests as the distinctive feature that moral norms secure (Heath 2003; Finlayson 2000b; Lafont 1999). How then should the inclusion of value orientations be understood?

Habermas has said he included value orientations in (U) so as to “prevent the marginalization of the self-understanding and worldviews of particular individuals and groups” (TIO 42). This does not mean that values are on a par with interests. Instead, his point is that interests and values are always bound together. Value orientations exert at least some indirect influence on moral discourses insofar as they subtly influence the very interpretation of our own interests (JA 90). Proceeding as if value orientations can be expunged from moral discourses may in fact introduce discursive blind spots. Indeed, candor about one’s own value-orientations may be crucial since the impartiality of (U) involves “generalized reciprocal perspective-taking” that cuts both ways: it orients participants towards “empathy for the self-understandings” of others as well as towards “interpretive interventions into the self-understanding of participants who must be willing to revise their descriptions of themselves and others” (TIO 43). The essential point is that even though “some of our needs are deeply rooted in our anthropology” and can be seen as basic generalizable interests shared by all, we must nevertheless avoid “ontologizing generalizable interests” into “some kind of given” because even “the interpretation of needs and wants must take place in terms of a public language” wherein our own self-understandings are open to revision (TJ 268; JA 90).

A final interpretive issue that merits attention is the precise status of moral rightness. Habermas has always held that morality and truth are analogous in that both are cognitive, binarily coded, and subject to learning processes. Moreover, he has always been sharply critical of approaches that would reduce morality to a purely subjective or relativized affair. Yet, given that rightness is not reducible to truth and that Habermas has repeatedly disclaimed a moral realist reading of his theory, it is unclear precisely how far this analogy is supposed to extend. This is not only because there are a variety of differences between empirical and moral knowledge but also because Habermas has changed his theory of truth over the years—moving from a consensus theory that identified truth with ideal warranted assertability to a “pragmatic epistemological realism that follows in the path of linguistic Kantianism” (TJ 7). Early articulations of discourse ethics seemed to admit of interpretations wherein rightness was a justification-transcendent concept that couldn’t be captured by ideal warranted assertability. This led some interpreters to interpret Habermas’ moral theory as at least tacitly committed to some variant of internal moral realism (Davis 1994, Kitchen 1997, Lafont 1999 and 2012, Smith 2006, Peterson 2010 ms.). But, in the course of resisting this reading, Habermas has explicitly claimed that, “ideally warranted assertability is what we mean by moral validity” (TJ 258, 248). He now wishes to articulate a notion of moral rightness that can be cashed out in terms of a pragmatist constructivism that also avoids the perils of relativism and skepticism—that is, which maintains an anti-realist account of moral rightness that still resists collapsing into a form of moral consensus theory. Whether he succeeds in this endeavor is a hotly debated topic.

5. Political and Legal Theory

In post-conventional, pluralistic societies ever fewer norms can be underwritten by a shared ethos embodied in a community’s ethicality or collective identity. Moral norms cannot pick up the slack to achieve social integration and cohesion by themselves. Because moral discourse is demanding and aims at what is equally in everyone’s interest, few moral norms will be seen as justified across the world or even in a given society (JA 91, TJ 265). And, as Habermas noted in Theory of Communicative Action, while systems like the bureaucratic state and economy can achieve stability and coordinate expectations through money and power, this can erode mutual understandings and social solidarity; markets and bureaucracies tend to displace and colonize the lifeworld. Indeed, his political essays from this period cast democratically created law as holding the line against system encroachments in a siege mentality (BFN 486-89, Habermas 1992b 444). This may leave us asking: What other resources exist for legitimate social integration?

In Habermas’ clearest statement of political theory, Between Facts and Norms, modern law shows up as precisely the resource we are looking for. If law is linked to democratic political structures in the right way it confers legitimacy on legal norms, thereby fostering social integration and stability. Broadly speaking, the relation between legal legitimacy, procedural-democratic popular sovereignty, and public discourse is nested and reflexive: legitimate law must be rooted in democracy, which itself depends upon a robust public sphere. A vibrant democratic public sphere is what allows for the revision and questioning of prior law. Conceived of in this way modern law is a “transformer” that preserves the normative achievements and mutual understandings that issue from the collective self-determination of the public sphere by translating them into legitimate, binding decisions that can “counter-steer” against the logics of the state and market. As long as legal decisions are arrived at in the right type of procedural, discursive fashion there is a presumption in favor of their rationality and legitimacy. And, as long as the public sphere continues to be a robust and open forum of contestation, any prior decisions are revisable such that there is a circulation between the informal public sphere and more formal institutions of the state. This focus on the transformative, mediating nature of law revises the prior “siege” model of democratic law into a procedural “sluice” model (Habermas 2002, 243). While the prior model saw democratically generated law as a defensive dam or shield against the demands of systems, the new model sees a certain type of lawmaking as mediating the circulation between lifeworld and system in a way that produces legitimate and binding legal norms. Modern law works with systems and alongside post-conventional morality to stabilize social expectations and resolve conflicts.

We can start to understand the relation between law, democracy, and the public sphere by focusing on legal legitimacy and democracy. Between Facts and Norms posits a tension within law itself, as well as an internal relation between modern law and democracy. To function, all law must demand compliance, threaten coercion, and (however tacitly) appeal to an underlying normative justification. Law is therefore characterized by a tension between “facticity” and “validity” insofar as it must be recognized as factually efficacious and normatively justified. This tension helps explain the relation between law and democracy in contemporary contexts. Pre-modern law appealed to God, nature, human reason, or shared culture for its justificatory backing. In post-conventional societies the fact that law is coercible and changeable yet merely rooted in fallible humans is laid bare. For Habermas, the underlying normative justification can now only be understood as “a mode of lawmaking that engenders legitimacy” (IO 254). The thought is that democracy is the only mode of lawmaking that is up to this legitimacy-engendering task. In light of these connections it is fruitful for present purposes to focus on “deliberations that end in legislative decision making” rather than treating political and legal legitimacy separately (BFN 171; Bohman and Rehg 1999, 36).

The democracy Habermas has in mind differs from overly populist varieties. He is clear that the legitimacy underwriting lawmaking must be twofold: law must not only express the democratic will of the community but must also be non-subordinately “harmonized” with morality (BFN 99, 106). This non-subordinate concordance of legality and discourse theoretic morality is the hardest sense of legitimacy to explain and the easiest to overlook, so it is fruitful to start there. For Habermas, “legal and moral rules…appear side by side as two different but mutually complementary kinds of action norms” in post-conventional societies. In order to also account for “the idea of self-legislation by citizens” we must avoid a “subordination of law to morality” along the lines of classical natural law theory (BFN 105-6, 120; IO 257). Yet it seems puzzling to hold that democratically determined law should be compatible with but not subordinate to discourse-theoretic morality. What about cases where law and morality seem to conflict? There are a few answers that highlight unique features in Habermas’ theory. At a general level these answers take the same shape: while there are many ways that legal systems can square with moral permissibility, there are nevertheless structural and conceptual features endogenous to processes of modern procedural-democratic popular sovereignty that, at least at an abstract level, tend to harmonize legal norms with moral permissibility. This avoids concerns with morality trumping legality in an exogenous manner.

One reason to expect that democratically legitimate law and moral permissibility will be at least in principle commensurable is that they are both rooted in (D). We saw above how the moral principle (U) expresses the way (D) is specified for moral discourses. Habermas also proposes a principle of democratic legitimacy (L) that expresses the way (D) is specified for political discourses producing law. This principle is rooted in (D) in virtue of what Habermas calls the “legal form.” When (D) is deployed in discourses aimed at producing legal norms for regulating common life together it is understood these norms will be cloaked in the legal form: the set of formal and functional features characterizing modern positive law. Modern positive law is enacted and conventional, enforceable and coercive, rooted in institutions with some reflexivity, tailored to protect individuals through rights, and limited in scope (BFN 111-118, IO 256). If law is to function as a tool for the consensual regulation of social conflicts and the integration of society, then it needs to take on this form.

The principle of democratic legitimacy (L) is part of the normative backing that is supposed to emerge, albeit in nuce and very abstractly, from the historical interpenetration of (D) and the legal form that has culminated in the structures of modern democratic state. It claims, “only those statutes may claim legitimacy that can meet with the assent of all citizens in a discursive process of legislation that in turn has been legally constituted” (BFN 110; constituted is sometimes translated as organized). This principle captures how (D) is specified for political discourses so that democratic procedures underwrite the legitimacy of legal norms. Legitimacy does not arise out of formal legality alone; it needs the added normative backing of democracy. The idea of (L) is that compliance with the law must be rational and rooted in the law’s perceived legitimacy. To achieve this, political discourses must be structured in a way where formal legislative institutions accurately represent and address deliberations going on in the informal public sphere, and where there are institutionalized procedural mechanisms organized in a way to help screen out weak arguments (BFN 340). The details of this structuring will be clarified below, particularly in relation to the process model and the relationship between democracy and the public sphere.

However, the mere fact that (U) and (L) are rooted in (D) does little to ensure the commensurability of law and discourse-theoretic morality. Fortunately, there are additional reasons why we might expect such a harmonization. Habermas thinks the combination of (D) and the legal form in (L) also supplies us with the resources to discern the conceptual kernels of an abstract “system of rights” that will be inscribed in the core structures of any legitimate self-determining political community. The basic argument is that in order for (L) to be realized it must make reference to a concrete community engaged in self-determination through modern law. In such communities equal legal personhood takes on the role of a “protective mask,” a formal identity mainly defined by rights instead of duties, that crystalizes around individual moral persons (BFN 531, 112). This legal identity is constituted by a core of rights that secure the status and private autonomy of individuals such that they can not only live their individual lives but also genuinely deliberate (on equal footing, free from coercion, and so forth) about the terms of shared life together. Yet, these individual rights cannot be effective unless they presuppose other rights to participation and basic material provision—rights that secure public autonomy. The claim is that the legal manifestations of private and public autonomy, often expressed in the idioms of human rights and popular sovereignty, mutually presuppose one another. What results is an abstract system of rights made up of five core types. What are these right types?

First, in order to discursively engage one another people need to be reasonably secure. Therefore, rights that guarantee the status of individual persons are required. Three types of rights jointly achieve such protection: (i.) the right to equal liberties compatible with those of others, (ii.) rights of membership that determine the extent of the community, and (iii.) rights of due process that assure each person is treated the same and equally protected under the law (BFN, 133-134). These rights secure the individual private autonomy prioritized by classical liberalism. But any community engaged in specifically democratic self-determination must also safeguard the ability to actively use the freedom afforded by this secure status to deliberate, disagree, and come to mutual understandings in concert with others. If individual rights are to be effectively used (iv.) rights of communication and political participation that formally secure equal opportunity and access to the political process are required. These rights secure the collective public autonomy prioritized by classical republicanism. They enable discourses in the public sphere as well as equal access to channels of political say and influence; they enable democratic popular sovereignty by making sure everyone can participate on fair and equal terms, and that information, innovative ideas and arguments about how to structure common life are kept freely circulating and scrutinized. Lastly, these four right types are insufficient if basic needs are threatened or go unmet. Formal guarantees of freedom and participation mean little if they amount to the freedom to starve. So, as a final step, Habermas proposes some measure of (v.) social, technological, and ecological rights securing the basic conditions of a minimally decent life. Democratic states have often done a poor job fully realizing these rights, but the claim is simply that these general right types are conceptually required if self-determination through law is to achieve the dual sense of legitimacy noted above. In this same spirit of clarification, it is also important to note that the abstract system only identifies certain right types, not some list of concrete rights. Communities have incredibly wide interpretive latitude when it comes to how these rights show up. Habermas often refers to rights as “unsaturated placeholders”; it is largely up to communities to “fill in” their content.

The expectation of a non-hierarchal harmonization of morality and legality may now seem less puzzling. Ideally, lawmaking discourses approximate (L) against the backdrop of an abstract system of rights inscribed in the political structures of a democratic community. This places some broad constraints on how deliberations unfold and the type of norms they can produce. Moreover, apart from these structural background constraints political discourses are also themselves unique. In contrast to moral discourses focused on “the interest of all” or ethical discourses focused on authentic self-realization, political discourses aimed at self-determination through law reference a plethora of different concerns, and do so in an internally-structured way aimed at carving out a space (defined by rights) where moral personhood and ethical authenticity can flourish (BFN 531).

While deliberations about “political questions are normally so complex that they require the simultaneous treatment of pragmatic, ethical, and moral aspects” of issues, they ideally unfold along a ‘process model’ where there is a structured interplay between pragmatic, ethical, and moral concerns as well as procedurally regulated bargaining (BFN 565, 168). The basic idea is that for any provisional policy conclusion there is an obligation to respond to objections stemming from more abstract aspects of an issue or levels of discourse; discursive processes cannot be arbitrarily limited. For instance, participants in a discourse on immigration policy cannot simply consult ethical concerns regarding their community’s authentic identity but yet refuse to listen to moral discourses that bear on such policies. Any moral aspects need to be explicitly discussed, and they filter or check more particularistic issue-aspects and discourses (Cf. BFN 169 and the emendation at 565 on whether to refer to the structured interplay as between discourses or aspects of a case). The abstract system of rights and the process model mean that, within political deliberations about how to structure common life together, it will in principle always be possible for more abstract moral discourses to weakly check pragmatic and ethical-political discourses. And, this checking will be endogenous to structures of democratic self-determination.

So far the focus has been on the relation between law and democracy without much reference to the public sphere. However, it is hard to overstate the importance Habermas places on democratic deliberation rooted in the public sphere. None of the formal or structural mechanisms mentioned so far guarantee that public political discourses or laws will be specified in a given way. There is assurance neither that the abstract system of rights or (L) will be meaningfully realized, nor that the interplay of various types of concerns in political discourses will unfold along the process model. Everything hangs on the quality and institutional structuring of deliberation in the public sphere. Indeed, the primary reason why democracy confers legitimacy upon legislative outcomes is that it is rooted in a model of distinctly procedural popular sovereignty that simultaneously expresses the will of the community and that leads to more rational outcomes. An analysis of the specific way in which democracy and the public sphere are related on Habermas’ model is the best way to understand how the democratic mode of lawmaking underwrites the legitimacy of legal norms.

In Between Facts and Norms Habermas proposes a “two-track” model of democratic politics outlining a circulation of political power engendering legitimacy. He divides the political public sphere into informal and formal parts. The informal public sphere includes all the various voluntary associations of civil society: religious and charitable organizations, political associations, the media, and public interest advocacy groups of all varieties (BFN 355). In this sphere public political deliberation is free and unorganized. Through this open clash of views and arguments individuals and collectivities can both persuade and be persuaded, thereby contributing to the emergence of considered public opinions. In contrast, the formal public sphere includes institutionalized forums of discourse and deliberation like congress, parliament, and the judiciary as well as more peripheral administrative and bureaucratic agencies associated with state structures. This sphere is supposed to be organized in such a way that it renders decisions reflecting the considered public opinions of the informal public sphere. Formal institutionalized decision making bodies must be porous to results of the informal public sphere.

The informal public sphere is the key forum for generating a type of normative power that can integrate society through mutual understandings and solidarity rather than through money or administrative-bureaucratic power. When discourse participants in the informal public sphere freely reach mutual understandings about how to regulate the terms of shared life together “communicative power” emerges (Flynn 2004 discusses communicative power’s precise locus). Communicative power arises from jointly authored norm expectations that are cognitively grounded in the force of better reasons and motivationally grounded (albeit weakly) in mutual recognition and collective ethical discourses. Cognitively speaking, free communication in the public sphere can foster “rational opinion and will-formation” because “the free processing of information and reasons, of relevant topics and contributions is meant to ground the presumption that results reached in accordance with correct [discursive] procedure are rational” (BFN 147). This acceptance also provides weak motivation: in accepting a norm’s validity claim one accepts the background understandings and reasonins underlying it which can motivate relevant circumstances. Moreover, because this mutual understanding was presumably reached through persuasive discourse where reasoned dissent was (and remains) a real possibility, norm acceptance can also motivate in a spirit of anti-paternalistic empowerment: parties recognize each other as accountable and responsible for their actions in accord with a norm until new counter-reasons are discovered. While they may be aware of counter-inclinations and motives that are not backed by good reasons, they take one another to be competent, responsible agents who can choose to act on rationally backed norms (Günther 1998). Yet, because the motivation accompanying cognitive insight is fragile and weak, communicative power must also be rooted in a community with a shared ethical-political identity and legitimate law so that motivational deficits can be met with supplemental resources of a shared life and law.

Communicative power can only arise if the informal public sphere has certain characteristics. First and foremost, it must be relatively free of distortions, coercion, and silencing social pressures so that communication can work as a filter for fostering more rational individual and collective will formation (BFN 360). The public sphere also needs to accurately function as a “context of discovery” wherein problems that affect large segments of the public are identified and taken up for discussion and resolution in discourse. Moreover, civil society must be animated by a political culture so that members actively participate in voluntary associations and public discourse about the terms of common life together (BFN 371). Normative power potentials cannot be generated if members largely retreat into private concerns or a society is internally segmented and riven with special interests (Flynn (2004) 439-444; Bohman and Rehg (1999) 41-42). Clearly, if the public sphere is to remain healthy then the media’s role in fostering accurate information and timely mass communication will also be crucial (EFP, 138-183).

The political institutions of the formal public sphere are arranged so as to be porous to the inputs of the informal public sphere, to further refine and focus public opinion, and to make decisions. Building on the work of Bernhard Peters, Habermas maintains that modern constitutional democracies are set up so communication and decision-making flow from the “periphery” of the informal public sphere into the “center” constituted by those formal political institutions that create, enforce, clarify, or implement the law (BFN 354). In a well-functioning democratic regime there will be structural “sluices” or “floodgates” embedded in the institutions of the administrative state (legislature, judiciary, and so forth) so that the circulatory flow of power proceeds in the right direction, from the periphery to the center.

The thought is that the political community should “program” and direct the institutions of the administrative complex, not the other way around (BFN 356). If the state or other powerful actors reverse this flow by simply positing new laws or rules and either demanding compliance or inducing it in some other way, then this exercise of non-communicative administrative-bureaucratic power would be neither legitimate nor stable. Habermas claims the “integrative capacity of democratic citizenship” erodes to the extent that the circulation of political power is interrupted or reversed. Only communicative power has the legitimating force needed so that a community can both author and rationally abide by the law. Democratic lawmaking is the key institution that “represents…the medium for transforming communicative power into administrative power” while preserving its normative potential (BFN 169, 81, 299). Democratically generated law ensures normative power potentials flow in the right direction and that they are maintained when implemented by institutions of the administrative state.

This account of procedural-democratic collective self-determination should not be confused with traditional national self-determination. Habermas rejects models of sovereign collective self-determination that presuppose a nation or people with a homogeneous identity and interests, as well as models where “a network of associations” stands-in for this (imaginary) collective-self (BFN 185, 486). Instead, in modern constitutional democracies the “idea of popular sovereignty is… desubstantialized [and]…not even embodied in the heads of the associated members.” Popular sovereignty “is found in those subjectless forms of communication that regulate the flow of discursive opinion- and will- formation in such a way that their fallible outcomes have the presumption of practical reason on their side” (BFN 486). Insofar as we can speak about the will of a community it is an anonymous and subjectless public opinion emerging out of the discursive structures of communication themselves (BFN 136, 171, 184-186, 299, 301). This unique interpretation of popular sovereignty helps explain some final aspects of Habermas’ political theory: his views on religion and the public sphere, his constitutional patriotism, and his vision of politics beyond the nation-state.

In early writing Habermas claimed that as the rationality and pluralism of enlightenment ideals slowly took hold in modern societies the mythic explanations of religion would be less important. But, he slowly came to revise his view on religion in modern societies. At present, the way he sees religion fitting into the public sphere of a liberal democracy is what is important. In liberal democracies, untrammeled populism is held in check by not only individual rights but also the very nature of public debate: citizens collectively self-determine through persuasion and rational argumentation. To do this amidst the pluralism of modernity, the laws they make must be grounded in public reasons accessible to all. The question is what this means for religious citizens.

There have been a variety of answers. For instance, in Political Liberalism John Rawls held that liberal democratic citizens should ultimately only endorse policies that they can support on the basis of secular reasons. While these citizens may have religious reasons that favor a law or policy, when engaged in political debate they must eventually “translate” these reasons into terms that non-believers could accept. Habermas is sympathetic to the vision of liberal democracy animating this view of how religious citizens should act. Indeed, he criticizes thinkers like Wolterstorff who insist that religious citizens ought to be allowed to try to base coercive law on their own particularistic values and conception of the good. Nevertheless, he feels that placing the burden of “translation” onto religious citizens alone is somewhat misguided. Such an approach underestimates the ethical-existential importance of religion in some people’s lives—especially if it is bound up with the structure of their lifeworld and identity. As an alternative, Habermas proposes both religious and non-religious citizens be allowed to invoke any reasons for or against policies at the level of the informal public sphere, provided they take one another’s claims seriously and do not dismiss them from the outset. But when it comes to the institutions of the formal public sphere concerning coercive lawmaking, justifications should only be based in reasons that all can accept.

This view is somewhat unsatisfying for several reasons: it simply moves the asymmetrical burden of translation “up a level,” it may run into concerns of a metaphorical split in identity, and it could even saddle non-religious citizens with undue burdens (Yates 2007, Lafont 2009). For present purposes, the most charitable reading is that Habermas assumes all democratic citizens have an obligation to adopt a thoroughly self-reflective attitude. Religious citizens must “self-modernize” insofar as they are expected to be open to things like the authority of science, the need for non-religious reasons backing coercive law, and the possible validity of claims made by other religions. But, this also means non-religious citizens must move beyond a dogmatic secularist understanding wherein it is impossible for religious claims to have any cognitive value whatsoever. Indeed, given that some fundamental moral notions—such as equal human dignity—have been inextricably tied to the history of world religions, he claims it is not always clear where the boundaries of the religious and secular are. Determining these boundaries (and what can count as publicly acceptable) may at times be a cooperative task wherein each side takes the claims of the other with some degree of seriousness (2006b, 45 and 2003b, 109).

Habermas’ reinterpretation of popular sovereignty also explains why he has adopted the theory of constitutional patriotism pioneered by Dolf Sternberger. Constitutional patriotism maintains that, in contrast to national identities of the past, modern political communities can base their collective identities around the unique ways they appropriate and embed the abstract, universalistic principles of democratic self-determination within their unique histories and traditions. On such a model, political allegiance can coalesce around “a particularist anchoring of…the universalist meaning of [principles such as] popular sovereignty and human rights” (BFN 500; L’i 308; BNR 106). This particularist anchoring would presumably include the way in which a community takes up the abstract system of rights, the process model, and (L). The claim is that the specific way a political community instantiates the “abstract procedures and principles” of the modern democratic state fosters the development of a “liberal political culture” that “crystalizes” around that country’s constitutional traditions, structures, and discursive fora (IO 118; DW 78). The integrative force that emerges against this backdrop is called civic solidarity, which Habermas characterizes as “an abstract, legally mediated solidarity among citizens… a political form of solidarity among strangers” (DW 79; BNR 22). This is essentially the integrative potential of democratic citizenship when it is actively used.

One assumption here is that “culture and national politics have become…differentiated” from one another; citizens can see themselves as part of a shared political culture precisely because they no longer see the state as a vehicle for realizing a homogenous, pre-political nation. While this is a far cry from empirical realities in many parts of the world, Habermas sees the European Union as illustrative in this regard. Even in a context that was once characterized by strong national identities (where the chances for such an identity might seem slimmer than in more multicultural contexts) we can start to see how “a common political culture could differentiate itself off from the various national cultures” and how “identifications with one’s own forms of life and traditions [could be] overlaid with a patriotism that has become more abstract, that now relates… to abstract procedures and principles” (NC 261; BFN 507, 465; IO 118; BNR 327; DW 78).

Finally, Habermas sees constitutional patriotism as a normative resource that could help to expand civic solidarity across political borders and uncouple legal structures from the nation-state so they could be scaled-up into new institutions of international law. Such developments would allow new forms of democratic self-governance above the nation-state at regional and global levels (DW 79). These post-national implications are naturally produced by Habermas’ core theoretical commitments. Deliberative democracy is committed to institutionalized discourse that in some way makes it possible for law to be justified to the persons who are affected by or subjected to it. Given increasing global interdependence this obviously pushes in cosmopolitan directions. However, at the same time, it is important to remember that communicative power must be rooted in a community with a shared ethical-political identity, and that constitutional patriotism is parasitic upon a particular political culture. This rootedness means that civic solidarity and new forms of self-governance can stretch, but only so far.

This anchored cosmopolitanism yields a multi-level constitutionalization of international law that aims at some measure of global governance without government. While Habermas’ account of such a multi-level system is only a sketch and many details need filling-in, the broad outline is clear. He proposes a system comprised of the “supranational” (global), “transnational” (regional), and national level political institutions with different roles. A supranational organization akin to a reformed United Nations is envisioned as securing international peace, security, and core human rights. At the mid-level, transnational authorities like the EU would tackle technical issues through coordinative efforts and political issues through negotiated bargaining among sufficiently representative regional regimes of commensurate stature. Finally, nation-states would retain their status as the locus of democratic legitimation. This would require the spread of democratic structures to each nation-state so that laws can reflect the will of the community and so that they could be reliably in line with the basic human rights secured by a supranational organization.

This vision of a multi-level political system for the constitutionalization of international law can be criticized as demanding both too much and too little. Habermas’ version of cosmopolitan deliberative democracy locates the touchstone of legitimacy in the fact that “citizens are subject only to those laws which they have given themselves in accordance with democratic procedure” (CEU 14). From this perspective of democratically legitimate law, the proposed system may demand too little. Despite Habermas’ insistence that negotiation between regional regimes could take place in a way that would not “impair deliberation and inclusion,” it is hard to see how such bargaining could really constitute a process where citizens give themselves the law through democratic procedures (CEU 19). From the perspective of rootedness in political culture, the multi-level system may also demand too much with the extension of civic solidarity to transnational regimes. Habermas clearly thinks there are limits to such an extension, as “the transnational extension of civic solidarity…comes to nothing…when it is supposed to assume a global format.” However, apart from the fact that neighboring countries might be supposed to have some minimal level of shared history and culture born out of territorial proximity and an interdependency of interests, it is unclear why this extension of solidarity would reach the levels needed to underwrite the democratic legitimacy of laws within transnational units of regional governance (CEU 62).

While Habermas is certainly aware of these criticisms, he is largely focused on defending his political theory in broad, systematic terms. If the broad normative outlines are correct then the overall theory will stand regardless of how the empirical details are filled in. Indeed, Habermas is rather unique among contemporary philosophers both in his systematic approach to large areas of theory and in his willingness to allow others to fill in the details of how particular claims might work. He has always insisted that philosophers do not speak from a privileged place of knowledge. The best that they can hope for is to articulate a theory that can be convincingly and rigorously tested and debated in the public sphere. We can perhaps understand not only his political theory, but several other theoretical projects in this spirit of a public intellectual putting forth a theory for testing and debate that requires further articulation by those who come after.

6. References and Further Reading

a. General Introductions to Habermas

The article presented a general and reasonably complete introduction to Habermas. However, given the breadth of his work and space constraints, the following should also be consulted:

1978. McCarthy, Thomas. The Critical Theory of Jürgen Habermas. The MIT Press.
1988. White, Stephen K. The Recent Work of Jürgen Habermas. Cambridge University Press.
2005. Finlayson, James Gordon. Habermas: A Very Short Introduction. Oxford University Press.
2011. Fultner, Barbara (ed.) Jürgen Habermas: Key Concepts. Acumen Press.
2014. Bohman, James and Rehg, William. Jürgen Habermas. Stanford Encyclopedia of Philosophy.
Thomas Gregersen maintains an online bibliography at the Habermas Forum.

The following printed bibliographies are also useful:

2013. Corchia, Luca. Jürgen Habermas. A Bibliography: Works and Studies (1952-2013). Pisa (IT): Arnus University Books.
2014. Müller-Doohm, Stephan. Jürgen Habermas – Eine Biographie. Berlin: Suhrkamp.

b. Introductory Books and Articles on Specific Themes

i. Biography

2001. Matustik, Martin Beck. Jürgen Habermas: A Philosophical-Political Profile. Rowman and Littlefield.
2010. Specter, Matthew G. Habermas: An Intellectual Biography. Cambridge University Press.
2004. Wiggershaus, Rolf. Jürgen Habermas. Reinbek Bei Hamburg: Rowohlt.

ii. Linguistic Turn

1994. Cooke, Maeve. Language and Reason. The MIT Press.
1999. Lafont, Cristina. The Linguistic Turn in Hermeneutic Philosophy. Jose Medina (trans.). Cambridge University Press.
2016. Lafont, Cristina. Jürgen Habermas in The Blackwell Companion to Hermeneutics, 440-445.

iii. Discourse Ethics

1994. Davis, Felmon John. Discourse Ethics and Ethical Realism: A Realist Realignment of Discourse Ethics. European Journal of Philosophy, 125-142.
1997. Rehg, William. Insight and Solidarity: The Discourse Ethics of Jürgen Habermas. University of California Press.
2000a. Finlayson, James Gordon. Modernity and Morality in Habermas’s Discourse Ethics. Inquiry. 43, 319-40.

iv. Political Theory

1994. Bohman, James. Review: Complexity, Pluralism, and the Constitutional State: On Habermas’s Faktizitat und Geltung. Law and Society Review, 897-930.
2002. Discourse and Democracy: Essays on Habermas’s Between Facts and Norms, ed. Rene von Schomberg and Kenneth Baynes. SUNY Press.
2010. Hedrick, Todd. Rawls and Habermas: Reason, Pluralism, and the Claims of Political Philosophy. Stanford University Press

c. Works Cited

Most of Habermas’s work can be found in German and English. After the original year of publication and title, see the square brackets for the English translation. For some texts a translation does not exist, only exists in part, or is divided between texts. This is denoted by an asterisk (*).

1953. Mit Heidegger gegen Heidegger denken. Zur Veröffentlichung von Vorlesungen aus dem Jahre 1935. Frankfurter Allgemeine Zeitung. July 25, 1953. [English: 1977]
1956. Der Zerfall der Institutionen (Arnold Gehlen). Frankfurter Allgemeine Zeitung. July 4, 1956.*
1958. Philosopische Anthropologie: Ein Lexikonartikel. In: A. Diemer, I. Frenzel (eds.) Fischer-Lexikon Philosophie. Frankfurt am Main: Fischer. Pp. 18-35.*
1962. Strukturwandel derffentlichkeit. Darmstadt: Luchterhand. [English: 1989]
1967. Probleme einer philosophischen Anthropologie. Lecture transcript from the 1966/7 Winter semester at the University of Frankfurt (unauthorized edition).*
1967. Zur Logik der Sozialwissenschaften. Tübingen: J.C.B. Mohr. [English: 1988a]
1968a. Technik und Wissenschaft als Ideologie. Frankfurt am Main: Suhrkamp. [English: 1970, 1973b*]
1968b. Erkenntnis und Interesse. Frankfurt am Main: Suhrkamp. [English: 1971b]
1969. Protestbewegung und Hochschulreform. Frankfurt am Main: Suhrkamp. [English, 1970]
1970 Toward a Rational Society, J.J. Shapiro (trans.) Boston: Beacon.
1971a. Theorie und Praxis. Frankfurt am Main: Suhrkamp. [English: 1973b]
1971b. Knowledge and Human Interests. J. J. Shapiro (trans.). Boston: Beacon.
1971c. The Christian Gauss Lectures: Reflections on the linguistic foundations of sociology [For published versions, see chapter 1 of German 1984b and pp.1-103 of English 2001* original translation for lecture purposes by Jeremy Shapiro; re-translated for publication by Barbara Fultner]
1973a. Wahrheitstheorien. In H. Fahrenbach (ed.), Wirklichkeit und Reflexion. Pfllingen: Neske. 211-265. Reprinted as chapter 2 in 1984b.*
1973b. Theory and Practice, J. Viertel (trans.). Boston: Beacon.
1973c. Nachwort / Postscript to Knowledge and Human Interests: Philosophy of the Social Sciences [included in all subsequent printings of Knowledge and Human Interests].
1973d. Legitimationsprobleme im Spätkapitalismus. Frankfurt am Main: Suhrkamp. [English: 1975]
1975. Legitimation Crisis, T. McCarthy (trans.). Boston: Beacon.
1976a. Zur Rekonstruktion des Historischen Materialismus. Frankfurt am Main: Suhrkamp. [English: 1979*]
1976b. Was heiβt Universalpragmatik? In K.-O. Apel (ed.), Sprachpragmatik und Philosophie. Frankfurt am Main: Suhrkamp. 174-272. [English: 1979, chap. 1]
1977. Martin Heidegger, on the publication of lectures from the year 1935. Graduate Faculty Philosophy Journal 6, no. 2: 155-180.
1979. Communication and the Evolution of Society, T. McCarthy (trans.) Boston: Beacon.
1981. Theorie des kommunikativen Handelns. Band I: Handlungsrationalität und gesellschaftliche Rationalisierung. Band II: Zur Kritik der funktionalistischen Vernunft. Frankfurt am Main: Suhrkamp. [English: 1984a and 1987]
1983. Moralbewuβtsein und kommunikatives Handeln. Frankfurt am Main: Suhrkamp. [English: 1990a]
1984a. The Theory of Communicative Action, Volume I: Reason and the Rationalization of Society, T. McCarthy (trans.). Boston: Beacon.
1984b. Vorstudien und Ergänzungen zur Theorie des kommunikativen Handelns. Frankfurt am Main: Suhrkamp. [English: 2001* does not include the Wahrheitstheorien essay]
1985. Die Neue Unübersichtlichkeit: Kleine Politische Schriften V.ÿ Frankfurt am Main: Suhrkamp. [English: 1991]
1986a. Gerechtigkeit und Solidarität: Eine Stellungnahme zur Diskussion über Stufe 6. In W. Edelstein and G. Nunner-Winkler (eds), Zur Bestimmung der Moral. Frankfurt am Main: Suhrkamp. 291-318. [English: 1990b]
1986b. Entgegnung. In A. Honneth and H. Joas (eds), Kommunikatives Handeln. Frankfurt am Main: Suhrkamp. 327-405. [English: 1991b]
1987. The Theory of Communicative Action. Vol. II: Lifeworld and System, T. McCarthy (trans.). Boston: Beacon.
1988a. On the Logic of the Social Sciences, S. W. Nicholsen and J. A. Stark (trans.). Cambridge, MA: MIT Press.
1988b. Nachmetaphysisches Denken. Frankfurt am Main: Suhrkamp. [English: 1992a]
1989. The Structural Transformation of the Public Sphere, T. Burger and F. Lawrence (trans). Cambridge, MA: MIT Press.
1990a. Moral Consciousness and Communicative Action, C. Lenhardt and S. W. Nicholsen (trans). Cambridge, MA: MIT Press.
1990b. Justice and solidarity: On the discussion concerning stage 6. In: T. E. Wren (ed.), The Moral Domain. Cambridge, MA: MIT Press. 224-251, S. W. Nicholsen (trans.).
1991a. Erläuterungen zur Diskursethik. Frankfurt am Main: Suhrkamp. [English: 1993]
1991b. The New Conservatism: Cultural Criticism and the Historians’ Debate. S. W. Nicholsen (trans.).
1992a. Postmetaphysical Thinking, W. M. Hohengarten (trans.). Cambridge, MA: MIT Press.
1992b. Faktizität und Geltung. Beiträge zur Diskurstheorie des Rechts und des demokratischen Rechtsstaats. Frankfurt am Main: Suhrkamp. [English: 1996b]
1993. Justification and Application, C. P. Cronin (trans.). Cambridge, MA: MIT Press.
1994. The Past as Future. M. Pensky and P. Hohendahl (trans.). University of Nebraska Press.
1996a. Die Einbeziehung des Anderen. Studien zur politischen Theorie. Frankfurt am Main: Suhrkamp. [English: 1998a]
1996b. Between Facts and Norms: Contributions to a Discourse Theory of Law and Democracy, W. Rehg (trans.). Cambridge, MA: MIT Press. [German, 1992]
1998a. Inclusion of the Other: Studies in Political Theory, C. Cronin and P. DeGreiff (eds). Cambridge, MA: MIT Press.
1998b. Die postnationale Konstellation. Frankfurt am Main: Suhrkamp. [English: 2001a]
1999a. Wahrheit und Rechtfertigung. Frankfurt am Main: Suhrkamp. [English: 2003a]
2001a. The Postnational Constellation, M. Pensky (trans., ed.). Cambridge, MA: MIT Press.
2001b. Die Zukunft der menschlichen Natur. Auf dem Weg zu einer liberalen Eugenik? Frankfurt am Main: Suhrkamp. [English: 2003b]
2001c. Zeit der Übergänge. Kleine politische Schriften IX. Frankfurt am Main: Suhrkamp. [English: 2004b]
2003a. Truth and Justification, B. Fultner (trans.). Cambridge, MA: MIT Press.
2003b. The Future of Human Nature, W. Rehg, M. Pensky, and H. Beister (trans.). Cambridge: Polity.
2004a. Der gespaltene Westen. Kleine politische Schriften X. Frankfurt am Main: Suhrkamp. [English: 2006a]
2004b. Time of Transitions. C. Cronin (trans.). Cambridge: Polity Press.
2006a. The Divided West. C. Cronin (trans.). Cambridge: Polity Press.
2006b. Pre-political foundations of the democratic constitutional state. In J. Habermas and J. Ratzinger The Dialectics of Secularization: On Reasons and Religion. B. McNeil (trans.). San Francisco: Ignatius. 19-52.
2007. Kommunikative Vernunft und grenzüberschreitende Politik. Eine Replik. In: Anarchie der kommunikativen Freiheit Jürgen Habermas und die Theorie der internationalen Politik. Peter Niesen and Benjamin Herborth (eds.). Frankfurt am Main: Suhrkamp Press.
2008. Between Naturalism and Religion. C. Cronin (trans.). Cambridge: Polity Press.
2008. Ach Europa. Kleine politische Schriften XI. Frankfurt am Main: Suhrkamp. [English: 2009]
2009. Europe: the Faltering Project. C. Cronin (trans.). Cambridge: Polity Press.
2012: The Crisis of the European Union: A Response. C. Cronin (trans.). Cambridge: Polity Press.
2014. The Lure of Technocracy. C. Cronin (trans.). Cambridge: Polity Press.
2014. Entgegnung (x13) and Schlusswort. In: Habermas und der Historische Materialismus. Smail Rapic (ed.). München: Karl Alber Press.

d. Secondary Scholarship Beyond the Subject-Specific Recommendations Cited Above

1990. Apel, Karl-Otto. Diskurs und Verantwortung. Frankfurt am Main: Suhrkamp.
2002. Apel, Karl-Otto. Regarding the relationship of morality, law and democracy: on Habermas’s Philosophy of Law (1992) from a transcendental-pragmatic point of view. In: Habermas and Pragmatism. Aboulafia, Mitchell, Myra Bookman and Catherine Kemp (eds.). 17-30.
2014. Brunkhorst, Hauke. Critical Theory of Legal Revolutions: Evolutionary Perspectives. Bloomsbury.
2009. Brunkhorst, Hauke, Regina Kreide and Cristina Lafont (eds.) Habermas-Handbuch. Stuttgart: JB Metzler.
2000b. Finlayson, James Gordon. What Are ‘Universalizable Interests’? Journal of Political Philosophy. vol. 8, no. 4, 456-469.
2004. Flynn, Jeffrey. Communicative Power in Habermas’s Theory of Democracy. European Journal of Political Theory, 433-454.
1993. Günther, Klaus. The Sense of Appropriateness. J. Farrell (trans.). Albany: SUNY Press.
1991. Honneth, Axel and Hans Joas. Communicative Action: Essays on Jürgen Habermas’s Theory of Communicative Action. Gaines, Jeremy and Doris L. Jones (trans.). Cambridge, MA: MIT Press.
2003. Horkheimer, Max and Theodor Adorno. Dialectic of Enlightenment. G. Schmid Noerr (ed.), E. Jephcott (trans.). Stanford: Stanford University Press.
2001. Joas, Hans. Values versus Norms: a pragmatic account of moral objectivity. In: The Hedgehog Review 3, 42-56.
2012. Lafont, Cristina. Agreement and Consent in Kant and Habermas: Can Kantian Constructivism be fruitful for democratic theory? In: The Philosophical Forum. 43/3, 277-95.
2009. Lafont, Cristina. Religion and the Public Sphere: What are the deliberative obligations of democratic citizenship? Philosophy and Social Criticism 35, 127-150.
1991. McCarthy, Thomas. Ideals and Illusions, Cambridge, MA: MIT Press.
2007. Müller, Jan-Werner. Constitutional Patriotism. Princeton University Press.
2000. Müller-Doohm, Stefan (ed.). Das Interesse der Vernunft: Rückblicke auf das Werk von Jürgen Habermas seit “Erkenntnis und Interesse”. Frankfurt am Main: Suhrkamp.
2007. Niesen, Peter and Benjamin Herborth (eds.). Anarchie der kommunikativen Freiheit-Jürgen Habermas und die Theorie der internationalen Politik. Frankfurt am Main: Suhrkamp Press.
2002. Owen, David. Between Reason and History: Habermas and the Idea of Progress. Albany: SUNY Press.
2014. Rapic, Smail (ed.). Habermas und der Historische Materialismus. München: Karl Alber Press.
1989. Rockmore, Tom. Habermas and Historical Materialism. Bloomington and Indianapolis: Indiana University Press.
1998. Rosenfeld, Michel, and Andrew Arato (eds). Habermas on Law and Democracy, Berkeley: University of California Press.
1982. Thompson, John B. and David Held. Habermas: Critical Debates. Cambridge, MA: MIT Press.
1991. Wellmer, Albrecht. Ethics and dialogue: Elements of moral judgment in Kant and discourse ethics. In: A. Wellmer, The Persistence of Modernity, D. Midgley (trans.). Cambridge, MA: MIT Press. 113-231.
1995. White, Stephen K. (ed.). The Cambridge Companion to Habermas. Cambridge: Cambridge University Press.
2007. Yates, Melissa. Rawls and Habermas on religion in the public sphere. Philosophy and Social Criticism. 33, 880-891.

Author Information

Max Cherem
Email: Max.Cherem@kzoo.edu
Kalamazoo College
U. S. A.

Veṅkaṭanātha (Vedānta Deśika) (c. 1269—c. 1370)

Veṅkaṭanātha (also known as Vedānta Deśika “teacher of Vedānta”) was an Indian polymath who wrote philosophical as well as religious and poetical works in several languages, including Sanskrit, Maṇipravāḷa—a Sanskritised form of literary Tamil—and Tamil. He is traditionally dated to 1269-1370, but as explained by Neevel “the lifespans of the earliest teachers of Viśiṣṭādvaita Vedānta have been prolonged in order to connect them with each other” (1977, p. 14-16). He constitutes a turning point in the history of Viśiṣṭādvaita Vedānta philosophy, and his intellectual work shaped this current, as well as the tradition, of Śrī Vaiṣṇavism (the religious counterpart of Viśiṣṭādvaita Vedānta ) in general. The number of works he produced (more than 100 are attributed to him) and their depth make each assessment of his contribution preliminary, and the present one is no exception. He is a major figure in the history of Indian philosophy who wrote on a variety of philosophical topics and in a variety of genres. Unlike other authors, it appears that Veṅkaṭanātha was able to convey his theological ideas in different genres and different styles, so that his poems express in a mystical and condensed way what his essays explain in a systematic language. Veṅkaṭanātha’s philosophical works are composed mostly in Sanskrit but in some cases also in Maṇipravāḷa. Veṅkaṭanātha wrote both independent treatises and commentaries on other works.

Background
Veṅkaṭanātha’s Life, Works, and Formation
Veṅkaṭanātha’s Role within the History of Indian Philosophy
Veṅkaṭanātha’s Epistemology, Ontology, and Theology
1. Epistemological Issues
2. Cosmology and Metaphysics
State of the Art of Research on Veṅkaṭanātha
Abbreviations
References and Further Reading

1. Background

The Viśiṣṭādvaita Vedānta is a philosophical and theological school, chiefly active in South India from the last centuries of the first millennium until today, that holds that the Ultimate is a personal God who is the only existing entity and of whom everything else (from matter to human and other living beings) is a characteristic. This God is usually called Viṣṇu, hence the adjective Vaiṣṇava for His believers. As its name declares, Viśiṣṭādvaita Vedānta sees itself as a Vedānta school. The designation “Vedānta” (or “Uttara Mīmāṃsā”) is the name adopted by various concurring philosophical schools recognizing the Vedāntasūtras “Aphorisms on Vedānta” (also called Brahmasūtras “aphorisms on the brahman”) as one of their foundational texts focusing on the exegesis of the Upaniṣads. The latter are a collection of texts recognized by Vedāntins as the culmination of the Vedas, the Indian sacred texts, that elaborate on the brahman, the absolute, with theistic or a more or less monistic approach. By contrast, the philosophical school of Mīmāṃsā (or Pūrva Mīmāṃsā) is based on the Mīmāṃsāsūtra (or Pūrva Mīmāṃsā Sūtra) and focuses on the exegesis of the prescriptive portion of the Veda, namely the Brāhmaṇas. Further three schools will be mentioned in the next pages, namely the Advaita Vedānta, or monistic Vedānta, the Sāṅkhya, and the Nyāya one. The first one offers a monistic interpretation of the Upaniṣads, against which the Viśiṣṭādvaita Vedānta reacts vehemently. The Sāṅkhya school is a dualist school focusing on the opposition between a dynamic natura naturans, from which everything in the world originates, and an inert Spirit (see Sāṅkhya). The Nyāya school is the school of Indian philosophy focusing on logic, dialectics and epistemology, and it recognizes the Nyāyasūtra as its foundational text (see Nyāya).

The beginnings of Viśiṣṭādvaita Vedānta as an independent school are usually connected to Rāmānuja (traditional dates 1017-1137, see Rāmānuja ), both in India and in Western scholarship. Rāmānuja was indeed the first one to write a new commentary on the Vedāntasūtras, and thus robustly collocating his school within Vedānta. Ex post and principally due to the work of Veṅkaṭanātha, further predecessors of Rāmānuja have been linked to the prehistory of Viśiṣṭādvaita Vedānta, especially Nāthamuni, of whom no work is extant, and Yāmuna, of whom a complete work and various fragmentary ones are preserved.

The religious counterpart of Viśiṣṭādvaita Vedānta is usually called Śrī Vaiṣṇavism (on the introduction of this term, see Colas 2003, p. 247) since it focuses on the devotion to Viṣṇu and His wife Śrī. Other Vaiṣṇava texts worth mentioning are the constellation of texts called Pāñcarātras, a heterogenous group of texts prescribing tantric rituals, and the devotional poetry in Tamil of the Āḻvārs. The heterogeneity of the extant Pāñcarātra texts make it difficult to reconstruct a Pāñcarātra theology, but all Pāñcarātrins agree on the doctrine of God’s manifestations (called vyūhas), a belief which appear to clash with the absolute monism of Advaita Vedānta.

2. Veṅkaṭanātha’s Life, Works, and Formation

On Veṅkaṭanātha, we have an unusual wealth of biographical material although much of it is based on later hagiographies and is, therefore, not necessarily reliable in terms of yielding sheer historical data. Nonetheless, it appears to be clear that Veṅkaṭanātha was born in a family closely linked to Śrī Vaiṣṇavism and that he was born and raised in the current state of Tamil Nadu. It seems that Veṅkaṭanātha was born in Tūppul, close to Kañcipūram, and was active in various key locations of Śrī Vaiṣṇavism and eventually settled in Śrīraṅgam, the temple town that is the chief centre of Śrī Vaiṣṇavism even in the 21^st century. More than one hundred works (some of which are very short religious hymns) attributed to Veṅkaṭanātha have been preserved, and for most of them the attribution seems to be genuine. As noted earlier, Veṅkaṭanātha was able to convey his theological ideas in different genres and styles. One example is Veṅkaṭanātha’s hymn to Hayagrīva (Viṣṇu’s form as a horse and as God of knowledge), which glorifies the figure of the god Hayagrīva as the only God thus unifying, in a single symbol, Veṅkaṭanātha’s emphasis on a personal relation to God, his stress on Vedic learning, and his conception of a God who is not (only) transcendent.

This ability of mastering different approaches earned Veṅkaṭanātha the epithet of “lion [that is, foremost] among the poets and the philosophers,” which is widely found in most maṅgalas “auspicious verses” (found at the beginning of a work) dedicated to him. Through a comparison of his devotional, theological, and philosophical works, one can get an elaborate idea of his intellectual profile and contribution.

Veṅkaṭanātha’s philosophical works are composed mostly in Sanskrit but in some cases also in a highly Sanskritised form of Tamil called “Maṇipravāḷa.” For reasons that will be discussed below, Veṅkaṭanātha wrote both independent treatises and commentaries on other works. The former category includes such works as the Tattvamuktākalāpa, with its auto-commentary called Sarvārthasiddhi; the Nyāyasiddhāñjana and the Śatadūṣaṇī, all dealing with various philosophical topics in a relatively short number of pages for each topic; and works dealing with a single topic such as the Pāñcarātrarakṣā (Defense of Pāñcarātra) dedicated to the epistemic justification of the validity of the Pāñcarātra sacred texts. To the category of commentaries and the subcategory of the closer commentaries one can count Veṅkaṭanātha’s subcommentary (called Tātparyacandrikā) on Rāmānuja’s commentary on the Bhagavadgītā, and his subcommentary (called Tattvaṭīkā) on Rāmānuja’s Śrībhāṣya (henceforth ŚrīBh) on the Brahmasūtra. Less close are his Adhikaraṇasārāvalī, which comments on the topics of the Brahmasūtra according to their own sequence, and the Seśvaramīmāṃsā (“Theistic Mīmāṃsā” a new and independent commentary on the Mīmāṃsāsūtra). Akin to the latter work are works dealing with a specific tradition of thoughts, for example the Mīmāṃsāpādukā on Mīmāṃsā or the Nyāyapariśuddhi “Purification of Nyāya” on Nyāya and, more in general, the doxographic Paramatabhaṅga “Refutation of the other systems of thought,” written in Maṇipravāḷa. The Nyāyapariśuddhi and the Nyāsiddhāñjana are further linked insofar as the former deals with epistemology and the latter with ontology, thus mirroring the fundamental opposition of pramāṇa (“instrument of knowledge”) and prameya (“object of knowledge”) found in the Nyāya school since the Nyāyasūtra and paradigmatically in Jayanta’s masterwork, the Nyāyamañjarī, also divided in two major sections of six books each.

Veṅkaṭanātha’s maternal uncle and preceptor, Ātreya Rāmānuja, had a profound impact on his intellectual formation. Ātreya Rāmānuja is traditionally believed to have been born in Kañcipūram in the year 1220, the fourth in a lineage of disciples started by Rāmānuja himself (Ramanujachari and Srinivasacharya 1938, p. v). Several elements of Veṅkaṭanātha’s systematisation of Viśiṣṭādvaita Vedānta can indeed be found in Ātreya Rāmānuja’s Nyāyakuliśa (henceforth NK). Like other texts by Veṅkaṭanātha, the NK focuses on various philosophical topics rather than commenting on a root text (like Rāmānuja’s ŚrīBh) or on a single topic (like Yāmuna’s ĀP or, for instance, Maṇḍana Miśra’s treatises). Further, the NK is in a constant dialogue with the other schools of Vedānta, primarily with Advaita Vedānta, but like in the case of Veṅkaṭanātha other schools of Indian philosophy play a major role in it, namely Nyāya and Mīmāṃsā.

3. Veṅkaṭanātha’s Role within the History of Indian Philosophy

Veṅkaṭanātha is an important figure in the history of Indian philosophy. Since he is a historical figure, the explication of his thought is facilitated by the contextual knowledge available about the times, the cultural and geographical milieu, and the religious tradition related to him. Conversely, the study of Veṅkaṭanātha and of his sources allows one to undertake a study of Indian philosophy as known to him and of the changes he implemented in its interpretation.

Veṅkaṭanātha’s works are also an invaluable mine of information concerning his predecessors within what was later labelled Viśiṣṭādvaita Vedānta since Veṅkaṭanātha’s intellectual openness led him to frequent confrontations with previous authors and, thus, to quotes from them. In this way, thanks to Veṅkaṭanātha numerous quotes (otherwise lost works) have come to us from Nāthamuni’s Nyāyatattva (quoted largely in the Nyāyasiddhāñjana, for example, Vīrarāghavācārya 1976, p. 519) to Nārāyaṇārya (quoted, for example, in the Nyāyasiddhāñjana ad 8, Vīrarāghavācārya 1976, p. 39, about whom see Trikha 1999); from Parāśara Bhaṭṭa’s Tattvaratnākara (quoted in the Nyāyasiddhāñjana, for example, Vīrarāghavācārya 1976, p. 111, 408) to the Vṛttikāra (quoted, for example, in the introduction of the SM); from Rāmamiśra’s Ṣaḍarthasaṅkṣepa (quoted in the Nyāyasiddhāñjana) to Dramiḍācārya (quoted in the Nyāyasiddhāñjana, Vīrarāghavācārya 1976, p. 437) and to Yādavaprakāśa (about whom see Oberhammer 1997 and about whose presence in Veṅkaṭanātha’s works see Freschi 2016 b).

The Vedāntic school of Indian philosophy we refer to as Viśiṣṭādvaita Vedānta has been largely influenced by the shape Veṅkaṭanātha gave to it. For instance, the traditional lineage of teachers of Viśiṣṭādvaita Vedānta includes the Āḻvārs (an elusive Vṛttikāra), Nāthamuni, Yāmunācārya, Rāmānuja, and Parāśara Bhaṭṭa. All bear just some vague family resemblance with each other but all are directly linkable to Veṅkaṭanātha’s view of Viśiṣṭādvaita Vedānta, which includes the Āḷvārs devotionalism, the Pāñcarātra’s ritualism together with a robust commitment to the scholarly tradition of Indian philosophy in general (as already found in Nāthamuni and, from a different point of view, Yāmuna), and Vedānta in particular (as explicit in Rāmānuja).

Veṅkaṭanātha’s contributions reconfigured his predecessor’s contributions by giving them a new meaning. This is true for Veṅkaṭanātha’s closer engagement with the philosophical debate (even with the opponents of Vedānta) for Veṅkaṭanātha’s reformulation of what is included in “Vedānta,” for Veṅkaṭanātha’s reexamination of some key issues in Rāmānuja’s thought, for the fixation of Rāmānuja as the key reference point for Viśiṣṭādvaita Vedānta, and for Veṅkaṭanātha’s redefinition of the supreme Deity (see Freschi 2015a, 2015c and 2016a and b).

The main philosophical outlines of Veṅkaṭanātha’s Viśiṣṭādvaita Vedānta are:

the Vedāntic viewpoint
the emphasis on Pūrva Mīmāṃsā and Uttara Mīmāṃsā (or Vedānta) as a single system
the incorporation of Pāñcarātra
the incorporation of the Āḻvārs’ theology

All these points are shared with one or the other among the predecessors of Veṅkaṭanātha: 1 and—to a much lesser extent—2 with Rāmānuja (see Freschi 2016a), 3 with Yāmuna, and 4 with the Āḻvārs, and perhaps also with some other predecessors. What remains distinctively unique of Veṅkaṭanātha is the smooth synthesis of these various elements and the usage of Mīmāṃsā as a synthesizing key. For instance, the aikaśāstrya “unity of the teaching” which Veṅkaṭanātha shows to hold between Pūrva and Uttara Mīmāṃsā offers the model for further extensions of the Viśiṣṭādvaita Vedānta system. Similarly, Veṅkaṭanātha uses the model of the Mīmāṃsā’s approach to the various Vedic śākhās “recensions” in order to deal with the different views highlighted in the various Pāñcarātra Saṃhitās.

The development of Viśiṣṭādvaita Vedānta as a Vedāntic school becomes clear as one looks back at Veṅkaṭanātha’s predecessors, but it is important to note that what seems a posteriori to be a Vedāntic school would probably not have appeared such to the school’s contemporaries. In fact, Yāmuna’s relation to Vedānta is complex. He quoted from the Upaniṣads in his Ātmasiddhi “Establishment of the Self” (henceforth ĀtS), where he started by listing the Vedānta teachers he wanted to refute (including Bhartṛhari and Śaṅkara), so that one might think that he is keener to “purify” Vedānta than he is to “purify” Nyāya. At the same time, at least in the ĀtS (see Mesquita 1971, p. 4–13), Yāmuna accepted an anti-Vedāntic proof for the existence of God, namely the inference, whereas Pūrva and Uttara Mīmāṃsā would rather agree that God can only be known through the Sacred Texts. Moreover, Yāmuna’s Āgamaprāmāṇya “On the Validity of the [Vaiṣṇava] Sacred Texts” (henceforth ĀP) seems to have a completely different focus as it stresses the epistemic validity of the Pāñcarātra texts as works of a reliable author, namely God Himself. Rāmānuja is more straightforwardly part of a Vedāntic approach and this is probably also the reason why he has often been considered the “founder” of Viśiṣṭādvaita Vedānta—a term which he and Yāmuna still ignore. The term seems to have indeed been used first by Rāmānuja’s commentator Sudarśanasūri, and it did not become established as the name of a school until much later, namely in “the latter half of the 16th century” (Varadachari 1962). As for Nāthamuni, his relation to Vedānta can only be presupposed from what we know of him through his successors, given that his works have been lost. Their titles focus, however, on Nyāya and Yoga (and not on Vedānta).

We know nothing about Nāthamuni’s relation to Pūrva Mīmāṃsā (Mīmāṃsā for short), but we do know that at least one trend within Vedānta (as testified by Śaṅkara’s commentary on the Brahmasūtra, see Freschi 2016a) claimed that the study of Pūrva Mīmāṃsā was not necessary and this trend possibly remained influential for a long time, given the energy Veṅkaṭanātha still needed to dedicate to the issue well after Rāmānuja’s nominal acceptance of Mīmāṃsā. In fact, as will be shown immediately below, Rāmānuja accepted Mīmāṃsā as a preliminary discipline of Vedānta, possibly in order to situate Viśiṣṭādvaita Vedānta well within the Vedic orthodoxy (readers will remember that the Mīmāṃsā focuses on the exegesis of a portion of the Veda) and as part of his polemics against Śaṅkara’s Advaita Vedānta. Nonetheless, it is only with Veṅkaṭanātha that this acceptance becomes an inclusion of Mīmāṃsā on the same level as Viśiṣṭādvaita Vedānta.

Before Rāmānuja, Yāmuna’s relation to Pūrva Mīmāṃsā is two-faced, insofar as Pūrva Mīmāṃsā authors seem to be his targeted objectors whom he wanted to convince of the legitimacy of the Pāñcarātra transmission but often by recurring to their arguments; additionally, he by and large adopted Nyāya strategies, such as the reference to God as the authoritative source of the epistemological validity of the Pāñcarātra or the use of inference to establish God’s existence (a procedure which is criticized by Veṅkaṭanātha in the chapter on God of the Nyāyasiddhāñjana, see Clooney 2000, p. 108-109). The situation changes perhaps during Yāmuna’s own life, certainly with Rāmānuja, who steered in the direction of Vedānta and, thus, came closer to the Pūrva Mīmāṃsā. He came so close that he programmatically stated at the beginning of his commentary on the Brahmasūtra that not only the Brāhmaṇa part of the Veda (the one Mīmāṃsā authors focus on) needs to be studied, but that its study is part of the same teaching as the Vedānta. Veṅkaṭanātha took advantage of this remark and spelt out its deep implications: he could thus state that the whole of Pūrva Mīmāṃsā and the whole of Uttara Mīmāṃsā constitute a single teaching (ekaśāstra). To that he added a distinctive role to the Saṅkarṣakāṇḍa, a less well-known Mīmāṃsā text which had little fortune within Pūrva Mīmāṃsā but which rose with Veṅkaṭanātha (or perhaps already in the lost predecessors who inspired him) to the role of a bridge between the Pūrva and Uttara Mīmāṃsā since it is interpreted as focusing on veneration and on God.

As for the incorporation of Pāñcarātra and of the Āḻvārs (see Nos. 3 and 4 above), Veṅkaṭanātha used the same model he elaborated to establish 2 in order to incorporate these further elements into the system. He reached back to the Pāñcarātra, which had been defended by Yāmuna but rather neglected by Rāmānuja, and, more strikingly, to the hymns of the Āḻvārs. It is in this sense more than telling that Veṅkaṭanātha (as the first among the early teachers of Viśiṣṭādvaita Vedānta) decided to write in Tamil and to write theology in poetical form, as the Āḻvārs had done.

As is evident from the previous sections, Veṅkaṭanātha attempted to bring different voices together in a synthesis without excluding any of them. He examined and selected texts and ideas coming from different backgrounds and reshaped Viśiṣṭādvaita Vedānta in a distinctive way, one which the 21^st reader recognizes at first sight but only because he or she is following Veṅkaṭanātha’s interpretation of the school.

One of the guiding elements of Veṅkaṭanātha’s synthesis was his pro-Vedic attitude, which translated into a pro-Pūrva Mīmāṃsā bias and the attempt to moderate the anti-Vedic tendencies within Śrī Vaiṣṇavism. For instance, Veṅkaṭanātha (at least in his Sanskrit works) did not espouse the Ekāntin current within Śrīvaiṣṇavism (which contended that a Pāñcarātra text, the Ekāyanaveda, was the Ur-Veda, from which all Vedas originated), and his defense of the Pāñcarātra was based predominantly on the fact that they are rather based on the Vedas, like any other smṛti (a group of texts deriving their validity from the fact that they—allegedly—merely re-elaborate Vedic contents) (see Seśvaramīmāṃsā ad PMS 1.1.2), apart from their independent value as autonomous revelation of God.

This pro-Vedic attitude is probably the reason for Veṅkaṭanātha’s personal devotion to Hayagrīva. Before Veṅkaṭanātha, Hayagrīva was considered only a minor avatāra of Viṣṇu, and Veṅkaṭanātha was the first to dedicate to him a stotra “eulogy.” The reason for his choice of Hayagrīva as his favourite form of God is probably Hayagrīva’s link with intellectuality and with the Vedas, as well as being a form of Viṣṇu. In this sense, Hayagrīva was a perfect symbol of the union of Pūrva Mīmāṃsā Vedism and devotionalism.

Veṅkaṭanātha’s figure played a major role also in the so-called split between the so-called Vaṭakalai and Teṅkalai currents within Śrī Vaiṣṇavism even though these two currents became known by these names only much later than Veṅkaṭanātha. In the 21st century, the two currents are distinguished by sociological as well as doctrinal elements, for example, the doctrine of an Ekāyanaveda and the emphasis on God’s Grace vs. free will in the Teṅkalai and vice versa in the Vaṭakalai. Although, as will be shown below, Veṅkaṭanātha tried to find a synthesis within the latter tension, the Vaṭakalai current retrospectively identified Veṅkaṭanātha as its founder and adopted en bloc his theology and philosophy. This also means that the Vaṭakalai identified itself through elements that were typical of Veṅkaṭanātha’s philosophical and religious attitude, even those that had not been necessarily prosecuted after him. The desire to follow Veṅkaṭanātha in each possible aspect is probably the key reason for why Vaṭakalai authors added a short praise to Hayagrīva at the beginning of their works instead of the customary praise to Gaṇeśa (the elephant-God of learning praised at the beginning of each work). Out of the same set of reasons, in 1676 a temple was dedicated to Hayagrīva and is currently run by Vaṭakalais.

4. Veṅkaṭanātha’s Epistemology, Ontology, and Theology

a. Epistemological Issues

Veṅkaṭanātha more or less adopted the Pūrva Mīmāṃsā epistemology but with some differences which will be highlighted below. By contrast, his attitude towards the antagonist of Mīmāṃsā (including also the Uttara Mīmāṃsā, that is, Vedānta), namely Nyāya, is much more critical:

Therefore, according to the rule of the lion hidden in the forest, we will follow the Veda supported by the logical rules (nyāya) according to reality, and the Nyāya when it agrees with the Veda. By contrast, we will not follow the pure Nyāya.

ataḥ siṃhavanaguptinyāyena yathāvasthitanyāyānugṛhītaṃ vedaṃ vedānumataṃ ca nyāyam anusarāmaḥ, na punar nyāyamātram (Nyāyapariśuddhi 1.1.2)

Concerning sense perception, one sensitive topic is that of yogipratyakṣa ‘intellectual intuition,’ a specific kind of direct perception in which the intellect acts as direct access to knowledge, as if it were a sense faculty. Philosophers of the Nyāya and of the Buddhist Pramāṇavāda schools uphold this kind of direct perception as belonging to exceptional individuals. Through intellectual intuition, exceptional individuals (like the Buddha, a ṛṣi “mythical seer,” or God) could have direct access to dharma, thus relativizing the uniqueness of the Vedic Sacred Texts. To support the possibility of yogipratyakṣa is, thus, consistent with the Nyāya’s discussion of the validity of the Veda as dependent on its author and on the Pramāṇavāda view that the Buddha could access dharma. Veṅkaṭanātha, conversely, is in a difficult situation, insofar as he wants to defend both the uniqueness of the Vedas (see above, section 3, for his Vedism) and the authority of God. In order to reconcile these two positions, Veṅkaṭanātha in his SM ad 1.1.4 (the PMS aphorism on direct perception) rigidly negates the possibility for human beings to attain intellectual intuition, so that no single human being could ever question the validity of the Vedas. Nonetheless, consistently with his theism, Veṅkaṭanātha does not negate the possibility for God alone to see the dharma. This entails that it would be theoretically possible for God to reveal a new Sacred Text. He will, nonetheless, not do it because the Veda is already His will made into words, and God does not change whimsically what He wants.

As for the Pāñcarātras, they enjoy the same validity of smṛti texts insofar as they derive their validity from the fact of stating what is already present in the Veda, either in a branch of it that is available, or in a currently lost one.

b. Cosmology and Metaphysics

What, then, is God’s relationship to the Veda and to the world? The Veda is said to be the direct manifestation of His will but not His revelation. The latter option is refused because it would make the Veda subordinate to God’s will and liable to be overcome by a later and possibly more complete revelation. Thus, the Veda is a crystallization of God’s permanent free will. Similarly, the world—including all human beings inhabiting it—is described as God’s body in the sense that God can experience through it, insofar as each body pertains to something conscious (cetana) (Nyāyasiddhāñjana, Vīrarāghavācārya 1976, pp. 162–163).

The claim that the world is the body of God also puts Veṅkaṭanātha’s interest in ontology in the right perspective, which is always subordinate to his interest in theology. Ontology is not conceived as the study of what exists independently from God nor as the study of inert matter—since such inert matter is inconceivable in the Viśiṣṭādvaita Vedānta worldview.

A further topic that is closely linked to this one regards the definition of body in general in the work of Veṅkaṭanātha. Classical Indian philosophers tend to define śarīra “body” as a tool for experience (bhogasādhana). Thus, most philosophers state that plants only seem to have bodies because of our anthropomorphic tendencies, which make us believe that they function like us, whereas in fact plants cannot experience (Freschi 2015b). By contrast, Veṅkaṭanātha in the Nyāyasiddhāñjana defines śarīra in the following way:

Therefore, this śarīra is of two types: permanent and impermanent. Among them, permanent are God’s body—consisting of auspicious substrates, namely substances with the three qualities, time and [individual] souls—and the intrinsic form of Garuḍa, the snake (Ananta), etc. belonging to permanent [deities]. The impermanent [body] is of two types: not made of karman ‘past actions’ and made of karman. The first one has the form of the primordial natura naturans, etc. of God. In the same way, [an impermanent body not made of karman] assumes this or that form according to the wish of the liberated souls, such as Ananta and Garuḍa. Also the [body] made of karman is of two types: made out of one’s decision and karman and made out of karman alone. The first type belongs to great [souls] like the Muni Saubhari. The other one belongs to the other low [souls] (i.e., all normal human beings and the other conscious living beings). Moreover, the body in general is of two types, movable and unmovable. Wood (i.e., trees) and other [plants] and rocks and other [minerals] are unmovable. […] That there are souls also in rock-bodies is established through stories such as that of Ahalyā. (Nyāyasiddhāñjana, Vīrarāghavācārya 1976, p. 174–176, my translation)

Note that the inclusion of plants and rocks within what is a body could have to do with the fact that the whole world is the body of God and that consequently everything is a śarīra. The first commentary explains the “etc.” after natura naturans as referring to God’s manifestations, called vyūhas, which are a typical mark of Pāñcarātra theology throughout its history (see Schmücker 2007 for their relevance in Veṅkaṭanātha). The various forms assumed by God(s) are the impermanent ones He can assume on top of His permanent one at wish, not depending on karman. Last, the hint at Ahalyā refers to a story told: for example, in the Sanskrit epic Rāmāyaṇa, Ahalyā was transformed into a stone and then back into a woman, a fact which proves that a soul was present also while she was a stone.

The definition of God’s body entails that God is conscious and has as His body, apart from other substances and time, also the conscious souls of living beings. How can free will be possible under these circumstances? According to Veṅkaṭanātha’s Sanskrit works, human free will is granted only insofar as God Himself actively wants humans to be free (see Freschi 2015c). The possibility of human free will is ontologically plausible insofar as Veṅkaṭanātha does not consider karman an all-encompassing causal force, so that its suspension would transgress the normal causation. Rather, karman only bounds low-level souls whereas God and liberated souls do not undergo the karman’s laws and their free will is unrestrained.

In the Nyāyapariśuddhi, Veṅkaṭanātha discussed some fundamental ontological topics in order to distinguish his positions from the Nyāya-Vaiśeṣika position. The Nyāyasūtra proposes a fundamental division of realities into dravya “substances,” guṇa “qualities,” and karman “actions” (for a full list, including further categories included later, see Franco and Preisendanz 1998), with the former as the substrate of the latter two. This leads to two difficulties for Veṅkaṭanātha’s agenda. On the one hand, the radical distinction between substance and attribute means that Nyāya authors imagine liberation to be the end of the connection of the ātman “self” (of each individual being) to all attributes, from sufferance to consciousness. By contrast, Veṅkaṭanātha would never accept consciousness to be separated from the individual soul and even less from God, who, being a substance, would also (from the point of view of Nyāya) be at least in principle separable from His attributes, including from consciousness. The other difficulty regards the theology of Viśiṣṭādvaita Vedānta. Since the beginnings of Pāñcarātra, one of its chief doctrines has been that of the manifestations (vibhūti) of Viṣṇu, which are dependent on Him but co-eternal with Him, and, in this sense, are unexplainable according to the division of substances into eternal and transient.

To that, Veṅkaṭanātha opposed more than one classification, so that it is clear that Veṅkaṭanātha’s main point is addressing the above-mentioned problems with the Nyāya ontology rather than establishing in full detail a distinct ontology.

The first proposal found in the Nyāyapariśuddhi (1.1.2) is based on the idea of a two-fold division into dravya “substance” and adravya “non-substance” where it defines substance as the “substrate of possible accidental characteristics” (āgantukadharmāśraya). The attribute “accidental” hints at the fact that characteristics are ephemeral, whereas substances are permanent (nitya). The sub-classifications go further as depicted in Figure 1, with substance being divided into inert and alive. The former category includes the prakṛti “natura naturans” of Sāṅkhya and time, which is thus no longer a quality (guṇa) as in Nyāya. The latter category (alive substance) includes separate and heteronomous substances. Within the former are individual souls and God, distinguished insofar as individual souls are conscious but dependent on someone else (namely God), whereas God is autonomous (paratantracetano jīvaḥ, svatantra iśvaraḥ). The heteronomous category includes cognition as essential to God (see the paragraph immediately above) and the permanent manifestations of God, which are logically dependent on Him but have not been created by Him. Thus, there are ultimately six types of substances: the prakṛti, time, God, individual souls, God’s knowledge, and God’s manifestations.

Figure 1

The final result is an ontology that shares several elements with the other Vedāntic schools, such as the embedment of the Sāṅkhya structure whereas the genealogy of the ajaḍa part of the classification appears to be distinctive. It is opposed to the Vedāntic pariṇāmavāda “theory of the evolution [of the Absolute (brahman) into the world],”—according to which the brahman is the material cause of the world—and also to the māyāvāda “theory of the illusory [evolution of the brahman in the world]”—according to which only the brahman exists and everything else is just an illusion. Veṅkaṭanātha’s ontology, in this sense, is not monistic insofar as it has God as its pivotal point but not as its only component:

For, the other things rely only on the brahman, which is self-established (itareṣāṃ svaniṣṭhabrahmaikaniṣṭhatvāt, Nyāyapariśuddhi 1.1.2).

The ajaḍa part of the classification is, instead, directly connected to Yāmuna’s stress on dharmabhūtajñāna “knowledge as an inseparable characteristic [of God]” (see Mesquita 1971 and Neevel 1977) and to Veṅkaṭanātha’s dissociation of the personal aspect of God from any material ontology. Noteworthy in this connection is the fact that God’s vibhūtis “manifestations” are substances but devoid of any materiality. It is in this light that the concept of God’s body (see above) assumes its significance.

Furthermore it is noteworthy that God’s essential relation to the world should not be understood as that of a substance and its qualities, since if it were so the souls and so forth as a quality of the Lord, could not be bearers of further qualities (since Veṅkaṭanātha shares the notion common to all classical Indian philosophers that there are no qualities of qualities). Instead, God is linked to the material world through the prakṛti and to the individual souls insofar as He is their inner ruler.

Qualities are of two kinds. Ordinary qualities (avasthā) are defined by Veṅkaṭanātha as āgantuko ’pṛthaksiddho dharmo ’vasthā “A quality is an accidental characteristic which cannot be established separately [of the substance in which it inheres]” (NSi 3 ad v. 77, Vīrarāghavācārya 1976, p. 357). In the same text he lists the ordinary qualities, starting with the three guṇas of the prakṛti (sattva, rajas, tamas) [see Prakṛti and the three guṇa-s], then the five sensibilia (rūpa, rasa, gandha, sparśa, śabda), saṃyoga “contact,” śakti “power,” and then gurutva “heaviness,”, dravatva “fluidity,” snehatva “viscidity,” saṃskāra “(mnestic) trace,” saṅkhyā “number,” pariṇāma “measure,” pṛthaktva “distinction,” vibhāga “separation,” aparatva “distance,” paratva “proximity,” karman “action,” sāmānya “universal,” sādṛśya “similarity,” viśeṣa “individuality,” samavāya “inherence,” abhāva “absence,” and vaiśiṣṭya “qualified-ness” (NSi, beginning of 3, Vīrarāghavācārya 1976, p. 443). Noteworthy here is that out of these twenty-seven qualities, sixteen are shared with the Nyāya-Vaiśeṣika school (the five sensibilia, saṅkhyā, parimāṇa, pṛthaktva, paratva, aparatva, saṃyoga, vibhāga, gurutva, snehatva, dravatva, saṃskāra), and the three guṇas derive from the Sāṅkhya school. Further, four qualities offer a new way to systematize distinct categories of the Nyāya-Vaiśeṣika (karman, sāmānya, viśeṣa, samavāya). Two are of Mīmāṃsā origin (śakti and abhāva). Last, sādṛśya and, even more clearly, vaiśiṣṭya are most probably connected with specific theologemes of Viśiṣṭādvaita Vedānta, which indeed presents itself as upholding a qualified monism, that is, the possibility that a single God really exists exactly insofar as He is ultimately qualified by all His qualifications.

Apart from the accidental qualities, there are the svarūpanirūpakadharmas “qualities which define the own nature [of a given substance].” These allow Veṅkaṭanātha to explain that the six substances mentioned above (prakṛti, time, God, individual souls, God’s knowledge, and God’s manifestations) are nitya “permanent” although they can only be grasped through a quality (which would amount to a paradox if the qualities were only accidental and ephemeral).

Further discussions about ontologically relevant topics are not altogether absent in Veṅkaṭanātha’s texts. On the contrary, they are largely dealt with whenever the topic has some impact on his theology.

Veṅkaṭanātha was fiercely adverse to the Buddhist theory of momentariness (“Whatever exists is momentary,” compare Ratnakīrti, Kṣaṇabhaṅgasiddhi, third sentence), opposing it through the Mīmāṃsā-developed argument of recognition (pratyabhijñā), that is we do recognize things, which means that they did not pass away (Nyāyasiddhāñjana ad 6, Vīrarāghavācārya 1976, p. 16–37). The topic is of central theological significance because if everything were momentary, then there would be neither a lasting self nor a lasting God.

A topic at the centre of centuries of Nyāya-Mīmāṃsā controversies is the status of śabda “word.” Nyāya authors understand it as synonymous with the audible sounds composing the phones and thus declare it impermanent. Mīmāṃsā authors, by contrast, understand śabda as the linguistic unit capable of communicating a meaning and only manifested by phones but existing independently of them. In this sense, śabda is nitya “permanent.” Veṅkaṭanātha dealt with the issue in slightly modified terms according to whether he is speaking from a Nyāya or a Mīmāṃsā perspective. Still, even in the former case (Nyāyapariśuddhi, 1.1.3) he looked for a solution to the controversy by saying that the phones composing the Veda recur (āvṛt-) at each new creation of the world and that in this sense they are permanent. In the background is the Mīmāṃsā definition of permanence as two types, kūṭastha– and pravāhanityatva. The first is the unchanging permanence of (for example ether) whereas the second is the permanence-in-flux of (for example, a stream water—if this can be assumed to be permanent), the single elements of which are always renewed.

This same doctrine of recurrence is present also in Veṅkaṭanātha’s account of cosmology. According to him, God has not created the world e nihilo. By contrast, since the world is God’s body, it is necessarily co-eternal with Him. The idea of recurrent destructions (vilaya) and recreations is dealt with by saying that even at the state of destruction everything is present in subtle (sūkṣma) form and neither the karman of the individuals nor the Veda nor any other aspect of the world is lost. Given the fact that the creation e nihilo appears to be chiefly a Judaeo-Christian concept, Veṅkaṭanātha did not need to address the possible objection that a body coeternal with God might be a limitation to His omnipotence (on Time and God see also Schmücker’s work).

5. State of the Art of Research on Veṅkaṭanātha

Among the volumes dedicated to Śrī Vaiṣṇavism, Raman (2007) focuses on the Vaṭakalai-Teṅkalai debate (see above, section 3) and dedicates several pages to Veṅkaṭanātha. Mumme (1988) focuses on the same debate from the point of view of its two champions, that is,, Maṇavāḷamāmuni and Veṅkaṭanātha.

Among the works focusing on Veṅkaṭanātha, after the pioneer study Tātāchārya (1911), Satyavrata Singh’s 1958 study is noteworthy. Notwithstanding its faults (for example, partisanship), this study is of particular significance since it discusses the syncretism of Nyāya and Vedānta by Veṅkaṭanātha and, perhaps even more importantly, it analyzes in depth (p. 106‒136) the sources of Veṅkaṭanātha, especially within Vaiṣṇavism.

As part of the same generation of scholars, K.C. Varadachari (respectfully mentioned in Singh 1958, p. xvii) has been of fundamental importance to the studies on Śrīvaiṣṇavism. He edited, translated and studied several works by Veṅkaṭanātha (among other studies, one might mention Varadachari 1943 and Varadachari 1969). His Varadachari (1940) focuses on Veṅkaṭanātha’s only commentary on an Upaniṣad, the Īśāvāsyopaniṣad. His work tries to join respect towards Veṅkaṭanātha’s thought with a philosophical insight that makes the arguments clear also for a contemporary reader. V. Varadachari further worked on the relationship between Pāñcarātras and Vaiṣṇavism and on Veṅkaṭanātha (see Varadachari 1982 and Varadachari 1983). A “friend” of K.C. Varadachari (as he calls himself in his preface to Varadachari 1943), P.N. Srinivasachari is the author of Srinivasachari (1946), which has set some of the fundamental parameters of interpretation for Viśiṣṭādvaita Vedānta (defined as “a philosophy of religion”). The text very often refers to Veṅkaṭanātha.

Born in 1918, S.M. Srinivasa Chari has written (apart from several volumes on South Indian Vaiṣṇavism) several insightful volumes, each on a distinct work of Veṅkaṭanātha, examined from a philosophical and theological perspective (Srinivasa Chari 1961; Srinivasa Chari 1988; Srinivasa Chari 2007; and Srinivasa Chari 2011). Unlike Carman, Hardy, and Hopkins (about whom see a few lines below), Srinivasa Chari programmatically focuses on philosophical topics (see Srinivasa Chari 1988, pp. ix-x):

The emphasis placed by Rāmānuja on the acceptance of saviśeṣa Brahman or the personal Supreme Being endowed with attributes […] has led some scholars to feel that Rāmānuja’s system is essentially theological. […] But the Viśiṣṭādvaita system has both a philosophical as well as theological aspect, and the former is of greater importance for the reason that it gives meaning and value to the latter. […] The main objective of this task [i.e., Srinivasa Chari 1988, E.F.] is to remove a prevalent impression that Viśiṣṭādvaita is primarily theology and establish that it is essentially a system of philosophy. It is a system which has been developed, apart from an appeal to scriptural authority, on the basis of well-formulated epistemological, ontological, cosmological and religious doctrines. (Srinivasa Chari 1988, pp. ix-x)

Srinivasa Chari also explains why key works of Viśiṣṭādvaita have hardly been studied nor have been translated yet despite the system being of major significance in India, well-known and studied:

The metaphysical doctrines, developed by the Viśiṣṭādvaitin on the basis of which the system is founded, cannot be understood easily unless one has made a deep study of ancient treatises in the original. Next to the Śrī-bhāṣya of Rāmānuja, there are two outstanding philosophical classics, Tattva-muktā-kalāpa and Śatadūṣaṇī, written by Veṅkaṭanātha. A study of these texts is an essential prerequisite for getting a deeper insight into Viśiṣṭādvaita tenets. But […] these are highly technical works written in terse Sanskrit and presented in the classical style replete with subtleties of dialectical arguments. The […] texts have therefore remained beyond the approach of ordinary scholars and modern students of philosophy. […] Even among the existing scholars, brought up strictly in the traditional disciplines of scholarship, there are very few who can claim to have studied them fully. (Srinivasa Chari 1988, pp. ix-x)

The same argument could be repeated in regard to the epistemology of Viśiṣṭādvaita Vedānta and the SM would play a major role in this case, possibly highlighting the fact that (against the first quote above) also the appeal to the authority of Sacred Texts can become part of one’s philosophical enterprise and be epistemologically well-grounded.

Three scholars working in close connection or linked as teacher and student, John Carman, Friedhelm Hardy and Steven Paul Hopkins have dedicated insightful and thought-provoking essays to Veṅkaṭanātha and to Viśiṣṭādvaita Vedānta. Most of these studies are characterized by an unusual attention to the contemporary Śrīvaiṣṇava tradition and try to be sensitive to an insider view of their subject (this attitude is particularly evident in Hardy 1977). Hardy and Hopkins focus on the artistic production of Veṅkaṭanātha and show how it is of intrinsic aesthetic value and of deep philosophical significance. They also show how the two aspects are deeply connected in Veṅkaṭanātha’s theology. Selected examples of the production of these scholars are Carman (1981), on Rāmānuja but explicitly relying on Veṅkaṭanātha’s commentaries; Hardy (1979), on Veṅkaṭanātha’s Dehalīśastuti; and Hopkins (2007) on Veṅkaṭanātha’s devotional songs.

Also dedicated to the philosophical works of Veṅkaṭanātha is Narayanan (2008), which focuses on the Nyāyapariśuddhi, one of Veṅkaṭanātha’s most important works dedicated to epistemology and with Nyāya and Mīmāṃsā authors as chief discussants. Narayanan (2008) closely follows the order of Veṅkaṭanātha’s work and is in this sense precious for closer studies of this text.

Toshihiro Mikami (Mikami n.y.) translated Nyāyasiddhāñjana, but unfortunately this scholar died before being able to write a complete study on it. Clooney (2000) analyzes a chapter of the same text, the one on rational theology (Īśvarapariccheda).

Furthermore, several studies have been dedicated to Veṅkaṭanātha in particular, such as the short and hagiographical Narasimhachary (2004) and Clooney (2008) on his theology of devotion attitude. Other studies focus on Veṅkaṭanātha’s Tamil works and will not be analysed here (apart from Hopkins’ studies mentioned above, Colas 2002 is also noteworthy).

Last, the IKGA in Vienna has hosted and still hosts a group of scholars who started working intensively on Veṅkaṭanātha after having focused on Viśiṣṭādvaita Vedānta and on the Pāñcarātra tradition: Gerhard Oberhammer, Sylvia Stark, Marion Rastelli, and Marcus Schmücker. Particularly noteworthy in this connection is the series dedicated to the “Rāmānuja school” and edited by Oberhammer together with Rastelli and Schmücker, which constitutes a regular platform for scholars working on this topic. Gerhard Oberhammer’s list of works on this topic is impressive (1971, 1998, 2000, 2002, 2004, 2006, 2007, 2008) as is the zeal with which he created an international network of scholars working on Viśiṣṭādvaita Vedānta and enabled them to discuss with scholars researching on similar topics within parallel religious traditions (Oberhammer and Rastelli 2002; Oberhammer and Schmücker 2003; Oberhammer and Rastelli 2007; Oberhammer and Schmücker 2008). Schmücker (2009), dedicated to the epistemology of sense and super-sensuous perception (yogipratyakṣa), discusses Veṅkaṭanātha’s views on epistemology. Schmücker (2011) focuses on subjectivity in Veṅkaṭanātha and shows how Veṅkaṭanātha equates the ātman with the aham “I,” most probably (and this is not Schmücker’s conclusion) following Kumārila’s idea that one might prove the existence of an ātman through one’s cognition of an “I” (ahampratyaya) (on ahampratyaya in Kumārila, see Watson 2006, chapter 3 and more briefly Freschi 2014).

6. Abbreviations

ĀtS Ātmasiddhi by Yāmuna

ĀP Āgamaprāmāṇya by Yāmuna

BṬ Bṛhaṭṭīkā by Kumārila Bhaṭṭa

NK Nyāyakuliśa by Veṅkaṭanātha

NP Nyāyapariśuddhi by Veṅkaṭanātha

NSi Nyāyasiddhāñjana by Veṅkaṭanātha

PMS Pūrva Mīmāṃsā Sūtra

ŚrīBh Śrībhāṣya by Rāmānuja

ŚV Ślokavārttika by Kumārila Bhaṭṭa

SM Seśvaramīmāṃsā by Veṅkaṭanātha

7. References and Further Reading

Ayyaṅgār, Tirunārāyaṇapuram Kṛṣna (2008). Swamy Śrī Vedānta Deśikan (life span 1268 AD to 1369 AD, 101 years), as seen through his own writings, Stothrangal and Paduka Sahasram. Bangalore: D.K. Agencies.
Carman, John Braisted (1981). The theology of Ramanuja: an essay in interreligious understanding. 2nd (1st New Haven 1974). Ananthacharya Indological Research Institute series 9. Bombay: Ananthacharya Indological Research Institute.
Clooney, Francis Xavier (2000). “Vedānta Deśika’s “Definition of the Lord” (Īśvarapariccheda) and the Hindū Argument about Ultimate Reality”. In: Ultimate Realities. Ed. by Robert Neville, New York: SUNY, pp. 95-123.
Clooney, Francis Xavier (2008). Beyond Compare. St. Francis de Sales and Śrī Vedānta Deśika on Loving Surrender to God. Washington D.C.: Georgetown University Press.
Colas, Gérard (2002). “Variations sur la pâmoison dévote. A propos d’un poème de Vedāntadeśika et du théâtre des araiyar ”. In: Images du corps dans le monde hindou. Ed. by Véronique Bouillier and Gilles Tarabout. Paris: CNRS Editions, pp. 275–314.
Colas, Gérard (2003). “History of Vaiṣṇava Traditions: An Esquisse”. In: The Blackwell Companion to Hinduism. Ed. by Gavin Flood. Oxford: Blackwell. Chap. 11, pp. 229–270.
Franco, Eli and Karin Preisendanz (1998). “Nyāya-Vaiśeṣika”. In: Routledge Encyclopedia of Philosophy. Ed. by Edward Craig. London: Routledge, pp. 57–67.
Freschi, Elisa (2014). “Does the subject have desires? The Ātman in Prābhākara Mīmāṃsā”. In: Puṣpikā: Tracing Ancient India Through Texts and Traditions. Contributions to Current Research in Indology. Number 2. Ed. by Giovanni Ciotti, Paolo Visigalli, and Alastair Gornall. Oxford: Oxbow Books Press, pp. 55–86.
Freschi, Elisa (2015a). “Between Theism and Atheism: a journey through Viśiṣṭādvaita Vedānta and Mīmāṃsā”. In: Puṣpikā: Tracing Ancient India Through Texts and Traditions. Contribu-tions to Current Research in Indology. Number 3. Ed. By Robert Leach and Jessie Pons. Oxford: Oxbow Books Press, pp. 24–47.
Freschi, Elisa (2015b). “Systematizing an absent category: discourses on “nature” in Prābhākara Mīmāṃsā”. In: The Human Person and Nature in Classical and Modern India. Ed. by Raffaele Torella and Giorgio Milanetti. Supplemento della Rivista di Studi Orientali LXXXVIII, pp. 45—54.
Freschi, Elisa (2015c). “Free will in Viśiṣṭādvaita Vedānta: Rāmānuja, Sudarśana Sūri and Veṅkaṭanātha”. In: Religion Compass 9.9, pp. 287–296.
Freschi, Elisa (2016[a]). “Reusing, Adapting, Distorting. Veṅkaṭanātha’s reuse of Rāmānuja, Yāmuna and the Vṛttikāra in his commentary ad PMS 1.1.1”. In: Adaptive Reuse of Texts, Ideas and Images in Classical India. Ed. by Elisa Freschi and Philipp A. Maas. Abhandlungen für die Kunde des Morgenlandes. Wiesbaden: Harrassowitz. (a pre-print version is available on Academia.edu)
Freschi, Elisa (2016[b]). “Veṅkaṭanātha’s engagement with Buddhist opponents in the Buddhist texts he reused”. In: The reuse of texts in Buddhist literature. Ed. by Catherine Cantwell, Elisa Freschi, and Jowita Kramer. Buddhist Studies Review 33, 1—2. (a pre-print version is available on Academia.edu)
Hardy, Friedhelm (1977). “Ideology and Cultural Contexts of the Śrīvaiṣṇava Temple”. In: The Indian Economic and Social History Review XIV.1, pp. 119–151.
Hardy, Friedhelm (1979). “The Philosopher as Poet — A Study of Vedāntadeśika’s Dehalīśastuti”. In: Journal of Indian Philosophy 7.3, pp. 277–325.
Hopkins, Steven Paul, ed. (2007). An Ornament for Jewels. Love Poems for the Lord of Gods by Vedāntadeśika. New York: Oxford University Press.
McCann, Erin (forthcoming). Maṇipravāḷa in Śrīvaiṣṇavism. PhD thesis. Montreal, Québec: McGill University, Faculty of Religious Studies.
Mesquita, Roque (1971). “Das Problem der Gotteserkenntnis bei Yāmunamuṇi”. PhD thesis. Wien: Universität Wien.
Mikami, Toshihiro (n.y.). “Nyāyasiddhāñjana of Vedānta Deśika. An annotated Translation”. PhD thesis. Tokyo: University of Tokyo. Available through the website of the University of Tokyo as pdf.
Mumme, Patricia Y. (1988). The Śrī Vaiṣṇava Theological Dispute: Maṇavāḷamāmuni and Vedānta Deśika. Madras: New Era Publications.
Narasimhachary, Mudumby (2004). Sri Vedanta Desika. New Delhi: Sahitya Academi.
Narayanan, Vedavalli (2008). The Epistemology of Viśiṣṭādvaita. A Study Based on the Nyāyapariśuddhi of Vedānta Deśika. New Delhi: Munshiram Manoharlal.
Neevel, Walter G. Jr. (1977). Yāmuna’s Vedānta and Pāñcarātra: Integrating the Classical and the Popular. Harvard Dissertations in Religion. Missoula, Montana: Scholars Press, Harvard Theological Review.
Oberhammer, Gerhard (1997). Materialien zur Geschichte der Rāmānuja-Schule III. Yādavaprakāśa, der vergessene Lehrer Rāmānujas. Wien: Verlag der Österreichischen Akademie der Wissenschaften.
Oberhammer, Gerhard (1998). Materialien zur Geschichte der Rāmānuja-Schule IV. Der “Innere Lenker” (Antaryāmī), Geschichte eines Theologems. Wien: Verlag der Österreichischen Akade-mie der Wissenschaften.
Oberhammer, Gerhard (2000). Materialien zur Geschichte der Rāmānuja-Schule V. Zur Lehre von der ewigen vibhūti Gottes. Wien: Verlag der Österreichischen Akademie der Wissenschaften.
Oberhammer, Gerhard (2002). Materialien zur Geschichte der Rāmānuja-Schule VI. Die Lehre von der Göttin vor Veṅkaṭanātha. Wien: Verlag der Österreichischen Akademie der Wissen-schaften.
Oberhammer, Gerhard (2004). Materialien zur Geschichte der Rāmānuja-Schule VII. Zur spirituellen Praxis des Zufluchtnehmens bei Gott (śaraṇāgatiḥ) vor Veṅkaṭanātha. Wien: Verlag der Öster-reichischen Akademie der Wissenschaften.
Oberhammer, Gerhard (2006). Materialien zur Geschichte der Rāmānuja-Schule VIII. Zur Eschatologie der Rāmānuja-Schule vor Veṅkaṭanātha. Wien: Verlag der Österreichischen Akademie der Wissenschaften.
Oberhammer, Gerhard (2007). Ausgewählte kleine Schriften. Ed. by Utz Podzeit and Velizard Sadovski. Wien: Sammlung De Nobili.
Oberhammer, Gerhard (2008). Materialien zur Geschichte der Rāmānuja-Schule IX. Der Ātmā als Subjekt in der Theologie Rāmānujas. Wien: Verlag der Österreichischen Akademie der Wissen-schaften.
Oberhammer, Gerhard and Marion Rastelli, eds. (2002). Studies in Hinduism III. Pāñcarātra and Viśiṣṭādvaitavedānta. Wien: Verlag der Österreichischen Akademie der Wissenschaften.
Oberhammer, Gerhard and Marion Rastelli, eds. (2007). Studies in Hinduism IV. On the Mutual Influences and Relationship of Viśiṣṭādvaita Vedānta and Pāñcarātra. Wien: Verlag der Österreichischen Akademie der Wissenschaften.
Oberhammer, Gerhard and Marcus Schmücker, eds. (2003). Mythisierung der Trans-zendenz als Entwurf ihrer Erfahrung. Wien: Verlag der Österreichischen Akademie der Wissenschaften.
Oberhammer, Gerhard and Marcus Schmücker, eds. (2008). Glaubensgewissheit und Wahrheit in religiöser Tradition. Wien: Verlag der Österreichischen Akademie der Wissenschaften.
Raman, Srilata (2007). Self-Surrender (prapatti) to God in Śrīvaiṣṇavism: Tamil cats and Sanskrit monkeys. RoutledgeCurzon Hindū Studies Series. London: Routledge.
Ramanujachari, R. and K. Srinivasacharya, eds. (1938). Nyāyakuliśa or The Lightning-Shaft of Reason by Ātreya Rāmānuja. Edited with Introduction and Notes. Anna-malai University Philosophy Series 1.
Schmücker, Marcus (2007). “The vyūha as the ‘State of the Lord’ (bhagavadavasthā). Vedāntic Interpretation of Pāñcarātra Doctrines according to Veṅkaṭanātha”. In: Studies in Hinduism IV. On the Mutual Influences and Relationship of Viśiṣṭādvaita Vedānta and Pāñcarātra. Ed. by Gerhard Oberhammer, and Marion Rastelli. Wien 2007: Verlag der Österreichischen Akademie der Wissenschaften, pp. 89-106.
Schmücker, Marcus (2009). “Yogic Perception According to the Later Tradition of the Viśiṣṭādvaita Vedānta”. In: Yogic Perception, Meditation and Altered States of Consciousness. Ed. by Eli Franco and Dagmar Eigner (in collaboration with). Wien: Verlag der Österreichischen Akademie der Wissenschaften, pp. 283–298.
Schmücker, Marcus (2011). “Zur Bedeutung des Wortes Ich (aham) bei Veṅkaṭanātha”. In: Die Rela-tionalität des Subjektes im Kontext der Religionshermeneutik. Arbeitsdokumentation eines Symposiums. Ed. by Gerhard Oberhammer and Marcus Schmücker. Wien: Verlag der Österreichischen Akademie der Wissenschaften.
Schmücker, Marcus (forthcoming) Die Bedeutung der Zeit (kāla) als Substanz (dravya) in der Gotteslehre Veṅkaṭanāthas. Habilitation thesis. Wien: Universität Wien.
Schreiner, Peter (1991). “Review of: Fundamentals of Visistadvaita Vedanta: A Study Based on Vedanta Desika’s Tattva-mukta-kalapa”. In: Bulletin of the School of Oriental and African Studies, University of London 54.2, p. 391.
Seshadri Acharya, V. N (1993). Sarvārtha siddhi of Śrī Vedāntadeśika: a study. Madras: Sri Visishtadvaita Research Centre.
Singh, Satyavrata (1958). Vedānta Deśika: His Life, Works, and Philosophy. Chowkhamba Sanskrit Series. Varanasi: Chowkhamba.
Srinivasa Chari, S.M. (1961). Advaita and Viśiṣṭādvaita; a study based on Vedānta Deśika’s Śatadūṣaṇī. London and Madras: Asia Publishing House.
Srinivasa Chari, S.M. (1988). Fundamentals of Viśiṣṭādvaita Vedānta: a study based on Vedānta Deśika’s Tattva-muktā-kalāpa. Delhi: Motilal Banarsidass.
Srinivasa Chari, S.M. (2007). The philosophy of Viśiṣṭādvaita Vedānta: a study based on Vedānta Deśika’s Adhikaraṇa-sārāvalī. Delhi: Motilal Banarasidass.
Srinivasa Chari, S.M. (2011). Indian philosophical systems: a critical review based on Vedānta Deśika’s Paramata-bhaṅga. New Delhi: Munshiram Manoharlal.
Srinivasachari, P.N. (1946). The Philosophy of Viśiṣṭādvaita. Adyar: Adyar Library.
Tātāchārya, M.K. (1911). Vedānta Deśika, his life and literary writings. Madras: T. S. Ramaswami Aiyangar.
Trikha, Himal (1999). “Nārāyaṇārya’s Vidhisvarūpanirṇayaḥ. Das achte Kapitel in der Nītimālā — Ein Beitrag zur personalen Konzeption der Vorschrift im Viśiṣṭādvaita-vedānta”. PhD thesis. Wien: Universität Wien.
Varadachari, K. C. (1940). A Clue into the nature of the relationship into the mystical and religious consciousness as seen in the interpretation of the Isavasyopanisad by Sri Vedanta Desika. [S. l. ?].
Varadachari, Karumbur Chakravarthy (1943). Śrī Rāmānuja’s theory of knowledge, a study. Tirupati: Tirumalai-Tirupati Devasthanams Press.
Varadachari, Karumbur Chakravarthy (1969). Viśiṣṭādvaita and its development. Tirupati: Chakravarthy Publications.
Varadachari, V. (1962). “Antiquity of the term Viśiṣṭādvaita”. In: The Adyar Library Bulletin XXVI, pp. 177–181.
Varadachari, V. (1982). Āgamas and South Indian Vaiṣṇavism. Triplicane, Madras: M. Rangacharya Memorial Trust.
Varadachari, V. (1983). Two great acharyas, Vedanta Desika and Manavala Mamuni. Madras: Prof. M. Rangacharya Memorial Trust.
Vīrarāghavācārya, T., ed. (1976). Nyāya Siddhāñjana by Vedānta Deśika with two old commentaries [Saralaviśadavyākhyā by Śrīraṅgarāmānujasvāmi and Ratnapeṭikā by Śrīkāñcī Kṛṣṇatātayārya, including a Ṭippaṇa by the editor]. Ubhayavedāntagrantha-mālā. [Madras].
Watson, Alex (2006). The Self’s Awareness of Itself: Bhaṭṭa Rāmakaṇṭha’s Arguments against the Buddhist Doctrine of No-Self. Wien: De Nobili. .

Author Information

Elisa Freschi
Email: elisa.freschi@gmail.com
Austrian Academy of Sciences
Austria

Ancient Aesthetics

ancient-greek-aesthetics It could be argued that ‘ancient aesthetics’ is an anachronistic term, since aesthetics as a discipline originated in 18^th century Germany. Nevertheless, there is considerable evidence that ancient Greek and Roman philosophers discussed and theorised about the nature and value of aesthetic properties. They also undoubtedly contributed to the development of the later tradition because many classical theories were inspired by ancient thought; and, therefore, ancient philosophers’ contributions to the discussions on art and beauty are part of the traditions of aesthetics.

The ancient Greek philosophical tradition starts with the pre-Socratic philosophers. In most cases, there is little evidence of their engagement with art and beauty, with the one notable exception of the Pythagoreans. In the Classical period, two prominent philosophers, Plato and Aristotle, emerged. They represent an important stage in the history of aesthetics. The problems they raised and the concepts they introduced are well known and discussed even today.

The three major philosophical schools in the Hellenistic period (the Epicureans, the Stoics and the Sceptics) inherited a certain philosophical agenda from Plato and Aristotle while at the same time presenting counterarguments and developing distinct stances. Their contributions to aesthetics are not as famous and, in some cases, are significantly smaller than those of their predecessors, yet in certain respects, they are just as important. In late antiquity, the emergence of Neoplatonism marks another prominent point in the aesthetic tradition. Neoplatonists were self-proclaimed followers of Plato, yet starting with the founder of the school, Plotinus, Neoplatonists advocated many distinctly original views, some of them in aesthetics, that proved to be enduringly influential.

The history of ancient Greek aesthetics covers centuries, and during this time numerous nuanced arguments and positions were developed. In terms of theories of beauty, however, it is possible to classify the theories into three distinct groups: those that attribute the origin of beauty to proportion, those that attribute it to functionality and those that attribute the Form as the cause of beauty. This classification ought not to be understood as a hard-and-fast distinction among philosophical schools, but as a way of pinpointing some major theoretical trends. Oftentimes, philosophers use a combination of these positions, and many original innovations are due to the convergence and interaction among them.

Ancient philosophers were also the authors of some of the more notable concepts in the philosophy of art. The notions of catharsis, sublimity and mimesis originated in antiquity and have played a role in aesthetics ever since then.

Ancient Aesthetics: Methodological Issues
1. Aesthetics in Antiquity
2. To Kalon
Three Types of Theories about the Origin of Beauty
1. Proportion
2. Functionality
3. Form
  1. Plato
  2. Plotinus
Philosophy of Art
1. Mimesis
  1. Plato
  2. Aristotle
2. Criticism of Arts
  1. Plato
  2. Epicureans
3. Catharsis
4. Sublime
References and Further Reading
1. Primary Sources
2. Secondary Sources

1. Ancient Aesthetics: Methodological Issues

a. Aesthetics in Antiquity

One of the most important foundational issues about ancient aesthetics is the question of whether the very concept of ‘ancient aesthetics’ is possible. It is generally considered that aesthetics as a discipline emerged in the 18^th century. To speak of ancient Greek and Roman aesthetics, therefore, would be an anachronism. Furthermore, there are certain differences between ancient and modern approaches to the philosophical study of beauty and art that make them distinct projects. These differences were outlined and discussed by Oskar Kristeller, an influential critic of ancient aesthetics, who suggested that the ancients’ interest in moral, religious, and practical aspects of works of art—combined with their lack of grouping the fine arts into a single category and presenting philosophical interpretations on that basis—means that aesthetics was not a philosophical discipline in antiquity (Kristeller 1951: 506).

Kristeller’s critique is still often quoted and discussed in works that deal with the ancients’ ideas on arts and beauty. The question of how compatible ancient and modern methodologies are remains a relevant issue. At the same time, Kristeller’s view has been challenged by a number of compelling arguments in 20^th and early-21^st century scholarship.

A number of arguments against Kristeller’s interpretation of the aesthetic tradition have been raised. These arguments also pinpoint some of the central concepts that ancient philosophers used. Stephen Halliwell criticised Kristeller’s argument by pointing out that, first, the notion of mimesis was a much more unified concept of art than Kristeller allows (see below for a more detailed explanation of mimesis). Second, the 18^th-century category of fine art, established in such works as Batteux’s Les beaux arts réduits à un même principe (1746), relied on the mimetic tradition, although later the focus shifted towards different conceptions of art (Halliwell 2002: 7–8). Peponi later refuted Kristeller’s claims by pointing out that ancient Greek thinkers grouped activities we call fine arts and, moreover, were interested in the effects produced by the beautiful properties of, for instance, poetry (Peponi 2012: 2–6).

James Porter has also criticised Kristeller’s premises and conclusions on three different grounds: Kristeller’s historical account is not the only one possible; “the modern system of arts” is not as clear-cut a category as Kristeller makes it out to be; and it does not follow that the existence of the concept of fine arts indicates the emergence of aesthetic theory (Porter 2009). In addition to this, it has been argued that the ideas of Plato and Aristotle are not only relevant to the preoccupations of modern philosophers but also address the foundational questions of aesthetics and philosophy of art (Halliwell 1991).

b. To Kalon

Another methodological issue concerning ancient aesthetics is a linguistic one, namely the translation and conceptualisation of the term to kalon (honestum in Latin) whose meaning contains some ambiguity. The issue at stake is the question of when this term can and cannot be read and translated as an aesthetic one. The Greek language has a rich vocabulary of terms that are uncontroversially aesthetic, but to kalon, a fairly popular term in philosophical texts, has a range of meanings from ‘beauty’ to ‘being appropriate.’ The problem arises especially in ethical discussions, when the context does not make it clear whether the usage of the term to kalon ought to be understood as aesthetic or not.

It has been customary to translate to kalon in ethical contexts as ‘fine’ or something similar. Early 21^st-century thinkers have argued, however, that to kalon and similar Greek and Latin terms (to prepon in Greek; honestum and decorum in Latin) ought to be read as aesthetic concepts. The translations that ignore the aesthetic aspect of these terms may not capture their meaning accurately (Bychkov 2010: 176). Or, more specifically, the use of to kalon in Aristotle’s works often has aesthetic meaning and, therefore, can be translated as ‘beautiful’ (Kraut 2013). At the same time, some studies of Aristotle’s use of to kalon have argued that the conceptualisation and translation of the term depend on the context in which it is found. In the context of ethical discussions, more neutral or ethical translations ought to be preferred over aesthetic ones (Irwin 2010: 389–396).

2. Three Types of Theories about the Origin of Beauty

a. Proportion

The idea that beauty in any given object originates from the proportion of the parts of that object is one of the most straightforward ways of accounting for beauty. The most standard term for denoting this theory is summetria, meaning not bilateral symmetry, but good, appropriate or fitting proportionality.

The idea that beauty derives from summetria is usually attributed to the sculptor Polycleitus (5^th cn. B.C.E.), who wrote a treatise entitled Canon containing a discussion of the exact proportions that generate beauty and then made a statue, also entitled Canon, exemplifying his theory. Little is known of Polycleitus’ work and ideas, but when the famous Roman architect Vitruvius used this notion in his De Architectura, he explained it in terms of specific numerical ratios. For instance, in the human face, the distance from the chin to the crown of the head is an eighth part of the whole height; the length of the foot is a sixth part of the height of the body, while the forearm is a fourth part. Then Vitruvius adds that ancient painters and sculptors achieved their renown by following these principles (Book 3.1.2). It is likely that Polycleitus’ treatise had similar contents, such as a discussion of specific ratios that produce beauty in a human body, and was therefore useful for making sculptures of idealised human forms.

i. Pythagoreans

Equally, if not more, significant for the philosophical tradition are Pythagorean ideas about the fundamentality of numbers. Of course, Pythagoreanism was far from a unified school of thought; diverse philosophers were given that name during antiquity. The Pythagoreans referred to here are the philosophers active during the 5^th and 4^th centuries B.C.E., such as Philolaus and Archytas.

Numbers, according to this strand of Pythagoreanism, underlie the basic ontological and epistemological structure of the world and, as a result, everything in the world can be explained in terms of numbers and the relationship between them, namely, proportion. Beauty is one of the properties that the Pythagorean philosophers use to support their doctrine, because they claimed its presence can be fully explained in terms of numbers or, to be more precise, the proportion and harmony that is expressed in numerical relationships.

Sextus Empiricus recorded the Pythagorean argument that sculpture and painting achieve their ends by means of numbers, and thus art cannot exist without proportion and number. Art, the argument continues, is a system of perceptions and the system is reducible to a number (Sextus Empiricus Against the Logicians Book 1.108–9).

The Pythagoreans had a well-known interest in music. The evidence on this topic is wide-ranging: from the reputation of Pythagoras as the first one to pinpoint the mathematics underlying the Greek music scale to Socrates’ remark in the Republic attributing to Pythagoras the claim that music and astronomy were sister sciences (Rep. 530D). Music is also said to have a positive influence on a person’s soul. According to a testimonial from Aristoxenus, music had an effect on a person’s soul comparable to the effect that medicine has on a person’s body (Diels, II. 283, 44). Arguably this role was attributed to music due to its being an expression of the harmonizing influence of numbers.

ii. Plato and Aristotle

Although generally speaking, Plato is best classified as a Form Theorist, a small number of passages in the Platonic corpus suggest a viewpoint derived from summetria, that is, a good proportion or ratio of parts.

In the Timaeus, lacking summetria is associated with lacking beauty (87D). Similarly, both in the Republic and the Sophist, beauty is said to derive from arrangements (R. 529D-530B and Sph. 235D–236A respectively). Plato’s use of summetria raises the question of how this theory was supposed to function alongside the idea that beauty derives from the form of beauty. Most likely, however, there was no contradiction for Plato. Summetria is one of the properties that beautiful things have, rather than the cause of beauty, which is its form. Summetria, as well as such properties as colour and shape, is one of the aspects that an object gains by partaking in the form.

The case is similar in the Aristotelian corpus. Aristotle named summetria one of the chief forms of beauty, alongside order and definiteness (M 3.1078a30–b6). The context for this definition is the refutation of the view put forth by the sophist Aristippus who argued that mathematics has nothing to say about the good and the beautiful (M 3.996a). Since the causes of aesthetic properties are describable in mathematical terms, mathematics does, in fact, have something to say about these things. Similarly, in Physics, bodily beauty (kallos) is named as one of the excellences that depend on particular relations (Ph. 246b3–246b19), and in Topics, it is said to be a kind of summetria of limbs (Topics 116b21). The beautiful (to kalon) is also identified with being well arranged in On Universe (397a6).

At the same time, Aristotle did not think that summetria was a sufficient condition for beauty. He claimed that size was also necessary for beauty. In Nicomachean Ethics 4.3, beauty is said to imply a good-sized body, so that little people might be well-proportioned, but not beautiful. The city as well is required to be of a certain size before it can be called beautiful (Politics 7.4).

iii. The Stoics

Summetria assumed a much more significant role in Stoicism. The Stoics defined beauty as originating from the summetria of parts with each other and with the whole. Galen (On the Doctrines of Hippocrates and Plato 5.3.17) attributes this definition to Chrysippus, the third head of the school, but all other testimonials describe it simply as the Stoic definition. This definition is meant to apply to both the beauty of the body and the beauty of the soul (Arius Didymus Epitome of Stoic Ethics 5b4–5b5 (Pomeroy); Stobaeus Ecl. 2.62, 15). Some sources suggest that there are additional conditions: for the former, colour, and for the latter, the stability or consistency of beliefs (Plotinus Ennead 1.6.1; Cicero Tusculan Disputations 4.13.30). In many respects, the Stoics inherit this understanding of beauty from their predecessors, but it is worth noting that they also often invoked the notion of functional beauty. Stoics aesthetics, therefore, was likely a combination of functional and proportion theories.

b. Functionality

The theory of functional beauty is the idea that beauty originates in an object when that object performs its functions, achieves its end or fits its purpose, especially when it is done particularly well, that is, excelling at the task of achieving that end. In an ancient philosophical context, this idea is also often associated with the notion of dependent beauty, which means an object is beautiful if it excels at functioning as the kind of object it is. It is also noteworthy that the Greek term to kalon, often—but not always—used as an aesthetic term, can be used to denote being fitting or well-executed. The functionalist theory of beauty might have been more linguistically intuitive to ancient Greeks than it is possible to convey in English.

i. Xenophon

It is hard to attribute this theory to one particular philosopher, since functionalist arguments are fairly common in ancient philosophy texts. An example of functional theory can be found in Xenophon’s Memorabilia. Socrates first makes a point about dependent beauty by saying “a beautiful wrestler is unlike a beautiful runner, a shield beautiful for defence is utterly unlike a javelin beautiful for swift and powerful hurling” (3.8.4). Then he further develops this point by adding that “it is in relation to the same things that men’s bodies look beautiful and good and that all other things men use are thought beautiful and good, namely, in relation to those things for which they are useful” (3.8.5).

It is not obvious that the term to kalon employed here is used in an aesthetic sense, but a few lines down, it is said that “the house in which the owner can find a pleasant retreat at all seasons and can store his belongings safely is presumably at once the pleasantest and the most beautiful. As for paintings and decorations, they rob one of more delights than they give” (3.8.10). This remark highlights that the issue at stake is aesthetic phenomena, and that a much greater pleasure is to be gained from perceiving functionality rather than perceiving pleasing, yet artificial, colours (paintings) and structures.

ii. Hippias Major

A functional definition of beauty is also found in Plato’s dialogue Hippias Major. In this dialogue, Socrates engages in a discussion with Hippias, a sophist, in order to discover the definition of beauty. They each give a number of possible options, and one of them, proposed by Socrates, was a functional definition.

It is argued that stone, rather than ivory, is more beautiful as material for eye pupils in Pheidias’ statue and that a fig wood ladle is much better suited and beautiful than a gold one for making soup. Socrates proposes these two cases as objections to Hippias’ proposal that beauty is gold. By presenting two cases in which a beauty-making property is not some inherent property of an object, but that object’s functionality, Socrates rejects Hippias’ suggestion. This move also leads to examining the possibility that all beauty is to be defined as deriving from functionality, but this option is ultimately rejected as well on the grounds that it appears to rely on a kind of deception, because it prioritizes how things appear over how things truly are (290D–294E).

iii. Aristotle

In Aristotle’s work, there are many instances of excellence in functionality described by the term to kalon. In fact, Aristotle states outright that fitting a function and to kalon are the same (Top. 135a12–14). Since this term can be used both aesthetically and non-aesthetically, it is a matter of contention whether in some specific cases the reference for this term is meant to be an aesthetic phenomenon or not.

If to kalon is read aesthetically, some of the most pertinent passages for the functionalist understanding of aesthetic properties would come from Aristotle’s descriptions of natural phenomena. For instance, according to Generation of Animals, the generation of bees reveals a kalon arrangement of nature; the generations succeed one another even though drones do not reproduce (760a30–b3). In Nicomachean Ethics, Aristotle states that dogs do not enjoy the scent of rabbits as such, but the prospect of eating them; similarly, the lion appears to delight in the lowing of an ox, but only because it perceives a sign of potential food (1118a18–23).

iv. The Stoics

A certain kind of functionality and aesthetic language also appear in certain Stoic arguments, most notably in the works of Panaetius who used the term to prepon (‘fitting’, ‘becoming’, ‘appropriate’ in English) in certain ethical arguments. Probably the most elaborate discussion of to prepon (or decorum in Latin) is recorded in Cicero’s On Duties, which represents Panaetius’ views.

Here, an analogy between poetry and human behaviour is drawn as follows. The poets “observe propriety, when every word or action is in accord with each individual character.” The poets depict each character in a way which is appropriate regardless of the moral value of the character’s actions, so that a poet would be applauded even when he skilfully depicts an immoral person saying immoral things. To human beings, meanwhile, nature also assigned a kind of role, namely that of manifesting virtues like steadfastness, temperance, self-control, and so forth. This claim reflects one of the essential tenets of Stoic ethics, eudaimonia, which is living in accordance with nature and pursuing virtue (Diogenes Laertius 7.87–9; Long and Sedley 63C). Human beings, therefore, are functional entities as well in the sense that they have a certain function and end. The idea that achieving that end produces beauty is made clear when it is said that just as physical beauty consisting of the harmonious proportion of limbs delights the eye, so too does to prepon in behaviour earn the approval of fellow humans through the order, consistency and self-control imposed on speech and acts (1.97–98).

c. Form

i. Plato

Plato’s best-known argument, the theory of forms, has much bearing on his aesthetics in a number of ways. The theory posits that incorporeal, unchanging, ideal paradigms— forms—are universals and play an important causal role in the world generation. Arguably the most important way in which the theory of forms has bearing on aesthetics is the account of the origin of aesthetic properties. Beauty, just like many other properties, is generated by its respective form. An object becomes beautiful by partaking in the form of Beauty. The form of Beauty is mentioned as the cause of beauty throughout the Platonic corpus; see, for instance Cratylus 439C–440B; Phaedrus 254B; Phaedo 65d–66A and 100B–E; Parmenides 130B; Republic 476B–C, 493E, 507B. In this respect, the form of Beauty is just like all the other forms. Plato does, however, say that the form of Beauty has a special connection with the form of Good, even if they are not, ultimately, identical (Hippias Major 296D–297D).

The form of Beauty is shown as having a pedagogical aspect in the Symposium. In Diotima’s speech, the acquisition of knowledge (that is, the knowledge of the forms) is represented as the so-called Ladder of Love. A lover is said to first fall in love with an individual body, then notices that there are commonalities among all beautiful bodies and thus becomes an admirer of human form in general. Then the lover starts appreciating the beauty of the mind, followed by the beauty of institutions and laws. The love of sciences is the next step on the ladder until the lover perceives the form of Beauty. The form is said to be everlasting, not increasing or diminishing, not beautiful at one point and ugly at another, not beautiful only in relation to any specific condition, not in the shape of any specific thing, such as a limb, a piece of knowledge or an animal. Instead, it is absolute, everlasting, unchanging beauty itself (210A–211D).

ii. Plotinus

Plotinus, a self-proclaimed follower of Plato, was also committed to the view that beauty originated from the form of Beauty, adding some further elaborations of his own. Plotinus presents this account as a rival to the summetria theory. His treatise On Beauty (Ennead 1.6) starts with an elaborate critique of the rival theory. Plotinus claims that accounting for beauty by means of summetria has a number of drawbacks. For instance, it cannot explain the beauty in unified objects that do not have parts, such as a piece of gold.

According to Plotinus’ own theory, an object becomes beautiful by virtue of its participating in the form. He also adds that the Intellect (nous) is the cause of beauty. To be precise, it is the Intellect that imposes the forms onto passive matter thus producing beauty. Those entities that do not participate in the form, and thus reason, are ugly (1.6.2). The form is therefore capable of producing beauty by virtue of its being an instrument of the Intellect that creates order and structure out of chaotic matter in the universe, and beauty is an expression of its designing powers.

Apart from being expressions of the Intellect, forms have another aspect that makes them the cause of beauty; namely, they unify disarrayed and chaotic elements into harmony. When form approaches formless matter, it introduces a certain intrinsic agreement, so that many parts are brought into unity and harmony with each other. The form has intrinsic unity and is one, and therefore, it turns the matter it shapes into one as well, as far as it is possible. The unity produces beauty which ‘communicates itself’ to both the parts and the whole (1.6.2).

Plotinian metaphysics and aesthetics converge in the analogy between Intellect shaping the universe and a sculptor shaping a piece of stone into a statue. At the beginning of his Ennead 5.8 (On the Intelligible Beauty), Plotinus asks his readers to envision two pieces of stone placed next to each other, one plain and another one sculpted into the shape of an especially beautiful human or some god. Then he argues that the latter will appear immediately beautiful, not because of the material it is made out of but because it possesses the form. The beauty is caused by the intellect of the sculptor, which transmits the form onto the stone. The visible form that the sculptor imposes onto the stone is an inferior version of the actual form that can only be contemplated. The actual forms are purely intellectual, ‘seen’ with mind’s eye. The intellectual beauty of reason, argues Plotinus, is a much greater and also truer beauty (En. 5.8.1.).

Plotinus follows Plato in arguing that visible beauty is inferior as it is only a copy of the true beauty of forms. There is, however, a significant difference between them in terms of their attitude towards the value of artistic beauty. Plotinus warns against devaluing artistic activities and, in an argument very much unlike those found in Plato, states that (i) nature itself imitates some things. (ii) Arts do not simply imitate what is seen by the eye but refer back to the principles of nature. (iii) Arts produce many things not by means of copying, but from themselves. In order to create a perfect whole, they add what is lacking, because arts contain beauty themselves. (iv) Phidias (one of the most famous Greek sculptors) designed a statue of Jupiter not by imitation, but by conceiving a form that a god would take if he were willing to show himself to humans (5.8.1).

3. Philosophy of Art

a. Mimesis

In older scholarship, it is common to find a claim that a Greek term for art was techne, and as this is a much narrower term than the contemporary concept of fine art, it is claimed that ancient Greeks did not have a concept of fine art. This interpretation, however, has been challenged. It has been argued that, if there were a concept of fine art in Greek thought, it would be mimesis. In the most literal meaning of the term, mimesis refers to imitation in a very broad sense, including such acts as following an example of someone’s behaviour or adopting a certain custom. This word is widely used when discussing art and artistic activities, and it can be roughly defined as an imitative representation, where ‘representation’ is understood as involving not just copy-making, but also creative interpretation. Aristotle grouped poetry with “the other mimetic arts” (8.1451a30) in the Poetics, in a remark that suggests the conceptualisation of a distinct group of artistic activities resembling the notion of fine arts. A similar grouping of “imitators” (mimetai), including poets, rhapsodists, actors, and chorus-dancers can be found in Plato’s Republic as well (2.373B).

i. Plato

Books 2 and 3 of Plato’s Republic contain an extensive analysis of mimesis in the context of the education of the guardian class in the ideal city-state. In Book 2, Socrates starts developing his account of the ideal city-state. The class of guardians plays an especially important role in its maintenance, and therefore, the question of how the guardians ought to be educated is raised. Apart from physical education, the education based on storytelling is quite important, as it starts early in childhood and precedes physical education (2.376E).

First, Socrates and the interlocutors agree to ban from the guardians’ education and the ideal city-state more generally certain stories based on their content, particularly stories depicting the gods committing evil deeds (2.377D–E). At the start of book 3, there is a longer list of the kind of stories that are undesirable in the ideal city, including ones with negative portrayals of the afterlife, lamentations, gods committing unseemly acts and portrayals of bad people as happy (386A–392C).

Then there follows a discussion of the style (Gr. lexis) of narration. Socrates distinguishes direct speech, when a poet speaks in his own voice, from imitative speech, when a poet imitates the speech of the characters in the story and suggests that if a poem is written in the former style, it contains no mimesis (3. 393D). The poetry can be of three kinds: dithyrambs (in poet’s own voice, no mimesis), tragedy and comedy (pure mimesis) and epic poetry (a combination of the two) (3.394C).

The discussion turns towards the question of whether mimetic poets ought to be allowed into the city-state and whether guardians themselves could be mimetai. The answer to this question turns out to be negative. The main argument against mimesis in the ideal city goes as follows. The guardians preserve the well-being of the city, and thus the only things they ought to imitate are the properties of virtue, not shameful or slavish acts. The reason for this is that enjoying the imitation of these things might lead them to actually pursuing them, as imitation is habit-forming (3.395D–E). It is ultimately concluded that only a pure imitator of a good person ought to be allowed into the city-state (3. 397D–398B)

ii. Aristotle

Aristotle argues that poetry originates from two causes. Both of these causes are grounded in human nature, particularly the natural proneness of human beings to mimesis. Mimesis is said to be (i) the natural method of learning from childhood and (ii) a source of delight for human beings.

In order to support the latter point, Aristotle notes that although such objects as dead bodies and low animals might be painful to see in real life, we delight in artistic depictions of them, and the reason for this is the pleasure humans derive from learning. People delight in seeing a picture, either because they recognise the person depicted and ‘gather the meaning of things’ or—if they do not recognise the subject—they admire the execution, colour, and so on. (The distinction between mimesis and colour/composition is reiterated in Politics, where colour and figures are said to be not imitations but signs with little connection to morality, and therefore, young men ought to be taught to look at those paintings which depict character (1340a32–39).) This principle applies not only to visual arts. The natural inclination to mimesis combined with the sense of harmony and rhythm is the reason why humans are drawn to poetry as well (Poetics 1448b5–1448b24).

Aristotle’s conceptual analysis of poetry contains a revealing discussion of the differences between poetry and history. Poets differ from historians by virtue of describing not what happened, but what might happen, either because it is probable or necessary. They do not, however, differ because one is set in prose and another one in verse, as the works of Herodotus could be set in verse and remain history. The fundamental difference between history and poetry lies in the fact that the former is concerned with statements about particulars, whereas the latter is concerned with universal statements. Some tragedies do use historical characters, but this, according to Aristotle, is because “what is possible is credible,” which presumably means that plots involving historical characters are more moving because they might have actually happened. Another notable conclusion is that the poet is a poet because of the plot rather than the verse, as the defining characteristic of such activity is the imitation of action (1451a37–b31).

b. Criticism of Arts

i. Plato

Unlike Aristotle, Plato saw potential dangers associated with mimetic activities. In Republic 5, “lovers of beautiful sights and sounds,” people addicted to music, drama and so on, are contrasted with true philosophers. The lovers of sights and sounds pursue only opinions, whereas philosophers are the pursuers of knowledge and, ultimately, beauty in itself (5.475D–480A).

But perhaps the best-known argument criticising art comes from Book 10 of the Republic. Here, the products of artistic activities are criticised for being twice removed from what is actually the case. Socrates uses the example of a symposium couch to argue that the painting of a couch is just a copy of reality, the actual couch. Yet the actual couch made by the craftsman is also just a copy of the true reality, the forms. The painters, according to this argument, portray only a small portion of what is actually the case. For the most part, they are concerned with appearances. There are, thus, three kinds of couches: one produced by god, another one produced by a carpenter and the third one by a painter. God and the craftsmen are called makers or producers of their kinds of couches, but the painter is only an imitator, a producer of the product that is thrice removed from nature. This category is also said to include tragedians and all the other imitators (596D–597E).

These and other passages have earned Plato a reputation of being hostile to art. Plato’s theory of art, however, is much more complex, and criticism is only one aspect of his treatment of artistic mimesis. An example of a more constructive understanding of artistic imitation can be found in the same work where he famously criticises it, the Republic.

For instance, Socrates suggests that there is an analogy between the ideal political state they are discussing and an idealised portrait, arguing that no one would think the latter is flawed because the painter cannot produce an ideal person in reality and, therefore, there is no need to worry that their ideal state does not actually exist (5.472D–E). Socrates’ remark indicates that there is much more to painting than the copying of appearances. Ideas like these can be found throughout the Republic (see also 6.500E–501C; 3.400E–401A).

In fact, after banishing poetry from the ideal city earlier, Socrates praises Homer, who is said to be the best of the tragedians, and a concession is made for hymns to god and eulogies to good people. Socrates also adds that even imitative poetry could be welcomed in the city, provided there is an argument showing it ought to belong to such well-governed places (10.606E–607c). The ancient quarrel between poets and philosophers, as Plato called it, was neither unambiguous nor a settled matter.

ii. Epicureans

The Epicureans, members of the Hellenistic philosophical school notorious for its atomist physics and hedonist ethics, were also critics of poetry. The Epicurean ethical views, especially the claim that death is not evil, played a major role in shaping their perspective on poetry. The extant works of the founder of the school, Epicurus, show him criticising muthos, stories told by poets. Epicurus was concerned with the dangerous influence that these stories could have on those who hear them. The stories of poets are based on beliefs that produce the feeling of anxiety in listeners (for instance, the belief that life is full of pain and it is best to not to be born at all). The opposite of these are beliefs gained by studying nature and engaging in philosophical investigations. Such studies lead to the discovery that the greatest pleasure in life is ataraxia (the state of tranquillity) and abolishing the fear of pain and death (Letter to Menoeceus 126–7; Principal Doctrines 12). Epicurus also notoriously argued against receiving the traditional education (paideia) that includes an education in poetry (Letter to Pythocles 10.6; Plutarch 1087A).

It is noteworthy, however, that Epicurus was not unequivocally opposed to poetry and arts. Some evidence suggests that he maintained that only an Epicurean would discuss music and poetry in the right way, although the Epicureans would not take up writing poetry themselves (Diogenes Laertius, Lives 10.120; Plutarch 1095C). It appears that for Epicurus, like Plato, arts were problematic because of their power to impart incorrect beliefs and emotions that pose risks to one’s ataraxia.

Lucretius, the author of the Epicurean epic poem De Rerum Natura, espouses a somewhat different attitude toward poetry. Written in the 1^st century B.C.E. in Latin, the poem is an exposition of Epicurean views including atomism, hedonistic ethics and epistemic dogmatism (especially against attacks from the Sceptics). As a whole, the poem engages very little with aesthetic issues, with the exception of the often-quoted passage from Book 1, in which Lucretius talks about the effects of poetry. He compares himself to a physician who, administering unpleasant-tasting wormwood, covers the brim of the glass with honey, not to deceive his patients, but to help them take the medicine and become better. In the same way, Lucretius himself sweetens doctrines that otherwise might seem woeful to those who are new to Epicureanism (1.931–50).

c. Catharsis

Catharsis is a psychological phenomenon, often associated with the effects of art on humans, famously described by Aristotle. There is, however, no explicit definition of catharsis in the extant Aristotelian corpus. Instead, we have a number of references to such a phenomenon. The one most pertinent to aesthetics is found in Poetics, where one of the defining features of tragedy is a catharsis of such emotions as fear and pity (1449b22–28). Another reference to catharsis can be found in Politics. Here Aristotle writes that music ought to be used for education, catharsis and other benefits (1341b37–1342a1). The lack of Aristotle’s own definition combined with the long and rich history of later interpretations of catharsis (see Halliwell 1998: app. 5) makes it hard to reconstruct a precise Aristotelian account of this term. It is arguably related to the influence that arts have on a person’s emotions and judgements that derive from those emotions (Politics 1340a1–1340b18).

It has been argued that the concept of catharsis has both religious and medical connotations, although more recent interpretations favour the view that it is primarily a psychological phenomenon that has certain ethical aspects (though it is not a means to learn ethics per se).

d. Sublime

Another aesthetic term that originated in antiquity, but was made famous by subsequent adaptations, especially by Kant and Burke, is that of the sublime. The main source for the theory of the sublime is the handbook on oratory titled Peri Hupsous (De Sublimitate in Latin), although it is also noteworthy that a notion of the sublime was known and used much more widely in antiquity (Porter 2016). The authorship of Peri Hupsous is disputable. The work has been attributed to Cassius Longinus, a Greek rhetorician in the 3rd century C.E., and an anonymous author in the 1^st century C.E. referred to as pseudo-Longinus.

Fundamentally, the sublime as described by Longinus is a property of style, “certain loftiness and excellence of language.” It does have some more striking aspects, however. For instance, Longinus states that:

A lofty passage does not convince the reason of the reader, but takes him out of himself . . . Skill in invention, lucid arrangement and disposition of facts, are appreciated not by one passage, or by two, but gradually manifest themselves in the general structure of a work; but a sublime thought, if happily timed, illumines an entire subject with the vividness of a lightning-flash, and exhibits the whole power of the orator in a moment of time (1).

Longinus suggests that sublimity originates from five different sources: (i) the greatness of thought; (ii) a vigorous treatment of passions; (iii) skill in employing figures of thought and figures or speech; (iv) dignified expressions, including the appropriate choice of words and metaphors; and (v) majesty and elevation of structure. The last cause of sublimity is said to embrace all the preceding ones as well (8.1).

4. References and Further Reading

a. Primary Sources

Armstrong, A. 1966–88. Plotinus: Enneads. 7 vols. Cambridge, MA: Harvard University Press.
Arnim, H. F. A. von. 1903–1924. Stoicorum Veterum Fragmenta. 3 vols. Leipzig: Teubner.
Bychkov, O. and A. Sheppard, eds. 2010. Greek and Roman Aesthetics. Cambridge: Cambridge University Press.
Cooper, J. and D. Hutchinson, eds. 1997. Plato: Complete Works. Indianapolis; Cambridge: Hackett.
Diels, H. and W. Kranz, eds. 1951–1952. Die Fragmente der Vorsokratiker, griechisch und deutsch. 3 Vols. Berlin: Weidmannsche buchhandlung.
Dyck, A. R. 1996. A Commentary on Cicero De Officiis. Ann Arbor: University of Michigan Press.
Goodwin, W. 1874. Plutarch’s Morals. Cambridge: John Wilson and son.
Hicks, R. D. 1925. Diogenes Laertius: Lives of Eminent Philosophers. London: W. Heinemann; New York: G.P. Putnam’s Sons.
King, J. 1945. Cicero: Tusculan Disputations. Cambridge, MA: Harvard University Press.
Long, A. and D. Sedley, eds. 1987. The Hellenistic Philosophers. 2 vols. Cambridge: Cambridge University Press.
Leonard, W. E. 1916. Lucretius: De Rerum Natura. London: Dent; New York: Dutton.
O’Connor, E. M. 1993. The essential Epicurus: letters, principal doctrines, Vatican sayings, and fragments. Buffalo, N.Y.: Prometheus Books.
Roberts, W. R. 2011. Longinus on the Sublime: The Greek Text Edited after the Paris Manuscript. 2nd edn. Cambridge: Cambridge University Press.

b. Secondary Sources

Asmis, E. 1991. “Epicurean Poetics.” Proceedings of the Boston Area Colloquium in Ancient Philosophy 7, pp. 63–93. Reprinted in Philodemus and Poetry: Poetic Theory and Practice in Lucretius, Philodemus and Horace, ed. by D. Obbink, Oxford University Press 1995, pp. 15–34; and in Ancient Literary Criticism, ed. Andrew Laird, Oxford University Press 2006, pp. 238–66.
- (A discussion of the evidence concerning the views on poetry found in the works of Epicurus, Lucretius and Philodemus.)
Barney, R. 2010. “Notes on Plato on The Kalon and The Good.” Classical Philology 105(4): 363–377.
- (A discussion of functionality and its relationship to beauty in Plato’s works.)
Beardsley, Monroe C. 1966. Aesthetics from Classical Greece to the Present. New York: Macmillan.
- (Relevant sections of this book contain a classic interpretation of ancient aesthetics.)
Bernays, J. 1979. “Aristotle on the Effect of Tragedy.” In Articles on Aristotle, edited by J. Barnes, Schofield, and R. Sorabji. Vol. 4: Psychology and Aesthetics, 154–165. London. (Originally in Abhandlungen der historisch‐philosophischen Gesellschaft in Breslau, vol. 1, 1857: 135–202; and Sonderausgabe, Breslau 1857.)
- (A seminal paper for the study of Aristotle’s concept of catharsis; it argues that catharsis is the ‘purgation’ of emotions.)
Bett, R. 2010. “Beauty and its Relation to Goodness in Stoicism.” In Ancient Models of Mind, ed. A. Nightingale and D. Sedley, 130–152. Cambridge: Cambridge University Press.
- (In this paper, the evidence for the Stoic definition of beauty as summetria is collected and interpreted.)
Boudouris, K. ed. 2000. Greek Philosophy and the Fine Arts, Volume 2. Athens: International Centre for Greek Philosophy and Culture.
- (A large collection of papers on various aspects of ancient Greek aesthetics.)
Bychkov, O. 2010. Aesthetic Revelation: Reading Ancient and Medieval Texts after Hans Urs von Balthasar. Washington, D.C.: Catholic University of America Press.
- (A wide-scope monograph; the central argument concerns the notion of the revelatory aesthetics and its presence in ancient (and later) philosophical texts.)
Close, A. J. 1971. “Philosophical Theories of Art and Nature in Classical Antiquity.” Journal of the History of Ideas 32(2): 163–184.
- (A study of the notion of creator/designer in antiquity.)
Demand, N. 1975. “Plato and the Painters.” Phoenix 29(1): 1–20.
- (An article discussing Plato’s attitude to painting and the relationship between his views and contemporary painting traditions.)
Denham, A. ed. 2012. Plato on Art and Beauty. New York: Palgrave Macmillan.
- (A collection of papers on Plato’s philosophy of art.)
Destrée, P. and P. Murray, eds. A companion to Ancient Aesthetics. Hoboken, NJ: Wiley-Blackwell.
- (A wide-ranging collection of extended entries, including such topics as mimesis, beauty, sublime, art and morality, tragic emotions and others.)
Ford, A. 1995. “Katharsis: The Ancient Problem.” In Performativity and Performance, edited by A. Parker and E. K. Sidgwick, 109–32. New York and London.
- (An interpretation of Aristotle’s concept of catharsis with an argument that the relevant passages from Politics help to shed light on the sparse description in Poetics.)
Gál, O. 2011. “Unitas Multiplex as the Basis of Plotinus’ Conception of Beauty: An Interpretation of Ennead V.8.” Estetika: The Central European Journal of Aesthetics 48(2): 172–198.
- (A paper arguing that, for Plotinus, beauty derives from Intellect and unity in diversity.)
Golden, L. 1973. “The Purgation Theory of Catharsis.” The Journal of Aesthetics and Art Criticism 31(4): 473–479.
- (An in-depth argument against Bernays’ interpretation of catharsis as purgation; it contains a suggestion that catharsis is better understood as intellectual clarification.)
Halliwell, S. 1991. “The Importance of Plato and Aristotle for Aesthetics.” Proceedings of the Boston Area Colloquium in Ancient Philosophy, vol.7, pp. 321–48. New York: Routledge.
- (A paper arguing that Plato and Aristotle address issues that are pertinent to contemporary aesthetics.)
Halliwell, Stephen. 1998. Aristotle’s Poetics. 2nd edn. London: Duckworth.
- (An extensive study of Poetics, including a number of concepts central to Aristotle’s aesthetics; also includes appendices on the history of interpreting catharsis after Aristotle, dating of Poetics and others.)
Halliwell, Stephen. 2002. The Aesthetics of Mimesis: Ancient Texts and Modern Problems. Princeton: Princeton University Press.
- (A seminal study of the concept of mimesis in Greek philosophy and literature.)
Horn, H. -J. 1989. “Stoische Symmetrie und Theorie des Schönen in der Kaiserzeit.” Aufstieg und Niedergang der römischen Welt 36.3: 454–472.
- (A study of the Stoic definition of beauty as summetria.)
Hyland, D. 2008. Plato and the Question of Beauty. Blooming & Indianapolis: Indiana University Press.
- (An interpretation of Plato’s notion of beauty in Symposium, Hippias Major and Phaedrus influenced by continental philosophy.)
Irwin, T. 2010. “The Sense and Reference of Kalon in Aristotle.” Classical Philology 105(4): 381–396.
- (An argument for avoiding an aesthetic translation of the term to kalon in Aristotle’s works on ethics.)
Kraut, R. 2013. “An aesthetic reading of Aristotle’s Ethics.” In Politeia in Greek and Roman Philosophy, ed. M. Lane and V. Harte, pp. 231–250. Cambridge: Cambridge University Press.
- (An argument for translating to kalon in Aristotle’s work as an aesthetic term.)
Kristeller, O. P. 1951. “The Modern System of the Arts: A Study in the History of Aesthetics Part I.” Journal of the History of Ideas 12(4): 496–527.
- (An article containing arguably the most significant critique of the notion of ancient aesthetics.)
Konstan, D. 2015. Beauty: The Fortunes of an Ancient Greek Idea. Oxford: Oxford University Press.
- (A wide-ranging study of the ancient Greek conception of beauty; includes a discussion of translating problematic aesthetic terms.)
Laird, A. ed. 2006. Ancient Literary Criticism. Oxford: Oxford University Press.
- (A collection of papers covering a wide range of topics including Aristotle’s catharsis, the views of the Hellenistic schools on poetry and Plato’s treatment of tragedy.)
Lear, J. 1988. “Katharsis.” Phronesis 33: 297–326.
- (An argument against the interpretation of catharsis as ‘purgation’ of emotions; and the suggestion that it is, instead, a psychological one with certain ethical connotations.)
Lear, G. R. 2006. “Aristotle on Moral Virtue and the Fine.” In The Blackwell Guide to Aristotle’s Nicomachean Ethics, ed. R.Kraut, pp.116–136. Malden, MA; Oxford: Blackwell.
- (A study of Aristotle’s use of to kalon with the argument that Aristotle used this term (with its aesthetic undertones) to put an emphasis on certain properties of goodness, namely, intelligibility and pleasantness to contemplate.)
Lobsien V. and C. Olk, eds. 2007. Neuplatonismus und Ästhetik: zur Transformationsgeschichte des Schönen. Berlin/New York: De Gruyter.
- (A collection of papers on Neoplatonist aesthetics.)
Lombardo, G. 2002. L’Estetica Antica. Bologna: Il Mulino.
- (A short monograph in Italian containing a discussion of views on aesthetics espoused by both major and lesser-known philosophical figures in antiquity.)
Nehamas, A. 2007. “‘Only in the Contemplation of Beauty is Human Life Worth Living’ Plato, Symposium 211d.” European Journal of Philosophy 15 (1): 1–18.
- (A discussion of the role that beauty plays in Plato’s Symposium.)
Nussbaum, M. 1990. Love’s Knowledge: Essays on Philosophy and Literature. Oxford: Oxford University Press.
- (The relevant sections of this book analyze the complex relationship between philosophy and literature in Plato’s works.)
Pappas, N. 2012. “Plato on Poetry: Imitation or Inspiration?” Philosophy Compass 7 (10): 669–678.
- (An argument that in Republic and Sophist, poetry is treated as imitation, whereas in Ion and Phaedrus, it is treated as inspiration. The relationship between the two views is explained by employing Plato’s concept of drama in Laws.)
Peponi, A. -E. 2012. Frontiers of Pleasure: Models of Aesthetic Response in Archaic and Classical Greek Thought. Oxford: Oxford University Press.
- (A study of the representations of aesthetic properties of artworks and other objects in ancient Greek texts, including philosophical ones.)
Pollitt, J. J. 1974. The Ancient View of Greek Art: Criticism, History, and Terminology. New Haven, CT: Yale University Press.
- (A seminal work on ancient Greek philosophy of art, it deals with not only philosophical but also literary, rhetorical and other kinds of texts.)
Porter J. 2009. “Is Art Modern? Kristeller’s ‘Modern System of the Arts’ Reconsidered.” British Journal of Aesthetics 49: 1–24.
- (An article containing a critique of Kristeller’s dismissal of the possibility of ancient aesthetics.)
Porter, J. 2010. The Origins of Aesthetic Thought in Ancient Greece: Matter, Sensation and Experience. Cambridge: Cambridge University Press.
- (The central argument claims that Plato and Aristotle established formalist aesthetics, which dominated the tradition and silenced alternative, materialist aesthetics.)
Porter, J. 2016. The Sublime in Antiquity. Cambridge: Cambridge University Press.
- (A study of the notion of sublime outside Longinus’ treatise.)
Rogers, K. 1993. “Aristotle’s Conception of τὸ καλόν.” Ancient Philosophy 13:355–71. Reprinted in L. P. Gerson (ed.) 1999. Aristotle: Critical Assessments, iv. London: Routledge: 337–55.
- (The analysis and interpretation Aristotle’s use of the term to kalon, especially his claim that virtues are undertaken for the sake of to kalon.)
Sheffield, F. 2006. Plato’s ‘Symposium’: The Ethics of Desire. Oxford: Oxford University Press.
- (A monograph on Plato’s Symposium; the central argument interprets the dialogue as concerned with moral education, but in a distinct way, that is, by means of the analysis of desire.)
Tatarkiewicz, W. 1974. The History of Aesthetics. Vol. 1. The Hague: Mouton.
- (A collection of ancient Greek philosophical texts on various topics in aesthetics accompanied by a commentary.)
Zagdoun, M. -A. 2000. La Philosophie Stoïcienne de l’art. Paris: CNRS Editions.
- (An extensive study of the notions of beauty and art in Stoic philosophy.)

Author Information

Aiste Celkyte
Email: aiste.celkyte@googlemail.com
Yonsei University
South Korea

Quantum Logic in Historical and Philosophical Perspective

Quantum Logic (QL) was developed as an attempt to construct a propositional structure that would allow for describing the events of interest in Quantum Mechanics (QM). QL replaced the Boolean structure, which, although suitable for the discourse of classical physics, was inadequate for representing the atomic realm. The mathematical structure of the propositional language about classical systems is a power set, partially ordered by set inclusion, with a pair of operations that represent conjunction and disjunction. This algebra is consistent with the discourse about both classical and relativistic phenomena, but inconsistent in a theory that prohibits, for example, giving simultaneous truth values to the following propositions: “The system possesses this velocity” and “The system is in this place.” The proposal of the founding fathers of QL was to replace the Boolean structure of classical logic by a weaker structure which relaxed the distributive properties of conjunction and disjunction.

During its development, QL started to refer not only to a logic, but also to the multiple lines of research that attempted to understand QM from a logical perspective. This article provides a map of these multiple approaches in order to introduce the very different strategies and problems discussed in the QL literature. When possible, unnecessary formulas are avoided in order to give an intuitive grasp of the concepts before deriving or introducing the associated mathematics. However, for those readers who wish to engage more profoundly with the subject of QL, the article provides an extensive bibliography.

Logic and Physics
The Logical Structure of Quantum Mechanics
The Origin of Quantum Logic
Quantum Logic in Historical and Philosophical Perspective
Ongoing Developments and Debates
Final Remarks
References and Further Reading

1. Logic and Physics

QL relates the two seemingly different disciplines of physics and logic. These disciplines have been intimately related since their origin. It was Aristotle who created classical logic and used it in order to develop his own physical and metaphysical scheme, providing an answer to the problem of movement and knowledge set down by the Heraclitean and Eleatic schools of thought. Movement was then regarded by Aristotle in terms of his hylomorphic scheme, as the path from a potential (undetermined, contradictory and non-identical) realm to an actual (determined, non-contradictory and identical) realm of existence. The notion of entity was then characterized by three main logical and ontological principles: The Principle of Existence (PE), which allowed Aristotle to claim existence about that which is predicated, the Principle of Non-Contradiction (PNC), which permitted him to argue that which exists possesses non-contradictory properties, and the Principle of Identity (PI), which allowed him to claim that the predicated existent is “the same,” or remains identical to itself, through time. Aristotle’s architectonic determined the fate of both classical and medieval physics, as well as metaphysics. The transformation from medieval to modern science coincides with the abolition of the Aristotelian hylomorphic metaphysical scheme as the foundation of knowledge. However, the basic structure of his metaphysical scheme and his logic still remained the basis for correct reasoning. As noted by Karin Verelst and Bob Coecke:

Dropping Aristotelian metaphysics, while at the same time continuing to use Aristotelian logic as an empty ‘reasoning apparatus’ implies therefore losing the possibility to account for change and motion in whatever description of the world that is based on it. The fact that Aristotelian logic transformed during the twentieth century into different formal, axiomatic logical systems used in today’s philosophy and science doesn’t really matter, because the fundamental principle, and therefore the fundamental ontology, remained the same ([40], p. xix). This ‘emptied’ logic actually contains an Eleatic ontology, that allows only for static descriptions of the world. [231, p. 173]

It was Isaac Newton who was able to translate into a closed mathematical formalism both the ontological presuppositions present in Aristotelian (Eleatic) logic, and the materialistic ideal of ‘res extensa’ together with actuality as its mode of existence. The term ‘actual’ refers here to preexistence (within the transcendent representation) and not to the observation hic et nunc. Every physical system may be described exclusively by means of its actual properties. The change of the system may be accounted for by the change of its actual properties. Potential or possible properties are then only considered as the points to which the system might arrive in a future instant of time. As Dennis Dieks states: “In classical physics the most fundamental description of a physical system (a point in phase space) reflects only the actual, and nothing that is merely possible. It is true that sometimes states involving probabilities occur in classical physics: think of the probability distributions ρ in statistical mechanics. But the occurrence of possibilities in such cases merely reflects our ignorance about what is actual. The statistical states do not correspond to features of the actual system, but quantify our lack of knowledge of those actual features.” [98, p. 124-125] In QM however, the different structure of the physical properties of the system determines a change of nature regarding the meaning of possibility and potentiality. Indeed, QM has been related to modality since 1926 when Max Born interpreted the quantum wave function Ψ in terms of a density of probability. However, it was clear from the very beginning that this new quantum possibility was something completely different from that considered in classical theories.

[The] concept of the probability wave [in quantum mechanics] was something entirely new in theoretical physics since Newton. Probability in mathematics or in statistical mechanics means a statement about our degree of knowledge of the actual situation. In throwing dice we do not know the fine details of the motion of our hands which determine the fall of the dice and therefore we say that the probability for throwing a special number is just one in six. The probability wave function, however, meant more than that; it meant a tendency for something. [152, p. 42]

According to Werner Heisenberg, the concept of the probability wave “was a quantitative version of the old concept of ‘potentia’ in Aristotelian philosophy. It introduced something standing in the middle between the idea of an event and the actual event, a strange kind of physical reality just in the middle between possibility and reality.” [152, p. 42] Indeed, contrary to classical possibility which only refers to our incomplete knowledge of an actual state of affairs, quantum possibilities interact between each other. This fact, completely foreign to classical theories, is exploited by present technological developments in quantum information processing for example, quantum computation, quantum cryptography, quantum teleportation. However, apart from this very fundamental question regarding the realm of existence which the logical structure of QM forces us to consider, there are many other aspects which have been a subject of discussion in the literature since the origin of QM. As a matter of fact, the interpretation of Planck’s quantum postulate, the superposition principle, the non-commutativity of observables or the identity of quantum particles—just to mention a few—pose important problems which help us to coherently consider what QM is talking about. QL has been an important tool for discussing all these fascinating subjects.

2. The Logical Structure of Quantum Mechanics

In logical terms, Newtonian mechanics may be described through “the logic of an omniscient mind in a deterministic universe” [54] because in such a universe any assertion is semantically decided. That is, either proposition p or its negation ¬p is true (excluded middle principle), both assertions p and ¬p cannot be simultaneously true (PNC), meanings are sharp and unambiguous, and the meaning of a compound expression is determined by the meanings of its parts. From a mathematical perspective, both the syntactic and the semantic aspects of classical propositional logic can be described completely in terms of Boolean algebra. However, the structure of QM does not fit these features. The main reason for this is that in physical theories the information about the state of affairs is encoded in what is called “the physical state.” Both in classical and QM there are states of maximal knowledge, but the logical implications that may be grasped from each situation are not the same. While in classical mechanics maximal information about a situation implies logical completeness, meaning that every assertion about the situation represented by the state is either true or false, in QM a state cannot decide the truth or falsity of all propositions about events. This is because there are states related with both a property and its negation called “superposition states.”

In classical physics every system can be described by specifying its actual properties. Mathematically, this happens by representing the state of a system of mass m by a point (p, q) in its corresponding phase space Γ of positions q and momenta p. Newton’s law tells us how this point moves along the path determined by the initial conditions. Physical magnitudes are represented by real functions over Γ. These functions commute between each other and can be interpreted as all possessing definite values at any time, independently of physical observations. Physical events are represented by subsets of Γ. The power set ℘ of Γ endowed with set theoretical operations: intersection (∩), union (∪) and set-complement gives rise to a Boolean algebra. Interpreting these operations as the logical connectives, they represent and (∧), or (∨) and not (¬). The link between the algebraic structure of classical mechanics and classical logic is obvious. When dealing with many degrees of freedom, a statistical description is useful. The logical-algebraic structure associated with classical mechanics admits the definition of a probability measure over it with its elements considered as events. The resulting probability is a classical Kolmogorovian probability.

According to John von Neumann’s axiomatization of QM, the mathematical interpretation of a physical system is a complex separable Hilbert space H, and a pure state is represented by a ray in H. Differently from the classical scheme, physical magnitudes are represented by self-adjoint operators on H that, in general, do not commute under multiplication. The values that any magnitude may take are the eigenvalues of the corresponding operator, each one of which comes with its associated eigenstate. The non-commutativity of operators has problematic interpretational consequences, for it is then difficult to affirm that the quantum magnitudes thus represented are simultaneously pre-existent to observation. The evolution of the state is given by the Schrödinger equation that, due to its linearity, implies the formal existence of quantum superpositions of states. The fact that states may be linearly combined forbids the use of mere subsets as representatives of propositions, they are instead well represented by closed subspaces of H.

Historically, the first approach to an idea of QL is in Chapter 3 of von Neumann’s book on the mathematical formulation of QM [234] where he relates linear operators, namely the projections on state space H, with the representatives of “experimental propositions” affiliated with the system: “[…] the relation between the properties of a physical system on the one hand, and the projections on the other, makes possible a sort of logical calculus with these.” In fact, closed subspaces are in one-to-one correspondence with the projectors over them: “If we introduce, along with the projections E, the closed linear manifold R belonging to them (E = P_R), then the closed linear manifolds correspond equally to the properties of S [S is the system].” [234, p. 250] The set of closed subspaces of H, ordered by inclusion and equipped with adequate definitions of algebraic operations, gives rise to a lattice [180], namely a partially ordered set (L,∨,∧) in which every pair of elements has a supremum called join (∨) and an infimum called meet (∧) that satisfy:

commutative laws for the meet and join operations: x∨y = y∨x, x∧y = y ∧ x
absorption laws: x ∨ (x ∧ y) = x, x ∧ (x ∨ y) = x
associative laws: x ∨ (y ∨ z) = (x ∨ y) ∨ z, x ∧ (y ∧ z) = (x ∧ y) ∧ z

The lattice may have a maximum (or top) 1, which is the identity for the ∧ operation, and a minimum (or bottom) 0, the identity for the ∨ operation. A lattice (L,∨,∧,1,0) is said to be modular when for all elements x, y and z, if x ≤ z, then

x ∨ (y ∧ z) = (x ∨ y) ∧ z

An orthocomplement x^⊥ of the element x is defined in such a way that they satisfy:

the complement law: x^⊥ ∨ x = 1 and x^⊥ ∧ x = 0
the involution law: x^⊥⊥ = x
the order-reversing law: if x ≤ y then y^⊥ ≤ x.

The modular lattice is called orthomodular if it is equipped with an orthocomplementation. The lattice of subspaces of H, denoted by L(H), is called the Hilbert lattice associated to H and motivates the standard QL [41].

This is the proposal of Garret Birkhoff and J. von Neumann for the algebraic structure that organizes the propositions of the language of QM. This is a quite different structure than the classical one. In fact, as mentioned above, in classical logic the propositions organize themselves in the power set with operations ∧, ∨ and ¬ representing the classical language connectives and, or and not. This structure constitutes a Boolean algebra that satisfies the distributive laws of and and or:

(x ∧ y) ∨ z = (x ∧ z) ∨ (y ∧ z)

(x ∨ y) ∧ z = (x ∨ z) ∧ (y ∨ z)

Closed subspaces of Hilbert space H form an algebra called a Hilbert lattice denoted as L(H). In any Hilbert lattice the meet operation ∧ corresponds to set theoretical intersection between subspaces, and the join operation ∨ corresponds to the smallest closed subspace of H containing the set theoretical union of subspaces. In this way, the ordering relation ≤ associated to the lattice corresponds to the set-theoretical inclusion of subspaces. Note that L(H) is a bounded lattice where H is the maximum, denoted by 1, and the empty subspace is the minimum, denoted by 0. This lattice equipped with the relation of orthogonal complement ^⊥can be described as an ortholattice [162].

3. The Origin of Quantum Logic

The official birth of QL was produced with the 1936 seminal paper “The logic of quantum mechanics,” where Birkhoff and von Neumann made the proposal of a non-classical logic for the theory, arguing that the problem of whether the Hilbert space formalism displayed a logical structure could prove useful to the understanding of QM. In the introduction to the paper they make the point:

One of the aspects of quantum theory which has attracted the most general attention is the novelty of the logical notions it presupposes. It asserts that even a complete mathematical description of a physical system S does not in general enable one to to predict with certainty the result of an experiment on S, and that in particular one can never predict with certainty both the position and the momentum of S (Heisenberg’s uncertainty principle). It further asserts that most pairs of observations cannot be made on S simultaneously (Principle of Non-commutativity of Observations). […] The object of the present paper is to discover what logical structure one may hope to find in physical theories which, like quantum mechanics, do not conform to classical logic. [41]

As said above, the propositional structure that gave rise to QL was the ortholattice <L(H), ∨, ∧, ^⊥, 1, 0>. The different characters proposed for the representatives of the logical connectives completely changes the meaning of these connectives. A relevant feature of ∨ is that, differently from the case in classical semantics, a quantum disjunction may be true even if neither of it members is true. This reflects, for example, the case in which we are dealing with a state such as that of a spin 1/2 system which is in a linear combination of states up and down. Both propositions, “the state is up” and “the state is down,” may have no definite truth value (the excluded middle principle is violated), but the disjunction “the state is up or the state is down” is a tautology. The distinguishing character of the structure is the failure of the distributive law, a law that holds in classical logic. This means that if p, q and r are propositions,

x ∧ (y ∨ z) ≠ (x ∧ y) ∨ (x ∧ z)

Birkhoff and von Neumann remarked on this fact in their paper: “[…] whereas logicians have usually assumed that properties of negation were the ones least able to withstand a critical analysis, the study of mechanics points to the distributive identities as the weakest link in the algebra of logic.” And concluded that “the propositional calculus of quantum mechanics has the same structure as an abstract projective geometry.” However, L(H) satisfies a kind of weak distributivity. In case of a finite-dimensional Hilbert space H, the ortholattice L(H) is modular, that is, satisfies the following condition known as the modular law:

x ≤ y ⟹ x ∨ (y ∧ z) = y ∧ (x ∨ z)

The modular law is equivalent to the identity (x∧y)∨(y∧z) = y∧((x∧y)∨z). In the case of an infinite-dimensional Hilbert space the modular law is not satisfied. In 1937, Kodi Husimi [156] showed that a weaker law, the so called orthomodular law is satisfied in the ortholattice L(H). The orthomodular law says:

x ≤ y ⟹ x ∨ (x^⊥ ∧ y) = y

and it is equivalent to the identity x ∨ y = ((x ∨ y) ∧ y^⊥) ∨ y [180]. This is an important point for the purpose of defining a probability measure that could be interpreted in terms of relative frequencies (see for example [210, ch. 7]). But, when taking the lattice elements as events, this is not possible. Josef Maria Jauch’s remarked that:

Birkhoff and von Neumann […] have tried to justify modularity by pointing out that on finite modular lattices one can define a dimension function […] Such a function has the characteristic properties of a probability measure, and d(a) would represent the a priori probability for finding the system with property a when nothing is specified as to its preparation. It is known that there are systems for which such a finite a priori probability does not exist. [159, p. 83]

Since the lattice of subspaces (or projection operators) L(H) was not in general a modular one—precluding a nice definition of probability (see for example [210, Ch. 7] and [212])—von Neumann abandoned the Hilbert space structure for the formulation of QM and turned to the study of rings of operators, that in turn gave rise to von Neumann’s algebras [235].

Before discussing the historical development it has to be said that the name “quantum logic” is somewhat misleading. As Dalla Chiara et al. remark: “by standard quantum logic one usually means the complete orthomodular lattice based on the closed subspaces in a Hilbert space. Needless to observe, such a terminology that identifies a logic with a particular example of an algebraic structure turns out to be somewhat misleading from the strict logical point of view.” [78] Different forms of QL may be constructed by building algebraic or Kripkean semantics over the algebraic structure of the Hilbert space (see for example [81]).

4. Quantum Logic in Historical and Philosophical Perspective

QL has been a field of debate in philosophy as well as Quantum Physics. Within QL many different philosophical approaches and lines of research have been developed, discussed and addressed. From neo-Kantism to empiricism and Aristotelian realism, quantum logical research has opened the door to one of the most interesting debates in both physics and philosophy of physics in the second half of the last century. However, even though there are many different perspectives regarding QL, one might characterize its most general interpretational characteristic in terms of a strategically subversive attitude towards classical logic and the very foundations of metaphysical understanding. In this respect, in order to clarify the vast map of interpretations about QM and to discover the physical meaning of the theory, one can consider the strategies that different interpreters have taken. While the first group started from a set of (classical) metaphysical presuppositions and intended to change the formalism in order to fit QM into their desired metaphysical picture [for example, Bohmian mechanics, Ghirardi-Rimini-Weber theory], a second group concentrated their efforts—taking as a standpoint the orthodox formalism—on trying to understand the symmetries and characteristics of the formalism in order to derive a suitable interpretation of the theory. Much more open to an original metaphysical development that would allow us to understand what the world is like according to QM, QL—apart from some minor exceptions—is clearly part of the latter group.

a. The Neo-Kantian Logical Path

As recalled by Heisenberg in Physics and Philosophy [152], the concern about objectivity and the use of ordinary language for quantum concepts was an important focus of discussion during the development of the theory:

The most difficult problem, however, concerning the use of language arises in quantum theory. Here we have at first no simple guide for correlating the mathematical symbols with concepts of ordinary language: and the only thing we know from the start is the fact that our common concepts cannot be applied to the structure of the atoms. […] The analysis can now be carried further in two entirely different ways. We can either ask which language concerning the atoms has actually developed among physicists in the thirty years that have elapsed since the formulation of QM. Or we can describe the attempts for defining a precise scientific language that corresponds to the mathematical scheme. In answer to the first question one may say that the concept of complementarity introduced by Bohr into the interpretation of quantum theory has encouraged physicists to use an ambiguous language rather than an unambiguous language. [152, p. 153]

Formulating critics to this use Heisenberg argues that: “it seems rather doubtful whether an expectation [referring to the use of classical concepts] should be called objective.” A different approach, initiated by Birkhoff and von Neumann and continued by Carl Friedrich von Weizsäcker in the fifties, would be “to define a different precise language which follows definite logical patterns in conformity with the mathematical scheme.” Carl Friedrich von Weizsäcker, as well as Hans Reichenbach [208], did so by modifying the principle of excluded middle. As this principle is used in everyday conversation, von Weizsäcker proposed to distinguish different levels of language: one level referring to objects, a second level to statements about objects, a third level to statements about statements about objects and so on. The modification of classical logic has to refer, first of all, to the level of objects. As the state of a system allows us to predict with some probability the different properties it could possess, von Weizsäcker introduced the concept of “degrees of truth.” For each pair of properties, the question about its truth is not decided. But ‘not decided’ is by no means equivalent to ‘not known’. This kind of many valued logic may be extended to the successive levels of language.

As Heisenberg remarks, it is not clear at first sight which kind of ontology would underpin these modified logical patterns; the main concern in the project of finding a logical system associated to the algebraic structure of the theory. Von Weizsäcker advanced this approach from the idea of reconstructing physics in terms of yes-no-alternatives, called ur-alternatives (from the German prefix ‘Ur’: original) and establishing a connection between quantum structures and the structure of space-time. These ur-alternatives are considered the fundamental objects in physics from which, in principle, any physical object can be built. Thus, from a notion related to information a turn is made to the notion of physical object: objects are reduced or even “made out of” information [178]. Later on, also Holger Lyre would argue in favor of this possibility:

In quantum theory in particular, this view has a lot of plausibility. Quantum objects are represented in terms of their Hilbert state spaces, their quantum states correspond to empirically decidable alternatives. Any quantum object may further be de-composed or embedded into the tensor product of two objects, nowadays called quantum bits or qubits. Urs, therefore, are in fact nothing but qubits. [178]

In the seventies, Pieter Mittelstäedt, a student of Heisenberg and von Weizsäcker, continued QL research framed within the neo-Kantian tradition [48]. Contrary to classical physics, where all propositions about a system can be predicated together, quantum properties may be assigned values only in a contextual manner [167], thus forbidding an interpretation in terms of substance. According to this view, the category of substance can be only applied to compatible observables; that is, in the case in which the state of the system is such that these observables may be assigned definite values. Classical logic, in turn, allows truth values for all propositions and thus it is not adequate for propositions about a quantum system, where the empirical content of propositions is relevant when applying the rules of logic. With the assumption that the laws of logic ought to be universally valid, Mittelstäedt turned to search for a different foundation of logic that could allow proofs to be independent of the empirical content of statements. First, he called attention to the fact that commensurability between any two propositions is implicit in classical logic. Then, starting from elementary propositions that assert that a system has a certain property, which can be valued by testing the property in an experiment, the concept of dialog-game was introduced. Several kinds of compound propositions may be defined by specifying the dialog-game. By adding a commensurability relation to the Hilbert lattice before constructing a formal propositional logic, Mittelstäedt was able to complete a calculus that is a model of L. By means of this concept of commensurability, the dialog-game gives a complete frame for argumentation [185, Ch. 4]. Then he introduced modalities and probability as metalinguistic concepts [186, 187, 188] as well as establishing that only by employing adequate notions of ‘temporal identity’ and ‘transworld identity’ might a Kripke-like semantics be formulated in QL [189, 190].

During the eighties and nineties, in line with the neo-Kantian QL line of research, the French philosopher Michel Bitbol analyzed the different alternatives of the language of physical properties and their role in objectivity. Although he admitted that Kant’s reasoning had to be greatly altered to become applicable to QM, he nevertheless outlined a derivation of QL from transcendental arguments [43]. First, contextuality is pointed to as the main characteristic that has to be focused on when applying the program. In the classical case:

[…] a phenomenon is usually (or even always) relative to a certain context which defines the range of possible phenomena to which it belongs. […] As long as the context can be combined, or at least as long as the phenomena can be made indifferent to the order and chronology of use of the contexts, nothing prevents one from merging the distinct of possible phenomena relative to each context into a single range of possible conjunctions of phenomena. This being done, one may consider that the new range of possible compound phenomena is relative to a single ubiquitous context which is not even worth mentioning. [43]

In classical physics, the rules of classical logic hold in every context, but they also hold when merging the contexts. This is not the case in QM. Although Boolean algebra and the corresponding laws of classical logic may be used to deal with propositions about qualities in each context, when considering them all together the structure is that of L(H). To manifestly show how the different languages link together, classical languages using classical connectives are implemented in each context, then a meta-language is constructed using a relation of implication, that is, one language implies another one if and only if every sentence in the first is also a sentence in the other. This implication is broader than the mere ‘union’ of both languages because it contains not only the propositions of each contextual language, their conjunctions and disjunctions but also new ones. The combination of contexts has more consequences than the ones that occur when they are used separately. This construction is shown to be nothing but an orthocomplemented non-distributive lattice [42, Annexe I]. Thus, Bitbol [43] concludes that “the specific structure of QL is unavoidable when unification of contextual languages at a meta-linguistic level is demanded. In this sense, one can say that QL has been derived by means of a transcendental argument: it is a condition of possibility of a meta-language able to unify context-dependent experimental languages.” For a complete revision of the neo-Kantian line of research within QM we refer to [163].

b. Quantum Logical Operationalism

The Birkhoff-von Neumann paper initiated the search for an axiomatic theory where the, physically non-justified, Hilbert space structure would be derived from a set of physically motivated axioms, giving particular importance to the concept of experimental propositions. Following this line of thought, George Mackey published in 1963 a monograph [179] in which he recovered von Neumann’s idea of “projections as propositions” [234, p. 247]. As projections have only two eigenvalues, 0 and 1, one may think of the proposition associated to a projection as the answer “yes” or “no” to the corresponding question. Thus, Mackey referred to the propositions affiliated with a physical system as questions [179, p. 64] and, under a reasonable axiomatization, Mackey showed that the questions form an orthomodular lattice. In this frame, the question of “which measures on questions are to be regarded as states?” [179, p. 85] was answered by Mackey’s student Andrew Gleason:

A measure on the closed subspaces means a function µ which assigns to every closed subspace a nonnegative real number, such that if {A_i} is a countable collection of mutually orthogonal subspaces having closed linear spam B, then µ(B) = Σµ(A_i). It is easy to see that such a measure can be obtained by selecting a vector v and, for each closed subspace A, taking µ(A) as the square of the norm of the projection of v on A. Positive linear combinations of such measures lead to more examples and, passing to the limit, one finds that, for every positive semi-definite self-adjoint operator T of the trace class µ(A) = tr(TP_A), where P_Adenotes the orthogonal projection on A, defines a measure on the closed subspaces. It is the purpose of this paper to show that, in any separable Hilbert space of dimension at least three, whether real or complex, every measure on the closed subspaces is derived in this fashion. [144]

In some sense, Mackey’s program is a reconstruction of QM as non-classical probability calculus. Mackey’s investigations on the foundations of QM renewed interest in the somewhat forgotten subject of QL, and also in its connection with the study of orthomodular lattices. Varadarajan’s and Jauch’s books [230, 159] follow from this. For example, some mathematical aspects of the notion of probability involved by the density operator have been studied by Veeravalli Varadarajan [229]. But it was the representation theorem of Constantin Piron [194] which clarified the field. The theorem states that if L is a complete orthocomplemented atomic lattice which is weakly modular and satisfies the covering law, then each irreducible component of the lattice L can be represented as the lattice of all biorthogonal subspaces of a vector space V over a division ring K. The Solèr theorem then proves that an infinite dimensional orthomodular space over a division ring which is the real or complex numbers or the quaternions, is a Hilbert space [219].

In the sixties, Jauch and Piron [194, 159] also aimed at reconstructing the formalism of QM from first principles with special interest in the relation between concepts and real physical operations that can be performed in the laboratory. For example, states are defined as “the result of a series of physical manipulations on the system which constitute the preparation of the state.” And it is emphasized that “[t]wo states are identical if the relevant conditions in the preparation of the state are identical. (The distinction between the system and its states cannot be maintained under all circumstances with the precision implied by this definition. The reason is that systems which we regard under normal circumstances as different may be considered as two different states of the same system. An example is a positronium and a system of two photons.)” [159, p. 92] The same prescriptions follow for propositions: “the composed proposition a ∧ b denotes the measurement of a and b.” [159, Sect. 5.3] Due to the prescription that every notion should be defined in terms of operations, this line of research is called operationalism. Operational QL involves the fact that the yes-no answers to the elementary questions, or the “experimental propositions” of Birkhoff and von Neumann, may be regarded as the propositions of a non-classical logic. Moreover, its purpose is to attempt to give an independent motivation to the general program to understand QM [58]. According to [12], the main operationalist lines of research are the following: The Geneva school commanded by Jauch and Piron [159, 195, 197] in Geneva, and continued by Piron’s student, Diederik Aerts [7, 9, 10, 11], in Brussels; the Amherst approach which in words of David Foulis and Charles Randall should be called “empirical logic” [122, 123, 124, 127]; and finally the Marburg approach directed by Günter Ludwig [176, 177].

One of the main results of the operational line of research is due to Aerts in 1981. Orthodox QL faces a deep problem for treating composite systems. In fact, when considering two classical systems, it is meaningful to organize the whole set of propositions about them in the corresponding Boolean lattice built up as the Cartesian product of the individual lattices. Informally one may say that each factor lattice corresponds to the properties of each physical system. But the quantum case is completely different. When two or more systems are considered together, the state space of their pure states is taken to be the tensor product of their Hilbert spaces. Given the Hilbert state spaces H₁and H₂as representatives of two systems, the pure states of the compound system are given by rays in the tensor product space H = H₁⊗ H₂. But it is not true, as a naive classical analogy would suggest, that any pure state of the compound system factorizes after the interaction in pure states of the subsystems, and that they evolve with their own Hamiltonian operators. It was shown, in a non-separability theorem by Aerts [7], that when trying to repeat the classical procedure of taking the tensor product of the lattices of the properties of two systems, to obtain the lattice of the properties of the composite, the procedure fails [5, 6, 8, 57, 125, 126]. Attempts to vary the conditions that define the product of lattices have been made but in all cases it results that the Hilbert lattice factorizes only in the case in which one of the factors is a Boolean lattice, or when the systems have never interacted. Using the operationalist approach two Belgian students of Aerts, Bob Coecke and Sonja Smets, outlined a research program on dynamic QL. [62, 60, 218] (see Section 5.2).

c. Is Quantum Logic Empirical?

During the late sixties and beginning of the seventies there was a radical philosophical view initiated by David Finkelstein [120, 121] and Hilary Putnam [202, 203] arguing that logic is in a certain sense empirical.

Finkelstein highlighted the abstractions we make in passing from mechanics to geometry to logic, and suggested that the dynamical processes of fracture and flow already observed at the first two levels should also arise at the third. Putnam, on the other hand, argued that the metaphysical pathologies of superposition and complementarity are nothing more than artifacts of logical contradictions generated by an indiscriminate use of the distributive law. [58]

According to Putnam’s famous paper [202]: “Logic is as empirical as geometry. We live in a world with a non-classical logic.” For Putnam in that specific period, the elements of L(H) represent categorical properties that an object does or does not possess, independently of whether or not we look. Inasmuch as this picture of physical properties is confirmed by the empirical success of QM, this view means we must accept that the way in which physical properties actually hang together is not Boolean. Since logic is, for Putnam, very much the study of how physical properties actually hang together, he concludes that classical logic is simply mistaken: the distributive law is not universally valid.

d. Modal Interpretations

The study of the modal character of QM was explicitly formalized in the seventies and eighties by a group of physicists and philosophers of science. Bas van Fraassen was the first to formally include the reasoning of modal logic in QM. He presented a modal interpretation (MI) of QL in terms of its semantical analysis [224, 225, 226, 227]. The purpose of which was to clarify which properties among those of the complete set structured in the lattice of subspaces of Hilbert space pertain to the system. Van Fraassen’s position remains close to the tradition introduced by Niels Bohr and his interpretation of QM. Indeed, the relation of van Fraassen’s interpretation to the orthodox view can be seen as a consequence of maintaining a “conservative” position regarding the values of definite properties [228, p. 280].

In 1985, Simon Kochen presented his own modal version [166] at one of the famous conferences on the foundations of QM organized by Kalervo Laurikainen in Finland. This interpretation of QM also has a direct link to the discussions between the founding fathers of the theory. Von Weizsäcker and Thomas Görnitz referred specifically to it in a paper entitled “Remarks on S. Kochen’s Interpretation of Quantum Mechanics”:

We consider it is an illuminating clarification of the mathematical structure of the theory, especially apt to describe the measuring process. We would, however feel that it means not an alternative but a continuation to the Copenhagen interpretation (Bohr and, to some extent, Heisenberg). [236, p. 357]

Dennis Dieks’ interpretation can be considered as a continuation and a formal account of Bohr’s ideas on complementarity and measurement. Taking as a standpoint the work done by van Fraassen, Dieks went further in relation to the metaphysical presuppositions involved, making explicit the idea that MIs [94, 95, 96, 97] could be also considered from a realist stance as describing systems with properties. If considered from this perspective, MIs face the problem of finding an objective reading of the accepted mathematical formalism of the theory, a reading “in terms of properties possessed by physical systems, independently of consciousness and measurements (in the sense of human interventions).” [97] Thus the main problem they must face is the determination of the set of definite valued properties possessed by a physical system, avoiding the constraints imposed by the Kochen-Specker (KS) theorem [167] (for a discussion see [220]). Of course, the way in which MIs attack the problem rests on the distinction between the realms of possibility and actuality.

As noted by Dirac in the first chapter of his famous book [99], the existence of superpositions is responsible for the striking difference between quantum and classical behavior. Superpositions are also central when dealing with the measurement process, where the various terms associated with the possible outcomes of a measurement must be assumed to be present together in the description. This fact leads van Fraassen to the distinction between value-attributing propositions and state-attributing propositions, between value-states and dynamic-states:

[…] a state, which is in the scope of quantum mechanics, gives us only probabilities for actual occurrence of events which are outside that scope. They can’t be entirely outside the scope, since the events are surely described if they are assigned probabilities; but at least they are not the same things as the states which assign the probability.

In other words, the state delimits what can and cannot occur, and how likely it is—it delimits possibility, impossibility, and probability of occurrence—but does not say what actually occurs. [228, p. 279]

So, van Fraassen distinguishes propositions about events and propositions about states. Propositions about events are value-attributing propositions < A,σ >, they say that ‘observable A has a certain value belonging to a set σ.’ Propositions about states are of the form ‘the system is in a state of this or that type (in a pure state, in some mixture of pure states, in a state such that…).’ A state-attribution proposition [A,σ] gives a probability of the value-attribution proposition, it states that A will have a value in σ, with a certain probability. Value-states are specified by stating which observables have values and what these values are. Dynamic-states state how the system will develop. This is endowed with the following interpretation:

The interpretation says that, if a system X has dynamic state ρ at t, then the state-attributions [A,σ] which are true are those that Tr(ρP_σ^A) = 1 [have probability equal to one]. [P_σ^A is the projector over the corresponding subspace.] About the value-attributions, it says that they cannot be deduced from the dynamic state, but are constrained in three ways:

If [A,σ] is true then so is the value-attribution < A,σ >: observable A has value in σ.
All the true value-attributions should have Born probability 1 together.
The set of true value-attributions is maximal with respect to the feature (2.) [228, p. 281]

This interpretation informs the consideration of possibility in the realm of QL [228, chapter 9]. In fact, the probabilities are of events, each describable as ‘an observable having a certain value’, corresponding to value states. If w is a physical situation in which system X exists, then X has both a dynamic state ϕ and a value state λ, that is, w =< ϕ,λ >. A value state λ is a map of observable A into non-empty Borel sets σ such that it assigns {1} to 1_σA. 1_σis the characteristic function of the set σ of values. So, if the observable 1_σA has value 1, then it is impossible that A has a value outside σ. The proposition < A,σ >= {w : λ(w)(A) ⊆ σ} assigns values to physical magnitudes. This is a value-attribution proposition and is read as ‘A (actually) has value in σ’. V is called the set of value attributions V = {< A,σ >: A an observable and σ a Borel set}. The logic operations among value-attribution propositions are defined as:

<A,σ >^⊥=< A, − σ >, < A,σ > ∧ < A,θ >=< A,σ ∩ θ >, < A,σ > ∨ < A,θ >=< A,σ∪θ > and ∧{< A,σ_i>: i ∈ N} =< A,∩{σ_i: i ∈ N} >.

With all this, V is the union of a family of Boolean sigma algebras < A > with common unit and zero equal to < A,S(A) > and < A,∧ > respectively. The Law of Excluded Middle is satisfied: every situation w belongs to q ∨ q^⊥, but not the Law of Bivalence: situation w may belong neither to q nor to q^⊥.

A dynamic state ϕ is a function from V into [0, 1], whose restriction to each Boolean sigma algebra < A > is a probability measure. The relation between dynamic and value states is the following: ϕ and λ are a dynamic state and a value state respectively, only if there exists possible situations w and w‘ such that ϕ = ϕ(w), λ = λ(w’). Here, ϕ is an eigenstate of A, with corresponding eigenvalue a, exactly if ϕ(< A,{a} >) = 1. The state-attribution proposition [A,σ] is defined as: [A,σ] = {w : ϕ(w)(< A,σ >) = 1} and means ‘A must have value in σ’. P denotes the set of state-attribution propositions: P = {[A,σ] : A an observable, σ a Borel set}. Partial order between them is given by [A,σ] ⊆ [A’ ,σ’ ] only if, for all dynamical states ϕ, ϕ(< A,σ >) ≤ ϕ(< A’,σ‘>) and the logic operations are (well) defined as: [A,σ]^⊥ = [A, − σ], [A,σ]$\uplus$[A,θ] = [A,σ∪θ] and [A,σ]∩[A,θ] = [A,σ∩θ]. With all this, < P,⊆,^⊥ > is an orthoposet. The orthoposet is formed by ‘pasting together’ a family of Boolean algebras in which whole operations coincide in overlapping areas. It may be enriched to approach the lattice of subspaces of Hilbert space.

One may recognize a modal relation between both kind of propositions. For example, one starts denying the collapse in the measurement process and recognizing that the observable has one of the possible eigenvalues. Then it may be asked what may be inferred with respect to those values when one knows the dynamic state. The answer van Fraassen gives is that, in the case that ϕ(w) is an eigenstate of the observable A with eigenvalue a, then A actually does have value a. This means that in this case, the measurement ‘reveals’ the value the observable already had. He generalizes this idea and postulates that [A,σ] implies < A,σ >. With this assumption and the rejection of an ignorance interpretation of the uncertainty principle, he is able to prove that [A,σ] = □< A,σ >. The necessity operator □ is defined by □Q = {w : for all w‘, if wRw‘ then w‘ ∈ Q}, where Q is any proposition and R is the relative possibility relation: w‘ is possibly relative to w exactly if, for all Q in V, if w is in Q then w‘ is in Q. So, [A,σ] may be read as ‘necessarily, < A,σ >’. This says that the dynamic state assigns 1 to < A,σ > if and only if the value state that accompanies any relatively possible dynamic state makes < A,σ > true. Instead of the transitive possibility relation R, one may use an equivalence relation to define □’, the negated necessity operator. In this case, van Fraassen maintains that the map [A,σ] →< A,σ > is an isomorphism of posets < P,⊆> and < V,⊆> and, when orthocomplementation is defined, it becomes an isomorphism between the orthoposets. Thus, the logic of V is that of P, that is, QL. Endowed with these tools, van Fraassen gives an interpretation of the probabilities of the measurement outcomes which is in agreement with the Born rule.

The MI proposed by Kochen and Dieks (K-D, for short), proposes to use the so called biorthogonal decomposition theorem (also called Schmidt theorem) in order to describe the correlations between the quantum system and the apparatus in the measurement process. From a realistic perspective, an interpretational issue which MIs need to take into account is the assignment of definite values to properties. But if we try to interpret eigenvalues which pertain to different sets of observables as the actual (pre-existent) values of the physical properties of a system, we are faced with all kind of no-go theorems that preclude this possibility. Regarding the specific scheme of the MI, Bacciagaluppi and Clifton were able to derive KS-type contradictions in the K-D interpretation which showed that one cannot extend the set of definite valued properties to non-disjoint sub-systems [26, 56]. In order to escape KS type contradictions, Jeffrey Bub’s modal version recalls David Bohm’s interpretation and proposes to take some observable, R, as always possessing a definite value. In this way one can avoid KS contradictions and maintain a consistent discourse about statements which pertain to the sublattice determined by the preferred observable R. As with van Fraassen’s and Vermaas and Dieks’ interpretations, Bub’s proposal distinguishes between dynamical states and property or value states, in his case with the purpose of interpreting the wave function as defining a Kolmogorovian probability measure over a restricted sub-algebra of the lattice L(H) of projection operations (corresponding to yes-no experiments) over the state space. It is this distinction between property states and dynamical states which according to Bub provides the modal character to the interpretation:

The idea behind a ‘modal’ interpretation of quantum mechanics is that quantum states, unlike classical states, constrain possibilities rather than actualities—which leaves open the question of whether one can introduce property states […] that attribute values to (some) observables of the theory, or equivalently, truth values to the corresponding propositions. [47, p. 173]

In precise terms, as L(H) does not admit a global family of compatible valuations, and thus not all propositions about the system are determinately true or false, probabilities defined by the (pure) state cannot be interpreted epistemically [47] (p. 119). But, if one chooses, for a given state |e>, a preferred observable R, these properties can be taken as determinate since the propositions associated with R, that is, with the projectors in which R decomposes, generate a Boolean algebra. Bub constructs the maximal sublattices D(|e>, R) ⊆ L(H) to which truth values can be assigned via a 2-valued homomorphism and demonstrates a uniqueness theorem that allows the construction of the preferred observable.

In Bub’s proposal, a property state is a maximal specification of the properties of the system at a particular time, defined by a Boolean homomorphism from the determined sublattice to the Boolean algebra of two elements. On the other hand, a dynamical state is an atom of L(H) that evolves unitarily in time following the Schrödinger equation. So, dynamical states do not coincide with property states. Given a dynamical state represented by the atom |e> ∈ L(H), one constructs the sublattice D(|e>, R) with Kolmogorovian probabilities defined over alternative subsets of properties in the sublattice. They are the properties of the system, and the probabilities defined by |e> evolve (via the evolution of |e>) in time. If the preferred observable is the identity operator I, the atoms in D(|e>, I) may be pictured as a ‘fan’ of its projectors generated by the ‘handle’ |e> [46, p. 751] or an ‘umbrella’ with state |e> again as the handle and the rays in (|e>)^⊥ as the spines. When observable R ≠ I, there is a set of handles {|e_ri>,i = 1…k} given by the nonzero projections of |e> onto the eigenspaces of R and the spines represented by all the rays in the orthogonal complement of the subspace generated by the handles. When dim(H) > 2, there are k 2-valued homomorphisms which map each of the handles onto 1 and the remaining atoms onto 0. The determinate sublattice, which changes with the dynamics of the system, is a partial Boolean algebra, that is, the union of a family of Boolean algebras pasted together in such a way that the maximum and minimum elements of each one, and eventually other elements, are identified and, for every n-tuple of pair-wise compatible elements, there exists a Boolean algebra in the family containing the n elements. The possibility of constructing a probability space with respect to which the Born probabilities generated by |e> can be thought of as measures over subsets of property states, depends on the existence of sufficiently many property states defined as 2-valued homomorphisms over D(|e>, R). This is guaranteed by a uniqueness theorem that characterizes D(|e>, R) [47, p. 126]. Thus constructed, the structure avoids KS-type theorems. Then, given a system S and a measuring apparatus M,

[…] if some quantity R of M is designated as always determinate, and M interacts with S via an interaction that sets up a correlation between the values of R and the values of some quantity A of S, then A becomes determinate in the interaction. Moreover, the quantum state can be interpreted as assigning probabilities to the different possible ways in which the set of determinate quantities can have values, where one particular set of values represents the actual but unknown values of these quantities. [46, p. 750]

The problem with this interpretation is that, in the case of an isolated system, there is no single element in the formalism of QM that allows us to choose an observable R, rather than another. This is why the move seems flagrantly ad hoc. Were we dealing with an apparatus, there would be a preferred observable, namely the pointer position, but the quantum wave function contains in itself mutually incompatible representations (choices of apparatuses) each of which provides non-trivial information about the state of affairs. The Bohmian proposal of Bub, has been extended by Guido Bacciagaluppi and Michael Dickson in their atomic version of the MI [27].

The authors of this work have also contributed to the understanding of modality in the context of orthodox QL [102, 103, 104, 105]. From our investigation there are several conclusions which can be drawn. We started our analysis with a question regarding the contextual aspect of possibility. As it is well known, the KS theorem does not talk about probabilities, but rather about the constraints of the formalism to actual definite valued properties considered from multiple contexts. What we found via the analysis of possible families of valuations is that a theorem which we called, for obvious reasons, the Modal KS (MKS) theorem can be derived which proves that quantum possibility, contrary to classical possibility, is also contextually constrained [102]. This means that, regardless of its use in the literature, quantum possibility is not classical possibility. In a paper written in 2014 [88], we concentrated on the analysis of actualization within the orthodox frame and interpreted, following the structure, the logical realm of possibility in terms of ontological potentiality.

e. The Czech-Slovakian and Italian Schools

The study of the structure of tensor products [57, 199, 112, 113, 114] motivated a fruitful development of different algebraic structures that could represent quantum propositions, which in turn became a line of investigation by itself. Beginning with the proposal of test spaces by Foulis and Randall [122, 123, 124, 204, 205, 206, 207], which are related to orthoalgebras, the theory of structures as orthomodular lattices, partial Boolean algebras, orthomodular posets, effect algebras, quantum MV-algebras and the like became widely discussed. The Czech school led by Pavel Ptak, the Slovak school initiated by Anatolij Dvurečenskij and Sylvia Pulmannová and the Italian school organized by Enrico Beltrametti and Maria Luisa Dalla Chiara and continued by Roberto Giuntini were pioneers in the subject, see for example [32, 33, 36, 37, 52, 51, 54, 78, 79, 81, 115, 113, 130, 128, 129, 145, 141, 142, 148, 147, 168, 162, 169, 198, 200, 237, 238]. The weakened structures allow consideration of unsharp propositions related, not to projections, but to the elements of the more general set of linear bounded operators—called effects—over which the probability measure given by the Born rule may be defined. And this in turn gave rise to the consideration of paraconsistent QL, partial QL and Łukasiewickz QL [79].

An important line of research in the subject of quantum structures is the application of QL methods to languages of information processing and, more specifically, to quantum computational logic (QCL) [53, 80, 101, 82, 135, 136, 138, 143, 149, 193, 192]. In this way several logical systems associated to quantum computation were developed. They provide a new form of quantum logic strongly connected with the fuzzy logic of continuous t-norms [151]. The groups in Firenze directed by Dalla Chiara, and Cagliari directed by Giuntini, have also developed different languages for quantum computation. A sentence in QL may be interpreted as a closed subspace of H. Instead, the meaning of an elementary sentence in QCL is a quantum information quantity encoded in a collection of qbits—unit vectors pertaining to the tensorial product of two dimensional complex Hilbert spaces—or qmixes—positive semi-definite Hermitian operators of trace one over Hilbert space. Conjunction and disjunction are not associated to the join and meet lattice operations. Instead, the number of conjunctions and disjunctions involved in a sentence determines the dimension of the space of its ‘meanings’, the dimension varying with the number and nature of the logical connectives, thus the ‘meaning’ of the sentence reflects the logical form of the sentence itself (for a complete discussion see [80]).

f. The Brazilian School

Newton da Costa and Décio Krause at Florianópolis have begun investigations on Non-Reflexive Logics (NRL) and Paraconsistent Logics (PL) related to several foundational issues regarding QM. On the one hand, NRL is, in a wide sense, a logic in which the relation of identity (or equality) is restricted, eliminated, replaced, at least in part, by a weaker relation, or employed together with a new non-reflexive implication or equivalence relation. In classical logic, one of the basic principles is the Principle of Identity (PI), expressing the reflexive property of identity, whose usual formulation is x = x or ∀ x (x = x), where x is a first order variable. There are other versions in higher-order logic, in which higher order variables appear. There are also propositional formulations of the principle: p → p (p implies p) or p ↔ p (p is equivalent to p), where p is a propositional variable. If propositional quantification is allowed, then we have other forms of the principle: ∀p (p → p) as well as: ∀p (p ↔ p). Some of the above principles are not in general valid in non-reflexive logics. They are total or partially eliminated, restricted, or not applied to the relation that is employed instead of identity. Several of these principles are the motivations for the development of non-reflexive logics. The application of the PI is controversial in the quantum domain not only due to the so called “indistinguishability of quantum particles” but, more deeply, when applying it to “something” that does not respect the classical definition of object. In particular, the search for a set theory that could be adequate to QM goes as far back as the 1974 Congress of the American Mathematical Society, which was devoted to the evaluation of the status of Hilbert’s problems for the century, posed in Paris in 1900. In the 1974 Congress, Manin proposed as one of the new set of problems for the next century:

[…] we should consider possibilities of developing a totally new language to speak about infinity. […] I would like to point out that this [the concept of set] is rather an extrapolation of common-place physics, where we can distinguish things, count them, put them in order, etc. New quantum physics has shown us models of entities with quite different behaviour. Even ‘sets’ of photons in a looking-glass box, or electrons in a nickel piece are much less Cantorian that the ‘set’ of grains of sand. [181]

The “new language to speak about infinity” is obviously a new ‘set’ theory, since set theory is usually known as “the theory of the (actual) infinite.” For a discussion about the necessity of a new set theory see for example [171, 134, 170, 77]. Within this context, the weakening of the concept of identity—substituted by that of indiscernibility—allows the development of non-reflexive logics which, in a wide sense, are logics in which the relation of identity (or equality) is restricted, eliminated, replaced, at least in part, by a weaker relation, or employed together with a new non-reflexive implication or equivalence relation [68, 73, 172, 75]. There are also different approaches to the logic related to quantum set theories. Gaisi Takeuti proposed a quantum set theory developed in the lattice of projections-valued universe [221, 222] and Satoko Titani formulated a lattice valued logic corresponding to general complete lattices developed in the classical set theory based on the classical logic [223].

On the other hand, PL are the logics of inconsistent but non-trivial theories. The origins of PL go back to the first systematic studies dealing with the possibility of rejecting the PNC. PL was elaborated, independently, by Stanislaw Jaskowski in Poland, and by Newton da Costa in Brazil, around the middle of the last century (on PL, see, for example: [72]). A theory T founded on the logic L, which contains a symbol for negation, is called inconsistent if it has among its theorems a sentence A and its negation ¬A; otherwise, it is said to be consistent. T is called trivial if any sentence of its language is also a theorem of T; otherwise, T is said to be non-trivial. In classical logics and in most usual logics, a theory is inconsistent if, and only if, it is trivial. L is paraconsistent when it can be the underlying logic of inconsistent but non-trivial theories. Clearly, no classical logic is paraconsistent. In the context of QM, da Costa and Krause have put forward [71] a PL in order to provide a suitable formal scheme to consider the notion of complementarity introduced in 1927 by Niels Bohr during his famous ‘Como Lecture’. The notion of complementarity was developed by Bohr in order to consider the contradictory representations of wave representation and corpuscular representation found in the double-slit experiment (see for example [174]). According to Bohr: “We must, in general, be prepared to accept the fact that a complete elucidation of one and the same object may require diverse points of view which defy a unique description.” The proposal of da Costa and Krause has been further analyzed by Jean-Yves Béziau [39, 40] taking into account the Square of Opposition (see section 6.4 below).

5. Ongoing Developments and Debates

There is a great amount of work in progress in QL from new quantum structures, to the use of non-reflexive logics, paraconsistent logics, dynamical logics, etc. In the following section we shall review some of these advancements that have taken place in relation to QM.

a. New Quantum Structures

The importance of quantum structures as a field of research gave rise to its own association: The International Quantum Structures Association (IQSA). As Dvurečenskij relates in the Foreword to the Handbook of Quantum Logic and Quantum Structures:

[…] in the early nineties, a new organization called International Quantum Structures Association (IQSA) was founded. IQSA gathers experts on quantum logic and quantum structures from all over the world under its umbrella. It organisms regular biannual meetings: Castiglioncello 1992, Prague 1994, Berlin 1996, Liptovsky Mikulas 1998, Cesenatico 2001, Vienna 2002, Denver 2004, Malta 2006. In spring 2005, Dov Gabbay, Kurt Engesser, Daniel Lehmann and Jane Spurr had an excellent idea—to ask experts on quantum logic and quantum structures to write long chapters for the Handbook of Quantum Logic and Quantum Structures. [117, p. viii]

In fact, in the subject of quantum structures, MV-algebras, effect algebras, pseudo-effect algebras and related structures are being developed in relation to their use in QM. See [55, 116, 131, 132, 133, 184, 201], just to cite a few examples.

b. Dynamical Logics, Category Theory and Quantum Computation

As mentioned above (see Section 4.2), Smets and Coecke initiated a line of research that considers the possibility of regarding QL in a dynamical manner. This research is connected to the tradition of computer science, interested in the semantic notion of process, and thinks about the quantum realm in terms of change, instead of taking concepts like ‘particle’, ‘system’, ‘property’ and so on as fundamental. The standpoint of this approach is the observation that QL is essentially a dynamical logic, that it is about actions rather than propositions [30]. It is also connected to the interpretation of the ‘Sasaki hook’—namely, the quantum implication that is the closest to the classical one—which may be understood in terms of a dynamic modality instead of in terms of deduction—in fact, it does not satisfy the deduction theorem [60]. Smets together with Alexandru Baltag have proposed two axiomatizations of the logic of quantum actions [218]. One of them takes the notion of action as fundamental and axiomatizes the underlying algebra, giving a quantale [22, 59]. The other takes the notion of state as fundamental and represents actions as relations between states. Contrary to orthomodular QL [78], these axiomatizations fulfill completeness with respect to infinite dimensional Hilbert spaces and have applications in computational science [28]. In fact, the application to computational science and more broadly to information processing needs to manage composite systems, one of the profound difficulties that faces orthodox QL.

Also, the relation between category theory and QL is being explored from different perspectives. On the one side, there is a line of investigation initiated by Chris Isham and continued by Andreas Döring with Chris Heunen, Klaas Landsman and Bas Spitters among others, whose main interest is to link the construction of a physical theory and its representation in a topos [146] of the formal language attached to the theory [107, 108, 109, 110, 157, 111, 154, 50]. They make claims about the necessity of reviewing the basic suppositions that are taken from granted, for example, the nature of space-time, the use of real numbers as values of physical quantities and the meaning of probability. From a logical point of view, contrary to the intractable QL, any topos in which the physical theory is represented comes with an intrinsic intuitionistic logic that is obviously more tractable. Moreover, compound systems also find their place in the topos approach [111]. Classical theories are included in this new formalization and for all of them the corresponding topos is that of sets endowed with classical logic as a trivial intuitionistic one. Also Elias Zafiris and Vassilios Karakostas are making new research in categorial semantics [239].

On the other side, the line of investigation initiated at Oxford by Samson Abramsky and continued by Coecke among others proposed an axiomatization which may be useful for managing the formal language of physical processes involved in new quantum technologies as quantum computation and teleportation. Quantum computers exploit the existence of superpositions to drastically decrease the time and recourses required to deal with certain problems such as triangle-finding, integer factorization or the searching of an entry in an unordered list [164, 191]. Teleportation uses non-separability to safely transmit information from one place to another by means of an entangled state and a classical communication channel [45]. The categorical approach of the Oxford group uses monoidal categories [1, 2, 3, 4, 67, 165] and simple diagrams to view quantum processes and composite systems in a consistent manner. They apply these tools to research in the subject of computing semantics [63, 64, 65], in particular in the subject of linear logic [140] which is essential for computing science. Also Cristina and Amilcar Sernadas in Lisbon are working on the connection of category theory and linear logic [182, 183, 49].

Research on computational semantics is being developed in connection with epistemic logics by members of the Italian group. For example, they model operators such as “to understand” or “to know” by irreversible quantum operations, thus allowing us to reflect on characteristic limitations in the process of acquiring information [34, 35]. The relation between quantum structures and epistemic logics is also being studied by a group in Amsterdam. They are applying a modal dynamic-epistemic QL for reasoning about quantum algorithms and, in general, for considering quantum systems as codifying actions of information production and processing [29, 30, 31].

Dynamics of concepts as studied by cognitive science are also being considered with the aid of quantum structures. In fact, D. Aerts and co-workers have applied the formalism of QM for modeling the combination of concepts, showing the indeterministic and holistic characters of this process [13, 16, 17, 20, 21]. This approach has technological applications in connection with quantum computation and robotics [18, 19].

c. Paraconsistency and Quantum Superpositions

As remarked by Coecke the meaning of the superposition principle might be the key to understand QM:

Birkhoff and von Neumann crafted quantum logic in order to emphasize the notion of quantum superposition. In terms of states of a physical system and properties of that system, superposition means that the strongest property which is true for two distinct states is also true for states other than the two given ones. In order-theoretic terms this means, representing states by the atoms of a lattice of properties, that the join p ∨ q of two atoms p and q is also above other atoms. From this it easily follows that the distributive law breaks down: given atom r ≠ p,q with r < p ∨ q we have r ∧(p ∨ q) = r while (r ∧ p)∨(r ∧ q) = 0∨0 = 0. Birkhoff and von Neumann as well as many others believed that understanding the deep structure of superposition is the key to obtaining a better understanding of quantum theory as a whole. [66]

In line with this intuition, in [74], one of the authors of this paper together with N. da Costa argued in favor of the possibility of considering quantum superpositions in terms of a PL approach. It was claimed that, even though most interpretations of QM attempt to escape contradictions, there are many hints—coming mainly from present technical and experimental developments in QM—that indicate it could be worthwhile to engage in a research of this kind. Arenhart and Krause [23, 24, 25] have raised several arguments against the paraconsistent approach to quantum superpositions which have been further analyzed in [86]. Recently, some new proposals to consider quantum superpositions from a logical perspective have been put forward [76, 173].

d. Contradiction and Modality in the Square of Opposition

In Aristotelian classical logic, categorical propositions are divided in Universal Affirmative, Universal Negative, Particular Affirmative and Particular Negative. Possible relations between two of the mentioned types of propositions are encoded in the square of opposition. The square expresses the essential properties of monadic first order quantification which, in an algebraic approach may be represented by taking into account monadic Boolean algebras. The square of opposition has been considered, in relation to QL, as a useful tool to identify paraconsistent negations [38, 40]. The square also expresses the essential properties of the monadic first order quantifiers ∃ and ∀ that, in an algebraic approach, can be represented within the frame of monadic Boolean algebras by considering quantifiers as modal operators acting on a Boolean algebra [150]. This representation is called the modal square of opposition. An extension of the square to a case in which the underlying structure is replaced by the algebra of QL has been provided in [137] and it may be useful to identify paraconsistent negations in the structure of QM (see also for discussion [89]).

The square of opposition has also recently been considered in relation to the meaning of quantum superpositions and the interpretation of the terms that compose it (Section 5.3). On the one hand, according to [74], it has been argued that one might consider some of the terms that compose the superposition as contradictory. On the other hand, Arenhart and Krause [23, 24, 25] have defended the idea that, taking into account the square of opposition, contrariety is a more suitable notion to describe the physical meaning of superpositions (see also [85, 86, 87]).

e. Quantum Probability

The subject of probability in QM appears in the early discussions and analysis provided by the founding fathers of the theory. On the one hand, there is the question about its interpretation, already stressed by Schrödinger in a letter to Einstein: “It seems to me that the concept of probability is terribly mishandled these days. Probability surely has as its substance a statement as to whether something is or is not the case—of an uncertain statement, to be sure. But nevertheless it has meaning only if one is indeed convinced that the something in question quite definitely is or is not the case. A probabilistic assertion presupposes the full reality of its subject.” [47, p. 115]. On the other hand, one faces the problem of its very definition: The Born rule was incorporated in the axiomatization of QM as a noncommutative measure over the lattice of events by von Neumann in the early thirties, but this measure needs a modular lattice to be well posed, while L(H) is an orthomodular one. As Miklos Rédei states:

To see why von Neumann insisted on the modularity of quantum logic, one has to understand that he wanted quantum logic to be not only the propositional calculus of a quantum mechanical system but also wanted it to serve as the event structure in the sense of probability theory. In other words, what von Neumann aimed at was establishing the quantum analogue of the classical situation, where a Boolean algebra can be interpreted both as the Tarski-Lindenbaum algebra of a classical propositional logic and as the algebraic structure representing the random events of a classical probability theory, with probability being an additive normalized measure on the Boolean algebra. [212, p. 157]

In fact, the difficulties with a rigorous definition of probability were well known to von Neumann [212]. When he was invited to the 1954 Congress of Mathematicians held in Amsterdam, dedicated to unsolved problems in mathematics—in a similar flavor to the 1900 Paris meeting in which Hilbert gave his famous lecture—von Neumann sketched his (ungiven) conference on the role of continuous rings of operators for a better understanding of QM, QL and quantum probability [211]. The difficulties with the definition of a “good measure” over the Hilbert lattice made von Neumann abandon the orthodox formalism of QM in Hilbert space, to which he himself had contributed a great deal, and face the classification of the factors and their dimension functions which led to the subject of von Neumann’s algebras.

Nowadays, the definition of probability still faces various challenges and the subject is under debate. On the one hand, type II₁factor (the one whose projection lattice is a continuous geometry, and thus an orthomodular modular lattice as required by the definition of measure of probability) is not an adequate structure to represent quantum events. On the other hand, there exists different candidates for defining conditional probability and there is not a unique criterion for choosing among them [81, 209]. With respect to interpretation, the frequency interpretation is untenable for all non-commutative probabilities [213]. As Rédei remarks, “yet, a satisfactory interpretation of non-commutative measure as probability and the relation of this non-commutative (quantum) probability to (quantum) logic is still lacking.” [211]

f. Potentiality and Actuality

As we have discussed above, QL has been related to actuality since its origin. The operationalist perspective of Birkhoff and von Neumann was implicitly related to the measurement problem (MP). In QM “a complete mathematical description of a physical system S does not in general enable one to predict with certainty the result of an experiment.” [41] As a matter of fact, QM describes mathematically the state in terms of a superposition, thus the question raises: why do we observe a single result (that corresponds to a single eigenstate) instead of something related to a superposition of them? Although the MP accepts the fact that there is something very weird about quantum superpositions, leaving aside its problematic meaning, it focuses on the justification of the actualization process. Taking as a standpoint the single outcome it asks how we get to the actual result from the multiplicity of possible states. The MP is thus an attempt to justify why, regardless of QM, we only observe actuality. The problem places the result in the origin, what needs to be justified is the already known answer.

QL distinguishes in general between ‘actual’ properties and ‘possible’ or ‘potential’ ones, opening the door to discuss a realm of existence beyond actuality. The notion of potentiality was introduced by Heisenberg in QM, and later developed and related through the operationalist approach to QL by Piron in [196] and more recently by Aerts in [14, 15] (see also for discussion [217]). Within such interpretations the collapse is accepted, and potentialities are defined in terms of their “becoming actual.” A different notion of potentiality which attempts to escape the limits of actuality has been also developed in [83, 84]. According to this approach one should turn things upside-down; we do not need to explain the actual via the potential but rather, we need to use the actual in order to develop the potential.

From different perspectives, the development of the notion of potentiality in QM is related to an attempt to provide a realistic physical representation of the theory going beyond the discourse about mere “actual results.” Such proposals are in line with trying to understand what is a quantum superposition, which is the main theoretical tool which has opened the door to the most outstanding technological developments and experiments in early 21^st century physics.

6. Final Remarks

Quantum logic has deeply influenced our understanding of the formal structure of QM. It has also played an important role within the foundational debates about the theory. In the early 21st century, the rise of a new technological era grounded on the processing of quantum information is posing original questions and challenges to all researchers close to the field. In this respect, the ongoing research in QL (section 6) can prove to be an important guide to try to advance our comprehension of the phenomena implied by these technologies.

7. References and Further Reading

[1] Abramsky, S., 1996, “Retracing some paths in process algebra”, in Proceedings of CONCUR 96, Lecture Notes in Computer Science, vol. 1119, 1-17, Springer-Verlag, Berlin.
[2] Abramsky, S. and Coecke, B., 2008, “Categorical Quantum Mechanics”, in Handbook of Quantum Logic and Quantum Structures, vol. II, K. Engesser, D. M. Gabbay and D. Lehmann (Eds.) Elsevier, Amsterdam.
[3] Abramsky, S. and Coecke, B., 2004, “A Categorical Semantics of Quantum Protocols”, in LICS Proceedings, 415-425.
[4] Abramsky, S. and Duncan, R.W., 2006, “A Categorical Quantum Logic”, Mathematical Structures in Computer Science, 16, 469-489.
[5] Aerts, D. and Daubechies, I., 1979 “A characterization of subsystems in physics”, Letters in Mathematical Physics, 3, 11-17.
[6] Aerts, D. and Daubechies, I., 1979, “A mathematical condition for a sublattice of a propositional system to represent a physical subsystem, with a physical interpretation”, Letters in Mathematical Physics, 3 19-27.
[7] Aerts, D., 1981, The One and the Many: Towards a Unification of the Quantum and Classical Description of One and Many Physical Entities, Doctoral dissertation, Brussels Free University, Belgium.
[8] Aerts, D., 1981, “Description of compound physical syste ms and logical interaction of physical systems”, in Current Issues on Quantum Logic, E.G. Beltrametti and B.C. van Fraassen (Eds.), pp. 381-405, Kluwer Academic Publishers, Dordrecht.
[9] Aerts, D., 1982, “Description of many physical entities without the paradoxes encountered in quantum mechanics”, Foundations of Physics, 12, 1131-1170.
[10] Aerts, D., 1983, “Classical theories and non-classical theories as a spetial case of a more general theory”, Journal of Mathematical Physics, 24, 24412453.
[11] Aerts, D., 1984, “Construction of a structure which makes it possible to describe the joint system of a classical and a quantum system”, Reports in Mathematical Physics, 20, 421-428.
[12] Aerts, D., 1999, “Foundations of Quantum physics: a general realistic and operational approach”, International Journal of Theoretical Physics, 38, 289-358.
[13] Aerts, D., 2009, “Quantum structure in cognition”, Journal of Mathematical Psychology, 53, 314-348.
[14] Aerts, D., 2009, “Quantum particles as conceptual entities: a possible explanatory framework for quantum theory”, Foundations of Science, 14, 361-411.
[15] Aerts, D., 2010, “A Potentiality and conceptuality interpretation of quantum mechanics”, Philosophica, 83, 15-52.
[16] Aerts, D., 2011, “Quantum interference and superposition in cognition. Development of a theory for the disjunction of concepts”, in Worldviews, Science and Us: Bridging Knowledge and its Implications for Our Perspectives of the World, D. Aerts, J. Broekaert, B. D’Hooghe and N. Note (Eds.), pp. 169-211, World Scientific, Singapore.
[17] Aerts, D., Broekaert, J. and Gabora, L., 2011, “A case for applying an abstracted quantum formalism to cognition”, New Ideas in Psychology, 29, 136-146.
[18] Aerts, D., Czachor, M and Sozzo, S., 2011, “Quantum interaction approach in cognition, artificial intelligence and robotics” in Proceedings of the Fifth International Conference on Quantum, Nano and Micro Technologies, V. Privman and V. Ovchinnikov (Eds.), pp. 35-40, 2011
[19] Aerts, D. and Sozzo, S., 2011, “Quantum structures in cognition: why and how concepts are entangled”, in Quantum Interaction Lecture Notes in Computer Science, 7052, 116-127.
[20] Aerts, D. Gabora, L. and Sozzo, S., 2013, “Concepts and their dynamics: a quantum-theoretic modeling of human thought”, Topics in Cognitive Science, 5, 737-772.
[21] Aerts, D. and Sozzo, S., 2014, “Quantum entanglement in concept combination”, International Journal of Theoretical Physics, 53, 3587-3603.
[22] Amira, H., Coecke, B. and Stubbe, I., 1998, “How quantales emerge by introducing induction within the operational approach”, Helvetica Physica Acta, 71, 554-572.
[23] Arenhart, J. R. and Krause, D., 2014, “Oppositions in Quantum Mechanics”, in New Dimensions of the Square of Opposition, J.-Y. Béziau and K. Gan-Krzywoszynska (Eds.), pp. 337-356, Philosophia Verlag, Munich.
[24] Arenhart, J. R. and Krause, D., 2014, “Contradiction, Quantum Mechanics, and the Square of Opposition”, Logique et Analyse.
[25] Arenhart, J. R. and Krause, D., 2015, “Potentiality and Contradiction in Quantum Mechanics”, in The Road to Universal Logic (volume II), A. Koslow and A. Buchsbaum (Eds.), pp. 201-211, Springer, Berlin.
[26] Bacciagaluppi, G., 1995, “A Kochen Specker Theorem in the Modal Interpretation of Quantum Mechanics”, Internal Journal of Theoretical Physics, 34, 1205-1216.
[27] Bacciagaluppi, G. and Dickson, W. M., 1997, “Dynamics for Density Operator Interpretations of Quantum Theory”, Preprint. (quantph/arXiv:9711048)
[28] Baltag, A. and Smets, S., 2004, “The logic of quantum programs”, in Proceedings of the 2nd International Workshop on Quantum Programming Languages, P. Selinger (Ed.), pp. 39-56, TUCS General Publication.
[29] Baltag, A. and Smets, S., 2010, “Correlated knowledge: an epistemic-logic view on quantum entanglement”, Internal Journal of Theoretical Physics, 49, 3005-3021.
[30] Baltag, A. and Smets, S., 2012, “The dynamic turn in quantum logic”, Synthese, 186, 753-773.
[31] Baltag, A., Bergfeld, J., Kishida, K., Sack, J., Smets, S. and Zhong, S., 2014, “PLQP and company: decidable logics for quantum algorithms”, Internal Journal of Theoretical Physics, 53, 3628-3647.
[32] Beltrametti, E.G. and Cassinelli, J., 1981, The Logic of Quantum Mechanics, Addison-Wesley, Reading, New York.
[33] Beltrametti, E.G. and van Fraassen, B.C. (Eds.), 1981, Current Issues in Quantum Logic, Plenum, New York.
[34] Beltrametti, E., Dalla Chiara, M.L., Giuntini, R., Leporini, R. and Sergioli, G., 2014, “A quantum computational semantics for epistemic logical operators. Part I: epistemic structures”, Internal Journal of Theoretical Physics, 53, 3279-3292.
[35] Beltrametti, E., Dalla Chiara, M.L., Giuntini, R., Leporini, R. and Sergioli, G., 2014, “A quantum computational semantics for epistemic logical operators. Part II: semantics”, Internal Journal of Theoretical Physics, 53, 3293-3307.
[36] Bennett, M.K. and Foulis, D.J., 1995, “Phi-symmetric effect algebras”, Foundations of Physics, 25, 1699-1722.
[37] Bennett, M.K. and Foulis, D.J., 1997, “Interval algebras and unsharp quantum logics”, Advances in Mathematics, 19, 200-215.
[38] Béziau, J.-Y., 2003, “New light on the square of opposition and its nameless corner”, Logical Investigations, 10, 218-232.
[39] Béziau, J.-Y., 2012, “The Power of the Hexagon”, Logica Universalis, 6, 1-43.
[40] Béziau, J.-Y., 2014, “Paraconsistent logic and contradictory viewpoint”, Revista Brasileira de Filosofia.
[41] Birkhoff, G. and von Neumann, J., 1936, “The logic of quantum mechanics”, Annals of Mathematics 37, 823-843.
[42] Bitbol, M., 1996, Mécanique Quantique, Flamarion, Paris.
[43] Bitbol, M., 1998, “Some steps towards a trascendental deduction of quantum mechanics”, Philosophia Naturalis, 35, 253-280.
[44] Bohr, N., 1985, Collected Works, vol. 6, I. Kolckar (Ed.), North-Holland, Amsterdam.
[45] Brouwmeester, D., Ekert, A.K. and Zeilinger, A., 2001, The Physics of Quantum Information: Quantum Criptography, Quantum Teleportation, Quantum Computation, Springer, Berlin.
[46] Bub, J., 1992, “Quantum Mechanics Without the Projection Postulate”, Foundations of Physics, 22, 737-754.
[47] Bub, J., 1997, Interpreting the Quantum World, Cambridge University Press, Cambridge.
[48] Bush, P., Pfarr, J., Ristig, M. and Stachow, E.-W., 2010, “QuantumMatter-Spacetime: Peter Mittelstaedt’s Contributions to Physics and Its Foundations”, Foundations of Physics, 40, 1163-1170.
[49] Caleiro, C., Mateus, P., Sernadas, A. and Sernadas, C., 2006, “Quantum Institutions”, in Algebra, Meaning, and Computation: Essays Dedicated to Joseph A. Goguen on the Occasion of His 65th Birthday, K. Futatsugi, J.-P. Jouannaud, and J. Meseguer (Eds.), 50-64, Springer-Verlag, Berlin.
[50] Caspers, M., Heunen, C., Landsman, N. and Spitters, B., 2009, “Intuitionistic quantum logic of an n-level system”, Foundations of Physics, 39, 731-759.
[51] Cattaneo, G. and Nisticò, G.,1986, “Brower-Zadeh posets and three-valued ?ukasiewicz posets”, Fuzzy Sets and Systems, 33, 165-190.
[52] Cattaneo, G. and Laudisa, F., 1994, “Axiomatic unsharp quantum theory (from Mackey to Ludwig and Piron)”, Foundations of Physics, 24, 631-683.
[53] Cattaneo, G., Dalla Chiara, M.L., Giuntini, R. and Leporini, R., 2004, “An unsharp logic from quantum computation”, International Journal of Theoretical Physics, 43, 1803-1817.
[54] Cattaneo, G., Dalla Chiara, M.L., Giuntini, R. and Paoli, F., 2009, “Quantum Logic and Nonclassical Logics”, in Handbook of Quantum Logic and Quantum Structures, K. Engesser, D. Gabbay and D. Lehmann (Eds), pp. 127-226, Elsevier, Amsterdam.
[55] Chajda, I. and Kühr, J., 2012, “A generalization of effect algebras and ortholattices”, Mathematica Slovaca, 62, 1045-1062.
[56] Clifton, R.K., 1996, “The Properties of Modal Interpretations of Quantum Mechanics”, British Journal for the Philosophy of Science, 47, 371-398.
[57] Coecke, B., 2000, “Structural characterization of compoundness”, International Journal of Theoretical Physics, 39, 581-590.
[58] Coecke, B., Moore, D.J. and Wilce, A., 2000, “Operational Quantum Logic: An Overview”, in Current Research in Operational Quantum Logic: Algebras, Categories, Languages, B. Coecke, D.J. Moore and A. Wilce (Eds.), pp. 1-36, Kluwer Academic Publishers, Dordrecht.
[59] Coecke, B., Moore, D.J. and Stubbe, I., 2001, “Quantaloids describing causation and propagation of physical properties”, Foundation of Physics Letters, 14, 357-367.
[60] Coecke, B. and Smets, S., 2004, “The Sasaki-hook is not a [static] implicative connective but induces a backward [in time] dynamic one that assigns causes”, International Journal of Theoretical Physics, 43, 1705-1736.
[61] Coecke, B., Moore D.J. and Wilce A. (Eds.), 2000, Current Research in Operational Quantum Logic: Algebras Categories, Languages, Kluwer Academic Publishers, Dordrecht.
[62] Coecke, B., Moore, D.J. and Smets, S., 2004, “Logic of dynamics & dynamics of logic”, in Logic, Epistemology and the Unit of Science, S. Rahman, J. Symons (Eds.), pp. 527-555, Kluwer Academic Publisher, Dordrecht.
[63] Coecke, B., 2005, “Kindergarten Quantum Mechanics”, in Proceedings of QTRF-III, G. Adenier, A.Yu Khrennikov and T.M. Nieuwenhuizen (Eds.), pp. 81-98, AIP Proceedings, New York.
[64] Coecke, B., 2010, “Quantum Picturalism”, Contemporary Physics, 51, 5983.
[65] Coecke, B., Duncan, R., Kissinger, A. and Wang, Q., 2012, “Strong Complementarity and Non-locality in Categorical Quantum Mechanics”, in Proceedings of the 27th Annual IEEE Symposium on Logic in Computer Science LiCS 2012, pp. 245-254, IEEE Publisher.
[66] Coecke, B., 2012, “The Logic of Quantum Mechanics – Take II”, Preprint. (quant-ph/arXiv:1204.3458)
[67] Coecke, B., Heunen, C. and Kissinger, A., 2013, “Compositional Quantum Logic”, Computation, Logic, Games, and Quantum Foundations, 21-36.
[68] da Costa, N.C.A., 1997, Logique Classique et Non-Classique, Masson, Paris.
[69] da Costa, N. C. A. and French, S., 2003, Partial Truth: A Unitary Approach to Models and Scientific Reasoning, Oxford University Press, Oxford.
[70] da Costa, N.C.A., Krause, D. and Bueno, O., 2006, “Paraconsistent Logics and Paraconsistence”, in Philosophy of logic, D.M. Gabbay, P. Thagard and J. Woods (Eds.), pp. 655-781, Elsevier, Amsterdam.
[71] da Costa, N.C.A. and Krause, D., 2006, “The Logic of Complementarity”, in The Age of Alternative Logics: Assessing Philosophy of Logic and Mathematics Today, J. van Benthem, G. Heinzmann, M. Rebushi and H. Visser (Eds.), pp. 103-120, Springer, Berlin.
[72] da Costa, N. C. A., Krause, D., and Bueno, O., 2007, “Paraconsistent Logics and Paraconsistency”, in Handbook of the Philosophy of Science (Philosophy of Logic), D. Jacquette (Ed.), pp. 791-911, Elsevier, Amsterdam.
[73] da Costa, N.C.A. and Bueno, O., 2009, “Non Reflexive Logics”, Revista Brasilera de Filosofía, 232, 181-196.
[74] da Costa, N. and de Ronde, C., 2013, “The Paraconsistent Logic of Quantum Superpositions”, Foundations of Physics, 43, 845-858.
[75] da Costa, N.C.A. and de Ronde, C., 2014, “Non-Reflexive Logical Foundation for Quantum Mechanics”, Foundations of Physics, 44, 1369-1380.
[76] da Costa, N.C.A. and de Ronde, C., 2014, “The Paraconsistent Approach to Quantum Superpositions Reloaded”, Preprint. (quantph/arXiv:1507.02706)
[77] Dalla Chiara, M.L., Giuntini, R. and Krause, D., 1998, “Quasi set theories for microobjects: a comparison”, in Interpreting Bodies: Classical and Quantum Objects in Modern Physics, E. Castellani (Ed.), Princeton University Press, Princeton.
[78] Dalla Chiara, M. and Giuntini, R., 2002, “Quantum Logics”, in Handbook of Philosophical Logic, Vol 6, D. Gabbay and F. Guenthner, (Eds.), Kluwer Academic Publishers, Dordrecht.
[79] Dalla Chiara, M. and Giuntini, R., 2000, “Paraconsistent Ideas in Quantum Logic”, Synthese, 125, 55-68.
[80] Dalla Chiara, M.L., Giuntini, R. and Leporini, R., 2003, “Quantum Computational Logic. A Survey”, Preprint. (quant-ph/arXiv:030529).
[81] Dalla Chiara, M., Giuntini, R. and Greechie, R., 2004, Reasoning in Quantum Theory, Kluwer Academic Publishers, Dordrecht.
[82] Dalla Chiara, M.L., Giuntini, R., Freytes, H., Ledda, A. and Sergioli, G., 2009, “The Algebraic Structure of an Approximately Universal System of Quantum Computational Gates”, Foundation of Physics, 39, 559-572.
[83] de Ronde, C., 2011, The Contextual and Modal Character of Quantum Mechanics: A Formal and Philosophical Analysis in the Foundations of Physics, Doctoral dissertation, Utrecht University, Utrecht.
[84] de Ronde, C., 2013, “Quantum Superpositions and Causality: On the Multiple Paths to the Measurement Result”, Preprint. (quantph/arXiv:1310.4534)
[85] de Ronde, C., 2013, “Representing Quantum Superpositions: Powers, Potentia and Potential Effectuations”, Preprint. (quant-ph/arXiv:1312.7322)
[86] de Ronde, C., 2015, “Modality, Potentiality and Contradiction in Quantum Mechanics”, New Directions in Paraconsistent Logic, pp. 249-265, J.-Y. Beziau, M. Chakraborty and S. Dutta (Eds.), Springer, Berlin.
[87] de Ronde, C., 2016, “Representational Realism, Closed Theories and the Quantum to Classical Limit”, in R. E. Kastner, J. Jeknic-Dugic and G. Jaroszkiewicz (Eds.), Quantum Structural Studies, World Scientific, Singapore.
[88] de Ronde, C., Freytes, H. and Domenech, G., 2014, “Interpreting the Modal Kochen-Specker Theorem: Possibility and Many Worlds in Quantum Mechanics”, Studies in History and Philosophy of Modern Physics, 45, pp. 11-18.
[89] de Ronde, C., Freytes, H. and Domenech, G., 2014, “Quantum Mechanics and the Interpretation of the Orthomodular Square of Opposition”, in New Dimensions of the Square of Opposition, Jean-Yves Béziau and Katarzyna Gan-Krzywoszynska (Eds.), pp. 223-242, Philosophia Verlag, Munich.
[90] DeWitt, B., 1973, “The Many-Universes Interpretation of Quantum Mechanics”, In Foundations of Quantum Mechanics, 167-218, Academic Press, New York.
[91] DeWitt, B. and Graham, N., 1973, The Many-Worlds Interpretation of Quantum Mechanics, Princeton University Press, Princeton.
[92] Dickson, W. M., 2001, “Quantum logic is alive ? (It is true ? It is false)”, Proceedings of the Philosophy of Science Association 2001, 3, S274-S287.
[93] Dickson, W. M., 1998, Quantum Chance and Nonlocality: Probability and Nonlocality in the Interpretations of Quantum Mechanics, Cambridge University Press, Cambridge.
[94] Dickson, M. and Dieks, D., 2002, “Modal Interpretations of Quantum Mechanics”, The Stanford Encyclopedia of Philosophy (Winter 2002 Edition), E. N. Zalta (Ed.), URL: http://plato.stanford.edu/archives/win2002/entries/qm-modal/.
[95] Dieks, D., 1988, “The Formalism of Quantum Theory: An Objective Description of Reality”, Annalen der Physik, 7, 174-190.
[96] Dieks, D., 1989, “Quantum Mechanics Without the Projection Postulate and Its Realistic Interpretation”, Foundations of Physics, 19, 1397-1423.
[97] Dieks, D., 2007, “Probability in the modal Interpretation of quantum mechanics”, Studies in History and Philosophy of Modern Physics, 38, 292310.
[98] Dieks, D., 2010, “Quantum Mechanics, Chance and Modality”, Philosophica, 83, 117-137.
[99] Dirac, P. A. M., 1974, The Principles of Quantum Mechanics, 4th Edition, Oxford University Press, London.
[100] Domenech, G. and Freytes, H., 2005, “Contextual logic for quantum systems”, Journal of Mathematical Physics, 46, 012102-1 – 012102-9.
[101] Domenech, G. and Freytes, H., 2005, “Fuzzy propositional logic associated with quantum computational gates”, International Journal of Theoretical Physics, 45, 228-261.
[102] Domenech, G., Freytes, H. and de Ronde, C., 2006, “Scopes and limits of modality in quantum mechanics”, Annalen der Physik, 15, 853-860.
[103] Domenech, G., Freytes, H. and de Ronde, C., 2008, “A topological study of contextuality and modality in quantum mechanics”, International Journal of Theoretical Physics, 47, 168-174.
[104] Domenech, G., Freytes, H. and de Ronde, C., 2009, “Modal-type orthomodular logic”, Mathematical Logic Quarterly, 3, 307-319.
[105] Domenech, G., Freytes, H. and de Ronde, C., 2009, “Many worlds and modality in the interpretation of quantum mechanics: an algebraic approach”, Journal of Mathematical Physics, 50, 072108.
[106] Domenech, G., Holik, F and Massri, C., 2010, “A quantum logical and geometrical approach to the study of improper mixtures”, Journal of Mathematical Physics, 51, 052108.
[107] Döring, A. and Isham, C. J., 2008, “A topos foundation for theories of physics: I. Formal languages for physics”, Journal of Mathematical Physics, 49, 053515.
[108] Döring, A. and Isham, C. J., 2008, “A topos foundation for theories of physics: II. Daseinisation and the liberation of quantum theory”, Journal of Mathematical Physics, 49, 053516.
[109] Döring, A. and Isham, C. J., 2008, “A topos foundation for theories of physics: III. The representation of physical quantities with arrows”, Journal of Mathematical Physics, 49, 053517.
[110] Döring, A. and Isham, C. J., 2008, “A topos foundation for theories of physics: VI. Categories of systems”, Journal of Mathematical Physics, 49, 053518.
[111] Döring, A. and Isham, C. J., 2011, “What is a thing’ Topos theory in the foundations of physics”, in New structures in physics, B. Coecke (Ed.), 753-940, Springer, Berlin.
[112] Dvure?enskij, A. and Pulmannová, S., 1994, “Difference posets, effects, and quantum measurements”, International Journal of Theoretical Physics, 33, 819-850.
[113] Dvure?enskij, A. and Pulmannová, S., 1994, “Tensor products of D-posets and D-test spaces”, Reports in Mathematical Physics, 34, 251-275.
[114] Dvure?enskij, A., 1995, “Tensor product of difference posets and effect algebras”, International Journal of Theoretical Physics, 34, 1337-1348.
[115] Dvure?enskij A. and Pulmannova, S., 2000, New Trends in Quantum Structures, Kluwer Academic Publishers, Dordrecht.
[116] Dvure?enskij, A. and Xie, Y., 2014, “N-Perfect and Q-Perfect Pseudo Effect Algebras”, International Journal of Theoretical Physics, 53, 33803390.
[117] Engesser, K., Gabbay, D.M. and Lehman, D. (Eds.), 2009, Handbook of Quantum Logic and Quantum Structures, Elsevier, Amsterdam.
[118] Everett, H., 1957, “‘Relative State’ Formulation of Quantum Mechanics”, Reviews of Modern Physics, 29, 454-462.
[119] Everett, H., 1973, “The Theory of the Universal Wave Function”, In The Many-Worlds Interpretation of Quantum Mechanics, DeWitt and Graham (Eds.), Princeton University Press, Princeton.
[120] Finkelstein, D., “Matter, space and logic”, in Boston Studies in the Philosophy of Science V, R.S. Cohen and M.W. Wartofsky (Eds.), D. Reidel, Dordrecht.
[121] Finkelstein, D., “The physics of logic”, in Paradigms and Paradoxes: The Philosophical Challenge of the Quantum Domain, R.G. Colodny (Ed.), University of Pittsburg Press, Pittsburg.
[122] Foulis, D.J. and Randall, C.H., 1972, “Operational statistics, I. Basic concepts”,Journal of Mathematical Physics, 13, 1667-1675.
[123] Foulis, D.J. and Randall, C.H., 1974, “Empirical logic and quantum mechanics”, Synthese, 29, 81-111.
[124] Foulis, D.J. and Randall, C.H., 1978, “Manuals, morphisms and quantum mechanics”, in Mathematical Foundations of Quantum Theory, A. Marlow (Ed.), Academic Press, New York.
[125] Foulis, D.J. and Randall, C.H., 1979, “Tensor products of quantum logics do not exist”, Noticies of the American Mathematical Society, 26, 557.
[126] Foulis, D.J. and Randall, C.H., 1981, “Empirical logic and tensor products”, in Interpretations and Foundations of Quantum Theory, H. Neumann (ed.), B. I. Wissenschaft, Mannheim.
[127] Foulis, D.J., Piron, C. and Randall, C.H., 1983, “Realism, Operationalism, and Quantum Mechanics”, Foundations of Physics, 13, 813-841.
[128] Foulis, D.J. and Bennett, M.K., 1994, “Effect algebras and unsharp quantum logics”, Foundations of Physics, 24, 1331-1352.
[129] Foulis, D.J., Bennett, M.K. and Greechie, R.J., 1996, “Test groups and effect algebras”, International Journal of Theoretical Physics, 35, 11171140.
[130] Foulis, D.J., 2000, “Representations on unigroups”, in Current Research in Operational Quantum Logic: Algebras Categories, Languages, B. Coecke, D.J. Moore and A. Wilce (Eds.), Kluwer Academic Publishers, Dordrecht.
[131] Foulis, D.J., Pulmannová, S. and Vinceková, E., 2011, “Lattice pseudoeffect algebras as double residuated structures”, Soft Computing, 15, 24792488.
[132] Foulis, D.J. and Pulmannová, S., 2013, “Dimension theory for generalized effect algebras”, Algebra Universalis, 69, 357-386.
[133] Foulis, D.J. and Pulmannová, S., 2014, “Symmetries in synaptic algebras”, Mathematica Slovaca, 64, 751-776.
[134] French, S. and Krause, D., 2006, Identity in Physics: A Historical, Philosophical and Formal Analysis, Oxford University Press, Oxford.
[135] Freytes, H. and Ledda, A., 2009, “Categories of semigroups in quantum computational structures”, Mathematica Slovaca, 59, 413-432
[136] Freytes, H., 2010, “Quantum computational structures: categorical equivalence for square roots QMV-algebras”, Studia Logica, 95, 63-80.
[137] Freytes, H., de Ronde, C. and Domenech, G., 2012, “The Square of Opposition in Orthomodular Logic”, in Around and Beyond the Square of Opposition: Studies in Universal Logic, J.-Y. Béziau and D. Jacquette (Eds.), pp. 193-201, Springer, Basel.
[138] Freytes, H. and Domenech, G., 2013, “Quantum computational logic with mixed states”, Mathematical Logic Quarterly, 59, 27-50.
[139] Friedman, M. and Putnam, H., 1978, “Quantum Logic, Conditional Probability and Inference”, Dialectica, 32, 305-315.
[140] Girard, J-Y, “Linear logic”, 1987, Theoretical Computational Science, 50, 1-102.
[141] Giuntini, R. and Greuling, H., 1989, “Toward a formal language for unsharp properties”, Foundations of Physics, 19, 931-945.
[142] Giuntini, R., 1996, “Quantum MV-Algebras”, Studia Logica, 56, 393-417.
[143] Giuntini, R., Freytes, H,. Ledda, A. and Paoli, F., 2009, “A discriminator variety of Gödel algebras with operators arising from quantum computation”, Fuzzy Sets and Systems, 160, 1082-1098.
[144] Gleason, A.M., 1957, “Measures on the closed subspaces of a Hilbert space”, Journal Mathematical Mechanics, 6, 885-893.
[145] Goldblatt, R.I., 1974, “Semantic analysis of orthologic”, Journal of Philosophical Logic, 3, 19-35.
[146] Goldblatt, R.I., 1984, Topoi: The Categorial Analysis of Logic, Elsevier, Amsterdam.
[147] Greechie, R.J., Foulis, D.J. and Pulmannová, S., 1995, “The center of an effect algebra”, Order, 12, 910-106.
[148] Gudder, S.P., “Effect test spaces”, 1997, International Journal of Theoretical Physics, 36, 2681-2705.
[149] Gudder, S., 2002, “Quantum Computational Logic”, International Journal of Theoretical Physics, 42, 39-47.
[150] Halmos, P., 1995, “Algebraic Logic I, Monadic Boolean algebras”, Compositio Mathematica, 12, 217-249.
[151] Hajek, P., 1998, Metamathematics of Fuzzy Logic, Kluwer Academic Publishers, Dordrecht.
[152] Heisenberg, W., 1958, Physics and Philosophy, Ruskin House, London.
[153] Heisenberg, W., 1973, “Development of Concepts in the History of Quantum Theory”, in The Physicist’s Conception of Nature, 264-275, J. Mehra (Ed.), Reidel, Dordrecht.
[154] Heunen, C. and Spitters, B., 2009, “A Topos for Algebraic Quantum Theory”, Communications in Mathematical Physics, 291, 63-110.
[155] Hooker, C.A. (Ed.), 1979, The Logico-Algebraic Approach to Quantum Mechanics II, D. Reidel, Dordrecht.
[156] Husimi, K., 1937, “Studies on the foundations of quantum mechanics I”, Proceedings of the Physico-Mathematical Society Japan, 9, 766-778.
[157] Isham, C. J., 2011, “Topos Methods in the Foundations of Physics”, in Deep Beauty, H. Halvorson (Ed.), 187-206, Cambridge University Press.
[158] Jammer, M., 1974, Philosophy of Quantum Mechanics, Wiley, New York.
[159] Jauch, J.M., 1968, Foundations of Quantum Mechanics, Addison-Wesley, Reading.
[160] Jauch, J.M. and Piron, C., 1969, “On the structure of quantal proposition systems”, Helvetica Physica Acta, 42, 842-848.
[161] Kalman, J. A., 1958, “Lattices with involution”, Transactions of the American Mathematical Society, 87, 485-491.
[162] Kalmbach, G., 1983, Ortomodular Lattices, Academic Press, London.
[163] Kauark-Leite, P., 2004, The Transcendental Approach and the Problem of Language and Reality in Quantum Mechanics, Doctoral dissertation, Centre de Recherche en Epistémologie Appliquée – École Polytechnique, Paris.
[164] Kaye, P., Laflamme, R. and Mosca, M., 2007, An Introduction to Quantum Computing, Oxford University Press, New York.
[165] Kissinger, A., 2014, “Abstract Tensor Systems as Monoidal Categories”, in Categories and Types in Logic, Language, and Physics, 235-252.
[166] Kochen, S., 1985, “A New Interpretation of Quantum Mechanics”, In Symposium on the foundations of Modern Physics 1985, P. Lathi and P. Mittelslaedt (Eds.), pp. 151-169, World Scientific, Johensuu.
[167] Kochen, S. and Specker, E., 1967, “On the problem of Hidden Variables in Quantum Mechanics”, Journal of Mathematics and Mechanics, 17, 59-87. Reprinted in Hooker, 1975, 293-328.
[168] Köhler, P., 1981, “Brouwerian semilattices”, Transactions of the American Mathematical Society, 268, 103-126.
[169] Kôpka, F., 1992, “D-posets of fuzzy sets”, Tatra Mountains Mathematical Publications, 1, 83-87.
[170] Krause, K., 1992, “On a quasi-set theory”, Notre Dame Journal of Formal Logic, 33, 402-411.
[171] Krause, D., 2002, “Why quasi-sets?”, Boletim da Sociedade Paranaense de Matemática, 20, 73-92.
[172] Krause, D., 2014, “The problem of identity and a justification for a nonreflexive quantum mechanics,” Logic Journal of the IGLP, 22, 186-205.
[173] Krause, D. and Arenhart, J., 2016, “A Logic of Quantum Superpositions”, in Probing the Meaning of Quantum Mechanics, D. Aerts, C. de Ronde, H. Freytes and R. Giuntini (Eds.), World Scientific, Singapore.
[174] Lahti, P., 1980, “Uncertainty and Complementarity in Axiomatic Quantum Mechanics”, International Journal of Theoretical Physics, 19, 789-842.
[175] Lewis, D., 1986, On the Plurality of Worlds, Blackwell Publishers, Harvard.
[176] Ludwig, G., An Axiomatic Basis of Quantum Mechanics 1. Derivation of Hilbert Space, Springer-Verlag, Berlin, 1985
[177] Ludwig, G., 1987, An Axiomatic Foundation of Quantum Mechanics 2. Quantum Mechanics and Macrosystems, Springer-Verlag, Berlin.
[178] Lyre, H., 2003, “C. F. von Weizsäcker’s Reconstruction of Physics: Yesterday, Today, Tomorrow”, in Time, Quantum and Information (Essays in Honor of C. F. von Weizsäcker), L. Castell and O. Ischebeck (Eds.), Springer, Berlin.
[179] Mackey, G.W., 1963, The Mathematical Foundations of Quantum Mechanics, Benjamin, Amsterdam.
[180] Maeda, F. and Maeda, S., 1970, Theory of Symmetric Lattices, SpringerVerlag, Berlin.
[181] Manin, Yu.I., 1976, “Mathematical Problems I: Foundations”, in Mathematical Problems Arising from Hilbert Problems, F.E. Browder (Ed.), p. 36, American Mathematical Society, Providence.
[182] Mateus P. and Sernadas, A., 2004, “Reasoning About Quantum Systems”, in Logics in Artificial Intelligence, Ninth European Conference, JELIA-04, 239-251, J. Alferes and J. Leite (Eds), Springer-Verlag.
[183] Mateus P. and Sernadas, A., 2006, “Weakly complete axiomatization of exogenous quantum propositional logic”, Information and Computation, 204, 771-794.
[184] Matoušek, M. and Pták, P., 2014, “Orthomodular posets related to Zvalued states”, International Journal of Theoretical Physics, 53, 3323-3332 [185] Mittelstäedt; P., 1978, Quantum logic, Reidel, Dordrecht.
[186] Mittelstäedt; P., 1979, “The modal logic of quantum logic”, Journal Philosophical Logic, 8, 479-504.
[187] Mittelstäedt; P., 1981, “The concepts of truth, possibility an probability in the language of quantum physics”, in Interpretations and Foundations of Quantum Theory, H. Neumann (Ed.), pp. 70-94, Bibliographisches Institut, Mannheim.
[188] Mittelstäedt; P., 1981, “The dialogic approach to modalities in the language of quantum physics”, in Current Issues in Quantum Logic, E. Beltrametti and B.C. van Fraassen (Eds.), pp. 259-281, Plenum Publication Co, New York.
[189] Mittelstäedt; P., 1985, “Constituting, Naming and Identity in Quantum Logic”, in Recent Developments in Quantum Logic, P. Mittelstäedt and E.-W. Statchow (Eds.), pp. 215-234, BI-Wissenschaftsverlag, Mannheim.
[190] Mittelstäedt; P., 1986, “Naming and Identity in Quantum Logic”, in Foundations of physics, P. Weingartner and G. Dorn (Eds.), pp. 139-161, Vienna.
[191] Nielsen, M.A. and Chuang, I.L., 2010, Quantum Computation and Quantum Information: 10th Anniversary Edition, Cambridge University Press, Cambridge.
[192] Paoli, F., Ledda, A., Giuntini, R. and Freytes, H., 2009, “On some properties of quasi-MV algebras and sqrt quasi-MV algebras”, Reports on Mathematical Logic, 44, 31-63.
[193] Paoli, F., Ledda, A., Spinks, Freytes, H. and Giuntini, R., 2011, “Logics from sqrt MV-algebras”, International Journal of Theoretical Physics, 50, 3882-3902.
[194] Piron, C., 1964, “Axiomatique Quantique”, Helvetica Physica Acta, 37, 439-468.
[195] Piron, C., 1976, Foundations of Quantum Mechanics, W.A. Benjamin, Inc., Reading.
[196] Piron, C., 1983, “Le realisme en physique quantique: une approche selon Aristote”, In The Concept of Physical Reality. Proceedings of a Conference Organized by the Interdisciplinary Research Group, University of Athens, Athens.
[197] Piron, C., 1989, “Recent Developments in Quantum Mechanics”, Helvetica Physica Acta, 62, 82-90.
[198] Pták, P. and Pulmannová, S., 1991, Orthomodular Structures as Quantum Logics, Kluwer Academic Publishers, Dordrecht.
[199] Pulmannová, S., 1985, “Tensor Product of Quantum Logics”, Journal of Mathematical Physics, 26, 1-5.
[200] Pulmannová, S. and Wilce, A., 1995, “Representations of D-posets”, International Journal of Theoretical Physics, 34, 1689-1696.
[201] Pulmannová, S. and Vincenková, E., 2007, “Remarks on the order for quantum observables”, Mathematica Slovaca, 57, 589-600.
[202] Putnam, H., 1968, “Is Logic Empirical?”, Boston Studies in the Philosophy of Science V, 5, 199-215.
[203] Putnam, H., 1974, “How To Think Quantum Logically”, Synthese, 29, 55-61.
[204] Randall C.H. and Foulis, D.J., 1970, “An approach to empirical logic”, American Mathematicl Monthly, 77, 363-374.
[205] Randall C.H. and Foulis, D.J., 1973, “Operational statistics II. Manuals of operations and their logics”, Journal Mathematical Physics, 14, 1472-1480.
[206] Randall C.H. and Foulis, D.J., 1983, “A mathematical language for quantum physics”, in Les fondements de la mécanique quantique, C. Gruber, C. piron, T.M. Tâm and R. Weil (Eds), AVCP, Lausanne.
[207] Randall, C.H. and Foulis, D.J., 1983, “Properties and operational propositions in quantum mechanics”, Foundations of Physics, 13, 843-857.
[208] Reichenbach, H., 1975, “Three valued logic and the interpretation of quantum mechanics”, in The Logico-Algebraic Approach to Quantum Mechanics – Vol I, C.A. Hooker (Ed.), Reidel, Dordrecht.
[209] Rédei, M., 1989, “Quantum conditional probabilities are not probabilities of quantum conditional”, Physics Letters, A 139, 287-290.
[210] Rédei, M., 1998, Quantum Logic in Algebraic Approach, Kluwer Academic Publishers, Dordrecht.
[211] Rédei, M., 1999, “ ‘Unsolved Problems of Mathematics’ J. von Neumann’s address to the International Congress of Mathematicians, Amsterdam, September 2-9, 1954”, The Mathematical Intelligencer, 21, 7-12.
[212] Rédei, M., 2001, “Von Neumann’s concept of quantum logic and quantum probability”, in John von Neumann and the Foundations of Quantum Physics, M. Rédei and M. Stötzner (Eds.), pp. 153-172, Kluwer Academic Publishers, Dordrecht.
[213] Rédei, M. and Summers, S.J., 2007, “Quantum probability theory”, Studies in the History and Philosophy of Modern Physics, 38, 390-417.
[214] Sakurai, J. J. and Napolitano, J., 2010, Modern Quantum Mechanics, Addison-Wesley, London.
[215] Smets, S., 2000, “In Defense of Operational Quantum Logic”, Logic and Logical Philosophy.
[216] Smets, S., 2001, The Logic of Physical Properties in Static and Dynamic Perspective, Doctoral dissertation, Brussels Free University, Brussels.
[217] Smets, S., 2005, “The Modes of Physical Properties in the Logical Foundations of Physics”, Logic and Logical Philosophy, 14, 37-53.
[218] Smets, S., 2010, “Logic and quantum physics”, Journal of the Indian Council of Philosophical Research Spetial Issue XXVIII, N2.
[219] Solèr, M.P., 1995, “Characterization of Hilbert spaces by orthomodular spaces”, Communcations in Algebra, 23, 219-243.
[220] Svozil, K., 1998, Quantum Logic, Springer, Singapore.
[221] Takeuti, G., 1978, Two Applications of Logic to Mathematics, Iwanami and Princeton University Press, Tokyo and Princeton.
[222] Takeuti, G., 1981, “Quantum Set Theory”, in Current Issues in Quantum Logic, E. Beltrametti and B.C. van Frassen (Eds.), pp. 303-322, Plenum, New York.
[223] Titani, S., 1999, “Lattice Valued Set Theory”, Archive for Mathematical Logic, 38, 395-421.
[224] Van Fraassen, B.C., 1972, “A formal approach to the philosophy of science”, in Paradigms and Paradoxes: The Philosophical Challenge of the Quantum Domain, R. Colodny (Ed.), pp. 303-366, University of Pittsburgh Press, Pittsburgh.
[225] Van Fraassen, B.C., 1973, “Semantic Analysis of Quantum Logic”, In Contemporary Research in the Foundations and Philosophy of Quantum Theory, C. A. Hooker (Ed.), pp. 80-113, Reidel, Dordrecht.
[226] Van Fraassen, B.C., 1974, “The Einstein-Podolsky-Rosen paradox”, Synthese, 29, 291-309.
[227] Van Fraassen, B.C., 1981, “A modal Interpretation of Quantum Mechanics”, in Current Issues in Quantum Logic, 229-258, E. G. Beltrametti and B. C. van Fraassen (Eds.), Plenum, New York.
[228] Van Fraassen, B.C., 1991, Quantum Mechanics: An Empiricist View, Clarendon, Oxford.
[229] Varadarajan, V.S., 1962, “Probability in Physics and a Theorem on Simultaneous Observability”, Communication of Pure and Applied Mathematics, XV, 189-217.
[230] Varadarajan, V.S., 1985, Geometry of Quantum Theory, Springer, Berlin.
[231] Verelst, K. and Coecke, B., 1999, “Early Greek Thought and Perspectives for the Interpretation of Quantum Mechanics: Preliminaries to an Ontological Approach”, in The Blue Book of Einstein Meets Magritte, Gustaaf C. Cornelis, Sonja Smets, Jean-Paul van Bendegem (Eds.), pp. 163-196, Kluwer Academic Publishers, Dordrecht.
[232] Vermaas, P.E., 1999, A Philosophers Understanding of Quantum Mechanics, Cambridge University Press, Cambridge.
[233] Vermaas, P.E. and Dieks, D., 1995, “The Modal Interpretation of Quantum Mechanics and Its Generalization to Density Operators”, Foundations of Physics, 25, 145-158.
[234] Von Neumann, J., 1996, Mathematical Foundations of Quantum Mechanics, Princeton University Press (12th. edition), Princeton.
[235] Von Neumann, J., 1961, Collected Works Vol III: Rings of Operators, A. H. Taub (Ed.), Pergamon Press.
[236] Von Weizsäcker, C. F., 1985, “Heisenberg’s Philosophy”, In Symposium on the Foundations of Modern Physics 1985, P. Lathi and P. Mittelslaedt (Eds.), pp. 277-293, World Scientific, Singapore.
[237] Wilce, A., 1995, “Partial Abelian Semigroups”, International Journal of Theoretical Physics, 34, 1807-1812.
[238] Wilce, A., 1998, “Perspectivity and Congruence in Partial Abelian Semigroups”, Mathematica Slovaca, 48, 117-135.
[239] Zafiris, E. and Karakostas, V., 2013, “A categorial semantics representation of quantum events”, Foundations of Physics, 43, 1090-1123.

Author Information

C. de Ronde
Email: cderonde@gmail.com
University of Buenos Aires
Argentina

and

G. Domenech
Email: gradomenech@gmail.com
Vrije Universiteit Brussel
Belgium

and

H. Freytes
Email: hfreytes@gmail.com
Cagliari University
Italy
University of Rosario
Argentina

Metaepistemology

Metaepistemology is, roughly, the branch of epistemology that asks questions about first-order epistemological questions. It inquires into fundamental aspects of epistemic theorizing like metaphysics, epistemology, semantics, agency, psychology, responsibility, reasons for belief, and beyond. So, if as traditionally conceived, epistemology is the theory of knowledge, metaepistemology is the theory of the theory of knowledge. It is an emerging and quickly developing branch of epistemology, partly because of the success of the more advanced ‘twin’ metanormative subject of metaethics. The success of metaethics and the structural similarities between metaethics and metaepistemology have inspired parallel conceptual forays in metaepistemology with far reaching implications for both subjects.

The current article offers a concise survey of basic themes and problems in metaepistemology. The survey, of course, aims neither at being exhaustive nor at presenting these basic themes and problems in their full sophistication and complexity. Rather, given the very broad span of themes and problems that fall under the label of metaepistemology, the aim is to introduce basic themes and problems and overview some of the cutting edge research that is currently undertaken in metaepistemology debates.

In what follows, “(meta)”epistemology contains brackets to indicate the epistemology of epistemology. This is to be distinguished from non-bracketed “metaepistemology,” which is meant to refer to the whole domain of metaepistemological theorizing (metaphysics, epistemology, semantics, agency and so forth).

Situating Metaepistemology within Epistemology and Metanormativity
Normativity
Metaphysics
Semantics
(Meta)Epistemology
Reasons for Belief and Epistemic Psychology
Agency and Responsibility
New Directions in Metaepistemology
References and Further Reading

1. Situating Metaepistemology within Epistemology and Metanormativity

Following the example of ethics (for example, Fisher 2011; see also Fumerton 1995), we can distinguish three basic branches of epistemology: normative epistemology, applied epistemology, and metaepistemology. Normative epistemology mostly deals with first-order theorizing about how we should form justified beliefs, gain understanding, truth and knowledge, offer accounts of the basic sources of knowledge (like memory, perception, testimony) and so forth, but it does not pursue higher-order questions about these matters or pressing applied epistemic matters. To the extent that it does, it embroils itself, respectively, in metaepistemology and applied epistemology. Applied epistemology draws from normative epistemological theorizing in order to respond to pressing epistemic matters of practical value, like climate change skepticism, jury decision-making, gender or race issues in epistemology, and so forth.

The following is an example to illustrate how the trichotomy of the epistemic domain is meant to divide epistemological labor. As is well-known, epistemologists are intrigued by the perennial question “What is knowledge?” and, accordingly, try to come up with plausible reductive analyses. This much is first-order normative epistemological theorizing at its best. If we conceptually dig deeper, however, move a level down and ask whether there is any “real” (or robust) knowledge or whether the project of reductive analysis of knowledge is any plausible, then we ask second-order, metaepistemological questions. That is, we ask questions about first-order epistemological questions, like the question “what is knowledge?”. Moreover, if we ask epistemic questions of pressing practical value, like whether gender, race, and ethnic origin factors affect ordinary knowledge attributions, then we are pressing applied questions (for example, Fricker 2010) and have swiftly moved into the field of applied epistemology.

Opinions diverge about the exact interrelation of the three branches of epistemology and the exact interrelation of metaepistemology and its twin metanormative subject of metaethics. In regard to the former issue, there are two broad, possible positions about the relation among the three branches. The first position is one we may call the autonomy thesis. According to the autonomy thesis, also sometimes propounded in ethical theory (compare Enoch 2013 for discussion), metaepistemology is an independent branch of epistemological inquiry that does not depend on the results of the other two branches of epistemology. Inversely, both applied and normative epistemology do not depend on the results of metaepistemology either. The autonomy thesis bears some prima facie plausibility because it seems intuitive that one may be, let us say, a coherentist, foundationalist, or reliabilist about normative epistemology but an expressivist, error theorist, or relativist about metaepistemology.

The other position on the matter is what we may call the interdependency thesis. It suggests that there are important theoretical interdependencies between the three branches (pace some prima facie appearances of autonomy). If, for example, we could reductively analyze epistemic justification in informative necessary and sufficient conditions, it seems that we would have a theory to invoke in normative justificatory matters and apply to pressing questions of epistemic justification like, say, climate change skepticism. However, the fact that such analyses do not seem readily available indicates that nothing is very obvious in metaepistemological matters.

In regard to the latter issue, namely, how to situate metaepistemology not merely within epistemology but within the broadly metanormative domain, there are again two broad, divergent positions. First, many metanormativists hold “the parity thesis” (or, sometimes called, “the unity thesis”) according to which the epistemic and the moral/practical are intertwined normative subjects, theoretically on a par and should therefore share the same metanormative fate, whatever that may be (realist, antirealist, Kantian constructivist, or even other) (compare Kim 1988; Cuneo 2007). Other metanormativists deny this and argue that there are important discontinuities between metaepistemology and metaethics and, hence, that we should instead hold “the disparity thesis” (compare Lenman 2008; Heathwood 2009).

For example, Cuneo (2007) has argued that the moral and the epistemic domain share core structural similarities (reasons, supervenience, motivation, and so forth) and that this bolsters the parity thesis. In response to Cuneo’s (2007) arguments for the parity thesis, it has been suggested by Lenman (2008) and Heathwood (2009) that while moral facts and truths may be irreducible, epistemic facts and truths may be reducible to facts and truths about evidence and probability (where these are ultimately to be understood in descriptive terms) and, therefore, there is a fundamental disparity between the two metanormative subjects. Again, Cuneo and Kyriacou (2017) have come up with a rejoinder to the Heathwood/Lenman case for the moral/epistemic disparity and argued that the parity seems to go through in the end. Of course, the dialectic is currently developing and the jury is still out.

So far, we have said a few basic things about the possible positions in situating metaepistemology within epistemology proper and within metanormativity. We now turn to the basic question of what it is that makes epistemology a distinctively normative subject and how from epistemic normativity we arrive at perplexing metaepistemological questions. The next section unpacks the various aspects of the metaepistemological domain that will be presented as we proceed.

2. Normativity

One of the most remarkable characteristics of human primates is their evolved, often linguistically mediated, capacity for cognizing and, moreover, the intrinsic normativity of this cognizing; intrinsic normativity of cognizing because our wide array of cognitive endeavors seem to be inherently “fraught with ought” and evaluable in terms of (in)correctness. Intuitively, to the extent that we are rational and responsible agents, there are propositions we ought to believe and propositions we ought not to believe, and there are cognitive practices, methods, processes, habits, and so forth that are epistemically correct to employ and others that are epistemically incorrect to employ. That is, (in)correct from the epistemic point of view.

Indeed, generations of epistemologists from the early moderns like Descartes (1641), Locke (1690) and Hume (1739), to Clifford (1877), Chisholm (1966), Alston (1988), Fumerton (1995), Feldman (2002) and beyond have attested the normativity of cognizing and have talked about corresponding epistemic duties, oughts, obligations, requirements, and so forth—terms that for current purposes are used interchangeably—that rational agents have.

For example, intuitively, we ought to believe on the basis of the relevant evidence or the relevant reliable cognitive process and ought not to believe what is merely bequeathed by tradition, dictated by fiat of authority or simply feels good. It is also epistemically correct to collect evidence meticulously and open-mindedly, and it is epistemically incorrect to cook up your lab research to the conclusions that a generous research sponsor would favor (for example, say, that extensive consumption of red meat incurs no side-effects on health and the environment).

It is precisely this intrinsic normativity of our cognitive endeavors (practices, methods, processes, habits, beliefs, theories) that gives rise to metaepistemological questions because as rational, responsible agents we seem bound by epistemic duties and obligations that are rationally non-optional and inescapable. To the extent that we are rational agents, we seem constrained by epistemic oughts and duties regardless of whether we like it or not, or whether we submit to these or not. The fact is reflected in ordinary locutions like “p is the right thing to believe,” “You should trust what Paul says because he is an expert on the matter,” or ‘They should have known this much; there is no excuse,” and so forth. Call this fundamental appearance of ordinary epistemic discourse the deontic appearance.

Of course, the deontic appearance is the prima facie appearance of ordinary epistemic discourse and appearances, even deeply entrenched appearances, as we know very well may be deceptive. Secunda facie, we may have no epistemic duties or obligations and epistemic normativity may not be explainable in deontological terms. But at least prima facie we often talk and think in terms of propositions that one should or should not believe and in terms of practices, processes, methods, habits etc. that one should and should not employ. This much of epistemic appearance seems unequivocal and whether we should debunk the deontic appearance or not is a further question down the road.

It should also be underscored that the deontic appearance of ordinary epistemic discourse seems to have a distinctively categorical flavor; that is, the phenomenology of our everyday talk and thought about duties, obligations, oughts, seems to imply the existence of categorical duties and obligations such as duties that are in some sense unconditional, that is independent of our psychology (desires, dispositions, beliefs,) and constrain what we ought to believe insofar as we are rational. For example, if a speaker utters, “You should believe that p” in an ordinary conversational context her statement would, typically, conversationally implicate that it is an (epistemic) fact of sorts that “You should believe that p.” A fortiori, the conversational implication is that anyone epistemically rational would be obliged to believe that p because it constitutes a categorical epistemic obligation (derivative of a corresponding epistemic fact).

In line with the deontic appearance, the broadly internalist view that takes it that we are bound by reflectively accessible epistemic duties is called epistemic deontologism (compare Clifford 1877; Alston 1988; Feldman 2002). It asserts that we have reflectively accessible, epistemic duties and that they should regulate rational doxastic behavior, namely, endorsing, maintaining, and revising a belief. Epistemic deontologism can be construed in a number of ways depending on how we understand epistemic goals of inquiry. Accordingly, we can have different proposals about how to construe epistemic duties.

However, the standard way to understand epistemic deontologism has been in terms of epistemic justification (for discussion see Feldman 2002). Roughly, an epistemic duty for S to believe p exists iff S has sufficient justification for p. Sufficient justification may in turn be understood in various ways, perhaps, along broadly evidentialist lines, that is, in terms of a relatively high ratio of evidential probability (for example, Heathwood 2009) or even along reliabilist lines, that is, in terms of a high ratio of truth output by a process (or ability) in an externalist framework (for example, Goldman 1979).

This, of course, is not the only way epistemic deontologism may be construed because it can also be construed in terms of alternative epistemic goals/values like truth, knowledge, and even understanding or wisdom (compare for the latter two goals Kyriacou 2016). That would mean that, roughly, an epistemic duty for S to believe p exists iff p is true or an instance of knowledge or even promotes understanding or wisdom. However, the best construal of epistemic deontologism is a question we need not further dwell on here. The important thing for current explicating purposes is that no matter how epistemic duties are to be construed, the deontic appearance stirs a whole host of perplexing and far-reaching metaepistemological questions, like the following:

Metaphysical: Are there epistemic properties/goals, norms, and facts in virtue of which categorical epistemic oughts, duties, and obligations for rational agents follow? If yes, what is their exact nature? If no, where does this leave us in terms of the intrinsic normativity of our cognitive practices? May the nonexistence of epistemic properties/goals, norms, and facts cripple the normative dimension of our epistemic lives? How is the constraint of epistemic supervenience to be understood and explained?

Semantic: What is the meaning of epistemic statements? Is it descriptive, expressivist, or even other? Are epistemic statements truth-apt? If yes, can truth-aptness be rescued in an expressivist metasemantic framework? Can deflationism do the trick? If there are robust epistemic facts, how do they ground truth, if at all? If there are no robust epistemic facts, then what does ground epistemic truth, if at all? Is the meaning of epistemic statements invariant or context-sensitive? Do practical interests, stakes, and so forth have a semantic contribution to the meaning of epistemic statements?

(Meta)epistemological: If there are categorical epistemic oughts and duties, how do we get to know them, if at all? Do we merely construct such duties and obligations or do we somehow discover them? If we discover them, how can this happen with minimal reliability, given that such properties and duties do not seem at first instance natural? How is this cognitive reliability to be accounted for, given our evolutionary history and the fact that the evolutionary process has been a blind, nonintentional process largely pushing towards adaptation, survival and reproduction by means of natural selection? Are intuitions credible evidence, especially in view of the evolutionary-cultural origins of cognition? Is talk of epistemic duties and oughts misconceived in light of epistemic externalism? Does epistemic externalism comport with the intrinsic normativity of cognitive endeavors?

Reasons for Belief and Epistemic Psychology: Are there categorical reasons for belief or are all reasons for belief hypothetical and dependent on our contingent, subjective desires and dispositions? Is epistemic rationality merely instrumental or categorical? Is epistemic judgment motivating? If yes, motivating in what way?

Agency and Responsibility: Can we directly choose what to believe and, if not, what about the fundamental and deontic notion of epistemic responsibility? Is there such a thing as character or is it merely fictional? If there is, can it play an integral part in our epistemic lives? If there is not, where does this leave our epistemic lives?

The following sections concisely introduce and discuss at some depth at least many of these metaepistemological questions.

3. Metaphysics

A core component of epistemic metaphysics concerns ontology. Epistemic ontology explores questions about the existence and nature of epistemic properties like epistemic justification, warrant, rationality, entitlement, understanding, truth, wisdom, knowledge, epistemic duties, and norms like, “You ought to trust your senses, unless you have reason to doubt their overall reliability” and particular epistemic facts like, “The theory of evolution is well-justified, given the abundant empirical evidence.”

Here, focus is restricted on the epistemic properties of justification and knowledge and respective justificatory and knowledge duties/norms/facts for at least three basic reasons: first, because of the more prominent position they have traditionally held in the history of epistemology; second, because of their relatively more advanced research state of art; and third, because considerations of simplicity and economy inevitably constrain the thematic boundaries of the article.

Justification and knowledge are treated in turns and not jointly in spite of the fact that some positions about the two properties are strictly analogous, for two reasons: first, because this analogy can easily come apart, for example, in principle someone could be an antirealist about knowledge but not about justification; second, because the debates of justification and knowledge often develop independently of one another, and it would perhaps oversimplify the state of the debates if we agglomerate the two.

Now, beginning with epistemic justification, a traditional distinction that helps map the theoretical landscape of justification debates is that between epistemic realism and antirealism (though to see how hard it can be to distinguish the two see Dreier, 2004). On the one hand, realists take epistemic justification to be a real, mind-independent property that its existence does not depend in any way on human cognizing. Thus, if a belief is justified, then this should be the object of discovery and not of invention (or construction). Accordingly, we should be able to understand that a justified belief instantiates the property of justification and that it is in virtue of this property that is justified.

On the other hand, epistemic antirealists deny that epistemic justification is anything like a real and mind-independent property. Epistemic justification is considered a property (if at all) that is constructed out of the workings of evolved human cognizing and nothing over and above this. This is not to imply the rather naïve view that justification is made up “out of thin air” by the antirealist. Justification is still constrained by certain epistemic norms, facts or framework, although these are mind-dependent and are of mere local validity. For the antirealist, if a belief is justified, then it is justified in virtue of certain epistemic norms, facts, or framework, but this should not be overstated. Epistemic norms, facts, or framework are invented by cognizers and therefore epistemic justification is also invented. It is not the case that if a belief is justified, then we can somehow understand that is justified because it instantiates a real property of justification.

As in metaethics, realists can be distinguished between reductionists and antireductionists. Reductionists can further be distinguished between analytic and synthetic reductionists. Analytic reductionists believe in the capacity of traditional a priori conceptual analysis to deliver illuminating, descriptive analyses of philosophically interesting concepts. Accordingly, they would take epistemic justification to be, in principle, reductively analyzable to a more basic property like coherence, reliability, foundations, virtues, responsibility, evidence, probability, and so forth (compare Bonjour 1985, 1998; Goldman 1979; Zagzebski 1996; Conee and Feldman 2004; Vahid 2005; Sosa 2007; Heathwood 2009;).

Synthetic reductionists are not so sanguine about traditional a priori conceptual analysis, and its purported capacity to deliver illuminating, descriptive analyses of philosophically interesting concepts. Accordingly, they would deny that epistemic justification is, in principle, reductively analyzable to a more basic and informative property, but they would still cling on realism about epistemic justification because they would take it to be a natural kind property, somehow discoverable only by the a posteriori means of empirical science and not by the a priori conceptual analysis of traditional philosophical methodology (compare Jenkins 2007). As, for instance, we can discover by empirical means that “water is H20 molecules” or that “gold is the element with atomic number 79,” we can presumably discover the natural kind property that constitutes the essence of epistemic justification, or so the thought goes.

So far, we have seen analytic and synthetic reductionism. Both typically adhere to methodological naturalism, roughly, the view that naturalistic scruples constrain the right kind of philosophical methodology (compare Pollock and Cruz 1999). In other words, philosophical methodology should be empirically informed and cohere with our best naturalistic picture about ontology, epistemology, and so forth. Be that as it may, analytic and synthetic reductionists disagree about the exact content of methodological naturalism. Analytic reductionists are still optimistic about the method of a priori conceptual analysis while synthetic reductionists counter that conceptual analysis is rendered obsolete by the progress of the a posteriori methods of empirical science and that philosophy should be sensitive to this progress.

However, there are also antireductionist realists that are usually reluctant to embrace methodological naturalism, at least not any form of chauvinistic methodological naturalism that would exclude the possibility of antireductionism from the outset. Anti-reductionists usually disagree with their fellow reductionist realists about the capacity of methodological naturalism to deliver illuminating philosophical results and, in particular, results about justification—and, in principle, other normative properties (compare Moore 1903; Boghossian 2007; Cuneo 2007). They take epistemic justification to be a property that is real and mind-independent but not reducible to any more basic, natural property. That is, a property that can be the object of study of natural sciences and empirical psychology, neuroscience, anthropology, sociology, and so forth.

Behind their suspicion of methodological naturalism may hover the intuition that normative properties do not seem natural. This suspicious attitude also helps explain their pessimism about the employment of methodological naturalism in metanormativity puzzles, for if normative properties do not seem in any profound way to be natural, then, perhaps, we should not insist on the employment of the restrictive philosophical methodology of methodological naturalism. The thought is that if normative properties are, indeed, non-natural in any profound sense then by insisting on a naturalistic methodology we will not be making any progress. We will only engage in a subtle begging of the question, or so the thought goes.

On the opposite theoretical side stand epistemic antirealists about justification. Antirealists deny that the property of epistemic justification is anything over and above what human cognizers construct and thereby invent. This is not to deny that there are, in some sense, justified beliefs in virtue of certain epistemic norms and facts or agents that justifiably believe that p or even corresponding epistemic oughts and duties. It is only to reject the distinctively realist idea that epistemic justification is a robust property somehow “out there” and our beliefs are justified to the extent that they instantiate that “out there” property. Rather, justification is something that emerges out of our evolved cognitive attitudes that are mental states and out of our culturally evolved epistemic practices and interactions that are social activities. The same goes for corresponding epistemic oughts and duties. They may exist, in some sense, but definitely not in the Archimedean “out there” sense that realists like to envisage.

Like realists, epistemic antirealists are a heterogeneous lot. Some may be subjectivists, others expressivists, error theorists, or relativists. Let us very briefly preview the rudiments of these families of theories. Subjectivists typically hold that judgments of justification report the agent’s noncognitive attitudes, valuations, pro-attitudes. For subjectivism, justification assertions like “p is justified, given my evidence” or attributions like “S justifiably believes that p” report the speaker’s attitudes of approval, endorsement, recommendation, trust, and so forth, for the belief that p or for S’s believing that p. Worthy of notice is that the speaker’s attitudes are reported and not expressed. The speaker is supposed, so to speak, to step back from his own attitudes, introspect and simply report these attitudes but not directly express them. So, if I say, “S justifiably believes that p,” according to a simple subjectivist theory I may be reporting my approval for the belief that p, but not directly expressing that approval.

The fine-grained distinction between reporting/expressing the speaker’s attitudes might seem like an insignificant detail but it is a distinctive feature of the theory that helps distinguish it from expressivist theories (compare Schroeder 2008a). Besides, subjectivism is usually understood to be a cognitivist theory while expressivism is usually understood to be a noncognitivist theory. To stipulate, a cognitivist theory is a theory that takes normative judgment to express descriptive mental states like beliefs while a noncognitivist theory is a theory that takes normative judgment (or at least some species of it) to express nondescriptive mental states like desires, proattitudes and sentiments. Obviously, although both subjectivism and expressivism may be labeled as broadly sentimentalist theories because they involve noncognitive attitudes, subjectivism is a cognitivist theory while expressivism is a noncognitivist theory.

Subjectivism is usually considered an implausible theory (see Schroeder 2008a) and, in fact, although it has some metaethical proponents (compare Wiggins 1987), to the best of my knowledge there are no obvious subjectivists in metaepistemology. But things are very different with regard to expressivism that has had quite a few proponents recently. Expressivists take justification judgments to express (and not report) the speaker’s noncognitive attitudes like approval, endorsement, recommendation, assurance, reliance, plans, trust, desires and intentions. Justification assertions like “p is justified” or attributions like “S justifiably believes that p” express the speaker’s attitudes of approval, endorsement, recommendation, trust, and so forth for p or for S’s believing that p (compare Kyriacou 2012). They express (or “voice”) directly the speaker’s states of mind.

The third antirealist theory of epistemic justification is that of error theory (or, sometimes, fictionalism). Unlike subjectivism and expressivism, error theory does not invoke, one way or another, noncognitive attitudes. Error theorists take their justification judgments to purport to describe respective justification facts but deny that such facts really exist (compare Olson 2011a, 2011b). Given the absence of such facts (typically considered to be truthmakers), we end up with an error theory, namely, a theory that suggests that justification judgments are uniformly false; at least all first-order justification judgments are false (compare Olson 2011a, 2011b).

The fourth antirealist theory of justification is that of relativism. Relativism denies the existence of “real” justificatory epistemic norms and facts and stipulates that justificatory norms and facts are only relative to some indexical factor of mere local validity—usually the agent, or his society, culture, and so forth (compare MacFarlane 2005). Often, relativists are cultural relativists that think that there are no mind-independent norms and facts and that the only norms and facts that really exist are some culturally constructed and embedded norms and facts (compare Stich 1990; for a recent defense of cultural moral relativism see Velleman 2013; and for criticism see Kyriacou 2015). These culturally constructed norms and facts allow for justified beliefs but the justification is of only local validity.

There are many subdivisions under the banner of each of these families of theories that we cannot really dwell on here. There are, for example, many different expressivist theories, and it is really doubtful whether any two of these theories are identical. For instance, Allan Gibbard, one of the most prominent and influential expressivists and one of the first to extend expressivism from metaethics to metaepistemology, has held at least two expressivist theories, the early norm-expressivism (1990) and the later plan-expressivism (2003). There are other versions of expressivism in the literature too—habits-expressivism (compare Kyriacou 2012) and hybrid versions of expressivism such as Ridge’s (2007) ecumenical expressivism (for some discussion of epistemic expressivism see Chrisman 2012).

We have now introduced the rudiments of the major realist and antirealist theories of justification, but there are also other theoretical options that are somewhat harder to classify as realist or antirealist that deserve at least a short mention. For example, as in metaethics (for example, Korsgaard 1996), a Kantian constructivist might claim that norms and facts of justification are constructed out of a priori constitutive norms of rationality (for example, the universalizability or autonomy formulas) and obviate the distinction between realism/antirealism. She could claim that her theory cannot be properly classified as realist or antirealist but only as deontological. Categorical epistemic duties follow from the application of these constitutive norms of rationality but these duties are, ontologically speaking, neither realist nor antirealist.

Be that as it may, so far we have discussed the question of the ontology of justification and the various sorts of approaches to it, but there are also two important metaphysical challenges for a plausible metaepistemological theory that deserve some attention: the evolutionary challenge and the supervenience challenge.

The evolutionary challenge is more of a challenge to normative realism and realist understandings of justification/knowledge (moral and epistemic). As Sharon Street (2006, 2009) has argued, our evolutionary history prima facie conflicts with normative realism because it is very implausible to think that we evolved and our moral and epistemic attitudes were somehow mysteriously and finely attuned to track corresponding moral and epistemic facts. Such a realist “tracking account” seems implausible on a number of counts (ontological parsimony, clarity, mysterious causal connections, and so forth), especially if we think of a competing metaphysically lighter, mere Darwinian account that explains our normative attitudes and their content as largely shaped by the main mechanism of evolutionary change, namely, natural selection. There is no need to postulate moral and epistemic facts, Street argued, in order to have the best Darwinian explanation of how we came to have the normative attitudes we tend to have.

Thus, the evolutionary challenge for realists is to explain, in the best theoretical way, how our normative attitudes have largely been shaped by natural selection in consonance with robust normative facts. The problem for the realist now is that it seems that Ockham’s razor should apply and redundant robust normative facts (and realism) should drop off the picture of that theoretical explanation because a mere Darwinian, antirealist account can do all the explaining we need.

Street’s evolutionary challenge has stirred some fascinating discussion and some interesting realist rejoinders that we unfortunately have to skip here—for example, Setiya (2012), Enoch (2013), FitzPatrick (2014), Vavova (2014). Nevertheless, it is widely acknowledged to be an important challenge for metanormative realism. If realism is to be plausible, it has to explain in a plausible way how it can comport with our evolutionary history.

The supervenience challenge asks us to explain how epistemic properties (if any) relate to the natural world. This challenge is usually interpreted in terms of the widely accepted metaphysical constraint of the epistemic supervenience thesis (compare Kim 1988; Conee and Feldman 2004; Vahid 2005; Cuneo 2007). It is a metaphysical constraint because it suggests that any theory needs to explain how epistemic properties like justification supervene on more basic, natural properties in such a way that if two situations are naturalistically identical (and thereby indistinguishable) and the first realizes, say, epistemic justification then the second situation must also realize epistemic justification.

Metaphysically speaking, it cannot be the case that two naturalistically identical situations (at least in the epistemically relevant aspects) realize inconsistent epistemic properties. There must be some naturalistic difference at the base level that grounds some difference at the normative-supervening level, otherwise there is no good reason why the supervenient-normative properties should be inconsistent. To illustrate, suppose that there are two naturalistically indistinguishable cases where a dead body has been found. It would be unreasonable to think that the one case justifies the belief in a homicide while the second justifies the belief in a suicide—unless of course there is at least some relevant naturalistic difference in the two cases.

The epistemic supervenience thesis is a rather technical way to formalize the strong mundane intuition that ‘no double standards’ should be allowed in normative matters (epistemic, practical/moral, aesthetic, or other). Some theories can deal with the constraint rather easily while others seem to have difficulties with it. For example, reductionist realists have an easy explanation of the constraint. Intuitively, if two naturalistic situations are identical (at least in the epistemically relevant respects) and the first situation realizes justification as, say, coherence, then there should be no surprise that the other situation realizes coherence and thereby justificatory status.

On the other hand, antireductionists cannot offer the same account of supervenience with the reductionist because, crucially, they deny that epistemic justification is a reducible property. To see this, conceive for a moment of epistemic justification as a non-natural property, an irreducible property that regardless of what natural facts it is not entailed that a certain belief p is justified for S. Conceive now of two naturalistically identical situations and that one of the two situations does realize justification. It seems that it remains at least an open question whether the second situation also realizes epistemic justification exactly because the property is not reducible to a more basic property. No doubt, this is not to deny that antireductionists can somehow explain epistemic supervenience. It is only to note a distinctive challenge they face.

Expressivists usually attempt to explain epistemic supervenience as a merely conceptual and not as a metaphysical constraint (for discussion compare Hare 1952; Blackburn 1993; and Ridge 2014). For antirealists such as expressivists, epistemic supervenience is only an a priori norm of rationality that constrains the appropriate application of the concept of epistemic justification. If we have two naturalistically identical situations (at least in the epistemically relevant respects), and we judge that the one justifies a target belief p, then on pain of irrationality, we should also judge that the next situation justifies the belief that p.

One worry for antirealist explications of supervenience is that they reduce the constraint from a metaphysical principle to a conceptual and this seems to change the subject; likewise, expressivists might object that to insist that the constraint should be addressed in its metaphysical guise it is to beg the question in favor of realism. At any rate, supervenience is a tricky but valuable philosophical concept, and there are questions about how to best interpret it (local or global, for instance) and how to best account for its intuitive character, but the discussion will end here. Enough has been said to showcase why epistemic supervenience seems to be a challenge for realists and antirealists and a desideratum for a plausible metaepistemic theory of justification.

Let us now turn to the ontology of knowledge. The metaepistemology of knowledge seems even less well-defined than the metaepistemology of epistemic justification. This is reflected in the fact that knowledge theorists do not usually speak in the metaphysically-loaded terms of realism/antirealism about knowledge. They often set up their discussion in terms of challenges and problems for a theory of knowledge like radical skepticism, the Gettier problem, the lottery problem, the dogmatism paradox, linguistic evidence, and so forth. To keep the distinctively metaepistemological tone of the article, let us follow Michael Williams (2001) and speak of realism/antirealism about knowledge. According to Williams (2001), and this is the understanding of knowledge realism/antirealism that we endorse in the ensuing discussion, realists think of knowledge as something real, invariant, and mind-independent. Antirealists simply deny that there is such knowledge.

So, for realists, whether “S knows that p” is a question to be decided by the corresponding, independent knowledge facts, whatever these may be (evidential, reliabilist, virtue-theoretic, and so forth). Knowledge status is not a construct or invention of human cognition of sorts. In contrast, antirealists deny that there is such a thing as robust knowledge and corresponding robust knowledge norms and facts. There is knowledge, of course, but not in the metaphysically-inflated sense that realists tend to envisage. Knowledge is true belief in accordance with certain knowledge norms and facts that are not mind-independent and not of universal validity. How to comment on antirealist knowledge norms and facts depends on the contours of the particular theory one favors (relativist, expressivist, error theory and so forth). At any rate, there is nothing over and above this sort of knowledge.

Realists divide again into reductionists and antireductionists, and reductionists further divide into analytic and synthetic reductionists. On the one hand, analytic reductionists think that the analytical project about knowledge is still viable in spite of the failures and pessimism that traditional conceptual analysis may occasionally inspire. The Gettier problem has, for example, inspired pessimism about the prospects for an analysis of knowledge to many epistemologists (compare Kirkham 1984; Fogelin 1994; Williamson 2000; Floridi 2004), but some others remain unmoved and argue for sophisticated analyses of knowledge, like Pritchard’s (2012) anti-luck virtue-theoretic account. On the other hand, synthetic reductionists again tend to treat knowledge as a natural kind property that in principle should be discoverable by means of empirical inquiry. In this spirit Kornblith (2002) and Neta (2008) have argued that knowledge is just another natural kind.

In their turn, antireductionists deny that knowledge is reducible, analytically or synthetically. They tend to think that knowledge is irreducible to anything more basic and should be taken to be a primitive and sui generis concept, “the unexplained explainer” that is the most fundamental building block for an epistemological theory. Other epistemic concepts/phenomena like evidence, justification, probability, assertion, and skepticism should be explained in the light of knowledge and not the other way round. The most notable proponent of this “knowledge first” approach to epistemological theorizing is, of course, Williamson (2000).

Worthy of note is that Williamson (2000) puts forth his antireductionist theory not as a metaphysically-loaded theory, so it would be a mistake to hasten to infer that he is a non-naturalist about knowledge just because he is an antireductionist about knowledge. He is more interested in showing that the analytical reductionist project about knowledge is a “degenerate research programme.” In consequence, the safe thing to say is that although he is an avowed antireductionist, he shows no particular interest in the metaphysics of knowledge as such.

Antirealists about knowledge include expressivists and relativists as well as what we could call skeptics-as-error theorists. Expressivists like Gibbard (2003) and Chrisman (2007) claim that there are no “real” knowledge facts in virtue of which knowledge truths obtain. Building on Gibbard’s (1990) norm-expressivism about rationality, Chrisman (2007) suggests that knowledge is a normative concept we use to evaluate epistemic positions, and, accordingly, express approval for the norms in virtue of which true belief is formed. There are also relativists about knowledge who suggest that there are no independent or absolute knowledge facts but only constructed knowledge facts of mere local validity, like Stich (1990) and Macfarlane (2005).

Skeptics about knowledge are typically also antirealists and one natural way to integrate their position in metaepistemological classification would be as error theorists (compare Unger 1971; Fogelin 1994; Kyriacou 2017). Skeptics deny that there are any real knowledge facts (at least empirical knowledge facts) and therefore imply that at least most of knowledge discourse is implicitly in a state of constant error (for discussion compare Hawthorne 2004). Knowledge assertions and attributions are almost uniformly false. Of course, we speak of knowing such-and-such but we do not know in reality because there is no such thing as knowledge. As a result, ordinary speakers are afflicted by semantic blindness about the concept of knowledge. They unwittingly speak as if they know, but philosophical reflection can eventually indicate that there is not much of real knowledge.

Thus far we have introduced realists and antirealist approaches to knowledge. Interestingly, however, some epistemologists argue that there is “real” knowledge but not of the demanding invariant sort that traditional realism (as stipulated above) presupposes. There is proper, “real” knowledge that fully merits the name but is context-sensitive. That is, it is knowledge where the demandingness of the standards of justification (in virtue of which we arrive at true belief) varies with context because of factors like the intentions, needs, stakes, goals, and so forth of the attributor. In essence, the standards of knowledge may shift from context to context.

Contextualists, though, suggest that at least some important portion of our knowledge talk comes out true because the standards of knowledge need not be so high that we couldn’t ever satisfy them. Of note is that contextualism is a semantic view and not a metaphysical view and as such it could be easily wed to an antirealist theory. For example, some cultural relativists might be understood as a kind of contextualists about knowledge. The concept of knowledge picks out incommensurable culturally constructed knowledge facts from cultural context to cultural context (for critical discussion of such views see Boghossian 2007). At any rate, discussion of contextualism about knowledge takes us into the field of semantics, which is discussed in section four.

A final note on how the supervenience constraint applies to knowledge. As the epistemic supervenience thesis applies to epistemic justification, it seems to apply to knowledge as well. That is, if two epistemic situations are naturalistically indistinguishable (at least in the epistemically relevant respects) and the first situation realizes knowledge, then in the absence of at least some relevant difference between the two situations, it would be irrational to deny that knowledge is realized in the second situation. No double standards are allowed in epistemology. Again, the supervenience constraint on knowledge has stirred some fascinating discussion but space restrictions oblige us to leave the topic here (for critical discussion of the mentalist supervenience thesis of Conee and Feldman 2004, see Greco 2010).

We have now talked a bit about the metaphysics of epistemic justification and knowledge. As it has probably become obvious from the discussion, metaphysics is inextricably linked with semantics (and there is also a good methodological question of explanatory priority) and, therefore, it makes for a natural follow up section.

4. Semantics

Epistemic semantics typically deals with the meaning of epistemic declarative statements and often focuses on the more traditional concepts of justification and knowledge. Let us first present a certain famous semantic challenge for the meaning of normative predicates that has been lately applied to the subset of epistemic predicates as well (Jenkins 2007; Heathwood 2009; Greco 2015; Cuneo and Kyriacou 2017). This is Moore’s (1903) famous open question argument that he applied to “goodness” that ushered Moore to the conclusion that goodness is an indefinable predicate that picks out a simple and sui generis non-natural property. It can be formulated in this style of question: “Is (super)natural property N (for example, pleasure, desire, divine will, and so forth) goodness?” Or, in terms of more colloquial discursive contexts, “I can see that this is N (pleasurable, desirable, socially accepted, and so forth), but can’t see why this is good.”

Moore thought that competent speakers of English (and of goodness) will find this style of questions widely open and that this intuition of semantic openness is evidence that goodness is irreducible. He thought so because he assumed that property identities can be discovered solely by a priori conceptual analysis and that they should be directly transparent to competent speakers of the target language. As applied to epistemic predicates like justification, the open question argument would go like this: “Is (super)natural property N epistemic justification?”; or “I can see that this belief (or system of beliefs) is coherent, intuitive, socially approved, reliably produced and so forth but is this really justified? I don’t see that!”; or “I see that this belief is coherent, self-presenting, and so forth, but so what? It does not make it justified.”

The open question argument has been transposed and applied to epistemic justification and epistemic rationality with the same antireductionist verdict as in the case of goodness (and other moral predicates). Interestingly, there are also echoes of the Moorean semantic openness idea in Williamson’s (2000) famous antireductionist argument about knowledge. Williamson (2000:31) says, for example, that ‘‘even if some sufficiently complex analysis never succumbed to counterexamples, that would not entail the identity of the analyzing concept with the concept knows. Indeed, the equation of the concepts might well lead to more puzzlement rather than less.” The Moorean, semantic intuition that lies behind Williamson’s words is that it just seems that any purported reduction of knowledge, even if it avoids counterexamples, will remain semantically open and this indicates that the predicate is irreducible to a more basic property. It is a conceptual primitive that is not derivative to anything conceptually more basic.

As it quickly became obvious, however, the open question argument is dialectically vulnerable because it relies on unwarranted assumptions about meaning and analysis. First, intuitions of semantic openness are inconclusive evidence for property non-identity. We might have not discovered the correct analysis yet and, besides, intuitions are often a bad counselor (compare Frankena 1939).

Second, not all property identities should be immediately transparent to a competent speaker (compare Smith 1994). Some may of course be almost trivial like “vixen is a female fox,” but some others may require reflection and practice even for the relatively simple “circle is the figure with a circumference that is equidistant from the center.” Arguably, it might take some reflection (and drawing) for a competent speaker to grasp the property identity. Third, some property identities may not be discoverable by a priori conceptual analysis no matter how hard or ingeniously we try (compare Brink 1989; Kornblith 2002; Jenkins 2007; and Neta 2008). Some seem discoverable by the a posteriori means of empirical science like the natural kinds of water, gold, silver, salt, and so forth.

But in spite of its dialectical weaknesses, the open question argument has exerted significant influence to metaethics (compare Darwall and others 1992) and now seems to extend this influence to metaepistemology. Many think that there is something strongly intuitive to this argument that is hard to shake off. Perhaps what the argument captures is the strong—in Enoch’s (2013) words- “just-too-different intuition,” namely, the intuition that normative properties are just too different to be reduced to anything more basic, like natural properties. Others, no doubt, think that the argument can be blunted and we can still argue for reductionist accounts of goodness, justification, rationality, knowledge, and so forth. To be sure, all parties to the discussion seem to consider it a serious semantic challenge that should be addressed, one way or another.

Those who accept the antireductionist result of the open question argument may interpret it (and actually have) in ways amenable to their overall views, realist or antirealist. Synthetic reductionists have seen it as evidence that an a priori conceptual reduction of epistemic properties is not possible but this constitutes no reason for thinking that an a posteriori reduction could not apply. So they proposed that perhaps epistemic properties are irreducible by conceptual analysis and, hence, this is why we have “open feel” semantic intuitions in Moore-style open question arguments.

This optimism, though, seems to be premature as there are good reasons to question whether sophisticated synthetic reductionism is all that promising. In the case of moral properties, Timmons and Horgan (1991) have devised a sophisticated Moore-style open question argument, the so-called “moral twin earth argument,” with the intent to thwart the optimism of synthetic reductionists. Inspired by Putnam’s (1975) seminal “twin earth argument” for semantic externalism, Timmons and Horgan have devised a thought experiment that allows us to test our semantic intuitions for the prospects of such a synthetic reductionism.

Timmons and Horgan contrasted their moral version of the twin earth argument with Putnam’s natural kinds version. They argued that, while in Putnam’s original thought experiment our intuitions suggest that the meaning of natural kind terms is not merely fixed internally (by other “meanings in the head,” so to speak) but externally by the natural world, in the moral version of the story our intuitions differ significantly. We tend to think that differences in the extension of, say, “right’” merely reflect differences in internal normative theory, not external natural facts about rightness. This, obviously, contrasts with Putnam’s natural kinds version because these intuitive differences in extension in the twin earths experiments seem to reflect differences in what external natural facts we tend to think there are. We tend to think that there are no moral natural facts or kinds.

The exact details need not detain us here but what is important is that as in the case of Moore’s classic open question argument,, in the moral twin earth argument, our semantic intuitions seem to suggest that such a synthetic reduction is not in the offing. Even if there were a synthetic reduction of normative properties, we would tend to find such a reduction semantically open. To illustrate this, suppose for the sake of argument that somehow epistemic justification is reduced to some externalist property X (reliabilist, subjunctive tracking property, and so forth). Would this close the question “Is justification the externalist property X?”; it seems that the question would still strike us as widely open.

Some others have seen the open question argument as evidence for an antireductionist but realist theory while others, more sympathetic to both antireductionism and methodological naturalism, have seen it as evidence for antireductionist antirealist theories like error theory or expressivism (compare Ayer 1936). They suggested that we have entrenched “open feel” semantic intuitions because no reduction (analytic or synthetic) of normative properties (moral or epistemic) is forthcoming. Since there are no such properties, the quest for reduction is therefore quixotic.

In particular, expressivism is a very interesting approach to moral and epistemic discourse because it breaks completely with the traditional and mainstream truth-conditional metasemantic framework. Unlike error theorists, relativists, and so forth, expressivists question and reject both factualism and cognitivism. While antirealists by definition reject factualism—namely, the idea that propositions are rendered true by corresponding robust facts regardless of the kind of discourse (descriptive or normative)—with the sole exception of audacious expressivists, they respect cognitivism. That is, they respect the thesis that normative propositions express descriptive mental states like beliefs that purport to pick out corresponding normative properties and facts.

The result of the joint rejection of factualism and cognitivism is a novel metasemantic framework that does not understand meaning on the basis of truth and reference (and truth-conditions) but on the basis of the notion of expression of states of mind (though, for a recent non-content-centric exposition of expressivism see Charlow 2014). Truth is not built in the rudiments of the semantic theory at first instance. Sophisticated, non-classical expressivists (compare Blackburn 1993, 1998) like to appeal to a deflationary account of normative truth, but even then truth is not a primitive component of their metasemantic theory, but only a derivative of the mental states expressed.

This allows expressivists to reap some important explanatory fruit but at the same time also incurs important theoretical costs: expressivists can explain the operation of the open question argument; they are in accord with a naturalistic picture of ontology and epistemology consonant with our evolutionary history (compare Gibbard 1990, 2003); they can explain epistemic motivation (compare Kappel and Moeller (2014); they can explain normative disagreement as a mere conflict in noncognitive attitudes (compare Stevenson 1963); they are in line with linguistic evidence for the expression of noncognitive attitudes in normative discourse; and even more.

However, the theoretical price they are called to pay seems also pretty high. For a start, at first sight truth and objectivity fall out of the picture and the appeal to deflationism about truth seems only to transpose the problem a step back (compare Cuneo 2007). Normative disagreement also becomes a psychological matter of conflicting noncognitive attitudes and not a logical matter of truth/falsity. Consonance with empirical linguistics may also quickly become a problem rather than an attraction because sometimes we may express normative thoughts without expressing noncognitive states like approval as the theory suggests that we should; at least this much of empirical linguistics cannot and should not be denied a priori by expressivism (for similar points compare Huemer (2008) and Yalcin 2012).

There are even more problems for expressivism, but a very serious problem that deserves at least a brief note is widely considered to be “the Frege-Geach problem” (compare Schroeder 2008a, 2008b; Charlow 2014). Drawing from Frege’s (1918/1997) discussion of negation, Geach (1960, 1965) first broached the problem for the early expressivist theory of emotivism, and ever since the problem has been at the forefront of expressivism debates. The problem consists in the fact that while expressivism works reasonably well in the context of asserted atomic normative sentences, in non-asserted logically complex contexts, the noncognitive content of the sentence seems absent.

This seems to have serious repercussions because in the case of deductive inference contexts it implies a fallacy of equivocation in regard to the normative predicate. This is so because the meaning of the predicate seems different from premise to premise and as sameness of meaning is a prerequisite for truth-preservation and validity, an obviously valid argument is left invalid. This means that expressivism does not account for a key semantic fact, namely, logical validity (for a round discussion of the problem see Schroeder 2008b).

The Frege-Geach problem has provoked heated discussions about the plausibility of expressivism, and expressivists keep trying to address the challenge. Some have tried to build a so-called “logic of attitudes” while others have tried to build structure into the involved noncognitive attitudes that would help explain the syntactical features of logic and address the problem (compare Blackburn 1993, 1998; Gibbard 1990, 2003; and Schroeder 2008a). More recently, some have developed novel non-content-centric understandings of expressivism (compare Charlow 2014). Whether expressivism can be developed into a plausible, full-blown metasemantic framework is currently an ongoing research project for many philosophers (some pessimists, some optimists).

Perhaps unsurprisingly, expressivists have an additional problem with truth. For an expressivist, nondescriptive theory of meaning the notion of truth need not come up in the explication of the meaning of sentences. What matters are the states of mind expressed and not truth-conditions, but truth is so valuable a notion that we cannot just give up so lightly. For instance, we want to hold onto such truths like that “the theory of general relativity is well-justified, given the evidence’ or that ‘murder is wrong.” But of course, expressivists are baffled about such truth-talk. Early emotivists like Ayer (1936) were happy to concede that moral statements are not truth-apt at all, but misgivings about giving up truth never subsided.

More sanguine and sophisticated non-classical expressivists observed that expressivism is a metasemantic framework that need not directly involve truth-conditions but that need not imply that we cannot wed such a framework with a derivative notion of truth. Perhaps it is incoherent to assume that expressivism can be wed to certain traditional theories of truth like a correspondence theory, given the antirealism of expressivism, but we could still appeal to metaphysically light theories of truth like deflationism/minimalism. This is what Blackburn (1993, 1998) has proposed for example. Blackburn suggested that we could be expressivists and antirealists but appeal to a deflationary theory of quasi-truth and quasi-facts in order to rescue normative quasi-truth. He has dubbed this project quasi-realism for obvious reasons: you can be an antirealist but mimic all the realist appearances, like the truth appearance (though, for the so-called “problem of creeping minimalism” about how to distinguish realism from quasi-realism see Dreier 2004).

Deflationism about truth understands truth-talk in a merely disquotational, ontologically deflated way. No commitment to a truth-property is involved and typically truth-talk is understood as a linguistic device that serves certain conversational and social functions. For instance, to say that “p is true” is semantically equivalent to saying that “p” (and vice versa). The predicate “…is true” (and cognates) may recommend that p, show approval of p, confidence about p, and so forth. and facilitate other conversational and social functions but does not thereby commit to a robust truth property. Similarly, saying that “It is true that p is justified” merely shows approval, recommendation, and so forth. that p is justified but without any ontological commitment to a truth property (see Dowden & Swartz’s Truth for more discussion of deflationism).

Of course, deflationism as a theory of truth faces a number of important independent objections that carry over to the expressivist project. Here are a couple of objections. First, it is very unclear what grounds normative truth in the absence of normative facts (compare Cuneo 2007). Truth seems to presuppose a grounding relation that confers truthmaking but it is not clear what grounding and truthmaking can antirealist expressivism offer (for discussion of grounding see Schaffer 2013). So the question remains: If a normative sentence is true, it is true in virtue of what? Second, we often ascribe truth in second-order sentences like that “It is true for expressivism that realism is false and antirealism true” and there is a question about how the expressivist can explain such truth-talk if there is no corresponding robust truth about the matter.

To conclude our discussion of expressivism and return to mainstream truth-conditional semantics, there are various truth-conditional theories of justification/knowledge with different takes on semantics. There is of course traditional invariantism and contextualism about justification and knowledge. Traditional invariantism takes justification/knowledge to be absolute/univocal concepts that their meaning remains invariant from context to context (compare Unger 1971; Fogelin 1994; and Kyriacou 2017). Invariantism can then be glossed either in more moderate terms or more demanding terms. Moderate invariantism about knowledge can, for instance, be explicated in terms of a safety principle, that is, a principle that roughly suggests that true belief involved in knowledge could not have easily been false. Such safety-based, Neo-moorean approaches can be found in Sosa (1999), Williamson (2000), and Pritchard (2007, 2012).

More demanding invariantism about knowledge can be explicated in terms of a sensitivity principle, that is, a principle that roughly suggests that true belief involved in knowledge is sensitive to falsity and if the belief were false it wouldn’t be believed. Interestingly, one strand of demanding invariantism may lead to skeptical invariantism because it seems to indicate that all logical (relevant or irrelevant) possibilities of error should be taken into consideration and ruled out. Given that almost always we cannot rule out all logical possibilities of error, we inevitably embrace skepticism about knowledge. Of course, sensitivity theorists need not be, and some have not been, skeptics. Nozick (1981), for example, famously accepted a sensitivity condition on knowledge but rejected the intuitive condition of closure under known entailment and escaped the embrace of skepticism; at least this much he thought (for critical discussion compare Fogelin (1994) and Hawthorne 2004).

Some other philosophers simply repudiate the semantic invariance assumption in favor of semantic contextualism. Attributor contextualism takes justification/knowledge to be context-sensitive concepts, that is, concepts that their meaning varies due to contextual factors (stakes, interests, needs, goals, and so forth of the attributor). Such contextual factors induce a conversational shift of epistemic standards between high and low standards depending on the discursive context. Philosophers like Annis (1978), DeRose (1995), Lewis (1996), Cohen (1998), Williams (2001), and Wedgwood (2008) have proposed contextualist accounts of justification/knowledge that analyze meaning as context-sensitive.

More recently novel understandings of invariantism and contextualism have been proposed. These novel understandings are at least partly motivated by empirical linguistic evidence of how we occasionally use the concepts of justification/knowledge. Subject-sensitive invariantism proposes that the meaning of justification/knowledge is sensitive to the invariant subject’s practical interests, stakes, goals, and so forth. (and not the attributor’s) and makes much of the importance of knowledge for assertion and practical reasoning (compare Hawthorne 2004). Contrastivism takes into consideration the contrastive and comparative element in justification/knowledge discourse. For example, if I say ‘”S knows that p” this might conversationally imply that “S knows that p rather than q” (compare Schaffer 2004).

Questions of epistemic meaning go far beyond what we presented, but we have to pause here. In the next section we turn to (meta)epistemology, namely, the epistemology of epistemology and, in particular, the (meta)epistemology of epistemic justification and knowledge.

5. (Meta)Epistemology

Let us pause for a moment to take stock. We have explained how metaepistemological questions and puzzles arise out of the deontic appearance of our cognizing and its concomitant intrinsic normativity. Accordingly, we explained that human cognizing seems to implicate categorical epistemic duties and obligations; that is, there are propositions we ought to (dis)believe and practices that are (in)correct to employ from the epistemic point of view. Finally, we explained that justification and/or knowledge and corresponding epistemic oughts may be understood realistically or anti-realistically, or even otherwise (for example, in Kantian constructivist style) and delved a bit into the semantic aspect of things.

The obvious (meta)epistemological question now is how we get to know, or at least have justified belief, of such epistemic oughts and obligations (realist, antirealist, or other). That is, how we get to know, or at least have justified belief, about what we ought to believe or what doxastic practices, habits, and methods we should employ. To pursue this metaepistemic question we need first to stipulate a fundamental and very contentious distinction, namely, the distinction between epistemic internalism/externalism. The distinction is so fundamental—that we have not managed to entirely avoid it so far—because the epistemological theory we end up with depends on how we construe it and take sides about it. Given that so much hinges on the distinction, it should come as no surprise that the distinction is so contentious that is even debatable how best to construe it (and there are a number of ways of doing so) (see Poston’s Internalism and Externalism in Epistemology for discussion).

One standard way is to construe it in terms of cognitive accessibility, namely, the reality or not of cognitive accessibility to facts/evidence/reasons that support the target belief p. According to the accessibility reading, epistemic internalism suggests that S justifiably believes that p iff S has cognitive access to epistemic reasons in support of p (compare Chisholm 1966). Epistemic externalism simply consists in the denial of epistemic internalism. S can justifiably believe that p even if S has no cognitive access to epistemic reasons in support of p. If, for example, the belief is produced by a reliable belief-forming process, then S can justifiably believe that p (even without access to supporting reasons). There may of course be cases of justified believing that enjoy cognitive access to reasons in support of p, but for externalists this is not a prerequisite for justification (compare Goldman 1979).

A second reading of the distinction is laid out in terms of mental states rather than accessibility (compare Conee and Feldman 2004). According to the mentalist reading, epistemic internalism suggests that S justifiably believes that p iff S has mental states that count as evidence in support of p. Epistemic externalism again denies this. S can justifiably believe that p even if S has no mental states that count as evidence in support of p (compare Greco 2010). The mere satisfaction of some external conditions (reliability, counterfactual tracking, and so forth) suffices for justification or knowledge.

Of note is that mentalism implies no accessibility to reasons for justification. All that is required for justification are evidential mental states, but the agent need not have access or conscious awareness of such mental states or reasons. In this sense, mentalist internalism concedes to access externalism that cognitive access is not required for justification but still upholds that justification supervenes on and is fixed by mental states. At any rate, for the purposes of this article we could stick to the more standard accessibility reading of the distinction of epistemic internalism/externalism.

With this partial clarification of the epistemic internalism/externalism distinction in hand, we can now revert to the question of how we know, or at least have justified belief, about epistemic duties and obligations. Recall that ordinary epistemic discourse may give some prima facie support to epistemic deontologism and that the traditional (and more natural) interpretation of this has been in internalist contours. We have epistemic duties and obligations, like the general and a priori “You ought to abide by the relevant evidence and pursue the truth” or the particular and a posteriori “You should believe what Mary says because of her relevant expertise,” and these are reflectively accessible. Careful reflection can, in principle, indicate the epistemic duties and obligations an agent may have.

Of importance for the epistemology of epistemic oughts and duties is also the ontological distinction between epistemic realism and antirealism. Realists would claim that there are real, mind-independent epistemic duties and obligations that vindicate the deontic appearance and would offer different epistemological stories of how we get to know them, or at least have justified belief about them (foundationalist, coherentist, foundherentist, reliabilist, virtue-theoretic, and so forth).

To use a toy theory as an example, a foundationalist analytic reductionist would suggest that categorical epistemic duties are either non-inferentially justified or inferentially justified. Non-inferentially justified beliefs are beliefs that are justified but not in virtue of being based on other beliefs. There are various ways how beliefs could be justified without recourse to other beliefs: by means of non-doxastic states (for example, Fumerton 1995; Bonjour 2003); or self-presenting doxastic states (compare Chisholm 1966; McDowell 1994); or by means of belief-independent belief-forming processes (compare Goldman 1979); or in the case of a priori basic beliefs, in virtue of conceptual content (compare Bonjour 1998). To take the latter case, for instance, it could be suggested that “the duty to proportionate belief to the relevant evidence” seems a priori prima facie justified in virtue of conceptual content for anyone rational with proper understanding of the meaning of the proposition.

Instead, inferentially justified beliefs are beliefs that are justified in virtue of other beliefs/reasons. If beliefs/reasons for p are sufficiently strong then we have an obligation to believe that p. If for example I say that “The thing to believe is what Mr Poirot has concluded,” this should be further supported by epistemic reasons, that is, reasons that involve relevant evidence, like, say, Poirot’s character traits of integrity, attentiveness and investigating dexterity. Such a traditional foundationalist model of epistemic duties and obligations would be broadly internalist and intuitionist. On the one hand, internalist because we need access to epistemic reasons for justification and, on the other hand, intuitionist because intuitions should play a distinctive role in identifying at least non-inferentially justified duties to believe.

Unfortunately both features of such a theory are very controversial. To begin with, first, externalists would deny that internalism is a plausible constraint on epistemic justification. For one thing, they would suggest that it over-intellectualizes our everyday cognitive practices and processes and thereby distorts them beyond repair. Our cognitive practices and processes need not be conceived as so reflective and intellectual (compare Goldman 1979; Plantinga 1993; Sosa 2003, 2007; and Greco 2010). Second, what seems to really matter in our cognitive endeavors is cognitive success (of sorts) out of cognitive ability, be it of the goal of truth, knowledge or even other. So accessibility to facts/reasons does not really matter for justificatory matters. What matters is reliability through cognitive ability for truth and knowledge.

The intuitionist component does not fare any better. First, intuitions are often quite unreliable, and this psychological fact is unsurprising if we consider the evolutionary and cultural origins of our intuitions (compare Haidt 2012, and Kahneman 2011). So there is a clear question why trust our intuitions, especially when we have conflicting intuitions with other epistemic peers we esteem and trust as cognizers (for a similar point compare Setiya 2012). Second, it is not clear what sort of epistemic facts about duties we intuit when we invoke our intuitions. Surely such facts do not seem natural in the sense that they do not seem part of the natural world and the corresponding object of study of empirical sciences. Equally, non-natural (epistemic, moral or other) facts sound mysterious, or queer if we are to recall Mackie (1971), not to mention how we can reliably (and causally) track such facts if they exist (compare Street 2006, 2009; Olson 2011b).

Third, suppose our intuitions are at least somewhat reliable and sometimes, somehow, do track corresponding epistemic facts about duties. This much we can suppose with some justification because if our intuitions are completely and universally unreliable then it would seem that we have no rational ground for the starting point of an inquiry. So, it seems that too much of skepticism about intuitions might be self-defeating of any epistemic endeavor and lead to global skepticism, something that almost all epistemologists find unpalatable (pace Unger (1971), compare Plantinga 1993; De Cruz and others 2011; Vavova 2014). In addition, if we also assume so-called factualism, namely, the ontological idea that it is robust facts (of sorts) that stand as truthmakers for propositions, then there must be epistemic facts that render propositions about epistemic duties true.

Of course, internalist deontologism is far from being the only game in town. Externalists typically reject both epistemic deontologism and its internalist underpinnings as implausible. Externalists contend that the idea of internally construed epistemic duties and obligations is rendered obsolete by the advent of the more sophisticated scheme of externalism and, therefore, we should not take the deontic appearance of ordinary epistemic discourse at face value; at least not in the traditional internalist mode of its interpretation (for example, Greco 2010).

Instead, externalists focus on the reliability of cognitive processes (perception, memory, induction, deduction, introspection, and so forth), where reliability is typically understood in terms of a relatively high ratio of truth output. The basic insight is that if a cognitive process systematically and reliably delivers truth, then the belief-output of such a reliable process can be considered justified. No Cartesian-style reflective access to duties and supporting reasons is required for justification (and potentially knowledge). We can have justification and knowledge without further reflective justification or epistemic duties. It is not accidental that often externalist theories like reliabilism are understood as a form of epistemic consequentialism (see Dunn’s Epistemic Consequentialism). That is, of a theory that identifies epistemic value with the maximization and promotion of the goal of truth and the minimization and avoidance of error.

There are a variety of externalist positions in the marketplace of epistemology: Goldman’s (1979, 1986) process reliabilism; Plantinga’s (1993) proper functionalism; Bergmann’s (2008) Reidian externalism; Sosa’s (1991, 2007) and Greco’s (2010) virtue-theoretic externalisms and more. It is interesting to note that some externalist positions make important concessions to internalism and in essence come up with hybrid theories of justification and knowledge that aspire to a reconciliation of sorts between the two camps.

A good example of this is Sosa (1991, 2007). Sosa distinguishes between animal knowledge and reflective knowledge. Animal knowledge is simply “apt” knowledge that is produced by reliable, virtuous faculties, but human knowledge is the distinctively reflective sort of knowledge where reasons in support of the belief are accessed. Reasons are supposed to spring out of a coherence testing of the belief. That is, for human-reflective knowledge, production by the relevant virtuous faculty is not sufficient. Reasons in support of the belief should be accessed and these should come via a coherence test.

Hybrid accounts like Sosa’s seem better poised to fend off the often heard charge against externalism that it seems to leave epistemic normativity out of the picture because it overlooks the question of what one ought to believe (compare Fumerton 1995; Brandom 2000). More precisely, the charge is that it permits only for mechanical, reliable belief-forming processes that mostly deliver true beliefs and therefore ignores the further requirement for reflection about epistemic oughts. Sosa (1991, 2007) can alleviate this worry because he allows for a reflective element that can monitor the operation of the processes and ponder even about which process one ought to employ and rely on. This seems to be an advantage of Sosa’s (1991, 2007) theory over other mere externalist theories.

We have now seen some of the basic issues surrounding the epistemology of epistemology. Let us now turn to reasons for belief and epistemic psychology.

6. Reasons for Belief and Epistemic Psychology

Mainstream externalists dislike the whole idea of talking about (at least reflectively accessible) epistemic duties and reasons. For one thing, they tend to think that it is overly intellectualistic and, therefore, distorts ordinary cognitive practice. We neither usually appeal nor need to appeal to reasons for belief and even when we do we cannot really choose what to believe. Belief is involuntary and usually comes swiftly and effortlessly, plausibly for evolutionary psychological reasons. So, externalists are often also skeptical about the dialectical import of the deontic appearance and sometimes even of its concomitant normativity. The concepts of deontology, internalism and sometimes normativity do not ring very well to their ears.

Admittedly, there is some insight in externalist misgivings about reasons-talk and the deontic appearance, but it is also hard to expel talk of reasons and duties from epistemological theorizing. Think for example of Locke’s (1690/1975: 687) pithy aphorism that “those who believe without reason are in love with their own fancies.”. As we have seen, this conflict of externalist and deontological/internalist intuitions seems to leave us in a bind because it is not clear which intuition we should discard (or even how we should attempt to reconcile them). Hence, the resolution of the conflict of internalist/externalist intuitions constitutes one of the most serious challenges for any epistemological theory.

Be that as it may, in this section we set aside externalist misgivings about reasons and introduce basic issues surrounding reasons for belief and epistemic psychology. Besides, even externalists give some reasons in support of externalism. First, let us distinguish epistemic from non-epistemic reasons like moral, pragmatic, prudential, aesthetic, political, and even more. One way to delineate epistemic reasons from other kinds of reasons is to emphasize that they stand as reasons for belief from the epistemic point of view. That is, from the point of view of commitment to some epistemic goals/values, whatever these may be: justification, truth, knowledge, understanding, wisdom (or even some other). Suffice it to say that, very roughly, epistemic goals are goals that promote our goal for cognitive contact with reality.

Non-epistemic kinds of reasons are not oriented towards distinctively epistemic goals. Reasons that are oriented towards practical utility are often—somewhat loosely—called pragmatic, and they are akin to a consequentialist/instrumental understanding of practical reasons. So, if for instance you are a cynic enough to believe in divine existence simply because it is consoling or because you find Pascal’s wager cogent, then you have pragmatic reasons for such belief but not epistemic reasons. Even if you should believe in God because that is what your relevant evidence rationally prescribes, but believe because it is consoling or because you find Pascal’s wager cogent then you believe for the wrong kind of reasons.

Epistemic reasons are also closely connected to moral reasons. Indeed, it is sometimes said that epistemic reasons have a moral flavor because they are reasons that should regulate behavior and such reasons cannot fail to be in some sense moral/practical (compare Clifford 1877; Cuneo 2007). Even if this is right, which is a moot thesis, epistemic reasons would make for a very distinctive species of moral reasons, perhaps so distinctive that they merit a distinctive appellation too: epistemic instead of moral/practical. They are reasons that have to do with the proper regulation of doxastic behavior and not just any kind of practical behavior. Typical doxastic behavior seems different from typical practical behavior at least in terms of conscious and direct control of the behavior.

Much more needs to be said about the classification of reasons, but I have said enough for an intuitive grasp of what might make epistemic reasons different from other kind of reasons. Let me close this topic with two concluding observations. First, epistemic reasons may be sensitive to moral/practical reasons (and other reasons as well). To see this, think of the case of a cosmologist that by trade has a duty to inquire about the origins of the universe but also has to attend to time-consuming filial duties towards her ageing parents. Surely, it is hard to combine both conflicting duties because the time dedicated to filial duties is taken out of work and reflection of epistemic duties. So, there must be a tradeoff between epistemic and moral/epistemic reasons.

Moral/practical reasons may even affect epistemic reasons in a more dramatic way. You may have good epistemic reasons to believe that your business partner and childhood friend is cheating on you but you might have moral reasons to trust him. So, it seems that epistemic and moral reasons may pull to opposite directions and the interesting question is what sort of reasons may have the upper hand in such situations. In such cases, we have a moral-epistemic obligation dilemma.

Second, a short note to the so-called “wrong kind of reasons problem” is due (compare Rabinowicz and Ronnow-Rasmussen 2004; Olson 2004; Schroeder 2010; Heuer 2010; Kyriacou 2013). It is a problem that concerns all propositional attitudes (not just belief) and it popped to the surface as a problem for so-called buckpassing accounts of value. Buckpassing accounts of value contend that something is valuable iff it has a property that elicits proattitudes like approval, admiration, desire and so forth (compare Scanlon 1998). The problem now is that something may have a property that elicits proattitudes for the wrong kind of reasons. It is not enough that something is valuable. This is beside the point.

The point is that what we judge as valuable should be so judged for the right kind of reasons. If, for example, I admire a truly remarkable painting just because this will please the painter, I have the right attitude concerning its aesthetic value for the wrong reasons. In the epistemic case, if you endorse a truly justified belief as justified because it is consoling for you, and not because it is evidentially grounded, reliably produced, or what have you as the correct justificatory story, then you have the right kind of doxastic attitude but for the wrong kind of reason. The wrong kind of reasons problem poses a challenge for any metanormative theory and it is also a challenge for a theory of epistemic justification. We need a theory of epistemic justification that would get round the problem and have agents believe justifiably for the right kind of reasons.

A further important distinction about epistemic reasons is the distinction between hypothetical/instrumental and categorical reasons (or rationality). One position supports that epistemic rationality is merely instrumental. We have certain (or opt for some) epistemic goals like truth and knowledge and then we try to satisfy them in the best possible way (to the extent that we are rational). According to instrumental epistemic rationality, there are no reasons for belief going over and above these conditional\hypothetical epistemic goals. For example, if we have the hypothetical goal of truth about p, then epistemic rationality is merely instrumental in the sense that it should find the best means to satisfy the goal. In other words, epistemic rationality is instrumental, and, as such, it should be understood in terms of hypothetical requirements like ‘If you aim for the truth of p, then do such-and-such, say, employ reliable belief-forming processes’ (compare Kornblith 2002).

Their opponents disagree and reject instrumental epistemic rationality in favor of categorical rationality. They contend that we may have reasons for belief, even if we are lacking some conditional epistemic goal (compare Kelly 2003). For example, I may have no desire to find the truth about who killed my boss but I may still have a binding (external/categorical) reason to believe that it was John because there is abundant and accessible evidence at my disposal that confirms this. In a sense, whether I care about the truth of the matter or not, I have sufficient epistemic reason to believe that John is the murderer. Epistemic rationality constrains what I should believe unconditionally, that is, independently of my desires, interests, or adopted goals. In this sense, epistemic rationality should be understood in terms of non-optional, categorical requirements like “Given the evidence, you ought to believe that such-and-such (here, that John is the murderer).”

A related question about reasons for belief that may be inspired by a parallel discussion in metaethics concerns the nature of reasons for belief. As in metaethics there has been considerable discussion about the nature of reasons for action and corresponding internal/external reasons for action, the same fruitful discussion could be opened in metaepistemology about reasons for belief. Internal reasons are reasons that depend on the agent’s subjective motivational set (desires, intentions, dispositions, plans, goals) while external reasons are reasons that are independent of the agent’s subjective motivational set (see Williams 1990; also Turri 2009).

For example, I may have a reason to believe or inquire about p only if I have a desire or care about the truth about p. According to this internal reading of having a reason, my having a reason exclusively depends on my caring and desiring to know about p. Were I not to care, I wouldn’t have a reason to believe or inquire into p at all, which is to restate the instrumental conception of epistemic rationality, but there is also an externalist reading of the notion of having a reason. For example, I might say “Sophie has a reason to believe that John is from New York” and imply that she has a reason to believe this insofar as she is rational, independently of whether she cares or not about the truth of the matter. Even if she couldn’t care less, it remains an epistemic fact that she has a reason to believe that John is from New York, which is to restate the categorical conception of epistemic rationality. There are epistemic facts about what (categorical) reasons for belief we have (compare Boghossian 2007; Cuneo 2007).

A different issue that has recently drawn attention, and also parallels an analogous discussion in metaethics, concerns the relation between epistemic judgment and motivation (compare Kappel and Moeller 2014; Grajner 2015). That is, the so-called judgment internalism/externalism controversy. Like moral judgments, it seems that epistemic judgments like “p is justified” or “S’s belief that p is true” or “I know that p” seem to implicate certain kinds of motivations. They seem to have an internal, conceptual connection with certain motivations. Judgment internalists hold onto the intuition of internal connection of judgment and motivation while judgment externalists, though they accept the systematic character of the connection, deny the internalist intuition and suggest that the connection may be severed in certain circumstances. Hence, the connection may only be external and defeasible in certain circumstances.

For example, for judgment internalists sincerely saying that “p is justified” implicates that I have at least some reason or doxastic motivation to believe that p, rely on p, recommend to others that p, approve of the norms, habits, processes , in virtue of which p was formed; or saying that “S’s belief that p is true” implicates some reason to also believe that p, approve the norms in virtue of which p was formed, give some credit to S and so forth; or asserting that “I know that p” implicates, all other things equal, termination of inquiry. The inquiry about p is at least for time being settled (compare Kappel and Moeller 2014).

In contrast externalists would question the conceptual connection and its modal strength (compare Kvanvig 2003; Grajner 2015). They might highlight the conceivability of Spock-like agents who make sincere normative judgments but remain entirely cold and unmotivated by that judgment. In the moral case, such personas are called amoralists, and the typical characters externalists have in mind are psychopaths, sociopaths, social delinquents, and so forth that seem capable of sincere moral judgment without any corresponding motivation for action. (Perhaps we can call the epistemic counterparts of amoralists, “acognitivists.”)

Like in the parallel case of moral motivation in metaethics, expressivists seem able to easily explain the systematic—if not strictly necessary—connection between epistemic judgment and motivation for belief, reliance, termination of inquiry, and so forth. This is the case because, as a noncognitivist theory about the nature of the mental state expressed in epistemic judgment, the expressivist can appeal to the motivating nature of noncognitive states to explain this systematic connection of motivation. Desires, intentions, and even epistemic sentiments may be called for explanatory psychological work.

For example, if I say that “p is justified” the expressivist will suggest that I express some sort of approval for p or the norms that license it or the doxastic habits in virtue of which was reliably formed (for example, Kyriacou 2012). The bottom line of such a noncognitivist account is that the relevant desire or intention for belief might be expressible in such judgments of justification. Similar stories would be drawn for other sorts of epistemic judgments. Thus, epistemic motivation is easily explained as a psychological phenomenon in an expressivist framework.

Cognitivists about epistemic judgment, though, cannot directly appeal to the same style of psychological explanation as the expressivist because they suggest that epistemic judgments express beliefs and beliefs do not seem to be intrinsically motivating states. They are representational states and as such they do not seem to be engaging the will at all. Saying that “Paris is a city” or that “It’s a hot day today” may be representational, but it is also “motivationally inert.” As is often said, beliefs and desires have an opposite direction of fit with the world (compare Anscombe 1957; Smith 1994). Beliefs are representational states in the sense that they aspire to represent aspects of the world and get at the respective truth. They are mind-to-world directed states. Desires are nonrepresentational states in the sense that they urge for changing the world in order to be satisfied (compare Smith 1994). They are world-to-mind directed states.

This leaves cognitivists with an obvious puzzle about epistemic motivation. They could insist that epistemic beliefs can motivate but this seems entirely ad hoc because beliefs seem representational and motivationally inert states. Alternatively, they could suggest that beliefs may induce corresponding desires, but this will have to explain how this systematic inducing works given that beliefs and desires are so-called Humean “distinct existences” (compare Korsgaard 1986; Smith 1994). That is, there is no necessary (logical or psychological) connection between belief and desire as the two states can pull apart. There can be belief without any corresponding desire and vice versa.

They could even appeal to a sophisticated hybrid (or so-called “ecumenical”) picture about normative judgment and suggest that they express both cognitive and noncognitive states (compare Copp 2001; Ridge 2007; Grajner 2015). At any event, cognitivists have some extra explaining to do that noncognitivists do not. This seems to give the noncognitivists the edge in regard to understanding the psychological relation between normative judgment and motivation, and it is considered as one of the best arguments in favor of noncognitivism. Nevertheless, one should not question the resourcefulness of the cognitivist tradition and its ability to account for moral and epistemic motivation (for ecumenical cognitivism see Grajner 2015).

With this much about reasons for belief and epistemic psychology, in the next section we briefly visit the notions of agency and responsibility.

7. Agency and Responsibility

Epistemic deontology has seemed especially implausible to many externalists because it does not comport with a plausible picture of epistemic agency. That is, epistemic deontologism assumes that as rational and responsible agents we have epistemic duties about what to believe and to the extent that we do not conform our doxastic behavior to these duties we are responsible and culpable. The problem now is that a plausible account of epistemic agency does not allow for the responsibility that deontologism requires and, since this is a prerequisite for deontologism, deontologism seems implausible.

This is arguably the case because typically doxastic behavior seems beyond the reach of our direct and conscious control. To illustrate this, let someone try to believe something that considers an obvious falsehood, or at least something she considers implausible, like “1+1=3” or that “Paris is not the French capital.” Let us even offer a powerful motive for the fruition of this cognitive exercise, say, 10,000 pounds for sincerely acquiring the belief. But alas, it seems psychologically impossible because direct belief-fixation is not up to us in any profound manner. Unlike ordinary decisions about what to do (for example, dance or standing up) we can’t directly decide what to believe.

In light of the fact that an intuitive understanding of responsibility is explicated in terms of minimal control of action, and control in terms of relative freedom to choose from open alternatives (for example, Pink 2004) one may consider epistemic deontologism a lost cause. That is, a lost cause because it relies on the existence of epistemic responsibility but, on the other hand, does not allow for sufficient control and freedom of choice that is a necessary prerequisite for such responsibility—this is “the doxastic involuntarism problem” for (internalist) epistemic deontologism (for the classic statement compare Alston 1988).

As one might expect, there have been a number of responses on behalf of deontologism. Some have blankly denied the involuntarism intuition and insisted that we can at least sometimes directly decide what to believe (for example, Ginet 2001). Others have denied that responsibility is exhausted by the absence of direct control (for example, Feldman 2001). We might have no direct control over the working of our heart but we might still be to some extent responsible for its proper working because we can take steps to indirectly enhance its functioning. After all, we can have a healthy lifestyle that includes a pro-heart diet, exercising, regular medical exams, and so forth

Similarly, we can take steps to indirectly enhance our cognitive functioning. We could for example cultivate reliance on methods, habits, and processes that are reliable belief-producers—that is, produce a high ratio of true beliefs. If I am myopic for example, I could cultivate habitual reliance for visual perception on my glasses because I know they make for reliable perception, or I could rely on certain sources of information that are generally reliable like a trustworthy newspaper columnist or a trustworthy friend. I could also cultivate epistemic virtues, namely, character traits that are generally truth-conducive like conscientiousness, inquisitiveness, open-mindedness, perseverance, respect, courage, tolerance, fairness, love of truth and humility.

In this vein Chrisman (2008), for example, has argued that doxastic involuntarism and epistemic deontology can be reconciled. He appeals to Wilfrid Sellars’ distinction between “rules of criticism” and “rules of action” and explains how doxastic oughts might be subject to rules of criticism that require no direct voluntary control but, nevertheless, do not compromise the categoricity of doxastic oughts (and responsibility). The doxastic involuntarism problem remains a live question for epistemic deontologism and agency (for more discussion see Vitz’s Doxastic Voluntarism).

A second topic of some recent debate that relates to (epistemic) agency is the so-called “situationist challenge about character.” Roughly, situationism about character is the thesis that character is a figment of folk psychology (and virtue theory) and does not really exist. Although it was first applied to moral character and thereby virtue ethics and then transposed to responsibilist virtue-theoretic epistemology (compare Alfano 2012) that appeals to character, it is arguably not of merely local responsibilist virtue-theoretic concern. This is the case because if, as Baehr (2012) has cogently argued, even theories that as stipulated have nothing to do with character, and virtuous/vicious traits (like evidentialism and reliabilism) are inevitably involving virtuous and vicious traits, then the debunking of character as a mere figment will have serious repercussions for these theories as well.

Harman (1999) is a classic statement of so-called situationism about moral character traits, namely, the idea that what drives our behavior is the current of the social situation and not fixed character traits, as virtue theories and folk psychology tend to think. In fact, according to Harman (1999), there is no positive empirical evidence in favor of character traits, and there is much negative empirical evidence against them and, therefore, in so thinking we commit the so-called “fundamental attribution error.” We excessively focus on the agent and downplay the importance of the social situation that the agent is found in.

The situationist challenge questions the psychological reality of the theoretical cornerstone of responsibilist virtue theories, namely, character. Drawing from various empirical psychological studies, like Milgram’s (1974) famous obedience experiments, it is suggested that the folk psychological and virtue-theoretical concept of character is a mere fiction. These studies are taken to show that in reality the notion of character does not exist because in these experiments the agents’ actions are not guided by their character, relevant virtues, or traits but by the situational context they are found in.

True enough, we speak about character and character traits (virtuous and vicious) but empirical, psychological studies confirm that in a given situation our supposed character traits may fail to manifest themselves in our conduct, at least for a statistically significant portion of us. This indicates that there is no reliable source of dispositions for action according to character traits as it is widely assumed. If there were, most of us couldn’t just so easily behave out of character. This corollary would suffice to shake the building foundations of folk psychology (that relies extensively on the notion of character) and responsibilist virtue theory as well as other theories that surreptitiously involve the notion of character (for some discussion of situationism, see Timpe’s Moral Character; for responses to moral situationism see Miller 2003; Kamterkar 2004; and Snow 2010).

More recently Alfano (2011) transposed the situationist challenge from virtue ethics to responsibilist virtue epistemology. He appeals to abundant empirical work from cognitive and social psychology to rest his case. He indicates that intellectual virtues like curiosity, flexibility, creativity, and courage are susceptible to the vagaries of non-intellectual factors of their situation like mood elevators, mood depressors, ambient sounds, ambient smells, and even the weather.

This suggests, according to Alfano (2011), a puzzle that can be framed as an inconsistent triad: (non-skepticism) Most people know quite a bit; (responsibilism) Knowledge is true belief acquired and retained through the exercise of intellectual virtue; and (epistemic situationism) Most people do not possess the epistemic virtues countenanced by responsibilism. Alfano suggests that a plausible way to resolve the puzzle is to give up responsibilism and with it the whole enterprise of responsibilist virtue epistemology. The safe prediction is that situationism about moral and intellectual character poses a sharp challenge to folk psychology, virtue theory (and beyond) and is here to stay as a topic for discussion.

With this much about agency and responsibility, let us now wrap up the overall discussion.

8. New Directions in Metaepistemology

We have covered a lot of ground in large strides and introduced various key aspects of metaepistemological theorizing, ranging from semantics to metaphysics and agency. There is, of course, much more going on in metaepistemological debates (both in terms of depth and breadth), especially as new and fascinating directions of inquiry are constantly emerging. New semantic theories have been suggested like hybrid semantic theories of epistemic concepts and inferentialist theories. More light is being shed on neglected epistemic properties and their value like understanding, entitlement, and wisdom and new and interesting parallels with metaethics are constantly emerging, like reasons and motivation. Finally, exciting experimental work on intuitions is emerging and formal work is being carried out that links metaepistemological themes with probability calculus and decision theory (for example, Pettigrew 2011, 2013). This indicates that metaepistemology is an emerging and promising field of epistemological inquiry that is anticipated to blossom in the years to come.

9. References and Further Reading

Alfano, Mark. (2011). ‘Expanding the Situationist Challenge to Responsibilist Virtue Epistemology’. Philosophical Quarterly 62(247): 223-249.
Alston, William. (1988). ‘The Deontological Conception of Epistemic Justification’, Philosophical Perspectives, 2, Epistemology, pp.115-152.
Alston, William. (2005). Beyond Justification. Ithaca, NY: Cornell University Press.
Annis, David. (1978). ‘A Contextualist Theory of Epistemic Justification’. American Philosophical Quarterly, Vol.15, No.3, pp.213-219.
Anscombe, G. E. M. (1957). Intention. Oxford, Blackwell.
Ayer, A. J. (1936). Language, Truth and Logic. London, Penguin Books.
Baehr, Jason. (2012). The Inquiring Mind. Oxford, Oxford University Press.
Bergmann, Michael. (2008). ‘Reidian Externalism’ in New Waves in Epistemology, (eds.) Vincent Hendricks and Duncan Pritchard. London, Palgrave MacMillan.
Blackburn, Simon. (1993). Essays in Quasi-Reaslism. Oxford, Oxford University Press.
Blackburn, Simon. (1998). Ruling Passions. Oxford, Oxford University Press.
Blackburn, Simon. (2006). Truth. London, Penguin Books.
Bonjour, Lawrence. (1985). The Structure of Empirical Knowledge. Cambridge MA: Harvard University Press.
Bonjour, Lawrence. (1998). In Defense of Pure Reason. London, Cambridge University Press.
Bonjour, Lawrence and Sosa, Ernest. (2003). Epistemic Justification. Oxford, Blackwell.
Boghossian, Paul. (2007). Fear of Knowledge. Oxford, Oxford University Press.
Brandom, Robert. (2000). Articulating Reasons. Cambridge MA, Harvard University Press.
Brink, David. (1989). Moral Realism and the Foundations of Ethics. New York, Cambridge University Press.
Charlow, Nate. (2014). ‘The Problem with the Frege-Geach Problem’. Philosophical Studies 167(3): 635-665.
Chisholm, Roderick. (1966). Theory of Knowledge. Englewood Cliffs, NJ, Prentice Hall.
Chrisman, Matthew. (2007). ‘From Epistemic Contextualism to Epistemic Expressivism’. Philosophical Studies 135(2):225-254.
Chrisman, Matthew. (2008). ‘Ought to Believe’. Journal of Philosophy 105 (7): 346-370.
Chrisman, Matthew. (2012). ‘Epistemic Expressivism’. Philosophy Compass 7(2):118-126.
Clifford, William (1877/2008). ‘The Ethics of Belief’ in Reason and Responsibility, (eds.) Joel Feinberg and Russ Shafer-Landau. Belmont, CA: Thomson, pp.101-5.
Cohen, Stewart. (1998). ‘Contextualist Solutions to Epistemological Problems: Scepticism, Gettier and the Lottery’. Australasian Journal of Philosophy 74, 4, pp.549-567.
Conee, Earl and Feldman, Richard. (2004). Evidentialism. Oxford, Oxfrod University Press.
Copp, David. (2001). Realist-Expressivism: A Neglected Option for Moral Realism. Social Philosophy and Policy 18(02):1-43.
Cuneo, Terence. (2007). The Normative Web. Oxford, Oxford University Press.
Cuneo, Terence and Kyriacou, Christos. (2017) ‘Defending the Moral/Epistemic Parity’ in Metaepistemology. (eds.) C. McHugh, J. Way and D. Whiting.
Darwall, Stephen, Gibbard, Allan and Railton, Peter. (1992). ‘Toward Fin de Siecle Ethics: Some Trends’. Philosophical Review 101(1):115-189.
De Cruz, Helen, Boudry, Maarten, De Smedt, Johan, and Blancke, Stefaan. (2011). ‘Evolutionary Approaches to Epistemic Justification’. Dialectica 65(4):517-535.
DeRose, Keith. (1995). ‘Solving the Skeptical Problem’. The Philosophical Review 104, 1, pp. 1-52.
Descartes, Rene. (1641\2008). Meditations on First Philosophy. Oxford, Oxford University Press. Translated by Michael Moriarty.
Dowden, Bradley and Swartz, Norman. ‘Truth’ in the Internet Encyclopedia of Philosophy.
Dreier, James. (2004). ‘Meta-ethics and the Problem of Creeping Minimalism’. Philosophical Perspectives 18(1):23-44.
Dunn, Jeffrey. ‘Epistemic Consequentialism’. Internet Encylopedia of Philosophy.
Elgin, Catherine. (2004). ‘True Enough’. Philosophical Issues 14, Epistemology, pp. 113-131.
Enoch, David. (2013). Taking Morality Seriously. Oxford, Oxford University Press.
Feldman, Richard. (2001). ‘Voluntary Belief and Epistemic Evalutation’ in Knowledge, Truth and Duty, (ed.) Matthias Steup. Oxford, Oxford University Press, pp.77-92.
Feldman, Richard. (2002). ‘Epistemological Duties’ in The Oxford Handbook of Epistemology, ed. Paul Moser. Oxford, Oxford University Press. pp.362-384.
Fisher, Andrew. (2011). Metaethics: An Introduction. Acumen.
Floridi, Luciano. (2004). ‘On the Logical Unsolvability of the Gettier Problem’. Synthese 142(1): 61-79.
Fogelin, Robert. (1994). Pyrrhonian Reflections on Knowledge and Justification. Oxford, Oxford University Press.
Frankena, William. (1939). ‘The Naturalistic Fallacy’. Mind XLVIII(192):464:477.
Frege, Gottlob. (1997). ‘On Negation’ in The Frege Reader
Fricker, Miranda. (2010). Epistemic Injustice. Oxford, Oxford University Press.
Fumerton, Richard. (1995). Metaepistemology and Skepticism. London, Rowman and Littlefield.
Geach, Peter. (1960). ‘Ascriptivism’. Philosophical Review 2:221-225.
Geach, Peter. (1965). ‘Assertion’. Philosophical Review 74(4):449-465.
Gibbard, Allan. (1990). Wise Choices, Apt Feelings. Oxford, Oxford University Press.
Ginet, Carl. (2001). ‘Deciding to Believe’ in Knowledge, Truth and Duty, ed. Matthias Steup. Oxford, Oxford University Press. pp.63-76.
Goldman, Alvin. (1979). ‘What is Justified Belief?’ in Epistemology : An Anthology, eds., Ernest Sosa and Jaegkwon Kim. Blackwell. pp. 340-353.
Grajner, Martin. (2015). ‘Hybrid Expressivism About Epistemic Justification’. Philosophical Studies 172(9):2349-2369
Greco, Daniel. (2015). ‘Epistemological Open Question Arguments’. Australasian Journal of Philosophy 93(3):509-523.
Greco, John. (2010). Achieving Knowledge. Oxford, Oxford University Press.
Grimm, Stephen. (2011). ‘Understanding’ in The Routledge Companion to Epistemology, (eds.) Sven Bernecker and Duncan Pritchard. New York, Routledge. pp.
Haidt, Jonathan. (2011). The Righteous Mind. New York, Vintage Books.
Harman, Gilbert. (1986). Change in View. Cambridge, MA: MIT Press.
Harman, Gilbert.(1999). ‘‘Moral Philosophy Meets Social Psychology: Virtue Ethics and the Fundamental Attribution Error’. Proceedings of the Aristotelian Society 99, pp. 315-331.
Hare, Richard. (1952). The Language of Morals. Oxford, Clarendon Press.
Hawthorne, John. (2004). Knowledge and Lotteries. Oxford, Oxford University Press.
Heathwood, Chris. (2009). ‘Moral and Epistemic Open Question Arguments’, Philosophical Books 50: 83-98.
Heuer, Ulrike. (2010). ‘Beyond Wrong Reasons: The Buck-Passing Account of Value’ in New Waves in Metaethics, (ed.) Michael Brady. Palgrave Macmillan. pp. 166-184.
Hume, David (1739\1985). A Treatise of Human Nature. London, Penguin.
Huemer, Michael. (2008). Ethical Intuitionism. London, Palgrave Macmillan.
James, William. (1896\2008). ‘The Will to Believe’ in Reason and Responsibility, (eds.) Joel Feinberg and Russ Shafer-Landau. Belmont, CA: Thomson, pp. 106-113.
Jenkins, Catherine. (2007). ‘Epistemic Norms and Natural Facts’. American Philosophical Quarterly 44 (3): 259-272.
Kahneman, Daniel. (2012). Thinking, Fast and Slow. New York, Farrar, Straus and Giroux.
Kamtekar, Rachana. (2004). ‘Situationism and Virtue Ethics on the Content of Our Character’. Ethics 114(3):458-491.
Kappel, Klemens and Moeller, Eric. (2014). ‘Epistemic Expressivism and the Argument from Motivation’. Synthese, pp.1-19.
Kelly, Thomas. (2003). ‘Epistemic Rationality as Instrumental Rationality: A Critique’. Philosophy and Pehenomenological Research 66, 3, pp.612-40.
Kim, Jaegwon. (1988). ‘What is Naturalized Epistemology?’. Philosophical Perspectives 2: 381-405.
Kirkham, Richard. (1984). ‘Does the Gettier Problem Rest on A Mistake?’. Mind 93(372): 501-513.
Kyriacou, Christos. (2012).‘Habits-Expressivism About Epistemic Justification’. Philosophical Papers 41(2):209-237.
Kyriacou, Christos. (2013). ‘How Not To Solve the Wrong Kind of Reasons Problem’. Journal of Value Inquiry 47(1-2):101-110.
Kyriacou, Christos. (2015). ‘Critical Discussion of David Velleman’s ‘Foundations for Moral Relativism’’ UK: Open Book Publishers. 2013. Ethical Theory and Moral Practice 18(1),pp. 209-214.
Kyriacou, Christos. (2016a). ‘Ought to Believe, Evidential Understanding and the Pursuit of Wisdom’ in Epistemic Reasons, Epistemic Norms, Epistemic Goals, eds. Martin Grajner and Pedro Schmechtig. Berlin, DeGruyter. Pre-print version.
Kyriacou, Christos. (2016b). ‘Metaepistemology’ in Oxford Bibliographies Online, ed. Duncan Pritchard. New York, Oxford University Press. Pre-print version.
Kyriacou, Christos. (2017). ‘Bifurcated Sceptical Invariantism’ in Journal of Philosophical Research. Pre-print version.
Kornblith, Hilary. (2002). Knowledge and Its Place in Nature. Oxford, Oxford University Press.
Korsgaard, Christine. (1986). ‘Skepticism About Practical Reason’, Journal of Philosophy 83 (1):5-25.
Kvanvig, Jonanthan. (2003). The Value of Knowledge and the Pursuit of Understanding. Cambridge, Cambridge University Press.
Lenman, James. (2008). ‘Review of Terence Cuneo. The Normative Web.’ Notre Dame Philosophical Reviews 2008(6).
Lewis, David. (1996). ‘Elusive Knowledge’. Australasian Journal of Philosophy 74(4): 549-567.
Locke, John. (1690\1975). An Essay Concerning Human Understanding. Oxford, Oxford University Press. Edited with an Introduction by P.H. Nidditch.
Mackie, John. (1971). Ethics. Inventing Right and Wrong. London, Penguin.
MacFarlane, John. (2005). ‘The Assessment Sensitivity of Knowledge Attributions’. In T.S.Gendler and J.Hawthorne, eds. Oxford Studies in Epistemology, Oxford University Press. pp. 197-234.
Miller, Christian. (2003). ‘Social Psychology and Virtue Ethics’. Journal of Ethics 7(4): 365-392.
Milgram, Stanley. (1974). Obedience to Authority: An Experimental View. New York, Harper and Row.
Moore, G. E. (1903\2000). Principia Ethica. Cambridge, Cambridge University Press. Edited with an introduction by T. Baldwin.
Neta, Ram. (2008). ‘How to Naturalize Epistemology’ in New Waves in Epistemology, eds. Vincent Hendricks and Duncan Pritchard. New York, Palgrave Macmillan. pp. 324-353.
Nozick, Robert. (1981). Philosophical Explanations. Cambridge MA, Harvard University Press.
Olson, Erik. (2007). ‘The Place of Coherence in Epistemology’ in New Waves in Epistemology, (eds.) Vincent Hendricks and Duncan Pritchard. London, Palgrave Macmillan.
Olson, Jonas. (2004). ‘Buck-Passing and the Wrong Kind of Reasons’. Philosophical Quarterly 54(215):295-300.
Olson, Jonas. (2011a). ‘In Defense of Moral Error Theory’ in New Waves in Metaethics. New York, Palgrave Macmillan. pp. 62-84.
Olson, Jonas. (2011b). ‘Error Theory and Reasons for Belief’ in Reasons for Belief, eds. Andrew Reisner and Asbjorn Stegligh-Petersen. Cambridge, Cambridge University Press. pp. 75-93.
Papineau, David. (2003). ‘The Evolution of Knowledge’ in The Roots of Reason. Oxford, Oxford University Press, pp. 39-82.
Pettigrew, Richard. (2011). ‘Epistemic Utility Arguments for Probabilism’. Stanford Encyclopedia of Philosophy.
Pettigrew, Richard. (2013). ‘Epistemic Utility and Norms for Credences’. Philosophy Compass 8/10: 897-908.
Pink, Thomas. (2004). Free Will: A Very Short Introduction. Oxford, Oxford University Press.
Plantinga, Alvin. (1993). Warrant: The Current Debate. Oxford, Oxford University Press.
Plantinga, Alvin. (1993). Warrant and Proper Function. Oxford, Oxford University Press.
Plato. (2005). Euthyphro, Apology, Crito, Phaedo, Phaedrus. Cambridge MA : Harvard University Press.
Pollock, John and Cruz, Joseph. (1999). Contemporary Theories of Knowledge. Lanham, Rowman and Littlefield.
Poston, Ted. ‘Internalism and Externalism in Epistemology’ in the Internet Encyclopedia of Philosophy.
Pritchard, Duncan. (2010). The Nature and Value of Knowledge. Oxford, Oxford University Press. Coauthored with Allan Millar and Adrian Haddock.
Putnam, Hilary. (1975/1997). ‘The Meaning of ‘Meaning’’ in Philosophical Papers Vol.2 : Mind, Language and Reality. Cambridge, Cambridge University Press. pp. 215-271.
Quine, W. V. O. (1953). ‘Two Dogmas of Empiricism’ in From A Logical Point of View. Cambridge, MA: Harvard University Press. pp. 20-46.
Quine W. V. O. (1992). Pursuit of Truth. Cambridge, MA: Harvard University Press
Rabinowicz, Wlodek and Ronnow-Rasmussen, Toni. (2004). ‘The Strike of the Demon: On Fitting Pro-Attitudes and Value’. Ethics 114(3):391-423.
Ridge, Michael. (2007). ‘Ecumenical Expressivism: The Best of Both Worlds?’ Oxford Studies in Metaethics 2:51-76.
Ridge, Michael. (2014). ‘Moral Non-naturalism’ in Stanford Encyclopedia of Philosophy, (ed.) Edward Zalta.
Scanlon, T. M. (1998). What We Owe To Each Other. Cambridge MA, Harvard University Press.
Schaffer, Jonathan. (2004). ‘From Contextualism to Contrastivism’. Philosophical Studies 119(1-2): 73-104.
Schaffer, Jonathan. (2013). ‘On What Grounds What’ in Metametaphysics, eds. D.Chalmers, D.Manley and R.Wasserman. Oxford, Oxford University Press. pp.347-383.
Schroeder, Mark. (2008a). Being For. Oxford, Oxford University Press.
Schroeder, Mark. (2008b). ‘What is the Frege-Geach Problem?’. Philosophy Compass 3(4):703-720.
Schroeder, Mark. (2010). ‘Value and the Right Kind of Reason’. Oxford Studies in Metaethics 5:25-55.
Setiya Kieran. (2012). Knowing Right from Wrong. Oxford, Oxford University Press.
Smith, Michael. (1994). The Moral Problem. Oxford, Oxford University Press.
Snow, Nancy. (2010). Virtue as Social Intelligence. New York, Routledge.
Sosa, Ernest. (1991). Knowledge in Perspective. Cambridge, Cambridge University Press.
Sosa, Ernest. (2003). ‘The Place of Truth in Epistemology’ in Intellectual Virtue, (eds.) M.DePaul and L.Zagzebski. Oxford, Oxford University Press, pp.155-79.
Sosa, Ernest. (2007). A Virtue Epistemology. Oxford, Oxford University Press.
Street, Sharon. (2006). ‘A Darwinian Dilemma for Realist Theories of Value’. Philosophical Studies 127 (1), pp. 109-166.
Street, Sharon. (2009). ‘Evolution and the Normativity of Epistemic Reasons’. Canadian Journal of Philosophy 39 (supplement 1): 213-248.
Stephen, Stich. (1990). The Fragmentation of Reason. Cambridge MA, MIT Press.
Stevenson, C. L. (1963). ‘The Nature of Ethical Disagreement’ in Facts and Values. New Haven: Yale University Press. pp.1-9.
Timmons, Mark and Horgan, Terence. (1991). ‘New Wave Moral Realism Meets Moral Twin Earth’. Journal of Philosophical Research 16:447-465.
Timpe, Kevin. ‘Moral Character’ in the Internet Encyclopedia of Philosophy.
Turri, John. (2009). ‘The Ontology of Epistemic Reasons’. Nous 43(3): 490-512.
Unger, Peter. (1971). ‘A Defense of Skepticism’. Philosophical Review, LXXX :198-219.
Vahid, Hamid. (2005). Epistemic Justification and the Skeptical Challenge. London, Palgrave Macmillan.
Vavova, Katia. (2014). ‘Debunking Evolutionary Debunking’. Oxford Studies in Metaethics 9:76-101.
Velleman, David. (2013). Foundations for Moral Relativism. UK: Open Publishers.
Vitz, Rico. ‘Doxastic Voluntarism’ in the Internet Encyclopedia of Philosophy.
Yalcin, Seth. (2012). ‘Bayesian Expressivism’, Proceedings of the Aristotelian Society 112, Vol.2:123-160.
Zagzebski, Linda. (1996). Virtues of the Mind. Cambridge, Cambridge University Press.
Zagzebski, Linda. (2003). ‘The Search for the Source of the Epistemic Good’. Metaphilosophy 34, pp.12-28.
Zagzebski, Linda. (2009). On Epistemology. Belmont, CA : Wadsworth.
Wedgwood, Ralph. (2008). ‘Contextualism About Justified Belief’. Philosopher’s Imprint 8, No.9, pp.1-20.
Wiggins, David. (1987). ‘A Sensible Subjectivism?’ in Needs, Values and Truth. Oxford, Oxford University Press. pp. 185-214.
Williams, Bernard. (1979). ‘Internal and External Reasons’ in Rational Action, ed. Ross Harrison. Cambridge, Cambridge University Press. pp. 101-113.
Williams, Michael. (2001). Problems of Knowledge. Oxford, Oxford University Press.
Williamson, Timothy. (2000). Knowledge and its Limits. Oxford, Oxford University Press.

Author Information

Christos Kyriacou
Email: ckiriakou@gmail.com
University of Cyprus
Cyprus

Interpretations of Quantum Mechanics

Quantum mechanics is a physical theory developed in the 1920s to account for the behavior of matter on the atomic scale. It has subsequently been developed into arguably the most empirically successful theory in the history of physics. However, it is hard to understand quantum mechanics as a description of the physical world, or to understand it as a physical explanation of the experimental outcomes we observe. Attempts to understand quantum mechanics as descriptive and explanatory, to modify it such that it can be so understood, or to argue that no such understanding is necessary, can all be taken as versions of the project of interpreting quantum mechanics.

The problematic nature of quantum mechanics stems from the fact that the theory often represents the state of a system using a sum of several terms, where each term apparently represents a distinct physical state of the system. What’s more, these terms interact with each other, and this interaction is crucial to the theory’s predictions. If one takes this representation literally, it looks as if the system exists in several incompatible physical states at once. And yet when the physicist makes a measurement on the system, only one of these incompatible states is manifest in the result of the measurement. What makes this especially puzzling is that there is nothing in the physical nature of a measurement that could privilege one of the terms over the others.

According to the Copenhagen interpretation of quantum mechanics, the solution to this puzzle is that the quantum state should not be taken as a description of the physical system. Rather, the role of the quantum state is to summarize what we can expect if we make measurements on the system. According to the many-worlds interpretation, the quantum state is to be taken as a description of the system, and the solution to the puzzle is that each term in that description produces a corresponding measurement outcome. That is, for any quantum measurement there are generally multiple measurement results occurring on distinct “branches” of reality. According to hidden variable theories, the quantum state is a partial description of the system, where the rest of the description is given by the values of one or more “hidden” variables. The solution to the puzzle in this case is that the hidden variables pick out one of the physical states described by the quantum state as the actual one. According to spontaneous collapse theories, the quantum state is a complete description of the system, but the dynamical laws of quantum mechanics are incomplete, and need to be supplemented with a “collapse” process that eliminates all but one of the terms in the state during the measurement process.

These interpretations and others present us with very different pictures of the nature of the physical world (or in the Copenhagen case, no picture at all), and they have different strengths and weaknesses. The question of how to decide between them is an open one.

The Development of Quantum Mechanics
The Copenhagen Interpretation
The Many-Worlds Interpretation
Hidden Variable Theories
Spontaneous Collapse Theories
Other Interpretations
Choosing an Interpretation
References and Further Reading

1. The Development of Quantum Mechanics

Quantum mechanics was developed in the early twentieth century in response to several puzzles concerning the predictions of classical (pre-20^th century) physics. Classical electrodynamics, while successful at describing a large number of phenomena, yields the absurd conclusion that the electromagnetic energy in a hollow cavity is infinite. It also predicts that the energy of electrons emitted from a metal via the photoelectric effect should be proportional to the intensity of the incident light, whereas in fact the energy of the electrons depends only on the frequency of the incident light. Taken together with the prevailing account of atoms as clouds of positive charge containing tiny negatively charged particles (electrons), classical mechanics entails that alpha particles fired at a thin gold foil should all pass straight through, whereas in fact a small proportion of them are reflected back towards the source.

In response to the first puzzle, Max Planck suggested in 1900 that light can only be emitted or absorbed in integral units of hn, where n is the frequency of the light and h is a constant. This is the hypothesis that energy is quantized—that it is a discrete rather than continuous quantity—from which quantum mechanics takes its name. This hypothesis can be used to explain the finite quantity of electromagnetic energy in a hollow cavity. In 1905 Albert Einstein proposed that the quantization of energy can solve the second puzzle too; the minimum amount of energy that can be transferred to an electron from the incident light is hn, and hence the energy of the emitted electrons is proportional to the frequency of the light.

Ernest Rutherford’s solution to the third puzzle in 1911 was to posit that the positive charge in the atom is concentrated in a small nucleus with enough mass to reflect an alpha particle that collides with it. According to Niels Bohr’s 1913 elaboration of this model, the electrons orbit this nucleus, but only certain energies for these orbital electrons are allowed. Again, energy is quantized. The model has the additional benefit of explaining the spectrum of light emitted from excited atoms; since only certain energies are allowed, only certain wavelengths of light are possible when electrons jump between these levels, and this explains why the spectrum of the light consists of discrete wavelengths rather than a continuum of possible wavelengths.

But the quantization of energy raises as many questions as it answers. Among them: Why are only certain energies allowed? What prevents the electrons in an atom from losing energy continuously and spiraling in towards the nucleus, as classical physics predicts? In 1924 Louis de Broglie suggested that electrons are wave-like rather than particle-like, and that the reason only certain electron energies are allowed is that energy is a function of wavelength, and only certain wavelengths can fit without remainder in the electron orbit for a given energy. By 1926 Erwin Schrödinger had developed an equation governing the dynamical behavior of these matter waves, and quantum mechanics was born.

This theory has been astonishingly successful. Within a year of Schrödinger’s formulation, Clinton Davisson and Lester Germer demonstrated that electrons exhibit interference effects just like light waves—that when electrons are bounced off the regularly-arranged atoms of a crystal, their waves reinforce each other in some directions and cancel out in others, leading to more electrons being detected in some directions than others. This success has continued. Quantum mechanics (in the form of quantum electrodynamics) correctly predicts the magnetic moment of the electron to an accuracy of about one part in a trillion, making it the most accurate theory in the history of science. And so far its predictive track record is perfect: no data contradicts it.

But on a descriptive and explanatory level, the theory of quantum mechanics is less than satisfactory. Typically when a new theory is introduced, its proponents are clear about the physical ontology presupposed—the kind of objects governed by the theory. Superficially, quantum mechanics is no different, since it governs the evolution of waves through space. But there are at least two reasons why taking these waves as genuine physical entities is problematic.

First, although in the case of electron interference the number of electrons arriving at a particular location can be explained in terms of the propagation of waves though the apparatus, each electron is detected as a particle with a precise location, not as a spread-out wave. As Max Born noticed in 1926, the intensity (squared amplitude) of the quantum wave at a location gives the probability that the particle is located there; this is the Born rule for assigning probabilities to measurement outcomes. The second reason to doubt the reality of quantum waves is that the quantum waves do not propagate through ordinary three-dimensional space, but though a space of 3n dimensions, where n is the number of particles in the system concerned. Hence it is not at all clear that the underlying ontology is genuinely of waves propagating through space. Indeed, the standard terminology is to call the quantum mechanical representation of the state of a system a wavefunction rather than a wave, perhaps indicating a lack of metaphysical commitment: the mathematical function that represents a system has the form of a wave, even if it does not actually represent a wave.

So quantum mechanics is a phenomenally successful theory, but it is not at all clear what, if anything, it tells us about the underlying nature of the physical world. Quantum mechanics, perhaps uniquely among physical theories, stands in need of an interpretation to tell us what it means. Four kinds of interpretation are described in detail below (and some others more briefly). The first two—the Copenhagen interpretation and the many-worlds interpretation—take standard quantum mechanics as their starting point. The third and fourth—hidden variable theories and spontaneous collapse theories—start by modifying the theory of quantum mechanics, and hence are perhaps better described as proposals for replacing quantum mechanics with a closely related theory.

2. The Copenhagen Interpretation

The earliest consensus concerning the meaning of quantum mechanics formed around the work of Niels Bohr and Werner Heisenberg in Copenhagen during the 1920s, and hence became known as the Copenhagen interpretation. Bohr’s position is that our conception of the world is necessarily classical; we think of the world in terms of objects (for example, waves or particles) moving through three-dimensional space, and this is the only way we can think of it. Quantum mechanics doesn’t permit such a conceptualization, either in terms of waves or particles, and so the quantum world is in principle unknowable by us. Quantum mechanics shouldn’t be taken as a description of the quantum world, and neither should the evolution of the quantum state over time be taken as a causal explanation of the phenomena we observe. Rather, quantum mechanics is an extremely effective tool for predicting measurement results that takes the configuration of the measuring apparatus (described classically) as input, and produces probabilities for the possible measurement outcomes (described classically) as output.

It is sometimes claimed that the Copenhagen interpretation is a product of the logical positivism that flourished in Europe during the same period. The logical positivists held that the meaningful content of a scientific theory is exhausted by its empirical predictions; any further speculation into the nature of the world that produces these measurement outcomes is quite literally meaningless. This certainly has some resonances with the Copenhagen interpretation, particularly as described by Heisenberg. But Bohr’s views are importantly different from Heisenberg’s, and are more Kantian than positivist. Bohr is happy to say that the micro-world exists, and that it can’t be conceived of in causal terms, both of which would be meaningless claims according to positivist scruples. However, Bohr thinks we can say little else about the micro-world. Bohr, like Kant, thinks that we can only conceive of things in certain ways, and that the world as it is in itself is not amenable to such conceptualization. If this is correct, it is inevitable that our fundamental physical theories are unable to describe the world as it is, and the fact that we can make no sense of quantum mechanics as a description of the world should not concern us.

Unless one is convinced of Kant’s position concerning our conceptual access to the world, one may not find Bohr’s pronouncements concerning what we can conceive compelling. However, the motivation for adopting a Copenhagen-style interpretation can be made independent of any overarching philosophical position. Since the intensity of the wavefunction at a location gives the probability of the particle occupying that location, it is natural to regard the wavefunction as a reflection of our knowledge of the system rather than a description of the system itself. This view, held by Einstein, suggests that quantum mechanics is incomplete, since it gives us only an instrumental recipe for calculating the probabilities of outcomes, rather than a description of the underlying state of the system that gives rise to those probabilities. But it was later proved (as we shall see) that given certain plausible assumptions, it is impossible to construct such a description of the underlying state. Bohr did not know at the time that Einstein’s task was impossible, but its evident difficulty provides some motivation for regarding the quantum world as inscrutable.

However, the Copenhagen interpretation has at least two major drawbacks. First, a good deal of the early evidence for quantum mechanics comes from its ability to explain the results of interference experiments involving particles like electrons. Bohr’s insistence that quantum mechanics is not descriptive takes away this explanation (although, of course, viewing the wavefunction as descriptive only of our knowledge does no better). Second, Bohr’s position requires a “cut” between the macroscopic world described by classical concepts and the microscopic world subsumed under (but not described by) quantum mechanics. Since macroscopic objects are made out of microscopic components, it looks like macroscopic objects must obey the laws of quantum mechanics too; there can be no such “cut”, either sharp or vague, delimiting the realm of applicability of quantum mechanics.

3. The Many-Worlds Interpretation

In 1957 Hugh Everett proposed a radically new way of interpreting the quantum state. His proposal was to take quantum mechanics as descriptive and universal; the quantum state is a genuine description of the physical system concerned, and macroscopic systems are just as well described in this way as microscopic ones. This immediately solves both the above problems; there is no “cut” between the micro and macro worlds, and the explanation of particle interference in terms of waves is retained.

An immediate problem facing such a realist interpretation of the quantum state is the provenance of the outcomes of quantum measurements. Recall that in the case of electron interference, what is detected is not a spread-out wave, but a particle with a well-defined location, where the wavefunction intensity at a location gives the probability that the particle is located there.

How does Everett account for these facts? What he suggests is that we model the measurement process itself quantum mechanically. It is by no means uncontroversial that measuring devices and human observers admit of a quantum mechanical description, but given the assumption that quantum mechanics applies to all material objects, such a description ought to be available at least in principle. So consider for simplicity the situation in which the wavefunction intensity for the electron at the end of the experiment is non-zero in only two regions of space, A and B. The detectors at these locations can be modeled using a wavefunction too, with the result that the electron wavefunction component at A triggers a corresponding change in the wavefunction of the A-detector, and similarly at B. In the same way, we can model the experimenter who observes the detectors using a wavefunction, with the result that the change in the wavefunction of the A-detector causes a change in the wavefunction of the observer corresponding to seeing that the A-detector has fired, and the change in the wavefunction of the B-detector causes a change in the wavefunction of the observer corresponding to seeing that the B-detector has fired. The observer’s final state, then, is modeled by two distinct wave structures superposed, much in the way two images are superposed in a double-exposure photograph.

In sum, the wave structure of the electron-detector-observer system consists of two distinct branches, the A-outcome branch and the B-outcome branch. Since these two branches are relatively causally isolated from each other, we can describe them as two distinct worlds, in one of which the electron hits the detector at A and the observer sees the A-detector fire, and in the other of which the electron hits the detector at B and the observer sees the B-detector fire. This talk of worlds needs to be treated carefully, though; there is just one physical world, described by the quantum state, but because observers (along with all other physical objects) exhibit this branching structure, it is as if the world is constantly splitting into multiple copies. It is not clear whether Everett himself endorsed this talk of worlds, but this is the understanding of his work that has become canonical; call it the many-worlds interpretation.

According to the many-worlds interpretation, then, every physically possible outcome of a measurement actually occurs in some branch of the quantum state, but as an inhabitant of a particular branch of the state, a particular observer only sees one outcome. This explains why, in the electron interference experiment, the outcome looks like a discrete particle even though the object that passes through the interference device is a wave; each point in the wave generates its own branch of reality when it hits the detectors, so from within each of the resulting branches it looks like the incoming object was a particle.

The main advantage of the many-worlds interpretation is that it is a realist interpretation that takes the physics of standard quantum mechanics literally. It is often met with incredulity, since it entails that people (along with other objects) are constantly branching into innumerable copies, but this by itself is no argument against it. Still, the branching of people leads to philosophical difficulties concerning identity and probability, and these (particularly the latter) constitute genuine difficulties facing the approach.

The problem of identity is a philosophically familiar one: if a person splits into two copies, then the copies can’t be identical to (that is, the same person as) the original person, or else they would be identical to (the same person as) each other. Various solutions have been developed in the literature. One might follow Derek Parfit and bite the bullet here: what fission cases like this show is that strict identity is not a useful concept for describing the relationship between people and their successors. Or one might follow David Lewis and rescue strict identity by stipulating that a person is a four-dimensional history rather than a three dimensional object. According to this picture, there are two people (two complete histories) present both before and after the fission event; they initially overlap but later diverge. Identity over time is preserved, since each of the pre-split people is identical with exactly one of the post-split people. Both of these positions have been proposed as potential solutions to the problem of personal identity in a many-worlds universe. A third solution that is sometimes mentioned is to stipulate that a person is the whole of the branching entity, so that the pre-split person is identical to both her successors, and (despite our initial intuition otherwise) the successors are identical to each other.

So the problem of identity admits of a number of possible solutions, and the only question is how one should try to decide between them. Indeed, one might argue that there is no need to decide between them, since the choice is a pragmatic one about the most useful language to use to describe branching persons.

The problem of probability, though, is potentially more serious. As noted above, quantum mechanics makes its predictions in the form of probabilities: the square of the wavefunction amplitude in a region tells us the probability of the particle being located there. The striking agreement of the observed distribution of outcomes with these probabilities is what underwrites our confidence in quantum mechanics. But according to the many-worlds interpretation, every outcome of a measurement actually occurs in some branch of reality, and the well-informed observer knows this. It is hard to see how to square this with the concept of probability; at first glance, it looks like every outcome has probability 1, both objectively and epistemically. In particular, if a measurement results in two branches, one with a large squared amplitude and one with a small squared amplitude, it is hard to see why we should regard the former as more probable than the latter. But unless we can do so, the empirical success of quantum mechanics evaporates.

It is worth noting, however, that the foundations of probability are poorly understood. When we roll two dice, the chance of rolling 7 is higher than the chance of rolling 12. But there is no consensus concerning the meaning of chance claims, or concerning why the higher chance of 7 should constrain our expectations or behavior. So perhaps a quantum branching world is in no worse shape than a classical linear world when it comes to understanding probability. We may not understand how squared wavefunction amplitude could function as chance in guiding our expectations, but perhaps that is no barrier to postulating that it does so function.

A more positive approach has been developed by David Deutsch and David Wallace, arguing that given some plausible constraints on rational behavior, rational individuals should behave as if squared wavefunction amplitudes are chances. If one combines this with a functionalist attitude towards chance—that whatever functions as chance in guiding behavior is chance—then this program promises to underwrite the contention that squared wave amplitudes are chances. However, the assumptions on which the Deutsch-Wallace argument is based can be challenged. In particular, they assume that it is irrational to care about branching per se: having two successors experiencing a given outcome is neither better nor worse than having one successor experiencing that outcome. But it is not clear that this is a matter of rationality any more than the question of whether having several happy children is better than having one happy child.

A further worry about the many-words theory that has been largely put to rest concerns the ontological status of the worlds. It has been argued that the postulation of many worlds is ontologically profligate. However, the current consensus is that worlds are emergent entities just like tables and chairs, and talk of worlds is just a convenient way of talking about the features of the quantum state. On this view, the many-worlds interpretation involves no entities over and above those represented by the quantum state, and as such is ontologically parsimonious. There remains the residual worry that the number of branches depends sensitively on mathematical choices about how to represent the quantum state. Wallace, however, embraces this indeterminacy, arguing that even though the many-worlds universe is a branching one, there is no well-defined number of branches that it has. If tenable, this goes some way towards resolving the above concern about the rationality of caring about branching per se: if there is no number of branches, then it is irrational to care about it.

4. Hidden Variable Theories

The many-worlds interpretation would have us believe that we are mistaken when we think that a quantum measurement results in a unique outcome; in fact such a measurement results in multiple outcomes occurring on multiple branches of reality. But perhaps that is too much to swallow, or perhaps the problems concerning identity and probability mentioned above are insuperable. In that case, one is led to the conclusion that quantum mechanics is incomplete, since there is nothing in the quantum state that picks out one of the many possible measurement results as the single actual measurement result. As mentioned above, this was Einstein’s view. If this view is correct, then quantum mechanics stands in need of completion via the addition of extra variables describing the actual state of the world. These additional variables are commonly known as hidden variables.

However, a theorem proved by John Bell in 1964 shows that, subject to certain plausible assumptions, no such hidden-variable completion of quantum mechanics is possible. One version of the proof concerns the properties of a pair of particles. Each particle has a property called spin: when the spin of the particle is measured in some direction, one either gets the result up or down. Suppose that the spin of each particle can be measured along one of three directions 120° apart. What quantum mechanics predicts is that if the spins of the particles are measured along the same direction, they always agree (both up or both down), but if they are measured along different directions they agree 25% of the time and disagree 75% of the time. According to the hidden variable approach, the particles have determinate spin values for each of the three measurement directions prior to measurement. The question is how to ascribe spin values to particles to reproduce the predictions of quantum mechanics. And what Bell proved is that there is no way to do this; the task is impossible.

Many physicists concluded on the basis of Bell’s theorem that no hidden-variable completion of quantum mechanics is possible. However, this was not Bell’s conclusion. Bell concluded instead that one of the assumptions he relied on in his proof must be false. First, Bell assumed locality—that the result of a measurement performed on one particle cannot influence the properties of the other particle. This seems secure because the measurements on the two particles can be widely separated, so that a signal carrying such an influence would have to travel faster than light. Second, Bell assumed independence—that the properties of the particles are independent of which measurements will be performed on them. This assumption too seems secure, because the choice of measurement can be made using a randomizing device or the free will of the experimenter.

Despite the apparent security of his assumptions, Bell knew when he proved his theorem that a hidden-variable completion of quantum mechanics had been explicitly constructed by David Bohm in 1952. Bohm assumed that in addition to the wave described by the quantum state, there is also a set of particles whose positions are given by the hidden variables. The wave pushes the particles around according to a new dynamical law formulated by Bohm, and the law is such that if the particle positions are initially statistically distributed according to the squared amplitude of the wave, then they are always distributed in this way. In an electron interference experiment, then, the existence of the wave explains the interference effect, the existence of the particles explains why each electron is observed at a precise location, and the new Bohmian law explains why the probability of observing an electron at a given location is given by the squared amplitude of the wave. As Bell often pointed out, to call Bohm’s theory a hidden variable theory is something of a misnomer, since it is the values of the hidden variables—the positions of the particles—that are directly observed on measurement. Nevertheless, the name has stuck.

Bohm’s theory, then, provides a concrete example of a hidden variable theory of quantum mechanics. However, it is not a counterexample to Bell’s theorem, because it violates Bell’s locality assumption. The new law introduced by Bohm is explicitly non-local: the motion of each particle is determined in part by the positions of all the other particles at that instant. In the case of Bell’s spin experiment, a measurement on one particle instantaneously affects the motion of the other particle, even if the particles are widely separated. This is a prima facie violation of special relativity, since according to special relativity simultaneity is dependent on one’s choice of coordinates, making it impossible to define “instantaneous” in any objective way. However, this does not mean that Bohm’s theory is immediately refuted by special relativity, since one can instead take Bohm’s theory to show the need to add a universal standard of simultaneity to special relativity. Bell recognized this possibility. It is worth noting that even though Bohm’s theory requires instantaneous action at a distance, it also prevents these influences from being controlled so as to send a signal; there is no “Bell telephone”.

Bohm chooses positions as the properties described by the hidden variables of his theory. His reason for this is that it is plausible that it is the positions of things that we directly observe, and hence completing quantum mechanics via positions suffices to ensure that measurements have unique outcomes. But it is possible to construct measurements in which the outcome is recorded in some property other than position. As a response to this possibility, one might suggest adding hidden variables describing every property of the particles simultaneously, rather than just their positions. However, a theorem proved by Kochen and Specker in 1967 shows that no such theory can reproduce the predictions of quantum mechanics. A second response is to stick with Bohm’s theory as it is, and argue that while such measurements may initially lack a unique outcome, they will rapidly acquire a unique outcome as the recording device becomes correlated with the positions of the surrounding objects in the environment.

A final way to accommodate such measurements within a hidden variable theory is to make it a contingent matter which properties of a system are ascribed determinate values at a particular time. That is, rather than supplementing the wavefunction with variables describing a fixed property (the positions of things), one can let the wavefunction state itself determine which properties of the system are described by the hidden variables at that time. The idea is that the algorithm for ascribing hidden variables to a system is such that whenever a measurement is performed, the algorithm ascribes a determinate value to the property recording the outcome of the measurement. Such theories are known as modal theories. But while Bohm’s theory provides an explicit dynamical law describing the motion of the particles over time, modal theories generally do not provide a dynamical law governing their hidden variables, and this is regarded as a weakness of the approach.

Modal theories, like Bohm’s theory, evade Bell’s theorem by violating Bell’s locality assumption. In the modal case, the rule for deciding which properties of the system are made determinate depends on the complete wavefunction state at a particular instant, and this allows a measurement on one particle to affect the properties ascribed to another particle, however distant. As mentioned above, one can solve this problem by supplementing special relativity with a preferred standard of simultaneity. But this is widely regarded as an ad hoc and unwarranted addition to an otherwise elegant and well-confirmed physical theory. Indeed, the same charge is often levelled at the hidden variables themselves; they are an ad hoc and unwarranted addition to quantum mechanics. If hidden variable theories turn out to be the only viable interpretations of quantum mechanics, though, the force of this charge is reduced considerably.

Nevertheless, it may be possible to construct a hidden variable theory that does not violate locality. In order to evade Bell’s theorem, then, it will have to violate the independence assumption—the assumption that the properties of the particles are independent of which measurements will be performed on them. Since one can choose the measurements however one likes, it is initially hard to see how this assumption could be violated. But there are a couple of ways it might be done. First, one could simply accept that there are brute, uncaused correlations in the world. There is no causal link (in either direction) between my choice of which measurement to perform on a (currently distant) particle and its properties, but nevertheless there is a correlation between them. This approach requires giving up on the common cause principle—the principle that a correlation between two events indicates either that one causes the other or that they share a cause. However, there is little consensus concerning this principle anyway.

A second approach is to postulate a common cause for the correlation—a past event that causally influences both the choice of measurement and the properties of the particle. But absent some massive unseen conspiracy on the part of the universe, one can frequently ensure that there is no common cause in the past by isolating the measuring device from external influences. However, the measuring device and the particle to be measured will certainly interact in the future, namely when the measurement occurs. It has been proposed that this future event can constitute the causal link explaining the correlation between the particle properties and the measurements to be performed on them. This requires that later events can cause earlier events—that causation can operate backwards in time as well as forwards in time. For this reason, the approach is known as the retrocausal approach.

The retrocausal approach allows correlations between distant events to be explained without instantaneous action at a distance, since a combination of ordinary causal links and retrocausal links can amount to a causal chain that carries an influence between simultaneous distant events. No absolute standard of simultaneity is required by such explanations, and hence retrocausal hidden variable theories are more easily reconciled with special relativity than non-local hidden variable theories.

Bohm’s theory operates with a two-element ontology—a wave steering a set of particles. Retrocausal theories vary in their ontological presuppositions. Some—retrocausal Bohmian theories—incorporate two waves steering a set of particles; one wave carries the “forward-causal” influences on the particles from the initial state of the system, and the other carries the “backward-causal” influences on the particles from the final state of the system. But it may be possible to make do with the particles alone, with the wavefunction representing our knowledge of the particle positions rather than the state of a real object. The idea is that the interaction between the causal influences on the particles from the past and from the future can explain all the quantum phenomena we observe, including interference. However, at present this is just a promising research program; no explicit dynamical laws for such a theory have been formulated.

5. Spontaneous Collapse Theories

Hidden variable theories attempt to complete quantum mechanics by positing extra ontology in addition to (or perhaps instead of) the wavefunction. Spontaneous collapse theories, on the other hand, (at least initially) take the wavefunction to be a complete representation of the state of a system, and posit instead that the dynamical law of standard quantum mechanics—the Schrödinger equation—is not exactly right. The Schrödinger equation is linear; this means that if initial state A leads to final state A’ and initial state B leads to final state B’, then initial state A + B leads to final state A’ + B’. For example, if a measuring device fed a spin-up particle leads to a spin-up reading, and a measuring device fed a spin-down particle leads to a spin-down reading, then a measuring device fed a particle whose state is a sum of spin-up and spin-down states will end up in a state which is a sum of reading spin-up and reading spin-down. This is the multiplicity of measurement outcomes embraced by the many-worlds interpretation.

To avoid sums of distinct measurement outcomes, one needs to modify the basic dynamical equation of the quantum mechanics equation so that it is non-linear. The first proposal along these lines was made by Gian Carlo Ghirardi, Alberto Rimini, and Tullio Weber in 1986; it has become known as the GRW theory. The GRW theory adds an irreducibly probabilistic “collapse” term to the otherwise deterministic Schrödinger dynamics. In particular, for each particle in a system there is a small chance per unit time of the wavefunction undergoing a process in which it is instantly and discontinuously localized in the coordinates of that particle. The localization process multiplies the wave state by a narrow Gaussian (bell curve), so that if the wave was initially spread out in the coordinates of the particle in question, it ends up concentrated around a particular point. The point on which this collapse process is centered is random, with a probability distribution given by the square of the pre-collapse wave amplitude (averaged over the Gaussian collapse curve).

The way this works is as follows. The collapse rate for a single particle is very low—about one collapse per hundred million years. So for individual particles (and systems consisting of small numbers of individual particles), we should expect that they obey the Schrödinger equation. And this is exactly what we observe; there are no known exceptions to the Schrödinger equation at the microscopic level. But macroscopic objects contain on the order of a trillion trillion particles, so we should expect about ten million collapses per second for such an object. Furthermore, in solid objects the positions of those particles are strongly correlated with each other, so a collapse in the coordinates of any particle in the object has the effect of localizing the wavefunction in the coordinates of every particle in the object. This means that if the wavefunction of a macroscopic object is spread over a number of distinct locations, it very quickly collapses to a state in which its wavefunction is highly localized around one location.

In the case of electron interference, then, each electron passes through the apparatus in the form of a spread-out wave. The collapse process is vanishingly unlikely to affect this wave, which is important, as its spread-out nature is essential to the explanation of interference: wave components traveling distinct paths must be able to come together and either reinforce each other or cancel each other out. But when the electron is detected, its position is indicated by something we can directly observe, for example, by the location of a macroscopic pointer. To measure the location of the electron, then, the position of the pointer must become correlated with the position of the electron. Since the wave representing the electron is spread out, the wave representing the pointer will initially be spread out too. But within a fraction of a second, the spontaneous collapse process will localize the pointer (and the electron) to a well-defined position, producing the unique measurement outcome we observe.

The spontaneous collapse approach is related to earlier proposals (for example, by John von Neumann) that the measurement process itself causes the collapse that reduces the multitude of pre-measurement wave branches to the single observed outcome. However, unlike previous proposals, it provides a physical mechanism for the collapse process in the form of a deviation from the standard Schrödinger dynamics. This mechanism is crucial; without it, as we have seen, there is no way for the measurement process to generate a unique outcome.

Note that, unlike in Bohm’s theory, there are no particles at the fundamental level in the GRW theory. In the electron interference case, particle behavior emerges during measurement; the measured system exhibits only wave-like behavior prior to measurement. Strictly speaking, to say that a system contains n particles is just to say that its wave representation has 3n dimensions, and to single out one of those particles is really just to focus attention on the form of the wave in three of those dimensions.

An immediate difficulty that faces the GRW theory is that the localization of the wave induced by collapse is not perfect. The collapse process multiplies the wave by a Gaussian, a function which is strongly peaked around its center but which is non-zero everywhere. No part of the pre-collapse wavefunction is driven to zero by this process; if the wavefunction represents a set of possible measurement results, the wave component corresponding to one result becomes large and the wave component corresponding to the others become small, but they do not disappear. Since one motivation for adopting a spontaneous collapse theory is the perceived failure of the many-worlds interpretation to recover probability claims, it cannot be argued that the small terms are intrinsically improbable. Instead, it looks like the GRW spontaneous collapse process fails to ensure that measurements have unique outcomes.

A second difficulty with the GRW theory is that the wavefunction is not an object in a three-dimensional space, but an object occupying a high-dimensional space with three dimensions for each “particle” in the system concerned. David Albert has argued that this makes the three-dimensional world of experience illusory.

A third difficulty with the GRW theory is that the collapse process acts instantaneously on spatially separated parts of the system; it instantly multiplies the wavefunction everywhere by a Gaussian. Like Bohm’s theory, the GRW theory violates Bells’ locality assumption, since a measurement performed on one particle can instantaneously affect the state of a distant particle (although in the case of the GRW theory talk of “particles” has to be cashed out in terms of the coordinates of the wavefunction). As discussed in relation to Bohm’s theory, this requires an objective conception of simultaneity that is absent from special relativity, and hence it is hard to see how to reconcile the GRW theory with relativity.

One way of responding to these difficulties, advocated by Ghirardi, is to postulate a three-dimensional mass distribution in addition to and determined by the wavefunction, such that our experience is determined directly by the mass distribution rather than the wavefunction. This responds to the second difficulty, since the mass distribution that we directly experience is three-dimensional, and hence our experience of a three-dimensional world is veridical. It may also go some way towards resolving the first difficulty, since the mass density corresponding to non-actual measurement outcomes is likely to be negligible relative to the background mass density surrounding the actual measurement outcome (the mass density of air, for example). Ghirardi’s mass density is not intended to address the third difficulty; this requires modifying the collapse process itself, and several proposals for constructing a relativistic collapse process based on the GRW theory have been developed.

An alternative approach to the difficulties facing the GRW theory is to adapt a suggestion made by John Bell that the center of each collapse event should be regarded as a “flash of determinacy” out of which everyday objects and everyday experience are built. Roderich Tumulka has developed this suggestion into a “flashy” spontaneous collapse theory, in which the wavefunction is regarded instrumentally as that which connects the distribution of flashes at one time with the probability distribution of flashes at a later time. On this proposal, the small wave terms corresponding to non-actual measurement outcomes can be understood in a straightforwardly probabilistic way: there is only a small chance that a flash will be associated with such a term, and so only a small chance that the non-actual measurement outcome will be realized. The flashes are located in three-dimensional space, so there is no worry that three-dimensionality is an illusion. And since the flashes, unlike the wavefunction, are located at space-time points, it is easier to envision a reconciliation between the flashy theory and special relativity.

6. Other Interpretations

There are several other interpretations of quantum mechanics available that don’t fit neatly into one of the categories discussed above. Here are some prominent ones.

The consistent histories (or decoherent histories) interpretation developed by Robert Griffiths, Murray Gell-Mann and James Hartle, and defended by Roland Omnès, is mathematically something of a hybrid between collapse theories and hidden variable theories. Like spontaneous collapse theories, the consistent histories approach incorporates successive localizations of the wavefunction. But unlike spontaneous collapse theories, these localizations are not regarded as physical events, but just as a means of picking out a particular history of the system in question as actual, much as hidden variables pick out a particular history as actual. If the localizations all constrain the position of a particle, then the history picked out resembles a Bohmian trajectory. But the consistent histories approach also allows localizations to constrain properties other than position, resulting in a more general class of possible histories.

However, not all such sets of histories can be ascribed consistent probabilities: notably, interference effects often prevent the assignment of probabilities obeying the standard axioms to histories. However, for systems that interact strongly with their environment, interference effects are rapidly suppressed; this phenomenon is called decoherence. Decoherent histories can be ascribed consistent probabilities—hence the two alternative names of this approach. It is assumed that only consistent sets of histories can describe the world, but other than this consistency requirement, there is no restriction on the kinds of histories that are allowed. Indeed, Griffiths maintains that there is no unique set of possible histories: there are many ways of constructing sets of possible histories, where one among each set is actual, even if the alternative actualities so produced describe the world in mutually incompatible ways. Absent a many worlds ontology, however, some have worried about how such a plurality of true descriptions of the world could be coherent. Gell-Mann and Hartle respond to such concerns by arguing that organisms evolve to exploit the relative predictability of one among the competing sets of histories.

The transactional interpretation, initially developed by John Cramer, also incorporates elements of both collapse and hidden variable approaches. It starts from the observation that some versions of the dynamical equation of quantum mechanics admit wave-like solutions traveling backward in time as well as forward in time. Typically the former solutions are ignored, but the transactional interpretation retains them. Just as in retrocausal hidden variable theories, the backward-travelling waves can transmit information about the measurements to be performed on a system, and hence allow the transactional interpretation to evade the conclusion of Bell’s theorem.

The transactional interpretation posits rules according to which the backward and forward waves generate “transactions” between preparation events and measurement events, and one of these transactions is taken to represent the actual history of the system in question, where probabilities are assigned to transactions via a version of the Born rule. The formation of a transaction is somewhat reminiscent of the spontaneous collapse of the wavefunction, but due to the retrocausal nature of the theory, one might conclude that the wavefunction never exists in a pre-collapse form, since the completed transaction exists as a timeless element in the history of the universe. Hence some have questioned the extent to which the story involving forwards and backwards waves constitutes a genuine explanation of transaction formation, raising questions about the tenability of the transactional interpretation as a description of the quantum world. Ruth Kastner responds to these challenges by developing a possibilist transactional interpretation, embedding the transactional interpretation in a dynamic picture of time in which multiple future possibilities evolve to a single present actuality.

Relational interpretations, such as those developed by David Mermin and by Carlo Rovelli, take quantum mechanics to be about the relations between systems rather than the properties of the individual systems themselves. According to such an interpretation, there is no need to assign properties to individual particles to explain the correlations exhibited by Bell’s experiment, and hence one can evade Bells’ theorem without violating either locality or independence. Superficially, this approach resembles Everett’s, according to which systems have properties only relative to a given branch of the wavefunction. But whereas Everettians typically say that a relation such as an observer seeing a particular measurement result holds on the basis of the properties of the observer and of the measured system within a branch, Mermin denies that there are such relata; rather, the relation itself is fundamental. Hence this is not a many worlds interpretation, since world-relative properties provide the relata that relational interpretations deny. Without such relata, though, it is hard to understand relational quantum mechanics as a description of a single world either. However, citing analogies with spatiotemporal properties in relativistic theories, Rovelli insists that it is enough that quantum mechanics ascribe properties to a system relative to the state of a second system (for example, an observer).

Informational interpretations, such as those developed by Jeffrey Bub and by Carlton Caves, Christopher Fuchs and Rüdiger Schack, interpret quantum mechanics as describing constraints on our degrees of belief. They develop rules of quantum credence by analogy with the rules of classical information theory, expressing the difference between quantum systems and classical systems in informational terms, for example in terms of an unavoidable loss of information associated with a quantum measurement. Some proponents of an informational interpretation take an explicitly instrumentalist stance: quantum mechanics is just about the beliefs of observers, treated as external to the quantum systems under consideration. Others take their informational interpretation to be a realist one, in the sense that it can in principle be applied to the whole universe, with “information” serving as a new physical primitive. However, the adequacy of the informational approach as realist can be challenged, for example, on the basis that it does not provide a dynamics for the evolution of the actual state of the world over time. Bub responds that an account of the information-theoretic properties of our measurement results may be the deepest explanation we can hope for.

7. Choosing an Interpretation

Setting aside interpretations such as Copenhagen that eschew describing the quantum world, the interpretations discussed above present us with a number of very different ontological pictures. The many-worlds interpretation tells us that the underlying nature of physical objects is wave-like and branching. Bohm’s theory adds particles to this wave, and some hidden variable theories attempt to do away with the wave as a physical entity. The GRW theory, like the many-worlds interpretation, takes waves as fundamental, but rejects the many-worlds picture of a branching universe. Other spontaneous collapse theories add a mass density distribution to the wave, or replace the wave with point-like flashes. The GRW theory is indeterministic, casting quantum mechanical probabilities as genuine objective chances appearing in the fundamental physical laws. Bohm’s theory is deterministic, since the physical laws involve no chances, making quantum probabilities merely epistemic. The many-worlds interpretation involves no objective chances in the laws, but nevertheless (if successful) casts quantum mechanical probabilities as objective chances grounded in the branching process.

It seems, then, that we have a classic case of underdetermination: while the experimental data strongly confirm quantum mechanics, it is unclear whether those data confirm the metaphysical picture of many-worlds, Bohm, GRW or some other alternative. Since it has been doubted that underdetermination is ever actually manifested in the history of science, this is a striking example.

Nevertheless, the nature and even the existence of this underdetermination can be contested. It is worth noting that spontaneous collapse theories differ in their empirical predictions from standard quantum mechanics; the collapse process destroys interference effects, and the larger the object the more quickly one expects these effects to be detectable. At present, the differences between spontaneous collapse theories and standard quantum mechanics are beyond the reach of feasible experiments, since small objects cannot be kept isolated for long enough, and large objects cannot be kept isolated at all. Even so, the empirical underdetermination between spontaneous collapse theories and the other interpretations is not a matter of principle, and may be resolved in favor of one side or the other at some point.

The underdetermination between hidden variable theories and the many-worlds interpretation is of a different character. These two interpretations are empirically equivalent, and hence no experimental evidence could decide between them. It seems that here we have a case of underdetermination in principle. One could try to decide between them on the basis of non-empirical theoretical virtues like simplicity and elegance. On measures like this, the many-worlds interpretation would surely win, since hidden variable theories begin with the mathematical formalism of the many-worlds interpretation and add complicated and arguably ad hoc extra theoretical structure. But judging theories on the basis of extra-theoretical virtues is a controversial endeavor, particularly if we take the winner to be a guide to the metaphysical nature of the world.

Alternatively, it is not unreasonable to think that either the many-worlds interpretation or hidden variable theories could prove to be untenable. As noted above, it is unclear whether the many-worlds interpretation can account for the truth of probability claims, and if it cannot, then it fails to make contact with the empirical evidence. On the other hand, it is unclear whether any hidden variable theory can be made consistent with special relativity (and generalized to cover quantum field theory), and if not, then the hidden variable approach is arguably inadequate.

Some have argued that there is no underdetermination in the interpretation of quantum mechanics, since the many-worlds interpretation alone follows directly from a literal reading of the standard theory of quantum mechanics. It is true that both hidden variable theories and spontaneous collapse theories supplement or modify standard quantum mechanics, so perhaps only the many-worlds interpretation qualifies as an interpretation of standard quantum mechanics rather than a closely related theory. The many-worlds interpretation may be the only reasonable interpretation of quantum mechanics as it stands, and there may be good methodological reasons against modifying successful scientific theories. However, given the possibility that quantum mechanics according to the many-worlds interpretation is not in fact a successful scientific theory (because of the probability problem), it seems reasonable to consider modifications to the standard theory.

Nevertheless, it is certainly true that there may be no underdetermination in quantum mechanics, since it is possible that only one of the interpretations described here will prove to be tenable. Indeed, it is possible that none of these interpretations will prove to be tenable, since all of them face unresolved difficulties. Hence the interpretation of quantum mechanics is still very much an open question.

8. References and Further Reading

Albert, David Z. Quantum mechanics and experience. Harvard University Press, 1992.
- Non-technical overview of the various interpretations of quantum mechanics and their problems.
Bell, John Stewart. Speakable and unspeakable in quantum mechanics: Collected papers on quantum philosophy. Cambridge University Press, 2004.
- A mix of technical and non-technical papers, including the original 1964 proof of Bell’s theorem and discussions of various interpretations of quantum mechanics, especially hidden variable theories.
Bohm, David. Quantum theory. Prentice-Hall, 1951.
- Classic quantum mechanics textbook, with early chapters covering the historical development of the theory.
Bohm, David, and Basil J. Hiley. The undivided universe: An ontological interpretation of quantum theory. Routledge, 1993.
- A guide to Bohm’s theory and its implications by its originator. Technical in parts.
Bub, Jeffrey. Bananaworld: Quantum mechanics for primates. Oxford University Press, 2016.
- Accessible introduction to the phenomena of entanglement, and an extended argument for an informational interpretation of quantum mechanics.
Cushing, James T. Quantum mechanics: Historical contingency and the Copenhagen hegemony. University of Chicago Press, 1994.
- A comparison of the Copenhagen interpretation and Bohm’s theory, and a defense of the view that the former became canonical largely for social reasons.
Greaves, Hilary. “Probability in the Everett interpretation.” Philosophy Compass 2.1 (2007): 109-128.
- Non-technical overview of the attempts to find a place for probability within Everett’s branching universe.
Kastner, Ruth. The transactional interpretation of quantum mechanics: The reality of possibility. Cambridge University Press, 2013.
- Non-technical introduction to the transactional interpretation, and development of a “possibilist” version as a response to objections.
Maudlin, Tim. Quantum non-locality and relativity. Blackwell, 1994.
- Non-technical guide to the problems of reconciling quantum mechanics with relativity.
Mermin, N. David. “Quantum mysteries for anyone.” The Journal of Philosophy 78 (1981): 397-408.
- Non-technical exposition of Bell’s theorem and discussion of its implications.
Ney, Alyssa, and David Z. Albert, eds. The wavefunction: Essays on the metaphysics of quantum mechanics. Oxford University Press, 2013.
- Essays on the ontological status of the wavefunction, including the issue of whether realism about the wavefunction makes the three-dimensional world of experience illusory.
Omnès, Roland. Understanding quantum mechanics. Princeton University Press, 1999.
- Accessible (but in parts moderately technical) defense of the consistent histories approach.
Price, Huw. Time’s arrow & Archimedes’ point: New directions for the physics of time. Oxford University Press, 1997.
- An extended, non-technical defense of the retrocausal hidden variable interpretation of quantum mechanics.
Rovelli, Carlo. “Relational quantum mechanics.” International Journal of Theoretical Physics 35 (1996): 1637-1678.
- Exposition and defense of relational quantum mechanics. Moderately technical in parts.
Saunders, Simon, Jonathan Barrett, Adrian Kent, and David Wallace, eds. Many Worlds?: Everett, Quantum Theory, & Reality. Oxford University Press, 2010.
- A collection of essays on the many-worlds interpretation, for and against, technical and non-technical. Includes an essay by Peter Byrne on the history of Everett’s interpretation.
Wallace, David. The emergent multiverse: Quantum theory according to the Everett interpretation. Oxford University Press, 2012.
- An exposition and defense of the many-worlds interpretation, focusing especially on the issue of probability. Technical in parts.

Author Information

Peter J. Lewis
Email: plewis@miami.edu
University of Miami
U. S. A.

Human Dignity

The mercurial concept of human dignity features in ethical, legal, and political discourse as a foundational commitment to human value or human status. The source of that value, or the nature of that status, are contested. The normative implications of the concept are also contested, and there are two partially, or even wholly, different deontic conceptions of human dignity implying virtue-based obligations on the one hand, and justice-based rights and principles on the other. Added to this, the different practical and philosophical presuppositions of law, ethics, and politics mean that definitive adjudication between different meanings is frustrated by disciplinary incommensurabilities.

What follows is an analysis of human dignity’s uses in law, ethics, and politics, and a critical description of the functions and tensions generated by human dignity within these fields. Crucial conceptual and methodological questions arise from the outset regarding whether human dignity can be reconstructed as one concept or must be treated as several concepts. It is argued here that a focal concept of human dignity can be reconstructed and that this concept provides the most illuminating perspective from which to view human dignity’s range of conceptions and uses.

Introduction
Conceptual Background
Themes
Conceptual Analysis
Conclusion
References and Further Reading

1. Introduction

There are a number of competing conceptions of human dignity taking their meaning from the cosmological, anthropological, or political context in which human dignity is used. Human dignity can denote the special elevation of the human species, the special potentiality associated with rational humanity, or the basic entitlements of each individual. There are, by extension, dramatically different normative uses to which the concept can be put. It is connected, variously, to ideas of sanctity, autonomy, personhood, flourishing, and self-respect, and human dignity produces, at different times, strict prohibitions and empowerment of the individual. It can also, potentially, be used to express the core commitments of liberal political philosophy as well as precisely those duty-based obligations to self and others that communitarian philosophers consider to be systematically neglected by liberal political philosophy.

As a consequence of these antagonistic currents of thought, philosophical analysis of human dignity cannot be separated from wider debates in moral, political, and legal philosophy. Nor can a certain level of selective reconstruction be avoided. The genealogy of the concept has been traced, tendentiously, through the whole history of Western, and sometimes non-Western, philosophical thought; such genealogies are not always illuminating at a conceptual level. More specifically, it is a desideratum of philosophical analysis of human dignity that the concept can be shown to have sufficient clarity to make a useful contribution to modern philosophical debate. This article therefore locates human dignity within a range of debates and suggests—using one important reconstruction of the concept—that human dignity represents a claim about human status that is intended to have a unifying effect on our ethical, legal and political practices.

We begin with an extended methodological and conceptual exploration, asking what should be taken as primary in examining human dignity. Noting a particularly close relationship between contemporary uses of human dignity, international law, and human rights, this connection is treated as focal without assuming that it is definitive of the concept (for related but alternative starting points see Debes 2009; Waldron 2013; Donnelly 2015).

2. Conceptual Background

The use of human dignity in public international law is a marker for understanding the moral, legal and political discourse of human dignity. A characteristic expression is found in the Preamble of the International Covenant on Civil and Political Rights (1966) whose rights “derive from the inherent dignity of the human person” and whose animating principle is “recognition of the inherent dignity and of the equal and inalienable rights of all members of the human family [as] the foundation of freedom, justice and peace in the world.” This assertion and others like it form a common reference point in contemporary literature on human dignity. Importantly, this ‘inherent dignity’ represents a potential bridge between a number of different ideas and ideals, namely freedom, justice and peace.

In fact, it is this potential to bridge different fields of regulation—human rights, bioethics, humanitarian law, equality law and others—that we might take to be the most important function of human dignity in international law. We will refer to an interstitial concept of human dignity (IHD). This concept, arising from discourses and practices of international law, has a strong relationship with equality, liberty, and the basic status of the individual. And, crucially, it implies an interstitial or conjunctive function across our normative systems. It is where law, ethics, and politics meet and are practically and critically interrelated. It is where domestic, regional, and international regulation find a common principle. It is where positive law and morality become difficult to distinguish. And it is where specific norms and general principles are linked. By extension, this concept of human dignity is the concept we should treat as the foundation of human rights because any reconstruction of the complex menu of human rights in international law has to take account of their wide-ranging implications for legal, moral and political governance. Put another way, one necessary condition for a defensible, foundational account of human rights is that their foundational principle must have an interstitial function straddling these fields of normative practice.

Note that this does not capture, and is potentially in tension with, many existing linguistic and normative practices related to human dignity. For instance, discussion of ‘dignitarian harms’ relevant to healthcare law, or local prohibitions on degrading work, might well invoke the language of human dignity without intending any implications for other normative systems. They imply nothing about politics or about law more generally. These linguistic and normative manifestations of human dignity should be considered in their own terms and are returned to in what follows. But the question of why there are tensions between these uses and the IHD is a revealing line of enquiry in itself. It concerns genealogical changes in the concept but also, and more importantly, the ways in which norms and principles are shaped and conditioned within the different practices of law, ethics and politics. To be sure, an interstitial concept is treated here as the best vantage point for all the competing claims. But this is not to insist it is the only intelligible concept. What follows is a description of an IHD’s form, content, and normative uses and an initial comparison with competing characterizations.

First, the idea of form allows us to distinguish the IHD from other uses of ‘dignity.’ Human dignity in international law is associated with a cluster of closely related, but distinguishable, formal characteristics. Human dignity connotes universality (ascription to every human person), inalienability (it is a non-contingent implication of one’s status as human), unconditionality (a property requiring no performance or maintenance), and overridingness (having priority in normative disputes). These immediately assist in distinguishing an IHD concept from a behavioral description of dignity which would not be inalienable, a virtue ethical reading which would either not include ascription to every human person or would be contingent, or a healthcare ethics reading which might not insist on the overridingness of human dignity. Note that these formal criteria are not treated as necessary conditions for human dignity but are, rather, claims commonly associated with human dignity in international law. They assist, amongst other things, in distinguishing human dignity from dignity simpliciter with its associations with behavior and comportment. They also situate the IHD close to certain currents of Kantianism and deontology without assuming that Kant’s work is definitive of the concept.

Second, content encompasses the ‘what’ and the ‘who’ of human dignity. Invocation of human dignity invites us to ask what underlying conception of humanity is at work. The discourse of the ‘human person,’ often associated with human dignity in international law, captures the mixture of formal personhood and embodiment or vulnerability. The conjunction of human and person also produces potentially competing conceptual and ontological commitments, and we can draw a distinction between normative and taxonomical humanity in our discourse of human dignity (Donnelly 2015). Further complexity arises from strong species-based claims or discussions of transhumanism that are focused on potential changes in the ontology of humanity. Undoubtedly human dignity is associated with species claims but it is also intelligible to rely upon more formal claims about the characteristics of agents or persons in analysis of human dignity. Related to these questions of ascription, the ontological and normative commitments involved in a human dignity claim (the question of what) are varied. Human dignity could concern capacities, could include the direct requirement to exercise capacities, and might also concern a teleology for humanity (that is, the ontology of human dignity). Human dignity will—at least in the use of concern here—be closely linked to notions of autonomy, personhood and free will (that is, the correlates of human dignity). Related to this is a contrast (concerning what we might call the metaphysics of human dignity) between human dignity considered broadly as a property or as something arising relationally through recognition or respect.

Third, normative use concerns characteristic normative implications and normative functions. This has been usefully expressed as a distinction between empowerment and constraint (Beyleveld and Brownsword 2001). The IHD is commonly associated with empowerment through human rights. This is distinguishable from the constraint function commonly found in bioethics and healthcare ethics, often a peremptory ban on certain kinds of uses of human beings. It is less clear how the IHD functions regarding another common distinction, that between horizontal application (between individuals) and vertical application (between the state and individual). International human rights law predominantly concerns vertical application, but the IHD, particularly given its linking of law, morality and politics does not preclude (and may imply) horizontal application. We may also note at this point a common distinction between human dignity as status and value. This turns, in part, on what response is required in the light of human dignity: status demands respect but also rights, duties and privileges; the existence of a value potentially requires fostering or enhancement. Only the former rights, duties and privileges are likely to be treated as having systemic application (being justiciable or enforceable), at least within liberal political systems that refuse to enforce moral conduct. As a consequence, the normative use of any IHD concept is undoubtedly conditioned by liberal assumptions concerning the proper scope of legislation. Nonetheless there are many instances of enforcement of more perfectionist or self-regarding conceptions of human dignity (for instance in the prohibition of ‘dwarf tossing’).

The last point reveals the most important tension in the general philosophical study of human dignity, namely the seeming co-existence of the interstitial concept characteristic of international law on the one hand and a perfectionist, virtue or purely self-regarding concept on the other. The assumption made here, that the latter perfectionist claims are non-focal or non-standard, is contentious (for the opposing view see Hennette-Vauchez 2011). Nevertheless this would appear to make the best sense of the majority of post-World War Two literature and thinking. Indeed the important post-war legal instruments themselves represent an interstitial process or moment, and the reconfiguration of the international legal order was the seedbed in which a certain idea of human dignity was given international expression. Far from being an accident of drafting or the contingencies of finding consensus, the (re)assertion of a notion of human dignity can be seen as the intention to transcend the boundaries of the legal, moral and political. Accordingly, while the following analysis does point to some historically contingent aspects of the use of human dignity, this is less important than the fact that the drafting of the Universal Declaration of Human Rights (1948) [UDHR] took place when the foundations of the international legal and political order were undergoing massive upheaval and when the need for a unifying moral principle was acute. We begin with law as the normative system within which the putative interstitial concept arose.

3. Themes

a. Law

There is no doubt that an IHD concept finds its most important expression in post-World War Two international law and constitutional instruments (the Universal Declaration of Human Rights, the Twin Covenants, and others). As such, the nature and function of human dignity in law could be assumed to be clear and well documented. This is the case at the level of doctrinal analysis of human dignity, and there is important jurisprudence arising in particular from the European Court of Human Rights and from constitutions including those of Germany, South Africa and Hungary. The sum of this jurisprudential thought is a mixture of general thinking about the foundation of constitutional rights alongside specific focus on the prohibition of degradation and objectification. This however points to two areas of deeper complexity, one hermeneutical and one concerning the conditioning effects of legal systems. First, different jurisdictions and institutions have given such radically different functions to human dignity that it is not always clear that one concept, the IHD, is at work. Indeed more substantive and perfectionist notions are often in evidence in national legal settings. Second, the IHD seems an ideal candidate for a kind of Grundnorm or secondary rule in law: a norm giving validity to legal systems as a whole or a principle governing the application of all norms within a system. However, this is difficult to defend as anything other than a loose generalization. In principled terms, legal systems treat justice as their foundational norm and this means that consistency, rather than moral defensibility, guides adjudication. And, in practice, it is not at all clear how human dignity can or should function as a ‘higher’ norm. There is, in other words, something of a mismatch between the putative function of the concept and its actual potential.

The nature and content of international law can partially explain such tensions. The prominent place of human dignity in international human rights instruments, as the foundation of those rights, has given human dignity enormous symbolic and heuristic significance. The foundational significance of human dignity is frequently assumed to extend beyond international human rights law to the international legal system as a whole. Where there are tensions between different fields of international law, or emerging practices in international law, human dignity is an important tool for focusing on the normative forces at work, in particular the significance of the individual as transcending the boundaries of state authority and as justifying state authority. It is fair to say that at this level human dignity is of enormous symbolic importance though human dignity is not, in itself, an enforceable norm of international law (the exception to this is in international humanitarian law’s Common Article 3, a prohibition on “outrages upon personal dignity”).

At the regional and domestic levels the normative implications of human dignity become more precise. While the European Court of Human Rights takes from international law the assumption that human dignity is foundational, it has operationalized it within its jurisprudence as an interpretive tool generally, and with particular reference to the idea of “torture, inhuman or degrading treatment.” This association between human dignity and the worst forms of degradation and objectification is shared with international humanitarian law and with German constitutional thinking. It is also the focus of the US constitutional deployment of human dignity as an interpretive tool in Eighth Amendment jurisprudence (concerning “cruel and unusual punishment”). The merit of this association with degradation is to give human dignity a clearer normative implication: the absolute impermissibility of certain kinds of gross mistreatment of the individual. Conversely, it is difficult to reconcile this restrictive, prohibitive reading with the assumption that human dignity is broad and foundational.

This relates, in turn, to a tension between human dignity operationalized as a specific norm (or in some instances a right) and a more general principle in law. Consider, for instance, Article 23 of the Universal Declaration of Human Rights (1948) (“everyone who works has the right to just and favourable remuneration ensuring for himself and his family an existence worthy of human dignity”). Here human dignity is neither a principle nor clearly foundational of the right it is associated with (or any other right); instead, it is a telos or standard. That standard is, potentially, related to material sufficiency or to flourishing and could be seen, to that extent, to have an aspiration to being interstitial. Nevertheless it is (in fact) rare for human dignity to be enforced as a standard and is (in principle) unclear how this would amount to normative or conceptual unification of law, ethics and politics. It is possible that some instances of human dignity as a right or as a telos appear to have clear interstitial implications but nonetheless represent a different concept from the IHD because both their content and their normative implications differ (see Waldron 2013).

The kind of complexities and possibilities that arise from human dignity being in law a right, standard or telos as well as a principle, value or status, gives rise to an underlying uncertainty as to whether law contains a single concept, a number of conceptions or simply a confusion of several ideas. There are a number of proposed normative and conceptual solutions to this tension, though it is not obvious how we might adjudicate between them. First, we can assume that human dignity necessarily has a dual status as norm (a more or less prohibitive norm) and as principle (predominantly symbolic and heuristic) (Alexy 2009). Second, we can assume that law has a number of different conceptions at work, conceptions that are either incommensurable (McCrudden 2008) or loosely linked by family resemblance (Neal 2012). Third, we can assume that law now has two very different concepts at work, one ancient and honor-based and the second closer to the IHD. We give this last option closer attention.

While many domestic or constitutional uses of human dignity are closely related to autonomy, privacy and the protection of agency, there is no doubt that (human) dignity has also been used to impose limitations on acts that can be seen as voluntarily diminishing an individual’s own human dignity or violating duties to themselves. In the broadest terms, then, there is a tension between a permissive reading of human dignity that protects autonomous individual agency from state intrusion, and a conservative reading that allows law to protect individuals from themselves. (This partially resembles Beyleveld and Brownsword’s contrast between the empowerment and constraint conceptions of human dignity.) These kinds of tensions are explored by Stephanie Hennette-Vauchez (2011), who insists on the coexistence of a human dignity principle, which is in essence a principle of equality, and an older (ancient) notion which is closer to a hierarchical notion of honor and permits the enforcement of certain norms related to self-respect. The form, content, and normative implications of these two ideas are clearly very different. While the idea of respect is morally important, it is difficult to reconcile the enforcement of respect with the assumptions we would treat as definitive of liberal legal systems, namely formal equality and division between public and private obligations. As such the honorific manifestations of human dignity are distinct from the liberal concept of human dignity; they are only rarely treated as enforceable (through personality law or public morality provisions) and lack the universal or inalienable characteristics of the IHD. They are nevertheless an irreducible part of contemporary law.

In sum, international law is a source of much of our thinking about human dignity, and in particular it gives credence to the idea of an IHD concept that can link different fields of legislation and different jurisdictions. At the same time, international and domestic legal institutions exercise a conditioning force on the discourse of human dignity. The implications of this are two-fold. First, as argued by James Griffin, human dignity acts as the foundation of human rights and gives rise to a large range of rights related to personhood and agency; nevertheless, the menu of human rights potentially generated by human dignity must be reduced or rationalized given the equal importance of legal institutions in national legal systems as a source of settled norms and practices (Griffin 2008). Second, legal systems require normative precision, and positive law invoking human dignity often appears to fall short of that precision; this has meant that jurists have favored conceptualizing and operationalizing human dignity through an association with degradation (Kaufmann et al, 2011). As Beitz insists, these implications raise related questions:

human dignity seem to apply (differently) at two distinct levels of thought about human rights—as a feature of a public system of norms and as a more specific value that explains why certain ways of treating people are (almost?) always impermissible. If there could be a theory of human dignity, one of its desiderata would be to show what (if anything) these senses of human dignity have in common and how they hang together (if they do). (2013, 283)

Beitz’s own analysis retains a certain kind of bifurcation between prohibitive and empowering conceptions of human dignity (2013, 289–290), suggesting resilient problems in making sense of human dignity’s place in law. Does the overridingness of human dignity have, in legal systems, to be conditioned by the normal institutional limits on legal norms and principles or does it retain its (extra-legal) moral force? And what role does philosophical anthropology play in our ethical and legal thinking, and should this inform what we take to be enforceable in law? This is a question of what we hold to be distinctively human and how, if at all, this should inform our thinking about law. A philosophical anthropology, along with related moral commitments, may demand or prevent perfectionist readings of human dignity which, in turn, has implications for any putative interstitial concept.

b. Ethics

i. General

Those concerns with philosophical anthropology form a point of departure for reflection on ethics. For example, animal ethics concerns sometimes explicitly, but always at least implicitly, questions about the value of human beings in contrast to nonhuman animals. Answers to such questions will typically concern whether human beings have standing over animals, or whether human beings have an inner significance that animal beings lack. These two questions are ambiguous and the relation between them is far from clear. Supported by tradition which has overshadowed much of our understanding of human dignity, the first question can be variously understood as the elevation of the human species, human dominion over nature, humanity as imago dei, or as the special worth of humanity relative to all other natural phenomena. In other words, human dignity as elevation rather than human dignity as human inner significance (compare Sensen, 2011). The second question, by contrast, leaves open the possibility that human beings and nonhuman animals have potentially incommensurable significances (Korsgaard, 2013; Nussbaum, 2006; Balzer, Rippe and Schaber, 2000; Kaldewaij, 2013). Each of these presumptions has a questionable relationship with an IHD.

Starting from the idea that human beings have a distinctive significance, at least two possibilities flow: the existence of duties of dignity that address its bearer, and duties of dignity that address others. Some philosophical theories deny a distinctive significance for human (and nonhuman) beings as such, but emphasize the contractual basis of our norms or argue that what matters morally is sentience (compare Gauthier, 1987; Singer, 2001). By contrast, philosophical views on human dignity emphasize that there is a distinctive significance to human beings and that this entails certain stringent ethical norms. Note that claiming a distinctive significance for human beings does not necessarily amount to prioritizing human beings over animals. (Claiming that human beings should be prioritized over animals would of course entail that human beings have a distinctive significance.) Indeed claims that both human nature and animal nature have their own distinctive significance can be interpreted both in terms of elevation and in terms of inner significance. When animal and human interests clash, one could try to compromise the interests of one to satisfy the same or even a different interest for the other, in line with or even as a matter of respect to their different dignities.

That being said, the claim of human significance has often found expression in philosophies that elevate human beings over animals. It should be noted that the very idea of a relative standing of human beings over nonhuman animals and nature does not entail that human beings should be protected for that dignity (Sensen, 2011). Rather, the relative elevation of a human being is conceived in terms of his distinctive human capacities that, given some teleological or religious background assumptions, entail for him a duty to exercise these. These capacities are, in turn, typically understood to be exercised by acting morally, that is, to act in line with a morality that concerns what one does to oneself, to other humans, or to God. It is these teleological or religious assumptions that generally benefit humans over animals. It has been argued that this view of humanity was central to Western traditional views of dignity including those of the ancients, medieval Christians, Renaissance and early Modern thinkers.

Within these moral schemes the question of what we should do to a human being is not (fully) decided by recognizing their dignity (as elevation), whereas the individual’s own duty to comply with that scheme is the main normative implication of the set of capacities that ground his dignity. He has initial dignity as subject to such a moral scheme, in particular by virtue of his capacity and correlated duty to live up to it. As such, his dignity may not entail any or all duties that others have to him, such as to respect or even support him. What we are to do to him depends on the content of the moral duty that we have as a result of our dignity grounding capacities, duties which are conceptualized in terms of cosmic principles or divine commands. That is to say, we are to respect each other not for our relative standing, our initial dignity, but given that and insofar as non-interference or support for beings that happen to have this standing is required by cosmic or divine principle. This principle specifies what we should value in the individual. As such, it specifies a type of dignity that comes closer to the inner significance view, which in turn may be, but does not necessarily require, an expression in terms of schemas that advance ideas of human elevation.

It is the inner significance view, not the human elevation view, that fits more easily within the formal features of the IHD. The normative significance view has found expressions in at least three ways: as a status (Habermas, 2010; Waldron and Dan-Cohen, 2012), a value (Rosen, 2012; Sulmasy, 2007) or a principle (Düwell, 2014). As a status, human dignity gives human beings a set of duties and rights. A value, by contrast, sets human dignity as something to sustain or promote. As a principle, human dignity sets a fundamental standard for action. These three types of specifications are featured in broader philosophical anthropologies that explain who has it and what should be protected in them—as well as entail implications for policy and law with regard to it. In other words, whether we treat human dignity as a value, status or principle will depend in large measure on the background assumptions—anthropological and/or cosmological—that we take to form the background of a claim about human dignity.

ii. Philosophical Anthropology

All three claims—status, value and principle—can be interpreted in terms of the formal features of the IHD (universal, unconditional, inalienable and overriding). At the same time, some views on the significance of humanity may deny one of these features, and this will affect the content and normative use of such a view of the significance of humanity considerably. In these respects, attempts to reconstruct non-Western traditional views on dignity should be especially sensitive not only to distinctions between status, value and principle, but particularly to the formal as well as substantive specifications of the significance of humanity in these traditions (Donnelly, 2009). It has been argued, for example, that the normatively relevant notion of humanity in, for example, Confucian tradition should be understood in terms of dignity’s achievement through virtuous conduct, rather than in terms that make it independent of one’s character and conduct (Luo, 2014). This would touch on the issue of universality, unconditionality, alienability and overridingness. In Confucian tradition, dignity (qua ‘worth’) can be seen as a universal human potential that we may fail to cultivate: it is therefore universal but not unconditional; it can also be self-alienated and overridden.

It has been argued also that in certain Islamic traditions, Man has a God-given status as vicegerent on earth (Mozaffari, no date; Kamali, 2002; Maroth, 2014). This status may demand some respect, but how he is to be treated depends largely on what God has specified by law. If God demands—as some traditions seem to imply—respect for human individuals as a matter of their good deeds, piety or their living by the Book, then this would raise questions about consistency with the unconditionality and inalienability of an IHD. A further significantly different tradition, Hinduism, is sometimes interpreted to operate with a concept of dignity that a human individual shares because and insofar as his soul cannot be distinguished from the universe (Braarvig, 2014). On the one hand, this implies the significance of human individuals. On the other hand, given differentiations in the world of appearances we can distinguish degrees of dignity not only between individuals, but also between classes—which one can enter only through birth—specified by the presence of the universal whole in them. The possibility of rebirth in a higher caste—conditional on loyalty to the caste system or on pure chance—renders consistent this universal notion of dignity with the social one.

On top of these possible alternatives to an IHD at the formal level, it is also crucial to note the possibility of different accounts of the IHD in which these formal features may have different and incompatible contents, if not opposing implications for normative use. The differences concern not only questions about the nature of the subject of human dignity—a species, humanity or the human person—but also what is significant in him. Further differences emerge from answers to other questions: are we to grant him rights and impose on him duties; are we to value him, non-interfere and support him to perfect himself; are we to respect him?

iii. Tensions

This mixture of concerns and foci—different background assumptions in terms of cosmology and anthropology, different assumptions in terms of normative functioning of human dignity as statue, principle, and value—gives rise to an expansive field of enquiry. Even if we were to consider how the IHD may or may not be present in ethical accounts of human dignity, this would have to encompass the two substantial fields of normative ethics and applied ethics and would require careful analysis of how and why further links between politics, ethics and law are issues. For present purposes we narrow our concerns to applied ethics.

Applied ethics can be understood by reference to ethical problems that arise from concrete practices. These practices emerge or have their existence in society and as such require attention by politics and law—not only by philosophical ethics. What we typically see is that the ethical issue is addressed in terms of norms or principles accepted in the practice, and that politics or law let this happen and regulate only in their own terms—quite independent of an explicit assessment in terms of IHD, let alone in terms of a coherent integration of philosophical ethics, politics, law, empirical knowledge and practical constraints (compare Düwell, 2012).

‘Dignity’ has different usages in different applied ethical practices, and in some it has none (Beyleveld and Brownsword, 2001; Nordenfelt, 2004; Sulmasy, 2013). For example, in the life sciences dignity is used to legitimize a patient’s right to informed consent, to set constraints on her choices. Further, it is used to constrain her choice options, such as deciding when to die. It is also used to characterize the way a patient deals with and adapts to his condition, the way a patient is treated, and to emphasize the effects of his condition or of the actions of others on his identity. It is used to emphasize the value a person attaches to himself, the extent to which he respects himself (Dillon, 2013). Dignity is the central term in assessing technological developments for their application to human life (Human dignity and bioethics: essays commissioned by the President’s Council on Bioethics, 2008). Dignity is also used to argue against abortion, against the pre-natal experimentation on early human life. It has been argued by some that all human life should be protected as a matter of dignity, whereas others emphasize protection of human life only if it will develop a personality. In this context, it especially interesting to note that in debates on pre-natal enhancement, the notion of dignity is appealed to in defense of respecting the human species as such (Bostrom, 2005; Habermas, 2005). Here human dignity is said to be threatened by attempts to bring to life human beings enhanced in certain ways, such as enhanced to be more competent in certain abilities that are valued by parents or society. Here the worry not only concerns the dignity of the enhanced individual, whether it is violated or enhanced, but also the dignity of humanity as such: whether humanity is compromised by these interferences. It also concerns the dignity of non-enhanced human beings, whether it is threatened by the increased capacity of enhanced beings. Not all of these usages express the same concept, let alone an IHD. Those that do may give only partial expression to competing versions of an IHD. Often, however, we see that problems are addressed without explicit recourse to an IHD, let alone via an integral assessment in terms of the philosophical commitments that come with such an IHD. It would make a significant difference if these discourses were orientated towards coherence with an IHD.

c. Politics

Already in discussion of applied ethics certain of the constraining and conservative uses of human dignity are in evidence. A ‘dignitarian alliance’ of conservative thinkers and activists has deployed a notion of dignity close to that of sanctity in order to oppose or constrain reproductive and biotechnological innovations (Brownsword 2003). Political discourse of the twentieth century also, by contrast, witnessed radical and liberation-focused discourses of human dignity. While the division between human dignity as empowerment and as constraint helps to partially map this contrast, this section draws a more general divide between power-focused conceptions of politics as opposed to principle-focused conceptions of politics. Principled accounts can in turn be divided between those who make ethics (and potentially human dignity) central to politics, and those who might accommodate other interstitial principles like justice or the rule of law.

In those accounts that make ethics clearly foundational to politics, human dignity could be conceived as a regulative idea, providing the trajectory of politics but not necessarily central to its practice. Slightly differently, human dignity could be treated as providing a conception of good politics and implying practical side-constraint within political systems. More directly, human dignity might be identified with the good, which would give human dignity a more clearly normative and perhaps perfectionist role (Boylan 2004). Efforts to synthesize aspects of pluralism with such accounts of the good have informed a capabilities approach intended to encompass both a substantial conception of the individual and the protections of agency and individuality characteristic of liberal thought. This itself is often expressed in the language of human dignity (Nussbaum 2006, Claassen 2014). This interpretation of human dignity in terms of capability based flourishing has been reviewed and critically reinterpreted by reference to a different idea of dignity, that of dignity as a basic principle that demands recognition of the generic features of human agency as a matter of basic rights (Gewirth 1992). Far from being unrelated to the perfectionist notion of dignity, this latter notion of dignity functions as an underlying principle that may help us identify relevant from irrelevant human capabilities as well as to rank them so as to prevent or settle clashes between them (Düwell 2009, Claassen and Düwell 2012). Such a take on capabilities would imply that possibilities for certain forms of flourishing should be protected as a matter of dignity, indeed the same kind of dignity that demands respect for freedom and well-being as basic features of agency. One further upshot of this approach would be that those things to be secured or provided might, in view of this principle, differ between persons as well as between contexts. That is to say, to protect a capability for one agent may require different or more resources than protecting it in someone else (Boylan 2004). Also, when possibilities of securing agency are scarce in a community, priority should be given to capabilities at the core of agency. It might be that this represents a manifestation of the IHD concept in that the idea is intended to have application across different systems and also be extended to other, new forms of moral and political challenges.

In contrast, those positions that give the right priority over the good place rights and a plurality of reasonable conceptions of the good at the center of just institutional design. Such a ‘community of rights’ is quite directly committed to an interstitial notion of human dignity cashed-out as both basic human rights and systems for preserving freedom and welfare across all normative systems (Gewirth 1998). Rawls’s position (2009) in contrast faces the challenge of reconciling commitment to human dignity with treating justice as a primary institutional virtue. Rawls’s two principles of justice—while expressed in the language of basic rights and institutional virtues—could intelligibly be taken as an expression of a politics based on human dignity. However, this should give rise to important hermeneutical and conceptual hesitations. First, little is added to our understanding of Rawls’s work by associating it with human dignity, and conversely the distinctive conceptual characteristics of human dignity are immediately lost in more general debates about liberal political theory. Second, in Rawls’s later work where “decent non-liberal” societies are insulated from criticism and intervention from liberal states, we might say that Rawls concedes that non-liberal states—states that would clearly not accept an IHD principle as foundational—are nonetheless morally and politically justified (2001). By extension, the links between liberal political theory and human dignity are enormously complex, and can be conditioned by the demands of realism or non-ideal theory. With that in mind we turn to more practice-based and power-focused links.

The concept of human dignity as it appeared in post-war international law was undoubtedly intended to mark a decisive political, not just legal, turning-point. The concept is closely associated with the commitment “never again”—that never again should there be atrocities of the kind in the Second World War—and we could see human dignity as a predominantly political idea focused on the impermissibility of widespread and systematic attacks on civilian populations and by extension fundamental limitations on states’ sovereignty. In this sense there is credibility to an interstitial reading of human dignity that links international law, politics and morality in supporting a more individual-focused, less state-focused account of international relations. This, in turn, strengthens a link between human dignity and (moral and institutional) cosmopolitanism given that the value of individuals transcends state boundaries.

Conversely, this—interstitial and cosmopolitan—reading of human dignity has important limitations. First, the interstitial understanding of human dignity could be assumed to be, at heart, an ideological reading of human rights discourse: it is the rhetoric of human rights that links international law and politics rather than any systemic or philosophically defensible normative framework related to dignity. Second, the cosmopolitan understanding of human dignity faces the general vulnerability of all cosmopolitan philosophies (the priority of local and natural attachments in our moral thinking) and a specific attack via the problem of statelessness. That is, unless human dignity rests on or implies a ‘right to have rights,’ any political and legal discourse of human dignity will be inadequate in comparison to the systematic and concrete protections offered to citizens by constitutions and constitutional rights. We return to the right to have rights later by way of a more general analysis of social theory.

Certain historical and sociological trends are important for understanding human dignity and its role in politics. The first and most obvious is a shift from hierarchical societies to more democratic societies and with this an emphasis on the equal status and rights of individuals. A clash between the notions of dignity as aristocratic bearing and dignity as fundamental status is a characteristic of debates concerning the French Revolution. The ‘dignity of Man’ as emblematic of political emancipatory projects finds its first major expression during this revolutionary period, and it allowed the articulation of new emancipatory projects as in Wollstonecraft’s appeal to the equal dignity of men and women ([1792]1982). The post-World War Two invocation of human dignity undoubtedly shares basic humanistic, enlightenment, and liberal assumptions with these currents of eighteenth and nineteenth century thought, though by the twentieth century the idea of the ‘dignity of Man’ was being opposed not directly by defenders of the Ancien Régime but by Marxist and communitarian critics of liberalism. What unites these latter positions is concern about the insensitivity of human dignity relative to pressing political problems including colonialism and minority rights, along with more fundamental concerns about the emptiness of the concept relative to collective interests that cannot be disaggregated into individual interests.

Sociological shifts are also crucial in understanding the competing functions of human dignity in political discourse. The characteristics of modernity, as charted by both Weber and Durkheim, involve changes in the conception of the individual (including for Durkheim the creation of an ‘ethic’ or ‘religion’ of humanity), changes in the concept of politics, and changes in the political significance of human dignity. On the one hand, the more technocratic and bureaucratic nature of politics was held to have yielded a demystifying, but also dangerously dehumanizing, relationship between the individual and political power. In the light of that and related concerns, Margalit (2009) and others use human dignity to stress the importance of retaining dignity qua self-respect within political and social practices. By the same token, Honneth’s work on the political conditions of recognition (1996) entwines respect with the basic conditions of individual and group identity. On the other hand, liberal institutions that intended to preserve the basic status of the individual have been held to be inadequate to maintain the conditions of the possibility of ethical life. This has meant direct attacks on ‘liberal’ practices, including human rights, by communitarian theorists.

It is against this background that a different style of political theorizing about human dignity can be found in the second half of the twentieth century. Hannah Arendt’s Aristotle-inspired political theory emphasizes the importance of recognition in a political community and of strong constitutional rights with an equation between human dignity and the right to have rights (Arendt 1958). Arendt offers an influential internal critique of politico-legal understandings of human dignity. Broadly, Arendt is unsympathetic to any potential interstitial concept (given her views on the basic conditions of politics) and to generalizations about the rights of Man (given her writings on the emptiness of this notion, particularly with regard to the status of refugees). In contrast she stresses the basic importance of citizenship as a condition of protecting the basic status of the individual. There are nevertheless resources in Arendt’s work that are clearly sympathetic to human dignity and human rights as more expansive commitments, and human dignity could be seen as the best expression of that view of human dignity as opposition to atrocity and defensive of human status and human plurality (Menke 2014).

In the light of these competing currents of thought, and the complexities of the concept itself, human dignity does not map neatly onto the division between empowerment and constraint or between the priority of the good and the priority of the right. The IHD, to the extent that it is a recognizable component of political thinking, might be assumed to be closer to conceptions of politics focused on the rule of law rather than a substantive conception of the good. Understood as interstitial concepts, human dignity and the rule of law are intended precisely to express the importance of links between politics and law and the co-regulation of the two. The rule of law is important not only as an expression of self-restraint in politics but also as a necessary condition of a permissive politics of human agency, choice and self-creation. This might be otherwise expressed in terms of a defense of the public-private divide. It could be expressed in more sociological terms as a defense of functional differentiation, the coexistence of different social systems that an individual can move between. Or this might be linked to a libertarian defense of minimalism in the power of the state. The unifying idea here is that human dignity is a principle with significance for political, legal and moral systems and which preserves, one way or another, the freedom and self-creation of the individual. It has been the recurrent theme of communitarian critics of liberalism and human rights that such permissiveness undermines the self-constitution of the individual within a polity. Middle ground could, potentially, be found in the capabilities approach or in an Arendtean stress on the right to have rights.

4. Conceptual Analysis

a. The Conceptual Features of Human Dignity

It is desirable, but no simple task, to begin to draw more general conclusions about human dignity as a concept and as a component of normative debate. It is worth briefly contrasting how we might approach the analysis of human dignity with that of human rights. Discussion of human rights features settled debates concerning their moral or political justification, an appropriate theory of rights, and human rights’ tailoring to practice. Analysis of human dignity, in contrast, lacks such clearly defined parameters because it is plausible that there are competing concepts of human dignity and not just competing conceptions. That is, it is not simply that in academic debate different aspects of a single concept can be given special emphasis or that there are competing justificatory strategies for the same, shared, idea. Rather, ‘human dignity’ might encompass historically different, and antagonistic, ideas. For this reason, meta-studies of the uses of human dignity have difficulty yielding definitive analysis of the concept’s presuppositions and functions, or have mapped a number of functions that are difficult to cohere (Nordenfeld 2004; Sulmasy 2013). Bonding the many functions of human dignity may be possible, at best, only through performative analysis (O’Malley 2011) or family resemblance analysis (Neal 2012), but these involve abandoning a single idea of human dignity in favor of describing various local uses.

b. The Credibility of an Interstitial Concept

In contrast, we would argue that the three normative fields of law, morality and politics together offer at least the possibility of a distinctive, focal concept. The idea of the absolute status of every individual can intelligibly be held to frame our normative practices. Indeed, the magnitude of this commitment is such that it would have to be manifest in all of our social practices. Clearly, however, this is not without problems. Any conceivable defense of an IHD concept—one that, by definition, sits between and links different normative practices—faces the immediate problem of the conditioning assumptions of those disciplines and practices (including the local practices and settled dispositions and attitudes of those working within the fields). This can be treated as a three-fold problem. The validity of any legal norm is conditional on political will (the problem of the primacy of the political); the moral justification of the idea still requires further explanation and justification (the problem of the foundations of morality); and the legal notion itself will be conditioned by a legal system so that it can be consistently operationalized within the system (the problem of the demands of justice or the normative closure of law). These three problems are pressing problems for any IHD claim precisely because the concept must claim to transcend these conditioning aspects of our normative practices.

However, it can be argued that the possibility of an interstitial concept nevertheless has support within the fields. For example, the idea of a rule of law is intended to unify different fields of legal and political regulation (through demanding their consonance with good law consistent with human agency), and for that reason a number of theorists closely associate human dignity and the rule of law (Waldron 2008; Fuller 1964). Beyond this, human dignity might well inspire more productive and precise regulatory practices, be they related to global, social or procedural justice. If the rule of law is the minimal demand that there be a good match between regulation and agency, wider ‘projects’ conjoining law, ethics, and politics can be meaningfully expressed in the language of human dignity given its unifying function. Put more modestly, the idea of politics as an anomic practice is difficult to defend—after all, law and politics stand in a relation of productive co-constitution with politics making law and legal systems revising the content of that law and regulating political practices themselves—and our best reconstructions of the foundations of political practices and institutions are likely to involve commitment to the kinds of formal assumptions associated with human dignity (Rawls 2009; Habermas 2010). And moral theories can enforce duties which in turn generate institutional designs and procedural mechanisms intended to protect human dignity and render it immanent in social systems (Gewirth 1998). In sum the three problems associated with an IHD claim are not uniformly accepted and should not be treated as a refutation of interstitial claims in general or an IHD concept specifically.

Above all, a connection between human rights and human dignity gives critical force to human dignity and indicates precisely why the predominant concept of human dignity should be assumed to be an interstitial one. Conceptualizing human dignity as foundational is sometimes construed as bonding the existing body of human rights law with a moral claim that guarantees their force as moral, not just positive, rights. The most plausible explanation of such a guarantee is through deontological theory granting supreme moral importance to the individual and immunizing them from consequentialist determinations of the common good that would potentially sacrifice their rights and their status. Beyond this, the precise account of justification, rights, and practice is open to debate, but human dignity is the foremost expression of the deontological commitments sketched here. Even in this sketch it is clear that the normative fields of law, ethics, and politics are not intended to be absolutely divided but rather guided and judged by their consistency with the protection of human rights. It is this claim that lies at the heart of an interstitial concept of human dignity (and much else besides in international law). It remains to draw out the implications of this.

c. The Implications of an Interstitial Concept

Assuming that an IHD concept—sitting between normative fields, linking these fields, and conditioning them—is intelligible, then its implications are considerable. Let us assume that the commitments contained in such a concept are as follows. Human dignity is treated as having the formal features identified (universality, overridingness, and so forth); it has the characteristic content of human dignity claims (a species claim or a claim about human dignity being relational or a property); and it encompasses commitment to a distinctive normative use (for example, empowerment of the individual, expressed in terms of claim rights, that holds at least between the individual and all political institutions). The sum of this commitment would be as follows. In all interactions between state and individual, claim rights (expressible as human rights) can and should be exercised by all human persons, and the exercise of those rights would not be conditioned by any jurisdictional boundaries. This amounts to having significance in all possible interactions between the collective and the individual. It will imply that there is no interaction between individuals that is not at least potentially normatively governed by human dignity. And it implies that any special demands about normative priorities made by law, ethics or politics would be justified only to the extent that they were consistent with, or directly conditioned by, the overarching commitment to human dignity. This concept is, then, enormously demanding insofar as its fulfillment would not be discharged on the basis of respecting a single norm (be it a Grundnorm or an anti-atrocity norm) but would, rather, demand an ongoing commitment to subject every executive and administrative decision to scrutiny on the basis of its consonance with the content and implications of human dignity particularly as this is expressed through human rights.

What conceptual and practical problems does this imply? The actual enforceability of human dignity itself as a norm or right is potentially unclear here, and the idea of human dignity’s overridingness sits uneasily with many common legal, political and moral assumptions. For related reasons it is not clear if human dignity should be a named, explicit norm within a constitution. It would be impracticable (indeed perhaps senseless) to have a norm that trumped all other norms; human dignity cannot be assumed to function in a normative vacuum. And the function of an interstitial concept is to link and justify different normative fields, not to directly govern them through one explicit Grundnorm. In fact, having concrete implications for these fields demands a more complete explication of the concept in terms of human rights which themselves require clear institutional arrangements. What human dignity amounts to is an expression of the foundations of any and all of our normative practices and the demand that human rights and human dignity have a constitutive and not just regulative role in our social institutions and practices. Nevertheless, this is a demand for a far more substantial explication of human rights, institutions, and good—that is, human dignity preserving—interaction between law, morality and politics in practice.

If, despite such challenges, we accept this IHD reading, we should reject a number of other readings of human dignity as peripheral or incoherent. Common uses of human dignity in healthcare and medical ethics that treat human dignity as one amongst many ‘middle-level principles,’ or bioethical readings that treat human dignity as synonymous with sanctity, would be non-standard readings on these assumptions and intelligible only as idiosyncratic local uses. Common criticisms of human dignity as vacuous or empty (because human dignity apparently collapses into notions of autonomy) would be rejected as incoherent because they fail to distinguish an IHD from either idiosyncratic local uses or from irrelevant non-interstitial uses. There would remain, however, an important but complex line of enquiry concerning how human dignity and self-regarding duties should be thought to interact. On the one hand, the IHD concept has been detached from the perfectionist Stoic tradition invoking species norms which determine whether individuals are ‘fully human.’ On the other hand the typical form, content, and normative implications of the IHD need not exclude the possibility of self-regarding duties arising from respecting one’s own status as human person.

5. Conclusion

The foregoing analysis stressed the problems of using human dignity in philosophical and ethical thought. The concept itself is opaque, and one important modern usage faces the problem of aspiring to be interstitial within and between normative fields that are themselves resistant to the very idea of such interstitial concepts. Nevertheless, there are good reasons why such a far-reaching concept should be primary in our thinking, and for this reason human dignity is likely to remain a component of normative discourse despite its problematic characteristics.

6. References and Further Reading

Alexy, R. (2009) A theory of constitutional rights. Oxford University Press.
Arendt, H. (1958) Origins of Totalitarianism, Meridian Books.
Balzer, P., Rippe, K. P. and Schaber, P. (2000) ‘Two Concepts of Dignity for Humans and Non-Human Organisms in the Context of Genetic Engineering’, Journal of Agricultural and Environmental Ethics, 13(1), pp. 7–27. doi: 10.1023/A:1009536230634.
Beitz, C. (2013) ‘Human Dignity in the Theory of Human Rights: Nothing But a Phrase?’, Philosophy and Public Affairs, 41(3), pp. 259–290.
Beyleveld, D. and Brownsword, R. (2001) Human dignity in bioethics and biolaw. Oxford: Oxford University Press.
Bostrom, N. (2005) ‘In Defense of Posthuman Dignity’, Bioethics, 19(3), pp. 202–214. doi: 10.1111/j.1467-8519.2005.00437.x.
Boylan, M. (2004) A Just Society. Rowman & Littlefield Publishers.
Brownsword, R. (2003) ‘Bioethics today, bioethics tomorrow: stem cell research and the dignitarian alliance’, Notre Dame JL Ethics & Pub. Policy, 17, pp. 15–51.
Braarvig, J. (2014) ‘Hinduism: the universal self in a class society’, in The Cambridge Handbook of Human Dignity. Cambridge University Press.
Claassen, R., and Düwell, R. ‘The foundations of capability theory: comparing Nussbaum and Gewirth’, Ethical theory and moral practice 16(3), pp. 493–510.
Claassen, R. (2014) ‘Human Dignity in the Capability Approach’, in The Cambridge Handbook of Human Dignity. Cambridge University Press.
Debes, R. (2009) ‘Dignity’s gauntlet’, Philosophical Perspectives, 23(1), pp. 45–78.
Dillon, R. S. (2013) Dignity, Character and Self-Respect. Routledge.
Donnelly, J. (2009) ‘Human Dignity and Human Rights’, Commissioned by and Prepared for the Geneva Academy of International Humanitarian Law and Human Rights in the framework of the Swiss Initiative to Commemorate the 60th Anniversary of the Universal Declaration of Human Rights. Available at: http://www.udhr60.ch/report/donnelly-HumanDignity_0609.pdf.
Düwell, M. (2009) ‘On the Possibility of a Hierarchy of Moral Goods’, in Morality and Justice: Reading Boylan’s A Just Society, John-Steward Gordon (ed.), Rowman & Littlefield Publishers, Inc: Lanham, MD.
Düwell, M. (2012) Bioethics: Methods, Theories, Domains. Routledge.
Düwell, M. (2014) ‘Human dignity: concepts, discussions, philosophical perspectives’, in The Cambridge Handbook of Human Dignity. Cambridge University Press. Available at: http://dx.doi.org/10.1017/CBO9780511979033.004.
Fuller, L.L. (1964) The Morality of Law. Yale University Press.
Gauthier, D. (1987) Morals By Agreement. Oxford University Press, USA.
Gewirth, A. R. (1998) The community of rights. Springer Netherlands.
Habermas, J. (2005) Die Zukunft der menschlichen Natur: auf dem Weg zu einer liberalen Eugenik?. Frankfurt am Main: Suhrkamp.
Habermas, J. (2010) ‘The Concept of Human Dignity and the Realistic Utopia of Human Rights’, Metaphilosophy, 41(4), pp. 464–480. doi: 10.1111/j.1467-9973.2010.01648.x.
Hennette-Vauchez, S. (2011) ‘A human dignitas? Remnants of the ancient legal concept in contemporary dignity jurisprudence’, International journal of constitutional law, 9(1), pp. 32–57.
Honneth, A. (1996) The struggle for recognition: The moral grammar of social conflicts. MIT Press.
Human dignity and bioethics: essays commissioned by the President’s Council on Bioethics. (2008). Washington: [s.n.].
Kaldewaij, F. E. (2013) The animal in morality. Justifying duties to animals in Kantian moral philosophy. Department of Philosophy, Utrecht University. Available at: http://dspace.library.uu.nl/handle/1874/275543.
Kamali, P. M. H. (2002) The Dignity of Man: An Islamic Perspective. 2nd edition. Islamic Texts Society.
Kaufmann, Paulus, et al. (2011) ‘Human dignity violated: a negative approach–introduction’, in Kaufmann, P., Kuch, H., Neuhäuser, C., & Webster, E. (eds) Humiliation, Degradation, Dehumanization. Netherlands: Springer, pp. 1–5.
Korsgaard, C. M. (2013) ‘Kantian Ethics, Animals, and the Law’, Oxford Journal of Legal Studies, 33(4), pp. 629–648. doi: 10.1093/ojls/gqt028.
Luo, A. (2014) ‘Human dignity in traditional Chinese Confucianism’, in The Cambridge Handbook of Human Dignity. Cambridge University Press. Available at: http://dx.doi.org/10.1017/CBO9780511979033.021.
Margalit, M. A. (2009) The decent society. Cambridge Mass.: Harvard University Press.
Maroth, M. (2014) ‘Human dignity in the Islamic world’, in The Cambridge Handbook of Human Dignity. Cambridge University Press.
McCrudden, C., (2008) ‘Human Dignity and Judicial Interpretation of Human Rights, European Journal of International Law, 19(4), pp. 655–724.
Menke, C. (2014) ‘Human Dignity as the Right to Have Rights: Human Dignity in Hannah Arendt’, in The Cambridge Handbook of Human Dignity. Cambridge University Press. Available at: http://dx.doi.org/10.1017/CBO9780511979033.004.
Mozaffari, M. H. (no date) ‘The concept of Human Dignity in the Islamic Thought’, Hekmat: International Journal of Academic Research, (4), pp. 11–28.
Neal, M. (2012) ‘Dignity, law and language-games’, International Journal for the Semiotics of Law-Revue internationale de Sémiotique juridique, 25(1), pp. 107–122.
Nordenfelt, L. (2004) ‘The varieties of dignity’, Health care analysis: HCA: journal of health philosophy and policy, 12(2), pp. 69–81; discussion 83–89. doi: 10.1023/B:HCAN.0000041183.78435.4b.
Nussbaum, M. C. (2006) Frontiers of justice: disability, nationality, species membership. Cambridge, Mass.: The Belknap Press : Harvard University Press.
O’Malley, M. J. (2011) ‘A Performative Definition of Human Dignity’ Facetten Der Menschewürde: 75–101.
Rawls, J. (2001) The law of peoples: with, the idea of public reason revisited. Cambridge, Mass.: Harvard University Press.
Rawls, J. (2009) A theory of justice. Cambridge, Mass.Harvard University Press.
Rosen, M. (2012) Dignity its history and meaning. Cambridge, Mass: Harvard University Press.
Sensen, O. (2011) ‘Human dignity in historical perspective: The contemporary and traditional paradigms’, European Journal of Political Theory, 10(1), pp. 71–91. doi: 10.1177/1474885110386006.
Singer, P. (2001) Animal Liberation. Ecco Press.
Sulmasy, D. P. (2007) ‘Human dignity and human worth’, in Perspectives on human dignity: A conversation. Springer, pp. 9–18. Available at: http://link.springer.com/content/pdf/10.1007/978-1-4020-6281-0_2.pdf.
Sulmasy, D. P. (2013) ‘The varieties of human dignity: a logical and conceptual analysis’, Medicine, health care, and philosophy, 16(4), pp. 937–944. doi: 10.1007/s11019-012-9400-1.
Waldron, J. (2008) ‘The Concept and the Rule of Law’, Georgia Law Review, 43(1), pp. 1–62.
Waldron, J. and Dan-Cohen, M. (2012) Dignity, rank, and rights. Oxford; New York: Oxford University Press.
Waldron, J. (2013) ‘Is dignity the foundation of human rights?’ NYU School of Law, Public Law Research Paper 12–73. doi: http://dx.doi.org/10.2139/ssrn.2196074.
Wollstonecraft, M. (1982) Vindication of the Rights of Woman. Ontario: Broadview Press.

Author Information

Stephen Riley
Email: stephenriley12@gmail.com
Utrecht University
Netherlands

and

Gerhard Bos
Email: bos.gerhard@gmail.com
Utrecht University
Netherlands

Luck

Winning a lottery, being hit by a stray bullet, or surviving a plane crash, all are instances of a mundane phenomenon: luck. Mundane as it is, the concept of luck nonetheless plays a pivotal role in central areas of philosophy, either because it is the key element of widespread philosophical theses or because it gives rise to challenging puzzles. For example, a common claim in philosophy of action is that acting because of luck prevents free action. A platitude in epistemology is that coming to believe the truth by sheer luck is incompatible with knowing. If two people act in the same way but the consequences of one of their actions are worse due to luck, should we morally assess them in the same way? Is the inequality of a person unjust when it is caused by bad luck? These two complex issues are a matter of controversy in ethics and political philosophy, respectively.

A legitimate question is whether the concept of luck itself is worthy of philosophical investigation. One might think that it is not given (i) how acquainted we are with the phenomenon of luck in everyday life and (ii) the fact that progress has been made in the aforementioned debates on the assumption of a pre-theoretical understanding of the notion.

However, the idea that a rigorous analysis of the general concept of luck might serve to make further progress in areas of philosophy where the notion plays a fundamental role has motivated a recent and growing philosophical literature on the nature of luck itself. Although some might be skeptical that investigating the nature of luck in general can help shed some light on long-standing philosophical debates such as the nature of knowledge—see Ballantyne 2014—it is hardly sustainable that no general account of luck will be able to ground any substantive claim in areas of philosophy where the notion is constantly invoked but left undefined. This article gives an overview of current philosophical theorizing about the concept of luck itself.

Preliminary Remarks
Luck and Significance
Probabilistic Accounts
1. Objective Accounts
2. Subjective Accounts
Modal Accounts
Lack of Control Accounts
Hybrid Accounts
Luck and Related Concepts
1. Accidents
2. Coincidences
3. Fortune
4. Risk
5. Indeterminacy
References and Further Reading

1. Preliminary Remarks

The following preliminary remarks will address three questions: (1) What are the bearers of luck? (2) What is the target of the analysis of current accounts of luck? (3) What general features of luck should an adequate analysis of luck be able to explain?

a. The Bearers of Luck

The best way to find out what the bearers of luck are consists in considering the kind of entities of which we predicate luck-involving terms and expressions such as “lucky,” “a matter of luck,” or “by luck.”

1. Agents. On the one hand, the term “lucky” can be predicated of agents—for example, “Chloe is lucky to win the lottery.” In general, the kind of beings to which we attribute luck are beings with objective or subjective interests such as self-preservation or desires—see Ballantyne (2012) for further discussion. In this sense, a human or a dog are lucky to survive a fortuitous rockfall, but a stick of wood or a car are not. Still, at least in some contexts, it seems correct to attribute luck to an object without interests, as when one says that one’s beloved car is lucky not to have been damaged by a fortuitous rockfall. However, this kind of assertions are felicitous insofar as they are parasitic on our interests. No one would say that a stick of wood is lucky not to have been destroyed by a rockfall if its existence bore absolutely no significance to anyone’s interests, and if one would, one would only say it figuratively.

A related question is whether the kind of agents to which we attribute luck are only individuals or whether luck can be also ascribed to collectives. There is certainly a sense in which a group of individuals can be said to be lucky, as when we say that a group of climbers is lucky to have survived an avalanche. Coffman (2007) suggests that there seems to be no reason why group luck cannot be reduced to or explained in terms of individual luck. But if one holds—with many theorists working on collective intentionality—that groups can be the bearers of intentional states, it might turn out that group luck cannot be so easily reduced to individual luck. For example, if it is by bad luck that a manufacturing company fails to achieve its yearly revenue goal—so it is bad luck for the company—it does not necessary follow that each and every one of its workers—for example, people working on the assembly line—are also unlucky, if, say, they cannot be fired by law and the company is not compromised.

2. Events. On the other hand, the term “lucky” and expressions such as “a matter of luck” or “by luck” can be predicated of events—for example, “Chloe’s lottery win was lucky”—and states of affairs—for example, “It is a matter of luck that Chloe won” or “Chloe’s winning the lottery was by luck”; see Coffman (2014) for further discussion. Plausibly, luck-involving expressions can be also predicated of items belonging to related metaphysical categories such as accomplishments, achievements, actions, activities, developments, eventualities, facts, occurrences, performances, processes, and states. For presentation purposes, luck will be here described as a phenomenon that applies to agents and events, where by “agent” is meant any being with interests and by “event” any member of the previous categories.

b. The Target of the Analysis

1. Relational versus non-relational luck. We say things such as (1) an event E is lucky for an agent S and (2) S is lucky that E. We also say things such as (3) it is a matter of luck that E and (4) E is by luck. Milburn (2014) argues that (1) and (2) are plausibly equivalent: E is lucky for S if and only if S is lucky that E. (3) and (4) also seem equivalent: it is a matter of luck that E if and only if E is by luck. However, (1) and (2) are not equivalent to (3) and (4). Milburn is right in pointing out that this marks an important distinction that anyone in the business of analyzing luck should keep in mind.

The difference between (1) and (2), on the one hand, and (3) and (4), on the other, is that (1) and (2) denote a relation between an agent and an event, whereas (3) and (4) are not indicative of any relation and only apply to events. Call the kind of luck denoted by (1) and (2) relational luck and the kind of luck denoted by (3) and (4) non-relational luck—Milburn uses different terminology: he employs the expression “subjective-relative luck” to refer to relational luck and “subjective-involving luck” to refer to non-relational luck when the relevant event concerns an agent’s action.

Relational luck can be distinguished from non-relational luck regardless of the fact that the target event is an agent’s state or action. For instance, when the relevant event is an action by the agent—for example, that S scores a goal—the luck-involving expressions in (3) and (4) apply to the agent—for example, it is a matter of luck that S scores a goal—but fail to establish a relationship between the agent—S—and the event—S’s scoring of a goal. In contrast, if the target event is the agent’s action, (1) and (2) do establish a relationship between the agent and her action—for example, that S scores a goal is lucky for S.

In the literature, most accounts of luck try to explain what it takes for an event to be lucky for an agent. In other words, they focus on relational luck. But it might well be that in order to shed light on the special varieties of luck—for example, epistemic, moral, distributive luck—one might need to shift the focus of the analysis to non-relational luck—see Milburn (2014) for further discussion.

2. Synchronic versus diachronic luck. Most accounts of (relational and non-relational) luck focus on when an event is lucky—for an agent or simpliciter—at one point in time. However, Hales (2014) argues that luck may be predicated not only synchronically—that is, of an event’s occurrence at a certain time—but also diachronically—that is, of a series or streak of events occurring at different times. For example, synchronically, we say things such as “Joe was lucky to hit the baseball at the end of the game.” Diachronically, we say things such as “Joe was lucky to safely hit in 56 consecutive baseball games.” Hales’s point is that we can be lucky diachronically but not synchronically, and the other way around. By contrast, McKinnon (2013; 2014) argues that while we can determine the presence and degree of diachronic luck—for example, luck in a streak of successful performances—we do not have the ability to determine the presence of synchronic luck—that is, whether a concrete performance is by luck.

3. Strokes of luck. An important departure from standard analysis of relational and non-relational luck is Coffman (2014; 2015), who thinks that the notion of an event E being a stroke of luck for an agent S is more fundamental than the notion of E being lucky for S—or more simply, than the notion of lucky event—and that, therefore, the former should be the target of the analysis of any adequate account of luck. Nonetheless, Coffman’s account of strokes of luck features the same kind of conditions that other authors give in their analyses of the notion of lucky event. In view of this, Hales (2015) objects that Coffman’s approach unnecessarily adds an extra layer of complexity to the already complex analysis of luck and casts doubt on how an analysis of the notion of stroke of luck can shed any more light than an analysis of the notion of lucky event in those areas of philosophy where the concept of luck plays a significant role.

c. General Features of Luck

Before entering into further details, it is convenient to highlight three general features of luck that any adequate analysis of the concept should be able to explain.

Goodness and badness. Luck can be good or bad. This is clearly true of relational luck. For instance, we say things such as “Dylan was lucky to survive the car accident” or “Dylan was unlucky to die in the car accident” to mean, respectively, that it is good luck that he survived and bad luck that he died. Moreover, one and the same event can be both good and bad luck for an agent, which plausibly has to do with the fact that two or more interests of the agent are at stake—Ballantyne (2012). For example, losing one’s keys and having to spend the night outdoors is bad luck if one gets a cold as a consequence, but it is also good luck if one thereby avoids an explosion in one’s apartment.

By contrast, attributions of non-relational luck not so clearly convey good or bad luck—for example, “The discovery of Pluto was a matter of luck.” This is plausibly due to the fact that such attributions do not denote any relationship between a lucky event and an agent or group of agents. To put it differently, if we interpret that sort of attributions as conveying good or bad luck, it is probably because we read them as denoting such relationship. At any rate, accounting for why luck is good or bad is a desideratum at least for analyses of relational luck.

Finally, although the term “lucky” is ordinarily associated with good luck, in the philosophical literature, it is used to denote events that instantiate good luck as well as events that instantiate bad luck. This is done mainly for the sake of simplicity.

Vagueness. Luck is to some extent a vague notion. Not all instances of luck are as clear-cut as a lottery win. For example, goals from the corner kick in professional soccer matches are considered neither clearly lucky nor clearly produced by skill. Pritchard (2005: 143) gives another example: if someone drops her wallet, keeps walking and after five minutes realizes that she just lost her wallet, returns to the place where she dropped it and finds it, is that person lucky to have found her wallet? The answer is not clear. Accordingly, we should not expect an analysis of luck to remove this vagueness. On the contrary, an adequate account should predict borderline cases, that is, cases that are neither clearly lucky nor clearly non-lucky. This is a desideratum for accounts not only of relational luck but also of non-relational luck.

Gradualness. Luck is a gradual notion. In ordinary parlance, it is common to attribute different degrees of luck to different events. For example, winning one million dollars playing roulette is luckier than winning one dollar, even if the odds are the same. Interestingly, winning the prize of an ordinary lottery is luckier than winning the same amount of money by tossing a coin, that is, when the odds are lower. An adequate analysis of luck should be also able to account for these different differences in degree. Again, this concerns accounts of relational luck as well as of non-relational luck.

2. Luck and Significance

Several atomic nuclei joining and triggering off an explosion is an event that is neither lucky nor unlucky for anyone if it happens at the other end of the galaxy. But it is bad luck if the explosion takes place nearby. One way to account for the difference in luckiness is that while the former event is not significant to anyone, the latter is significant to whoever is nearby. Cases like this motivate philosophers who theorize about the concept of luck to endorse a significance condition, that is, a requirement to the effect that an event is lucky for an agent only if the event is significant to the agent.

Since the significance condition establishes a relationship between an agent and an event, whether one thinks that such a condition is needed or not depends on what the target of one’s account is. For instance, if one is in the business of analyzing relational luck, one will be willing to include a significance condition in one’s analysis. But if one’s aim is to account for non-relational luck instead—that is, when is an event lucky simpliciter—one will be reluctant to include such condition in one’s analysis—see Pritchard (2014) for further discussion.

Although there is a wide agreement that an adequate analysis of relational luck must include a significance condition, there is a significant disagreement on its specific formulation. Pritchard (2005: 132–3) formulates the significance condition as follows:

S1: An event E is lucky for an agent S only if S would ascribe significance to E, were S to be availed of the relevant facts.

S1 requires that lucky agents have the capacity to ascribe significance. But that is problematic insofar as the condition prevents sentient nonhuman beings (Coffman 2007) and human beings with diminished capacities like newborns or comatose adults (Ballantyne 2012) from being lucky.

Coffman (2007) proposes an alternative significance condition in terms of the positive or negative effect of lucky events on the agent:

S2: An event E is lucky for an agent S only if (i) S is sentient and (ii) E has some objective evaluative status for S—that is, E has some objectively good or bad, positive or negative, effect on S.

Ballantyne (2012) gives a counterexample to S2 by arguing, first, that (ii) should be read as follows:

(ii)* E has some objectively positive or negative effect on S’s mental states.

The reason given by Ballantyne is that if the event’s effect is not on the agent’s mental states, it is not obvious why clause (i) is required. With that in place, the counterexample to S2 goes as follows: an unlucky man has no inkling that scientists have randomly selected him to put his brain in a vat to feed his neural connections with real-world experiences. The case is allegedly troublesome for S2 because the event, which is bad luck for the man, has no impact on the man’s mental states and, in particular, on his interior life, which is not altered.

A reply might be that, although the fact that the man’s brain is put in a vat does not affect the man’s interior life and namely his phenomenal mental states, it certainly affects his representational mental states. In particular, most of them turn out false, which seems to be objectively negative for the man, just as S2 requires.

Ballantyne (2012) proposes an alternative formulation of the significance condition in terms of the positive or negative effect of lucky events on the agent’s interests:

S3: An event E is lucky for an agent S only if (i) S has a subjective or objective interest N and (ii) E has some objectively positive or negative effect on N—in the sense that E is good or bad for S.

S3 is more specific than S2 in the kind of attributes that are supposed to be positively or negatively affected by lucky events. While S2 does not say whether these need to be the qualitative states of sentient beings, or their representational states, or their physical condition, S3 is explicit that what lucky events affect are the subjective and objective interests of individuals.

Leaving aside the question of what the correct formulation of the significance condition is, it is interesting to see how a significance condition can help explain the three general features of luck outlined above, that is, the goodness, badness, vagueness, and gradualness of luck. Concerning goodness and badness, the explanation is straightforward: luck is good or bad because the significance that lucky events have for people is positive or negative. Concerning vagueness, significance is a vague concept, so including a significance condition in an analysis of luck at least does not remove its inherent vagueness. Concerning gradualness, it can be argued that the degree of luck of an event proportionally varies with its significance or value—Latus (2003), Levy (2011: 36), Rescher (1995: 211–12; 2014). Consider the previous example of winning one million dollars playing roulette versus winning one dollar when the probability of winning is the same: it can be simply argued that the former event is luckier than the latter because it is more significant.

3. Probabilistic Accounts

Paradigmatic lucky events—for example, winning a fair lottery—typically occur by chance. Probabilistic accounts of luck explicitly appeal to the probability of an event’s occurrence to explain why it is by luck. In addition, they typically include a significance condition to explain why events are lucky for agents. For discussion purposes, the analyses of luck below will be presented as analyses of significant events, so the relevant significance condition can be omitted.

a. Objective Accounts

Some accounts make use of objective probabilities to define luck, that is, the kind of probabilities that are not determined by an agent’s evidence or degree of belief, but by features of the world:

OP1: A significant event E is lucky for an agent S at time t if only if, prior to the occurrence of E at t, there was low probability that E would occur at t.

OP1 says that lucky events are events whose occurrence was not objectively likely. A related way to formulate a probabilistic view—suggested by Baumann 2012—is by means of conditional objective probabilities:

OP2: A significant event E is lucky for an agent S at time t if only if, prior to the occurrence of E at t, there was low objective probability conditional on C that E would occur at t.

C is whatever condition one uses to determine the probability that the event will occur. For example, the unconditional probability that Lionel Messi will score a goal in the soccer match is high but given C—the fact that he is injured—the probability that he will score is low. Suppose that Messi ends up scoring by luck. The condition helps explain why: he was injured and therefore it was not very likely that we would score.

According to Hales (2014), probabilistic views of luck such as OP1 or OP2 are the most widespread among scientists and mathematicians. But they face at least two problems. First, a dominant—although not undisputed—idea is that necessary truths have probability 1. In view of it, Hales (2014) argues that probabilistic analyses cannot account for lucky necessities, which are maximally probable. For example, he contends that organisms—humans included—are lucky to be alive because the gravitational constant, G, is the one that actually is, but the probability that G made life possible is 1.

Second, another problem for probabilistic accounts is that, although rare, there are highly probable lucky events, that is, lucky events whose occurrence is highly probable—see Broncano-Berrocal (2015). Suppose that someone is the most wanted person in the galaxy and that billions of mercenaries are trying to kill her, but also that her combat skills drastically reduce the probability that each independent assassination attempt will succeed. Suppose that one such an attempt succeeds for completely fortuitous reasons that have nothing to do with the exercise of her skills. That she is killed is obviously bad luck, but it was also very probable given how many mercenaries were trying to kill her: even if each killing attempt had low probability to succeed, the probability that at least one would succeed was high given the number of independent attempts—that is, the probability of the disjunction of all attempts was high. This shows, contrary to what OP1 and OP2 say, that luck does not entail low probability of occurrence.

OP1 and OP2 are analyses of synchronic luck. McKinnon (2013; 2014) proposes a probabilistic account of diachronic luck instead. The view, called the expected outcome view, starts with the observation that we can determine the expected objective ratio of many events, including people’s performances. By way of illustration, the expected ratio of flipping a coin is 50 percent tails and 50 percent heads. On the other hand, the expected ratio of a certain basketball player’s free-throw shots being successful might be of 90 percent. However, in real life series of tosses or free-throws shots the outcomes typically deviate from those values. In the light of these considerations, McKinnon proposes the following view:

OP3: For any series A of events (E1, E2, …, En) that are significant to an agent S and for any objective expected ratio N of outcomes for events of type E, S is lucky proportionally to how much the actual ratio of outcomes in A deviates from N.

In a nutshell, McKinnon’s view is that we attribute any deviation from the expected ratio of outcomes to luck, and namely to good luck—if the deviation is positive—and to bad luck—if the deviation is negative. If the actual ratio is as expected, the ratio is fully attributable to skill. One key element of McKinnon’s view—and the reason why she rejects any attempt to give an account of synchronic luck—is that she thinks that, while we can know that the set of outcomes that deviate from the expected ratio are due to luck, we cannot know which one of the outcomes in that set is by luck. In other words, we can know whether we are diachronically lucky, but not whether we are synchronically lucky.

Before turning to a different type of probabilistic accounts, let us see how accounts modeling luck in terms of objective probability explain the three general features of luck outlined above. On the one hand, they can explain why luck is a gradual notion in a natural way. For instance, Rescher (1995: 211–12; 2014) thinks that luck varies with not only significance but also chance. If S is the value or significance of an event E, how lucky E is can be determined, according to Rescher, as follows:

Luck = S × (1 – Prob[E]).

In other words, Rescher thinks that luck varies proportionally with the value or significance that the event has for the agent and inversely proportionally with the probability of its occurrence.

On the other hand, defenders of objective probabilistic views might in principle explain why luck is vague notion in epistemic terms. They might argue that knowing exactly how lucky someone is with respect to an event entails that the exact probability of the event’s occurrence is known. However, the relevant probabilities are typically unknown are, at best, approximately known, which might in principle help explain why, say, a goal from the corner kick is neither clearly lucky nor clearly produced by skill: prior to its occurrence, the probability that it would occur was unclear.

Finally, as we have seen, McKinnon thinks that her view also helps explain why luck is good or bad: luck is good or bad depending on whether the actual deviation from the expected ratio is positive or negative.

b. Subjective Accounts

A different way to model luck in probabilistic terms is by means of subjective probabilities, that is, the kind of probabilities that are determined by an agent’s evidence or degree of belief. One way to state this kind of view is that whether or not an event counts as lucky for an agent depends on the agent’s degree of belief in the occurrence of the event, that is, on how confident she is or how strongly she believes that the event will occur—see Latus (2003), Rescher (1995: 78–80), and Steglich-Petersen (2010) for relevant discussion. More precisely:

SP1: A significant event E is lucky for an agent S at time t if only if, just before the occurrence of E at t, S had a low degree of belief that E would occur at t.

A subjective probabilistic account might be also formulated in terms of the agent’s evidence for the occurrence of the event—see Steglich-Petersen (2010):

SP2: A significant event E is lucky for an agent S at time t if only if, given S’s evidence just before the occurrence of E at t, there was low probability that E would occur at t.

SP1 and SP2 characterize luck as a perspectival notion: if for A but not for B it is subjectively improbable that an event E will occur, then, if E occurs, E is lucky for A but not for B—Latus (2003) endorses this thesis. For example, suppose that someone receives a big check from a secret benefactor. From that person’s perspective, it is good luck that she has received the check, but from the perspective of the benefactor, it is not—the example is from Rescher (1995: 35). In addition, those who firmly believe in fate or whose evidence strongly points to its existence are never lucky according to these views, because everything that happens to them is highly probable from their perspective.

Stoutenburg (2015) gives a similar evidential account of degrees of luck. The idea is that an agent is lucky with respect to an event to the extent that her evidence does not guarantee its occurrence, in the sense that if the conditional probability of the occurrence of the event given the agent’s evidence is not maximal, she is lucky to some degree with respect to that event:

SP3: A significant event E is lucky to some degree for an agent S at time t if only if, given S’s evidence just before the occurrence of E at t, the probability that E would occur at t is not 1.

A problem for views such as SP1, SP2, and SP3 is that events are no less lucky if we have no evidence or have not thought about them—see Steglich-Petersen (2010). For example, someone would be clearly lucky if, unbeknownst to her, a bullet just missed her head by centimeters. Steglich-Petersen (2010) thinks that one way to fix this problem is to formulate a subjective view in terms of the agent’s total knowledge instead of her degree of belief or evidence for the occurrence of the event:

SP4: A significant event E is lucky for an agent S at time t if only if, for all S knew just before the occurrence of E at t, there was low probability that E would occur at t.

SP4 is compatible with an event being lucky for the agent when she has no prior evidence or doxastic state about its occurrence. But SP4 might still not yield the right results. Consider a macabre lottery in which all the participants have been poisoned and the only way to survive is to win the prize, which is the antidote. The lottery draw is a fair one, so surviving is a pure matter of chance. Suppose that the only difference in knowledge between two participants, A and B, is that only A knows of herself that has been poisoned and is a participant of the lottery. For all A knows, there is low probability that she will survive. In contrast, for all B knows, her survival is very likely—she is a healthy person and has no reason to think that she has been poisoned. According to SP4, B would not be lucky if she won the lottery and survived as a result. Intuitively, however, A and B would be equally lucky if they won the lottery.

In general, this and other cases might be taken to illustrate that what is apparently lucky does not always coincide with what is actually lucky—see Rescher (2014) for the distinction between apparent and actual luck. A potential problem for subjective views is then that they might be only capturing intuitions about the former.

Steglich-Petersen (2010) advances a different account, which is not probabilistic in nature, but which is worth considering in this section, not only because it is a natural development of SP4, but also because, like SP2, SP3, and SP4, it characterizes luck as an epistemic notion. In particular, it analyzes luck in terms of the agent’s epistemic position with respect to the future occurrence of the lucky event:

SP5: A significant event E is lucky for an agent S at time t if only if, just before the occurrence of E at t, S was not in a position to know that E would occur at t.

Steglich-Petersen explains that we are in a position to know that an event will occur if, by taking up the belief that the event will occur, we thereby know that it will occur. SP5 yields the correct result in the macabre lottery case, which was troublesome for SP4. None of the participants is in a position to know that they would win the lottery and survive as a result. For that reason, the winner is lucky.

However, SP5 might not capture the intuitions of other cases correctly. Suppose that someone is the holder of a ticket in a fair lottery. During the lottery draw, a Laplacian demon predicts and tells that person that she will be the winner, so she comes to know in advance—and therefore is in a position to know—that she will be the winner. However, that person is not less lucky to win the lottery because of that knowledge or because of being in that position. After all, it is still a coincidence that she has purchased the ticket that corresponds to the accurate prediction of the demon. In sum, knowing that one will be lucky—and therefore being in a position to know it—does not necessarily prevent one from being lucky.

Before considering an alternative approach to luck, let us see how subjective probabilistic accounts explain the three general features of luck presented at the beginning of the article. On the one hand, they can account for degrees of luck in terms of degrees of subjective probability. As we have seen, SP3 says that an agent is increasingly lucky with respect to an event the less likely the occurrence of the event—conditional on her evidence—is. On the other hand, advocates of the subjective approach might explain borderline cases of luck by appealing to the fact that the relevant subjective probabilities are not always transparent, so if we cannot determine whether an event is lucky or non-lucky, it is plausibly because the relevant subjective probabilities cannot be determined either. Finally, to explain why luck is good or bad defenders of subjective accounts can simply include a significance condition on luck in their analyses.

4. Modal Accounts

A different approach to luck emphasizes the fact that paradigmatic instances of luck such as lottery wins could have easily failed to occur. Modal accounts accordingly explain luck in terms of the notion of easy possibility. As usual in areas of philosophy where the notion of possibility is invoked, advocates of the modal approach use possible worlds terminology to explain that notion and in turn the concept of luck. In this sense, that a lucky event could have easily not occurred means that, although it occurs in the actual world, it would fail to occur in close possible worlds.

Closeness is simply assumed to be a function of how intuitively similar possible worlds are to the actual world. For example, if an event E occurs at time t in the actual world, close possible worlds can be obtained by making a small change to the actual world at t and by seeing what happens to E at t or at times close to t—see Coffman (2007; 2014) for relevant discussion. One should keep in mind that although current modal views are closeness views, it is in principle possible to give a modal account of luck that ranges over distant possible worlds.

In the literature, there can be found several formulations of modal conditions on luck, where the main point of disagreement concerns the proportion of close possible worlds in which an event needs not occur in order for its actual occurrence to be by luck. For discussion purposes, however, those conditions will be presented here as if they constituted full-fledged analyses of luck, but it is important to keep in mind that modal conditions are typically considered necessary but not sufficient for a significant event to be by luck. A prominent exception is Pritchard (2005), who is the only author in the literature advocating a pure modal account of luck—in more recent work (2014), he drops the significance condition from his analysis, plausibly because he is mainly interested in giving an account of non-relational luck. Also for discussion purposes, the analyses of luck below will be presented, as before, as analyses of significant events. Without further ado, let us consider the following modal account by Pritchard (2005: 128):

M1: A significant event E is lucky for an agent S at time t if only if E occurs in the actual world at t but does not occur at t or at times close to t in a wide proportion of close possible worlds in which the relevant initial conditions for E are the same as in the actual world.

According to M1, one is lucky to win a fair lottery because in a wide class of close possible worlds one would lose. M1 has two important features. The first one is that it does not consider any close possible world relevant to determine whether an event is lucky or not: only those in which the relevant initial conditions are the same as in the actual world. According to Pritchard (2014), the relevant initial conditions for an event are specific enough to allow a correct assessment of the luckiness of the target event, but not so specific as to guarantee its occurrence. Nonetheless, Pritchard leaves as a contextual matter what features of the actual world need to be fixed in our evaluation of close possible worlds. For instance, when we assess the modal profile of lottery results, we typically keep fixed features such as the fairness and the odds of the lottery or the fact that one has decided to purchase a specific lottery number.

Riggs (2007) argues that M1 is defective precisely because there is no non-arbitrary way to fix the relevant initial conditions. In reply, Pritchard (2014) argues that an analysis of a concept should not be more precise than the concept that the analysis intends to account for. Given that luck is a vague notion, the somewhat vague clause on initial conditions might be after all doing some explanatory work.

The second important feature of M1 is that it requires that the lucky event fails to occur in a wide proportion of close possible worlds. Pritchard (2005: 130) explains that by “wide” he means at least approaching half the close possible worlds, where events that are clearly lucky would not obtain in most close possible worlds.

However, there are clearly lucky events, such as obtaining heads by flipping a coin, that would not occur in a large proportion of close possible worlds—since the probability of heads is 0.5, we can suppose that in half the close possible worlds the outcome would be still heads. Perhaps, the following slightly different formulation is to be preferred—see Coffman (2007):

M2: A significant event E is lucky for an agent S at time t if only if E occurs in the actual world at t but does not occur at t or at times close to t in at least half the close possible worlds in which the relevant initial conditions for E are the same as in the actual world.

However, Levy (2011: 17–18) argues that if we accept that an event that does not occur in half the close possible worlds is lucky, we can also accept that an event that does not occur in little less than half the close possible worlds—for example, in 49 percent of them—is lucky as well. In view of this, Levy thinks that it is better not to commit one’s modal account to a precise view of the issue. Instead, Levy argues that there is no fixed proportion of close possible worlds where an event must not occur to be considered lucky in the actual world. His point is that there might be different “large enough” proportions of close possible worlds in which events need not occur to be considered lucky. According to Levy, what makes the threshold vary from case to case is the significance that the event has for the agent. A modal account in the spirit of Levy’s considerations would be then the following:

M3: A significant event E is lucky for an agent S at time t if only if E occurs in the actual world at t but does not occur at t or at times close to t in a large enough proportion of close possible worlds in which the relevant initial conditions for E are the same as in the actual world, where the relevant proportion of close possible worlds is determined by the significance that E has for S.

Lackey (2008) raises two important objections to the modal approach. The first one challenges the idea that the easy possibility of an event not occurring is necessary for luck. She proposes a counterexample involving a modally robust lucky event. Suppose that (i) A buries a treasure at location L and that (ii) B independently places a plant in the ground of L. When digging, B discovers A’s treasure. Lackey’s point is that if we stipulate that A’s and B’s independent actions are sufficiently modally robust, in the sense that there is no chance that they would fail to occur in close possible worlds, B’s discovery, which is undeniably lucky, would occur in most close possible worlds.

Pritchard (2014) and Levy (2009) try to circumvent the objection in two steps. First, they distinguish between the notions of luck and fortune. Then, they propose an error theory according to which most people would be mistaken to say that B’s discovery is by luck: B’s discovery is in reality fortunate, not lucky—see section 7 for the specific way in which Pritchard and Levy distinguish luck from fortune.

Lackey’s second objection targets the idea that the easy possibility of an event not occurring is sufficient for luck. Lackey thinks that whimsical events—that is, events that result from actions that are done on a whim—show exactly this. For instance, suppose that someone decides to catch the next flight to Paris on a whim. That person’s going to Paris is not by luck—since it is the result of her self-conscious decision—but it would nevertheless fail to occur in most close possible worlds—since she has made the decision on a whim.

In reply, Broncano-Berrocal (2015) argues that Lackey’s objection obviates the clause on initial conditions of modal accounts: if someone decides to go—and goes—to Paris on a whim, close possible worlds in which the relevant initial conditions for that trip are the same as in the actual world—that is, the only possible worlds that according to modal views are relevant to assess whether the trip is by luck—are worlds in which that person makes the decision to go to Paris. But, consistently with what modal accounts say, that person goes to Paris in most of those worlds. In a similar way, when it comes to evaluating whether someone in possession of a specific ticket is lucky to have won the lottery, we only consider close possible worlds in which she has decided to buy that specific ticket. Again, in most of those worlds, that ticket is a loser, just as modal accounts predict.

On the other hand, Hales (2014) thinks that cases of lucky necessities are problematic not only for objective probabilistic accounts but also for modal views. For example, if Jack the Ripper is terrorizing the neighborhood and it is one’s dearest friend Bob knocking on one’s door, one might be lucky that Bob is not Jack the Ripper, but it is metaphysically impossible that Bob is Jack the Ripper because things are self-identical—Hales gives credit to John Hawthorne for the example.

Before turning to lack of control views, let us see how modal accounts explain the three general features of luck. Concerning goodness or badness, modal views can simply include a significance condition—although, as noted, Pritchard (2014), one of the main advocates of the modal approach, thinks that a significance condition is not necessary for luck. In addition, we have seen that the clause on the relevant initial conditions of the event is vague enough to preserve the characteristic vagueness of the concept of luck.

On the other hand, modal views have at least two interesting ways to account for degrees of luck—the terminology below is from Williamson (2009), who applies it to the safety condition for knowledge. M1, M2, and M3 adopt what can be called the proportion view of the gradualness of luck: they cash out the degree of luck of an event in terms of the proportion of close possible worlds in which it would fail to occur—the larger the proportion of such close possible worlds is, the luckier the event is. Church (2013) argues that the proportion view should not be restricted to close possible worlds only: degrees of luck should be modeled in terms of all relevant possible worlds, although he also argues that more weight should be given to close ones.

The idea that more weight should be given to some possible worlds when fixing the degree of luck of an event serves to stipulate a different view of the gradualness of luck. The view, which can be called the distance view, says that the degree of luck of an event varies as a function of the distance to the actual world of possible worlds in which it would fail to occur. In this way, the closer those worlds are, the luckier the event is—Pritchard (2014) endorses the distance view.

On a related note, modal theorists can explore the relation between the significance of a lucky event and its modal profile. As we have seen, Levy (2011) thinks that the size of the proportion of close possible worlds in which an event needs not occur to count as lucky is sensitive to the significance that the event has for the agent. Although Levy thinks that it is a mistake to seek much clarity about how the latter affects the former, he also believes that there is a relation of inverse proportionality between the two: the more significant an event for an agent is, the smaller needs to be the proportion of close possible worlds in which it would not occur to be considered lucky for the agent—Coffman (2014) calls this the inverse proportionality thesis; see Levy (2011: 36).

By way of illustration, compare surviving a round of Russian roulette with one bullet in the chamber of a revolver with a six-shot capacity—approximately 0.16 probability of being shot—with winning one dollar in a poker game after having called an all-in that one knew one only had a 0.16 probability of losing. In both cases, one would succeed—that is, one would survive or win—in most close possible worlds, but only the former case is considered clearly lucky. The inverse proportionality thesis accounts for the difference: surviving is such a significant event that the proportion of close possible worlds in which one dies needs not be large for one’s actual survival to be considered lucky. However, Coffman (2015: 40) argues that the thesis is not sustainable precisely because it leads to the result that all extremely significant events count as lucky if there is at least a small non-zero chance that they will not happen—for example, the thesis seems to entail that we are lucky to survive every time we take a flight.

5. Lack of Control Accounts

One of the most widespread intuitions about luck is that lucky events are events beyond our control. For example, one way to explain why we are lucky to win the lottery is that the outcome of the lottery is beyond our control. In the literature, different lack of control views account for luck in those terms.

Some authors give pure lack of control accounts—for example, Broncano-Berrocal (2015), Riggs (2009). Other authors think that lack of control conditions are necessary but not sufficient for significant events to be by luck—for example, Coffman (2007; 2009), Latus (2003), Levy (2009; 2011). As in the case of modal conditions, and mainly for discussion purposes, the latter will be presented as if they constituted full-fledged analyses of luck—also as before, the analyses will be presented as analyses of significant events. That said, the simplest lack of control account has the following form:

LC1: A significant event E is lucky for an agent S at time t if only if E is beyond S’s control at t.

Many lucky events are beyond our control, so LC1 seems to be on the right track. However, Lackey (2008) argues that the fact that a significant event is beyond our control is neither necessary nor sufficient for the event being lucky. Against the sufficiency claim, Lackey argues that many nomic necessities—for example, sunrises—are not under our control, but that does not mean that they are by luck—see also Latus (2003) for this objection. To prove that lack of control is not necessary for luck, Lackey proposes a case in which a demolition worker, A, succeeds in demolishing the warehouse she was planning to demolish when pressing the button of the demolition system she had designed to that effect only because the electrical current is accidentally restored after the damage caused by a mouse when chewing the connection wires. According to Lackey, the explosion is both under A’s control and by luck.

Coffman (2009) and Levy (2011), who think that lack of control is not sufficient for luck, argue that Lackey’s counterexample to the necessity claim rests on the false thesis—called by Coffman the luck infection thesis—that if luck affects the conditions that enable an exercise of control, then the exercise of control itself is by luck; more generally, if S is lucky to be in a position to ϕ and S ϕ-es, then S ϕ-es by luck. The thesis, according to Coffman, has blatant counterexamples. For example, a lifeguard who accidentally goes to work very early and sees a swimmer drowning is lucky to be in a position to save the swimmer, but if done competently, it is not by luck that she saves him.

To overcome this and other objections, lack of control theorists define the notion of control in different ways. For example, Coffman (2009) thinks that an event is under an agent’s control just in case she is free to do something that would help produce it and something that would help prevent it. Rescher (1969: 329) gives a similar account of control as the capacity to produce the occurrence of an event—what Rescher calls positive control—and the capacity to prevent it—what he calls negative control. While Rescher defends a probabilistic account of luck, Coffman thinks that lack of both negative and positive control—when understood in terms of freedom—is necessary for luck. The following is a lack of control view in the spirit of Coffman’s and Rescher’s respective conceptions of control:

LC2: A significant event E is lucky for an agent S at time t if only if S is not both free to do something that would help produce E at t—or lacks the capacity to do it—and free to do something that would help prevent E at t—or lacks the capacity to do it.

An immediate problem for LC2 is that it is not the same to have control as to exercise it. We might have control over something in the sense that we are free or have the capacity to control it, but that does not mean that we actually exercise that capacity or freedom. For example, a competent pilot who is free or has the capacity to produce and prevent a plane crash but who refuses to take control of the plane for some reason is objectively lucky that a passenger manages to land the plane safely and that as a result survives.

Levy (2011: chap. 5) understands control in similar terms as Coffman and Rescher, but he introduces additional epistemic constraints. For Levy, an event is under an agent’s control just in case there is a basic action that she could perform which she knows would bring about the event and how it would do so. This way to understand control can be supplemented with Rescher’s point that agents can also control an event by inaction, omission or inactivity (Rescher 1969: 369). Taking the latter into account, the following is a pure lack of control view in the spirit of Levy’s conception of control:

LC3: A significant event E is lucky for an agent S at time t if only if S is able to perform—or to omit performing—a basic action whose occurrence—or non-occurrence—is such that S knows would bring about—or prevent—E at t and how it would do so.

According to LC3, if we do not want to be exposed to the whims of luck not only we have to be able to perform—or omit performing—actions that causally influence the world, but we also need to know that, and how, the world is sensitive to them.

A potential problem for LC3 is that we might be properly described as being in control of something when we act in a way that brings it to a desired state despite we do not know how exactly this happens. For example, a driver might know that by turning the steering wheel to the left she will avoid an obstacle in the road, but she might be completely mistaken about how exactly this works—for example, she might erroneously believe that, whenever she turns the steering wheel to the left, it is a magical dwarf who moves the car to the left. So, she knows that her basic action will bring about the desired effect while failing to know how. The problem is that if that person competently avoids the obstacle, the maneuver seems under her control, no matter that she mistakenly thinks that it is under the dwarf’s.

A different lack of control account is due to Riggs (2009), who tries to defend the lack of control approach from Lackey’s objection that the fact that an event is beyond our control does not suffice for the event being lucky. Riggs admits that although it is true that many nomic necessities—for example, sunrises—are beyond our control, we can still exploit them to our advantage. The idea is that if we exploit them for some purpose, they are not lucky for us even if they are not under our control. The following analysis accounts for luck in those terms:

LC4: A significant event E is lucky for an agent S at time t if only if (i) E is beyond S’s control at t and (ii) S did not successfully exploit E, prior to E’s occurrence at t, for some purpose.

To illustrate how LC4 can distinguish between lucky and non-lucky physical events beyond our control, Riggs proposes a case in which two people, A and B, are about to be executed, but only A knows two important facts: first, that their captors believe that solar eclipses are in reality a message from the gods telling them to stop sacrifices; second, that, unbeknownst to their captors, a solar eclipse will take place at the exact time the execution is planned. Riggs thinks that, while B is lucky to be released, A is not. By being in a position to exploit the eclipse in her favor, A is in control of the situation.

Coffman (2015: 10) argues via counterexample that LC4 does not distinguish correctly between lucky and non-lucky physical events beyond our control. He proposes a case in which someone lives in an underground facility that is, unbeknownst to her, solar-powered. According to Coffman, that person, who has become completely oblivious to sunrises, is not lucky that the sun rises every morning and keeps her facility running, even if it is something that is neither beyond her control, nor successfully exploited by her for some purpose.

Broncano-Berrocal (2015) gives a lack of control account in the spirit of Riggs’s, but with significant differences. According to Broncano-Berrocal, there are two ways in which something might be under our control. On the one hand, we exercise effective control over something by competently bringing it to a desired state—for example, by causally influencing it in a certain way. On the other hand, something is under our tracking control when we actively check or monitor that it is currently in a certain desired state, so that we are thereby disposed or in a position either (i) to exercise effective control over it or (ii) to act in a way that would allow us to achieve goals related to the thing controlled—for example, exploiting it to our advantage. By way of illustration, when flying on autopilot mode, a pilot does not exercise effective control over the plane—for example, she does not exert any causal influence on it—but the plane is under her tracking control if she is sufficiently vigilant. A key point of Broncano-Berrocal’s account is that, depending on the practical context, attributions of control such as “Event E is under S’s control” might refer either to effective control, to tracking control, or to both. The corresponding account of luck is the following:

LC5: A significant event E is lucky for an agent S at time t if only if E is beyond S’s control at t, where E is beyond S’s control at t either if (i) S lacks effective control over E, or (ii) E is not under S’s tracking control, or (iii) both.

Lotteries are typically not under our tracking control—although they might be if a Laplacian demon tells us what the result will be. The reason why winning a fair lottery is a matter of luck is, according to LC5, that we are not able to causally influence the result in the desired way, that is, the fact that we lack effective control. By the same token, LC5 also considers lucky winning a lottery that, unbeknownst to one, has been rigged in one’s favor.

LC5 allows to give a different response to Lackey’s demolition case: Lackey’s intuition that the explosion is under A’s control can be explained in terms of the fact that A exercises effective control over the explosion by pressing the button. But the intuition that A is lucky to demolish the warehouse is parasitic on the fact that the explosion is not under A’s tracking control. In particular, the practical context provided by Lackey is such that A is responsible for the design of the demolition system but fails to check that the connection wires are damaged—sometimes, tracking control might be very difficult to achieve. In a similar way, LC5 explains that, while we lack effective control over many physical events—for example, sunrises—the reason why they are not lucky is that they are under our tracking control, that is, they are things that we regularly monitor and thereby can exploit to our advantage.

Coffman’s solar-powered facility case, the counterexample to LC4, is also a counterexample to LC5. Coffman’s point is that sunrises are not lucky for the person living in the solar-powered underground facility, despite they are not under her control—tracking or effective. In reply, defenders of lack of control views might argue that it is not unreasonable to say that such a person is lucky that the sun rises every morning and keeps, unbeknownst to her, her facility running. After all, there are similar attributions of luck in ordinary speech. For example, we say things such as “S is lucky to live in an earthquake-free region” even though S ignores it and is therefore lucky that an earthquake will not make her house collapse.

Finally, Hales (2014) thinks that there are cases of skillful achievements that lack of control accounts are compelled to consider lucky. For instance, he thinks that not even the best batter in history can plausibly be said to have control over whether he hits the ball, since there are many factors over which he cannot exercise any sort of control—for example, distractions, the pitches he receives, and the play of the opposing fielders. In reply, lack of control theorists might argue that Hales is illicitly raising the standards of control. After all, intuitions about whether the result of our actions is under our control go hand in hand with intuitions about whether the result of our actions is because of our skills.

As a final note, let us briefly consider how lack of control accounts explain the three general features of luck presented at the beginning of the article. Concerning goodness or badness, lack of control views can, like other views, simply include a significance condition. Concerning vagueness, the notion of control is not as precise as to remove all vagueness from the analysis of luck. Concerning gradualness, control, like luck, comes in degrees. In particular, lack of control of theorists might endorse the view that the degree of luck of an event is inversely proportional to the degree of control that the agent has over it—see Latus (2003) for further discussion.

6. Hybrid Accounts

Some authors opt for giving accounts of luck that mix modal or probabilistic conditions with lack of control conditions. The rationale behind this move is, as Latus (2003) puts it, that although lack of control over an event often goes hand in hand with the event having low chance of happening—or with the event being modally fragile—there are non-lucky events that are either beyond our control—for example, sunrises—or have low chance of occurring—for example, rare significant events brought about by ability. Latus’s hybrid view features a lack of control condition and a subjective probabilistic condition:

H1: A significant event E is lucky for an agent S at time t if only if, (i) just before the occurrence of E at t, S had a low degree of belief that E would occur at t, and (ii) E is beyond S’s control at t.

By contrast, Coffman (2007) and Levy (2011) opt for conjoining a lack of control condition with modal conditions. Coffman’s analysis is roughly the following—he includes several further refinements to handle specific cases of competing significant events:

H2: A significant event E is lucky for an agent S at time t if only if, (i) E does not occur around t in at least half the possible worlds obtainable by making no more than a small change to the actual world at t, and (ii) E is beyond S’s control at t.

Levy’s hybrid analysis (2011) features a different modal condition:

H3: A significant event E is lucky for an agent S at time t if only if, (i) E occurs in the actual world at t but does not occur at t or at times close to t in a large enough proportion of close possible worlds, where the relevant proportion of close possible worlds is inverse to the significance of E for S, and (ii) E is beyond S’s control at t.

Levy calls this kind of luck chancy luck, but argues that there also exists a non-chancy variety of luck, which is the kind of luck that affects one’s psychological traits or dispositions relative to a reference group of individuals—for example, human beings.

Any of the already discussed counterexamples to the necessity for luck of (i) subjective probabilistic conditions—for example, cases of agents without beliefs about events that are lucky for them, (ii) objective probabilistic conditions—for example, cases of highly probable lucky events, (iii) modal conditions—for example, Lackey’s buried treasure case, and (iv) lack of control conditions—for example, Lackey’s demolition case—are troublesome for hybrid views.

7. Luck and Related Concepts

There are several concepts that are closely related to the concept of luck. Here we will focus on the concepts of accident, coincidence, fortune, risk, and indeterminacy.

a. Accidents

The concept of accident is closely related to the concept of luck. After all, most accidents—for example, car crashes—involve luck—mostly bad luck. But as Pritchard (2005: 126) argues, there are paradigmatic cases of luck that involve no accidents. For example, if one self-consciously chooses a specific lottery ticket and wins the lottery, one’s winning is by luck, but it is not an accident given that one was trying to win.

From Pritchard’s example, we might infer that if an agent acts with the intention of bringing about some result, then if it occurs, it is not an accident. However, if someone prays with the intention of bringing about some event and the event occurs by sheer coincidence—because that person’s prayers are causally irrelevant to its occurrence—the event is accidental. But the mere causal relevance of an agent’s actions to an event’s occurrence is not sufficient for excluding accidentality either. If a pilot dancing in the cockpit unintentionally presses the depressurization button and as a result the plane crashes, the crash is an accident despite being caused by the pilot.

This suggests that what prevents the outcomes of an action from being accidental—but not from being lucky—is both the fact that an agent acts with the intention to bring about a certain outcome and the fact that her action is causally relevant to that outcome. For example, if someone wins a lottery in which participants have to pick a ball directly from the lottery drum with a blindfold on, that person’s winning is lucky but not accidental because of being brought about by her direct intentional action.

b. Coincidences

The concept of coincidence is also closely related to the concept of luck. Owens (1992) gives an account according to which a coincidence is an inexplicable event in the following sense: we cannot explain why its constituents come together because they are produced by independent causal factors—see also Riggs (2014) for a similar account. More specifically, coincidences are such that we cannot explain why they occur because there is no common nomological antecedent of their components or a nomological connection between them. For example, if someone prays for rain and it rains, that it rains is a coincidence because there is no nomological connection between that person’s prayers and the fact that it rains. On the other hand, how close or immediate should an antecedent be in order to prevent two events from constituting a coincidence is a matter that usually becomes clear in context. For example, we would regard as a coincidence the fact that someone wishes that her favorite team wins the final and that as a matter of fact it ends up winning the final despite both events have some distant nomological component—for example, the Big Bang; see Riggs (2014) for further discussion.

Not all lucky events are coincidental events. For example, it is no coincidence that a coin lands heads when someone flips it. But that might be clearly lucky for that person. In the same way, as causally relevant intentional action prevents an event from being an accident, causally relevant intentional action seems to prevent a pair of events—someone’s flipping of the coin and the coin landing heads—from being a coincidence. By contrast, all coincidental events, if significant, are lucky. For example, if someone prays for rain because she is in need of water and it rains, the coincidental event that it rains is lucky for that person.

Probabilistic and modal views have difficulties when it comes to accounting for highly probable or modally robust lucky events arising out of coincidence. As Lackey’s buried treasure case illustrates, if the occurrence of the components of a coincidence—A’s burial of the treasure and B’s digging at the same location—is highly probable or modally robust, the occurrence of the resulting coincidental event—B’s discovery of A’s treasure—is also highly probable or modally robust. Yet, the event is lucky precisely because it arises out of a coincidence.

c. Fortune

In the literature, there is some disagreement concerning whether or not the concept of fortune is the same as the concept of luck. Most modal theorists think that luck and fortune are different and use the distinction to argue that Lackey’s buried treasure case is in reality a case of fortune, while their theories are theories of luck.

For example, Pritchard (2005: 144, n.15; 2014) thinks that fortunate events are events beyond our control that count in our favor, but unlike lucky events, they are not chancy or modally fragile. In his way, having good health or a good financial situation are instances of fortune, not of luck, while winning a fair lottery is only an instance of luck. Rescher (1995: 28–9) similarly thinks that we can be fortunate if something good happens to or for us in the natural course of things, but we are lucky only if such eventuality is chancy. In a similar vein, Coffman (2007; 2014) thinks that we are lucky to win a fair lottery—given how unlikely it was—but we are merely unfortunate to lose it—given how likely it was.

Finally, Levy (2009; 2011: 17) thinks that fortunate events are non-chancy events—hence non-lucky—but luck-involving, in the sense that they have luck in their causal history and, in particular, in their proximate causes. His reply to Lackey’s buried treasure case is that luck in the circumstances—the lucky coincidence that someone places a plant at the same location in which someone has buried a treasure—is not inherited by the actions performed in those circumstances or by the events resulting from them—for example, the discovery of the treasure. So while there is luck involved in the circumstances of the discovery, the discovery itself is merely fortunate.

Against the distinction between luck and fortune, Broncano-Berrocal (2015) and Stoutenburg (2015) argue that the terms “luck” and “fortune” can be interchanged in English sentences without any significant semantic difference. Moreover, since English speakers use the terms interchangeably, arguing that luck and fortune are two distinct concepts entails that speakers are systematically mistaken in their usage of the terms, which is a hardly tenable error theory. For example, we would be wrong in saying that someone is fortunate to win a raffle or lucky to win a lottery that, completely unbeknownst to her, has been rigged in her favor.

d. Risk

There is a close connection between the concepts of luck and risk. In fact, some theorists think that the connection is so close that they think that the former can be explained in terms of the latter—see Broncano-Berrocal (2015), Coffman (2007), Pritchard (2014; 2015), and Williamson (2009) for relevant discussion. On the one hand, Pritchard (2015) explains that a risk or a risk event is a potential, unwanted event that is realistically possible—that is, something that could credibly occur—whereas a risky event is a potential, unwanted event that has higher risk than normal of occurring—for example, there is always a risk that one’s plane might crash, but flying by plane is not risky. With that distinction in place, Pritchard distinguishes two competing ways to understand the notion of risk or of risk event.

The probabilistic account of risk says that an event is at risk of occurring just in case there is non-zero objective probability that it will occur. How high its risk of occurrence is—that is, how risky it is—depends on how probable its occurrence is. The modal account of risk, by contrast, says that an event is at risk of occurring just in case it would occur in at least some close possible worlds—see also Coffman (2007) and Williamson (2009). How high its risk of occurrence is—that is, how risky it is—depends on how large the proportion of close possible worlds in which it would occur is—call this the proportion view of degrees of risk—or on how distant possible worlds in which it would occur are—call this the distance view of degrees of risk.

Pritchard contends that the probabilistic account fails to adequately account for degrees of risk. In particular, he argues that if two risk events E1 and E2 have the same probability of occurring but E1 is such that its occurrence is easily possible, E1 is riskier than E2, but the probabilistic account is committed to say that they are equally risky.

Pritchard (2014; 2015) also argues that when risk is understood in modal terms, the notions of luck and risk are basically co-extensive, because both how lucky and risky an event is depends on the modal profile of the event’s occurrence, that is, on the size of the proportion of close possible worlds in which it would not obtain, or the distance to the actual world of possible worlds in which it would not occur. According to Pritchard, the only two minor differences between the two notions are, on the one hand, that risk is typically associated to negative events, whereas luck can be predicated of both negative and positive events; on the other, that while we can talk of very low levels of risk, we cannot so clearly talk of low levels of luck.

Broncano-Berrocal (2015) makes a further distinction between two ways in which we think of risk: the risk that an event has of occurring—or event-relative risk—and the risk at which an agent is with respect to an event—or agent-relative risk. The distinction serves to delimit the scope of Pritchard’s account: his modal account of risk is an account of event-relative risk—the same applies to the probabilistic view. For Broncano-Berrocal, the modal and probabilistic accounts of event-relative risk are both correct: while the probabilistic conception is the one that is typically used or assumed in scientific and technical contexts, the modal conception better fits our everyday thinking about risky events. On the other hand, the best way to understand the agent-relative sense of risk is, according to Broncano-Berrocal, in terms of lack of control: an agent is at risk with respect to the possible occurrence of an event just in case its occurrence is beyond her control. He further argues that the agent-relative sense of risk is the one that really serves to account for luck: when risk is understood in terms of lack of control, the notions of luck and risk are basically co-extensive, because whether an event is lucky or risky for an agent depends on whether it is under the agent’s control.

e. Indeterminacy

In a causally deterministic world, events are necessitated as a matter of natural law by antecedent conditions. It might be thought that lucky events are events whose occurrence was not predetermined in that way. Against this idea, Pritchard (2005: 126–27) argues that at least some lucky events are not brought about by indeterminate factors. For example, given the position and momentum of the balls in a lottery drum at time t1 it might be fully determinate that a certain combination of balls will be the winner combination at t2. To make the point more vivid, Coffman (2007) proposes an example in which someone’s life depends on the fact that a ball remains perfectly balanced on the tip of a cone in a deterministic world. According to Coffman, that person can be properly described as being lucky if her stay in the deterministic world corresponds to the predetermined temporal interval in which the ball would remain balanced on the cone’s tip. Another example is the following: a Laplacian demon, who is able to predict the future given his knowledge of the complete state of a deterministic world at a prior time, might be unlucky to know in advance that he will die in a car accident. The moral of all these cases is that luck is—or at least seems—fully compatible with determinism.

8. References and Further Reading

Ballantyne, Nathan 2014. Does luck have a place in epistemology? Synthese 191:1391–1407.
- Ballantyne argues that investigating the nature of luck does not allow to better understand knowledge.
Ballantyne, Nathan. 2012. Luck and interests. Synthese 185: 319–334.
- Ballantyne provides a detailed examination of the different ways to formulate the significance condition on luck.
Baumann, Peter. 2012. No luck with knowledge? On a dogma of epistemology. Philosophy and Phenomenological Research DOI: 10.1111/j.1933-1592.2012.00622.
- Baumann defends an objective probabilistic condition.
Broncano-Berrocal, Fernando. 2015. Luck as risk and the lack of control account of luck. Metaphilosophy 46: 1–25.
- Broncano-Berrocal proposes a lack of control account and argues that luck can be explained in terms of risk.
Coffman, E. J. 2015. Luck: Its nature and significance for human knowledge and agency. Palgrave Macmillan.
- Coffman’s monograph includes extensive criticism of leading theories of luck and argues that luck can be explained in terms of the notion of stroke of luck; it also explores the applications in epistemology and philosophy of action of that idea.
Coffman, E. J. 2014. Strokes of luck. Metaphilosophy 45: 477–508.
- Coffman proposes an account of strokes of luck.
Coffman, E. J. 2009. Does luck exclude control? Australasian Journal of Philosophy 87: 499–504.
- Coffman defends a specific way to understand the lack of control condition on luck.
Coffman, E. J. 2007. Thinking about luck. Synthese 158: 385–398.
- Coffman gives a hybrid account of luck in terms of easy possibility and lack of control.
Church, Ian M. (2013). Getting ‘Lucky’ with Gettier. European Journal of Philosophy. 21: 37–49.
- Church explores several ways to model degrees of luck in modal terms.
Hales, Steven D. 2015. Luck: Its Nature and Significance for Human Knowledge and Responsibility, by E.J. Coffman. The Philosophical Quarterly, DOI:10.1093/pq/pqv093.
- Critical book review of Coffman’s monograph.
Hales, Steven. D. 2014. Why every theory of luck is wrong. Noûs, DOI: 10.1111/nous.12076.
- Hales gives three kind of counterexamples to probabilistic, modal, and lack of control accounts of luck.
Hales, Steven. D. & Johnson, Jennifer Adrienne. 2014. Luck attributions and cognitive Bias. Metaphilosophy 45: 509–528.
- Hales and Johnson conduct an empirical investigation on luck attributions and suggest that the results might indicate that luck is a cognitive illusion.
Lackey, Jennifer. 2008. What luck is not. Australasian Journal of Philosophy 86: 255-67.
- Lackey argues that the conditions of modal and lack of control analyses are neither sufficient nor necessary for luck.
Latus, Andrew. 2003. Constitutive luck. Metaphilosophy 34: 460–475.
- Latus gives a hybrid account of luck that features subjective probabilistic and lack of control conditions and uses the account to show that the concept of constitutive luck is not incoherent.
Levy, Neil. 2011. Hard luck: How luck undermines free will and moral responsibility. Oxford University Press.
- Levy proposes a hybrid account that conjoins a modal condition with a lack of control condition and argues that the epistemic requirements on control are so demanding that are rarely met; he also applies this account to the free will debate.
Levy, Neil. 2009. What, and where, luck is: A response to Jennifer Lackey. Australasian Journal of Philosophy 87: 489–497.
- Levy defends that Lackey’s buried treasure case poses no problem to modal accounts in terms of the distinction between luck and fortune.
McKinnon, Rachel. 2014. You make your own luck. Metaphilosophy 45: 558–577.
- McKinnon gives an answer to the question of what does it mean to say that someone creates her own luck and uses her account of diachronic luck to explain how we evaluate performances.
McKinnon, Rachel. 2013. Getting luck properly under control. Metaphilosophy 44: 496–511.
- McKinnon proposes an account of diachronic luck in terms of the notion of expected value.
Milburn, Joe. 2014. Subject-involving luck. Metaphilosophy 45: 578–593.
- Milburn distinguishes between subject-relative and subject-involving luck and argues that one of the upshots of focusing on the latter is that lack of control accounts of luck become more attractive.
Owens, David. 1992. Causes and coincidences. Cambridge University Press.
- Owens gives an account of coincidences according to which a coincidence is an event whose constituents are nomologically independent of each other.
Pritchard, Duncan (2015). Risk. Metaphilosophy 46: 436–461.
- Pritchard argues that the standard way of conceptualizing risk in probabilistic terms is flawed and proposes an alternative modal conception.
Pritchard, Duncan. 2014. The modal account of luck. Metaphilosophy 45: 594–619.
- Pritchard defends the modal account of luck from several objections.
Pritchard, Duncan. 2005. Epistemic luck. Oxford University Press.
- Pritchard introduces the modal account of luck and gives corresponding accounts of epistemic and moral luck.
Pritchard, Duncan, & Smith, Matthew. 2004. The psychology and philosophy of luck. New Ideas in Psychology 22: 1–28.
- Pritchard and Smith survey psychological research on luck and argue that it supports the modal account of luck.
Pritchard, Duncan, & Whittington, Lee John (eds.). 2015. The philosophy of luck. Wiley-Blackwell.
- A volume with many of the papers contained in this bibliography.
Rescher, Nicholas. 2014. The machinations of luck. Metaphilosophy 45: 620–626.
- Rescher defends an objective probabilistic account of luck.
Rescher, Nicholas. 1995. Luck: The brilliant randomness of everyday life. Farrar, Straus and Giroux.
- Rescher provides an extensive examination of the concept of luck as well as of many other issues surrounding it.
Rescher, Nicholas. 1969. The concept of control. In Essays in Philosophical Analysis. University of Pittsburgh Press: 327–354.
- Rescher provides an extensive examination of the concept of control.
Riggs, Wayne D. 2014. Luck, knowledge, and “mere” coincidence. Metaphilosophy 45 :627–639.
- Riggs advances an account of coincidence and applies it to the theory of knowledge.
Riggs, Wayne. 2009. Knowledge, luck, and control. In Haddock, A., Millar, A. & Pritchard, D. (eds.). Epistemic value. Oxford University Press.
- Riggs proposes a lack of control account of luck and replies to some objections.
Riggs, Wayne 2007. Why epistemologists are so down on their luck. Synthese 158: 329–344.
- Riggs criticizes the modal account of luck and defends a lack of control condition.
Steglich-Petersen, Asbjørn 2010. Luck as an epistemic notion. Synthese 176: 361–377.
- Steglich-Petersen gives an epistemic analysis of luck in terms of the notion of being in a position to know.
Stoutenburg, Gregory. 2015. The epistemic analysis of luck. Episteme, DOI:10.1017/epi.2014.35.
- Stoutenburg gives an evidential account of degrees of luck.
Williamson, Timothy. 2009. Probability and danger. The Amherst Lecture in Philosophy 4: 1–35.
- Williamson compares probabilistic and modal conceptions of safety and risk and discusses how they bear on the theory of knowledge.

Author Information

Fernando Broncano-Berrocal
Email: fernando.broncanoberrocal@kuleuven.be
University of Leuven (KU Leuven)
Belgium

Epistemic Justification

We often believe what we are told by our parents, friends, doctors, and news reporters. We often believe what we see, taste, and smell. We hold beliefs about the past, the present, and the future. Do we have a right to hold any of these beliefs? Are any supported by evidence? Should we continue to hold them, or should we discard some? These questions are evaluative. They ask whether our beliefs meet a standard that renders them fitting, right, or reasonable for us to hold. One prominent standard is epistemic justification.

Very generally, justification is the right standing of an action, person, or attitude with respect to some standard of evaluation. For example, a person’s actions might be justified under the law, or a person might be justified before God.

Epistemic justification (from episteme, the Greek word for knowledge) is the right standing of a person’s beliefs with respect to knowledge, though there is some disagreement about what that means precisely. Some argue that right standing refers to whether the beliefs are more likely to be true. Others argue that it refers to whether they are more likely to be knowledge. Still others argue that it refers to whether those beliefs were formed or are held in a responsible or virtuous manner.

Because of its evaluative role, justification is often used synonymously with rationality. There are, however, many types of rationality, some of which are not about a belief’s epistemic status and some of which are not about beliefs at all. So, while it is intuitive to say a justified belief is a rational belief, it is also intuitive to say that a person is rational for holding a justified belief. This article focuses on theories of epistemic justification and sets aside their relationship to rationality.

In addition to being an evaluative concept, many philosophers hold that justification is normative. Having justified beliefs is better, in some sense, than having unjustified beliefs, and determining whether a belief is justified tells us whether we should, should not, or may believe a proposition. But this normative role is controversial, and some philosophers have rejected it for a more naturalistic, or science-based, role. Naturalistic theories focus less on belief-forming decisions—decisions from a subject’s own perspective—and more on describing, from an objective point of view, the relationship between belief-forming mechanisms and reality.

Regardless of whether justification refers to right belief or responsible belief, or whether it plays a normative or naturalistic role, it is still predominantly regarded as essential for knowledge. This article introduces some of the questions that motivate theories of epistemic justification, explains the goals that a successful theory must accomplish, and surveys the most widely discussed versions of these theories.

Starting Points
Internalist Foundationalism
1. Basic Beliefs
2. Arguments For and Against Foundationalism
Internalist Coherentism
1. Varieties of Coherence
2. Objections to Coherentism
Infinitism
1. Arguments for Infinitism
2. Objections to Qualified Infinitism
Types of Internalism and Objections
1. Accessibilism and Mentalism
2. Objections to Internalism
The Gettier Era
1. The History of the Gettier Problem
2. Responses to the Gettier Problem
Externalist Foundationalism
Justification as Virtue
The Value of Justification
Conclusion
References and Further Reading

1. Starting Points

Consider your simplest, most obvious beliefs: the color of the sky, the date of your birth, what chocolate tastes like. Are these beliefs justified for you? What would explain the rightness or fittingness of these beliefs? One prominent account of justification is that a belief is justified for a person only if she has a good reason for holding it. If you were to ask me why I believe the sky is blue and I were to answer that I am just guessing or that my horoscope told me, you would likely not consider either a good reason. In either case, I am not justified in believing the sky is blue, even if it really is blue. However, if I were to say, instead, that I remember seeing the sky as blue or that I am currently seeing that it is blue, you would likely think better of my reason. So, having good reasons is a very natural explanation of how our beliefs are justified.

Further, the possibility that my belief that the sky is blue is not justified, even if it is true that the sky is blue, suggests that justification is more than simply having a true belief. All of my beliefs may be true, but if I obtained them accidentally or by faulty reasoning, then they are not justified for me; if I am seeking knowledge, I have no right to hold them. Further still, true belief may not even be necessary for justification. If I understand Newtonian physics, and if Newton’s arguments seem right to me, and if all contemporary physicists testify that Newtonian physics is true, it is plausible to think that my belief that it is true is justified, even if Einstein will eventually show that Newton and I are wrong. We can imagine this was the situation of many physicists in the late 1700s. If this is right, justification is fallible—it is possible to be justified in believing false propositions. Though some philosophers have, in the past, rejected fallibilism about justification, it is now widely accepted. Having good reasons, it turns out, does not guarantee having true beliefs.

But the idea that justification is a matter of having good reasons faces a serious obstacle. Normally, when we give reasons for a belief, we cite other beliefs. Take, for example, the proposition, “The cat is on the mat.” If you believe it and are asked why, you might offer the following beliefs to support it:

1. I see that the cat is on the mat.

2. Seeing that X implies that X.

Together, these seem to constitute a good reason for believing the proposition:

3. The cat is on the mat.

But does this mean that proposition 3 is epistemically justified for you? Even if the combination of propositions 1 and 2 counts as a good reason to believe 3, proposition 3 is not justified unless both 1 and 2 are also justified. Do we have good reasons for believing 1 and 2? If not, then according to the good reasons account of justification, propositions 1 and 2 are unjustified, which means that 3 is unjustified. If we do have good reasons for believing 1 and 2, do we have good reasons for believing those propositions? How long does our chain of good reasons have to be before even one belief is justified? These questions lead to a classic dilemma.

a. The Dilemma of Inferential Justification

For simplicity, let’s focus on proposition 1: I see that the cat is on the mat.

Horn A: If there are no good reasons to believe proposition 1, then proposition 1 is unjustified, which means 3 is unjustified.

Horn B: If there is a good reason to believe proposition 1, say proposition 1_a, then either 1_a is unjustified or we need another belief, proposition 1_b, to justify 1_a. If this process continues infinitely, then 1 is ultimately unjustified, and, therefore, 3 is unjustified.

Either way, proposition 3 is unjustified.

Horn A of the dilemma is the problem of skepticism about justification. If our most obvious beliefs are unjustified, then no belief derived from them is justified; and if no belief is justified, we are left with an extreme form of skepticism. Horn B of the dilemma is called the regress problem. If every reason we offer requires a reason that also requires a reason, and so on, infinitely, then no belief is ultimately justified.

Both of these problems assume that all justification involves inferring beliefs from one or more other beliefs, so let’s call these two problems the dilemma of inferential justification (DIJ). And let’s call the assumption that all justification involves inference from other beliefs the inferential assumption (also called the doxastic assumption, Pollock 1986: 19).

Responses to this dilemma typically take one of two forms. On one hand, we might embrace Horn A, which is, in effect, to adopt skepticism and eschew any further attempts to justify our beliefs. This is the classic route of the Pyrrhonian skeptics, such as Sextus Empiricus, and some later Academic skeptics, such as Arcesilaus. (For more on these views, see Ancient Greek Skepticism.)

On the other hand, we might offer an explanation of how beliefs can be justified in spite of the dilemma. In other words, we might offer an account of epistemic justification that resolves the dilemma, either by constructing a third, less problematic option or by showing that Horn B is not as troublesome as philosophers have traditionally supposed. This non-skeptical route is the majority position and the focus of the remainder of this article.

Philosophers tend to agree that any adequate account of epistemic justification—that is, an account that resolves the dilemma—must do at least three things: (1) explain how a belief comes to be justified for a person, (2) explain what role justification plays in our belief systems, and (3) explain what makes justification valuable in a way that is not merely practically or aesthetically valuable.

b. Explaining How Beliefs are Justified

One of the central aims of theories of epistemic justification is to explain how a person’s beliefs come to be justified in a way that resolves the DIJ. Those who accept the inferential assumption argue either that a belief is justified if it coheres with—that is, stands in mutual support with—the whole set of a person’s beliefs (coherentism) or that an infinite chain of sequentially supported beliefs is not as problematic as philosophers have claimed (infinitism).

Among those who reject the inferential assumption, some argue that justification is grounded in special beliefs, called basic beliefs, that are either obviously true or supported by non-belief states, such as perceptions (foundationalism). Others who reject the inferential assumption argue that justification is either a function of the quality of the mechanisms by which beliefs are formed (externalism) or at least partly a function of certain qualities or virtues of the believer (virtue epistemology).

In addition to resolving the DIJ, theories of justification must explain what it is about forming or holding a belief that justifies it in order to explain how a belief is justified. Some argue that justification is a matter of a person’s mental states: a belief is justified only if a person has conscious access to beliefs and evidence that support it (internalism). Others argue that justification is a matter of a belief’s origin or the mechanisms that produce it: a belief is justified only if it was formed in a way that makes the belief likely to be true (externalism), whether through an appropriate connection with the state of affairs the belief is about or through reliable processes. The former view is called internalism because the justifying reasons—whether beliefs, experiences, testimony, and so forth—are internal mental states, that is, states consciously available to a person. The latter view is called externalism because the justifying states are outside a person’s immediate mental access; they are relationships between a person’s belief states and the states of the world outside the believer’s mental states (see Internalism and Externalism in Epistemology).

c. Explaining the Role of Justification

A second central aim of epistemology is to identify and explain the role that justification plays in our belief-forming behavior. Some argue that justification is required for the practical work of having responsible beliefs. Having certain reasons makes it possible for us to choose well which beliefs to form and hold and which to reject. This is called the guidance model of justification. Some philosophers who accept the guidance model, like René Descartes and W. K. Clifford, pair it with a strongly normative role according to which justification is a matter of fulfilling epistemic obligations. This combination is sometimes called the guidance-deontological model of justification, where “deontology” refers to one’s duties with respect to believing. Other epistemologists reject the guidance and guidance-deontological models for more descriptive models. Justification, according to these philosophers, is simply a feature of our psychology, and though our minds form beliefs more effectively under some circumstances than others, the conditions necessary for forming justified beliefs are outside of our access and control. This objective, naturalistic model of justification has it that our understanding of justification should be informed, in large part, by psychology and cognitive science.

d. Explaining Why Justification is Valuable

A third central aim of theories of justification is to explain why justification is epistemically valuable. Some epistemologists argue that justification is crucial for avoiding error and increasing our store of knowledge. Others argue that knowledge is more complicated than attaining true beliefs in the right way and that part of the value of knowledge is that it makes the knower better off. These philosophers are less interested in the truth-goal in its unqualified sense; they are more interested in intellectual virtues that position a person to be a proficient knower, virtues such as intellectual courage and honesty, openness to new evidence, creativity, and humility. Though justification increases the likelihood of knowledge under some circumstances, we may rarely be in those circumstances or may be unable to recognize when we are; nevertheless, these philosophers suggest, there is a fitting way of believing regardless of whether we are in those circumstances.

A minority of epistemologists reject any connection between justification and knowledge or virtue. Instead, they focus either on whether a belief fits into an objective theory about the world or whether a belief is useful for attaining our many and diverse cognitive goals. An example of the former involves focusing solely on the causal relationship between a person’s beliefs and the world; if knowledge is produced directly by the world, the concept of justification drops out (for example, Alvin Goldman, 1967). Other philosophers, whom we might call relativists and pragmatists, argue that epistemic value is best explained in terms of what most concerns us in practice.

Debates surrounding these three primary aims inspire many others. There are questions about the sources of justification: Is all evidence experiential, or is some non-experiential? Are memory and testimony reliable sources of evidence? And there are additional questions about how justification is established and overturned: How strong does a reason have to be before a belief is justified? What sort of contrary, or defeating, reasons can overturn a belief’s justification? In what follows, we look at the strengths and weaknesses of prominent theories of justification in light of the three aims just outlined, leaving these secondary questions to more detailed studies.

e. Justification and Knowledge

The type of knowledge primarily at issue in discussions of justification is knowledge that a proposition is true, or propositional knowledge. Propositional knowledge stands in contrast with knowledge of how to do something, or practical knowledge. (For more on this distinction, see Knowledge.) Traditionally, three conditions must be met in order for a person to know a proposition—say, “The cat is on the mat.”

First, the proposition must be true; there must actually be a state of affairs expressed by the proposition in order for the proposition to be known. Second, that person must believe the proposition, that is, she must mentally assent to its truth. And third, her belief that the proposition is true must be justified for her. Knowledge, according to this traditional account, is justified true belief (JTB). And though philosophers still largely accept that justification is necessary for knowledge, it turns out to be difficult to explain precisely how justification contributes to knowing.

Historically, philosophers regarded the relationship between justification and knowledge as strong. In Plato’s Meno, Socrates suggests that justification “tethers” true belief “with chains of reasons why” (97A-98A, trans. Holbo and Waring, 2002). This idea of tethering came to mean that justification—when one is genuinely justified—guarantees or significantly increases the likelihood that a belief is true, and, therefore, we can tell directly when we know a proposition. But a series of articles in the 1960s and 1970s demonstrated that this strong view is mistaken; justification, even for true beliefs, can be a matter of luck. For example, imagine the following three things are truth: (1) it is three o’clock, (2) the normally reliable clock on the wall reads three o’clock, and (3) you believe it is three o’clock because the clock on the wall says so. But if the clock is broken, even though you are justified in believing it is three o’clock, you are not justified in a way that constitutes knowledge. You got lucky; you looked at the clock at precisely the time it corresponded with reality, but its correspondence was not due to the clock’s reliability. Therefore, your justified true belief seems not to be an instance of knowledge. This sort of example is characteristic of what I call the Gettier Era (§6). During the Gettier Era, philosophers were pressed to revise or reject the traditional relationship.

In response, some have maintained that the relationship between justification and knowledge is strong, but they modify the concept justification in attempt to avoid lucky true beliefs. Others argue that the relationship is weaker than traditionally supposed—something is needed to increase the likelihood that a belief is knowledge, and justification is part of that, but justification is primarily about responsible belief. Still others argue that whether we can tell we are justified is irrelevant; justification is a truth-conducive relationship between our beliefs and the world, and we need not be able to tell, at least not directly, whether we are justified. The Gettier Era (§6) precipitated a number of changes in the conversation about justification’s relationship to knowledge, and these remain important to contemporary discussions of justification. But before we consider these developments, we address the DIJ.

2. Internalist Foundationalism

One way of resolving the DIJ is to reject the inferential assumption, that is, to reject the claim that all justification involves inference from other beliefs. The most prominent way of doing this while avoiding skepticism is to show that all chains of good inference culminate at a unique kind of belief called a basic belief. Basic beliefs are beliefs that need not be inferred from any other beliefs in order to be justified. This approach to resolving the dilemma is called foundationalism because basic beliefs serve as a foundation on which all other justified beliefs are supported; a person’s beliefs are related to one another like the parts of a building: beliefs justified by inference are analogous to the roof and walls, which are in turn supported by foundational basic beliefs (see Figure 1).

Foundationalism comprises a family of views, all of which claim, at minimum, that all justified beliefs are either basic or inferred from other justified beliefs. Classically, foundationalists combine this view with the claims that we can know whether a belief is justified—that is, whether it stands in an evidential chain that starts with a basic belief—and the claim that knowing whether we are justified helps us fulfill our epistemic duties—in other words, we do well when we form or keep beliefs that are well supported and discard or refuse beliefs that are not; we do poorly when we do not.

The view that justification is a matter of having certain internal mental states is called internalism, and the family of views that include both is called internalist foundationalism. There is a further debate among internalists as to whether justification requires simply having certain mental states (propositional justification) or whether justified beliefs must be based on those mental states (doxastic justification). Philosophers who reject internalism are called externalists (see §7 of this article). Another debate among internalists is whether justification helps us to fulfill epistemic duties—that is, it tells us which beliefs are epistemically permissible, obligatory, or impermissible (the deontological conception of justification)—or whether it is simply a descriptive fact about our belief systems. (For an example of the latter, see Conee and Feldman 2004).

Figure 1: Simple Foundationalist Justification. The dots represent beliefs; the arrowsrepresent inferential relations.

a. Basic Beliefs

It is one thing to say basic beliefs resolve the DIJ and quite another thing to explain how they do. René Descartes famously argued that some beliefs are basic because they are indubitable. If a belief is genuinely indubitable, Descartes argued, it cannot be false. As it is commonly understood, dubitability is a psychological, not epistemic, matter. It might be indubitable for me that my mother loves me, even if it is not true and even if it is the sort of belief that could be doubted, even perhaps by me. But Descartes used “indubitable” to describe a belief that is clear and distinct, which is supposed to guarantee that the belief is true. (See Harry Frankfurt, 1973 for a fuller discussion of clarity and distinctness.) Other foundationalists have explained how some beliefs might stop the regress in virtue of self-evidence, or their privileged role in our belief-forming systems, or their incorrigibility.

Long before Descartes, simple mathematical propositions, such as 2 + 2 = 4, and logical propositions, such as “no one is taller than herself,” were thought to be so obvious that they could not be false. These propositions, many claimed, are self-evidently true, that is, they need no supporting evidence because any attempt to support them would be weaker than their intuitive truth. Some philosophers include perceptual experiences among self-evident beliefs, experiences such as seeing red and hearing a ringing sound. Even if you misperceive a color or a sound, or misperceive what seems to be colored or what seems to be ringing, you cannot doubt that you are having the experience of seeing redness or hearing ringing.

Another explanation for why some beliefs are basic is that they play a privileged role in our belief-forming systems. A common example of beliefs privileged in this way are those formed on the basis of sensory perception: seeing a red ball, touching what feels like a rough surface, hearing a bell. You could be hallucinating these experiences, so it is not self-evident that there is a ball, bell, or surface to experience. Nevertheless, the world impresses itself on you in this way, and it would be difficult to imagine functioning without any sense perceptions whatsoever; they play a highly privileged role in our belief systems and, therefore, can justify other beliefs (hence the emphasis that scientists have traditionally placed on observation).

Further candidates for basicality are beliefs that are true in virtue of being believed, that is, if you believe them, they are true. For example, propositions about intentional states (in other words, states about a mental state, such as hoping, doubting, thinking, believing, and so forth), logically imply the existence of the subject who is in the state. So, for anyone who, while thinking, believes the proposition “I think” can logically infer “I exist.” Beliefs that if held are true are called incorrigible. Other examples may include beliefs about introspective states such as what you believe or feel or remember. If incorrigible beliefs can be recognized as true without appeal to any other beliefs, they are good candidates for justifying other, non-basic beliefs.

Unfortunately, it is not easy to see how all of our many and various non-basic justified beliefs can be inferred from this relatively small set of basic beliefs, even if we accepted every type of basic belief just mentioned. For example, imagine you have been looking for your laptop computer. When you find it, you form the belief, “There’s my laptop.” Did seeing your computer elicit the basic belief, “I seem to be perceiving a laptop there,” from which you then inferred the belief “There’s my laptop”? Not obviously. Seeing the laptop allowed you directly—without any reasoning at all—to form the belief that you found your laptop.

Examples like these have motivated some foundationalists to expand their accounts of basic beliefs to include a wider variety of experiences. These weaker accounts allow that there are many types of non-inferentially justified beliefs, all of which are at least properly basic, where “properly basic” means a belief that is either basic in the classic sense or that meets some other condition that makes it non-inferentially justified for a person. As long as there are a sufficient number of properly basic beliefs, these philosophers argue, a certain sort of foundationalism remains plausible.

One example of how proper basicality might work is Alvin Plantinga’s (1983; 1993a) argument for the rationality of religious belief. Plantinga’s notion of proper basicality is supposed to be weak enough to avoid problems with classic basic beliefs but strong enough to avoid the DIJ. According to him, if a belief is properly basic for a person, it is rational for that person to accept it without appealing to other reasons. He uses rational instead of justified to distance himself from classical problems. (Sometimes Plantinga puts it even more weakly, such that, if a belief is properly basic for a person, that person is not irrational in holding it.) As an example, Plantinga argues that if a person is raised in a religious community where the central religious claims he hears are corroborated by the community and none of those claims is undermined by contrary experience or argument, he is not violating any epistemic duty in believing that, say, God exists. His experiences and circumstances can “call forth belief in God” in a way that does not require other beliefs and can serve as a reason to accept other beliefs (1983: 81). This is a controversial view, not least because it either changes the discussion from justification to rationality or conflates justification and rationality. Nevertheless, basic beliefs are controversial no matter how they are characterized, and Plantinga’s proper basicality is just one among several. For another attempt to defend classical foundationalism against objections, see Timothy McGrew (1995).

b. Arguments For and Against Foundationalism

Foundationalism has remained competitive in the history of justification largely because of its intuitive advantages over competing views. The most common argument for foundationalism is the positive argument that it explains how we actually form beliefs on the basis of evidence. I believe the sky is blue because I see that it is blue, not because I infer it from other beliefs about the sky. Roderick Chisholm offers a sophisticated version of this argument, concluding that “[t]hinking and believing provide us with paradigm cases of the directly evident” (1966: 28). In addition to this positive argument, foundationalists offer the negative argument that no alternative account—skepticism, coherentism, or infinitism—has the resources to satisfactorily resolve the DIJ, that is, to avoid both skepticism and an infinite regress (see BonJour and Sosa 2003). This is, perhaps, the more powerful of the arguments and merits some attention.

Skepticism motivated epistemologists to inquire into justification in the first place, so the skeptical option is generally considered a loss. As an alternative, coherentists (§3) maintain that a person’s beliefs are justified in virtue of their relationship to the person’s belief set (see Lehrer 1974). If a belief stands or can stand in a consistent, mutually supportive relationship with other beliefs—a “web of belief,” as W. V. O. Quine (1970) calls it—that belief is justified. However, there is reason to believe that, since all beliefs stand in mutually supportive relationships, at least some beliefs (perhaps all) will play an indispensable role in their own support, rendering any coherentist argument viciously circular. Since circular arguments are fallacious, if coherentism entails that justification is circular, coherentism cannot resolve the DIJ.

A more recent alternative to skepticism is infinitism (see §4), according to which all justified beliefs stand in infinite chains of inferential relations (see Klein 2005). Skepticism is avoided because every belief is justified by some other belief. Unfortunately, infinitism requires that we accept one of two questionable assumptions: either that there simply is an infinite number of justifying beliefs available (and to which our minds, in virtue of being finite, do not have access) or that there is some algorithm that, for any belief, B, can direct us to a non-circular justifying belief for B. The problem with the former assumption is that it seems to depend on faith that there is an infinite series of justifiers, which is not obviously better than having no justification at all. And the problem with the latter is that it comes dangerously close to foundationalism, where the algorithm functions as a basic belief. If the infinitist cannot refute these objections, it cannot resolve the DIJ.

These are simple concerns about coherentism and infinitism, and we consider more sophisticated objections in sections 3 and 4. But, if neither coherentism nor infinitism can provide an alternative means of resolving the original dilemma, foundationalism may be the most promising alternative to skepticism. Unfortunately for foundationalists, even if they are right that some account of basic belief would adequately resolve the dilemma of inferential justification, it is not clear that such an account is currently available. Further, there are at least two other serious objections to foundationalism.

First, there is some concern that foundationalism cannot be justified by its own account of justification, that is, foundationalism is self-defeating. Alvin Plantinga (1993b) offers a version of this objection. According to foundationalists, a belief is justified if and only if it is either basic or inferred from other justified beliefs. This criterion, though, is not itself basic on any classical conception of basic beliefs (indubitability, self-evidence, evident to the senses, or incorrigibility), and it is not clear how it could be supported by other justified beliefs.

One straightforward response to this objection is that the arguments above (the positive argument and the negative argument by elimination), do provide, contra Plantinga, inferential support for foundationalism. In fact, Plantinga (1983; 1993) expands his own notion of proper basicality precisely to avoid the self-defeat objection. Further, if sophisticated reasoning strategies like induction could be justified on foundationalist grounds, then foundationalism itself may be justified on such grounds. For example, Laurence BonJour (1998) defends rational insight as a basic source of evidence and then argues that induction is justified by rational insight. If foundationalism is roughly correct and there are arguments grounded in rational insight that justify foundationalism, foundationalism might be vindicated. Of course, there remain concerns about the circularity of such arguments.

Other philosophers use an inference to the best explanation to defend a type of basic evidence, though these views may rightly be regarded as hybrids of foundationalism and coherentism. For example, Earl Conee and Richard Feldman (2008) argue that “[p]erceptual experiences can contribute toward the justification of propositions about the world when the propositions are part of the best explanation of those experiences that is available to the person.” The idea that what have been called basic beliefs are connected with the world and how we are positioned in the world is a better explanation of why we have the evidence we have than traditional accounts of justification. Catherine Z. Elgin (2005) offers a similar account, arguing that, while perceptions have “initial tenability” given their privileged role in our belief formation, they do not obtain this tenability in isolation from our whole evidential context; over time, certain perceptual beliefs have proved themselves to have the plausibility that allows us to privilege them.

A second objection to foundationalism is the meta-justification argument. The idea is that basic beliefs cannot resolve the DIJ because, even if their justification does not depend on other beliefs, it does depend on reasons which themselves require reasons. If I believe a proposition because it is indubitable, then I must have some reason for thinking that indubitable beliefs are likely to be true. If I do not, I am stuck with Horn A, and if I do, I am stuck with Horn B. To demonstrate this problem, Peter Klein (2005) asks us to imagine an argument between Fred and Doris, where Fred has come to what he regards as the basic belief on which his argument depends; call it b.

According to Fred, b has autonomous justification, that is, is a type of basic belief. Doris happens to agree that b is autonomously justified but asks whether beliefs with autonomous warrant are likely to be true. As a foundationalist, the most plausible option for Fred is the following: “He can hold that autonomously warranted propositions are somewhat likely to be true in virtue of the fact that they are autonomously warranted” (2005: 133).

If Fred is right, however, b only works as a justification for the rest of his argument precisely because he has added something to b. What has he added? Namely, that he “has a very good reason for believing b, namely b has F and propositions with F are likely to be true.” These are propositions independent of b that serve to justify b. Klein continues: “Of course Fred, now, could be asked to produce his reasons for thinking that b has F and that basic propositions are somewhat likely to be true in virtue of possessing feature F” (2005: 134). If this is right, basic beliefs do not stop the regress of reasons (see also Smithies 2014).

One response to this criticism comes from Laurence BonJour, who argues that it is plausible to think that understanding b includes a sort of built-in awareness of the content of those additional premises Klein mentions, such that understanding b constitutes, in and of itself, a reason to hold b (BonJour and Sosa 2003: 60-68). If it is possible to have an evidential state that includes, non-inferentially, all the content necessary for having a reason to believe a proposition is true, foundationalists may be able to describe a basic belief that stops the regress and avoids skepticism. But explaining just what this state is remains a point of controversy.

Another response is to construct an inference to the best explanation, as mentioned above in response to the self-defeat objection (Elgin, 2005; Conee and Feldman, 2008). The result, again, is typically a hybrid view, which may be equivalent to giving up foundationalism. Conee and Feldman say their view is closer to a “non-traditional version of coherentism” (2008: 98). And Elgin calls her view “a very weak foundationalism or…a coherence theory” (2005: 166). This raises questions about the merits of coherentism, to which we now turn.

3. Internalist Coherentism

Like foundationalists, coherentists attempt to avoid skepticism while rejecting infinitism. But they find a further problem with foundationalism. Every sensory state (seeing red, smelling cinnamon, and so forth) must be understood in a mental context, that is, one must have a set of background experiences, beliefs, and vocabulary sufficiently large for forming and understanding beliefs. All sensory beliefs, such as “I see red” and “I smell cinnamon,” require an immensely complex set of assumptions about self-reference, seeing, colors, smelling, and scents. This means that individual beliefs are not isolated bits of information that act as bricks in a building; they are nodes of information that depend for their meaning and support on a web of relationships with other beliefs.

Many coherentists accept the inferential assumption and argue that the result is not an infinite regress of inferences, but a non-linear system of support from which justification emerges as a property of the combination of inferences. As Donald Davidson puts it, “[N]othing can count as a reason for holding a belief except another belief” (2000: 156). Other coherentists reject the inferential assumption and argue that the result is a non-linear system of support from which justification emerges as a property of the set as a whole. Keith Lehrer explains: “This does not make the belief self-justified, however, even though it might be non-inferential. The belief is not justified independently of relations to other beliefs. It is justified because of the way it coheres with other beliefs belonging to a system of beliefs” (1990: 89). As we see below (§3.b), some coherentists reject the belief requirement of the inferential assumption, arguing that perceptual experiences can play a justifying role in the set of mental states that includes a person’s beliefs.

Regardless of whether coherentists accept the inferential assumption, they can allow that some beliefs are non-inferentially generated—for example, by experiences, intuitions, hunches, and so forth. But they are committed to the idea that the justification for beliefs generated in these ways depends essentially on their relationship to the person’s complete set of beliefs. Construed in this way, coherentism is specifically a view about justification and should not be confused with coherentism about a truth. Some philosophers have held both coherentism about truth and justification (Blanshard 1939 and Lewis 1946), but many who hold coherentism about justification reject coherentism about truth (see BonJour 1985, ch. 5, and Truth).

a. Varieties of Coherence

Broadly, coherentists argue that a belief is justified just in case it stands in a system of mutually supporting relationships with other beliefs in a person’s system of beliefs. For instance, my belief that the cat is on the mat involves a complicated set of beliefs: I am seeing a cat, I am seeing a mat, I am seeing a cat on a mat, a cat is a particular kind of mammal, a mat is a particular type of floor covering, my vision is generally reliable under normal circumstances, these are normal circumstances, and so forth. It is difficult to imagine arranging these in a linear, foundationalist fashion. In addition, it is not clear whether some of these beliefs are more basic than some others. Nevertheless, they all cohere, which means they are logically consistent with one another and with other beliefs in my belief set, and they mutually support one another. The challenge for coherentists is to explain just what “mutual support” amounts to.

Whereas foundationalists employ the metaphor of a building (or a pyramid, in some cases) to explain justificational relationships, coherentists employ the metaphor of a web (or, in some cases, a raft), according to which, each node (or plank) works alongside the others in a non-linear fashion to constitute a stable, interconnected whole (see Figure 2, as well as Neurath 1932, Quine 1970, and Sosa 1980). There are four candidates for how the web or raft holds together: logical consistency, logical entailment, inductive probability, and explanation.

Figure 2: Simple Coherentist Justification

P-S represent propositions;

the arrows represent lines of inference.

The first candidate, logical consistency, is generally regarded as necessary for coherence but too weak to stand on its own. For example, the belief that P and the belief that probably not-P are logically consistent. But they are not coherent; if one of them is true, the other is not likely to be true (BonJour 1985, ch. 5). Therefore, some early coherentists added that the relationship must also include logical entailment. This view, which I will call entailment coherentism, has it that a belief is justified just in case it entails or is entailed by every other belief in a person’s belief set (Blanshard 1939). Most coherentists now reject this relationship as overly strict, primarily because it seems possible to have two very different beliefs, neither of which entails the other and yet which are both justified. For example, consider the beliefs “I am seeing a needle puncture my skin” and “I am feeling pain.” Neither belief entails the other; nevertheless, it is intuitively plausible that both belong to a coherent set of beliefs.

Because of the problems with mere consistency and consistency plus entailment, most coherentists allow that entailment is sufficient for coherence but not necessary. To capture weaker relationships, they expand the notion to include inductive probability. Inductive probability coherentism is the view that a belief is justified just in case it is a member of a set each of whose members is entailed by or made more probable by a subset of the rest. C. I. Lewis, calling this type of justification “congruence,” puts it eloquently: “A set of statements, or a set of supposed facts asserted, will be said to be congruent if and only if they are so related that the antecedent probability of any one of them will be increased if the remainder of the set can be assumed as given premises” (1962). With their emphasis on inferential relations among beliefs, entailment and inductive probability coherentism attempt to resolve the DIJ by capturing the intuitive plausibility of the inferential assumption while avoiding the difficulties with basic beliefs.

Unfortunately, inductive probability coherentism faces problems similar to those that face entailment coherentism. It seems plausible for a person to hold two justified beliefs without the antecedent probability of either increasing the epistemic probability of the other, even when conjoined with other beliefs in the set. Consider, for example, your beliefs that “the Red Sox will win the Pennant” and “John F. Kennedy was shot in 1963.” Both beliefs are reasonably part of a person’s belief system, and yet it is difficult to see how one might contribute to a set of beliefs that makes the other more probable. Second, even if a subset of beliefs in a set increase the probability of each other member, the set might not be sufficiently comprehensive or well-connected with one’s experiences to justify one’s beliefs. Imagine a set of 100 beliefs, any 99 of which render the 100^th member more probable than its antecedent probability. This set passes the inductive probability test and is, therefore, coherent on this account, but it includes very few beliefs. This suggests that, in order to maintain coherence, we could arbitrarily expand or contract our set of beliefs at will to avoid loss of rationality. The only guideline is that we preserve strong inductive inferences. Unfortunately, such arbitrary sets ignore important differences in the sources of beliefs; we can imagine two inductively coherent sets, one that includes sensory beliefs and one that does not. Inductive probability coherentism, without further qualification, implies that neither set is more rational than the other. As Catherine Z. Elgin puts it, “A good nineteenth-century novel is highly coherent, but not credible on that account. Even though Middlemarch is far more coherent than our regrettably fragmentary and disjointed views…, the best explanation of its coherence lies in the novelist’s craft, not in the truth…of the story” (2005: 159-60).

A third prominent account of coherence aimed at avoiding this criticism allows that entailment and inductive probability can contribute to coherence but only insofar as they function in a plausible explanation of the set of beliefs. According to this view, known as explanatory coherentism, beliefs are justified just in case they explain or are explained by the other beliefs of the same type (Harman 1986 and Poston 2014). This view is not committed to the inferential assumption and argues that justification is an emergent property of the explanatory relations among beliefs. Catherine Z. Elgin says that “epistemic justification is primarily a property of a suitably comprehensive, coherent account, when the best explanation of coherence is that the account is at least roughly true” (2005: 158). Elgin adds that the beliefs comprising a coherent system “must be mutually consistent, cotenable, and supportive. That is, the components must be reasonable in light of one another” (2005: 158).

Explanatory coherentism takes its motivation from responses to a problem in philosophy of science that was similar to the problem that faces inductive probability coherentism (Neurath 1932 and Hempel 1935). Not every proposition in a scientific theory is derived inferentially from others, and so there is some question as to whether such propositions could be believed justifiably. It turns out, though, that those propositions play an important explanatory role in the theory that organizes evidence and concepts in plausible ways, even if those propositions have no antecedent probability outside of the system. Elgin explains, “For example, although there is no direct evidence of positrons, symmetry considerations show that a physical theory that eschewed them would be significantly less coherent than one that acknowledged them. So physicists’ commitment to positrons is epistemically appropriate” (2005: 164). This suggests that explanations can play a justifying role independently of inferential relations, thus lending plausibility to coherentism.

Explanatory coherence avoids criticisms of earlier accounts in that it (1) maintains that consistency is an important constraint on a belief set, and (2) maintains that inferential relations contribute to explanatory power, while (3) also accounting for the intuitive connection of certain beliefs with sensory evidence and non-inferential coherence relations. Nevertheless, some criticisms have led philosophers like BonJour (1985), Lehrer (1974; 1990), and Poston (2014) to add other interesting and influential conditions to coherence theories, though space prevents us from exploring them here.

b. Objections to Coherentism

There are three prominent objections to coherentism. The first, which we already encountered in §2.b, is called the circularity problem. Since coherentism depends on mutual support relations, every particular belief will likely play an essential role in its own justification, rendering coherentist justification a form of circular argument (see Figure 3).

The problem with circular justification is that it putatively undermines the goal of justification, which is to garner support for claim. If a claim is inferred from itself (P à P), the concluding proposition has only as much support as the premise, but that is precisely what we do not know. Therefore, multiplying the inferences between a proposition and an inference to that proposition (for example, (P à Q); (Q à R); (R à S); (S à P)) cannot justify P.

In response, some coherentists argue that the circularity objection oversimplifies the view. While it is true that a belief will almost certainly play a role in its own justification, this is only problematic if we assume the justificational relationship is linear. Properly understood, justification is a property that emerges from non-linear relationships among beliefs, whether inferential or non-inferential. For example, Catharine Z. Elgin tells a story about Meg (adapted from a story by Lewis 1946), whose logic textbook was stolen. There were three witnesses to the theft, but all are unreliable witnesses (one is aloof, one has severe vision problems, and one is a known liar). Nevertheless, all three witnesses agree that the thief had spiked green hair. Despite the fact that no one of the witnesses is reliable, their independent testimony to a single, unique proposition increases the likelihood that the proposition is true. As Elgin puts it, “This [agreement] makes a difference. … Their accord evidently enhances the epistemic standing of the individual reports” (2005: 157). If this is right, the antecedently low probability of the thief’s having spiked green hair can be added to the combined strength of the testimonies to create a justified belief without vicious circularity.

A second objection to coherentism is called the isolation objection. Even if a collection of beliefs could explain, and thereby justify, its members, it is not obvious how this set of beliefs is connected with reality, that is, with the content the beliefs are about. In rejecting basic beliefs, coherentists reject privileging any particular cognitive state in the belief system, such as sensory experiences. All beliefs are treated equally and are evaluated according to whether they cohere with the belief set. But beliefs can cohere with one another regardless of whether their content expresses true propositions about reality. Coherence cannot guarantee that the set is not isolated from reality.

Some coherentists respond to this objection by making special provisions for beliefs that derive from coherence-increasing sources, such as sense experience. (BonJour (1985) calls such beliefs “cognitively spontaneous beliefs.”) This makes the degree of coherence partly a matter of how well the system of beliefs integrates sense perception. Others appeal to more abstract distinctions among types of justification. For example, Keith Lehrer (1986) distinguishes personal justification, which involves the traditional, internalist coherence requirement, from verific justification, which is an externalist requirement on coherence. While objective coherence may be outside a person’s ken, it nevertheless contributes, along with personal justification, to what Lehrer calls complete justification. This externalist requirement helps to ground a person’s system of beliefs in the world those beliefs are supposed to be about.

Another coherentist response to the isolation objection is to allow experience itself, not just beliefs about experience, to figure in the evaluation of coherence. Catherine Z. Elgin (2005) argues that we have good reasons to privilege some perceptual experiences over very coherent sets of beliefs. She argues that this is because perception does not—contra foundationalists—work in isolation from other sorts of evidence. She says, “Only observations we have reason to trust have the power to unseat theories. So it is not an observation in isolation, but an observation backed by reasons that actually discredits the theory” (162). This also explains how we are able to privilege some perceptual experiences over others (say, in unfavorable conditions), though she admits that her view includes “something other than coherence,” and allows that it is a very weak form of foundationalism. For a reply along these lines that maintains a more traditional version of coherentism, see Kvanvig and Riggs (1992).

A third objection is called the plurality objection. Because justification is determined solely by the internal coherence of a person’s beliefs, coherence theory cannot guarantee that there is “one uniquely justified system of beliefs” (BonJour 1985: 107). BonJour explains that this is because “on any plausible conception of coherence, there will always be many, probably infinitely many, different and incompatible systems of belief which are equally coherent” (ibid.). To show just how pernicious this problem is, Lehrer asks us to imagine one set of beliefs comprised of both necessary and contingent beliefs and then to imagine a second set created by negating all the contingent beliefs in the first set (1990: 90). This has the nasty implication that, if coherence is sufficient for justification, then “for any contingent statement a person is completely justified in accepting is such that he is also completely justified in accepting the denial of that statement” (ibid.).

One response to the plurality objection is to invoke a “total evidence” requirement on explanatory and probabilistic relations. While we can arbitrarily construct probabilistically and explanatorily coherent sets, there is a non-trivial sense in which non-belief states explain our beliefs: sensation, testimony, and so forth. A theory of explanation that includes the antecedent probabilities of the beliefs based on this evidence would be more coherent with our total evidence than an arbitrary set of beliefs that ignores them. Recent debates over the relationship between coherence and truth include sophisticated analyses of probabilistic assessments (Klein and Warfield 1994 and Fitelson 2003) and an interesting argument for the impossibility of coherence’s increasing the probability that a belief is true (Olsson 2009), but there is not space to develop these arguments here.

For more on coherentism, see Coherentism in Epistemology.

4. Infinitism

Infinitism is an internalist view that proposes to resolve the dilemma of inferential justification by showing that Horn B of the DIJ, properly construed, is an acceptable option. In fact, argue infinitists, there are no serious problems with an infinite chain of justifying beliefs.

Traditionally, epistemologists have rejected the idea that a belief’s linear chain of justifying beliefs can extend infinitely because it leaves all beliefs ultimately unjustified. Inferential justification is said to transmit justification, not create it; therefore, an infinite chain of justifying beliefs would have no source of support to transmit. Similarly, since one could not hold an infinite number of beliefs or mentally trace an infinitely long chain of beliefs, infinitism betrays a common internalist intuition that a person must be aware of good reasons for holding a belief.

Infinitists claim these criticisms are misguided. In practice, justification is not as tidy as epistemologists would have us believe. The traditional idea that the regress must stop or bottom out in basic beliefs is unrealistic and unnecessary. Few of us attempt to draw inferences long enough to arrive at basic beliefs. We often stop looking for reasons when we are content that we have fulfilled our epistemic responsibility, not because the chain has actually ended (Aikin 2011). Foundationalists and coherentists, then, are relatively unconcerned with ultimate justification in their own epistemic behavior and, therefore, to hold epistemic justification to such high standards renders very few of our beliefs justified. To accommodate this messiness, infinitists might reject the inferential assumption, at least as classically understood. Like coherentists, infinitists may hold that justification is an emergent property of a set of beliefs and that justification comes in degrees such that, the longer the inferential chain, the stronger the degree of justification (Klein 2005).

a. Arguments for Infinitism

There are two main lines of argument for infinitism. The first is that foundationalism and coherentism cannot stop the structure of justification from regressing infinitely. For example, Peter Klein (2005) constructs a version of the meta-justification argument against foundationalism and argues that the most plausible version of coherentism (emergent justification accounts), because of its appeal to a basic assumption about the reliability of coherent sets, is merely a disguised form of foundationalism. If these arguments hit their mark, and if externalism is ruled out, infinitism may be the only non-skeptical option available.

The second main line of argument for infinitism is that the classic objections to infinitism are aimed at overly simplistic versions of the view; they do not threaten suitably qualified versions. For example, Scott Aikin (2009) argues that concerns about the regress arise because of a conflict between two types of intuition: (1) proceduralism, which includes our standard intuitions about good reasons and responsible believing, and (2) egalitarianism, which includes our intuitions that people are generally justified in believing a lot of things (beliefs about how to set DVRs and beliefs about how to get from home to work). Aikin claims that infinitists take the demands of proceduralism more seriously than egalitarian intuitions, maintaining that justification and knowledge are very difficult to attain. The more committed we are to following our chains of evidence, the more likely we are to attain our epistemic goals. However, we often stop far from what even foundationalists would take to be the end of those chains. And at every proposed stopping point, there is an infinite number of justificational questions about the appropriateness of the terms we are using, the reliability of our perceptions and concept attributions, and so forth. If this is right, infinitism may be the most plausible implication of our epistemic intuitions.

Similarly, Peter Klein (2014) argues that infinitism is a minimal thesis about what makes justification valuable, namely, that it renders our beliefs “reason-enhanced.” He says, “Infinitism holds that a belief-state is reason-enhanced whenever S deploys a reason for believing that p. Importantly, S can make a belief-state reason-enhanced even if the basis is another belief-state that is not (yet) reason-enhanced” (2014: 105). If this is right, then the process of inferring can create or produce original epistemic support, and we need not appeal to anything like basic beliefs for ultimate support. Further, infinitists do not object to a chain of inference’s stopping, for instance, when some presuppositions are explicit. For example, reasoning about Euclidean geometry may appropriately stop at Euclid’s axioms when we agree that they are our standard of evaluation. But we can also admit that those axioms can be challenged, and our reasoning could continue indefinitely. Infinitists simply argue that this is a standard feature of all justification.

b. Objections to Qualified Infinitism

Carl Ginet (2005) argues that even qualified infinitism is motivated on spurious grounds. One argument against foundationalism is that, even for basic beliefs, one needs a reason to believe they are true, and this initiates an infinite regress of reasons. Ginet objects, however, that this argument threatens foundationalism only if all reasons are inferential reasons. Of course, this is precisely what foundationalists reject. If some non-belief reasons are justified independently of any additional reasons for thinking they are true, that is, if they are inherently reasonable, the infinitist argument against foundationalism is question-begging.

In response, the infinitist might contend that, even if its critique of foundationalism is flawed, infinitism may yet be the more plausible alternative. If infinitism captures our intuitions about justification as adequately as foundationalism, and if it requires fewer controversial concepts (basic beliefs), infinitism may be an attractive competitor.

Another objection to infinitism is that, given our finite minds, we lack complete access to the infinite set of justifying beliefs. If a person has no access to his reason for belief, then infinitism is no longer internalist and, thereby, loses its means of defusing the DIJ. Of course, the infinitist may concede this and fall back on a mentalist account of epistemic access (see §5.a below). As Ginet puts it: a belief (L) “is available to S as a reason for so believing only if S is disposed, upon entertaining and accepting (L), to believe that the fact that (L) was among his reasons for so believing” (2005: 146). If this is right, a person may have a disposition to recognize further evidence for his justifying beliefs when prompted to do so.

Nevertheless, even this mentalist-enhanced infinitism faces the concern that the process of justification is never complete. An assumption behind the DIJ is that, if for any belief, there is not a reason to believe it is true, that belief and any beliefs inferred from it are unjustified. If this is right, and the justification condition for infinitism is never actually met, then we are left with skepticism.

A variation on this criticism is the idea that inferential justification can only transmit justification and cannot originate it. The idea is that all inference is conditional ((P → Q); (Q → R); (R → S)). Given this set of propositions, is S justified for us? That depends on whether P is justified. Telling us that P is justified by N, (N → P), though, does not answer the question of whether S is justified. We still need to know whether N is justified (Dancy 1985: 55). If this is right, then no matter how long the chain of inference is—even if it is infinite—no belief is justified.

Infinitists may respond to this objection by arguing that the justification condition is not a matter of getting to a final, infinitely large set, but of increasing one’s epistemic reasons for the proposition in question. Peter Klein, using the term “warrant” for “justification,” says that infinitism is like coherentism in this respect. He says, “Infinitism is like the warrant-emergent form of coherentism because it holds that warrant for a questioned proposition emerges as the proposition becomes embedded in a set of propositions” (2005: 135). Further, Klein explains that “warrant increases not because we are getting closer to a basic proposition but rather because we are getting further from the questioned proposition” (137). This amounts to a rejection of the claim that inferential justification can only transmit justification and, therefore, that a justificational chain must be complete in order to be adequate (recall Catherine Z. Elgin’s story about Meg in §3.b above).

A worry for this response is similar to a worry for coherentism. Any criterion that implies the infinite set of beliefs is justified is either part of the set or independent of it, in which case, it, too, needs a justification. If some sort of justification-conferring awareness is built into the increasingly large set, infinitism seems like foundationalism in disguise.

A further worry is that, if infinitists do not require that a person actually have an infinite number of justifying beliefs or perform an infinite number of inferences, then infinitism seems committed to the idea that inference itself can create justification. This, however, seems implausible. Carl Ginet writes, “…acceptable inference preserves justification … [but] there is nothing in the inferential relation itself that contributes to making any of those beliefs justified” (2005: 148-49). If inference cannot produce justification, it is unclear how a belief in an infinite chain of inferences comes to be justified.

For a more detailed treatment of infinitism, see Infinitism in Epistemology.

5. Types of Internalism and Objections

As noted above (§2), the view that justification is something we can determine by directly consulting our mental states is called internalism. This view does not entail that all epistemic concepts are internal. John Greco gives an example to demonstrate the difference: “[S]uppose that someone learns the history of his country from unreliable testimony. Although the person has every reason to believe the books that he reads and the people that teach him, his understanding of history is in fact the result of systematic lies and other sorts of deception” (2005: 259). Objectively speaking, this person’s beliefs are not reliably connected with reality. Subjectively, though, he is following his evidence to their rational conclusion. Should we say this person’s beliefs are justified? Since the reliability of his sources is beyond his ability to evaluate, the internalist says he has fulfilled his epistemic duty: yes, he is justified.

For centuries, there was no serious alternative to internalism. As we will see in §6, the advent of the Gettier case in the 20^th century constitutes a serious challenge to internalism, and it contributed to alternative, externalist accounts of knowledge and justification. This move to externalism also led to closer scrutiny of internalism, and new concerns about its adequacy arose. I review just two of these here. But before doing so, it is helpful to distinguish two types of internalism: accessibilism and mentalism.

a. Accessibilism and Mentalism

According to accessibilists, in order for a belief to be justified for a person, that person must have “reflective access” to good reasons for holding that belief. To have reflective access is to be directly mentally aware of reasons for holding a belief. Some accessibilists argue that a person’s access must be occurrent, that is, she must be currently aware of her reasons for holding a belief (Conee and Feldman 2004). Others hold the looser requirement that, as long as a person has had direct access to relevant justifying reason, she is justified in holding the supported belief.

According to mentalists, reflective access may be sufficient for justification, but it is not necessary. All that is necessary for a belief to be justified is that a person has mental states that justify the belief, regardless of whether a person has reflective access to those states. Mentalists allow that some non-reflectively accessible mental states can justify beliefs.

Mentalism is supposed to have several advantages over accessibilism given the standard criticisms of internalism. For example, some have objected to internalism on the grounds that it cannot accommodate intuitive cases of stored or forgotten evidence. If, for example, you are driving and not thinking about whether Washington, D.C. is the capital of the United States, or you have forgotten any evidence for this belief, are you justified in believing that it is? If not, could we say that you know it is the capital? Accessibilists claim that a person must be able to access her evidence for a belief while she is currently thinking about it and presumably without prompting. Few of us, though, hold (or even could hold) a belief with all its attendant reasons in mind at once. Similarly, it seems reasonable to imagine that a person is justified in believing a proposition for which she has forgotten her evidence. Mentalists can handle these cases by claiming that the ability to access stored facts can constitute dispositional justification, and that even in cases of forgotten evidence, it could still be the case that the fact that it is justified is consciously available, either occurrently or dispositionally (Conee and Feldman 2004).

The worry for mentalism is that, in allowing non-occurrent mental states to count as reasons, mentalism betrays its claim to be internalist. For example, there may be a lot of evidence I could have that P is true if I were in the right place at the right time. But the existence of that evidence does not obviously justify P for me since being in such a place might be a matter of luck. Being at the right place at the right time may mean that the evidence that, say, “Washington is the capital,” is in a book nearby that I never happen to read or that the evidence is one of my mental states that I am not currently thinking about, even if I could when prompted. Specifying just what it means for evidence to be available but not occurrent turns out to be quite difficult. Richard Feldman (1988) argues that in neither of these examples am I justified in believing that Washington is the capital and that a mental state counts as evidence if and only if one is currently thinking of P. Feldman embraces the counterintuitive implication that “one does not know things such as that Washington is the capital when one is not thinking of them” (237). Despite these difficulties, the distinction between accessibilism and mentalism plays an important role in the debate over internalism.

For more on accessibilism and mentalism, see §1.c of, Internalism and Externalism in Epistemology.

b. Objections to Internalism

In addition to the Gettier problem (§6), there many other lines of argument that challenge internalism. Here, I review only three. One of these lines is called the access problem. Traditional foundationalists have accepted some version of accessibilism. For example, Roderick Chisholm writes that justification is “internal and immediate in that one can find out directly, by reflection, what one is justified in believing at any time” (1989: 7). But what if the belief P that justifies my current belief Q is tucked far back in the recesses of my memory and would require more time than I currently have to access it? Am I still justified in believing Q? Or worse, imagine that I have forgotten P; there is no possibility that I can directly access it. However, Q seems true to me, I remember that I had good reasons for believing it, and I do not have any reasons to doubt Q now. Am I justified in believing Q in this case?

Without some modification, the internalist must say no in both cases—the relevant evidence is neither immediately nor reflectively available—though intuitively these are normal cases of justified belief. The standard response is two-fold. First, we must admit that justification comes in degrees: having more evidence can increase one’s justification and some evidence is stronger than others. And second, the state of seeming to be justified or remembering that I am justified can, themselves, constitute reasons for belief. Therefore, in these cases, the internalist might respond that, while the justifications are not as strong as we would prefer, they are, nonetheless, based on accessible mental states.

A second, related objection to internalism is what, following John Greco, I will call the etiology problem. Internalism tends to make justification so easy that it is unclear how one is able to distinguish between good and bad reasons. Consider an example from Greco (2005):

Charlie is a wishful thinker and believes that he is about to arrive at his destination on time. He has good reasons for believing this, including his memory of train schedules, maps, the correct time at departure and at various stops, etc. However, none of these things is behind his belief—he does not believe what he does because he has these reasons. Rather, it is his wishful thinking that causes his belief. Accordingly, he would believe that he is about arrive on time even if he were not. (261)

Why is the combination of his beliefs about schedules, maps, and time a better reason for thinking he is about to arrive than wishful thinking? Presumably, it is because those things are reliable indicators of truth, whereas wishful thinking is not. Being a reliable indicator of truth, though, is an external relationship between the belief and the world—something to which Charlie has no access. We can arrive at a similar result from imagining that Charlie does base his beliefs on his beliefs about train schedules, and so forth, but stipulating that he formed those beliefs carelessly and haphazardly, and only accidentally arrived at the correct conclusion. Nevertheless, based on these beliefs, it seems clear to Charlie that the conclusion follows.

An internalist might respond that this objection depends on the mistaken assumption that internal factors exclude empirical evidence. To see how this assumption slips in, consider how an externalist might determine that train schedules are more fitting sources of evidence than wishful thinking. Presumably, externalists would evaluate the past track record of each source of evidence to see which more reliably indicates truth. The act of “reviewing their past track records,” however, involves appealing to internal states about what seems to be their track records and, therefore, is not obviously different from what an internalist would do; one has internal access to evidence that train arrivals correspond more reliably with train schedules than with wishes. By demanding that justification depends only on external features of the belief-forming process, and then appealing to internal features to evaluate external reliability, Greco is not denying that one must have good, accessible reasons for her beliefs; he is simply disguising the internal features by including them in the external conditions (Feldman 2005: 281). Therefore, either objective etiology is essential to justification, and, therefore, since no one has access to it, we are left with skepticism, or subjective access to evidence of reliable etiology is sufficient for justification, and the externalist criticism misses its mark.

Both the access problem and the etiology problem challenge the idea that we can determine whether we are justified by appeal to internal states. But even if this challenge can be answered, internalism is sometimes thought to imply that we can voluntarily control or change what we believe, that is, that we are guided but not determined by our evidence. The view that we have voluntary control over what we believe is called doxastic voluntarism (from the Greek doxa, for “what is given” and sometimes for “what is believed”). The idea is that internalism is intuitive partially because it allows us to take responsibility for our epistemic behavior. In fact, “[n]onvoluntarism is generally taken to rule out responsibility, since one is not responsible for what one does not control” (Adler 2002: 64). Taking responsibility implies we can decide to respond to evidence well or poorly. This suggests a third objection to internalism called the guidance problem. (For presentations of the guidance problem, see John Heil 1983 and William Alston 1989.)

It turns out that it is difficult to control what we believe: try to make yourself believe you are not reading this page or that you are not real. It is unclear what it would take to convince you that such things are true. That kind of shift would seem to require a complete change in your evidence. But if that is right, then our beliefs are tied strongly to factors outside our control; we cannot simply decide what evidence we have or whether to believe on the basis of that evidence. According to this critique, the idea that internalism explains how we take responsibility for our beliefs is misguided.

In response, contemporary internalists tend to accept that our beliefs are largely determined by the evidence we perceive ourselves to have, but they reject the idea that complete or even partial voluntary control is necessary for responsibility. Carl Ginet (2001) argues that our control over our beliefs is limited but that we nonetheless may decide what to believe in those cases where the evidence is indecisive, cases “where the subject has it open to her also to not come to believe it” (74). Further, Earl Conee and Richard Feldman (2004) argue that a person’s beliefs may appropriately fit one’s evidence even if she cannot control whether she forms those beliefs. For instance:

Suppose that a person spontaneously and involuntarily believes that the lights are on in the room, as a result of the familiar sort of completely convincing perceptual evidence. This belief is clearly justified, whether or not the person cannot voluntarily acquire, lose, or modify the cognitive process that led to the belief. (85)

For a more comprehensive treatment of the debate between internalists and externalists, see Internalism and Externalism in Epistemology.

6. The Gettier Era

The idea that justification is the crucial link between true belief and knowledge seems to be implicit in epistemology since Plato. In Theatetus, Socrates gives an example of a jury that has been persuaded by hearsay of a true judgment that can only be known by an eye-witness (201b-c). This example shows that “true judgment” is not the same thing as “knowledge,” and, therefore, that some other element is needed. Theatetus suggests that knowledge is true judgment plus a logos—an account or argument. Socrates considers three ways of giving an account of a true judgment but concludes that none is plausible. Nevertheless, from then until now, philosophers have generally thought something like the Theatetus’s suggestion must be right, and most of those accounts have been internalist. Socrates’s own suggestion, in Plato’s Meno, is that knowledge is a type of remembrance of what is true based on direct experience prior to being born. Descartes tries to close the gap between true belief and knowledge with the apprehension of clarity and distinctness. Kant attempts to bring them together with the transcendental apperception of the conditions for the possibility of veridical perception. In each case, the knower is assumed to have direct access to something that explains when true belief is knowledge.

Unfortunately, a thought experiment developed in the 20^th century challenges the idea that any internal criteria can distinguish knowledge from accidentally true belief. This thought experiment was named the Gettier Problem after Edmund Gettier, who introduced the most influential examples in a famously brief 1963 paper. Examples from other philosophers proliferated after Gettier’s publication, but each new instance is standardly called a “Gettier Case.”

a. The History of the Gettier Problem

The idea is that there are cases where all three conditions on knowledge are met—a belief is justified and true—and yet that belief fails to be knowledge. Although some traditional internalists have allowed that a false belief can be justified, they have resisted the idea that a belief’s justification does not contribute to the likelihood of knowing. But if Gettier cases are successful, it is possible to be justified (in the classic internalist sense) in holding a true belief without that belief’s being knowledge.

The the broken clock example in §1 is an early version of this problem, constructed by Bertrand Russell (1948). Here is another example Russell includes alongside his clock case:

There is the man who believes, truly, that the last name of the Prime Minister in 1906 began with a B, but believes this because he believes that Balfour was Prime Minister then, whereas in fact it was Campbell-Bannerman. … Such instances can be multiplied indefinitely, and show that you cannot claim to have known merely because you turned out to be right. (171)

The problem, though, contra Russell, is not merely that such a person turns out to be right; it is that the person’s belief is justified in cases where a belief turns out to be true by luck; justified true belief in these cases does not increase the likelihood that the belief is knowledge. The evidence that justifies the belief is not connected with the truth of the belief in the right way, and, recall from the introduction, believing in the right way is precisely the sort of thing justification is supposed to indicate.

Such cases trace at least as far back as Alexius Meinong (1906), but the most famous are Gettier’s. His cases are interesting because they show that such cases can occur even when our evidence includes logical entailment. In his first example, Gettier asks us to imagine that two men, Smith and Jones, have applied for the same job. Imagine also that Smith has very good reasons for believing: “Jones will get the job” and “Jones has 10 coins in his pocket.” From this, it follows logically that: “The man who will get the job has 10 coins in his pocket,” and Smith forms the belief that this is true. As it turns out, however, Smith has 10 coins in his pocket (though he does not know it) and he will get the job. So, Smith’s belief that the man who will get the job has 10 coins in his pocket is true, and he has good reasons for why this is so, but his reasons are unconnected with the real reasons it is true. Most philosophers have concluded that, since Smith’s true belief is just a matter of luck (and not a function of his reasons’ connection with the state of affairs that make it true), Smith does not know that the man who will get the job has 10 coins in his pocket.

Because of the many possible variations on cases like these, the idea that justification is based on evidence to which we have direct access faces a serious challenge. There is no clear sense in which that sort of evidence always or even regularly increases the likelihood that a belief is knowledge.

b. Responses to the Gettier Problem

Some philosophers have tried to save strong internalist justification from Gettier cases. For example, D. M. Armstrong—although he ultimately defends an externalist theory of justification—argues that Gettier cases can be avoided by adding a requirement that all evidence for a belief must be, not merely justified, but also knowledge. In the Gettier case above, since it is false that Jones will get the job, this belief cannot be knowledge for Smith and, therefore, undermines Smith’s ability to know the man who will get the job has 10 coins in his pocket. (See Feldman 1974 for a counterexample.)

Others weaken the requirements on justification by arguing that, while knowledge may have constraints outside our conscious access, justification is more plausibly about responsible or apt belief than truth. Call this weak internalist justification (see Zagzebski, 1996).

Still others argue that Gettier cases suggest either that justification is simply not an internal matter or that knowledge does not require justification. Those who argue that justification is external claim that whether a belief is justified depends on whether there is a law-like connection (conceptual or physical) between a belief and the state of affairs it is about (Bergmann 2006). This approach is externalist because it explains justification in terms of belief-forming processes outside the mental life of the believer. In adopting externalism, some treat internal mental states as irrelevant for justification, while others argue that internal states can play an indirect and partial role in justification. Ernest Sosa (1991), for example, argues that internal states can contribute to the state of affairs that grounds the reliability of certain belief-forming behaviors.

7. Externalist Foundationalism

Gettier cases, in addition to other challenges to internalism, have led some epistemologists to reject the idea that justification requires an internal condition. In its most minimal form, externalism is the view that internalism is false, that is, that some features external to the mental life of a person play a necessary role in justification (Greco 2005: 258). However, many versions of externalism also explicitly reject internal conditions for justification, at least for non-inferential knowledge. Some philosophers have developed externalist accounts of knowledge that lack any account of justification (compare, Goldman, 1967, though he has since given up this view). The debate between externalists and internalists, though, is primarily about justification. Externalist accounts of justification differ from internalist accounts by challenging the idea that justification is primarily or ultimately about good reasons when good reasons are construed as mental states.

To accommodate the external features that connect beliefs with states of the world, externalists modify what was traditionally meant by justification; rather than appealing to a person’s subjective perspective on her evidence, externalists appeal to the objective features of the belief-forming and -holding behavior. Epistemic standing is not about the reasons a person has; it is about the relationship between a belief and the world, how that belief is formed or how it is maintained, and where the relationship is not a guarantee of truth but a strong indicator of truth, typically because of a causal, lawful, conceptual, or counterfactual connection with the states of affairs the belief is about. The most prominent version of externalism is the view that a belief is justified just in case it is caused by a reliable process, where “reliable” means that the process produces more true beliefs than false.

a. Externalism, Foundationalism, and the DIJ

Externalists agree that, to resolve the DIJ, one needs to avoid infinite regress and skepticism. So, rather than grounding justification in other beliefs (as coherentists do) or in non-belief states (as classical foundationalists do), externalists ground justification and knowledge in the objective way the world contributes to belief formation or maintenance.

Some externalists, like Armstrong (1973) and Goldman (1979), make room for something like basic beliefs, from which something like non-basic beliefs are inferred. This means that contemporary externalists tend to accept the foundationalist structure—some beliefs are produced reliably by non-belief states, and some beliefs can be produced by other beliefs—though they reject the distinction between basic and non-basic beliefs. All belief-forming processes are states external to the knower’s mental states, and whether a belief is justified (and, therefore, knowledge) depends on the reliability of those processes.

Unlike classical foundationalists, who appeal to internal seemings, indubitability, or self-evidence as justifying these states, externalists like Goldman argue that these states are knowledge simply because they stand in a reliable relationship with the world. A non-inferential belief is knowledge when and because it is lawfully (Armstrong) or reliably (Goldman) produced.

b. Reliabilism

The concept of reliability is crucial to externalist theories of justification (in contrast to externalist theories of knowledge, for example, Goldman 1967, 1976 and Armstrong 1973). There are two types of reliabilist theories of justification. According to reliable indicator theories, a belief is justified just in case its reason or ground is a reliable indicator of the belief’s truth (Swain 1981 and Alston 1988). According to process reliabilism, a belief is justified just in case it was causally produced by reliable processes (Goldman 1979 and Bach 1985). Although he focuses primarily on externalist theories of knowledge, D. M. Armstrong’s “thermometer theory of knowledge” explains that certain mental states serve as reliable indicators or signs of knowledge, and therefore make the belief reasonable, or “justifiable.” Comparing non-inferential belief and a thermometer, Armstrong writes:

In some cases, the thermometer-reading will fail to correspond to the temperature of the environment. Such a reading may be compared to non-inferential false belief. In other cases, the reading will correspond to the actual temperature. Such a reading is like non-inferential true belief. (166)

There are a number of important qualifications to Armstrong’s view, but the central point is that a belief is justified independently of whether the person has reasons to believe it: “The subject’s belief is not based on reasons, but it might be said to be reasonable (justifiable), because it is a sign, a completely reliable sign, that the situation believed to exist does in fact exist” (183).

The benefit of Armstrong’s law-like account is that it suggests a counterfactual account of causal relations along the following lines: as long as a person has a means of distinguishing a proposition, P, from a mutually exclusive but very similar proposition Q, then the person is justified in believing P. For example, if Judy and Trudy are twins, and when John sees someone who looks like Judy, he would not mistake Trudy for Judy, then Sam is justified in believing that he sees Judy. “But if Sam frequently mistakes Judy for Trudy, and Trudy for Judy, he presumably does not have any way of distinguishing between them” (Goldman 1976: 778).

Unfortunately, reliable indicator theories tend to be overly strict in their analysis of cases. Goldman asks us to consider Oscar, who is standing in an open field and sees a Dachshund, from which he forms the belief that he sees a dog. As it happens, Oscar often mistakes certain dog breeds for wolves, who frequent the field. If he were to see a wolf, he might easily mistake it for a dog. Now, is his seeing a Dachshund a reliable indicator of seeing a dog? Since Oscar would likely believe he is seeing a dog regardless of whether he is seeing a wolf or a Dachshund, reliable indicator theories (at least Armstrong’s) would say his seeing a Dachshund is not a reliable indicator of seeing a dog. Whether this criticism is ultimately successful or whether it applies to all reliable indicator theories, reliable process theories quickly overshadowed interest in this type of reliabilism.

Process reliabilism is the view that a belief is justified just in case it is produced by a reliable cognitive process, where a cognitive process may include either conscious reasoning processes or unconscious mechanisms. As I formulated it earlier in this article, reliabilism is a necessary and sufficient condition for justification (“just in case”), but some reliabilists formulate weaker versions. Goldman treats it as a sufficient condition (though he argues against the plausibility of alternative sufficient conditions): “If S’s believing p at t results from a reliable cognitive belief-forming process (or set of processes), then S’s belief in p at t is justified,” (1979: 13). Kent Bach treats it as only a necessary condition: “The idea, roughly, is that to be justified a belief must be formed as the result of reliable processes…” (1985: 199). Despite these differences, externalists univocally reject internalist conditions as sufficient for justification. This commitment, however, leaves them open to a number of interesting criticisms.

c. Objections to Externalism

Though externalism, putatively, has the advantage of avoiding the Gettier problem (though this is controversial) and several other skeptical concerns and of capturing some important intuitions about knowledge, it faces several serious criticisms. On the basis of these criticisms, some internalists claim that externalists have simply changed the subject altogether and are not really talking about justification.

One famous criticism of externalism is called the generality problem. Earl Conee and Richard Feldman (1998) present an example to demonstrate the problem:

Suppose that Smith has good vision and is familiar with the visible differences among common species of trees. Smith looks out a house window one sunny afternoon and sees a plainly visible a nearby maple tree. She forms the belief that there is a maple tree near the house. Assuming everything else in the example is normal, this belief is justified and Smith knows that there is a maple tree near the house. Process reliabilist theories reach the right verdict about this case only if it is true that the process that caused Smith’s belief is reliable. (372)

Is it reliable? That depends on which process formed the belief. Was it the unique causal set of events leading to that particular belief? If so, it is not reliable, since token, or one-time, events have no historical track record. Reliabilists respond to this challenge by saying it is the type of process that must be reliable in order for a belief to be justified, not the token. If that is right, then we face the problem of determining which type of process formed the belief. Was it the “visually initiated belief-forming process,” the “process of a retinal image of such-and-such specific characteristics leading to a belief that there is a maple tree nearby,” the “process of relying on a leaf shape to form a tree-classifying judgment,” the “perceptual process of classifying by species a tree located behind a solid obstruction,” or any number of others (373)? There are innumerable options, and even if a combination of types were involved, each type would have to meet reliability conditions. Conee and Feldman conclude, “Without a specification of the relevant type, process reliabilism is radically incomplete” (373).

A second objection to externalism is called the New Evil Demon Problem (NEDP) (Cohen and Lehrer 1983). In Descartes’s original evil demon problem, in order to motivate the problem of skepticism, we are asked to consider the possibility that all our current perceptions are the fictitious construction of a being intent on deceiving us such that all our perceptual and intuitive beliefs are false. Putting the thought experiment to a very different purpose, if the evil demon world is possible, we can imagine two worlds: (1) a non-deceptive world, where our perceptions are reliably produced by the world outside of our minds, and (2) an evil demon world, where there are people just like you and me, who have exactly the same mental states that we do but whose perceptions are systematically unreliable—they track nothing of truth at that world. There are no trees, buildings, bodies, and so forth. Whatever actually exists at that world, those people have no perception of it. According to externalists—process reliabilists, in particular—the beliefs of people in the real world are justified and those of people in the demon world are unjustified, despite the fact that their mental lives are identical. Yet it is difficult to imagine that demon world beliefs about looking both ways before crossing the street and getting a second opinion about a medical diagnosis are unjustified. People who believe such things are acting responsibly from their perspective on their evidence. This suggests that reliabilism is not really about justification at all.

A third objection to externalism is what Ernest Sosa (2001) calls the metaincoherence problem, which attempts to show that a person’s belief can be externally reliable while internally unjustified. In the literature, there are two versions of the metaincoherence problem. The first is what I call first-order metaincoherence, which attempts to show that externalism is insufficient for justification. The second is what I call second-order metaincoherence, which challenges the externalist’s reasons for holding externalism.

One famous example of first-order metaincoherence is a thought experiment given in various forms by Laurence BonJour (1985) and Keith Lehrer (1990). Consider Armstrong’s Thermometer Analogy from above. Imagine there was a human thermometer, that is, someone who “undergoes brain surgery by an experimental surgeon who invents a small device which is both a very accurate thermometer and a computational device capable of generating thoughts” (Lehrer 1990: 163). This person, whom Lehrer names Mr. Truetemp, is unaware of the device despite the fact that it regularly causes him to form reliable beliefs that he unreflectively accepts about the temperature. On a given day, he might reliably form and accept the belief that it is 104 degrees Fahrenheit outside. Is this belief knowledge? Lehrer concludes: “Surely not. He has no idea whether he or his thoughts about the temperature are reliable” (164). BonJour concludes similarly, “Part of one’s epistemic duty is to reflect critically upon one’s beliefs, and such critical reflection precludes believing things to which one has, to one’s knowledge, no reliable means of epistemic access” (1985: 42).

The second-order metaincoherence problem is stated by Barry Stroud (1989):

The scientific ‘externalist’ claims to have good reason to believe that his theory is true. It must be granted that if, in arriving at his theory, he did fulfill the conditions his theory says are sufficient for knowing things about the world, then if that theory is correct, he does in fact know that it is. But still, I want to say, he himself has no reason to think that he does have good reason to think that his theory is correct. (321)

The worry is that, since externalists claim that features of the world outside the mental life of a believer ultimately determine whether a belief is justified, then, if externalism is true, externalists have no reason to believe it is true; in fact, they are committed to believing that whether their belief that it is true is justified is outside their ability to determine from within their own perspective. Again, the belief may be externally reliable, but it is internally unjustified.

If these criticisms hit their mark, epistemologists must make some difficult decisions about which approach—internalism or externalism—has the fewest or least pernicious problems. In the 21^st century, much work is underway to address these problems. If one remains unconvinced, there are recent developments that attempt to salvage some of the insights of internalism and externalism. A prominent example involves introducing character traits into the conditions for justification. We turn next to this view, called virtue epistemology.

8. Justification as Virtue

Classical theories of justification that imply a normative or belief-guiding dimension are modeled largely on normative ethical theories, whether teleological, or outcome-based, accounts or deontological, or duty-based, accounts. They ask whether people are rationally obligated to, permitted to, or obligated not to hold particular beliefs given their evidence. These are decision-based theories of rational normativity, as opposed to character-based theories. Just as virtue theory offers a non-decision-based alternative in ethics, it also suggests a non-decision-based alternative in epistemology. The attitudes and circumstances under which people form, maintain, and discard beliefs can be described as virtuous or vicious, and just as decision-based theories in epistemology are concerned with rational obligation (as opposed to moral obligation), character-based theories in epistemology are concerned either with intellectual character (as opposed to moral character), or with cognitive faculties understood as traits of a person (such as reason, perception, introspection, and memory). Of course, in matters of normativity, it is not a simple task to distinguish moral dimensions from rational or intellectual ones, but space prevents us from exploring that relationship here.

Virtue theories of justification hold that part of what justifies a belief is the intellectual traits with which a believer forms or holds the belief. Just as a person’s moral virtues contribute to the goodness of an action (kindness, compassion, honesty), a person’s intellectual virtues contribute to the epistemic goodness of a belief. Virtue theorists, however, are sharply divided as to which intellectual virtues are relevant. One prominent view is that justification is a function of those virtues that enhance reliability, that is, they have a strong external component (Sosa 1980; 2007). This view is known as virtue reliabilism.

A second prominent view is that justification is a function of those intellectual virtues that contribute to more general epistemic goods, including intellectual well-being, social trust, and the righting of epistemic injustice. These virtue responsibilists regard the truth-goal in epistemology very differently than both traditional epistemologists and their virtue reliabilist counterparts (Code 1984; Montmarquet 1993; Zagzebski 2000).

a. Virtue Reliabilism

A prominent version of virtue reliabilism is offered by Ernest Sosa (1980) in attempt to resolve the tension between foundationalists and coherentists. Sosa argues that if beliefs are grounded in truth-conductive intellectual virtues (where truth-conducive is conceived in process reliabilist terms), then foundationalists have empirically stable abilities or acquired habits that help explain the connection between sensory experience and non-inferential belief. Further, reliable virtues help explain how justification emerges from a coherent set of beliefs—coherence is a type of intellectual virtue.

What do these intellectual virtues look like for Sosa? Borrowing an example from his (2007), consider an archer who is aiming at a target. In order to be successful, the archer must have a degree of competence, which Sosa calls “adroitness,” and the shot must be accurate. These features are analogous to the epistemic state of having a true belief (accuracy) that is formed on the basis of good evidence (adroitness). These two features alone, though, are insufficient for the person to believe in the right way. The person must also exercise his adroitness in circumstances that increase his likelihood of having accurate beliefs, that is, his shot must be accurate because it is adroit. Sosa calls this third feature “aptness,” “its being true because competent” (2007: 23). Some of these circumstances will be outside the believer’s control—wind gusts in the archer’s case; causal ties to the world in the epistemic case. But some—for example, the virtues—are within the believer’s control.

Sosa explains:

Aptness depends on just how the adroitness bears on the accuracy. The wind may help some, for example…. If the shot is difficult, however, from a great distance, the shot might still be accurate sufficiently through adroitness to count as apt, though with some help from the wind. (2007: 79)

Notice that the role of the wind is analogous to certain external features of a person’s belief-forming state. Nevertheless, intellectual virtues like those mentioned above can increase one’s adroitness and thereby increase the likelihood of accuracy.

Imagine a person who has good evidence that P but who either does not appeal to that evidence when forming the belief that P, appealing instead to, say, wishful thinking, or who appeals to that evidence carelessly, refusing to consider alternatives or just how strong the evidence is. Despite this person’s having good evidence, her belief is not apt because the belief’s truth was not due to the person’s competence with the evidence.

Because of this external dimension, this branch of virtue epistemology is regarded as a form of reliabilism. Unlike externalist foundationalism, however, the reliability condition is not restricted to belief-forming processes; it is also highly dependent on context. Sosa says:

An archer might manifest sublime skill in a shot that does hit the bull’s-eye. This shot is then both accurate and adroit. But it could still fail to be accurate because adroit. The arrow might be diverted by some wind, for example, so that, if conditions remained normal thereafter, it would miss the target altogether. However, shifting winds might then ease it back on track towards the bull’s-eye. (79)

In epistemic cases, the believer must be suitably virtuous such that, under normal conditions, her beliefs are accurate because they are adroit.

b. Virtue Responsibilism

Sosa’s account has been well-received, though there is disagreement as to whether it is sufficient for solving the problems at issue. One prominent criticism is that Sosa does not take his use of virtues far enough. Rather than serving a more basic truth-goal, some argue that virtues should be conceived as central to the epistemic project.

Lorraine Code (1984) coined the term virtue responsibilism in contrast to Sosa’s reliabilism, and it is the view that justification, or rather, being an intellectually responsible agent, is a matter of acting virtuously in the practice of inquiry. Code argues that epistemic responsibility the central intellectual virtue. Similarly, James Montmarquet argues that, “S is subjectively justified in believing p insofar as S is epistemically virtuous in believing p” (1993: 99). This means that virtue responsibilism is internalist through and through.

Not all virtue responsibilists, however, eschew the truth-goal. As Linda Zagzebski explains, “It would not do any good for a person to be attentive, thorough, and careful unless she was generally on the right track” (2009: 82). But unlike externalist foundationalism, “the right track,” according to virtue epistemologists, does not necessarily include producing more true beliefs than false. There is more than one virtuous outcome, for example, in cases of creativity or inventiveness. It may be that “only 5 per cent of a creative thinker’s original ideas turn out to be true,” Zagzebski explains. “Clearly, their truth conduciveness in the sense of producing a high proportion of true beliefs is much lower than that of the ordinary virtues of careful and sober inquiry, but they are truth conducive in that they are necessary for the advancement of knowledge” (2000: 465). This suggests that the conditions under which a subject is justified are highly contingent on changing context and the goal of our epistemic behaviors. And virtue epistemologists argue that this captures the typical contingency of our epistemic lives.

c. Objections to Virtue Epistemology

In addition to internal disputes between virtue reliabilists and responsibilists, there are more serious concerns with the adequacy of virtue epistemology. Virtue reliabilism faces many of the same criticisms that face traditional reliabilism, including the generality problem, the New Evil Demon Problem, and the meta-incoherence problems. Further, although there is an intuitive sense in which a reliably functioning method of forming beliefs is virtuous (in the Aristotelian sense of “excellence”), it is not clear how virtue reliabilism is substantively different from classical reliabilism. To be sure, virtue responsibilists take special pains to explain the roles of context, luck, and the knower’s aptness in forming beliefs, but these do not seem unavailable to traditional reliabilists.

Similarly, virtue responsibilism faces many of the same problems as virtue ethics. There are questions about which intellectual states count as epistemic virtues (different responsibilists have different lists), whether some virtues should be privileged over others (for example, James Montmarquet (1992) argues that epistemic conscientiousness is the preeminent intellectual virtue), and the ontological status of virtues (whether they are real dispositions or simply heuristics for categorizing types of behavior). There are also serious concerns about some extreme versions of responsibilism that completely disconnect intellectual virtue from truth-seeking, as with Code’s account, rendering discussions of intellectual virtue the province of ethics rather than epistemology.

To alleviate some of these concerns, some virtue epistemologists defend a mixed theory, arguing that an adequate virtue epistemology requires both a reliability and a responsibility condition Greco (2000).

A general concern for both types of virtue epistemology is that virtue theory associates justification too closely with the idea of credit or achievement, whether a person has formed beliefs well. Jennifer Lackey (2007, 2009), for example, argues that if knowledge is produced by the virtuous activity of others (like that of a reliable witness) or if knowledge is innate, then it is not obvious how a person’s belief-forming behavior can be virtuous or vicious, as there is no behavior involved. In the case of the reliable witness, a hearer simply accepts on the basis of the witness’s testimony. In the case of innate knowledge, the knower does nothing to increase the likelihood that her beliefs are reliable; they are reliable for reasons outside her epistemic behavior. If these criticisms are right, virtue epistemology may be unable to explain a range of important types of knowledge.

For a more detailed treatment of virtue epistemology, see Virtue Epistemology.

9. The Value of Justification

Each of the theories of justification reviewed in this article presumes something about the value of justification, that is, about why justification is good or desirable. Traditionally, as in the case of Theatetus noted above, justification is supposed to position us to understand reality, that is, to help us obtain true beliefs for the right reasons. Knowledge, we suppose, is valuable, and justification helps us attain it. However, skeptical arguments, the influence of external factors on our cognition, and the influence of various attitudes on the way we conduct our epistemic behavior suggest that attaining true beliefs for the right reason is a forbidding goal, and it may not be one that we can access internally. Therefore, there is some disagreement as to whether justification should be understood as aimed at truth or some other intellectual goal or set of goals.

a. The Truth Goal

All the theories we have considered presume that justification is a necessary condition for knowledge, though there is much disagreement about what precisely justification contributes to knowledge. Some argue that justification is fundamentally aimed at truth, that is, it increases the likelihood that a belief is true. Laurence BonJour writes, “If epistemic justification were not conducive to truth in this way…then epistemic justification would be irrelevant to our main cognitive goal and of dubious worth” (1985: 8). Others argue that there are a number of epistemic goals other than truth and that in some cases, truth need not be among the values of justification. Jonathan Kvanvig explains:

[I]t might be the case that truth is the primary good that defines the theoretical project of epistemology, yet it might also be the case that cognitive systems aim at a variety of values different from truth. Perhaps, for instance, they typically value well-being, or survival, or perhaps even reproductive success, with truth never really playing much of a role at all. (2005: 285)

Given this disagreement, we can distinguish between what I will call the monovalent view, which takes truth as the sole, or at least fundamental, aim of justification, and the polyvalent view (or, as Kvanvig calls it, the plurality view), which allows that there are a number of aims of justification, not all of which are even indirectly related to truth.

b. Alternatives to the Truth Goal

One motive for preferring the monovalent view is that, if truth is not the primary goal of justification—that is, it connects belief with reality in the right way—then one is left only with goals that are not epistemic, that is, goals that cannot contribute to knowledge. The primary worry is that, in rejecting the truth goal, one is left with pragmatism. In response, those who defend polyvalence argue that, in practice, there are other cognitive goals that are (1) not merely pragmatic, and (2) meet the conditions for successful cognition. Kvanvig explains that “not everyone wants knowledge…and not everyone is motivated by a concern for understanding. … We characterize curiosity as the desire to know, but small children lacking the concept of knowledge display curiosity nonetheless” (2005: 293). Further, much of our epistemic activity, especially in the sciences, is directed toward “making sense of the course of experience and having found an empirically adequate theory” (ibid., 294). Such goals can be produced without appealing to truth at all. If this is right, justification aims at a wider array of cognitive states than knowledge.

Another argument for polyvalence allows that knowledge is the primary aim of justification but that much more is involved in justification than truth. The idea is that, even if one were aware of belief-forming strategies that are conducive to truth (following the evidence where it leads; avoiding fallacies), one might still not be able to use those strategies without having other cognitive aims, namely, intellectual virtues. Following John Dewey, Linda Zagzebski says that “it is not enough to be aware that a process is reliable; a person will not reliably use such a process without certain virtues” (2000: 463). As noted above, virtue responsibilists allow that the goal of having a large number of true beliefs can be superseded by the desire to create something original or inventive. Further still, following strategies that are truth-conducive under some circumstances can lead to pathological epistemic behavior. Amélie Rorty, for example, argues that belief-forming habits become pathological when they continue to be applied in circumstances no longer relevant to their goals (Zagzebski, ibid., 464). If this argument is right, then truth is, at best, an indirect aim of justification, and intellectual virtues like openness, courage, and responsibility may be more important to the epistemic project.

c. Objections to the Polyvalent View

One response to the polyvalent view is to concede that there are apparently many cognitive goals that fall within the purview of epistemology but to argue that all of these are related to truth in a non-trivial way. The goal of having true beliefs is a broad and largely indeterminate goal. According to Marian David, we might fulfill it by believing a truth, by knowing a truth, by having justified beliefs, or by having intellectually virtuous beliefs. All of these goals, argues David, are plausibly truth-oriented in the sense that they derive from, or depend on, a truth goal (David 2005: 303). David supports this claim by asking us to consider which of the following pairs is more plausible:

A1. If you want to have TBs [true beliefs] you ought to have JBs [justified beliefs].

A2. We want to have JBs because we want to have TBs.

B1. If you want to have JBs you ought to have TBs.

B2. We want to have TBs because we want to have JBs. (2005: 303)

David says, “[I]t is obvious that the A’s [sic] are way more plausible than the B’s. Indeed, initially one may even think that the B’s have nothing going for them at all, that they are just false” (ibid.). This intuition, he concludes, tells us that the truth-goal is more fundamental to the epistemic project than anything else, even if one or more other goals depend on it.

Almost all theories of epistemic justification allow that we are fallible, that is, that our justified beliefs, even if formed by reliable processes, may sometimes be false. Nevertheless, this does not detract from the claim that the aim of justification is true belief, so long as it is qualified as true belief held in the right way.

d. Rejections of the Truth Goal

In spite of these arguments, some philosophers explicitly reject the truth goal as essential to justification and cognitive success. Michael Williams (1991), for example, rejects the idea that truth even could be an epistemic goal when conceived of as “knowledge of the world.” Williams argues that in order for us to have knowledge of the world, there must be a unified set of propositions that constitute knowledge of the world. Yet, given competing uses of terms, vague domains of discourse, the failure of theoretical explanations, and the existence of domains of reality we have yet to encode into a discipline, there is not a single, unified reality to study. Williams argues that because of this, we do not necessarily have knowledge of the world:

All we know for sure is that we have various practices of assessment, perhaps sharing certain formal features. It doesn’t follow that they add up to a surveyable whole, to a genuine totality rather than a more or less loose aggregate. Accordingly, it does not follow that a failure to understand knowledge of the world with proper generality points automatically to an intellectual lack. (543)

In other words, our knowledge is not knowledge of the world—that is, access to a unified system of true beliefs, as the classical theory would have it. It is knowledge of concepts in theories putatively about the world, constructed using semantic systems that are evaluated in terms of other semantic systems. If this is, in fact, all there is to knowing, then truth, at least as classically conceived, is not a meaningful goal.

Another philosopher who rejects the truth goal is Stephen Stich (1988; 1990). Stich argues that, given the vast amount of disagreement among novices and experts about what counts as justification, and given the many failures of theories of justification to adequately ground our beliefs in anything other than calibration among groups of putative experts, it is simply unreasonable to believe that our beliefs track anything like truth. Instead, Stich defends pragmatism about justification, that is, justification just is practically successful belief; thus, truth cannot play a meaningful role in the concept of justification.

A response to both views might be that, in each case, the truth goal has not been abandoned but simply redefined or relocated. Correspondence theories of truth take it that propositions are true just in case they express the world as it is. If the world is not expressible propositionally, as Williams seems to suggest, then this type of truth is implausible. Nevertheless, a proposition might be true in virtue of being an implication of a theory, and so, for example, we might adopt a more semantic than ontological theory of truth, and it is not clear whether Williams would reject this sort of truth as the aim of epistemology.

Similarly, someone might object to Stich’s treating pragmatism as if it is not truth-conductive in any relevant sense. If something is useful, it is true that it is useful, even in the correspondence sense. Even if evidence does not operate in a classical representational manner, the success of beliefs in accomplishing our goals is, nevertheless, a truth goal. (See Kornblith 2001 for an argument along these lines.)

10. Conclusion

Epistemic justification is an evaluative concept about the conditions for right or fitting belief. A plausible theory of epistemic justification must explain how beliefs are justified, the role justification plays in knowledge, and the value of justification. A primary motive behind theories of justification is to solve the dilemma of inferential justification. To do this, one might accept the inferential assumption and argue that justification emerges from a set of coherent beliefs (internalist coherentism) or an infinite set of beliefs (infinitism). Alternatively, one might reject the inferential assumption and argue that justification derives from basic beliefs (internalist foundationalism) or through reliable belief-forming processes (externalist reliabilism). If none of these views is ultimately plausible, one might pursue alternative accounts. For example, virtue epistemology introduces character traits to help avoid problems with these classical theories. Other alternatives include hybrid views, such as Conee and Feldman’s (2008), mentioned above, and Susan Haack’s (1993) foundherentism.

11. References and Further Reading

Aikin, S. 2009. “Don’t Fear the Regress: Cognitive Values and Epistemic Infinitism.” Think, 23, 55-61.
Aikin, S. F. 2011. Epistemology and the Regress Problem. London: Routledge.
Alston, W. P. 1988. “An Internalist Externalism,” Synthese, 74, 265-283.
Alston, W. P. 1989. Epistemic Justification. Ithaca: Cornell University Press.
Armstrong, D. M. 1973. Belief, Truth, and Knowledge. Cambridge: Cambridge University Press.
Bach, K. 1985. “A Rationale for Reliabilism.” The Monist, 68, 246-63. Reprinted in S. Bernecker and F. Dretske, eds. 2000. Knowledge: Readings in Contemporary Epistemology. Oxford: Oxford University Press, 199-213. Cited pages are to this anthology.
Bergmann, M. 2006. Justification Without Awareness. New York: Oxford.
Blanshard, B. 1939. The Nature of Thought. London: Allen & Unwin.
BonJour, L. 1980. “Externalist Theories of Empirical Knowledge.” Midwest Studies in Philosophy 5: Studies in Epistemology. Minneapolis: University of Minnesota Press, 53-73.
BonJour, L. 1985. The Structure of Empirical Knowledge. Cambridge: Harvard University Press.
BonJour, L. and E. Sosa. 2003. Epistemic Justification: Internalism vs. Externalism, Foundations vs. Virtues. Malden: Wiley-Blackwell.
Chisholm, R. 1966. Theory of Knowledge. Englewood Cliffs: Prentice Hall.
Chisholm, R. 1982. “A Version of Foundationalism,” in The Foundations of Knowing, ed. R. Chisholm. Minneapolis: University of Minnesota Press.
Chisholm, R. 1989. Theory of Knowledge, 3^rd ed. Englewood Cliffs: Prentice Hall.
Code, L. 1984. “Toward a ‘Responsibilist’ Epistemology.” Philosophy and Phenomenological Research, 45 (1), 29–50.
Cohen, S. and K. Lehrer. 1983. “Justification, Truth, and Knowledge.” Synthese 55 (2), 191-207.
Conee, E. and R. Feldman. 2004. Evidentialism. New York: Oxford University Press.
Conee, E. and R. Feldman. 1998. “The Generality Problem for Reliabilism,” in E. Sosa and J. Kim, eds. Epistemology: An Anthology. Malden: Blackwell Publishers, 372-386. Page numbers are to this anthology.
Dancy, J. 1985. Introduction to Contemporary Epistemology. Oxford: Basil Blackwell.
David, M. 2005. “Truth as the Primary Epistemic Goal: A Working Hypothesis,” in Contemporary Debates in Epistemology, eds. Mattias Steup and Ernest Sosa. Malden: Blackwell Publishing, 296-312.
Davidson, D. “A Coherence Theory of Truth and Knowledge,” in E. Sosa and J. Kim, eds. Epistemology: An Anthology. Malden: Blackwell Publishers, 154-63.
Elgin, C. 2005. “Non-foundationalist Epistemology: Holism, Coherence, and Tenability,” in Contemporary Debates in Epistemology, eds. Mattias Steup and Ernest Sosa. Malden: Blackwell Publishing, 156-67.
Feldman, R. 1974. “An Alleged Defect in Gettier Counter-examples.” The Australasian Journal of Philosophy. 52, 68-69.
Feldman, R. 1988. “Having Evidence,” in Philosophical Analysis, ed. D. F. Austin. Kluwer Academic Publishers, 83-104.
Feldman, R. 2005. “Justification Is Internal,” in Contemporary Debates in Epistemology, eds. Mattias Steup and Ernest Sosa. Malden: Blackwell Publishing, 270-84.
Fitelson, B. 2003. “A Probabilistic Measure of Coherence.” Analysis, 63, 194–199.
Frankfurt, H. 1973/2008. Demons, Dreamers, and Madmen: The Defense of Reason in Descartes’s Meditations. Princeton: Princeton University Press.
Gettier, E. 1963. “Is Justified True Belief Knowledge?” Analysis. 23, 121-23.
Ginet, C. 2001. “Deciding to Believe,” in Knowledge, Truth and Duty, ed. Matthias Steup. Oxford: Oxford University Press, 63-76.
Ginet, C. 2005. “Infinitism Is not the Solution to the Regress Problem,” in Contemporary Debates in Epistemology, eds. Matthias Steup and Ernest Sosa. Malden: Blackwell Publishing, 140-149.
Goldman, A. 1967. “A Causal Theory of Knowing.” The Journal of Philosophy, 64, 357-72.
Goldman, A. 1976. “Discrimination and Perceptual Knowledge.” The Journal of Philosophy, 73, 771-91.
Goldman, A. 1979. “What Is Justified Belief?” in Knowledge and Justification, ed. George S. Pappas. Dordrecht, Holland: D. Reidel Publishing, 1-23.
Greco, J. 2005. “Justification Is Not Internal,” in Contemporary Debates in Epistemology, eds. Mattias Steup and
Ernest Sosa. Malden: Blackwell Publishing, 257-70.
Haack, S. 1993. Evidence and Inquiry. Malden: Blackwell Publishing.
Harman, G. 1986. Change in View. Cambridge: MIT Press.
Heil, J. 1983. “Doxastic agency.” Philosophical Studies, 43 (3), 355-364.
Hempel, C. 1935. “On the Logical Positivist’s Theory of Truth.” Analysis, 2 (4), 49-59.
Klein, P. 2005. “Infinitism Is the Solution to the Regress Problem,” in Contemporary Debates in Epistemology, eds.
Mattias Steup and Ernest Sosa. Malden: Blackwell Publishing, 131-40.
Klein P. 2014. “No Final End in Sight,” in Current Controversies in Epistemology, ed. R. Neta. London: Routledge, 95-115.
Klein, P. and T. A. Warfield. 1994, “What Price Coherence?” Analysis, 54, 129–132.
Kornblith, H. 2001. Knowledge and Its Place in Nature. Oxford: Oxford University Press.
Kvanvig, J. L. and W. D. Riggs. 1992. “Can a Coherence Theory Appeal to Appearance States?” Philosophical Studies, 67, 197-217.
Kvanvig, J. 2005. “Truth Is not the Primary Epistemic Goal,” in Contemporary Debates in Epistemology, eds.
Mattias Steup and Ernest Sosa. Malden: Blackwell Publishing, 285-96.
Lackey, J. 2007. “Why we don’t deserve credit for everything we know.” Synthese, 158, 345–361.
Lackey, J. 2009. “Knowledge and credit,” Philosophical Studies, 142, 27–42.
Lehrer, K. 1974. Knowledge. Oxford: Clarendon Oxford Press.
Lehrer, K. 1986. “The Coherence Theory of Knowledge.” Philosophical Topics, 14, pp. 5-25.
Lehrer K. 1990. Theory of Knowledge. Boulder: Westview Press.
Lewis, C. I. 1946. An Analysis of Knowledge and Valuation. LaSalle: Open Court.
McGrew, T. 1995. The Foundations of Knowledge. Lanham: Rowman & Littlefield.
Meinong, A. 1906. “Über die Erfahrungsgrundlagen unseres Wissens” [“On the Experiential Foundations of Our Knowledge”], in Abhandlungen zur Didaktik und Philosophie der Naturwissenschaften, Band [Vol.] I, Heft [Issue] 6, Berlin: J. Springer. Reprinted in Meinong 1968–78, Vol. V: 367–481.
Montmarquet, J. 1987. “Epistemic Virtue.” Mind, 96, 482–497.
Neurath, O. 1983/1932, “Protocol Sentences.” In R.S. Cohen and M. Neurath, eds. Philosophical Papers 1913– 1946. Dordrecht: Reidel.
Olsson, E. J. 2009. Against Coherence: Truth, Probability, and Justification. Oxford: Oxford University Press.
Plantinga, A. 1983. “Reason and Belief in God,” in A. Plantinga and N. Wolterstorff, eds. Faith and Rationality. Notre Dame: University of Notre Dame Press, 16-93.
Plantinga, A. 1993a. Warranted Christian Belief. New York: Oxford.
Plantinga, A. 1993b. Warrant: The Contemporary Debate. New York: Oxford.
Pollock, J. 1986. Contemporary Theories of Knowledge. Lanham: Rowman & Littlefield Publishers.
Poston, T. 2014. Reason and Explanation: A Defense of Explanatory Coherentism. Hampshire: Palgrave Macmillan.
Quine, W.V.O. 1970. Web of Belief. Cambridge: Harvard University Press.
Russell, B. 1948. Human Knowledge: Its Scope and Value. London: Routledge.
Smithies, D. 2014. “Can Foundationalism Solve the Regress Problem?” in Current Controversies in Epistemology, ed. R. Neta. London: Routledge, 73-94.
Sosa, E. 1980. “The Raft and the Pyramid: Coherence Versus Foundations in the Theory of Knowledge.” Midwest Studies in Philosophy, 5 (1), 3–26.
Sosa, E. 1991. “Reliabilism and intellectual virtue” in Knowledge in Perspective: Selected Essays in Epistemology. New York: Cambridge University Press, 131-145.
Sosa, E. 2001. “Reliabilism and Intellectual Virtue,” in Epistemology: Internalism and Externalism, ed. Hilary Kornblith. Malden: Blackwell Publishers, 147-62.
Sosa, E. 2007. A Virtue Epistemology: Apt Belief and Reflective Knowledge, Volume 1. Oxford: Oxford University Press.
Stich, S. 1988. “Reflective Equilibrium, Analytic Epistemology, and the Problem of Cognitive Diversity.” Synthese, 74, 391-413.
Stich, S. 1990. The Fragmentation of Reason. Cambridge: The MIT Press.
Stroud, B. 1989. “Understanding Human Knowledge in General,” in Knowledge and Skepticism, Marjorie Clay and Keith Lehrer, eds. Boulder: Westview, 31-50. Reprinted in S. Bernecker and F. Dretske, eds. 2000. Knowledge: Readings in Contemporary Epistemology. Oxford: Oxford University Press, 307-323. Page numbers to this anthology.
Swain, Marshall. 1981. Reasons and Knowledge. Ithaca: Cornell University Press.
Williams, M. 2000. “Epistemological Realism,” in Epistemology: An Anthology, eds. E. Sosa and J. Kim. Malden: Blackwell Publishers, 536-555.
Zagzebski, L. 1996. Virtues of the Mind: An Inquiry into the Nature of Virtue and the Ethical Foundation of Knowledge. Cambridge: Cambridge University Press.
Zagzebski, L. 2000. “Virtues of the Mind,” in Epistemology: An Anthology, eds. E. Sosa and J. Kim. Malden: Blackwell Publishers, 457-467.
Zagzebski, L. 2009. On Epistemology. Belmont: Wadsworth.

Author Information

Jamie Carlin Watson
Email: jamie.c.watson@gmail.com
Broward College
U. S. A.

Laozi (Lao-tzu, fl. 6th cn. B.C.E.)

Laozi is the name of a legendary Daoist philosopher, the alternate title of the early Chinese text better known in the West as the Daodejing, and the moniker of a deity in the pantheon of organized “religious Daoism” that arose during the later Han dynasty (25-220 C.E.). Laozi is the pinyin romanization for the Chinese characters which mean “Old Master.” Laozi is also known as Lao Dan (“Old Dan”) in early Chinese sources (see Romanization systems for Chinese terms). The Zhuangzi (late 4^th century B.C.E.) is the first text to use Laozi as a personal name and to identify Laozi and Lao Dan. The earliest materials to mention Laozi are in the Zhuangzi’s Inner Chapters (Chs. 1-7) in the narration of Lao Dan’s funeral in Ch. 3. Two other passages provide support for the linkage of Laozi and Lao Dan (in Ch. 14 and Ch. 27). There are seventeen passages in which Laozi plays a role in the Zhuangzi. Three are in the Inner Chapters, eight occur in chapters 11-14 in the Yellow Emperor sections of the text (chs. 11, 12, 13, 14), five are in chapters likely belonging to Zhuang Zhou’s disciples as the sources (chs. 21, 22, 23, 25, 27), and one is in the final concluding editorial chapter (ch. 33). In the Yellow Emperor sections in which Laozi is the main figure, four passages contain direct attacks on Confucius and the Confucian virtues of ren, yi, and li in the form of dialogues. The sentiments expressed by Laozi in these passages are reminiscent of remarks from the Daodejing and probably date from the period in which that collection was reaching some near final form. Some of these themes include the advocacy of wu-wei, rejection of discursive reasoning and mind meddling, condemnation of making discriminations, and valorization of forgetting and fasting of the mind. The earliest ascriptions of authorship of the Daodejing to Laozi are in Han Feizi and the Huainanzi. Over time, Laozi became a principal figure in institutionalized forms of Daoism and he was often associated with the many transformations and incarnations of the dao itself.

Laozi and Lao Dan in the Zhuangzi
Laozi and the Daodejing
The First Biography and the Establishment of Laozi as the Founder of Daoism
The Ongoing Laozi Myth
References and Further Reading

1. Laozi and Lao Dan in the Zhuangzi

The Zhuangzi gives the following, probably fictional, account of Confucius‘s impression of Laozi:

“Master, you’ve seen Lao Dan—what estimation would you make of him?” Confucius said, “At last I may say that I have seen a dragon—a dragon that coils to show his body at its best, that sprawls out to display his patterns at their best, riding on the breath of the clouds, feeding on the yin and yang. My mouth fell open and I couldn’t close it; my tongue flew up and I couldn’t even stammer. How could I possibly make any estimation of Lao Dan!” Zhuangzi, Ch. 14

Laozi’s relationship to Confucius is a major part of the Zhuangzi‘s picture of the philosopher. Of the seventeen passages mentioning Laozi, Confucius figures as a dialogical partner or subject in nine. While it is clear that Confucius is thought to have a long way to go to become a zhenren (the Zhuangzi‘s way of speaking about the perfected person), Lao Dan seems to feel sorry for Confucius in his reply to Wuzhi “No-Toes” in Ch. 5, The Sign of Virtue Complete. Laozi recommends to Wuzhi that he try to release Confucius from the fetters of his tendency to make rules and human discriminations (for example, right/wrong; beautiful/ugly) and set him free to wander with the dao.

Lao Dan addresses Confucius by his personal name “Qiu” in three passages. Since such a liberty is one that only a person with seniority and authority would take, this style invites us to believe that Confucius was a student of Lao Dan’s and thereby acknowledged Laozi as an authority. In one of these passages in which Lao Dan uses Confucius’s personal name Qiu, he cautions Confucius against clever arguments and making plans and strategies with which to solve life’s problems, telling him that such rhetoricians are simply like nimble monkeys and rat catching dogs who are set aside when unable to perform (Ch. 12, Heaven and Earth). And on another occasion, Qiu claims that he knows the “six classics” thoroughly and that he has tried to persuade 72 kings to their truth, but they have been unmoved. Lao Dan’s reply is, “Good!” He tells Confucius not to occupy himself with such worn out ways, and to instead live the dao himself (Ch.14, Turning of Heaven).

In his later attempt to provide an actual biography of Laozi by Sima Qian (see below), Laozi’s vocation as a librarian figures prominently. If the ultimate source of this tradition is the Zhuangzi, we should not forget that the context of this record is as a component in the theme that Laozi taught Confucius, who was confused and having no success with his own teachings. Accordingly, the point of the story that mentions Laozi’s occupation as librarian or an archivist (ch. 13) is that Confucius’ writings, offered to Laozi by Confucius himself, are simply not worthy to be put into a library. We cannot be sure, then, that there is any real memory of Confucius’s occupation being preserved for us, as the story may be an entire fiction meant to make a point about the inadequacy of Confucius’s teachings.

Finally, in Ch.14, Turning of Heaven, Lao Dan makes a direct attack not only on the rules and regulations of Confucius, but also the teachings of the Mohists, and the veneration of the ancient emperors and legendary sages of the past, displaying his preference for experiential oneness with dao to any teaching or tradition of philosophers or great minds of the past.

2. Laozi and the Daodejing

The ways in which expressions of Laozi in the seventeen passages in which he occurs in the Zhuangzi sound like sentiments in the Daodejing (hereafter, DDJ) represent collectively one basis for the traditional association of Laozi as author of the text. For example, at Laozi’s funeral in Ch. 3, Qin Shi valorizes Laozi by saying that he accomplished much, without appearing to do so, which is a reference both to the Old Master’s rejection of pursuit of fame and power and also praise for his conduct as wu-wei (effortless action) in oneness with dao. Qin Shi’s praise of Laozi is also consistent with Laozi’s teaching to Yangzi Ju in Ch. 7 not to seek fame and power. Such conduct and attitudes are encouraged strongly in DDJ 2, 7, 22, 24, 51 and 77. When Laozi tells Wuzhi to return to Confucius and set him free from the disease of problematizing life and tying himself in knots by helping him to empty himself of making discriminations (Zhuangzi ch. 5), this same teaching shows up in the DDJ in many places (for example, chs. 5 and 18). Likewise, Laozi criticizes Confucius for trying to spread the classics (12 in number in ch. 13 and 6 in ch. 14) instead of valuing the wordless teaching, the DDJ has a ready parallel in Ch. 56. While Confucius is teaching his disciples to put forth effort and cultivate benevolence (ren) and appropriate conduct (yi), Laozi tells him that he should be teaching effortless action (wu-wei) in Zhuangzi chs. 13, 14, and 21). This teaching also shows up in the DDJ (chs. 2, 3, 20, 47, 48, 57, 63, and 64). Finally, if we take Zhuangzi Ch. 33 as an original part of the work, then Lao Dan (Laozi) actually quotes DDJ 28.

In addition to the ways in which Laozi’s teachings in the Zhuangzi sound like those of the DDJ, we should also note that both of the very early classical works known as the Hanfeizi and the Huainanzi contain passages that are direct quotes or unmistakable allusions to teachings in the DDJ and attribute them to Lao Dan or Laozi by name. Tae Hyun Kim has made a study of these passages in Hanfeizi and the recent English translation of Huainanzi by John Major and others makes it easy to locate these citations (for example, see Huainanzi, 11.3). All of these connections culminate in Sima Qian’s biography of Laozi (see below) which not only says that Laozi was the author of the DDJ, but explains that it was a written text of Laozi’s teachings given when he departed China to go to the West. So, by the 1st Cent. B.C.E., it was accepted by tradition and lore that Laozi was the author of the DDJ.

However, the attribution of authorship of the DDJ to Laozi is much more complicated than it first appears. The DDJ has 81 chapters and about 5,000 Chinese characters, depending on which text is used. Its two major divisions are the dao jing (chs. 1-37) and the de jing (chs. 38-81). But actually, this division probably rests on nothing other than the fact that the principal concept opening Chapter 1 is dao (way) and that of Chapter 38 is de (virtue). Moreover, although the text has been studied by commentators in Chinese history for centuries, the general reverence shown to it, and the long standing tradition that it was the work of the great philosopher Laozi, were two factors militating against any critical literary analysis of its structure. What we know now is that in spite of the view that the text had a single author named Laozi, it is clear to textual critics that the work is a collection of smaller passages edited into sections and not the work of a single hand. Most of these probably circulated orally, perhaps as single teachings or in small collections. Later they were gathered and arranged by an editor.

The internal structure of the DDJ is only one ground for the denial of a single author for the text. The fact that we also now know there were multiple versions of the DDJ, even as early as 300 B.C.E., also suggests that it is unlikely that a single author wrote just one book that we now know as the DDJ. Consider that for almost 2,000 years the Chinese text used by commentators in China and upon which all except the most recent Western language translations were based has been called the Wang Bi, after the commentator who made a complete edition of the DDJ sometime between 226-249 C.E. Although Wang Bi was not a Daoist, the commentary he wrote after collecting and editing the text became a standard interpretive guide, and generally speaking even today scholars depart from his arrangement of the actual text only when they can make a compelling argument for doing so. However, based on recent archaeological finds at Guodian in 1993 and Mawangdui in the 1970s we have no doubt that there were several simultaneously circulating versions of the DDJ text that pre-dated Wang Bi’s compilation of what we now call the “received text.”

Mawangdui is the name for a site of tombs discovered near Changsha in Hunan province. The Mawangdui discoveries include two incomplete editions of the DDJ on silk scrolls (boshu) now simply called “A” and “B.” These versions have two principal differences from the Wang Bi. Some word choice divergencies are present. The order of the chapters is reversed, with 38-81 in the Wang Bi coming before chapters 1-37 in the Mawangdui versions. More precisely, the order of the Mawangdui texts takes the traditional 81 chapters and sets them out like this: 38, 39, 40, 42-66, 80, 81, 67-79, 1-21, 24, 22, 23, 25-37. Robert Henricks has published a translation of these texts with extensive notes and comparisons with the Wang Bi under the title Lao-Tzu, Te-tao Ching.

The Guodian find consists of 730 inscribed bamboo slips found in a tomb near the village of Guodian in Hubei province in 1993. There are 71 slips with material that is also found in 31 of the 81 chapters of the DDJ and corresponding only to Chapters 1-66. Based on the probable date of the closing of the tomb, the version of the DDJ found within it may date as early as c. 300 B.C.E.

3. The First Biography and the Establishment of Laozi as the Founder of Daoism

We have now arrived at the stage where studies of Laozi’s biography usually begin.

The first known attempt to write a biography of Laozi is in the Shiji (Historical Records) by Sima Qian (145-89 B.C.E.). According to this text, Laozi was a native of Chu, a southern state of the Zhou dynasty. His surname was Li, and his personal name was Er, and his style name was Dan. Sima Qian reports that Laozi was a historiographer in charge of the archives of Zhou. Moreover, Sima Qian tells us that Confucius had traveled to see Laozi to learn about the performance of rituals from him. According to The Book of Rites (Liji), a master known as Lao Dan was an expert on mourning rituals. On four occasions, Confucius (Kongzi, Master Kong) is reported to have responded to questions by appealing to answers given by Lao Dan. The records even say that Confucius once assisted him in a burial service. Just what date we can put on this record from The Book of Rites is uncertain, but it may have informed Sima Qian’s biography.

According to the biography, during the course of their conversations Laozi told Confucius to give up his prideful ways and seeking of power. When Confucius returned to his disciples, he told them that he was overwhelmed by the commanding presence of Laozi, which was like that of a mighty dragon. The biography goes on to say that Laozi cultivated the dao and its de. However, as the state of Zhou continued to decline, Laozi decided to leave China through the Western pass (toward India) and that upon his departure he gave to the keeper of the pass, one Yin Xi, a book divided into two parts, one on dao and one on de, and of 5,000 characters in length. After that, no one knew what became of him. This is perhaps the most familiar of the traditions narrated by Sima Qian and it contains the core of most every subsequent biography or hagiography of Laozi of significance. However, the biography did not end here. Sima Qian went on to record what other sources said about Laozi.

In the first biography, Sima Qian says some report that Laolaizi came from Chu, was a contemporary of Confucius, and he authored a work in fifteen sections which speaks of the practical uses of the Daoist teachings. But Sima Qian leaves it undecided whether he thinks Laolaizi should be identified with Laozi, even if he does include this reference in the section on Laozi.

Sima Qian adds another layer to the biography without commenting on the degree of confidence he has in its truthfulness, according to which it is said that Laozi lived 160 years or even 200 years, as a result of cultivating the dao and nurturing his longevity.

An additional tradition included in the first biography is that Dan, the historiographer of Zhou predicted in 479 B.C.E. that Zhou and Qin would break apart and that a new king would arise from Qin. The point of this tradition is that Dan (Lao Dan?) had the power to predict the political future of the people, including the fragmentation of the Zhou dynasty and the rise of the Qin in about 221 B.C.E. (that is, Qinshihuang, or the first emperor of China). But Sima Qian likewise refuses to identify Laozi with this Dan.

Finally, the first biography concludes with a reference to Laozi’s son and his descendants. Another movement in the evolution of the Laozi story was completed by about 240 B.C.E. This was necessitated by Lao Dan’s association with the grand historiographer Dan during the Zhou, who predicted the rise of the Qin state. This information, along with that of Laozi’s journey to the West, and of the writing of the book for Yin Xi won a favorable position for Laozi during the Qin dynasty. The association of Laozi with a text (the DDJ) that was becoming increasingly significant was important. However, with the demise of the Qin state, some realignment of Laozi’s connection with them was needed. So, Qian’s final remarks about Laozi’s son helped to associate the philosopher’s lineage with the new Han ruling family. The journey to the West component now also had a new force. It explained why Laozi was not presently advising the Han rulers.

Overall, it seems that the earliest biography conforms closely to passages contained in Zhuangzi Chapters 11-14 and 26 in associating Laozi with the archivist or historiographer of Zhou, Laozi’s rebuke of Confucius’s prideful seeking of fame and pursuit of power, and the report that Confucius told his disciples that Laozi was like a great dragon. It is possible, then, that Zhuangzi is thus the ultimate source of Sima Qian’s information.

Sima Qian also says, “Laozi cultivated the dao and its virtue (de).” We recognize of course that “dao and its virtue” is Dao and de and that this phase is meant to solidify Laozi’s association with the Daode jing. What the Zhuangzi, Hanfeizi and Huainanzi only alluded to by putting near quotes from the DDJ in the mouth of Laozi, Sima Qian now makes into an explicit connection. He even tells us that when the Zhou kingdom began to decline, Laozi decided to leave China and head into the West. When he reached the mountain pass, the keeper of the pass (Yin Xi) insisted that he write down his teachings, so that the people would have them after he left. So, “Laozi wrote a book in two parts, discussing the ideas of the dao and of de in some 5,000 words, and departed. No one knows where he ended his life.” These remarks make an unmistakable connection between what Laozi is said to have delivered to Yin Xi and the two sectional divisions of the DDJ and a very close approximation to its exact number of characters.

Sima Qian classified the Six Schools as Yin-Yang, Confucian, Mohist, Legalists, School of Names, and Daoists. Since his biography located Laozi in a time period predating the Zhuangzi, and the passages in the Zhuangzi seemed to be about a person who lived in the time of Confucius (and not to be simply a literary or traditional invention), then the inference was easy to make that Laozi was the founder of the Daoist school.

4. The Ongoing Laozi Myth

In The Lives of the Immortals (Liexuan zhuan) by Liu Xiang (79-8 B.C.E.) there are separate entries for Laozi and Yin Xi. According to the extension of the story of Laozi’s leaving China through the Western pass found in Liu Xiang’s work, Yin became a disciple of Laozi and begged him to allow him to go to the West as well. Laozi told him that he could come along, but only after he cultivated the dao. Laozi instructed Yin to study hard and await a summons which would be delivered to him in the marketplace in the city of Chengdu. There is now a shrine at the putative location of this site dedicated to “ideal disciple.” Additionally, in Liu Xiang’s text it is clear that Laozi is valorized as the preeminent immortal and as a superior daoshi (fangshi) who had achieved not only immortality through wisdom and the practice of techniques for longevity, but also mastery of the arts associated with the abilities and skills of one who was united with dao (compare the “Spirit Man” living in the Gushe mountains in Zhuangzi ch. 1 and Wang Ni’s remarks on the perfected person or zhenren in ch. 2).

Another important stage in the development of Laozi’s place in Chinese philosophical history occurred when Emperor Huan (147-167 C.E.) built a palace on the traditional site of Laozi’s birthplace and authorized veneration and sacrifice to Laozi. The “Inscription to Laozi” (Laozi ming) written by Pian Shao in c. 166 C.E. as a commemorative marker for the site goes well beyond Sima Qian’s biography. It makes the first apotheosis of Laozi into a deity. The text makes reference to the many cosmic metamorphoses of Laozi, portraying him as having been counselor to the great sage kings of China’s misty pre-history. Accordingly, during this period of the 2^nd and 3^rd centuries, the elite at the imperial court divinized Laozi and regarded him as an embodiment or incarnation of the dao, a kind of cosmic emperor who knew how to bring things into perfect harmony and peace by acting in wu-wei.

The Daoist cosmological belief in the powers of beings who experienced unity with the dao to effect transformation of their bodies and powers (for example, Huzi in Zhuangzi, ch.7) was the philosophical underpinning of the work, Classic on the Transformations of Laozi (Laozi bianhua jing, late 100s C.E., available now in a Dunhuang manuscript dating 612 C.E.). This work reflects some of the ideas in Pian Shao’s inscription, but takes them even further. It tells how Laozi transformed into his own mother and gave birth to himself, taking quite literally comments in the DDJ where the dao is portrayed as the mother of all things (DDJ, ch. 1). The work associates Laozi with various manifestations or incarnations of the dao itself. In this text there is a complete apotheosis of Laozi into a numinal divinity. “Laozi rests in the great beginning, wanders in the great origin, floats through dark, numinous emptiness…He joins serene darkness before its opening, is present in original chaos before the beginnings of time….Alone and without relation, he has existed since before heaven and earth. Living deeply hidden, he always returns to be. Gone, the primordial; Present, a man” (Quoted in Kohn, “Myth,” 47). The final passage in this work is an address given by Laozi predicting his reappearance and promising liberation from trouble and the overthrow of the Han dynasty, an allusion that helps us fix the probable date of origin for the work. The millennial cults of the second century believed Laozi was a messianic figure who appeared to their leaders and gave them instructions and revelations (for example, the hagiography of Zhang Daoling, founder of the Celestial Master Zhengyi movement contained in the 5th century work, Taiping Guangji 8).

The period of the Celestial Masters (c. 142-260 C.E.) produced documents enhancing the myth of Laozi who came then to be called Laojun (Lord Lao) or Taishang Laojun (Most High Lord Lao). Laojun could manifest himself in any time of unrest and bring Great Peace (taiping). Yet, the Celestial Masters never claimed that Laojun had done so in their day. Instead of such a direct manifestation, the Celestial Masters practitioners taught that Laojun transmitted to them talismans, registers, and new scriptures in the form of texts to guide the creation of communities of heavenly peace. One work, very likely from the late 3^rd or early 4^th century C.E. entitled The Hundred and Eight Precepts Spoken by Lord Lao (Laojun shuo yibai bashi jie) became the earliest set of behavioral guides for Celestial Masters communities. According to the text, Laozi delivered these precepts after returning from India and finding the people in a state of corruption.

During the reign of Emperor Huidi of the Western Jin dynasty (290-306 C.E.), Wang Fu, a master within the Daoist sectarian group known as the Celestial Masters, often debated with the Buddhist monk Bo Yuan about philosophical beliefs. As a result of these exchanges, scholarly consensus holds that Wang Fu compiled a one scroll work entitled Classic of the Conversion of the Barbarians (Huahu jing, c. 300 C.E.). The work is also known by the title The Supreme Numinous Treasure’s Sublime Classic on Laozi’s Conversion of the Barbarians (Taishang lingbao Laozi huahu miaojing). Perhaps the most inflammatory claim of this work was its teaching that when Laozi left China through the Western pass he went to India, where he transmorphed into the historical Buddha and converted the barbarians. The basic implication of the book was that Buddhism was actually only a form of Daoism. This work inflamed Buddhists for decades. In fact, both of the Tang Emperors Gaozong (649-683 C.E.) and Zhongzong (705-710 C.E.) gave imperial orders to prohibit its distribution. However, as bitter contention continued between Buddhism and Daoism, the Daoists actually expanded the Classic of the Conversion of the Barbarians, so that by 700 C.E. it was ten scrolls in length. Four of these were recovered in the Dunhuang cache of manuscripts. The much extended work came to include the account that Laozi entered the mouth of a queen in India and the next year was born from her right arm-pit to become the Buddha. He walked immediately after his birth, and “from then on Buddhist teaching came to flourish.” To those familiar with the hagiographies of the Buddha, virtually all of this birth account is recognizable as associated with Buddha, not Laozi.

In the course of the production of polemical writings on the Buddhist side of the debate, attempts were made to turn the tables on the Daoists. Laozi was portrayed as a bodhisattva or disciple of the Buddha sent to convert the Chinese. This theory had other desirable extensions from a Buddhist viewpoint, because it was also applied to Confucius, enabling Buddhist rhetoricians to hold that Confucius was an avatar of Buddhism and that Confucianism was actually a form of distorted Buddhism.

Most later writings about Laozi continued to base their appeals to Laozi’s authority on his ongoing transformations, but they likewise provide evidence of the growing tension between Daoism and Buddhism. The first mythological account of Laozi’s birth is in the Classic of the Inner Explanation of the Three Heavens (Santian neijie jing), a Celestial Master work dated about 420 C.E. In this text, Laozi has three births: as the manifestation of the dao from pure energy to become a deity in heaven; in human form as the ancient philosopher author of the DDJ; and as the Buddha after his journey to the West. In the first birth, his mother is known as “The Jade Maiden of Mystery and Wonder.” In his second, he is born to a human woman known as Mother Li. This was an eighty-one year pregnancy, after which he was born from her left armpit (there is a tradition that Buddha had been born from his mother’s right arm pit). At birth he had white hair and so he was called laozi (here meaning something more like lao haizi or Old Child). This birth is set in the time of the Shang dynasty, several centuries before the date Sima Qian reports. But the purpose of such a move in the Laozi legend is to allow him time to travel to the West and then become the Buddha. The third birth takes place in India as the Buddha.

In the Yuan dynasty (1285 C.E.), Emperor Shizu ordered the burning of the Daoist canon of texts, and according to lore, the first writing destroyed was the greatly extended version of Classic of the Conversion of the Barbarians in ten or more scrolls. Once again, though, the text and its story of Laozi seemed quite resilient. It reappeared in the form of an illustrated work entitled Eighty-one Transformations of Lord Lao (Laojun bashiyi hua tushuo). The Buddhist thinker Xiangmai wrote a detailed, but polemical, history of this text and few scholars trust its reliability. Whether the Eighty-one Transformations of Lord Lao still survives is arguable, although a work entitled Eighty-one Transformations of the Most High Lord Lao of Mysterious Origin of the Golden Portal (Jinque xuanyuan Taishang Laojun baishiyi hua tushuo) with illustrations and dating to 1598 is held in the Museum fur Volkerkunde in Berlin. The version in Berlin provides an illustration for each of Laozi’s transformations, each accompanied by a short text. The first few depict his existence in cosmic time. It is not until the 11^th transformation that he enters historical time during the era of Fu Xi by the name Yuhuazi. In his 34^th transformation, Laozi sends Yin Xi to explain the sutras to the Indian barbarians. The 58^th transformation is Laozi’s appearance in the clouds to Zhang Daoling, the founder of the Celestial Master Zhengyi sect of Daoism that still exists today.

Ge Hong’s (283-343 C.E.) The Inner Chapters of the Master Who Embraces Simplicity (Baopuzi neipian) is arguably the most important Daoist philosophical work of the Jin dynasty. In this text, Ge Hong reports that in a state of visualization he saw Laozi, seven feet tall, with cloudlike garments of five colors, wearing a multi-tiered cap and carrying a sharp sword. According to Ge Hong, Laozi had a prominent nose, long eyebrows, and an elongated head. This physiological type was the template for portraying immortals in Daoist art. Whereas Liu Xiang’s Collected Biographies of the Immortals (Liexian zhuan, c. 18 B.C.E.) reports that Laozi was born during the Shang dynasty, served as an archivist under the Zhou, was a teacher of Confucius, and later made his way to the West just as said in Sima Qian’s standard biography, Ge Hong also collected and edited the Biographies of Immortals (Shenxian zhuan). According to the article on Laozi, Ge Hong praises Laozi’s practice of stillness and wu-wei, but he also represents Laozi as a master of the techniques of immortality and the efficacy of external alchemy, herbs and control of qi. He attributes to Laozi what is called the alchemy of the nine cinnabars and eight minerals, as well as a vast knowledge of herbology and dietetics. Ge Hong also tells a story about one Xu Jia who was a retainer of Laozi. In the story, Laozi keeps Xu Jia alive by means of a powerful talisman placed in Xu’s mouth. Its removal causes Xu’s death. When replaced, Xu Jia lives again. In all this, Laozi is portrayed as a master of life and death by means of talismanic power, a practice used by the Celestial Masters and continued by Daoist masters as late as the Ming dynasty, if not into the present era.

Other reported manifestations of Laozi gave authority to new Daoist lineages or modifications of practice. For example, the Daoist master Kou Qianzhi reported a revelation received from Laozi in 415 C.E. which was a “New Code” for Daoist practitioners and communities. He wrote down the revelation in a text that became known as Classic on Precepts of Lord Lao Recited to the Melody of the Clouds (Laojun yinsong jiejing). This text contains 36 moral precepts each of which trace their authority to the introductory phrase, “Lord Lao said….” Textual traces are not the only sources for the traditions and views of Laozi in Chinese philosophical history. Yoshiko Kamitsuka has done a study of how views about Laozi changed and been reflected in material culture, especially sculpture and inscription.

Laozi was also often looked to for political validation. Throughout most of the Tang dynasty (618-907 C.E.), Laozi was regarded as the protector of the state because of the tradition that both the Tang ruling family and Laozi shared the surname Li and because of many reports of auspicious appearances of Laozi at the inauguration of the Tang dynasty in which he pledged his support during the rise and solidification of the ruling bureaucracy.

The hagiography of Laozi has continued to develop, down to the present day. There are even traditions that various natural geographic landmarks and features are the enduring imprint of Lord Lao on China and his face can be seen in them. It is more likely, of course, that Laozi’s immortality is in the mark made by the philosophical movement he has come to represent and the culture it created.

5. References and Further Reading

Ames, Roger. (1998). Wandering at Ease in the Zhuangzi. Albany: State University of New York Press.
Bokenkamp, Stephen R. (1997). Early Daoist Scriptures. Berkeley: University of California Press.
Boltz, William. (2005). “The Composite Nature of Early Chinese Texts.” In Text and Ritual in Early China, ed. Martin Kern. 50-78. Seattle: University of Washington Press.
Csikszentmihalyi, Mark and Ivanhoe, Philip J., eds. (1999). Religious and Philosophical Aspects of the Laozi. Albany: State University of New York.
Giles, Lionel. (1948). A Gallery of Chinese Immortals. London: John Murray.
Graham, Angus. (1981). Chuang tzu: The Inner Chapters. London: Allen & Unwin.
Graham, Angus. (1989). Disputers of the Tao: Philosophical Argument in Ancient China. La Salle, IL: Open Court.
Graham, Angus. [1998 (1986)], “The Origins of the Legend of Lao Tan.” In Lao-tzu and the Tao-te-ching, ed. Kohn, Livia Kohn and Michael LaFargue, 23-41. Albany: State University of New York Press.
Hansen, Chad. (1992). A Daoist Theory of Chinese Thought. New York: Oxford University Press.
Henricks, Robert. (1989). Lao-Tzu: Te-Tao Ching. New York: Ballantine.
Ivanhoe, Philip J. (2002). The Daodejing of Laozi. New York: Seven Bridges Press.
Kamitsuka, Yoshiko, (1998). “Lao-Tzu in Six Dynasties Taoist Sculpture.” In Lao-tzu and the Tao-te-ching, ed. Kohn, Livia Kohn and Michael LaFargue, 63-89. Albany: State University of New York Press.
Kim, Tae Hyun. (2010). “Other Laozi Parallels in the Hanfeizi An Alternative Approach to the Textual History of the Laozi and Early Chinese Thought.” Sino-Platonic Papers 199 (March 2010), ed. Victor H. Mair. Philadelphia: University of Pennsylvania Press.
Kohn, Livia (2008). “Laojun yinsong jiejing [Classic on Precepts of Lord Lao, Recited to the Melody in the Clouds].” In Encyclopedia of Taoism, ed. Fabrizio Pregadio. London: Routledge.
Kohn, Livia, (1998). “The Lao-Tzu Myth.” In Lao-tzu and the Tao-te-ching, ed. Kohn, Livia Kohn and Michael LaFargue, 41-63. Albany: State University of New York Press.
Kohn, Livia, (1996). “Laozi: Ancient Philosopher, Master of Longevity, and Taoist God.” In Religions of China in Practice, ed. Donald S. Lopez, 52-63. Princeton: Princeton University Press.
Kohn, Livia and LaFargue, Michael. (1998). Lao-tzu and the Tao-te-ching. Albany: State University of New York Press.
Kohn, Livia and Roth, Harold (2002) Daoist Identity: History, Lineage, and Ritual. Honolulu: University of Hawaii Press.
Nylan, Michael and Csikzentmihalyi, Mark. (2003). “Constructing Lineages and Inventing Traditions through Exemplary Figures in Early China.” T’oung Pao 89: 1-41.
Penny, Benjamin (2008). “Laojun bashiyi huatu [Eighty-one Transformations of Lord Lao].” In Encyclopedia of Taoism, ed. Fabrizio Pregadio. London: Routledge.
Penny, Benjamin (2008). “Laojun shuo yibai bashi jie [The 180 Precepts Spoken by Lord Lao].” In Encyclopedia of Taoism, ed. Fabrizio Pregadio. London: Routledge.
Smith, Kidder (2003). “Sima Tan and the Invention of Daoism, ‘Legalism,’ et cetera.” The Journal of Asian Studies 62.1: 129-156.
Watson, Burton. (1968). The Complete Works of Chuang Tzu. New York: Columbia University Press.
Welch, Holmes. (1966). Taoism: The Parting of the Way. Boston: Beacon Press.
Welch, Holmes and Seidel, Anna, eds. (1979). Facets of Taoism. New Haven: Yale University Press.

Author Information

Ronnie Littlejohn
Email: ronnie.littlejohn@belmont.edu
Belmont University
U. S. A.

Daoist Philosophy

Along with Confucianism, “Daoism” (sometimes called “Taoism“) is one of the two great indigenous philosophical traditions of China. As an English term, Daoism corresponds to both Daojia (“Dao family” or “school of the Dao”), an early Han dynasty (c. 100s B.C.E.) term which describes so-called “philosophical” texts and thinkers such as Laozi and Zhuangzi, and Daojiao (“teaching of the Dao”), which describes various so-called “religious” movements dating from the late Han dynasty (c. 100s C.E.) onward. Thus, “Daoism” encompasses thought and practice that sometimes are viewed as “philosophical,” as “religious,” or as a combination of both. While modern scholars, especially those in the West, have been preoccupied with classifying Daoist material as either “philosophical” or “religious,” historically Daoists themselves have been uninterested in such categories and dichotomies. Instead, they have preferred to focus on understanding the nature of reality, increasing their longevity, ordering life morally, practicing rulership, and regulating consciousness and diet. Fundamental Daoist ideas and concerns include wuwei (“effortless action”), ziran (“naturalness”), how to become a shengren (“sage”) or zhenren (“perfected person”), and the ineffable, mysterious Dao (“Way”) itself.

What is Daoism?
Classical Sources for Our Understanding of Daoism
Is Daoism a Philosophy or a Religion?
The Daodejing
Fundamental Concepts in the Daodejing
The Zhuangzi
Basic Concepts in the Zhuangzi
Daoism and Confucianism
Daoism in the Han
Celestial Masters Daoism
Neo-Daoism
Shangqing and Lingbao Daoist Movements
Tang Daoism
The Three Teachings
The “Destruction” of Daoism
References and Further Reading

1. What is Daoism?

Strictly speaking there was no Daoism before the literati of the Han dynasty (c. 200 B.C.E.) tried to organize the writings and ideas that represented the major intellectual alternatives available. The name daojia, “Dao family” or “school of the dao” was a creation of the historian Sima Tan (d. 110 B.C.E.) in his Shi ji (Records of the Historian) written in the 2^nd century B.C.E. and later completed by his son, Sima Qian (145-86 B.C.E.). In Sima Qian’s classification, the Daoists are listed as one of the Six Schools: Yin-Yang, Confucian, Mohist, Legalist, School of Names, and Daoists. So, Daoism was a retroactive grouping of ideas and writings which were already at least one to two centuries old, and which may or may not have been ancestral to various post-classical religious movements, all self-identified as daojiao (“teaching of the dao“), beginning with the reception of revelations from the deified Laozi by the Celestial Masters (Tianshi) lineage founder, Zhang Daoling, in 142 C.E.This article privileges the formative influence of early texts, such as the Daodejing and the Zhuangzi, but accepts contemporary Daoists’ assertion of continuity between classical and post-classical, “philosophical” and “religious” movements and texts.

2. Classical Sources for Our Understanding of Daoism

Daoism does not name a tradition constituted by a founding thinker, even though the common belief is that a teacher named Laozi originated the school and wrote its major work, called the Daodejing, also sometimes known as the Laozi. The tradition is also called “Lao-Zhuang” philosophy, referring to what are commonly regarded as its two classical and most influential texts: the Daodejing or Laozi (3^rd Cn. B.C.E.) and the Zhuangzi (4^th-3^rd Cn. B.C.E.). However, various streams of thought and practice were passed along by masters (daoshi) before these texts were finalized. There are two major source issues to be considered when forming a position on the origins of Daoism. 1) What evidence is there for beliefs and practices later associated with the kind of Daoism recognized by Sima Qian prior to the formation of the two classical texts? 2) What is the best reconstruction of the classical textual tradition upon which later Daoism was based?

With regard to the first question, Isabelle Robinet thinks that the classical texts are only the most lasting evidence of a movement she associates with a set of writings and practices associated with the Songs of Chu (Chuci), and that she identifies as the Chuci movement. This movement reflects a culture in which male and female masters variously called fangshi, daoshi, zhenren, or daoren practiced techniques of longevity and used diet and meditative stillness anto create a way of life that attracted disciples and resulted in wisdom teachings. While Robinet’s interpretation is controversial, there are undeniable connections between the Songs of Chu and later Daoist ideas. Some examples include a coincidence of names of immortals (sages), a commitment to the pursuit of physical immortality, a belief in the epistemic value of stillness and quietude, abstinence from grains, breathing and sexual practices used to regulate internal energy (qi), and the use of ritual dances that resemble those still done by Daoist masters (the step of Yu).

In addition to the controversial connection to the Songs of Chu, the Guanzi (350-250 B.C.E.) is a text older than both the Daodejing and probably all of the Zhuangzi, except the “inner chapters” (see below). The Guanzi is a very important work of 76 “chapters.” Three of the chapters of the Guanzi are called the Neiye, a title which can mean “inner cultivation.” The self-cultivation practices and teachings put forward in this material may be fruitfully linked to several other important works: the Daodejing; the Zhuangzi; a Han dynasty Daoist work called the Huainanzi; and an early commentary on the Daodejing called the Xiang’er. Indeed, there is a strong meditative trend in the Daoism of late imperial China known as the “inner alchemy” tradition and the views of the Neiye seem to be in the background of this movement. Two other chapters of the Guanzi are called Xin shu (Heart-mind book). The Xin shu connects the ideas of quietude and stillness found in both the Daodejing and Zhuangzi to longevity practices. The idea of dao in these chapters is very much like that of the classical works. Its image of the sage resembles that of the Zhuangzi. It uses the same term (zheng) that Zhuangzi uses for the corrections a sage must make in his body, the pacification of the heart-mind, and the concentration and control of internal energy (qi). These practices are called “holding onto the One,” “keeping the One,” “obtaining the One,” all of which are phrases also associated with the Daodejing (chs. 10, 22, 39).

The Songs of Chu and Guanzi still represent texts which are themselves creations of actual practitioners of Daoist teachings and sentiments, just as do the Daodejing and Zhuangzi. Who these persons were we do not know with certainty. It is possible that we do have the names, remarks, and practices of some of these individuals (daoshi) embodied in the passages of the Zhuangzi. For example, in Chs. 1-7 alone, Xu You, Ch.1; Lianshu, Ch.1; Ziqi Ch. 2; Wang Ni, Ch. 2; Changwuzi, Ch. 2; Qu Boyu, Ch. 4; Carpenter Shi, Ch. 4; Bohun Wuren, Ch. 5; Nu Y, Ch. 6; Sizi, Yuzi, Lizi, Laizi, Ch. 6; Zi Sanghu, Meng Zifan, Zi Qinzan, Ch. 6; Yuzi and Sangzi, Ch. 6; Wang Ni and Putizi, Ch. 7; Jie Yu, Ch. 7; Lao Dan, Ch. 7; Huzi, Ch. 7).

As for a reasonable reconstruction of the textual tradition upon which Daoism is based, we should not try to think of this task so simply as determining the relationship between the Daodejing and the Zhuangzi, such as which text was first and which came later. These texts are composite. The Zhuangzi, for example, repeats in very similar form sayings and ideas found in the Daodejing, especially in the essay composing Zhuangzi Chs. 8-10. However, we are not certain whether this means that whomever was the source of this material in the Zhuangzi knew the Daodejing and quoted it, or if they both drew from a common source, or even if the Daodejing in some way depended on the Zhuangzi. In fact, one theory about the legendary figure Laozi is that he was created first in the Zhuangzi and later became associated with the Daodejing. There are seventeen passages in which Laozi (a.k.a. Lao Dan) plays a role in the Zhuangzi and he is not mentioned by name in the Daodejing.

Based on what we know now, we could offer the following summary of the sources of early Daoism. Stage One: Zhuang Zhou’s “inner chapters” (chs. 1-7) of the Zhuangzi (c. 350 B.C.E.) and some components of the Guanzi, including perhaps both the Neiye and the Xin shu. Stage Two: The essay in Chs. 8-10 of the Zhuangzi and some collections of material which represent versions of our final redaction of the Daodejing, as well as Chs. 17-28 of the Zhuangzi representing materials likely gathered by Zhuang Zhou’s disciples. Stage Three: the “Yellow Emperor” (Huang-Lao) manuscripts from Mawangdui and of the Zhuangzi (Chs. 11-19, and 22), and the text known as the Huainanzi (c. 139 B.C.E.).

3. Is Daoism a Philosophy or a Religion?

In the late 1970s Western and comparative philosophers began to point out that an important dimension of the historical context of Daoism was being overlooked because the previous generation of scholars had ignored or even disparaged connections between the classical texts and Daoist religious belief and practice not previously thought to have developed until the 2^nd century C.E. We have to lay some of the responsibility for a prejudice against Daoism as a religion and the privileging of its earliest forms as a pure philosophy at the feet of the eminent translators and philosophers Wing-Tsit Chan and James Legge, who both spoke of Daoist religion as a degeneration of a pristine Daoist philosophy arising from the time of the Celestial Masters (see below) in the late Han period. Chan and Legge were instrumental architects in the West of the view that Daoist philosophy (daojia) and Daoist religion (daojiao) are entirely different traditions.

Actually, our interest in trying to separate philosophy and religion in Daoism is more revealing of the Western frame of reference we use than of Daoism itself. Daoist ideas fermented among master teachers who had a holistic view of life. These daoshi (Daoist masters) did not compartmentalize practices by which they sought to influence the forces of reality, increase their longevity, have interaction with realities not apparent to our normal way of seeing things, and order life morally and by rulership. They offered insights we might call philosophical aphorisms. But they also practid meditative stillness and emptiness to gain knowledge, engaged in physical exercises to increase the flow of inner energy (qi), studied nature for diet and remedy to foster longevity, practiced rituals related to their view that reality had many layers and forms with whom/which humans could interact, wrote talismans and practiced divination, engaged in spellbinding of “ghosts,” led small communities, and advised rulers on all these subjects. The masters transmitted their teachings, some of them only to disciples and adepts, but gradually these teachings became more widely available as is evidenced in the very creation of the Daodejing and Zhuangzi themselves.

The anti-supernaturalist and anti-dualist agendas that provoked Westerners to separate philosophy and religion, dating at least to the classical Greek period of philosophy was not part of the preoccupation of Daoists. Accordingly, the question whether Daoism is a philosophy or a religion is not one we can ask without imposing a set of understandings, presuppositions, and qualifications that do not apply to Daoism. But the hybrid nature of Daoism is not a reason to discount the importance of Daoist thought. Quite to the contrary, it may be one of the most significant ideas classical Daoism can contribute to the study of philosophy in the present age.

4. The Daodejing

The Daodejing (hereafter, DDJ) is divided into 81 “chapters” consisting of slightly over 5,000 Chinese characters, depending on which text is used. In its received form from Wang Bi (see below), the two major divisions of the text are the dao jing (chs. 1-37) and the de jing (chs. 38-81). Actually, this division probably rests on little else than the fact that the principal concept opening Chapter 1 is dao (way) and that of Chapter 38 is de (virtue). The text is a collection of short aphorisms that were not arranged to develop any systematic argument. The long standing tradition about the authorship of the text is that the “founder” of Daoism, known as Laozi gave it to Yin Xi, the guardian of the pass through the mountains that he used to go from China to the West (i.e., India) in some unknown date in the distant past. But the text is actually a composite of collected materials, most of which probably originally circulated orally perhaps even in single aphorisms or small collections. These were then redacted as someone might string pearls into a necklace. Although D.C. Lau and Michael LaFargue had made preliminary literary and redaction critical studies of the texts, these are still insufficient to generate any consensus about whether the text was composed using smaller written collections or who were the probable editors.

For almost 2,000 years, the Chinese text used by commentators in China and upon which all except the most recent Western language translations were based has been called the Wang Bi, after the commentator who used a complete edition of the DDJ sometime between 226-249 CE. Although Wang Bi was not a Daoist, his commentary became a standard interpretive guide, and generally speaking even today scholars depart from it only when they can make a compelling argument for doing so. Based on recent archaeological finds at Guodian in 1993 and Mawangdui in the 1970s we are certain that there were several simultaneously circulating versions of the Daodejing text as early as c. 300 B.C.E.

Mawangdui is the name for a site of tombs discovered near Changsha in Hunan province. The Mawangdui discoveries consist of two incomplete editions of the DDJ on silk scrolls (boshu) now simply called “A” and “B.” These versions have two principal differences from the Wang Bi. Some word choice divergencies are present. The order of the chapters is reversed, with 38-81 in the Wang Bi coming before chapters 1-37 in the Mawangdui versions. More precisely, the order of the Mawangdui texts takes the traditional 81 chapters and sets them out like this: 38, 39, 40, 42-66, 80, 81, 67-79, 1-21, 24, 22, 23, 25-37. Robert Henricks has published a translation of these texts with extensive notes and comparisons with the Wang Bi under the title Lao-Tzu, Te-tao Ching (1989). Contemporary scholarship associates the Mawangdui versions with a type of Daoism known as the Way of the Yellow Emperor and the Old Master (Huanglao Dao).

The Guodian find consists of 730 inscribed bamboo slips found near the village of Guodian in Hubei province in 1993. There are 71 slips with material that is also found in 31 of the 81 chapters of the DDJ and corresponding to Chapters 1-66. It may date as early as c. 300 B.C.E. If this is a correct date, then the Daodejing was already extant in a written form when the “inner chapters” (see below) of the Zhuangzi were composed. These slips contain more significant variants from the Wang Bi than do the Mawangdui versions. A complete translation and study of the Guodian cache has been published by Scott Cook (2013).

5. Fundamental Concepts in the Daodejing

The term Dao means a road, and is often translated as “the Way.” This is because sometimes dao is used as a nominative (that is, “the dao”) and other times as a verb (i.e. daoing). Dao is the process of reality itself, the way things come together, while still transforming. All this reflects the deep seated Chinese belief that change is the most basic character of things. In the Yi jing (Classic of Change) the patterns of this change are symbolized by figures standing for 64 relations of correlative forces and known as the hexagrams. Dao is the alteration of these forces, most often simply stated as yin and yang. The Xici is a commentary on the Yi jing formed in about the same period as the DDJ. It takes the taiji (Great Ultimate) as the source of correlative change and associates it with the dao. The contrast is not between what things are or that something is or is not, but between chaos (hundun) and the way reality is ordering (de). Yet, reality is not ordering into one unified whole. It is the 10,000 things (wanwu). There is the dao but not “the World” or “the cosmos” in a Western sense.

The Daodejing teaches that humans cannot fathom the Dao, because any name we give to it cannot capture it. It is beyond what we can express in language (ch.1). Those who experience oneness with dao, known as “obtaining dao,” will be enabled to wu-wei . Wu-wei is a difficult notion to translate. Yet, it is generally agreed that the traditional rendering of it as “nonaction” or “no action” is incorrect. Those who wu wei do act. Daoism is not a philosophy of “doing nothing.” Wu-wei means something like “act naturally,” “effortless action,” or “nonwillful action.” The point is that there is no need for human tampering with the flow of reality. Wu-wei should be our way of life, because the dao always benefits, it does not harm (ch. 81) The way of heaven (dao of tian) is always on the side of good (ch. 79) and virtue (de) comes forth from the dao alone (ch. 21). What causes this natural embedding of good and benefit in the dao is vague and elusive (ch. 35), not even the sages understand it (ch. 76). But the world is a reality that is filled with spiritual force, just as a sacred image used in religious ritual might be inhabited by numinal power (ch. 29). The dao occupies the place in reality that is analogous to the part of a family’s house set aside for the altar for venerating the ancestors and gods (the ao of the house, ch. 62). When we think that life’s occurrences seem unfair (a human discrimination), we should remember that heaven’s (tian) net misses nothing, it leaves nothing undone (ch. 37)

A central theme of the Daodejing is that correlatives are the expressions of the movement of dao. Correlatives in Chinese philosophy are not opposites, mutually excluding each other. They represent the ebb and flow of the forces of reality: yin/yang, male/female; excess/defect; leading/following; active/passive. As one approaches the fullness of yin, yang begins to horizon and emerge and vice versa. Its teachings on correlation often suggest to interpreters that the DDJ is filled with paradoxes. For example, ch. 22 says, “Those who are crooked will be perfected. Those who are bent will be straight. Those who are empty will be full.” While these appear paradoxical, they are probably better understood as correlational in meaning. The DDJ says, “straightforward words seem paradoxical,” implying, however, that they are not (ch. 78).

What is the image of the ideal person, the sage (sheng ren), or the perfected person (zhen ren) in the DDJ? Well, sages wu-wei, (chs. 2, 63). They act effortlessly and spontaneously as one with dao and in so doing, they “virtue” (de) without deliberation or volitional challenge. In this respect, they are like newborn infants, who move naturally, without planning and reliance on the structures given to them by culture and society (ch. 15). The DDJ tells us that sages empty themselves, becoming void of the discriminations used in conventional language and culture. Sages concentrate their internal energies (qi). They clean their vision (ch. 10). They manifest naturalness and plainness, becoming like uncarved wood (pu) (ch. 19). They live naturally and free from desires rooted in the discriminations that human society makes (ch. 37) They settle themselves and know how to be content (ch. 46). The DDJ makes use of some very famous analogies to drive home its point. Sages know the value of emptiness as illustrated by how emptiness is used in a bowl, door, window, valley or canyon (ch. 11). They preserve the female (yin), meaning that they know how to be receptive to dao and its power (de) and are not unbalanced favoring assertion and action (yang) (ch. 28). They shoulder yin and embrace yang, blend internal energies (qi) and thereby attain harmony (he) (ch. 42). Those following the dao do not strive, tamper, or seek to control their own lives (ch. 64). They do not endeavor to help life along (ch. 55), or use their heart-mind (xin) to “solve” or “figure out” life’s apparent knots and entanglements (ch. 55). Indeed, the DDJ cautions that those who would try to do something with the world will fail, they will actually ruin both themselves and the world (ch. 29). Sages do not engage in disputes and arguing, or try to prove their point (chs. 22, 81). They are pliable and supple, not rigid and resistive (chs. 76, 78). They are like water (ch. 8), finding their own place, overcoming the hard and strong by suppleness (ch. 36). Sages act with no expectation of reward (chs. 2, 51). They put themselves last and yet come first (ch. 7). They never make a display of themselves, (chs. 72, 22). They do not brag or boast, (chs. 22, 24) and they do not linger after their work is done (ch. 77). They leave no trace (ch. 27). Because they embody dao in practice, they have longevity (ch. 16). They create peace (ch. 32). Creatures do not harm them (chs. 50, 55). Soldiers do not kill them (ch. 50). Heaven (tian) protects the sage and the sage’s spirit becomes invincible (ch. 67).

Among the most controversial of the teachings in the DDJ are those directly associated with rulers. Recent scholarship is moving toward a consensus that the persons who developed and collected the teachings of the DDJ played some role in advising civil administration, but they may also have been practitioners of ritual arts and what we would call religious rites. Be that as it may, many of the aphorisms directed toward rulers in the DDJ seem puzzling at first sight. According to the DDJ, the proper ruler keeps the people without knowledge, (ch. 65), fills their bellies, opens their hearts and empties them of desires (ch. 3). A sagely ruler reduces the size of the state and keeps the population small. Even though the ruler possesses weapons, they are not used (ch. 80). The ruler does not seek prominence. The ruler is a shadowy presence, never standing out (chs. 17, 66). When the ruler’s work is done, the people say they are content (ch. 17). This picture of rulership in the DDJ is all the more interesting when we remember that the philosopher and legalist political theorist named Han Feizi used the DDJ as a guide for the unification of China. Han Feizi was the foremost counselor of the first emperor of China, Qin Shihuangdi (r. 221-206 B.C.E.). However, it is a pity that the emperor used the DDJ’s admonitions to “fill the bellies and empty the minds” of the people to justify his program of destroying all books not related to medicine, astronomy or agriculture. When the DDJ says that rulers keep the people without knowledge, it probably means that they do not encourage human knowledge as the highest form of knowing but rather they encourage the people to “obtain oneness with the dao.”

6. The Zhuangzi

The second of the two most important classical texts of Daoism is the Zhuangzi. This text is a collection of stories and remembered as well as imaginary conversations. The text is well known for its creativity and skillful use of language. Within the text we find longer and shorter treatises, stories, poetry, and aphorisms. The Zhuangzi may date as early as the 4^th century B.C.E. and according to imperial bibliographies of a later date, the Zhuangzi originally had 52 “chapters.” These were reduced to 33 by Guo Xiang in the 3^rd century C.E., although he seems to have had the 52 chapter text available to him. Ronnie Littlejohn has argued that the later work Liezi may contain some passages from the so-called “Lost Zhuangzi” 52 chapter version. Unlike the Daodejing which is ascribed to the mythological Laozi, the Zhuangzi may actually contain materials from a teacher known as Zhuang Zhou who lived between 370-300 B.C.E. Chapters 1-7 are those most often ascribed to Zhuangzi himself (which is a title meaning “Master Zhuang”) and these are known as the “inner chapters.” The remaining 26 chapters had other origins and they sometimes take different points of view from the Inner Chapters. Although there are several versions of how the remainder of the Zhuangzi may be divided, one that is gaining currency is Chs. 1-7 (Inner Chapters), Chs. 8-10 (the “Daode” essay), Chs. 11-16 and parts of 18, 19, and 22 (Yellow Emperor Chapters), and Chs. 17-28 (Zhuang Zhou’s Disciples’ material), with the remains of the text attributable to the final redactor.

7. Basic Concepts in the Zhuangzi

Zhuangzi taught that a set of practices, including meditative stillness, helped one achieve unity with the dao and become a “perfected person” (zhenren). The way to this state is not the result of a withdrawal from life. However, it does require disengaging or emptying oneself of conventional values and the demarcations made by society. In Chapter 23 of the Zhuangzi, aNanrong Chu inquiring of the character Laozi about the solution to his life’s worries was answered promptly: “Why did you come with all this crowd of people?” The man looked around and confirmed he was standing alone, but Laozi meant that his problems were the result of all the baggage of ideas and conventional opinions he lugged about with him. This baggage must be discarded before anyone can be zhenren, move in wu-wei and express profound virtue (de).

Like the DDJ, Zhuangzi also valorizes wu-wei, especially in the Inner Chapters, the Yellow Emperor sections on rulership, and the Zhuangzi disciples’ materials in Ch. 19. For its examples of such living the Zhuangzi turns to analogies of craftsmen, athletes (swimmers), ferrymen, cicada-catching men, woodcarvers, and even butchers. One of the most famous stories in the text is that of Ding the Butcher, who learned what it means to wu wei through the perfection of his craft. When asked about his great skill, Ding says, “What I care about is dao, which goes beyond skill. When I first began cutting up oxen, all I could see was the ox itself. After three years I no longer saw the whole ox. And now—now I go at it by spirit and don’t look with my eyes. Perception and understanding have come to a stop and spirit moves where it wants. I go along with the natural makeup, strike in the big hollows, guide the knife through the big openings, and follow things as they are. So I never touch the smallest ligament or tendon, much less a main joint. A good cook changes his knife once a year—because he cuts. A mediocre cook changes his knife once a month—because he hacks. I’ve had this knife of mine for nineteen years and I’ve cut up thousands of oxen with it, and yet the blade is as good as though it had just come from the grindstone. There are spaces between the joints, and the blade of the knife has really no thickness….[I] move the knife with the greatest subtlety, until—flop! The whole thing comes apart like a clod of earth crumbling to the ground.” (Ch. 3, The Secret of Caring for Life) The recurring point of all of the stories in Zhuangzi about wu-wei is that such spontaneous and effortless conduct as displayed by these many examples has the same feel as acting in wu-wei. The point is not that wu-wei results from skill development. Wu-wei is not a cultivated skill. It is a gift of oneness with dao. The Zhuangzi’s teachings on wu-wei are closely related to the text’s consistent rejection of the use of reason and argument as means to dao (chs. 2; 12, 17, 19).

Persons who exemplify such understanding are called sages, zhenren, and immortals. Zhuangzi describes the Daoist sage in such a way as to suggest that such a person possesses extraordinary powers. Just as the DDJ said that creatures do not harm the sages, the Zhuangzi also has a passage teaching that the zhenren exhibits wondrous powers, frees people from illness and is able to make the harvest plentiful (ch.1). Zhenren are “spirit like” (shen yi), cannot be burned by fire, do not feel cold in the freezing forests, and life and death have no effect on them (ch. 2). Just how we should take such remarks is not without controversy. To be sure, many Daoist in history took them literally and an entire tradition of the transcendents or immortals (xian) was collected in text and lore.

Zhuangzi is drawing on a set of beliefs about master teachers that were probably regarded as literal by many, although some think he meant these to be taken metaphorically. For example, when Zhuangzi says that the sage cannot be harmed or made to suffer by anything that life presents, does he mean this to be taken as saying that the zhenren is physically invincible? Or, does he mean that the sage has so freed himself from all conventional understandings that he refuses to recognize poverty as any more or less desirable than affluence, to recognize blindness as worse than sight, to recognize death as any less desirable than life? As the Zhuangzi says in Chapter One, Free and Easy Wandering, “There is nothing that can harm this man.” This is also the theme of Chapter Two, On Making All Things Equal. In this chapter people are urged to “make all things one,” meaning that they should recognize that reality is one. It is a human judgment that what happens is beautiful or ugly, right or wrong, fortunate or not. The sage knows all things are one (equal) and does not judge. Our lives are snarled and jumbled so long as we make conventional discriminations, but when we set them aside, we appear to others as extraordinary and enchanted.

An important theme in the Zhuangzi is the use of immortals to illustrate various points. Did Zhuangzi believe some persons physically lived forever? Well, many Daoists did believe this. Did Zhuangzi believe that our substance was eternal and only our form changed? Almost certainly Zhuangzi thought that we were in a constant state of process, changing from one form into another (see the exchange between Master Lai and Master Li in Ch. 6, The Great and Venerable Teacher). In Daoism, immortality is the result of what may be described as a wu xing transformation. Wu xing means “five phases” and it refers to the Chinese understanding of reality according to which all things are in some state of combined correlation of qi as wood, fire, water, metal, and earth. This was not exclusively a “Daoist” physics. It underlay all Chinese “science” of the classical period, although Daoists certainly made use of it. Zhuangzi wants to teach us how to engage in transformation through stillness, breathing, and experience of numinal power (see ch. 6). And yet, perhaps Zhuangzi’s teachings on immortality mean that the person who is free of discrimination makes no difference between life and death. In the words of Lady Li in Ch. 2, “How do I know that the dead do not wonder why they ever longed for life?”

Huangdi (the Yellow Emperor) is the most prominent immortal mentioned in the text of the Zhuangzi and he is a main character in the sections of the book called “the Yellow Emperor Chapters” noted above. He has long been venerated in Chinese history as a cultural exemplar and the inventor of civilized human life. Daoism is filled with other accounts designed to show that those who learn to live according to the according to the dao have long lives. Pengzu, one of the characters in the Zhuangzi, is said to have lived eight hundred years. The most prominent female immortal is Xiwangmu (Queen Mother of the West), who was believed to reign over the sacred and mysterious Mount Kunlun.

The passages containing stories of the Yellow Emperor in Zhuangzi provide a window into the views of rulership in the text. On the one hand, the Inner Chapters (chs. 1-7) reject the role of ruler as a viable vocation for a zhenren and consistently criticize the futility of government and politics (ch. 7). On the other hand, the Yellow Emperor materials in Chs. 11-13 present rulership as valuable, so long as the ruler is acts by wu-wei. This second position is also that taken in the work entitled the Huainanzi (see below).

The Daoists did not think of immortality as a gift from a god, or an achievement in the religious sense commonly thought of in the West. It was a result of finding harmony with the dao, expressed through wisdom, meditation, and wu-wei. Persons who had such knowledge were reputed to live in the mountains, thus the character for xian (immortal) is made up of two components, the one being shan “mountain” and the other being ren “person.” Undoubtedly, some removal to the mountains was a part of the journey to becoming a zhenren “true person.” Because Daoists believed that nature and our own bodies were correlations of each other, they even imagined their bodies as mountains inhabited by immortals. The struggle to wu-wei was an effort to become immortal, to be born anew, to grow the embryo of immortality inside. A part of the disciplines of Daoism included imitation of the animals of nature, because they were thought to act without the intention and willfulness that characterized human decision making. Physical exercises included animal dances (wu qin xi) and movements designed to enable the unrestricted flow of the cosmic life force from which all things are made (qi). These movements designed to channel the flow of qi became associated with what came to be called tai qi or qi gong. Daoists practiced breathing exercises, used herbs and other pharmacological substances, and they employed an instruction booklet for sexual positions and intercourse, all designed to enhance the flow of qi energy. They even practiced external alchemy, using burners to modify the composition of cinnabar into mercury and made potions to drink and pills to ingest for the purpose of adding longevity. Many Daoist practitioners died as a result of these alchemical substances, and even a few Emperors who followed their instructions lost their lives as well, Qinshihuang being the most famous.

The attitude and practices necessary to the pursuit of immortality made this life all the more significant. Butcher Ding is a master butcher because his qi is in harmony with the dao. Daoist practices were meant for everyone, regardless of their origin, gender, social position, or wealth. However, Daoism was a complete philosophy of life and not an easy way to learn.

When superior persons learn the Dao, they practice it with zest.

When average persons learn of the Dao, they are indifferent.

When petty persons learn of the Dao, they laugh loudly.

If they did not laugh, it would not be worthy of being the Dao.DDJ, 41

8. Daoism and Confucianism

Arguably, Daoism shared some emphases with classical Confucianism such as a this-worldly concern for the concrete details of life rather than speculation about abstractions and ideals. Nevertheless, it largely represented an alternative and critical tradition divergent from that of Confucius and his followers. While many of these criticisms are subtle, some seem very clear.

One of the most fundamental teachings of the DDJ is that human discriminations, such as those made in law, morality (good, bad) and aesthetics (beauty, ugly) actually create the troubles and problems humans experience, they do not solve them (ch. 3a). The clear implication is that the person following the dao must cease ordering his life according to human-made distinctions (ch. 19). Indeed, it is only when the dao recedes in its influence that these demarcations emerge (chs. 18; 38), because they are a form of disease (ch. 74). In contrast, Daoists believe that the dao is untangling the knots of life, blunting the sharp edges of relationships and problems, and turning down the light on painful occurrences (ch. 4). So, it is best to practice wu-wei in all endeavors, to act naturally and not willfully try to oppose or tamper with how reality is moving or try to control it by human discriminations.

Confucius and his followers wanted to change the world and be proactive in setting things straight. They wanted to tamper, orchestrate, plan, educate, develop, and propose solutions. Daoists, on the other hand, take their hands off of life when Confucians want their fingerprints on everything. Imagine this comparison. If the Daoist goal is to become like a piece of unhewn and natural wood, the goal of the Confucians is to become a carved sculpture. The Daoists put the piece before us just as it is found in its naturalness, and the Confucians polish it, shape it, and decorate it. This line of criticism is made very explicitly in the essay which makes up Zhuangzi Chs. 8-10.

Confucians think they can engineer reality, understand it, name it, control it. But the Daoists think that such endeavors are the source of our frustration and fragmentation (DDJ, chs. 57, 72). They believe the Confucians create a gulf between humans and nature that weakens and destroys us. Indeed, as far as the Daoists are concerned, the Confucian project is like a cancer that saps our very life. This is a fundamental difference in how these two great philosophical traditions think persons should approach life, and as shown above it is a consistent difference found also between the Zhuangzi and Confucianism.

The Yellow Emperor sections of the Zhuangzi in Chs. 12, 13 and 14 contain five text blocks in which Laozi is portrayed in dialogue with Confucius and according to which he is pictured as Confucius’ master and teacher. These materials provide a direct access into the Daoist criticism of the Confucian project.

9. Daoism in the Han

The teachings that were later called Daoism were closely associated with a stream of thought called Huanglao Dao (Yellow Emperor-Laozi Dao) in the 3rd and 2nd cn. B.C.E. The thought world transmitted in this stream is what Sima Tan meant by Daojia. The Huanglao school is best understood as a lineage of Daoist practitioners mostly residing in the state of Qi (modern Shandong area). Huangdi was the name for the Yellow Emperor, from whom the rulers of Qi said they were descended. When Emperor Wu, the sixth sovereign of the Han dynasty (r. 140-87 B.C.E.) elevated Confucianism to the status of the official state ideology and training in it became mandatory for all bureaucratic officials, the tension with Daoism became more evident. And yet, at court, people still sought longevity and looked to Daoist masters for the secrets necessary for achieving it. Wu continued to engage in many Daoist practices, including the use of alchemy, climbing sacred Taishan (Mt. Tai), and presenting talismanic petitions to heaven. Liu An, the Prince of Huainan and a nephew of Wu, is associated with the production of the work called the Masters of Huainan (Huainanzi, 180-122 B.C.E.). This is a highly synthetic work formed at what is known as the Huainan academy and greatly influenced by Yellow Emperor Daoism. John Major and a team of translators published the first complete English version of this text (2010). The text was an attempt to merge cosmology, Confucian ideals, and a political theory using “quotes” attributed to the Yellow Emperor, although the statements actually parallel closely the Daodejing and the Zhuangzi. All this is of added significance because in the later Han work, Laozi binahua jing (Book of the Transformations of Laozi) the Chinese physics that persons and objects change forms was employed in order to identify Laozi with the Yellow Emperor.

10. Celestial Masters Daoism

Even though Emperor Wu forced Daoist practitioners from court, Daoist teachings found a fertile ground in which to grow in the environment of discontent with the policies of the Han rulers and bureaucrats. Popular uprisings sprouted. The Yellow Turban movement tried to overthrow Han imperial authority in the name of the Yellow Emperor and promised to establish the Way of Great Peace (Tai ping). Indeed, the basic moral and philosophical text that provided the intellectual justification of this movement was the Classic of Great Peace (Taiping jing), provided in an English version by Barbara Hendrischke. The present version of this work in the Daoist canon is a later and altered iteration of the original text dating about 166 CE and attributed to transnormal revelations experienced by Zhang Jiao.

Easily the most important of the Daoist trends at the end of the Han period was the wudou mi dao (Way of Five Bushels of Rice) movement, best known as the Way of the Celestial Masters (tianshi dao). This movement is traceable to a Daoist hermit named Zhang ling, also known as Zhang Daoling, who resided on a mountain near modern Chengdu in Sichuan. According to an account in Ge Hong’s Biographies of Spirit Immortals, Laozi appeared to Zhang (c. 142 CE) and gave him a commission to announce the soon end of the world and the coming age of Great Peace (taiping). The revelation said that those who followed Zhang would become part of the Orthodox One Covenant with the Powers of the Universe (Zhengyi meng wei). Zhang began the movement that culminated in a Celestial Master state. The administrators of this state were called libationers (ji jiu), because they performed religious rites, as well as political duties. They taught that personal illness and civil mishap were owing to the mismanagement of the forces of the body and nature. The libationers taught a strict form of morality and displayed registers of numinal powers they could access and control. Libationers were moral investigators, standing in for a greater celestial bureaucracy. The Celestial Master state developed against the background of the decline of the later Han dynasty. Indeed, when the empire finally decayed, the Celestial Master government was the only order in much of southern China.

When the Wei dynastic rulers became uncomfortable with the Celestial Masters’ power, they broke up the power centers of the movement. But this backfired because it actually served to disperse Celestial Masters followers throughout China. Many of the refugees settled near X’ian in and around the site of Louguan tai. The movement remained strong because its leaders had assembled a canon of texts [Statutory Texts of the One and Orthodox (Zhengyi fawen)]. This group of writings included philosophical, political, and ritual texts. It became a fundamental part of the later authorized Daoist canon.

11. Neo-Daoism

The resurgence of Daoism after the Han dynasty is often known as Neo-Daoism. As a result, Confucian scholars sought to annotate and reinterpret their own classical texts to move them toward greater compatibility with Daoism, and they even wrote commentaries on Daoist works. A new type of Confucianism known simply as the Way of Mysterious Learning (Xuanxue) emerged. It is represented by a set of scholars, including some of the most prominent thinkers of the period: Wang Bi (226-249), He Yan (d. 249), Xiang Xiu (223?-300), Guo Xiang (d. 312) and Pei Wei (267-300). In general, these scholars share in common an effort to reinterpret the social and moral understanding of Confucianism in ways to make it more compatible with Daoist philosophy. In fact, for many interpreters, the extent to which Daoist influence is evident in the texts of these writers has led some scholars to call this movement ‘Neo-Daoism.’ Wang Bi and Guo Xiang who wrote commentaries respectively on the Daodejing and the Zhuangzi, were the most important voices in this development. Traditionally, the famous “Seven Sages of the Bamboo Grove” (Zhulin qixian) have also been associated with the new Daoist way of life that expressed itself in culture and not merely in mountain retreats. These thinkers included landscape painters, calligraphers, poets, and musicians.

Among the philosophers of this period, the great representative of Daoism in southern China was Ge Hong (283-343 CE). He practiced not only philosophical reflection, but also external alchemy, manipulating mineral substances such as mercury and cinnabar in an effort to gain immortality. His work the Inner Chapters of the Master Who Embraces Simplicity (Baopuzi neipian) is the most important Daoist philosophical work of this period. For him, longevity and immortality are not the same, the former is only the first step to the latter.

12. Shangqing and Lingbao Daoist Movements

After the invasion of China by nomads from Central Asia, Daoists of the Celestial Master tradition who had been living in the north were forced to migrate into southern China, where Ge Hong’s version of Daoism was strong. The mixture of these two traditions is represented in the writings of the Xu family. The Xu family was an aristocratic group from what is today the city of Nanjing. Seeking Daoist philosophical wisdom and the long life it promised, many of them moved to Mao Shan Mountain, near the city. There they claimed to receive revelations from immortals, who dictated new wisdom and morality texts to them. Yang Xi was the most prominent medium recipient of the Maoshan revelations (360-370 CE). These revelations came from spirits who were local heroes named the Mao brothers, but they had been transformed into deities. Yang Xi’s writings formed the basis for Highest Purity (Shangqing) Daoism. The writings were extraordinarily well done and even the calligraphy in which they were written was beautiful.

The importance of these texts philosophically speaking is to be found in their idealization of the quest for immortality and transference of the material practices of the alchemical science of Ge Hong into a form of reflective meditation. In fact, the Shangqing school of Daoism is the beginning of the tradition known as “inner alchemy” (neidan), an individual mystical pursuit of wisdom.

Some thirty years after the Maoshan revelations, a descendent of Ge Hong, named Ge Chaofu went into a mediumistic trance and authored a set of texts called the Numinous Treasure (Lingbao) teachings. These works were ritual recitation texts similar to Buddhist sutras, and indeed they borrowed heavily from Buddhism. At first, the Shangqing and Lingbao texts belonged to the general stream of the Celestial Masters and were not considered separate sects or movements within Daoism, although later lineages of masters emphasized the uniqueness of their teachings.

13. Tang Daoism

As the Lingbao texts illustrate, Daoism acted as a receiving structure for Buddhism. Many early translators of Buddhist texts used Daoist terms to render Indian ideas. Some Buddhists saw Laozi as an avatar of Shakyamuni (the Buddha), and some Daoists understood Shakyamuni as a manifestation of the dao, which also means he was a manifestation of Laozi. An often made generalization is that Buddhism held north China in the 4th and 5th centuries, and Daoism the south. But gradually this intellectual currency actually reversed. Daoism grew in scope and impact throughout China.

By the time of the Tang dynasty (618-906 CE) Daoism was the intellectual philosophy that underwrote the national understanding. The imperial family claimed to descend from Li (by lore, the family of Laozi). Laozi was venerated by royal decree. Officials received Daoist initiation as Masters of its philosophy, rituals, and practices. A major center for Daoist studies was created at Dragon and Tiger Mountain (longhu shan), chosen both for its feng shui and because of its strategic location at the intersection of numerous southern China trade routes. The Celestial Masters who held leadership at Dragon and Tiger Mountain were later called “Daoist popes” by Christian missionaries because they had considerable political power.

In aesthetics, two great Daoist intellectuals worked during the Tang. Wu Daozi developed the rules for Daoist painting and Li Bai became its most famous poet. Interestingly, Daoist alchemists invented gunpowder during the Tang. The earliest block-print book on a scientific subject is a Daoist work entitled Xuanjie lu (850 CE). As Buddhism gradually grew stronger during the Tang, Daoist and Confucian intellectuals sought to initiate a conversation with it. The Buddhism that resulted was a reformed version known as Chan (Zen in Japan).

14. The Three Teachings

During the Five Dynasties (907-960 CE) and Song periods (960-1279 CE) Confucianism enjoyed a resurgence and Daoists found their place by teaching that principal thinkers of their tradition were Confucian scholars as well. Most notable among these was Lu Dongbin, a legendary Daoist immortal that many believed was originally a Confucian teacher.

Daoism became a complete philosophy of life, reaching into religion, social action, and individual health and physical well-being. A huge network of Daoist temples known by the name Dongyue Miao (also called tianqing guan) was created through the empire, with a miao in virtually every town of any size. The Daoist masters who served these temples were often appointed as government officials. They also gave medical, moral, and philosophical advice, and led religious rituals, dedicated especially to the Lord of the Sacred Mountain of the East named Taishan. Daoist masters had wide authority. All this was obvious in the temple iconography. Taishan was represented as the emperor, the City God (cheng huang) was a high official, and the Earth God was portrayed as a prosperous peasant. Daoism of this period integrated the Three Teachings (sanjiao) of China: Confucianism, Buddhism, and Daoism. This process of synthesis continued throughout the Song and into the period of the Ming Dynasty.

Such a wide dispersal of Daoist thought and practice, taken together with its interest in merging Confucianism and Buddhism, eventually created a fragmented ideology. Into this confusion came Wang Zhe (1113-1170 CE), the founder of Quanzhen (Complete Perfection) Daoism. It was Wang’s goal to bring the three teachings into a single great synthesis. For the first time, Daoist teachers adopted monastic forms of life, created monasteries, and organized themselves in ways they saw in Buddhism. This version of Daoist thought interpreted the classical texts of the DDJ and the Zhuangzi to call for a rejection of the body and material world. The Quanzhen order became powerful as the main partner of the Mongols (Yuan dynasty), who gave their patronage to its expansion. Less frequently, the Mongol emperors favored the Celestial Masters and their leader at Dragon and Tiger Mountain in an effort to undermine the power of the Quanzhen leaders. For example, the Zhengyi (Celestial Master) master of Beijing in the 1220s was Zhang Liusun. Under patronage he was allowed to build a Dongyue Miao in the city in 1223 and make it the unofficial town hall of the capital. But by the time of Khubilai Khan (r. 1260-1294) the Buddhists were used against all Daoists. The Khan ordered all Daoist books except the DDJ to be destroyed in 1281, and he closed the Quanzhen monastery in the city known as White Cloud Monastery (Baiyun Guan).

When the Ming (1368-1644) dynasty emerged, the Mongols were expulsed, and Chinese rule was restored. The emperors sponsored the creation of the first complete Daoist Canon (Daozang), which was edited between 1408 and 1445. This was an eclectic collection, including many Buddhist and Confucian related texts. Daoist influence reached its zenith.

15. The “Destruction” of Daoism

The Manchurian tribes that became rulers of China in 1644 and founded the Qing dynasty were already under the influence of conservative Confucian exiles. They stripped the Celestial Master of Dragon Tiger Mountain of his power at court. Only Quanzhen was tolerated. White Cloud Monastery (Baiyun Guan)) was reopened, and a new lineage of thinkers was organized. They called themselves the Dragon Gate lineage (Longmen pai). In the 1780s, the Western traders arrived, and so did Christian missionaries. In 1849, the Hakka people of Guangxi province, among China’s poorest citizens, rose in revolt. They followed Hong Xiuquan, who claimed to be Jesus’ younger brother. This millennial movement built on a strange version of Chinese Christianity sought to establish the Heavenly Kingdom of Peace (taiping). As the Taiping swept throughout southern China, they destroyed Buddhist and Daoist temples and texts wherever they found them. The Taiping army completely raised the Daoist complexes on Dragon Tiger Mountain. During most of the 20th century the drive to eradicate Daoist influence has continued. In the 1920s, the “New Life” movement drafted students to go out on Sundays to destroy Daoist statues and texts. Accordingly, by the year 1926 only two copies of the Daoist Canon (Daozang) existed and Daoist philosophical heritage was in great jeopardy. But permission was granted to copy the canon kept at the White Cloud Monastery, and so the texts were preserved for the world. There are 1120 titles in this collection in 5,305 volumes. Much of this material has yet to receive scholarly attention and very little of it has been translated into any Western language.

The Cultural Revolution (1966-1976) attempted to complete the destruction of Daoism. Masters were killed or “re-educated.” Entire lineages were broken up and their texts were destroyed. The miaos were closed, burned, and turned into military barracks. At one time, there were 300 Daoist sites in Beijing alone, now there are only a handful. However, Daoism is not dead. It survives as a vibrant philosophical system and way of life as is evidenced by the revival of its practice and study in several new University institutes in the People’s Republic.

16. References and Further Reading

Ames, Roger and Hall, David. (2003). Daodejing: “Making This Life Significant” A Philosophical Translation. New York: Ballantine Books.
Ames, Roger. (1998). Wandering at Ease in the Zhuangzi. Albany: State University of New York Press.
Bokenkamp, Stephen R. (1997). Early Daoist Scriptures. Berkeley: University of California Press.
Boltz, Judith M. (1987). A Survey of Taoist Literature: Tenth to Seventeenth Centuries, China Research Monograph 32. Berkeley: University of California Press.
Chan, Alan. (1991). Two Visions of the Way: A Translation and Study of the Heshanggong and Wang Bi Commentaries on the Laozi. Albany: State University of New York Press.
Cook, Scott (2013). The Bamboo Texts of the Guodian: A Study & Complete Translation. New York: Cornell University East Asia Program.
Coutinho, Steve (2014). An Introduction to Daoist Philosophies. New York: Columbia University Press.
Creel, Herrlee G. (1970). What is Taoism? Chicago: University of Chicago Press.
Csikszentmihalyi, Mark and Ivanhoe, Philip J., eds. (1999). Religious and Philosophical Aspects of the Laozi. Albany: State University of New York.
Girardot, Norman J. (1983). Myth and Meaning in Early Taoism: The Theme of Chaos (hun-tun). Berkeley: University of California Press.
Graham, Angus. (1981). Chuang tzu: The Inner Chapters. London: Allen & Unwin.
Graham, Angus. (1989). Disputers of the Tao: Philosophical Argument in Ancient China. La Salle, IL: Open Court.
Graham, Angus. (1979). “How much of the Chuang-tzu Did Chuang-tzu Write?” Journal of the American Academy of Religion, Vol. 47, No. 3.
Hansen, Chad (1992). A Daoist Theory of Chinese Thought. New York: Oxford University Press.
Hendrischke, Barbara (2015, reprint ed.). The Scripture on Great Peace: The Taiping jing and the Beginnings of Daoism. Berkeley: The University of California Press.
Henricks, Robert. (1989). Lao-Tzu: Te-Tao Ching. New York: Ballantine.
Hochsmann, Hyun and Yang Guorong, trans. (2007). Zhuangzi. New York: Pearson.
Ivanhoe, Philip J. (2002). The Daodejing of Laozi. New York: Seven Bridges Press.
Kjellberg, Paul and Ivanhoe, Philip J., eds. (1996) Essays on Skepticism, Relativism, and Ethics in the Zhuangzi. Albany: State University of New York.
Kleeman, Terry (1998). Great Perfection: Religion and Ethnicity in a Chinese Millenial Kingdom. Honolulu: University of Hawaii Press.
Kohn, Livia, ed. (2004). Daoism Handbook, 2 vols. Boston: Brill.
Kohn, Livia (2009). Introducing Daoism. London: Routledge.
Kohn, Livia (2014). Zhuangzi: Text and Context. St. Petersburg: Three Pines Press.
Kohn, Livia and LaFargue, Michael., eds. (1998). Lao-tzu and the Tao-te-ching. Albany: State University of New York Press.
Kohn, Livia and Roth, Harold., eds. (2002). Daoist Identity: History, Lineage, and Ritual. Honolulu: University of Hawaii Press.
Komjathy, Louis (2014). Daoism: A Guide for the Perplexed. London: Bloomsbury.
LaFargue, Michael. (1992). The Tao of the Tao-te-ching. Albany: State University of New York Press.
Lin, Paul J. (1977). A Translation of Lao-tzu’s Tao-te-ching and Wang Pi’s Commentary. Ann Arbor: University of Michigan.
Lau, D.C. (1982). Chinese Classics: Tao Te Ching. Hong Kong: Hong Kong University Press.
Littlejohn, Ronnie (2010). Daoism: An Introduction. London: I.B. Tauris.
Littlejohn, Ronnie (2011). “The Liezi’s Use of the Lost Zhuangzi.” Riding the Wind with Liezi: New Perspectives on the Daoist Classic. Eds. Ronnie Littlejohn and Jeffrey Dippmann. Albany: State University of New York.
Lynn, Richard John. (1999). The Classic of the Way and Virtue: A New Translation of the Tao-Te Ching of Laozi as Interpreted by Wang Bi. New York: Columbia University Press.
Mair, Victor, ed. (2010). Experimental Essays on Zhuangzi. St. Petersburg: Three Pines Press. New edition of University of Hawai’i, 1983.
Mair, Victor. (1990). Tao Te Ching: The Classic Book of Integrity and the Way. New York: Bantam Press.
Mair, Victor (1994). Wandering on the Way: Early Taoist Tales and Parables of Chuang Tzu. Honolulu: University of Hawai’i Press.
Major, John, Queen, Sarah, Set Meyer, Andrew, and Roth, Harold, trans. (2010). The Huainanzi: A Guide to the Theory and Practice of Government in Early Han China. New York: Columbia University Press.
Maspero, Henri. (1981). Taoism and Chinese Religion. Amherst: University of Massachusetts Press.
Miller, James (2003). Daoism: A Short Introduction. Oxford: Oxford University Press.
Moeller, Hans-Georg (2004). Daoism Explained: From the Dream of the Butterfly to the Fishnet Allegory. Chicago: Open Court.
Robinet, Isabelle. (1997). Taoism: Growth of a Religion. Stanford: Stanford University Press.
Roth, Harold (1999). Original Tao: Inward Training (Nei-yeh) and the Foundations of Taoist Mysticism. New York: Columbia University Press.
Roth, Harold D. (1992). The Textual History of the Huai Nanzi. Ann Arbor: Association of Asian Studies.
Roth, Harold D. (1991). “Who Compiled the Chuang Tzu?” In Chinese Texts and Philosophical Contexts, ed. Henry Rosemont, 84-95. La Salle: Open Court.
Schipper, Kristofer. (1993). The Taoist Body Berkeley: University of California Press.
Slingerland, Edward, (2003). Effortless Action: Wu-Wei As Conceptual Metaphor and Spiritual Ideal in Early China. New York: Oxford University Press.
Waley, Arthur (1934). The Way and Its Power: A Study of the Tao Te Ching and its Place in Chinese Thought. London: Allen & Unwin
Watson, Burton. (1968). The Complete Works of Chuang Tzu. New York: Columbia University Press
Welch, Holmes. (1966). Taoism: The Parting of the Way. Boston: Beacon Press.
Welch, Holmes and Seidel, Anna, eds. (1979). Facets of Taoism. New Haven: Yale University Press.

Author Information

Ronnie Littlejohn
Email: ronnie.littlejohn@belmont.edu
Belmont University
U. S. A.

Slavoj Žižek (1949 —)

Slavoj Žižek is a Slovenian-born political philosopher and cultural critic. He was described by British literary theorist, Terry Eagleton, as the “most formidably brilliant” recent theorist to have emerged from Continental Europe.

Žižek’s work is infamously idiosyncratic. It features striking dialectical reversals of received common sense; a ubiquitous sense of humor; a patented disrespect towards the modern distinction between high and low culture; and the examination of examples taken from the most diverse cultural and political fields. Yet Žižek’s work, as he warns us, has a very serious philosophical content and intention. He challenges many of the founding assumptions of today’s left-liberal academy, including the elevation of difference or otherness to ends in themselves, the reading of the Western Enlightenment as implicitly totalitarian, and the pervasive skepticism towards any context-transcendent notions of truth or the good.

One feature of Žižek’s work is its singular philosophical and political reconsideration of German idealism (Kant, Schelling and Hegel). Žižek has also reinvigorated the challenging psychoanalytic theory of Jacques Lacan, controversially reading him as a thinker who carries forward founding modernist commitments to the Cartesian subject and the liberating potential of self-reflective agency, if not self-transparency. Žižek’s works since 1997 have become more and more explicitly political, contesting the widespread consensus that we live in a post-ideological or post-political world, and defending the possibility of lasting changes to the new world order of globalization, the end of history, or the war on terror.

This article explains Žižek’s philosophy as a systematic, if unusually presented, whole; and it clarifies the technical language Žižek uses, which he takes from Lacanian psychoanalysis, Marxism, and German idealism. In line with how Žižek presents his own work, this article starts by examining Žižek’s descriptive political philosophy. It then examines the Lacanian-Hegelian ontology that underlies Žižek’s political philosophy. The final part addresses Žižek’s practical philosophy, and the ethical philosophy he draws from this ontology.

Biography
Žižek’s Political Philosophy
Criticism of Ideology as “False Consciousness”
Ideological Cynicism and Belief
Jouissance as Political Factor
The Reflective Logic of Ideological Judgments (or How the King is King)
Sublime Objects of Ideology
Žižek’s Fundamental Ontology
The Fundamental Fantasy & the Split Law
Excursus: Žižek’s Typology of Ideological Regimes
Kettle Logic, or Desire and Theodicy
Fantasy as the Fantasy of Origins
Exemplification: the Fall and Radical Evil (Žižek’s Critique of Kant)
From Ontology to Ethics—Žižek’s Reclaiming of the Subject
Žižek’s Subject, Fantasy, and the Objet Petit a
The Objet Petit a & the Virtuality of Reality
Forced Choice & Ideological Tautologies
The Substance is Subject, the Other Does Not Exist
The Ethical Act Traversing the Fantasy
Conclusion
References and Further Reading
1. Primary Literature (Books by Žižek)
2. Secondary Literature (Texts on Žižek)

1. Biography

Slavoj Žižek was born in 1949 in Ljubljana, Slovenia. He grew up in the comparative cultural freedom of the former Yugoslavia’s self-managing socialism. Here—significantly for his work— Žižek was exposed to the films, popular culture and theory of the noncommunist West. Žižek completed his PhD at Ljubljana in 1981 on German Idealism, and between 1981 and 1985 studied in Paris under Jacques AlainMiller, Lacan’s son-in-law. In this period, Žižek wrote a second dissertation, a Lacanian reading of Hegel, Marx and Kripke. In the late 1980s, Žižek returned to Slovenia where he wrote newspaper columns for the Slovenian weekly “Mladina,” and cofounded the Slovenian Liberal Democratic Party. In 1990, he ran for a seat on the four-member collective Slovenian presidency, narrowly missing office. Žižek’s first published book in English, The Sublime Object of Ideology, appeared in 1989. Since then, Žižek has published over a dozen books, edited several collections, published numerous philosophical and political articles, and maintained a tireless speaking schedule. His earlier works are of the type “Introductions to Lacan through popular culture / Hitchcock / Hollywood …” Since at least 1997, however, Žižek’s work has taken on an increasingly engaged political tenor, culminating in books on September 11 and the Iraq war. As well as being visiting professor at the Department of Psychoanalysis, Universite ParisVIII in 1982-3 and 1985-6, Žižek has lectured at the Cardozo Law School, Columbia, Princeton, the New School for Social Research, the University of Michigan, Ann Arbor, and Georgetown. He is currently a returning faculty member of the European Graduate School, and founder and president of the Society for Theoretical Psychoanalysis, Ljubljana.

2. Žižek’s Political Philosophy

a. Criticism of Ideology as “False Consciousness”

In a way that is oddly reminiscent of Nietzsche, Žižek generally presents his work in a polemical fashion, knowingly striking out against the grain of accepted opinion. One untimely feature of Žižek’s work is his continuing defense and use of the unfashionable term “ideology.” According to the classical Marxist definition, ideologies are discourses that promote false ideas (or “false consciousness”) in subjects about the political regimes they live in. Nevertheless, because these ideas are believed by the subjects to be true, they assist in the reproduction of the existing status quo, in an exact instance of what Umberto Eco dubs “the force of the fake.” To critique ideology, according to this position, it is sufficient to unearth the truth(s) the ideologies conceal from the subject’s knowledge. Then, so the theory runs, subjects will become aware of the political shortcomings of their current regimes, and be able and moved to better them. As Žižek takes up in his earlier works, this classical Marxian notion of ideology has come under theoretical attack in a number of ways. First, to criticize a discourse as ideological implies access to a Truth about political things the Truth that the ideologies, as false, would conceal. But it has been widely disputed in the humanities that there could ever be any One such theoretically accessible Truth. Secondly, the notion of ideology is held to be irrelevant to describe contemporary sociopolitical life, because of the increased importance of what Jurgen Habermas calls “mediasteered subsystems” (the market, public and private bureaucracies), and also because of the widespread cynicism of today’s subjects towards political authorities. For ideologies to have political importance, critics comment, subjects would have to have a level of faith in public institutions, ideals and politicians which today’s liberal-cosmopolitan subjects lack. The widespread notoriety of left-leaning authors like Michael Moore of Noam Chomsky, as one example, bears witness to how subjects today can know very well what Moore claims is the “awful truth,” and yet act as if they did not know.

Žižek agrees with critics about this “false consciousness” model of ideology. Yet he insists that we are not living in a post-ideological world, as figures as different as Tony Blair, Daniel Bell or Richard Rorty have claimed. Žižek proposes instead that in order to understand today’s politics we need a different notion of ideology. In a typically bold reversal, Žižek’s position is that today’s widespread consensus that our world is post-ideological gives voice to what he calls the “archideological” fantasy. Since “ideology” since Marx has carried a pejorative sense, no one who taken in by such an ideology has ever believed that they were so duped, Žižek comments. If the term “ideology” has any meaning at all, ideological positions are always what people impute to Others (for today’s left, for example, the political right are the dupes of one or another noble lie about natural community; for the right, the left are the dupes of well-meaning but utopian egalitarianism bound to lead to economic and moral collapse, and so forth). For subjects to believe in an ideology, it must have been presented to them, and been accepted, as non-ideological indeed, as True and Right, and what anyone sensible would believe. As we shall see in 2e, Žižek is alert to the realist insight that there is no more effective political gesture than to declare some contestable matter above political contestation. Just as the third way is said to be post-ideological or national security is claimed to be extra-political, so Žižek argues that ideologies are always presented by their proponents as being discourses about Things too sacred to profane by politics. Hence, Žižek’s bold opening in The Sublime Object of Ideology is to claim that today ideology has not so much disappeared from the political landscape as come into its own. It is exactly because of this success, Žižek argues, that ideology has also been able to be dismissed in accepted political and theoretical opinion.

b. Ideological Cynicism and Belief

Today’s typical first world subjects, according to Žižek, are the dupes of what he calls “ideological cynicism.” Drawing on the German political theorist Sloterdijk, Žižek contends that the formula describing the operation of ideology today is not “they do not know it, but they are doing it”, as it was for Marx. It is “they know it, but they are doing it anyway.” If this looks like nonsense from the classical Marxist perspective, Žižek’s position is that nevertheless this cynicism indicates the deeper efficacy of political ideology per se. Ideologies, as political discourses, are there to secure the voluntary consent—or what La Boétie called servitude volontaire of people about contestable political policies or arrangements. Yet, Žižek argues, subjects will only voluntarily agree to follow one or other such arrangement if they believe that, in doing so, they are expressing their free subjectivity, and might have done otherwise.

However false such a sense of freedom is, Žižek insists that it is nevertheless a political instance of what Hegel called an essential appearance. Althusser’s understanding of ideological identification suggests that an individual is wholly “interpellated” into a place within a political system by the system’s dominant ideology and ideological state apparatuses. Contesting this notion by drawing on Lacanian psychoanalysis, however, Žižek argues that it is a mistake to think that, for a political position to win peoples’ support, it needs to effectively brainwash them into thoughtless automatons. Rather, Žižek maintains that any successful political ideology always allows subjects to have and to cherish a conscious distance towards its explicit ideals and prescriptions—or what he calls, in a further technical term, “ideological disidentification.”

Again bringing the psychoanalytic theory of Lacan to bear in political theory, Žižek argues that the attitude of subjects towards authority revealed by today’s ideological cynicism resembles the fetishist’s attitude towards his fetish. The fetishist’s attitude towards his fetish has the peculiar form of a disavowal: “I know well that (for example) the shoe is only a shoe, but nevertheless, I still need my partner to wear the shoe in order to enjoy.” According to Žižek, the attitude of political subjects towards political authority evinces the same logical form: “I know well that (for example) Bob Hawke / Bill Clinton / the Party / the market does not always act justly, but I still act as though I did not know that this is the case.” In Althusser’s famous “Ideology and Ideological State Apparatuses,” Althusser staged a kind of primal scene of ideology, the moment when a policeman (as bearer of authority) says “hey you!” to an individual, and the individual recognizes himself as the addressee of this call. In the “180 degree turn” of the individual towards this Other who has addressed him, the individual becomes a political subject, Althusser says. Žižek’s central technical notion of the “big Other” [grand Autre] closely resembles—to the extent that it is not modelled on Althusser’s notion of the Subject (capital “S”) in the name of which public authorities (like the police) can legitimately call subjects to account within a regime—for example, “God” in a theocracy, “the Party” under Stalinism, or “the People” in today’s China. As the central chapter of The Sublime Object of Ideology specifies, ideologies for Žižek work to identify individuals with such important or rallying political terms as these, which Žižek calls “master signifiers.” The strange but decisive thing about these pivotal political words, according to Žižek, is that no one knows exactly what they mean or refer to, or has ever seen with their own eyes the sacred objects which they seem to name (for example: God, the Nation, or the People). This is one reason why Žižek, in the technical language he inherits (via Lacan) from structuralism, says that the most important words in any political doctrine are “signifiers without a signified” (that is, words that do not refer to any clear and distinct concept or demonstrable object).

This claim of Žižek’s is connected to two other central ideas in his work:

First: Žižek adapts the psychoanalytic notion that individuals are always “split” subjects, divided between the levels of their conscious awareness and the unconscious. Žižek contends throughout his work that subjects are always divided between what they consciously know and can say about political things, and a set of more or less unconscious beliefs they hold concerning individuals in authority, and the regime in which they live (see 3a). Even if people cannot say clearly and distinctly why they support some political leader or policy, for Žižek no less than for Edmund Burke, this fact is not politically decisive, as we will see in 2e below.
Second: Žižek makes a crucial distinction between knowledge and belief. Exactly where and because subjects do not know, for example, what “the essence” of “their people” is, the scope and nature of their beliefs on such matters is politically decisive, according to Žižek (again, see 2e below).

Žižek’s understanding of political belief is modelled on Lacan’s understanding of transference in psychoanalysis. The belief or “supposition” of the analysand in psychoanalysis is that the Other (his analyst) knows the meaning of his symptoms. This is obviously a false belief, at the start of the analytic process. But it is only through holding this false belief about the analyst that the work of analysis can proceed, and the transferential belief can become true (when the analyst does become able to interpret the symptoms). Žižek argues that this strange intersubjective or dialectical logic of belief in clinical psychoanalysis also what characterizes peoples’ political beliefs. Belief is always “belief through the Other,” Žižek argues. If subjects do not know the exact meaning of those “master signifiers” with which they political identify, this is because their political belief is mediated through their identifications with others. Although they each themselves “do not know what they do” (which is the title one of Žižek’s books [Žižek, 2002]), the deepest level of their belief is maintained through the belief that nevertheless there are Others who do know. A number of features of political life are cast into new relief given this psychoanalytic understanding, Žižek claims:

First, Žižek contends that the key political function of holders of public office is to occupy the place of what he calls, after Lacan, “the Other supposed to know.” Žižek cites the example of priests reciting mass in Latin before an uncomprehending laity, who believe that the priests know the meaning of the words, and for whom this is sufficient to keep the faith. Far from presenting an exception to the way political authority works, for Žižek this scenario reveals the universal rule of how political consensus is formed.
Second, and in connection with this, Žižek contends that political power is primarily “symbolic” in its nature. What he means by this further technical term is that the roles, masks, or mandates that public authorities bear is more important politically than the true “reality” of the individuals in question (whether they are unintelligent, unfaithful to their wives, good family women, and soforth). According to Žižek, for example, fashionable liberal criticisms of George W. Bush the man are irrelevant to understanding or evaluating his political power. It is the office or place an individual occupies in their political system (or “big Other”) that ensures the political force of their words, and the belief of subjects in their authority. This is why Žižek maintains that the resort of a political leader or regime to “the real of violence” (such as war or police action) amounts to a confession of its weakness as a political regime. Žižek sometimes puts this thought by saying that people believe through the big Other, or that the big Other believes for them, despite what they might inwardly think or cynically say.

c. Jouissance as Political Factor

A further key point that Žižek takes from Louis Althusser’s later work on ideology is Althusser’s emphasis on the “materiality” of ideology, its embodiment in institutions and peoples’ everyday practices and lives. Žižek’s realist position is that all the ideas in the world can have no lasting political effect unless they come to inform institutions and subjects’ day-to-day lives. In The Sublime Object of Ideology, Žižek cites Blaise Pascal’s advice that doubting subjects should get down on their knees and pray, and then they will believe. Pascal’s position is not any kind of simple proto-behaviorism, according to Žižek. The deeper message of Pascal’s directive, he asserts, is to suggest that once subjects have come to believe through praying, they will also retrospectively see that they got down on their knees because they always believed, without knowing it. In this way, in fact, Žižek can be read as a consistent critic not only of the importance of knowledge in the formation of political consensus, but also of the importance of “inwardness” in politics per se in the tradition of the younger Carl Schmitt.

Prior political philosophy has placed too little emphasis, Žižek asserts, on communities’ cultural practices that involve what he calls “inherent transgression.” These are practices sanctioned by a culture that nevertheless allow subjects some experience of what is usually exceptional to or prohibited in their everyday lives as civilized political subjects—things like sex, death, defecation, or violence. Such experiences involve what Žižek calls jouissance, another technical term he takes from Lacanian psychoanalysis. Jouissance is usually translated from the French as “enjoyment.” As opposed to what we talk of in English as “pleasure”, though, jouissance is an always sexualized, always transgressive enjoyment, at the limits of what subjects can experience or talk about in public. Žižek argues that subjects’ experiences of the events and practices wherein their political culture organizes its specific relations to jouissance (in first world nations, for example, specific sports, types of alcohol or drugs, music, festivals, films) are as close as they will get to knowing the deeper Truth intimated for them by their regime’s master signifiers: “nation”, “God”, “our way of life,” and so forth (see 2b above). Žižek, like Burke, argues that it is such ostensibly nonpolitical and culturally specific practices as these that irreplaceably single out any political community from its others and enemies. Or, as one of Žižek’s chapter titles in Tarrying With the Negative puts it, where and although subjects do not know their Nation, they “enjoy (jouis) their nation as themselves.”

d. The Reflective Logic of Ideological Judgments (or How the King is King)

According to Žižek, like and after Althusser, ideologies are thus political discourses whose primary function is not to make correct theoretical statements about political reality (as Marx’s “false consciousness” model implies), but to orient subjects’ lived relations to and within this reality. If a political ideology’s descriptive propositions turn out to be true (for example: “capitalism exploits the workers,” “Saddam was a dictator,” “the Spanish are the national enemy,” and so forth), this does not in any way reduce their ideological character, in Žižek’s estimation. This is because this character concerns the political issue of how subjects’ belief in these propositions, instead of those of opponents, positions subjects on the leading political issues of the day. For Žižek, political speech is primarily about securing a lived sense of unity or community between subjects, something like what Kant called sensus communis or Rousseau the general will. If political propositions seemingly do describe things in the world, Žižek’s position is that we nevertheless need always to understand them as Marx understood the exchange value of commodities—as “a relation between people being concealed behind a relation between things.” Or again: just as Kant thought that the proposition “this is beautiful” really expresses a subject’s reflective sense of commonality with all other subjects capable of being similarly affected by the object, so Žižek argues that propositions like “Go Spain!” or “the King will never stop working to secure our future” are what Kant called reflective judgments, which tell us as much or more about the subject’s lived relation to political reality as about this reality itself.

If ideological statements are thus performative utterances that produce political effects by their being stated, Žižek in fact holds that they are a strange species of performative utterance overlooked by speech act theory. Just because, when subjects say “the Queen is the Queen!” they are at one level reaffirming their allegiance to a political regime, Žižek at the same time holds that this does not mean that this regime could survive without appearing to rest on such deeper Truths about the way the world is. As we saw in 2b, Žižek maintains that political ideologies always present themselves as naming such deeper, extra-political Truths. Ideological judgments, according to Žižek, are thus performative utterances which, in order to perform their salutary political work, must yet appear to be objective descriptions of the way the world is (exactly as when a chairman says “this meeting is closed!” only thereby bringing this state of affairs into effect). In Sublime Object of Ideology, Žižek cites Marx’s analysis of being a King in Das Capital to illustrate his meaning. A King is only King because his subjects loyally think and act like he is King (think of the tragedy of Lear). Yet, at the same time, the people will only believe he is King if they believe that this is a deeper Truth about which they can do nothing.

e. Sublime Objects of Ideology

In line with Žižek’s ideas of “ideological disidentification” and “jouissance as a political factor” (see 2b and 2c above) and in a clear comparison with Derrida’s deconstruction, arguably the unifying thought in Žižek’s political philosophy is that regimes can only secure a sense of collective identity if their governing ideologies afford subjects an understanding of how their regime relates to what exceeds, supplements or challenges its identity. This is why Kant’s analytic of the sublime in The Critique of Judgment, as an analysis of an experience in which the subject’s identity is challenged, is of the highest theoretical interest for Žižek. Kant’s analytic of the sublime isolates two moments to its experience, as Žižek observes. In the first moment, the size or force of an object painfully impresses upon the subject the limitation of its perceptual capabilities. In a second moment, however, a “representation” arises where “we would least expect it,” which takes as its object the subject’s own failure to perceptually take the object in. This representation resignifies the subject’s perceptual failure as indirect testimony about the inadequacy of human perception as such to attain to what Kant calls Ideas of Reason (in Kant’s system, God, the Universe as a Whole, Freedom, the Good).

According to Žižek, all successful political ideologies necessarily refer to and turn around sublime objects posited by political ideologies. These sublime objects are what political subjects take it that their regime’s ideologies’ central words mean or name extraordinary Things like God, the Fuhrer, the King, in whose name they will (if necessary) transgress ordinary moral laws and lay down their lives. When a subject believes in a political ideology, as we saw in 2b above, Žižek argues that this does not mean that they know the Truth about the objects which its key terms seemingly name—indeed, Žižek will finally contest that such a Truth exists (see 3c, d). Nevertheless, by drawing on a parallel with Kant on the sublime, Žižek makes a further and more radical point. Just as in the experience of the sublime, Kant’s subject resignifies its failure to grasp the sublime object as indirect testimony to a wholly “supersensible” faculty within herself (Reason), so Žižek argues that the inability of subjects to explain the nature of what they believe in politically does not indicate any disloyalty or abnormality. What political ideologies do, precisely, is provide subjects with a way of seeing the world according to which such an inability can appear as testimony to how Transcendent or Great their Nation, God, Freedom, and so forth is—surely far above the ordinary or profane things of the world. In Žižek’s Lacanian terms, these things are Real (capital “R”) Things (capital “T”), precisely insofar as they in this way stand out from the reality of ordinary things and events.

In the struggle of competing political ideologies, Žižek hence agrees with Ernesto Laclau and Chantal Mouffe, the aim of each is to elevate their particular political perspective (about what is just, best, and so forth) to the point where it can lay claim to name, give voice to or to represent the political whole (for example: the nation). In order to achieve this political feat, Žižek argues, each group must succeed in identifying its perspective with the extra-political, sublime objects accepted within the culture as giving body to this whole (for example: “the national interest,” “the dictatorship of the proletariat”). Or else, it must supplant the previous ideologies’ sublime objects with new such objects. In the absolute monarchies, as Ernst Kantorowicz argued, the King’s so called “second” or “symbolic” body exemplified paradigmatically such sublime political objects as the unquestionable font of political authority (the particular individual who was King was contestable, but not the sovereign’s role itself). Žižek’s critique of Stalinism, in a comparable way, turns upon the thought that “the Party” had this sublime political status in Stalinist ideology. Class struggle in this society did not end, Žižek contends, despite Stalinist propaganda. It was only displaced from a struggle between two classes (for example, bourgeois versus proletarian) to one between “the Party” as representative of the people or the whole and all who disagreed with it, ideologically positioned as “traitors” or “enemies of the people.”

3. Žižek’s Fundamental Ontology

a. The Fundamental Fantasy & the Split Law

For Žižek, as we have seen, no political regime can sustain the political consensus upon which it depends, unless its predominant ideology affords subjects a sense both of individual distance or freedom with regard to its explicit prescriptions (2b), and that the regime is grounded in some larger or “sublime” Truth (2e). Žižek’s political philosophy identifies interconnected instances of these dialectical ideas: his notion of “ideological disidentification” (2b); his contention that ideologies must accommodate subjects’ transgressive experiences of jouissance (2c); and his conception of exceptional or sublime objects of ideology (2e). Arguably the central notion in Žižek’s political philosophy intersects with Žižek’s notion of “ideological fantasy”. “Ideological fantasy” is Žižek’s technical name for the deepest framework of belief that structures how political subjects, and/or a political community, comes to terms with what exceeds its norms and boundaries, in the various registers we examined above.

Like many of Žižek’s key notions, Žižek’s notion of the ideological fantasy is a political adaptation of an idea from Lacanian psychoanalysis: specifically, Lacan’s structuralist rereading of Freud’s psychoanalytic understanding of unconscious fantasy. As for Lacan, so for Žižek, the civilizing of subjects necessitates their founding sacrifice (or “castration”) of jouissance, enacted in the name of sociopolitical Law. Subjects, to the extent that they are civilized, are “cut” from the primal object of their desire. Instead, they are forced by social Law to pursue this special, lost Thing in Žižek’s technical term, the “objet petit a” (see 4a, 4b) by observing their societies’ linguistically mediated conventions, deferring satisfaction, and accepting sexual and generational difference. Subjects’ “fundamental fantasies,” according to Lacan, are unconscious structures which allow them to accept the traumatic loss involved in this founding sacrifice. They turn around a narrative about the lost object, and how it was lost (see 3d). In particular, the fundamental fantasy of a subject resignifies the founding repression of jouissance by Law—which, according to Lacan, is necessary if the individual is to become a speaking subject—as if it were a merely contingent, avoidable occurrence. In the fantasy, that is, what for Žižek is a constitutive event for the subject, is renarrated as the historical action of some exceptional individual (in Enjoy Your Symptom! the pre-Oedipal “anal father”). Equally, the jouissance the subject considers itself to have lost is posited by the fantasy as having been taken from it by this persecutory “Other supposed to enjoy” (see 3b).

In the notion of ideological fantasy, Žižek takes this psychoanalytic framework and applies it to the understanding of the constitution of political groups. If after Plato, political theory concerns the Laws of a regime, the Laws for Žižek are always split or double in kind. Each political regime has a body of more or less explicit, usually written Laws which demand that subjects forego jouissance in the name of the greater good, and according to the letter of its proscriptions (for example, the US or French constitutions). Žižek identifies this level of the Law with the Freudian ego ideal. But Žižek argues that, in order to be effective, a regime’s explicit Laws must also harbor and conceal a darker underside, a set of more or less unspoken rules which, far from simply repressing jouissance, implicate subjects in a guilty enjoyment in repression itself, which Žižek likens to the “pleasure-in-pain” associated with the experience of Kant’s sublime (see 2d). The Freudian superego, for Žižek, names the psychical agency of the Law, as it is misrepresented and sustained by subjects’ fantasmatic imaginings of a persecutory Other supposed to enjoy (like the archetypal villain in noir films). This darker underside of the Law, Žižek agrees with Lacan, is at its base a constant imperative to subjects to jouis!, by engaging in the “inherent transgressions” of their sociopolitical community (see 2b).

Žižek’s notion of the split in the Law in this way intersects directly with his notion of ideological disidentification examined in 2b. While political subjects maintain a conscious sense of freedom from the explicit norms of their culture, Žižek contends, this disidentification is grounded in their unconscious attachment to the Law as superego, itself an agency of enjoyment. If Althusser famously denied the importance of what people “have on their consciences” in the explanation of how political ideologies work, then for Žižek the role of guilt—as the way in which the subject enjoys his subjection to the laws—is vital to understanding subjects’ political commitments. Individuals will only turn around when the Law hails them, Žižek argues, insofar as they are finally subjects also of the unconscious belief that the “big Other” has access to the jouissance they have lost as subjects of the Law, and which they can accordingly reattain through their political allegiance (see 2b). It is this belief, what could be termed this “political economy of jouissance,” that the fundamental fantasies underlying political regimes’ worldviews are there to structure in subjects.

b. Excursus: Žižek’s Typology of Ideological Regimes

With these terms of Žižek’s Lacanian ontology in place, it becomes possible to lay out Žižek’s theoretical understanding of the differences between different types of ideological-political regimes. Žižek’s works maintain a lasting distinction between modern and premodern political regimes, which he contends are grounded in fundamentally different ways of organizing subjects’ relations to Law and jouissance (3a). In Žižek’s Lacanian terms, premodern ideological regimes exemplified what Lacan calls in Seminar XVII the discourse of the master. In these authoritarian regimes, the word and will of the King or master (in Žižek’s mathemes, S1) was sovereign—the source of political authority, with no questions asked. Her/His subjects, in turn, are supposed to know (S2) the edicts of the sovereign and the Law (as the classical legal notion has it, “ignorance is no excuse”). In this arrangement, while jouissance and fantasy are political factors, as Žižek argues, regimes’ quasi-transgressive practices remain exceptional to the political arena, glimpsed only in such carnivalesque events as festivals or the types of public punishment Michel Foucault (for example) describes in the introduction to Discipline and Punish.

Žižek agrees with both Foucault and Marx that modern political regimes exert a form of power that is both less visible and more far-reaching than that of the regimes they replaced. Modern regimes, both liberal capitalist and totalitarian, for Žižek, are no longer predominantly characterized by the Lacanian discourse of the master. Given that the Oedipal complex is associated by him with this older type of political authority, Žižek agrees with the Frankfurt School theorists that, contra Deleuze and Guattari, today’s subjectivity as such is already post- or anti-Oedipal. Indeed, in Plague of Fantasies and The Ticklish Subject, Žižek contends that the characteristic discontents of today’s political world—from religious fundamentalism to the resurgence of racism in the first world—are not archaic remnants of, or protests against traditional authoritarian structures, but the pathological effects of new forms of social organization. For Žižek, the defining agency in modern political regimes is knowledge (or, in his Lacanian mathemes, S2). The enlightenment represented the unprecedented political venture to replace belief in authority as the basis of polity with human reason and knowledge. As Schmitt also complained, the legitimacy of modern authorities is grounded not in the self-grounding decision of the sovereign. It is grounded in the ability of authorities to muster coherent chains of reasons to subjects about why they are fit to govern. Modern regimes hence always claim to speak not out of ignorance of what subjects deeply enjoy (“I don’t care what you want; just do what I say!”) but in the very name of subjects’ freedom and enjoyment.

Whether fascist or communist, Žižek argues in his early books, totalitarian (as opposed to authoritarian) regimes justified their rule by final reference to quasi-scientific metanarratives. These metanarratives—a narrative concerning racial struggle in Nazism, or the Laws of History in Stalinism—each claimed to know the deeper Truth about what subjects want, and accordingly could both justify the most striking transgressions of ordinary morality, and justify these transgressions by reference to subjects’ jouissance. The most disturbing or perverse features of these regimes can only be explained by reference to the key place of knowledge in these regimes. Žižek describes, for instance, the truly Catch 22esque logic of the Soviet show trials, wherein it was not enough for subjects to be condemned by the authorities as enemies, but they were made to avow their “objective” error in opposing the party as agent of the laws of history.

Žižek’s statements on today’s liberal capitalism are complex, if not in mutual tension. At times, Žižek tries to formalize the economic generation of surplus value as a meaningfully “hysterical” social arrangement. Yet Žižek predominantly argues, that the market driven consumerism of later capitalist subjects is characterized by a marketing discourse which—like totalitarian ideologies—does not appeal to subjects in the name of any collective cause justifying individuals’ sacrifice of jouissance. Instead, as social conservatives criticize, it musters the quasi-scientific discourses of marketing and public relations, or (increasingly) Eastern religion, in order to recommend products to subjects as necessary means in the liberal pursuit of happiness and self-fulfillment. In line with this change, Žižek contends in The Ticklish Subject that the paradigmatic type of leader today is not some inaccessible boss but the uncannily familiar figure of Bill Gates—more like a little brother than the traditional father or master. Again: for Žižek it is deeply telling that at the same time as the nuclear family is being eroded in the first world, other institutions, from the so-called “nanny” welfare state to private corporations, are increasingly becoming “familiarized” (with self-help sessions for employees, company days, casual days, and so forth).

c. Kettle Logic, or Desire and Theodicy

We saw how Žižek claims that the truth of political ideologies concerns what they do, not what they say (2d). At the level of what political ideologies say, Žižek maintains, a Lacanian critical theory maintains that ideologies must be finally inconsistent. Freud famously talked of the example of a man who returns a borrowed kettle back to its owner broken. The man adduces mutually inconsistent excuses which are united only in terms of his ignoble desire to evade responsibility for breaking the kettle: he never borrowed the kettle, the kettle was already broken when he borrowed it, and when he gave the kettle back it was not really broken anyway. As Žižek reads political ideologies, they function in the same way in the political field—this is the sense of the subtitle of his 2004 Iraq: The Borrowed Kettle. As we saw in 2d, Žižek maintains that the end of political ideologies is to secure and defend the idea of the polity as a wholly unified community. When political strife, uncertainty or division occur, political ideologies and the fundamental fantasies upon which they lean (3a) operate to resignify this political discontent so that the political ideal of community can be sustained, and to deny the possibility that this discontent might signal a fundamental injustice or flaw within the regime. In what amounts to a kind of political theodicy, Žižek’s work points to a number of logically inconsistent ideological responses to political discontents, which are united only by the desire that informs them, like Freud’s “kettle logic”:

Saying that these divisions are politically unimportant, transient or merely apparent.
Or, if this explanation fails:
Saying that the political divisions are in any case contingent to the ordinary run of events, so that if their cause is removed or destroyed, things will return to normal.
Or, more perilously:
Saying that the divisions or problems are deserved by the people for the sake of the greater good (in Australia in the 90s, for example, we experienced “the recession we had to have”), or as punishment for their betrayal of the national Thing.

Žižek’s view of the political functioning of sublime objects of ideology can be charted exactly in terms of this political theodicy. (see 2e) We saw in 3a, how Žižek argues that subjects’ fantasy is what allows them to come to terms with the loss of jouissance fundamental to being social or political animals. Žižek centrally maintains that such narrative attempts at political self-understanding—whether of individuals or political regimes—are ultimately unable to achieve these ends, except at the price of telling inconsistencies.

As Žižek highlights in his analyses of the political discontents in former Yugoslavia following the fall of communism, each national or political community tends to claim that its sublime Thing is inalienable, and hence utterly incapable of being understood or destroyed by enemies. Nevertheless, the invariable correlative of this emphasis on the inalienable nature of one’s Thing, Žižek argues in Tarrying with the Negative (1993), is the notion that It is simultaneously deeply fragile if not under active threat. For Žižek, this mutual inconsistency is only theoretically resolvable if, despite first appearances, we posit a materialist teaching that says that the “substance” seemingly named by political regimes’ key rallying terms (see 2e) is only sustained in their lived communal practices (as we say when someone does not get a joke, “you had to be there”). Yet political ideologies, as such, cannot avow this possibility (see 2d). Instead, ideological fantasies posit various exemplars of a persecutory enemy or, as Žižek says, “the Other of the Other” to whom the explanation of political disunity or discontent can be traced. If only this other or enemy could be removed, the political fantasy contends, the regime would be fully equitable and just. Historical examples of such figures of the enemy include “the Jew” in Nazi ideology, or the “petty bourgeois” in Stalinism.

Again: a type of “kettle logic” applies to the way these enemies are represented in political ideologies, according to Žižek. “The Jew” in Nazi ideology, for example, was an inconsistent condensation of features of both the ruling capitalist class (money grabbing, exploitation of the poor) and of the proletariat (dirtiness, sexual promiscuity, communism). The only consistency this figure has, that is, is precisely as a condensation of everything that Nazi ideology’s Aryan Volksgemeinschaft (roughly, “national community”) was constructed in response and political opposition to.

d. Fantasy as the Fantasy of Origins

In a way that has drawn some critics (Bellamy, Sharpe) to question how finally political Žižek’s political philosophy is, Žižek’s critique of ideology ultimately turns on a set of fundamental ontological propositions about the necessary limitations of any linguistic or symbolic system. These propositions concern the widely known paradoxes that bedevil any attempt by a semantic system to explain its own limits, and/or how it came into being. If what preceded the system was radically different from what subsequently emerged, how could the system have emerged from it, and how can the system come to terms with it at all? If we name the limits of what the system can understand, do not we, in that very gesture, presuppose some knowledge of what is beyond these limits, if only enough to say what the system is not? The only manner in which we can explain the origin of language is within language, Žižek notes in For They Know Not What They Do. Yet we hence presuppose, again in the very act of the explanation, the very thing we were hoping to explain. Similarly, to take the example from political philosophy of Hobbes’ explanation of the origin of sociopolitical order, the only way we can explain the origin of the social contract is by presupposing that Hobbes’ wholly pre-social men nevertheless possessed in some way the very social abilities to communicate and make pacts that Hobbes’ position is supposed to explain.

For Žižek, fantasy as such is always fundamentally the fantasy of (one’s) origins. In Freud’s “Wolf Man” case, to cite the psychoanalytic example Žižek cites in For They Know Not What They Do, the primal scene of parental coitus is the Wolf Man’s attempt to come to terms with his own origin—or to answer the infant’s perennial question “where did I come from?” The problem here is this: who could the spectacle of this primal scene have been staged for or seen by, if it really transpired before the genesis of the subject that it would explain (see 3e, 4e)? The only answer is that the Wolf Man has imaginatively transposed himself back into the primal scene if only as an impassive object-gaze—whose historical occurrence he had yet hoped would explain his origin as an individual.

Žižek’s argument is that, in the same way, political or ideological systems cannot and do not avoid deep inconsistencies. No less than Machiavelli, Žižek is acutely aware that the act that founds a body of Law is never itself legal, according to the very order of Law it sets in place. He cites Bertolt Brecht: “what is the robbing of a bank, compared to the founding of a bank?” What fantasy does, in this register, is to try to historically renarrativize the founding political act as if it were or had been legal—an impossible application of the Law before the Law had itself come into being. No less than the Wolf Man’s false transposition of himself back into the primal scene that was to explain his origin, Žižek argues that the attempt of any political regime to explain its own origins in a political myth that denies the fundamental, extralegal violence of these origins is fundamentally false. (Žižek uses the example of the liberal myth of primitive accumulation to illustrate his position in For They Know Not What They Do, but we could cite here Plato’s myth of the reversed cosmos in the Laws and Statesman, or historical cases like the idea of terra nullius in colonial Australia).

e. Exemplification: the Fall and Radical Evil (Žižek’s Critique of Kant)

In a series of places, Žižek situates his ontological position in terms of a striking reading of Immanuel Kant’s practical philosophy. Žižek argues that in “Religion Within the Bounds of Reason Alone” Kant showed that he was aware of these paradoxes that necessarily attend any attempt to narrate the origins of the Law. The Judeo-Christian myth of the fall succumbs to precisely these paradoxes, as Kant analyses: if Adam and Eve were purely innocent, how could they have been tempted?; if their temptation was wholly the fault of the tempter, why then has God punished humans with the weight of original sin?; but if Adam and Eve were not purely innocent when the snake lured them, in what sense was this a fall at all? According to Žižek, Kant’s text also provides us with theoretical parameters which allow us to explain and avoid these paradoxes. The problems for the mythical narrative, Kant argues, hail from its nature as a narrative—or how it tries to render in a historical story what he argues is truly a logical or transcendental priority. For Kant, human beings are, as such, radically evil. They have always already chosen to assert their own self-conceit above the moral Law. This choice of radical evil, however, is not itself a historical choice either for individuals or for the species, for Kant. This choice is what underlies and opens up the space for all such historical choices. However, as Žižek argues, Kant withdraws from the strictly diabolical implications of this position. The key place in which this withdrawal is enacted is in the postulates of The Critique of Practical Reason, wherein Kant defends the immortality of the soul as a likely story on the basis of our moral experience. Because of radical evil, Kant argues, it is impossible for humans to ever act purely out of duty in this life—this is what Kant thinks our irremovable sense of moral guilt attests. But because people can never act purely in this life, Kant suggests, it is surely reasonable to hope and even to postulate that the soul lives on after death, striving ever closer towards the perfection of its will.

Žižek’s contention is that this argument does not prove the immortality of a disembodied soul. It proves the immortality of an embodied individual soul, always struggling guiltily against its selfish corporeal impulses (this, incidentally, is one reason why Žižek argues, after Lacan, that de Sade is the truth of Kant). In order to make his proof even plausible, Žižek notes, Kant has to tacitly smuggle the spatiotemporal parameters of embodied earthly existence into the postulated hereafter so that the guilty subject can continue endlessly to struggle against his radically evil nature towards good. In this way, though, Kant himself has to speak as if he knew what things are like on the other side of death—which is to say, from the impossible, because impossibly neutral, perspective of someone able to impassively see the spectacle of the immortal subject striving guiltily towards the good (see 4d). But in this way, also, Žižek argues that Kant enacts exactly the type of fantasmatic operation his reading of the fall (as a) narrative declaims, and which represents in nuce the basis operation also of all political ideologies.

4. From Ontology to Ethics—Žižek’s Reclaiming of the Subject

a. Žižek’s Subject, Fantasy, and the Objet Petit a

Perhaps Žižek’s most radical challenge to accepted theoretical opinion is his defense of the modern, Cartesian subject. Žižek knowingly and polemically positions his writings against virtually all other contemporary theorists, with the significant exception of Alain Badiou. Yet for Žižek, the Cartesian subject is not reducible to the fully self-assured “master and possessor of nature” of Descartes’ Discourses. It is what Žižek calls in “Kant With (Or Against) Kant,” an out of joint ontological excess or clinamen. Žižek takes his bearings here as elsewhere from a Lacanian reading of Kant, and the latter’s critique of Descartes’ cogito ergo sum. In the “Transcendental Dialectic” in The Critique of Pure Reason, Kant criticized Descartes’ argument that the self-guaranteeing “I think” of the cogito must be a thinking thing (res cogitans). For Kant (as for Žižek), while the “I think” must be capable of accompanying all of the subject’s perceptions, this does not mean that it is itself such a substantial object. The subject that sees objects in the world cannot see itself seeing, Žižek notes, any more than a person can jump over her own shadow. To the extent that a subject can reflectively see itself, it sees itself not as a subject but as one more represented object, what Kant calls the “empirical self” or what Žižek calls the “self” (versus the subject) in The Plague of Fantasies. The subject knows that it is something, Žižek argues. But it does not and can never know what Thing it is “in the Real”, as he puts it (see 2e). This is why it must seek clues to its identity in its social and political life, asking the question of others (and of the big Other (see 2b)) which Žižek argues defines the subject as such: che voui? (what do you want from me?). In Tarrying With the Negative, Žižek hence reads the Director’s Cut of Ridley Scott’s Bladerunner as revelatory of the Truth of the subject. Within this version of the film, as Žižek emphasizes, the main character Deckard literally does not know what he is—a robot that perceives itself to be human. According to Žižek, the subject is a “crack” in the universal field or substance of being, not a knowable thing (see 4d). This is why Žižek repeatedly cites in his books the disturbing passage from the young Hegel describing the modern subject not as the “light” of the modern enlightenment, but “this night, this empty nothing …”

It is crucial to Žižek’s position, though, that Žižek denies the apparent implication of this that the subject is some kind of supersensible entity, for example, an immaterial and immortal soul, and so forth. The subject is not a special type of Thing outside of the phenomenal reality we can experience, for Žižek. As we saw in 1e above, such an idea would in fact reproduce in philosophy the type of thinking which, he argues, characterizes political ideologies and the subject’s fundamental fantasy (see 3a). It is more like a fold or crease in the surface of this reality, as Žižek puts it in Tarrying With the Negative, the point within the substance of reality wherein that substance is able to look at itself, and see itself as alien to itself. According to Žižek, Hegel and Lacan add to Kant’s reading of the subject as the empty “I think” that accompanies any individual’s experience the caveat that, because objects thus appear to a subject, they always appear in an incomplete or biased way. Žižek’s “formula” of the fundamental fantasy (see 2a, 2d) “$ <> a” tries to formalize exactly this thought. Its meaning is that the subject ($), in its fundamental fantasy, misrecognizes itself as a special object (the objet petit a or lost object (see 2a)) within the field of objects that it perceives. In terms which unite this psychoanalytic notion with Žižek’s political philosophy, we can say that the objet petit a is exactly a sublime object (2e). It is an object that is elevated or, in Freudian terms, “sublimated” by the subject to the point where it stands as a metonymic representative of the jouissance the subject unconsciously fantasizes was taken from her/him at castration (3a). It hence functions as the object-cause of the subject’s desire that exceptional “little piece of the Real” that s/he seeks out in all of her/his love relationships. Its psychoanalytic paradigms are, to cite the title of a collection Žižek edited, “the voice and gaze as love objects”. Examples of the voice as object petit a include the persecutor’s voice in paranoia, or the very silence that some TV advertisements now use, and which captures our attention by making us wonder whether we may not have missed something. The preeminent Lacanian illustration of the gaze as object petit a is the anamorphotic skull at the foot of Holbein’s Ambassadors, which can only be seen by a subject who looks at it awry, or from an angle. Importantly, then, neither the voice nor the gaze as objet petit a attest to the subject’s sovereign ability to wholly objectify (and hence control) the world it surveys. In the auditory and visual fields (respectively), the voice and the gaze as objet petit a represent objects like Kant’s sublime things that the subject cannot wholly get its head around, as we say. The fact that they can only be seen or heard from particular perspectives indicates exactly how the subject’s biased perspective—and so his/her desire, what s/he wants—has an effect on what s/he is able to see. They thereby bear witness to how s/he is not wholly outside of the reality s/he sees. Even the most mundane but telling example of this subjective objet petit a of Lacanian theory is someone in love, of whom we commonly say that they are able to see in their lover something special, an “X factor,” which others are utterly blind to. In the political field, similarly—and as we saw in part 2c—subjects of a particular political community will claim that others cannot understand their regime’s sublime objects. Indeed, as Žižek comments about the resurgence of racism across the first world today, it is often precisely the strangeness of others’ particular ethnic or national Things that animates subjects’ hatred towards them.

b. The Objet Petit a & the Virtuality of Reality

In Žižek’s theory, the objet petit a stands as the exact opposite of the object of the modern sciences, that can only be seen clearly and distinctly if it is approached wholly impersonally. If the objet petit a is not looked at from a particular, subjective perspective—or, in the words of one of Žižek’s titles, by “looking awry” —it cannot be seen at all. This is why Žižek believes this psychoanalytic notion can be used to structure our understanding of the sublime objects postulated by ideologies in the political field, which as we saw in 3c show themselves to be finally inconsistent when they are looked at dispassionately. What Žižek’s Lacanian critique of ideology aims to do is to demonstrate such inconsistencies, and thereby to show us that the objects most central to our political beliefs are Things whose very sublime appearance conceals from us our active agency in constructing and sustaining them. (We will return to this thought in 4d and 4e below).

Žižek argues that the first place that the objet petit a appeared in the history of Western philosophy was with Kant’s notion of the transcendental object in The Critique of Pure Reason. Analyzing this Kantian notion allows us to elaborate more precisely the ontological status of the objet petit a. Kant defines the transcendental object as “the completely indeterminate thought of an object in general.” Like the objet petit a, then, Kant’s transcendental object is not a normal phenomenal object, although it has a very specific function in Kant’s epistemological conception of the subject. The avowedly anti-Humean function of this Kantian positing in the “Transcendental Deduction” is to ensure that the purely formal categories of the subject’s understanding can actually affect and indeed structure the manifold of the subject’s sensuous intuition. As Žižek stresses, that is, the transcendental object functions in Kant’s epistemology to guarantee that sense will continue to emerge for the subject, no matter what particular objects s/he might encounter.

We saw in 3c how Žižek argues that ideologies adduce ultimately inconsistent reasons to support the same goal of political unity. According to Žižek, as we can now elaborate, this is because the deepest political function of sublime objects of ideology is to ensure that the political world will make sense for subjects no matter what events transpire, in a way that he directly compares with Kant’s transcendental object. No matter what evidence someone might produce that all Jewish people are not acquisitive, capitalist, cunning, for example, a true Nazi will be able to immediately resignify this evidence by reference to his ideological notion of “the Jew”: “surely it is part of their cunning to appear as though they are not truly cunning,” and so forth. Importantly, it follows for Žižek that political community is always, in its very structure, an anticipated community. Subjects’ sense of political belonging is always mediated, according to him, by their shared belief in their regime’s key words or master signifiers. But these are words whose only “meaning” lies finally in their function, which is to guarantee that there will (continue to) be meaning. There is, Žižek argues, ultimately no actual, Real Thing better than the other real things subjects encounter that these words name (2e). It is only by acting as if there were such a Thing that community is maintained. This is why Žižek specifies in The Indivisible Reminder that political identification can only be, “at its most basic, identification with the very gesture of identification”:

…the coordination [between subjects in a political community] concerns not the level of the signified [of some positive shared concern] but the level of the signifier. [In political ideologies], undecidability with regard to the signified (do others really intend the same as me?) converts into an exceptional signifier, the empty Master-Signifier, the signifier-without-signified. ‘Nation’, ‘Democracy’, ‘Socialism’ and other Causes stand for that ‘something’ about which we are never sure what, exactly, it is – the point is, rather, that identifying with the Nation we signal our acceptance of what others accept, with a Master-Signifier which serves as the rallying point for all the others. (Žižek, 1996: 142)

This is the sense also in which Žižek claims in Plague of Fantasies that today’s virtual reality is “not virtual enough.” It is not virtual enough because the many options it offers subjects to enjoy (jouis) are transgressive or exotic possibilities. VR leaves nothing to the imagination or, in Žižek’s Lacanian terms, to fantasy. Fantasy, as we saw in 2a, operates to structure subjects’ beliefs about the jouissance which must remain only the stuff of imagination, purely “virtual” for subjects of the social law. For Žižek, then, it is identification with this law, as mediated via subjects’ anticipatory identifications with what they suppose others believe, that involves true virtuality.

c. Forced Choice & Ideological Tautologies

As 4b confirms (and as we commented in 1c), Žižek’s political philosophy turns around the idea that the central words of political ideologues are at base “signifiers without signified,” words that only appear to refer to exceptional Things, and which thereby facilitate the identification between subjects. As Žižek argues, these sublime objects of ideology have exactly the ontological status of what Kant called “transcendental illusions”—illusions whose semblance conceals that there is nothing behind them to conceal. Ideological subjects do not know what they do when they believe in them, Žižek contends. Yet, through the presupposition that the Other(s) know (2c), and their participation in the practices involving inherent transgression of their political community (2c), they “identify with the very gesture of identification” (4b). Hence, their belief, coupled with these practices, is politically efficient.

One of Žižek’s most difficult, but also deepest, claims is that the particular sublime objects of ideology with which subjects identify in different regimes (the Nation, the People, and so forth) each give particular form to a meta-law (law about all other laws) that binds any political community as such. This is the meta-law that says simply that subjects must obey all the other laws. In 2b above, we saw how Žižek holds that political ideologies must allow subjects the sense of subjective distance from their explicit directives. Žižek’s critical position is that this apparent freedom ideologies thereby allow subjects is finally a lure. Like the choice offered Yossarian by the “catch 22” of Joseph Heller’s novel, the only option truly available to political subjects is to continue to abide by the laws. No regime can survive if it waives this meta-law. The Sublime Object of Ideology hence cites with approval Kafka’s comment that it is not required that subjects think the law is just, only that it is necessary. Yet no regime, despite Kafka, can directly avow its own basis in such naked self-assertion without risking the loss of all legitimacy, Žižek agrees with Plato. This is why it must ground itself in ideological fantasies (3a) which at once sustain subjects’ sense of individual freedom (2c) and the sense that the regime itself is grounded extra-politically in the Real, and some transcendent, higher Good (2e).

This thought underlies the importance Žižek accords in For They Know Not What They Do to Hegel’s difficult notion of tautology as the highest instance of contradiction in The Science of Logic. If you push a subject hard enough about why they abide by the laws of their regime, Žižek holds that their responses will inevitably devolve into some logical variant of Exodus 3:14’s “I am that I am” statements of the form “because the Law (God / the People/ the Nation) is … the Law (God / the People / the Nation)”. In such tautological statements, our expectation that the predicates in the second half of the sentence will add something new to the (logical) subject given at its beginning is “contradicted,” Hegel argues. There is indeed something even sinister when someone utters such a sentence in response to our enquiries, Žižek notes—as if, when (for example) “the Law” is repeated dumbly as its own predicate (“because the law is the law”), it intimates the uncanny dimension of jouissance the law as ego ideal usually proscribes (3a). What this uncanny effect of sense attests to, Žižek argues in For They Know Not What They Do, is the usually “primordially repressed” force of the universal meta-law (that everyone must obey the laws) being expressed in the different, particular languages of political regimes: “because the People are the People,” “because the Nation is the Nation”, and so forth.

Žižek’s ideology critique hence contends that all political regimes’ ideologies always devolve finally around a set of such tautological propositions concerning their particular sublime objects. In The Sublime Object of Ideology, Žižek gives the example of a key Stalinist proposition: “the people always support the party.” On its surface, this proposition looks like a proposition that asserts something about the world, and which might be susceptible of disproof: perhaps there are some Soviet citizens who do not support the party, or who disagree with this or that of the party’s policies. What such an approach misses, however, is how in this ideology, what is referred to as “the people” in fact means “all those who support the party.” In Stalinism, that is, “the party” is the fetishized particular that stands for the people’s true interests (see 1e). Hence, the sentence “the people always support the party” is a concealed form of tautology. Any apparent people who in fact do not support the party by that fact alone are no longer “people” within Stalinist ideology.

d. The Substance is Subject, the Other Does Not Exist

In 4b, we saw how Žižek argues that political identification is identification with the gesture of identification. In 4c, we saw how the ultimate foundation of a regimes’ laws is a tautologous assertion of the bare political fact that there is law. What unites these two positions is the idea that the sublime objects of a political regime and the ideological fantasies that give narratives about their content conceal from subjects the absence of any final ground for Law beyond the fact of its own assertion, and the fact that subjects take it to be authoritative. Here as elsewhere, Žižek’s work surprisingly approaches leading motifs in the political philosophy of Carl Schmitt.

Importantly, once this position is stated, we can also begin to see how Žižek’s post-Marxist project of a critique of ideology intersects with his philosophical defense of the Cartesian subject. At several points in his oeuvre, Žižek cites Hegel’s statement in the “Introduction” to the Phenomenology of Spirit that “the substance is subject” as a rubric that describes the core of his own political philosophy. According to Žižek, critics have misread this statement by taking it to repeat the founding, triumphalist idea of modern subjectivity as such—namely, that the subject can master all of nature or “substance.” Žižek contends, controversially, that Hegel’s claim ought to be read in a directly opposing sense. For him, it indicates the truth that there can be no dominant political regime or, in Hegel’s terms, no “social substance” that does not depend for its authority upon the active, indeed finally anticipatory (4c) investment of subjects in it. Like the malign computer machines in The Matrix that literally run off the human jouissance they drain from deluded subjects, for Žižek the big Other of any political regime does not exist as a self-sustaining substance. It must ceaselessly run on the belief and actions of its subjects, and their jouissance (2c)—or, to recur to the example we looked at in 2d, the King will not be the King, for Žižek, unless he has his subjects. It is certainly telling that the leading examples of ideological tautology For They know What They Do discusses invoke precisely some subject’s will or decision as when a parent says to a child “do this … because I said so,” or when people do something “… because the King said so,” which means that no more questions can be asked.

In 4a, we saw how Žižek denies that the subject, because it is not itself a perceptible object, belongs to an order of being wholly outside of the order of experience. To elevate such a wholly Other order would, he argues, reproduce the elementary operation of the fundamental fantasy. We can now add to this thought the further position that the Cartesian subject is, according to Žižek, is finally nothing other than the irreducible point of active agency responsible for the always minimally precipitous political gesture of laying down a regime’s law. For Žižek, accordingly, the critical question to be asked of any theoretical or political position that posits some exceptional Beyond, as we saw in his reading of Kant (2e) is: from which subject-position do you speak when you claim a knowledge of this Beyond? As we saw in 2e, Žižek’s Lacanian answer is that the perspective that one always presupposes when one speaks in this manner is one that is always “superegoic” (see 2a)—tied to what he terms in Metastases of Enjoyment a “malevolently neutral” God’s eye view from nowhere. It is deeply revealing, from Žižek’s perspective, that the very perspective which allows the Kantian subject in the “dynamic sublime” to resignify its own finitude as itself a source of pleasure-in-pain (jouissance) is precisely one which identifies with the supersensible moral Law, before which the sensuous subject remains irredeemably guilty, infinitely striving to pay off its moral debt. As Žižek cites Hegel’s Phenomenology of Spirit:

It is manifest that beyond the so-called curtain [of phenomena] which is supposed to conceal the inner world, there is nothing to be seen unless we go behind it ourselves as much in order that we may see, as that there may be something behind there which can be seen. (Žižek, 1989: 196, emphasis added)

In other words, Žižek’s final position about the sublime objects of political regimes’ ideologies is that these belief inspiring objects are so many ways in which the subject misrecognizes its own active capacity to challenge existing laws, and to found new laws altogether. Žižek repeatedly argues that the most uncanny or abyssal Thing in the world is the subject’s own active subjectivity—which is why he also repeatedly cites the Eastern saying that “Thou art that.” It is finally the singularity of the subject’s own active agency that subjects misperceive in fantasies concerning the sublime objects of their regimes’ ideologies, in the face of which they can do nothing but reverentially abide by the rules. In this way, it is worth noting, Žižek’s work can claim a heritage not only of Hegel, but also from the Left Hegelians, and Marx’s and Feuerbach’s critiques of religion.

e. The Ethical Act Traversing the Fantasy

Žižek’s technical term for the process whereby we can come to recognize how the sublime objects of our political regimes’ ideologies are, like Marx’s commodities, fetish objects that conceal from subjects their own political agency is “traversing of the fantasy.” Traversing the fantasy, for Žižek, is at once the political subject’s deepest form of self-recognition, and the basis for his own radical political position or defense of the possibility of such positions. Žižek’s entire theoretical work directs us towards this “traversing of the fantasy” in the many different fields on which he has written, and despite the widespread consensus at the beginning of the new century that fundamental political change is no longer possible or desirable.

Insofar as political ideologies for Žižek, like for Althusser (see 2c), remain viable only because of the ongoing practices and beliefs of political subjects, this traversal of fantasy must always involve an active, practical intervention in the political world, which changes a regime’s political institutions. As for Kant, so for Žižek, the practical bearing of critical reason comes first, in his critique of ideology, and last, in his advocacy of the possibility of political change. Žižek hence also repeatedly speaks of traversing the fantasy in terms of an “Act” (capital “A”), which differs from normal human speech and action. Everyday speech and action typically does not challenge the framing sociopolitical parameters within which it takes place, Žižek observes. By contrast, what he means by an Act is an action which “touches the Real” (as he says) of what a sociopolitical regime has politically repressed or wiped its hands of, and which it cannot publicly avow without risking fundamental political damage (see 2c). In this way, the Žižekian Act extends and changes the very political and ideological parameters of what is permitted within a regime, in the hope of bringing into being new parameters in the light of which its own justice will be able to be retrospectively seen. This is the point of significant parallel with Alain Badiou’s work, whose influence Žižek has increasingly avowed in his more recent books. Notably, as Žižek specifies in The Indivisible Remainder, the Act as what it is effectively repeats the very act that he claims founds all political regimes as such, namely, the excessive, law founding gesture we examined in 4c. Just as the current political regime originated in a founding gesture excessive with regard to the laws it set in place, Žižek argues, so too can this political regime itself be superseded, and a new one replace it. In his reading of Walter Benjamin’s “Theses on the Philosophy of History” in The Sublime Object of Ideology, Žižek indeed argues that such a new Act also effectively repeats all previous, failed attempts at changing an existing political regime, which otherwise would be consigned forever to historical oblivion.

5. Conclusion

Slavoj Žižek’s work represents a striking challenge within the contemporary philosophical scene. Žižek’s very style, and his prodigious ability to write and examine examples from widely divergent fields, is a remarkable thing. His work reintroduces and reinvigorates for a wider audience ideas from the works of German Idealism. Žižek’s work is framed in terms of a polemical critique of other leading theorists within today’s new left or liberal academy (Derrida, Habermas, Deleuze), which claims to unmask their apparent radicality as concealing a shared recoil from the possibility of a subjective, political Act which in fact sits comfortably with a passive resignation to today’s political status quo. Not the least interesting feature of his work, politically, is indeed how Žižek’s critique of the new left both significantly mirrors criticisms from conservative and neoconservative authors, yet hails from an avowedly opposed political perspective. In political philosophy, Žižek’s Lacanian theory of ideology presents a radically new descriptive perspective that affords us a unique purchase on many of the paradoxes of liberal consumerist subjectivity, which is at once politically cynical (as the political right laments) and politically conformist (as the political left struggles to come to terms with). Prescriptively, Žižek’s work challenges us to ask questions about the possibility of sociopolitical change that have otherwise rarely been asked after 1989, including: what forms such changes might take?; and what might justify them or make them possible?

Looked at in a longer perspective, it is of course too soon to judge what the lasting effects of Žižek’s philosophy will be, especially given Žižek’s own comparative youth as a thinker (Žižek was born in 1949). In terms of the history of ideas, in particular, while Žižek’s thought certainly turns on their heads many of today’s widely accepted theoretical notions, it is surely a more lasting question whether his work represents any more lasting a break with the parameters that Kant’s critical philosophy set out in the three Critiques.

6. References and Further Reading

a. Primary Literature (Books by Žižek)

Iraq The Borrowed Kettle, New York: Verso, 2004.
Organs Without Bodies: On Deleuze and Consequences, New York, London: Routledge, 2003.
The Puppet and the Dwarf, New York: Routledge, 2003.
Did Somebody Say Totalitarianism? Five Essays on the (Mis)Use of a Notion, London; New York: Verso, 2001.
The Fright of Real Tears, Kieslowski and The Future, Bloomington: Indiana University Press, 2001.
On Belief, London: Routledge, 2001.
The Fragile Absolute or Why the Christian Legacy is Worth Fighting For, London; New York: Verso, 2000.
The Art of the Ridiculous Sublime, On David Lynch’s Lost Highway, Walter Chapin Center for the Humanities: University of Washington, 2000.
Contingency, Hegemony, Universality: Contemporary Dialogues on the Left, Judith Butler, Ernesto Laclau and SZ. London; New York: Verso, 2000.
Enjoy Your Symptom! Jacques Lacan in Hollywood and Out, second expanded edition, New York: Routledge, 2000.
The Ticklish Subject: The Absent Centre of Political Ontology, London; New York: Verso, 1999.
The Abyss Of Freedom Ages Of The World, with F.W.J. von Schelling, Ann Arbor: University of Michigan Press, 1997.
The Plague of Fantasies, London; New York: Verso, 1997.
Gaze And Voice As Love Objects, Renata Salecl and SZ editors. Durham: Duke University Press, 1996.
The Indivisible Remainder: An Essay On Schelling And Related Matters, London; New York: Verso, 1996.
The Metastases Of Enjoyment: Six Essays On Woman And Causality (Wo Es War), London; New York: Verso, 1994.
Mapping Ideology, SZ editor. London; New York: Verso, 1994.
Tarrying With The Negative: Kant, Hegel And The Critique Of Ideology, Durham: Duke University Press, 1993.
Enjoy Your Symptom! Jacques Lacan In Hollywood And Out, London; New York: Routledge, 1992.
Everything You Always Wanted to Know about Lacan (But Were Afraid To Ask Hitchcock), SZ editor. London; New York: Verso, 1992.
Looking Awry: an Introduction to Jacques Lacan through Popular Culture, Cambridge, Mass.: MIT Press, 1991.
For They Know Not What They Do: Enjoyment As A Political Factor, London; New York: Verso, 1991.
The Sublime Object of Ideology, London; New York: Verso, 1989.

b. Secondary Literature (Texts on Žižek)

Slavoj Žižek: A Little Piece of the Real, Matthew Sharpe, Hants: Ashgate, 2004.
Slavoj Žižek: A Critical Introduction, Ian Parker, London: Pluto Press, 2004.
Slavoj Žižek: Live Theory, Rex Butler, London: Continuum, 2004.
Žižek: A Critical Introduction, Sarah Kay, London: Polity, 2003.
Slavoj Žižek (Routledge Critical Thinkers), Tony Myers, London: Routledge, 2003.

Author Information

Matthew Sharpe
Email: matthew.sharpe@dewr.gov.au

Australia

Karl Popper: Philosophy of Science

Karl Popper (1902-1994) was one of the most influential philosophers of science of the 20th century. He made significant contributions to debates concerning general scientific methodology and theory choice, the demarcation of science from non-science, the nature of probability and quantum mechanics, and the methodology of the social sciences. His work is notable for its wide influence both within the philosophy of science, within science itself, and within a broader social context.

Popper’s early work attempts to solve the problem of demarcation and offer a clear criterion that distinguishes scientific theories from metaphysical or mythological claims. Popper’s falsificationist methodology holds that scientific theories are characterized by entailing predictions that future observations might reveal to be false. When theories are falsified by such observations, scientists can respond by revising the theory, or by rejecting the theory in favor of a rival or by maintaining the theory as is and changing an auxiliary hypothesis. In either case, however, this process must aim at the production of new, falsifiable predictions, while Popper recognizes that scientists can and do hold onto theories in the face of failed predictions when there are no predictively superior rivals to turn to. He holds that scientific practice is characterized by its continual effort to test theories against experience and make revisions based on the outcomes of these tests. By contrast, theories that are permanently immunized from falsification by the introduction of untestable ad hoc hypotheses can no longer be classified as scientific. Among other things, Popper argues that his falsificationist proposal allows for a solution of the problem of induction, since inductive reasoning plays no role in his account of theory choice.

Along with his general proposals regarding falsification and scientific methodology, Popper is notable for his work on probability and quantum mechanics and on the methodology of the social sciences. Popper defends a propensity theory of probability, according to which probabilities are interpreted as objective, mind-independent properties of experimental setups. Popper then uses this theory to provide a realist interpretation of quantum mechanics, though its applicability goes beyond this specific case. With respect to the social sciences, Popper argued against the historicist attempt to formulate universal laws covering the whole of human history and instead argued in favor of methodological individualism and situational logic.

Background
Falsification and the Criterion of Demarcation
Criticisms of Falsificationism
Realism, Quantum Mechanics, and Probability
Methodology in the Social Sciences
Popper’s Legacy
References and Further Reading
1. Primary Sources
2. Secondary Sources

1. Background

Popper began his academic studies at the University of Vienna in 1918, and he focused on both mathematics and theoretical physics. In 1928, he received a PhD in Philosophy. His dissertation, On the Problem of Method in the Psychology of Thinking, dealt primarily with the psychology of thought and discovery. Popper later reported that it was while writing this dissertation that he came to recognize “the priority of the study of logic over the study of subjective thought processes” (1976, p. 86), a sentiment that would be a primary focus in his more mature work in the philosophy of science.

In 1935, Popper published Logik der Forschung (The Logic of Research), his first major work in the philosophy of science. Popper later translated the book into English and published it under the title The Logic of Scientific Discovery (1959). In the book, Popper offered his first detailed account of scientific methodology and of the importance of falsification. Many of the arguments in this book, as well as throughout his early work, are directed against members of the so-called “Vienna Circle,” such as Moritz Schlick, Otto Neurath, Rudolph Carnap, Hans Reichenbach, Carl Hempel, and Herbert Feigl, among others. Popper shared these thinkers’ concern with general issues of scientific methodology, and he sympathized with their distrust of traditional philosophical methodology. His proposed solutions to the problems arising from these concerns, however, were significantly different from those favored by the Vienna Circle.

Popper stayed in Vienna until 1937, when he took a teaching position at Canterbury University College in Christchurch, New Zealand, and he stayed there throughout World War II. His major works on the philosophy of science from this period include the articles that would eventually make up The Poverty of Historicism (1957). In these articles, he offered a highly critical analysis of the methodology of the social sciences, in particular, of attempts by social scientists to formulate predictive, explanatory laws.

In 1946, Popper took a teaching position at the London School of Economics, where he stayed until he retired in 1969. While there, he continued to work on a variety of issues relating to the philosophy of science, including quantum mechanics, entropy, evolution, and the realism vs. anti-realism debate, along with the issues already mentioned. His major works from this period include “The Propensity Interpretation of Probability” (1959) and Conjectures and Refutations (1963). He continued to publish until shortly before his death in 1994. In The Philosophy of Karl Popper (1974), Popper offers responses to many of his most important critics and provides clarifications of his mature views. His intellectual autobiography Unended Quest (1976) gives a detailed account of Popper’s evolving views, especially as they relate to the philosophy of science.

2. Falsification and the Criterion of Demarcation

Much of Popper’s early work in the philosophy of science focuses on what he calls the problem of demarcation, or the problem of distinguishing scientific (or empirical) theories from non-scientific theories. In particular, Popper aims to capture the logical or methodological differences between scientific disciplines, such as physics, and non-scientific disciplines, such as myth-making, philosophical metaphysics, Freudian psychoanalysis, and Marxist social criticism.

Popper’s proposals concerning demarcation can be usefully seen as a response to the verifiability criterion of demarcation proposed by logical empiricists, such as Carnap and Schlick. According to this criterion, a statement is cognitively meaningful if and only if it is, in principle, possible to verify. This criterion is intended to, among other things, capture the idea that the claims of empirical science are meaningful in a way that the claims of traditional philosophical metaphysics are not. For example, this criterion entails that claims about the locations of mid-sized objects are meaningful, since one can, in principle, verify them by going to the appropriate location. By contrast, claims about the fundamental nature of causation are not meaningful.

While Popper shares the belief that there is a qualitative difference between science and philosophical metaphysics, he rejects the verifiability criterion for several reasons. First, it counts existential statements (like “unicorns exist”) as scientific, even though there is no way of definitively showing that they are false. After all, the mere fact that one has failed to see a unicorn in a particular place does not establish that unicorns could not be observed in some other place. Second, it inappropriately counts universal statements (like “all swans are white”) as meaningless simply because they can never be conclusively verified. These sorts of universal claims, though, are common within science, and certain observations (like the observation of a black swan) can clearly show them to be false. Finally, the verifiability criterion is by its own light not meaningful, since it cannot be verified.

Partially in response to worries such as these, the logical empiricists’ later work abandons the verifiability criterion of meaning and instead emphasizes the importance of the empirical confirmation of scientific theories. Popper, however, argues that verification and confirmation played no role in formulating a satisfactory criterion of demarcation. Instead, Popper proposes that scientific theories are characterized by being bold in two related ways. First, scientific theories regularly disagree with accepted views of the world based on common sense or previous theoretical commitments. To an uneducated observer, for example, it may seem obvious that Earth is stationary, while the sun moves rapidly around it. However, Copernicus posited that Earth in fact revolved around the sun. In a similar way, it does not seem as though a tree and a human share a common ancestor, but this is what Darwin’s theory of evolution by natural selection claims. As Popper notes, however, this sort of boldness is not unique to scientific theories, since most mythological and metaphysical theories also make bold, counterintuitive claims about the nature of reality. For example, the accounts of world creation provided by various religions would count as bold in this sense, but this does not mean that they thereby count as scientific theories.

With this in mind, he goes on argue that scientific theories are distinguished from non-scientific theories by a second sort of boldness: they make testable claims that future observations might reveal to be false. This boldness thus amounts to a willingness to take a risk of being wrong. On Popper’s view, scientists investigating a theory make repeated, honest attempts to falsify the theory, whereas adherents of pseudoscientific or metaphysical theories routinely take measures to make the observed reality fit the predictions of the theory. Popper describes his proposal as follows:

Thus my proposal was, and is, that it is this second boldness, together with the readiness to look for tests and refutations, which distinguished “empirical” science from non-science, and especially from pre-scientific myths and metaphysics (1974, pp. 980-981)

In other places, Popper calls attention to the fact that scientific theories are characterized by possessing potential falsifiers—that is, that they make claims about the world that might be discovered to be false. If these claims are, in fact, found to be false, then the theory as a whole is said to be falsified. Non-scientific theories, by contrast, do not have any such potential falsifiers—there is literally no possible observation that could serve to falsify these theories.

Popper’s falsificationist proposal differs from the verifiability criterion in several important ways. First, Popper does not hold that non-scientific claims are meaningless. Instead, he argues that such unfalsifiable claims can often serve important roles in both scientific and philosophical contexts, even if we are incapable of ascertaining their truth or falsity. Second, while Popper is a realist who holds that scientific theories aim at the truth (see Section 4), he does not think that empirical evidence can ever provide us grounds for believing that a theory is either true or likely to be true. In this sense, Popper is a fallibilist who holds that while the particular unfalsified theory we have adopted might be true, we could never know this to be the case. For these same reasons, Popper holds that it is impossible to provide justification for one’s belief that a particular scientific theory is true. Finally, where others see science progressing by confirming the truth of various particular claims, Popper describes science as progressing on an evolutionary model, with observations selecting against unfit theories by falsifying them.

a. Popper on Physics and Psychoanalysis

In order to see how falsificationism works in practice, it will help to consider one of Popper’s most memorable examples: the contrast between Einstein’s theory of general relativity and the theories of psychoanalysis defended by Sigmund Freud and Alfred Adler. We might roughly summarize the theories as follows:

General relativity (GR): Einstein’s theory of special relativity posits that the observed speed of light in a vacuum will be the same for all observers, regardless of which direction or at what velocity these observers are themselves moving. GR allows this theory to be applied to cases where acceleration or gravity plays a role, specifically by treating gravity as a sort of distortion or bend in space-time created by massive objects.

Psychoanalysis: The theory of psychoanalysis holds that human behavior is driven at least in part by unconscious desires and motives. For example, Freud posited the existence of the id, an unconscious part of the human psyche that aims toward gratifying instinctive desires, regardless of whether this is rational. However, the desires of the id might be mediated or superseded in certain circumstances by its interaction with both the self-interested ego and the moral superego.

As we can see, both theories make bold, counter-intuitive claims about the fundamental nature of reality. Moreover, both theories can account for previously observed phenomena; for example, GR allows for an accurate description of the observed perihelion of Mercury, while psychoanalysis entails that it is possible for people to consistently act in ways that are against their own long-term best interest. Finally, both of these theories enjoyed significant support among their academic peers when Popper was first writing about these issues.

Popper argues, however, that GR is scientific while psychoanalysis is not. The reason for this has to do with the testability of Einstein’s theory. As a young man, Popper was especially impressed by Arthur Eddington’s 1919 test of GR, which involved observing during a solar eclipse the degree to which the light from distant stars was shifted when passing by the sun. Importantly, the predictions of GR regarding the magnitude shift disagreed with the then-dominant theory of Newtonian mechanics. Eddington’s observation thus served as a crucial experiment for deciding between the theories, since it was impossible for both theories to give accurate predictions. Of necessity, at least one theory would be falsified by the experiment, which would provide strong reason for scientists to accept its unfalsified rival. On Popper’s view, the continual effort by scientists to design and carry out these sorts of potentially falsifying experiments played a central role in theory choice and clearly distinguished scientific theorizing from other sorts of activities. Popper also takes care to note that insofar as GR was not a unified field theory, there was no question of GR’s being the complete truth, as Einstein himself repeatedly emphasized. The scientific status of GR, then, had nothing to do with neither (1) the truth of GR as a general theory of physics (the theory was already known to false) nor (2) the confirmation of GR by evidence (one cannot confirm a false theory).

In contrast to such paradigmatically scientific theories as GR, Popper argues that non-scientific theories such as Freudian psychoanalysis do not make any predictions that might allow them to be falsified. The reason for this is that these theories are compatible with every possible observation. On Popper’s view, psychoanalysis simply does not provide us with adequate details to rule out any possible human behavior. Absent of these sorts of precise predictions, the theory can be made to fit with, and to provide a purported explanation of, any observed behavior whatsoever.

To illustrate this point, Popper offers the example of two men, one who pushes a child into the water with the intent of drowning it, and another who dives into the water in order to save the child. Popper notes that psychoanalysis can explain both of these seemingly contradictory actions. In the first case, the psychoanalyst can claim that the action was driven by a repressed component of the (unconscious) id and in the second case, that the action resulted from a successful sublimation of this exact same sort of desire by the ego and superego. The point generalizes that regardless of how a person actually behaves, psychoanalysis can be used to explain the behavior. This, in turn, prevents us from formulating any crucial experiments that might serve to falsify psychoanalysis. Popper writes:

The point is very clear. Neither Freud nor Adler excludes any particular person’s acting in any particular way, whatever the outward circumstances. Whether a man sacrificed his life to rescue a drowning child (a case of sublimation) or whether he murdered the child by drowning (a case of repression) could not possibly be predicted or excluded by Freud’s theory (1974, p. 985).

Popper allows that there are often legitimate purposes for positing non-scientific theories, and he argues that theories which start out as non-scientific can later become scientific, as we determine methods for generating and testing specific predictions based on these theories. Popper offers the example of Copernicus’s theory of a sun-centered universe, which initially yielded no potentially falsifying predictions, and so would not have counted as scientific by Popper’s criteria. However, later astronomers determined ways of testing Copernicus’s hypothesis, thus rendering it scientific. For Popper, then, the demarcation between scientific and non-scientific theories is not grounded on the nature of entities posited by theories, by the truth or usefulness of theories, or even by the degree to which we are justified in believing in such theories. Instead, falsification provides a methodological distinction based on the unique role that observation and evidence play in scientific practice.

b. Auxiliary and Ad Hoc Hypotheses

While Popper consistently defends a falsification-based solution to the problem of demarcation throughout his published work, his own explications of it include a number of qualifications to ensure a better fit with the realities of scientific practice. It is in this context that Popper introduces several of his more notable contributions to the philosophy of science, including auxiliary versus ad hoc hypotheses, basic sentences, and degrees of verisimilitude.

One immediate objection to the simple proposal regarding falsification sketched in the previous section is based on the Duhem-Quine thesis, according to which it is in many cases impossible to test scientific theories in isolation. For example, suppose that a group of investigators uses GR to deduce a prediction about the perihelion of Mercury, but then discovers that this prediction disagrees with their measurements. This failure might lead them to conclude that GR is false; however, the failure of the prediction might also plausibly be blamed on the falsity of some other proposition that the scientists relied on to deduce the apparently falsifying prediction. There are generally a large number of such propositions, concerning everything from the absence of human error to the accuracy of the scientific theories underlying the construction and application of the measuring equipment.

Popper recognizes that scientists routinely attribute the failure of experiments to factors such as this, and further grants that there is in many cases nothing objectionable about their doing so. On Popper’s view, the distinctive mark of scientific inquiry concerns the investigators’ responses to failed predictions in cases where they do not abandon the falsified theory altogether. In particular, Popper argues that a scientific theory can be legitimately saved from falsification by the introduction of an auxiliary hypothesis that allows for the generation of new, falsifiable predictions. Popper offers an example taken from the early 19th century, when astronomers noticed that the orbit of Uranus deviated significantly from what Newtonian mechanics seemed to predict. In this case, the scientists did not treat Newton’s laws as being falsified by such an observation. Instead, they considered the auxiliary hypothesis that there existed an additional and so far unobserved planet that was influencing the orbit of Uranus. They then used this auxiliary hypothesis, together with equations of Newtonian mechanics, to predict where this planet must be located. Their predictions turned out to be successful, and Neptune was discovered in 1846.

Popper contrasts this legitimate, scientific method of theory revision with the illegitimate, non-scientific use of ad hoc hypotheses to rescue theories from falsification. Here, an ad hoc hypothesis is one that does not allow for the generation of new, falsifiable predictions. Popper gives the example of Marxism, which he argues had originally made definite predictions about the evolution of society: the capitalist, free-market system would self-destruct and be replaced by joint ownership of the means of production, and this would happen first in the most highly developed economies. By the time Popper was writing in the mid-20th century, however, it seemed clear to him that these predictions were false: free market economies had not self-destructed, and the first communist revolutions happened in relatively undeveloped economies. The proponents of Marxism, however, neither abandoned the theory as falsified nor introduced any new, falsifiable auxiliary hypotheses that might account for the failed predictions. Instead, they adopted ad hoc hypotheses that immunized Marxism against any potentially falsifying observations whatsoever. For example, the continued persistence of capitalism might be blamed on the action of counter-revolutionaries but without providing an account of which specific actions these were, or what specific new predictions about society we should expect instead. Popper concludes that, while Marxism had originally been a scientific theory:

It broke the methodological rule that we must accept falsification, and it immunized itself against the most blatant refutations of its predictions. Ever since then, it can be described only as non-science—as a metaphysical dream, if you like, married to a cruel reality (1974, p. 985).

c. Basic Sentences and the Role of Convention

A second complication for the simple theory of falsification just described concerns the character of the observations that count as potential falsifiers of a theory. The problem here is that decisions about whether to accept an apparently falsifying observation are not always straightforward. For example, there is always the possibility that a given observation is not an accurate representation of the phenomenon but instead reflects theoretical bias or measurement error on the part of the observer(s). Examples of this sort of phenomenon are widespread and occur in a variety of contexts: students getting the “wrong” results on lab tests, a small group of researchers reporting results that disagree with those obtained by the larger research community, and so on.

In any specific case in which bias or error is suspected, Popper notes that researchers might introduce a falsifiable, auxiliary hypothesis allowing us to test this. And in many cases, this is just what they do: students redo the test until they get the expected results, or other research groups attempt to replicate the anomalous result obtained. Popper argues that this technique cannot solve the problem in general, however, since any auxiliary hypotheses researchers introduce and test will themselves be open to dispute in just the same way, and so on ad infinitum. If science is to proceed at all then, there must be some point at which the process of attempted falsification stops.

In order to resolve this apparently vicious regress, Popper introduces the idea of a basic statement, which is an empirical claim that can be used to both determine whether a given theory is falsifiable and thus scientific and, where appropriate, to corroborate falsifying hypotheses. According to Popper, basic statements are “statements asserting that an observable event is occurring in a certain individual region of space and time” (1959, p. 85). More specifically, basic statements must be both singular and existential (the formal requirement) and be testable by intersubjective observation (the material requirement). On Popper’s view, “there is a raven in space-time region k” would count as a basic statement, since it makes a claim about an individual raven whose existence, or lack thereof, could be determined by appropriately located observers. By contrast, the negative existential claim “there are no ravens in space-time region k” does not do this, and thus fails to qualify as a basic statement.

In order to avoid the infinite regress alluded to earlier, where basic statements themselves must be tested in order to justify their status as potential falsifiers, Popper appeals to the role played by convention and what he calls the “relativity of basic statements.” He writes as follows:

Every test of a theory, whether resulting in its collaboration or falsification, must stop at some basic statement or other which we decide to accept. If we do not come to any decision, and do not accept some basic statement or other, then the test will have led nowhere… This procedure has no natural end. Thus if the test is to lead us anywhere, nothing remains but to stop at some point or other and say that we are satisfied, for the time being. (1959, p. 86)

From this, Popper concludes that a given statement’s counting as a basic statement requires the consensus of the relevant scientific community—if the community decides to accept it, it will count as a basic statement; if the community does not accept it as basic, then an effort must be made to test the statement by using it together with other statements to deduce a statement that the relevant community will accept as basic. Finally, if the scientific community cannot reach a consensus on what would count as a falsifier for the disputed statement, the statement itself, despite initial appearances, may not actually be empirical or scientific in the relevant sense.

d. Induction, Corroboration, and Verisimilitude

Falsification also plays a key role in Popper’s proposed solution to David Hume’s infamous problem of induction. On Popper’s interpretation, Hume’s problem involves the impossibility of justifying belief in general laws based on evidence that concerns only particular instances. Popper agrees with Hume that inductive reasoning in this sense could not be justified, and he thus rejects the idea that empirical evidence regarding particular individuals, such as successful predictions, is in any way relevant to confirming the truth of general scientific laws or theories. This places Popper’s view in explicit contrast to logical empiricists such as Carnap and Hempel, who had developed extensive, mathematical systems of inductive logic intended to explicate the degree of confirmation of scientific theories by empirical evidence.

Popper argues that there are in fact two closely related problems of induction: the logical problem of induction and the psychological problem of induction. The first problem concerns the possibility of justifying belief in the truth or falsity of general laws based on empirical evidence that concerns only specific individuals. Popper holds that Hume’s argument concerning this problem “establishes for good that all our universal laws or theories remain forever guesses, conjectures, [and] hypotheses” (1974, p. 1019). However, Popper claims that while a successful prediction is irrelevant to confirming a law, a failed prediction can immediately falsify it. On Popper’s view, then, observing 1,000 white swans does nothing to increase our confidence that the hypothesis “all swans are white” is true; however, the observation of a single black swan can, subject to the caveats mentioned in previous sections, falsify this same hypothesis.

In contrast to the logical problem of induction, the psychological problem of induction concerns the possibility of explaining why reasonable people nevertheless have the expectation that unobserved instances will obey the same general laws as did previously observed instances. Hume tries to resolve the psychological problem by appeal to habit or custom, but Popper rejects this solution as inadequate, since it suggests that there is a “clash between the logic and the psychology of knowledge” (1974, p. 1019) and hence that people’s beliefs in general laws are fundamentally irrational.

Popper proposes to solve these twin problems of induction by offering an account of theory preference that does not rely upon inductive inference and thus avoids Hume’s problems altogether. While the technical details of this account evolve throughout his writings, he consistently emphasizes two main points. First, he holds that a theory with greater informative content is to be preferred to one with less content. Here, informative content is a measure of how much a theory rules out; roughly speaking, a theory with more informative content makes a greater number of empirical claims, and thus has a higher degree of falsifiability. Second, Popper holds that a theory is corroborated by passing severe tests, or “by predictions which were highly improbable in the lights of our previous knowledge (previous to the theory which was tested and corroborated)” (1963, p. 220).

It is important to distinguish Popper’s claim that a theory is corroborated by surviving a severe test from the claim that the logical empiricist view that a theory is inductively confirmed by successfully predicting events that, were the theory to have been false, would have been highly unlikely. According to the latter view, a successful prediction of this sort, subject to certain caveats, provides evidence that the theory in question is actually true. The question of theory choice is tightly tied to that of confirmation: scientists should adopt whichever theory is most probable by light of the available evidence. On Popper’s view, by contrast, corroboration provides no evidence whatsoever the theory in question is true, or even that the theory is preferable to a so-far-untested but still unfalsified rival. Instead, a corroborated theory has shown merely that it is the sort of theory that could be falsified and thus can be legitimately classified as scientific. While a corroborated theory should obviously be preferred to an already falsified rival (see Section 2), the real work here is being done by the falsified theory, which has taken itself out of contention.

While Popper consistently rejects the idea that we are justified in believing that non-falsified, well-corroborated scientific theories with high levels of informative content are either true or likely to be true, his work on degrees of verisimilitude explores the idea that such theories are closer to the truth than were the falsified theories that they had replaced. The basic idea is as follows:

For a given statement H, let the content of H be the class of all of the logical consequences of So, if H is true, then all of the members of this class would be true; if H were false however, then only some members of this class would be true, since every false statement has at least some true consequences.
The content of H can be broken into two parts: the truth content consisting of all the true consequences of H, and the falsity content, consisting of all of the false consequences of
The verisimilitude of H is defined as the difference between the truth content of H and falsity content of H. This is intended to capture the idea that a theory with greater verisimilitude will entail more truths and fewer falsehoods than does a theory will less verisimilitude.

With this definition in hand, it might now seem that Popper could incorporate truth into his account of his theory preference: non-falsified theories with high levels of informative content were closer to the truth than either the falsified theories they replaced or their unfalsified but less informative competitors. Unfortunately, however, this definition does not work, as arguments from Tichý (1974), Miller (1974), Harris (1974), and others show. Tichý and Miller in particular demonstrate that Popper’s proposed definition cannot be used to compare the relative verisimilitude of false theories, which is Popper’s main purpose in introducing the notion of verisimilitude. While Popper (1976) explores ways of modifying his proposal to deal with these problems, he is never able to provide a satisfactory formal definition of verisimilitude. His work on this area is nevertheless invaluable in identifying a problem that has continued to interest many contemporary researchers.

3. Criticisms of Falsificationism

While Popper’s account of scientific methodology has continued to be influential, it has also faced a number of serious objections. These objections, together with the emergence of alternative accounts of scientific reasoning, have led many philosophers of science to reject Popper’s falsificationist methodology. While a comprehensive list of these criticisms and alternatives is beyond the scope of this entry, interested readers are encouraged to consult Kuhn (1962), Salmon (1967), Lakatos (1970, 1980), Putnam (1974), Jeffrey (1975), Feyerabend (1975), Hacking (1983), and Howson and Urbach (1989).

One criticism of falsificationism involves the relationship between theory and observation. Thomas Kuhn, among others, argues that observation is itself strongly theory-laden, in the sense that what one observes is often significantly affected by one’s previously held theoretical beliefs. Because of this, those holding different theories might report radically different observations, even when they both are observing the same phenomena. For example, Kuhn argues those working within the paradigm provided by classical, Newtonian mechanics may genuinely have different observations than those working within the very different paradigm of relativistic mechanics.

Popper’s account of basic sentences suggests that he clearly recognizes both the existence of this sort of phenomenon and its potential to cause problems for attempts to falsify theories. His solution to it, however, crucially depends on the ability of the overall scientific community to reach a consensus as to which statements count as basic and thus can be used to formulate tests of the competing theories. This remedy, however, looks less attractive to the extent that advocates of different theories consistently find themselves unable to reach an agreement on what sentences count as basic. For example, it is important to Popper’s example of the Eddington experiment that both proponents of classical mechanics and those of relativistic mechanics could recognize Eddington’s reports of his observations as basic sentences in the relevant sense—that is, certain possible results would falsify the Newtonian laws of classical mechanics, while other possible results would falsify GR. If, by contrast, adherents of rival theories consistently disagreed on whether or not certain reports could be counted as basic sentences, this would prevent observations such as Eddington’s from serving any important role in theory choice. Instead, the results of any such potentially falsifying experiment would be interpreted by one part of the community as falsifying a particular theory, while a different section of the community would demand that these reports themselves be subjected to further testing. In this way, disagreements over the status of basic sentences would effectively prevent theories from ever being falsified.

This purported failure to clearly distinguish the basic statements that formed the empirical base from other, more theoretical, statements would also have consequences for Popper’s proposed criterion of demarcation, which holds that scientific theories must allow for the deduction of basic sentences whose truth or falsity can be ascertained by appropriately located observers. If, contrary to Popper’s account, there is no distinct category of basic sentences within actual scientific practice, then his proposed method for distinguishing science from non-science fails.

A second, related criticism of falsifiability contends that falsification fails to provide an accurate picture of scientific practice. Specifically, many historians and philosophers of science have argued that scientists only rarely give up their theories in the face of failed predictions, even in cases where they are unable to identify testable auxiliary hypotheses. Conversely, it has been suggested that scientists routinely adopt and make use of theories that they know are already falsified. Instead, scientists will generally hold on to such theories unless and until a better alternative theory emerges.

For example, Lakatos (1970) describes a hypothetical case where pre-Einsteinian scientists discover a new planet whose behavior apparently violates classical mechanics. Lakatos argues that, in such a case, the scientists would surely attempt to account for these observed discrepancies in the way that Popper advocates—for example, by hypothesizing the existence of a hitherto unobserved planet or dust cloud. In contrast to what he takes Popper to be arguing, however, Lakatos contends that the failure of such auxiliary hypotheses would not lead them to abandon classical mechanics, since they had no alternative theory to turn to.

In a similar vein, Putnam (1975) argues that the initial widespread acceptance of Newtonian mechanics had little or nothing to do with falsifiable predictions, since the theory made very few of these. Instead, scientists were impressed by the theory’s success in explaining previously established phenomena, such as the orbits of the planets and the behavior of the tides. Putnam argues that, on Popper’s view, accepting such an uncorroborated theory would seem to be irrational. Finally, Hacking (1983) argues that many aspects of ordinary scientific practice, including a wide variety of observations and experiments, cannot plausibly be construed as attempts to falsify or corroborate any particular theory or hypothesis. Instead, scientists regularly perform experiments that have little or no bearing on their current theories and measure quantities about which these theories do not make any specific claims.

When considering the cogency of such criticisms, it is worth noting several things. First, it is worth recalling that Popper defends falsificationism as a normative, methodological proposal for how science ought to work in certain sorts of cases and not as an empirical description intended to accurately capture all aspects of historical scientific practice. Second, Popper does not commit himself to the implausible thesis that theories yielding false predictions about a particular phenomenon must immediately be abandoned, even if it is not apparent which auxiliary hypotheses must change. This is especially true in the absence of any rival theory yielding a correct prediction. For example, Newtonian mechanics had well-known problems with predicting certain sorts of phenomena, such as the orbit of Mercury, in the years preceding Einstein’s proposals regarding special and general relativity. Popper’s proposal does not entail that these failures of prediction should have led nineteenth century scientists to abandon this theory.

This being said, Popper himself argues that the methodology of falsificationism has played an important role in the history of science and that adopting his proposal would not require a wholesale revision of existing scientific methodology. If it turns out that scientists rarely, if ever, make theory choice on the basis of crucial experiments that falsify one theory or another, then Popper’s methodological proposal looks to be considerably less appealing.

A final criticism concerns Popper’s account of corroboration and the role it plays in theory choice. Popper’s deductive account of theory testing and adoption posits that it is rational to choose highly informative, well-corroborated theories, even though we have no inductive grounds for thinking that these theories are likely to be true. For example, Popper explicitly rejects the idea that corroboration is intended as an analogue to the subjective probability or logical probability that a theory is true, given the available evidence. This idea is central to both Popper’s proposed solution to the problem of induction and to his criticisms of competing inductivist or “Bayesian” programs.

Many philosophers of science, however, including Salmon (1967, 1981), Jeffrey (1975), Howson (1984a), and Howson and Urbach (1989), have objected to this aspect of Popper’s account. One line of criticism has focused on the extent to which Popper’s falsification offers a legitimate alternative to the inductivist proposals that Popper criticizes. For example, Jeffrey (1975) points out that it is just as difficult to conclusively falsify a hypothesis as it to conclusively verify it, and he argues that Bayesianism, with its emphasis on the degree to which empirical evidence supports a hypothesis, is much more closely aligned to scientific practice than Popper’s program.

A related line of objection has focused on Popper’s contention that it is rational for scientists to rely on corroborated theories, a claim that plays a central role in his proposed solution to the problem of induction. Urbach (1984) argues that, insofar as Popper is committed to the claim that every universal hypothesis has zero probability of being true, he cannot explain the rationality of adopting a corroborated theory over an already falsified one, since both have the same probability (zero) of being true. Taking a different tack, Salmon (1981) questions whether, on Popper’s account, it would be rational to use corroborated hypotheses for the purposes of prediction. After all, corroboration is entirely a matter of hypotheses’ past performance—a corroborated hypothesis is one that has survived severe empirical tests. Popper’s account, however, does not provide us with any reason for thinking that this hypothesis will have more accurate predictions about the future than any one of the infinite number of competing uncorroborated hypotheses that are also logically compatible with all of the evidence observed up to this point.

If these objections concerning corroboration are correct, it looks as though Popper’s account of theory choice is either (1) vulnerable to the same sorts of problems and puzzles that plague accounts of theory choice based on induction or (2) does not work as an account of theory choice at all.

While the sorts of objections mentioned here have led many to abandon falsificationism, David Miller (1998) provides a recent, sustained attempt to defend a Popperian-style critical rationalism. For more details on debates concerning confirmation and induction, see the entries on Confirmation and Induction and Evidence.

4. Realism, Quantum Mechanics, and Probability

While Popper holds that it is impossible for us to justify claims that particular scientific theories are true, he also defends the realist view that “what we attempt in science is to describe (and so far as possible) explain reality” (1975, p. 40). While Popper grants that realism is, according to his own criteria, an irrefutable metaphysical view about the nature, he nevertheless thinks we have good reasons for accepting realism and for rejecting anti-realist views such as idealism or instrumentalism. In particular, he argues that realism is both part of common sense and entailed by our best scientific theories. By contrast, he contends that the most prominent arguments for anti-realism are based on a “mistaken quest for certainty, or for secure foundations on which to build” (1975, p. 42). Once one accepts the impossibility of securing such certain knowledge, as Popper contends we ought to do, the appeal of these sorts of arguments is considerably diminished.

Popper consistently emphasizes that scientific theories should be interpreted as attempts to describe a mind-independent reality. Because of this, he rejects the Copenhagen interpretation of quantum mechanics, in which the act of human measurement is seen as playing a fundamental role in collapsing the wave-function and randomly causing a particle to assume a determinate position or momentum. In particular, Popper opposes the idea, which he associates with the Copenhagen interpretation, that the probabilistic equations describing the results of potential measurements of quantum phenomena are about the subjective states of the human observers, rather than concerning mind-independent existing physical properties such as the positions or momenta of particles.

It is in the context of this debate over quantum mechanics that Popper first introduces his propensity theory of probability. This theory’s applicability, however, extends well beyond the quantum world, and Popper argues that it can be used to interpret the sorts of claims about probability that arise both in other areas of science and in everyday life. Popper’s propensity theory holds that probabilities are objective claims about the mind-independent external world and that it is possible for there to be single-case probabilities for non-recurring events.

Popper proposes his propensity theory as a variant of the relative frequency theories of probability defended by logical positivists such as Richard von Mises and Hans Reichenbach. According to simple versions of frequency theory, the probability of an event of type e can be defined as the relative frequency of e in a large, or perhaps even infinite, reference class. For example, the claim that the “the probability of getting a six on a fair die is 1/6” can be understood as the claim that, in a long sequence of rolls with a fair die (the reference class), six would come up 1/6 of the time. The main alternatives to frequency theory that concern Popper are logical and subjective theories of probability, according to which claims about probability should be understood as claims about the strength of evidence for or degree of belief in some proposition. On these views, the claim that “the probability of getting a six on a fair die is 1/6” can be understood as a claim about our lack of evidence—if all we know is that the die is fair, then we have no reason to think that any particular number, such as a six, is more likely to come up on the next roll than any of the other five possible numbers.

Like other defenders of frequency theories, Popper argues that logical or subjective theories incorrectly interpret scientific claims about probability as being about the scientific investigators, and the evidence they have available to them, rather than the external world they are investigating. However, Popper argues that traditional frequency theories cannot account for single-case probabilities. For example, a frequency theorist would have no problem answering questions about “the probability that it will rain on an arbitrarily chosen August day,” since August days form a reference class. By contrast, questions about the probability that it will rain on a particular, future August day raises problems, since each particular day only occurs once. At best, frequency theories allow us to say the probability of it raining on that specific day is either 0 or 1, though we do not know which.

On Popper’s view, the failure to provide adequate treatment of single-case probabilities is a serious one, especially given what he saw as the centrality of such probabilities in quantum mechanics. To resolve this issue, Popper proposes that probabilities should be treated as the propensities of experimental setups to produce certain results, rather than as being derived from the reference class of results that were produced by running these experiments. On the propensity view, the results of experiments are important because they allow us to test hypotheses concerning the values of certain probabilities; however, the results are not themselves part of the probability itself. Popper argues that this solves the problem of single-case probability, since propensities can exist even for experiments that only happen once. Importantly, Popper does not require that these experiments utilize human intervention—instead, nature can itself run experiments, the results of which we can observe. For example, the propensity theory should, in theory, be able to make sense of claims about the probability that it will rain on a particular day, even though the experimental setup in this case is constituted by naturally occurring, meteorological phenomena.

Popper argues that the propensity theory of probability helps provide the grounds for a realist solution to the measurement problem within quantum mechanics. As opposed to the Copenhagen interpretation, which posits that the probabilities discussed in quantum mechanics reflect the ignorance of the observers, Popper argues these probabilities are in fact the propensities of the experimental setups to produce certain outcomes. Interpreted this way, he argues that they raise no interesting metaphysical dilemmas beyond those raised by classical mechanics and that they are equally amenable to a realist interpretation. Popper gives the example of tossing a penny, which he argues is strictly analogous to the experiments performed in quantum mechanics: if our experimental setup consists of simply tossing the penny, then the probability of getting heads is 1/2. If the experimental setup, however, is expanded to include the results of our looking at the penny, and thus includes the outcome of the experiment itself, then the probability will be either 0 or 1. This does not, though, involve positing any collapse of the wave-function caused merely by the act of human observation. Instead, what has occurred is simply a change in the experimental setup. Once we include the measurement result in our setup, the probability of a particular outcome will trivially become 0 or 1.

5. Methodology in the Social Sciences

Much of Popper’s early work on the methodology of science is concerned with physics and closely related fields, especially those where experimentation plays a central role. On Popper’s view, which was discussed in detail in previous sections, these sciences make progress by formulating a theory and then carefully designing experiments and observations aimed at falsifying the purported theory. The ever-present possibility that a theory might be falsified by these sorts of tests is, on Popper’s view, precisely what differentiates legitimate sciences, such as physics, from non-scientific activities, such as philosophical metaphysics, Freudian psychoanalysis, or myth-making.

This picture becomes somewhat more complicated, however, when we consider methodology in social sciences such as sociology and economics, where experimentation plays a much less central role. On Popper’s view, there are significant problems with many of the methods used in these disciplines. In particular, Popper argues against what he calls historicism, which he describes as “an approach to the social sciences which assumes that historical prediction is their principal aim, and which assumes that this aim is attainable by discovering the ‘rhythms’ or ‘patterns’, the ‘laws’ or ‘trends’ that underlie the evolution of history” (1957, p. 3).

Popper’s central argument against historicism contends that, insofar as the whole of human history is a singular process that occurs only once, it is impossible to formulate and test any general laws about history. This stands in stark contrast to disciplines such as physics, where the formulation and testing of laws plays a central role in making progress. For example, potential laws of gravitation can be tested by observations of planetary motions, by controlled experiments concerning the rates of falling objects near the earth’s surface, or in numerous other ways. If the relevant theories are falsified, scientists can easily respond, for instance, by changing one or more auxiliary hypotheses, and then conducting additional experiments on the new, slightly modified theory. By contrast, a law that purports to describe the future progress of history in its entirety cannot easily be tested in this way. Even if a particular prediction about the occurrence of some particular event is incorrect, there is no way of altering the theory to retest it—each historical event only occurs one, thus ruling out the possibility of carrying more tests regarding this event. Popper also rejects the claim that it is possible to formulate and test laws of more limited scope, such as those that purport to describe an evolutionary process that occurs in multiple societies, or that attempt to capture a trend within a given society.

Popper’s opposition to historicism is also evident in his objections what he calls utopian social engineering, which involves attempts by governments to fundamentally restructure the whole of society based on an overall plan or blueprint. On Popper’s view, the problem again concerns the impossibility of carrying out critical tests of the effectiveness of such plans. This impossibility is because of the holism of utopian plans, which involve changing everything at the same time. When the planners’ actions fail—as Popper thinks is inevitably the case with human interventions in society—to achieve their predicted results, the planners have no method for determining what in particular went wrong with their plan. This lack of testability, in turn, means that there is no way for the utopian engineers to improve their plans. This argument, among others, plays a central role in Popper’s critique of Marxism and totalitarianism in The Open Society and its Enemies (1945). More details on Popper’s political philosophy, including his critique of totalitarian societies, can be found here.

In place of historicism and utopian holism, Popper argues that the social sciences should embrace both methodological individualism and situational analysis. On Popper’s definition, methodological individualism is the view that the behavior of social institutions should be analyzed in terms of the behaviors of the individual humans that made them up. This individualism is motivated, in part, by Popper’s contention that many important social institutions, such as the market, are not the result of any conscious design but instead arise out of the uncoordinated actions of individuals with widely disparate motives. Scientific hypotheses about the behavior of such unplanned institutions, then, must be formulated in terms of the constituent participants. Popper’s presentation and defense of methodological individualism is closely related to that provided by the Austrian economist Frederich von Hayek (1942, 1943, 1944), with whom Popper maintained close personal and professional relationships throughout most of his life. For both Popper and Hayek, the defense of methodological individualism within the social sciences plays a key role in their broader argument in favor of liberal, market economies and against planned economies.

While Popper endorses methodological individualism, he rejects the doctrine of psychologism, according to which laws about social institutions must be reduced to psychological laws concerning the behavior of individuals. Popper objects to this view, which he associates with John Stuart Mill, on the grounds that it ends up collapsing into a form of historicism. The argument can be summarized as follows: once we begin trying to explain or predict the behavior currently existing in institutions in terms of individuals’ psychological motives, we quickly notice that these motives themselves cannot be understood without reference to the broader social environment within which these individuals find themselves. In order to eliminate the reference to the particular social institutions that make up this environment, we are then forced to demonstrate how these institutions were themselves a product of individual motives that had operated within some other previously existing social environment. This, though, quickly leads to an unsustainable regress, since humans always act within particular social environments, and their motives cannot be understood without reference to these environments. The only way out for the advocate of psychologism is to posit that both the origin and evolution of all human institutions can be explained purely in terms of human psychology. Popper argues that there is no historical support for the idea that there was ever such as an origin of social institutions. He also argues that this is a form of historicism, insofar as it commits us to discovering laws governing the evolution of society as a whole. As such, it inherits all of the problems mentioned previously.

In place of psychologism, Popper endorses a version of methodological individualism based on situational analysis. On this method, we begin by creating abstract models of the social institutions that we wish to investigate, such as markets or political institutions. In keeping with methodological individualism, these models will contain, among other things, representations of individual agents. However, instead of stipulating that these agents will behave according to the laws governing individual human psychology, as psychologism does, we animate the model by assuming that the agents will respond appropriately according to the logic of the situation. Popper calls this constraint on model building within the social sciences the rationality principle.

Popper recognizes that both the rationality principle and the models built on the basis of it are empirically false—after all, real humans often respond to situations in ways that are irrational and inappropriate. Popper also rejects, however, the idea that the rationality principle should be thought of as a methodological principle that is a priori immune to testing, since part of what makes theories in the social sciences testable is the fact that they make definite claims about individual human behavior. Instead, Popper defends the use of the rationality principle in model building on the grounds that is generally good policy to avoid blaming the falsification of a model on the inaccuracies introduced by the rationality principle and that we can learn more if we blame the other assumptions of our situational analysis (1994, p. 177). On Popper’s view, the errors introduced by the rationality principle are generally small ones, since humans are generally rational. More importantly, holding the rationality principle fixed makes it much easier for us to formulate crucial tests of rival theories and to make genuine progress in the social sciences. By contrast, if the rationality principle were relaxed, he argues, there would be almost no substantive constraints on model building.

6. Popper’s Legacy

While few of Popper’s individual claims have escaped criticism, his contributions to philosophy of science are immense. As mentioned earlier, Popper was one of the most important critics of the early logical empiricist program, and the criticisms he leveled against helped shape the future work of both the logical empiricists and their critics. In addition, while his falsification-based approach to scientific methodology is no longer widely accepted within philosophy of science, it played a key role in laying the ground for later work in the field, including that of Kuhn, Lakatos, and Feyerabend, as well as contemporary Bayesianism. It also plausible that the widespread popularity of falsificationism—both within and outside of the scientific community—has had an important role in reinforcing the image of science as an essentially empirical activity and in highlighting the ways in which genuine scientific work differs from so-called pseudoscience. Finally, Popper’s work on numerous specialized issues within the philosophy of science—including verisimilitude, quantum mechanics, the propensity theory of probability, and methodological individualism—has continued to influence contemporary researchers.

7. References and Further Reading

Popper Selections (1985) is an excellent introduction to Popper’s writings for the beginner, while The Philosophy of Karl Popper (Schilpp 1974) contains an extensive bibliography of Popper’s work published before the date, together with numerous critical essays and Popper’s responses to these. Finally, Unended Quest (1976) is an expanded version of the “Intellectual Autobiography” from Schilpp (1974), and it provides a helpful, non-technical overview of many of Popper’s main works in his own words.

a. Primary Sources

1945. The Open Society and Its Enemies. 2 volumes. London: Routledge.
1957. The Poverty of Historicism. London: Routledge. Originally published as a series of three articles in Economica 42, 43, and 46 (1944-1945).
1959. The Logic of Scientific Discovery. London: Hutchinson. This is an English translation of Logik der Forschung, Vienna: Springer (1935).
1959. “The Propensity Interpretation of Probability.” The British Journal for the Philosophy of Science 10 (37): 25–42.
1963. Conjectures and Refutations: The Growth of Scientific Knowledge. London: Routledge. Fifth edition 1989.
1970. “Normal Science and Its Dangers.” In Criticism and the Growth of Knowledge, edited by Imre Lakatos and Alan Musgravez 51–58
1972. Objective Knowledge: An Evolutionary Approach. Oxford: Clarendon Press. Revised edition 1979.
1974. “Replies to My Critics” and “Intellectual Autobiography.” In: Schilpp, Paul Arthur, ed.
1974. The Philosophy of Karl Popper. 2 volumes. La Salle, Ill: Open Court.
1976. Unended Quest. London: Fontana. Revised edition 1984.
1976. “A Note on Verisimilitude.” The British Journal for the Philosophy of Science 27 (2): 147–59.
1978. “Natural Selection and the Emergence of Mind.” Dialectica 32 (3-4): 339–55.
1982. The Open Universe: An Argument for Indeterminism. Edited by W. W. Bartley III. London: Routledge.
1982. Quantum Theory and the Schism in Physics. Edited by W. W. Bartley III. New York: Routledge.
1983. Realism and the Aim of Science. Edited by W. W. Bartley III. New York: Routledge.
1985. Popper Selections. Edited by David W Miller. Princeton: Princeton University Press.
1994. The Myth of the Framework: In Defense of Science and Rationality. Edited by Mark Amadeus Notturno. London: Routledge.
1999. All Life Is Problem Solving. London: Routledge.

b. Secondary Sources

Ackermann, Robert John. 1976. The Philosophy of Karl Popper. Amherst: University of Mass. Press.
Agassi, Joseph. 2014. Popper and His Popular Critics: Thomas Kuhn, Paul Feyerabend and Imre Lakatos. 2014 edition. New York: Springer.
Blaug, Mark. 1992. The Methodology of Economics: Or, How Economists Explain. 2nd edition. New York: Cambridge University Press.
Caldwell, Bruce J. 1991. “Clarifying Popper.” Journal of Economic Literature 29 (1): 1–33.
Carnap, Rudolf. 1936. “Testability and Meaning.” Philosophy of Science 3 (4): 419–71. Continued in Philosophy of Science 4 (1): 1-40.
Carnap, Rudolf. 1995. An Introduction to the Philosophy of Science. New York: Dover. Originally published as Philosophical Foundations of Physics (1966).
Carnap, Rudolf. 2003. The Logical Structure of the World and Pseudoproblems in Philosophy. Translated by Rolf A. George. Chicago and La Salle, Ill: Open Court. Originally published in 1928 as Der logische Aufbau der Welt and Scheinprobleme in der Philosophie.
Catton, Philip, and Graham MacDonald, eds. 2004. Karl Popper: Critical Appraisals. New York: Routledge.
Currie, Gregory, and Alan Musgrave, eds. 1985. Popper and the Human Sciences. Dordrecht: Martinus Nijhoff.
Edmonds, David, and John Eidinow. 2002. Wittgenstein’s Poker: The Story of a Ten-Minute Argument Between Two Great Philosophers. Reprint edition. New York: Harper Perennial.
Feyerabend, Paul. 1975. Against Method. London; New York: New Left Books. Fourth edition 2010.
Fuller, Steve. 2004. Kuhn vs. Popper: The Struggle for the Soul of Science. New York: Columbia University Press.
Gattei, Stefano. 2010. Karl Popper’s Philosophy of Science: Rationality without Foundations. London; New York: Routledge.
Grünbaum, Adolf. 1976. “Is Falsifiability the Touchstone of Scientific Rationality? Karl Popper Versus Inductivism.” In Essays in Memory of Imre Lakatos, edited by R. S. Cohen, P. K. Feyerabend, and M. W. Wartofsky, 213–52. Dordrecht: Springer Netherlands.
Hacking, Ian. 1983. Representing and Intervening: Introductory Topics in the Philosophy of Natural Science. Cambridge; New York: Cambridge University Press.
Hacohen, Malachi Haim. 2002. Karl Popper: The Formative Years, 1902-1945 : Politics and Philosophy in Interwar Vienna. Cambridge: Cambridge University Press.
Hands, Douglas W. 1985. “Karl Popper and Economic Methodology: A New Look.” Economics and Philosophy 1 (1): 83–99.
Harris, John H. 1974. “Popper’s Definitions of ‘Verisimilitude.’” The British Journal for the Philosophy of Science 25 (2): 160–66.
Hausman, Daniel M. 1985. “Is Falsificationism Unpractised or Unpractisable?” Philosophy of the Social Sciences 15 (3): 313–19.
Hayek, Frederich von. 1942. “Scientism and the Study of Society. Part I.” Economica, New Series, 9 (35): 267–91.
Hayek, Frederich von. 1943. “Scientism and the Study of Society. Part II.” Economica, New Series, 10 (37): 34–63.
Hayek, Frederich von. 1944. “Scientism and the Study of Society. Part III.” Economica, New Series, 11 (41): 27–39.
Hempel, Carl G. 1945a. “Studies in the Logic of Confirmation (I.).” Mind, New Series, 54 (213): 1–26.
Hempel, Carl G. 1945b. “Studies in the Logic of Confirmation (II.).” Mind, New Series, 54 (214): 97–121.
Howson, Colin. 1984a. “Popper’s Solution to the Problem of Induction.” The Philosophical Quarterly 34 (135): 143–47.
Howson, Colin. 1984b. “Probabilities, Propensities, and Chances.” Erkenntnis 21 (3): 279–93.
Howson, Colin, and Peter Urbach. 1989. Scientific Reasoning: The Bayesian Approach. Chicago: Open Court Publishing. Third edition 2006.
Hudelson, Richard. 1980. “Popper’s Critique of Marx.” Philosophical Studies: An International Journal for Philosophy in the Analytic Tradition 37 (3): 259–70.
Hume, David. 1993. An Enquiry Concerning Human Understanding: With Hume’s Abstract of A Treatise of Human Nature and A Letter from a Gentleman to His Friend in Edinburgh. Edited by Eric Steinberg. 2nd ed. Indianapolis: Hackett Publishing Company, Inc.
Jeffrey, Richard C. 1975. “Probability and Falsification: Critique of the Popper Program.” Synthese 30 (1/2): 95–117.
Keuth, Herbert. 2004. The Philosophy of Karl Popper. New York: Cambridge University Press.
Kuhn, Thomas S. 1962. The Structure of Scientific Revolutions. Chicago: University of Chicago Press. Third edition 1996.
Lakatos, Imre. 1970. “Falsification and the Methodology of Scientific Research Programmes.” In Criticism and the Growth of Knowledge, edited by Imre Lakatos and Alan Musgrave, 91–196. Cambridge: Cambridge University Press.
Lakatos, Imre. 1980. The Methodology of Scientific Research Programmes: Volume 1: Philosophical Papers. Cambridge University Press.
Lakatos, Imre, and Alan Musgrave, eds. 1970. Criticism and the Growth of Knowledge. Cambridge: Cambridge University Press.
Levi, Isaac. 1963. “Corroboration and Rules of Acceptance.” The British Journal for the Philosophy of Science 13 (52): 307–13.
Maher, Patrick. 1990. “Why Scientists Gather Evidence.” The British Journal for the Philosophy of Science 41 (1): 103-119.
Magee, Bryan. 1985. Philosophy and the Real World: An Introduction to Karl Popper. La Salle, Ill: Open Court.
Miller, David. 1974. “Popper’s Qualitative Theory of Verisimilitude.” British Journal for the Philosophy of Science, 166–77.
Miller, David. 1998. Critical Rationalism: A Restatement and Defense. Chicago: Open Court.
Munz, Peter. 1985. Our Knowledge of the Growth of Knowledge: Popper or Wittgenstein?. London; New York: Routledge.
O’Hear, Anthony. 1996. Karl Popper: Philosophy and Problems. Cambridge ; New York: Cambridge University Press.
Putnam, Hilary. 1974. “The ‘corroboration’ of Theories.” In The Philosophy of Karl Popper, edited by Paul Arthur Schilpp, 221–40. La Salle, Ill: Open Court.
Rowbottom, Darrell. 2010. Popper’s Critical Rationalism: A Philosophical Investigation. New York: Routledge.
Runde, Jochen. 1996. “On Popper, Probabilities, and Propensities.” Review of Social Economy 54 (4): 465–85.
Ruse, Michael. 1977. “Karl Popper’s Philosophy of Biology.” Philosophy of Science 44 (4): 638–61.
Salmon, Wesley. 1967. The Foundations of Scientific Inference. Pittsburgh: University of Pittsburgh Press.
Salmon, Wesley. 1981. “Rational Prediction.” The British Journal for the Philosophy of Science 32 (2): 115–25.
Schilpp, Paul Arthur, ed. 1974. The Philosophy of Karl Popper. 2 volumes. La Salle, Ill: Open Court.
Thornton, Stephen. 2014. “Karl Popper.” In The Stanford Encyclopedia of Philosophy, edited by Edward N. Zalta.
Tichý, Pavel. 1974. “On Popper’s Definitions of Verisimilitude.” The British Journal for the Philosophy of Science 25 (2): 155–60.
Urbach, Peter. 1978. “Is Any of Popper’s Arguments against Historicism Valid?” The British Journal for the Philosophy of Science 29 (2): 117–30.

Author Information

Brendan Shea
Email: Brendan.Shea@rctc.edu
Rochester Community and Technical College, Minnesota Center for Philosophy of Science
U. S. A.

Albert Camus (1913—1960)

Albert Camus was a French-Algerian journalist, playwright, novelist, philosophical essayist, and Nobel laureate. Though he was neither by advanced training nor profession a philosopher, he nevertheless made important, forceful contributions to a wide range of issues in moral philosophy in his novels, reviews, articles, essays, and speeches—from terrorism and political violence to suicide and the death penalty. He is often described as an existentialist writer, though he himself disavowed the label. He began his literary career as a political journalist and as an actor, director, and playwright in his native Algeria. Later, while living in occupied France during WWII, he became active in the Resistance and from 1944-47 served as editor-in-chief of the newspaper Combat. By mid-century, based on the strength of his three novels (The Stranger, The Plague, and The Fall) and two book-length philosophical essays (The Myth of Sisyphus and The Rebel), he had achieved an international reputation and readership. It was in these works that he introduced and developed the twin philosophical ideas—the concept of the Absurd and the notion of Revolt—that made him famous. These are the ideas that people immediately think of when they hear the name Albert Camus spoken today. The Absurd can be defined as a metaphysical tension or opposition that results from the presence of human consciousness—with its ever-pressing demand for order and meaning in life—in an essentially meaningless and indifferent universe. Camus considered the Absurd to be a fundamental and even defining characteristic of the modern human condition. The notion of Revolt refers to both a path of resolved action and a state of mind. It can take extreme forms such as terrorism or a reckless and unrestrained egoism (both of which are rejected by Camus), but basically, and in simple terms, it consists of an attitude of heroic defiance or resistance to whatever oppresses human beings. In awarding Camus its prize for literature in 1957, the Nobel Prize committee cited his persistent efforts to “illuminate the problem of the human conscience in our time.” He was honored by his own generation, and is still admired today, for being a writer of conscience and a champion of imaginative literature as a vehicle of philosophical insight and moral truth. He was at the height of his career—at work on an autobiographical novel, planning new projects for theatre, film, and television, and still seeking a solution to the lacerating political turmoil in his homeland—when he died tragically in an automobile accident in January 1960.

Life
Literary Career
Camus, Philosophical Literature, and the Novel of Ideas
Works
Philosophy
Existentialism
Camus, Colonialism, and Algeria
Significance and Legacy
References and Further Reading
1. Works by Albert Camus
2. Critical and Biographical Studies

1. Life

Albert Camus was born on November 7, 1913, in Mondovi, a small village near the seaport city of Bonê (present-day Annaba) in the northeast region of French Algeria. He was the second child of Lucien Auguste Camus, a military veteran and wine-shipping clerk, and of Catherine Helene (Sintes) Camus, a house-keeper and part-time factory worker. (Note: Although Camus believed that his father was Alsatian and a first-generation émigré, research by biographer Herbert Lottman indicates that the Camus family was originally from Bordeaux and that the first Camus to leave France for Algeria was actually the author’s great-grandfather, who in the early 19th century became part of the first wave of European colonial settlers in the new melting pot of North Africa.)

Shortly after the outbreak of WWI, when Camus was less than a year old, his father was recalled to military service and, on October 11, 1914, died of shrapnel wounds suffered at the first battle of the Marne. As a child, about the only thing Camus ever learned about his father was that he had once become violently ill after witnessing a public execution. This anecdote, which surfaces in fictional form in the author’s novel The Stranger and is also recounted in his philosophical essay “Reflections on the Guillotine,” strongly affected Camus and influenced his lifelong opposition to the death penalty.

After his father’s death, Camus, his mother, and his older brother moved to Algiers where they lived with his maternal uncle and grandmother in her cramped second-floor apartment in the working-class district of Belcourt. Camus’s mother Catherine, who was illiterate, partially deaf, and afflicted with a speech pathology, worked in an ammunition factory and cleaned homes to help support the family. In his posthumously published autobiographical novel The First Man, Camus recalls this period of his life with a mixture of pain and affection as he describes conditions of harsh poverty (the three-room apartment had no bathroom, no electricity, and no running water) relieved by hunting trips, family outings, childhood games, and scenic flashes of sun, seashore, mountain, and desert.

Camus attended elementary school at the local Ecole Communale, and it was there that he encountered the first in a series of teacher-mentors who recognized and nurtured the young boy’s lively intelligence. These father figures introduced him to a new world of history and imagination and to literary landscapes far beyond the dusty streets of Belcourt and working-class poverty. Though stigmatized as a pupille de la nation (that is, a war veteran’s child dependent on public welfare) and hampered by recurrent health issues, Camus distinguished himself as a student and was eventually awarded a scholarship to attend high school at the Grand Lycee. Located near the famous Kasbah district, the school brought him into close proximity with the native Muslim community and thus gave him an early recognition of the idea of the “outsider” that would dominate his later writings.

It was in secondary school that Camus became an avid reader (absorbing Gide, Proust, Verlaine, and Bergson, among others), learned Latin and English, and developed a lifelong interest in literature, art, theatre, and film. He also enjoyed sports, especially soccer, of which he once wrote (recalling his early experience as a goal-keeper): “I learned . . . that a ball never arrives from the direction you expected it. That helped me in later life, especially in mainland France, where nobody plays straight.” It was also during this period that Camus suffered his first serious attack of tuberculosis, a disease that was to afflict him, on and off, throughout his career.

By the time he finished his Baccalauréat degree in June 1932, Camus was already contributing articles to Sud, a literary monthly, and looking forward to a career in journalism, the arts, or higher education. The next four years (1933-37) were an especially busy period in his life during which he attended college, worked at odd jobs, married his first wife (Simone Hié), divorced, briefly joined the Communist party, and effectively began his professional theatrical and writing career. Among his various employments during the time were stints of routine office work where one job consisted of a Bartleby-like recording and sifting of meteorological data and another involved paper shuffling in an auto license bureau. One can well imagine that it was as a result of this experience that his famous conception of Sisyphean struggle, heroic defiance in the face of the Absurd, first began to take shape within his imagination.

In 1933, Camus enrolled at the University of Algiers to pursue his diplome d’etudes superieures, specializing in philosophy and gaining certificates in sociology and psychology along the way. In 1936, he became a co-founder, along with a group of young fellow intellectuals, of the Théâtre du Travail, a professional acting company specializing in drama with left-wing political themes. Camus served the company as both an actor and director and also contributed scripts, including his first published play Revolt in Asturia, a drama based on an ill-fated workers’ revolt during the Spanish Civil War. That same year Camus also earned his degree and completed his dissertation, a study of the influence of Plotinus and neo-Platonism on the thought and writings of St. Augustine.

Over the next three years Camus further established himself as an emerging author, journalist, and theatre professional. After his disillusionment with and eventual expulsion from the Communist Party, he reorganized his dramatic company and renamed it the Théâtre de l’Equipe (literally the Theater of the Team). The name change signaled a new emphasis on classic drama and avant-garde aesthetics and a shift away from labor politics and agitprop. In 1938 he joined the staff of a new daily newspaper, the Alger Républicain, where his assignments as a reporter and reviewer covered everything from contemporary European literature to local political trials. It was during this period that he also published his first two literary works—Betwixt and Between, a collection of five short semi-autobiographical and philosophical pieces (1937) and Nuptials, a series of lyrical celebrations interspersed with political and philosophical reflections on North Africa and the Mediterranean.

The 1940s witnessed Camus’s gradual ascendance to the rank of world-class literary intellectual. He started the decade as a locally acclaimed author and playwright, but he was a figure virtually unknown outside the city of Algiers; however, he ended the decade as an internationally recognized novelist, dramatist, journalist, philosophical essayist, and champion of freedom. This period of his life began inauspiciously—war in Europe, the occupation of France, official censorship, and a widening crackdown on left-wing journals. Camus was still without stable employment or steady income when, after marrying his second wife, Francine Faure, in December of 1940, he departed Lyons, where he had been working as a journalist, and returned to Algeria. To help make ends meet, he taught part-time (French history and geography) at a private school in Oran. All the while he was putting finishing touches to his first novel The Stranger, which was finally published in 1942 to favorable critical response, including a lengthy and penetrating review by Jean-Paul Sartre. The novel propelled him into immediate literary renown.

Camus returned to France in 1942 and a year later began working for the clandestine newspaper Combat, the journalistic arm and voice of the French Resistance movement. During this period, while contending with recurrent bouts of tuberculosis, he also published The Myth of Sisyphus, his philosophical anatomy of suicide and the absurd, and joined Gallimard Publishing as an editor, a position he held until his death.

After the Liberation, Camus continued as editor of Combat, oversaw the production and publication of two plays, The Misunderstanding and Caligula, and assumed a leading role in Parisian intellectual society in the company of Sartre and Simone de Beauvoir among others. In the late 40s his growing reputation as a writer and thinker was enlarged by the publication of The Plague, an allegorical novel and fictional parable of the Nazi Occupation and the duty of revolt, and by the lecture tours to the United States and South America. In 1951 he published The Rebel, a reflection on the nature of freedom and rebellion and a philosophical critique of revolutionary violence. This powerful and controversial work, with its explicit condemnation of Marxism-Leninism and its emphatic denunciation of unrestrained violence as a means of human liberation, led to an eventual falling out with Sartre and, along with his opposition to the Algerian National Liberation Front, to his being branded a reactionary in the view of many European Communists. Yet his position also established him as an outspoken champion of individual freedom and as an impassioned critic of tyranny and terrorism, whether practiced by the Left or by the Right.

In 1956, Camus published the short, confessional novel The Fall, which unfortunately would be the last of his completed major works and which in the opinion of some critics is the most elegant, and most under-rated of all his books. During this period he was still afflicted by tuberculosis and was perhaps even more sorely beset by the deteriorating political situation in his native Algeria—which had by now escalated from demonstrations and occasional terrorist and guerilla attacks into open violence and insurrection. Camus still hoped to champion some kind of rapprochement that would allow the native Muslim population and the French pied noir minority to live together peaceably in a new de-colonized and largely integrated, if not fully independent, nation. Alas, by this point, as he painfully realized, the odds of such an outcome were becoming increasingly unlikely.

In the fall of 1957, following publication of Exile and the Kingdom, a collection of short fiction, Camus was shocked by news that he had been awarded the Nobel Prize for literature. He absorbed the announcement with mixed feelings of gratitude, humility, and amazement. On the one hand, the award was obviously a tremendous honor. On the other, not only did he feel that his friend and esteemed fellow novelist Andre Malraux was more deserving, he was also aware that the Nobel itself was widely regarded as the kind of accolade usually given to artists at the end of a long career. Yet, as he indicated in his acceptance speech at Stockholm, he considered his own career as still in mid-flight, with much yet to accomplish and even greater writing challenges ahead:

Every person, and assuredly every artist, wants to be recognized. So do I. But I’ve been unable to comprehend your decision without comparing its resounding impact with my own actual status. A man almost young, rich only in his doubts, and with his work still in progress…how could such a man not feel a kind of panic at hearing a decree that transports him all of a sudden…to the center of a glaring spotlight? And with what feelings could he accept this honor at a time when other writers in Europe, among them the very greatest, are condemned to silence, and even at a time when the country of his birth is going through unending misery?

Of course Camus could not have known as he spoke these words that most of his writing career was in fact behind him. Over the next two years, he published articles and continued to write, produce, and direct plays, including his own adaptation of Dostoyevsky’s The Possessed. He also formulated new concepts for film and television, assumed a leadership role in a new experimental national theater, and continued to campaign for peace and a political solution in Algeria. Unfortunately, none of these latter projects would be brought to fulfillment. On January 4, 1960, Camus died tragically in a car accident while he was a passenger in a vehicle driven by his friend and publisher Michel Gallimard, who also suffered fatal injuries. The author was buried in the local cemetery at Lourmarin, a village in Provencal where he and his wife and daughters had lived for nearly a decade.

Upon hearing of Camus’s death, Sartre wrote a moving eulogy in the France-Observateur, saluting his former friend and political adversary not only for his distinguished contributions to French literature but especially for the heroic moral courage and “stubborn humanism” which he brought to bear against the “massive and deformed events of the day.”

2. Literary Career

According to Sartre’s perceptive appraisal, Camus was less a novelist and more a writer of philosophical tales and parables in the tradition of Voltaire. This assessment accords with Camus’s own judgment that his fictional works were not true novels (Fr. romans), a form he associated with the densely populated and richly detailed social panoramas of writers like Balzac, Tolstoy, and Proust, but rather contes (“tales”) and recits (“narratives”) combining philosophical and psychological insights.

In this respect, it is also worth noting that at no time in his career did Camus ever describe himself as a deep thinker or lay claim to the title of philosopher. Instead, he nearly always referred to himself simply, yet proudly, as un ecrivain—a writer. This is an important fact to keep in mind when assessing his place in intellectual history and in twentieth-century philosophy, for by no means does he qualify as a system-builder or theorist or even as a disciplined thinker. He was instead (and here again Sartre’s assessment is astute) a sort of all-purpose critic and modern-day philosophe: a debunker of mythologies, a critic of fraud and superstition, an enemy of terror, a voice of reason and compassion, and an outspoken defender of freedom—all in all a figure very much in the Enlightenment tradition of Voltaire and Diderot. For this reason, in assessing Camus’s career and work, it may be best simply to take him at his own word and characterize him first and foremost as a writer—advisedly attaching the epithet “philosophical” for sharper accuracy and definition.

3. Camus, Philosophical Literature, and the Novel of Ideas

To pin down exactly why and in what distinctive sense Camus may be termed a philosophical writer, we can begin by comparing him with other authors who have merited the designation. Right away, we can eliminate any comparison with the efforts of Lucretius and Dante, who undertook to unfold entire cosmologies and philosophical systems in epic verse. Camus obviously attempted nothing of the sort. On the other hand, we can draw at least a limited comparison between Camus and writers like Pascal, Kierkegaard, and Nietzsche—that is, with writers who were first of all philosophers or religious writers, but whose stylistic achievements and literary flair gained them a special place in the pantheon of world literature as well. Here we may note that Camus himself was very conscious of his debt to Kierkegaard and Nietzsche (especially in the style and structure of The Myth of Sisyphus and The Rebel) and that he might very well have followed in their literary-philosophical footsteps if his tuberculosis had not side-tracked him into fiction and journalism and prevented him from pursuing an academic career.

Perhaps Camus himself best defined his own particular status as a philosophical writer when he wrote (with authors like Melville, Stendhal, Dostoyevsky, and Kafka especially in mind): “The great novelists are philosophical novelists”; that is, writers who eschew systematic explanation and create their discourse using “images instead of arguments” (The Myth of Sisyphus 74).

By his own definition then Camus is a philosophical writer in the sense that he has (a) conceived his own distinctive and original world-view and (b) sought to convey that view mainly through images, fictional characters and events, and via dramatic presentation rather than through critical analysis and direct discourse. He is also both a novelist of ideas and a psychological novelist, and in this respect, he certainly compares most closely to Dostoyevsky and Sartre, two other writers who combine a unique and distinctly philosophical outlook, acute psychological insight, and a dramatic style of presentation. (Like Camus, Sartre was a productive playwright, and Dostoyevsky remains perhaps the most dramatic of all novelists, as Camus clearly understood, having adapted both The Brothers Karamazov and The Possessed for the stage.)

4. Works

Camus’s reputation rests largely on the three novels published during his lifetime—The Stranger, The Plague, and The Fall—and on his two major philosophical essays—The Myth of Sisyphus and The Rebel. However, his body of work also includes a collection of short fiction, Exile and the Kingdom; an autobiographical novel, The First Man; a number of dramatic works, most notably Caligula, The Misunderstanding, The State of Siege, and The Just Assassins; several translations and adaptations, including new versions of works by Calderon, Lope de Vega, Dostoyevsky, and Faulkner; and a lengthy assortment of essays, prose pieces, critical reviews, transcribed speeches and interviews, articles, and works of journalism. A brief summary and description of the most important of Camus’s writings is presented below as preparation for a larger discussion of his philosophy and world-view, including his main ideas and recurrent philosophical themes.

a. Fiction

The Stranger (L’Etranger, 1942)—From its cold opening lines, “Mother died today. Or maybe yesterday; I can’t be sure,” to its bleak concluding image of a public execution set to take place beneath the “benign indifference of the universe,” Camus’s first and most famous novel takes the form of a terse, flat, first-person narrative by its main character Meursault, a very ordinary young man of unremarkable habits and unemotional affect who, inexplicably and in an almost absent-minded way, kills an Arab and then is arrested, tried, convicted, and sentenced to death. The neutral style of the novel—typical of what the critic Roland Barthes called “writing degree zero”—serves as a perfect vehicle for the descriptions and commentary of its anti-hero narrator, the ultimate “outsider” and a person who seems to observe everything, including his own life, with almost pathological detachment.

The Plague (La Peste, 1947)—Set in the coastal town of Oran, Camus’s second novel is the story of an outbreak of plague, traced from its subtle, insidious, unheeded beginnings and horrible, seemingly irresistible dominion to its eventual climax and decline, all told from the viewpoint of one of the survivors. Camus made no effort to conceal the fact that his novel was partly based on and could be interpreted as an allegory or parable of the rise of Nazism and the nightmare of the Occupation. However, the plague metaphor is both more complicated and more flexible than that, extending to signify the Absurd in general as well as any calamity or disaster that tests the mettle of human beings, their endurance, their solidarity, their sense of responsibility, their compassion, and their will. At the end of the novel, the plague finally retreats, and the narrator reflects that a time of pestilence teaches “that there is more to admire in men than to despise,” but he also knows “that the plague bacillus never dies or disappears for good,” that “the day would come when, for the bane and the enlightening of men, it would rouse up its rats again” and send them forth yet once more to spread death and contagion into a happy and unsuspecting city.

The Fall (La Chute, 1956)—Camus’s third novel, and the last to be published during his lifetime, is in effect an extended dramatic monologue spoken by M. Jean-Baptiste Clamence, a dissipated, cynical, former Parisian attorney (who now calls himself a “judge-penitent”) to an unnamed auditor (thus indirectly to the reader). Set in a seedy bar in the red-light district of Amsterdam, the work is a small masterpiece of compression and style: a confessional (and semi-autobiographical) novel, an arresting character study and psychological portrait, and at the same time a wide-ranging philosophical discourse on guilt and innocence, expiation and punishment, good and evil.

b. Drama

Camus began his literary career as a playwright and theatre director and was planning new dramatic works for film, stage, and television at the time of his death. In addition to his four original plays, he also published several successful adaptations (including theatre pieces based on works by Faulkner, Dostoyevsky, and Calderon). He took particular pride in his work as a dramatist and man of the theatre. However, his plays never achieved the same popularity, critical success, or level of incandescence as his more famous novels and major essays.

Caligula (1938, first produced 1945)—“Men die and are not happy.” Such is the complaint against the universe pronounced by the young emperor Caligula, who in Camus’s play is less the murderous lunatic, slave to incest, narcissist, and megalomaniac of Roman history than a theatrical martyr-hero of the Absurd: a man who carries his philosophical quarrel with the meaninglessness of human existence to a kind of fanatical but logical extreme. Camus described his hero as a man “obsessed with the impossible” willing to pervert all values, and if necessary destroy himself and all those around him in the pursuit of absolute liberty. Caligula was Camus’s first attempt at portraying a figure in absolute defiance of the Absurd, and through three revisions of the play over a period of several years he eventually achieved a remarkable composite by adding to Caligula’s original portrait touches of Sade, of revolutionary nihilism, of the Nietzschean Superman, of his own version of Sisyphus, and even of Mussolini and Hitler.

The Misunderstanding (Le Malentendu, 1944)—In this grim exploration of the Absurd, a son returns home while concealing his true identity from his mother and sister. The two women operate a boarding house where, in order to make ends meet, they quietly murder and rob their patrons. Through a tangle of misunderstanding and mistaken identity they wind up murdering their unrecognized visitor. Camus has explained the drama as an attempt to capture the atmosphere of malaise, corruption, demoralization, and anonymity that he experienced while living in France during the German occupation. Despite the play’s dark themes and bleak style, he described its philosophy as ultimately optimistic: “It amounts to saying that in an unjust or indifferent world man can save himself, and save others, by practicing the most basic sincerity and pronouncing the most appropriate word.”

State of Siege (L’Etat de Siege, 1948)—This odd allegorical drama combines features of the medieval morality play with elements of Calderon and the Spanish baroque; it also has apocalyptic themes, bits of music hall comedy, and a collection of avant-garde theatrics thrown in for good measure. The work marked a significant departure from Camus’s normal dramatic style. It also resulted in virtually universal disapproval and negative reviews from Paris theatre-goers and critics, many of whom came expecting a play based on Camus’s recent novel The Plague. The play is set in the Spanish seaport city of Cadiz, famous for its beaches, carnivals, and street musicians. By the end of the first act, the normally laid-back and carefree citizens fall under the dominion of a gaudily beribboned and uniformed dictator named Plague (based on Generalissimo Franco) and his officious, clip-board wielding Secretary (who turns out to be a modern, bureaucratic incarnation of the medieval figure Death). One of the prominent concerns of the play is the Orwellian theme of the degradation of language via totalitarian politics and bureaucracy (symbolized onstage by calls for silence, scenes in pantomime, and a gagged chorus). As one character observes, “we are steadily nearing that perfect moment when nothing anybody says will rouse the least echo in another’s mind.”

The Just Assassins (Les Justes, 1950)—First performed in Paris to largely favorable reviews, this play is based on real-life characters and an actual historical event: the 1905 assassination of the Russian Grand Duke Sergei Alexandrovich by Ivan Kalyayev and fellow members of the Combat Organization of the Socialist Revolutionary Party. The play effectively dramatizes the issues that Camus would later explore in detail in The Rebel, especially the question of whether acts of terrorism and political violence can ever be morally justified (and if so, with what limitations and in what specific circumstances). The historical Kalyayev passed up his original opportunity to bomb the Grand Duke’s carriage because the Duke was accompanied by his wife and two young nephews. However, this was no act of conscience on Kalyayev’s part but a purely practical decision based on his calculation that the murder of children would prove a setback to the revolution. After the successful completion of his bombing mission and subsequent arrest, Kalyayev welcomed his execution on similarly practical and purely political grounds, believing that his death would further the cause of revolution and social justice. Camus’s Kalyayev, on the other hand, is a far more agonized and conscientious figure, neither so cold-blooded nor so calculating as his real-life counterpart. Upon seeing the two children in the carriage, he refuses to toss his bomb not because doing so would be politically inexpedient but because he is overcome emotionally, temporarily unnerved by the sad expression in their eyes. Similarly, at the end of the play he embraces his death not so much because it will aid the revolution, but almost as a form of karmic penance, as if it were indeed some kind of sacred duty or metaphysical requirement that must be performed in order for true justice to be achieved.

c. Essays, Letters, Prose Collections, Articles, and Reviews

Betwixt and Between (L’Envers et l’endroit, 1937)—This short collection of semi-autobiographical, semi-fictional, philosophical pieces might be dismissed as juvenilia and largely ignored if it were not for the fact that it represents Camus’s first attempt to formulate a coherent life-outlook and world-view. The collection, which in a way serves as a germ or starting point for the author’s later philosophy, consists of five lyrical essays. In “Irony” (“L’Ironie”), a reflection on youth and age, Camus asserts, in the manner of a young disciple of Pascal, our essential solitariness in life and death. In “Between yes and no” (“Entre Oui et Non”) he suggests that to hope is as empty and as pointless as to despair, yet he goes beyond nihilism by positing a fundamental value to existence-in-the-world. In “Death in the soul” (“La Mort dans l’ame”) he supplies a sort of existential travel review, contrasting his impressions of central and Eastern Europe (which he views as purgatorial and morgue-like) with the more spontaneous life of Italy and Mediterranean culture. The piece thus affirms the author’s lifelong preference for the color and vitality of the Mediterranean world, and especially North Africa, as opposed to what he perceives as the soulless cold-heartedness of modern Europe. In “Love of life” (“Amour de vivre”) he claims there can be no love of life without despair of life and thus largely re-asserts the essentially tragic, ancient Greek view that the very beauty of human existence is largely contingent upon its brevity and fragility. The concluding essay, “Betwixt and between” (“L’Envers et l’endroit”), summarizes and re-emphasizes the Romantic themes of the collection as a whole: our fundamental “aloneness,” the importance of imagination and openness to experience, the imperative to “live as if….”

Nuptials (Noces, 1938)—This collection of four rhapsodic narratives supplements and amplifies the youthful philosophy expressed in Betwixt and Between. That joy is necessarily intertwined with despair, that the shortness of life confers a premium on intense experience, and that the world is both beautiful and violent—these are, once again, Camus’s principal themes. “Summer in Algiers,” which is probably the best (and best-known) of the essays in the collection, is a lyrical, at times almost ecstatic, celebration of sea, sun, and the North African landscape. Affirming a defiantly atheistic creed, Camus concludes with one of the core ideas of his philosophy: “If there is a sin against life, it consists not so much in despairing as in hoping for another life and in eluding the implacable grandeur of this one.”

The Myth of Sisyphus (Le Mythe de Sisyphe, 1943)—If there is a single non-fiction work that can be considered an essential or fundamental statement of Camus’s philosophy, it is this extended essay on the ethics of suicide (eventually translated and repackaged for American publication in 1955). It is here that Camus formally introduces and fully articulates his most famous idea, the concept of the Absurd, and his equally famous image of life as a Sisyphean struggle. From its provocative opening sentence—“There is but one truly serious philosophical problem, and that is suicide”—to its stirring, paradoxical conclusion—“The struggle itself toward the heights is enough to fill a man’s heart. One must imagine Sisyphus happy”—the book has something interesting and challenging on nearly every page and is shot through with brilliant aphorisms and insights. In the end, Camus rejects suicide: the Absurd must not be evaded either by religion (“philosophical suicide”) or by annihilation (“physical suicide”); the task of living should not merely be accepted, it must be embraced.

The Rebel (L’Homme Revolte, 1951)—Camus considered this work a continuation of the critical and philosophical investigation of the Absurd that he began with The Myth of Sisyphus. Only this time his primary concern is not suicide but murder. He takes up the question of whether acts of terrorism and political violence can be morally justified, which is basically the same question he had addressed earlier in his play The Just Assassins. After arguing that an authentic life inevitably involves some form of conscientious moral revolt, Camus winds up concluding that only in rare and very narrowly defined instances is political violence justified. Camus’s critique of revolutionary violence and terror in this work, and particularly his caustic assessment of Marxism-Leninism (which he accused of sacrificing innocent lives on the altar of History), touched nerves throughout Europe and led in part to his celebrated feud with Sartre and other French leftists.

Resistance, Rebellion, and Death (1960)—This posthumous collection is of interest to students of Camus mainly because it brings together an unusual assortment of his non-fiction writings on a wide range of topics, from art and politics to the advantages of pessimism and the virtues (from a non-believer’s standpoint) of Christianity. Of special interest are two pieces that helped secure Camus’s worldwide reputation as a voice of liberty: “Letters to a German Friend,” a set of four letters originally written during the Nazi Occupation, and “Reflections on the Guillotine,” a denunciation of the death penalty cited for special mention by the Nobel committee and eventually revised and re-published as a companion essay to go with fellow death-penalty opponent Arthur Koestler’s “Reflections on Hanging.”

5. Philosophy

To re-emphasize a point made earlier, Camus considered himself first and foremost a writer (un ecrivain). Indeed, Camus’s dissertation advisor penciled onto his dissertation the assessment “More a writer than a philosopher.” And at various times in his career he also accepted the labels journalist, humanist, novelist, and even moralist. However, he apparently never felt comfortable identifying himself as a philosopher—a term he seems to have associated with rigorous academic training, systematic thinking, logical consistency, and a coherent, carefully defined doctrine or body of ideas.

This is not to suggest that Camus lacked ideas or to say that his thought cannot be considered a personal philosophy. It is simply to point out that he was not a systematic, or even a notably disciplined thinker and that, unlike Heidegger and Sartre, for example, he showed very little interest in metaphysics and ontology, which seems to be one of the reasons he consistently denied that he was an existentialist. In short, he was not much given to speculative philosophy or any kind of abstract theorizing. His thought is instead nearly always related to current events (e.g., the Spanish War, revolt in Algeria) and is consistently grounded in down-to-earth moral and political reality.

a. Background and Influences

Though he was baptized, raised, and educated as a Catholic and invariably respectful towards the Church, Camus seems to have been a natural-born pagan who showed almost no instinct whatsoever for belief in the supernatural. Even as a youth, he was more of a sun-worshipper and nature lover than a boy notable for his piety or religious faith. On the other hand, there is no denying that Christian literature and philosophy served as an important influence on his early thought and intellectual development. As a young high school student, Camus studied the Bible, read and savored the Spanish mystics St. Theresa of Avila and St. John of the Cross, and was introduced to the thought of St. Augustine St. Augustine would later serve as the subject of his baccalaureate dissertation and become—as a fellow North African writer, quasi-existentialist, and conscientious observer-critic of his own life—an important lifelong influence.

In college Camus absorbed Kierkegaard, who, after Augustine, was probably the single greatest Christian influence on his thought. He also studied Schopenhauer and Nietzsche—undoubtedly the two writers who did the most to set him on his own path of defiant pessimism and atheism. Other notable influences include not only the major modern philosophers from the academic curriculum—from Descartes and Spinoza to Bergson—but also, and just as importantly, philosophical writers like Stendhal, Melville, Dostoyevsky, and Kafka.

b. Development

The two earliest expressions of Camus’s personal philosophy are his works Betwixt and Between (1937) and Nuptials (1938). Here he unfolds what is essentially a hedonistic, indeed almost primitivistic, celebration of nature and the life of the senses. In the Romantic poetic tradition of writers like Rilke and Wallace Stevens, he offers a forceful rejection of all hereafters and an emphatic embrace of the here and now. There is no salvation, he argues, no transcendence; there is only the enjoyment of consciousness and natural being. One life, this life, is enough. Sky and sea, mountain and desert, have their own beauty and magnificence and constitute a sufficient heaven.

The critic John Cruikshank termed this stage in Camus’s thinking “naïve atheism” and attributed it to his ecstatic and somewhat immature “Mediterraneanism.” Naïve seems an apt characterization for a philosophy that is romantically bold and uncomplicated yet somewhat lacking in sophistication and logical clarity. On the other hand, if we keep in mind Camus’s theatrical background and preference for dramatic presentation, there may actually be more depth and complexity to his thought here than meets the eye. That is to say, just as it would be simplistic and reductive to equate Camus’s philosophy of revolt with that of his character Caligula (who is at best a kind of extreme or mad spokesperson for the author), so in the same way it is possible that the pensées and opinions presented in Nuptials and Betwixt and Between are not so much the views of Camus as they are poetically heightened observations of an artfully crafted narrator—an exuberant alter ego who is far more spontaneous and free-spirited than his more naturally reserved and sober-minded author.

In any case, regardless of this assessment of the ideas expressed in Betwixt and Between and Nuptials, it is clear that these early writings represent an important, if comparatively raw and simple, beginning stage in Camus’s development as a thinker where his views differ markedly from his more mature philosophy in several noteworthy respects. In the first place, the Camus of Nuptials is still a young man of twenty-five, aflame with youthful joie de vivre. He favors a life of impulse and daring as it was honored and practiced in both Romantic literature and in the streets of Belcourt. Recently married and divorced, raised in poverty and in close quarters, beset with health problems, this young man develops an understandable passion for clear air, open space, colorful dreams, panoramic vistas, and the breath-taking prospects and challenges of the larger world. Consequently, the Camus of the period 1937-38 is a decidedly different writer from the Camus who will ascend the dais at Stockholm nearly twenty years later.

The young Camus is more of a sensualist and pleasure-seeker, more of a dandy and aesthete, than the more hardened and austere figure who will endure the Occupation while serving in the French underground. He is a writer passionate in his conviction that life ought to be lived vividly and intensely—indeed rebelliously (to use the term that will take on increasing importance in his thought). He is also a writer attracted to causes, though he is not yet the author who will become world-famous for his moral seriousness and passionate commitment to justice and freedom. All of which is understandable. After all, the Camus of the middle 1930s had not yet witnessed and absorbed the shattering spectacle and disillusioning effects of the Spanish Civil War, the rise of Fascism, Hitlerism, and Stalinism, the coming into being of total war and weapons of mass destruction, and the terrible reign of genocide and terror that would characterize the period 1938-1945. It was under the pressure and in direct response to the events of this period that Camus’s mature philosophy—with its core set of humanistic themes and ideas—emerged and gradually took shape. That mature philosophy is no longer a “naïve atheism” but a very reflective and critical brand of unbelief. It is proudly and inconsolably pessimistic, but not in a polemical or overbearing way. It is unbending, hardheaded, determinedly skeptical. It is tolerant and respectful of world religious creeds, but at the same time wholly unsympathetic to them. In the end it is an affirmative philosophy that accepts and approves, and in its own way blesses, our dreadful mortality and our fundamental isolation in the world.

c. Themes and Ideas

Regardless of whether he is producing drama, fiction, or non-fiction, Camus in his mature writings nearly always takes up and re-explores the same basic philosophical issues. These recurrent topoi constitute the key components of his thought. They include themes like the Absurd, alienation, suicide, and rebellion that almost automatically come to mind whenever his name is mentioned. Hence any summary of his place in modern philosophy would be incomplete without at least a brief discussion of these ideas and how they fit together to form a distinctive and original world-view.

i. The Absurd

Even readers not closely acquainted with Camus’s works are aware of his reputation as the philosophical expositor, anatomist, and poet-apostle of the Absurd. Indeed, as even sitcom writers and stand-up comics apparently understand (odd fact: the comic-bleak final episode of Seinfeld has been compared to The Stranger, and Camus’s thought has been used to explain episodes of The Simpsons), it is largely through the thought and writings of the French-Algerian author that the concept of absurdity has become a part not only of world literature and twentieth-century philosophy but also of modern popular culture.

What then is meant by the notion of the Absurd? Contrary to the view conveyed by popular culture, the Absurd, (at least in Camus’ terms) does not simply refer to some vague perception that modern life is fraught with paradoxes, incongruities, and intellectual confusion. (Although that perception is certainly consistent with his formula.) Instead, as he himself emphasizes and tries to make clear, the Absurd expresses a fundamental disharmony, a tragic incompatibility, in our existence. In effect, he argues that the Absurd is the product of a collision or confrontation between our human desire for order, meaning, and purpose in life and the blank, indifferent “silence of the universe.” (“The absurd is not in man nor in the world,” Camus explains, “but in their presence together . . . it is the only bond uniting them.”)

So here we are: poor creatures desperately seeking hope and meaning in a hopeless, meaningless world. Sartre, in his essay-review of The Stranger provides an additional gloss on the idea: “The absurd, to be sure, resides neither in man nor in the world, if you consider each separately. But since man’s dominant characteristic is ‘being in the world,’ the absurd is, in the end, an inseparable part of the human condition.” The Absurd, then, presents itself in the form of an existential opposition. It arises from the human demand for clarity and transcendence on the one hand and a cosmos that offers nothing of the kind on the other. Such is our fate: we inhabit a world that is indifferent to our sufferings and deaf to our protests.

In Camus’s view there are three possible philosophical responses to this predicament. Two of these he condemns as evasions, and the other he puts forward as a proper solution.

The first choice is blunt and simple: physical suicide. If we decide that a life without some essential purpose or meaning is not worth living, we can simply choose to kill ourselves. Camus rejects this choice as cowardly. In his terms it is a repudiation or renunciation of life, not a true revolt.

The second choice is the religious solution of positing a transcendent world of solace and meaning beyond the Absurd. Camus calls this solution “philosophical suicide” and rejects it as transparently evasive and fraudulent. To adopt a supernatural solution to the problem of the Absurd (for example, through some type of mysticism or leap of faith) is to annihilate reason, which in Camus’s view is as fatal and self-destructive as physical suicide. In effect, instead of removing himself from the absurd confrontation of self and world like the physical suicide, the religious believer simply removes the offending world and replaces it, via a kind of metaphysical abracadabra, with a more agreeable alternative.

The third choice—in Camus’s view the only authentic and valid solution—is simply to accept absurdity, or better yet to embrace it, and to continue living. Since the Absurd in his view is an unavoidable, indeed defining, characteristic of the human condition, the only proper response to it is full, unflinching, courageous acceptance. Life, he says, can “be lived all the better if it has no meaning.”

The example par excellence of this option of spiritual courage and metaphysical revolt is the mythical Sisyphus of Camus’s philosophical essay. Doomed to eternal labor at his rock, fully conscious of the essential hopelessness of his plight, Sisyphus nevertheless pushes on. In doing so he becomes for Camus a superb icon of the spirit of revolt and of the human condition. To rise each day to fight a battle you know you cannot win, and to do this with wit, grace, compassion for others, and even a sense of mission, is to face the Absurd in a spirit of true heroism.

Over the course of his career, Camus examines the Absurd from multiple perspectives and through the eyes of many different characters—from the mad Caligula, who is obsessed with the problem, to the strangely aloof and yet simultaneously self-absorbed Meursault, who seems indifferent to it even as he exemplifies and is finally victimized by it. In The Myth of Sisyphus, Camus traces it in specific characters of legend and literature (Don Juan, Ivan Karamazov) and also in certain character types (the Actor, the Conqueror), all of who may be understood as in some way a version or manifestation of Sisyphus, the archetypal absurd hero.

[Note: A rather different, yet possibly related, notion of the Absurd is proposed and analyzed in the work of Kierkegaard, especially in Fear and Trembling and Repetition. For Kierkegaard, however, the Absurd describes not an essential and universal human condition, but the special condition and nature of religious faith—a paradoxical state in which matters of will and perception that are objectively impossible can nevertheless be ultimately true. Though it is hard to say whether Camus had Kierkegaard particularly in mind when he developed his own concept of the absurd, there can be little doubt that Kierkegaard’s knight of faith is in certain ways an important predecessor of Camus’s Sisyphus: both figures are involved in impossible and endlessly agonizing tasks, which they nevertheless confidently and even cheerfully pursue. In the knight’s quixotic defiance and solipsism, Camus found a model for his own ideal of heroic affirmation and philosophical revolt.]

ii. Revolt

The companion theme to the Absurd in Camus’s oeuvre (and the only other philosophical topic to which he devoted an entire book) is the idea of Revolt. What is revolt? Simply defined, it is the Sisyphean spirit of defiance in the face of the Absurd. More technically and less metaphorically, it is a spirit of opposition against any perceived unfairness, oppression, or indignity in the human condition.

Rebellion in Camus’s sense begins with a recognition of boundaries, of limits that define one’s essential selfhood and core sense of being and thus must not be infringed—as when a slave stands up to his master and says in effect “thus far, and no further, shall I be commanded.” This defining of the self as at some point inviolable appears to be an act of pure egoism and individualism, but it is not. In fact Camus argues at considerable length to show that an act of conscientious revolt is ultimately far more than just an individual gesture or an act of solitary protest. The rebel, he writes, holds that there is a “common good more important than his own destiny” and that there are “rights more important than himself.” He acts “in the name of certain values which are still indeterminate but which he feels are common to himself and to all men” (The Rebel 15-16).

Camus then goes on to assert that an “analysis of rebellion leads at least to the suspicion that, contrary to the postulates of contemporary thought, a human nature does exist, as the Greeks believed.” After all, “Why rebel,” he asks, “if there is nothing permanent in the self worth preserving?” The slave who stands up and asserts himself actually does so for “the sake of everyone in the world.” He declares in effect that “all men—even the man who insults and oppresses him—have a natural community.” Here we may note that the idea that there may indeed be an essential human nature is actually more than a “suspicion” as far as Camus himself was concerned. Indeed for him it was more like a fundamental article of his humanist faith. In any case it represents one of the core principles of his ethics and is one of the tenets that sets his philosophy apart from existentialism.

True revolt, then, is performed not just for the self but also in solidarity with and out of compassion for others. And for this reason, Camus is led to conclude that revolt too has its limits. If it begins with and necessarily involves a recognition of human community and a common human dignity, it cannot, without betraying its own true character, treat others as if they were lacking in that dignity or not a part of that community. In the end it is remarkable, and indeed surprising, how closely Camus’s philosophy of revolt, despite the author’s fervent atheism and individualism, echoes Kantian ethics with its prohibition against treating human beings as means and its ideal of the human community as a kingdom of ends.

iii. The Outsider

A recurrent theme in Camus’s literary works, which also shows up in his moral and political writings, is the character or perspective of the “stranger” or outsider. Meursault, the laconic narrator of The Stranger, is the most obvious example. He seems to observe everything, even his own behavior, from an outside perspective. Like an anthropologist, he records his observations with clinical detachment at the same time that he is warily observed by the community around him.

Camus came by this perspective naturally. As a European in Africa, an African in Europe, an infidel among Muslims, a lapsed Catholic, a Communist Party drop-out, an underground resister (who at times had to use code names and false identities), a “child of the state” raised by a widowed mother (who was illiterate and virtually deaf and dumb), Camus lived most of his life in various groups and communities without really being integrated within them. This outside view, the perspective of the exile, became his characteristic stance as a writer. It explains both the cool, objective (“zero-degree”) precision of much of his work and also the high value he assigned to longed-for ideals of friendship, community, solidarity, and brotherhood.

iv. Guilt and Innocence

Throughout his writing career, Camus showed a deep interest in questions of guilt and innocence. Once again Meursault in The Stranger provides a striking example. Is he legally innocent of the murder he is charged with? Or is he technically guilty? On the one hand, there seems to have been no conscious intention behind his action. Indeed the killing takes place almost as if by accident, with Meursault in a kind of absent-minded daze, distracted by the sun. From this point of view, his crime seems surreal and his trial and subsequent conviction a travesty. On the other hand, it is hard for the reader not to share the view of other characters in the novel, especially Meursault’s accusers, witnesses, and jury, in whose eyes he seems to be a seriously defective human being—at best, a kind of hollow man and at worst, a monster of self-centeredness and insularity. That the character has evoked such a wide range of responses from critics and readers—from sympathy to horror—is a tribute to the psychological complexity and subtlety of Camus’s portrait.

Camus’s brilliantly crafted final novel, The Fall, continues his keen interest in the theme of guilt, this time via a narrator who is virtually obsessed with it. The significantly named Jean-Baptiste Clamence (a voice in the wilderness calling for clemency and forgiveness) is tortured by guilt in the wake of a seemingly casual incident. While strolling home one drizzly November evening, he shows little concern and almost no emotional reaction at all to the suicidal plunge of a young woman into the Seine. But afterwards the incident begins to gnaw at him, and eventually he comes to view his inaction as typical of a long pattern of personal vanity and as a colossal failure of human sympathy on his part. Wracked by remorse and self-loathing, he gradually descends into a figurative hell. Formerly an attorney, he is now a self-described “judge-penitent” (a combination sinner, tempter, prosecutor, and father-confessor) who shows up each night at his local haunt, a sailor’s bar near Amsterdam’s red light district, where, somewhat in the manner of Coleridge’s Ancient Mariner, he recounts his story to whoever will hear it. In the final sections of the novel, amid distinctly Christian imagery and symbolism, he declares his crucial insight that, despite our pretensions to righteousness, we are all guilty. Hence no human being has the right to pass final moral judgment on another.

In a final twist, Clamence asserts that his acid self-portrait is also a mirror for his contemporaries. Hence his confession is also an accusation—not only of his nameless companion (who serves as the mute auditor for his monologue) but ultimately of the hypocrite lecteur as well.

v. Christianity vs. “Paganism”

The theme of guilt and innocence in Camus’s writings relates closely to another recurrent tension in his thought: the opposition of Christian and pagan ideas and influences. At heart a nature-worshipper, and by instinct a skeptic and non-believer, Camus nevertheless retained a lifelong interest and respect for Christian philosophy and literature. In particular, he seems to have recognized St. Augustine and Kierkegaard as intellectual kinsmen and writers with whom he shared a common passion for controversy, literary flourish, self-scrutiny, and self-dramatization. Christian images, symbols, and allusions abound in all his work (probably more so than in the writing of any other avowed atheist in modern literature), and Christian themes—judgment, forgiveness, despair, sacrifice, passion, and so forth—permeate the novels. (Meursault and Clamence, it is worth noting, are presented not just as sinners, devils, and outcasts, but in several instances explicitly, and not entirely ironically, as Christ figures.)

Meanwhile alongside and against this leitmotif of Christian images and themes, Camus sets the main components of his essentially pagan worldview. Like Nietzsche, he maintains a special admiration for Greek heroic values and pessimism and for classical virtues like courage and honor. What might be termed Romantic values also merit particular esteem within his philosophy: passion, absorption in pure being, an appreciation for and indeed a willingness to revel in raw sensory experience, the glory of the moment, the beauty of the world.

As a result of this duality of influence, Camus’s basic philosophical problem becomes how to reconcile his Augustinian sense of original sin (universal guilt) and rampant moral evil with his personal ideal of pagan primitivism (universal innocence) and with his conviction that the natural world and our life in it have intrinsic beauty and value. Can an absurd world have intrinsic value? Is authentic pessimism compatible with the view that there is an essential dignity to human life? Such questions raise the possibility that there may be deep logical inconsistencies within Camus’s philosophy, and some critics (notably Sartre) have suggested that these inconsistencies cannot be surmounted except through some sort of Kierkegaardian leap of faith on Camus’s part—in this case a leap leading to a belief not in God but in man.

Such a leap is certainly implied in an oft-quoted remark from Camus’s “Letter to a German Friend,” where he wrote: “I continue to believe that this world has no supernatural meaning…But I know that something in the world has meaning—man.” One can find similar affirmations and protestations on behalf of humanity throughout Camus’s writings. They are almost a hallmark of his philosophical style. Oracular and high-flown, they clearly have more rhetorical force than logical potency. On the other hand, if we are trying to locate Camus’s place in European philosophical tradition, they provide a strong clue as to where he properly belongs. Surprisingly, the sentiment here, a commonplace of the Enlightenment and of traditional liberalism, is much closer in spirit to the exuberant secular humanism of the Italian Renaissance than to the agnostic skepticism of contemporary post-modernism.

vi. Individual vs. History and Mass Culture

A primary theme of early twentieth-century European literature and critical thought is the rise of modern mass civilization and its suffocating effects of alienation and dehumanization. This became a pervasive theme by the time Camus was establishing his literary reputation. Anxiety over the fate of Western culture, already intense, escalated to apocalyptic levels with the sudden emergence of fascism, totalitarianism, and new technologies of coercion and death. Here then was a subject ready-made for a writer of Camus’s political and humanistic views. He responded to the occasion with typical force and eloquence.

In one way or another, the themes of alienation and dehumanization as by-products of an increasingly technical and automated world enter into nearly all of Camus’s works. Even his concept of the Absurd becomes multiplied by a social and economic world in which meaningless routines and mind-numbing repetitions predominate. The drudgery of Sisyphus is mirrored and amplified in the assembly line, the business office, the government bureau, and especially in the penal colony and concentration camp.

In line with this theme, the ever-ambiguous Meursault in The Stranger can be understood as both a depressing manifestation of the newly emerging mass personality (that is, as a figure devoid of basic human feelings and passions) and, conversely, as a lone hold-out, a last remaining specimen of the old Romanticism—and hence a figure who is viewed as both dangerous and alien by the robotic majority. Similarly, The Plague can be interpreted, on at least one level, as an allegory in which humanity must be preserved from the fatal pestilence of mass culture, which converts formerly free, autonomous, independent-minded human beings into a soulless new species.

At various times in the novel, Camus’s narrator describes the plague as if it were a dull but highly capable public official or bureaucrat:

It was, above all, a shrewd, unflagging adversary; a skilled organizer, doing his work thoroughly and well. (180) “But it seemed the plague had settled in for good at its most virulent, and it took its daily toll of deaths with the punctual zeal of a good civil servant.” (235)

This identification of the plague with oppressive civil bureaucracy and the routinization of charisma looks forward to the author’s play The State of Siege, where plague is used once again as a symbol for totalitarianism—only this time it is personified in an almost cartoonish way as a kind of overbearing government functionary or office manager from hell. Clad in a gaudy military uniform bedecked with ribbons and decorations, the character Plague (a satirical portrait of Generalissimo Francisco Franco—or El Caudillo as he liked to style himself) is closely attended by his personal Secretary and loyal assistant Death, depicted as a prim, officious female bureaucrat who also favors military garb and who carries an ever-present clipboard and notebook.

So Plague is a fascist dictator, and Death a solicitous commissar. Together these figures represent a system of pervasive control and micro-management that threatens the future of mass society.

In his reflections on this theme of post-industrial dehumanization, Camus differs from most other European writers (and especially from those on the Left) in viewing mass reform and revolutionary movements, including Marxism, as representing at least as great a threat to individual freedom as late-stage capitalism. Throughout his career he continued to cherish and defend old-fashioned virtues like personal courage and honor that other Left-wing intellectuals tended to view as reactionary or bourgeois.

vii. Suicide

Suicide is the central subject of The Myth of Sisyphus and serves as a background theme in Caligula and The Fall. In Caligula the mad title character, in a fit of horror and revulsion at the meaninglessness of life, would rather die—and bring the world down with him—than accept a cosmos that is indifferent to human fate or that will not submit to his individual will. In The Fall, a stranger’s act of suicide serves as the starting point for a bitter ritual of self-scrutiny and remorse on the part of the narrator.

Like Wittgenstein (who had a family history of suicide and suffered from bouts of depression), Camus considered suicide the fundamental issue for moral philosophy. However, unlike other philosophers who have written on the subject (from Cicero and Seneca to Montaigne and Schopenhauer), Camus seems uninterested in assessing the traditional motives and justifications for suicide (for instance, to avoid a long, painful, and debilitating illness or as a response to personal tragedy or scandal). Indeed, he seems interested in the problem only to the extent that it represents one possible response to the Absurd. His verdict on the matter is unqualified and clear: The only courageous and morally valid response to the Absurd is to continue living—“Suicide is not an option.”

viii. The Death Penalty

From the time he first heard the story of his father’s literal nausea and revulsion after witnessing a public execution, Camus began a vocal and lifelong opposition to the death penalty. Executions by guillotine were a common public spectacle in Algeria during his lifetime, but he refused to attend them and recoiled bitterly at their very mention.

Condemnation of capital punishment is both explicit and implicit in his writings. For example, in The Stranger Meursault’s long confinement during his trial and his eventual execution are presented as part of an elaborate, ceremonial ritual involving both public and religious authorities. The grim rationality of this process of legalized murder contrasts markedly with the sudden, irrational, almost accidental nature of his actual crime. Similarly, in The Myth of Sisyphus, the would-be suicide is contrasted with his fatal opposite, the man condemned to death, and we are continually reminded that a sentence of death is our common fate in an absurd universe.

Camus’s opposition to the death penalty is not specifically philosophical. That is, it is not based on a particular moral theory or principle (such as Cesare Beccaria’s utilitarian objection that capital punishment is wrong because it has not been proven to have a deterrent effect greater than life imprisonment). Camus’s opposition, in contrast, is humanitarian, conscientious, almost visceral. Like Victor Hugo, his great predecessor on this issue, he views the death penalty as an egregious barbarism—an act of blood riot and vengeance covered over with a thin veneer of law and civility to make it acceptable to modern sensibilities. That it is also an act of vengeance aimed primarily at the poor and oppressed, and that it is given religious sanction, makes it even more hideous and indefensible in his view.

Camus’s essay “Reflections on the Guillotine” supplies a detailed examination of the issue. An eloquent personal statement with compelling psychological and philosophical insights, it includes the author’s direct rebuttal to traditional retributionist arguments in favor of capital punishment (such as Kant’s claim that death is the legally appropriate, indeed morally required, penalty for murder). To all who argue that murder must be punished in kind, Camus replies:

Capital punishment is the most premeditated of murders, to which no criminal’s deed, however calculated, can be compared. For there to be an equivalency, the death penalty would have to punish a criminal who had warned his victim of the date on which he would inflict a horrible death on him and who, from that moment onward, had confined him at his mercy for months. Such a monster is not to be encountered in private life.

Camus concludes his essay by arguing that, at the very least, France should abolish the savage spectacle of the guillotine and replace it with a more humane procedure (such as lethal injection). But he still retains a scant hope that capital punishment will be completely abolished at some point in the time to come: “In the unified Europe of the future the solemn abolition of the death penalty ought to be the first article of the European Code we all hope for.” Camus himself did not live to see the day, but he would no doubt be gratified to know that abolition of capital punishment is now an essential prerequisite for membership in the European Union.

6. Existentialism

Camus is often classified as an existentialist writer, and it is easy to see why. Affinities with Kierkegaard and Sartre are patent. He shares with these philosophers (and with the other major writers in the existentialist tradition, from Augustine and Pascal to Dostoyevsky and Nietzsche) an habitual and intense interest in the active human psyche, in the life of conscience or spirit as it is actually experienced and lived. Like these writers, he aims at nothing less than a thorough, candid exegesis of the human condition, and like them he exhibits not just a philosophical attraction but also a personal commitment to such values as individualism, free choice, inner strength, authenticity, personal responsibility, and self-determination.

However, one troublesome fact remains: throughout his career Camus repeatedly denied that he was an existentialist. Was this an accurate and honest self-assessment? On the one hand, some critics have questioned this “denial” (using the term almost in its modern clinical sense), attributing it to the celebrated Sartre-Camus political “feud” or to a certain stubbornness or even contrariness on Camus’s part. In their view, Camus qualifies as, at minimum, a closet existentialist, and in certain respects (e.g., in his unconditional and passionate concern for the individual) as an even truer specimen of the type than Sartre.

On the other hand, besides his personal rejection of the label, there appear to be solid reasons for challenging the claim that Camus is an existentialist. For one thing, it is noteworthy that he never showed much interest in (indeed he largely avoided) metaphysical and ontological questions (the philosophical raison d’etre of Heidegger and Sartre). Of course there is no rule that says an existentialist must be a metaphysician. However, Camus’s seeming aversion to technical philosophical discussion does suggest one way in which he distanced himself from contemporary existentialist thought.

Another point of divergence is that Camus seems to have regarded existentialism as a complete and systematic world-view, that is, a fully articulated doctrine. In his view, to be a true existentialist one had to commit to the entire doctrine (and not merely to bits and pieces of it), and this was apparently something he was unwilling to do.

A further point of separation, and possibly a decisive one, is that Camus actively challenged and set himself apart from the existentialist motto that being precedes essence. Ultimately, against Sartre in particular and existentialists in general, he clings to his instinctive belief in a common human nature. In his view human existence necessarily includes an essential core element of dignity and value, and in this respect he seems surprisingly closer to the humanist tradition from Aristotle to Kant than to the modern tradition of skepticism and relativism from Nietzsche to Derrida (the latter his fellow-countryman and, at least in his commitment to human rights and opposition to the death penalty, his spiritual successor and descendant).

7. Camus, Colonialism, and Algeria

One of the main topics and even preoccupations of recent Camus studies has been the writer’s attitude, as reflected in both his fiction and in his non-fiction, towards European colonialism in general and his response to the French-Algerian “problem” or “question” (as it was often termed) in particular. The first thing that can be noted in this respect is that, unlike Sartre and many other European intellectuals, Camus never delivered a formal critique of colonialism. Nor did he sign any of the frequent manifestos and declarations deploring the practice – a sin for which he was sharply criticized and even accused of moral cowardice. In 1958, partly to explain and vindicate himself, but mainly to illustrate and give voice to the painful complexities of colonial reform and decolonization, he published Algerian Chronicles, a collection of his writings on the vexing “problem” that he had personally agonized over for more than twenty years.

In addition to his perceived silence on the issue of colonialism (a silence, as Algerian Chronicles reveals, motivated by his fear that speaking out aggressively would be more likely to heighten tensions than secure the united and independent post-colonial Algeria he hoped for), Camus has also been criticized for the virtual erasure of Arab characters and culture from his fiction. The Irish writer and politician Conor Cruise O’Brien made a partial attempt to rescue Camus from this criticism by arguing that The Fall should be read as an autobiographical work in which Camus confesses his own personal failures, including his guilt at becoming a privileged citizen in a poor country. Several writers, and most prominently and forcefully Edward Said, have denounced the nearly total absence of Arab characters in Camus’s novels and stories. Moreover, the few Arab characters who do appear, these critics point out, are inevitably mute and anonymous. They are either shadow figures, including the nameless murder victim at the climactic center of The Stranger, or mere bodies, like the uncounted and unidentified native Algerians who presumably make up the major part of the death toll in The Plague but who otherwise have no speaking role or even visible presence in the novel. Along this same line of criticism, The Meursault Investigation is a fictional and metafictional riposte to Camus by the Algerian writer Kamel Daoud. A reimagining of the characters and events of The Stranger, told from the point of view of the brother of the murdered Arab, the novel represents both a corrective rebuke and a literary tribute to it famous original. In the introduction to her recent expanded edition of Algerian Chronicles, Alice Kaplan addresses these and related criticisms and cites relevant passages from Camus’s own writing in response to them.

8. Significance and Legacy

Obviously, Camus’s writings remain the primary reason for his continuing importance and the chief source of his cultural legacy, but his fame is also due to his exemplary life. He truly lived his philosophy; thus it is in his personal political stands and public statements as well as in his books that his views are clearly articulated. In short, he bequeathed not just his words but also his actions. Taken together, those words and actions embody a core set of liberal democratic values—including tolerance, justice, liberty, open-mindedness, respect for personhood, condemnation of violence, and resistance to tyranny—that can be fully approved and acted upon by the modern intellectual engagé.

On a purely literary level, one of Camus’s most original contributions to modern discourse is his distinctive prose style. Terse and hard-boiled, yet at the same time lyrical, and indeed capable of great, soaring flights of emotion and feeling, Camus’s style represents a deliberate attempt on his part to wed the famous clarity, elegance, and dry precision of the French philosophical tradition with the more sonorous and opulent manner of 19th century Romantic fiction. The result is something like a cross between Hemingway (a Camus favorite) and Melville (another favorite) or between Diderot and Hugo. For the most part when we read Camus we encounter the plain syntax, simple vocabulary, and biting aphorism typical of modern theatre or noir detective fiction. By the way it’s worth noting that Camus was a fan of the novels of Dashiell Hammett and James M Cain and that his own work has influenced the style and the existentialist loner heroes of a succession of later crime writers, including John D McDonald and Lee Child. This muted, laconic style frequently becomes a counterpoint or springboard for extended musings and lavish descriptions almost in the manner of Proust. Moreover, this base style frequently becomes a counterpoint or springboard for extended musings and lavish descriptions almost in the manner of Proust. Here we may note that this attempted reconciliation or union of opposing styles is not just an aesthetic gesture on the author’s part: It is also a moral and political statement. It says, in effect, that the life of reason and the life of feeling need not be opposed; that intellect and passion can, and should, operate together.

Perhaps the greatest inspiration and example that Camus provides for contemporary readers is the lesson that it is still possible for a serious thinker to face the modern world (with a full understanding of its contradictions, injustices, brutal flaws, and absurdities) with hardly a grain of hope, yet utterly without cynicism. To read Camus is to find words like justice, freedom, humanity, and dignity used plainly and openly, without apology or embarrassment, and without the pained or derisive facial expressions or invisible quotation marks that almost automatically accompany those terms in public discourse today.

At Stockholm Camus concluded his Nobel acceptance speech with a stirring reminder and challenge to modern writers: “The nobility of our craft,” he declared, “will always be rooted in two commitments, both difficult to maintain: the refusal to lie about what one knows and the resistance to oppression.” He left behind a body of work faithful to his own credo that the arts of language must always be used in the service of truth and the service of liberty.

9. References and Further Reading

a. Works by Albert Camus

The Stranger. Trans. Stuart Gilbert. New York: Vintage-Random House, 1946.
Camus’s first novel, a classic portrait of the “outsider” originally published in France as L’Etranger by Librairie Gallimard in 1942.
The Plague. Trans. Stuart Gilbert. New York: Vintage-International, 1991.
Camus’s second novel, originally published in France as La Peste by Librairie Gallimard in 1947.
The Fall. Trans. Justin O’Brien. New York: Vintage-Random House, 1956.
Camus’s third novel, a confessional monologue originally published in France as La Chute by Librairie Gallimard in 1956.
The Myth of Sisyphus and other Essays. Trans. Justin O’Brien. New York: Vintage-Random House, 1955.
A philosophical meditation on suicide originally published as Le Mythe de Sisyphe by Librairie Gallimard in 1942.
The Rebel. Trans. Anthony Bower. New York: Vintage-Random House, 1956.
A philosophical essay on the ethics of rebellion and political violence originally published as L’Homme Revolte by Librairie Gallimard in 1951.
Exile and the Kingdom. Trans. Justin O’Brien. New York: Vintage-Random House, 1958.
A collection of short fiction originally published as L’Exil et le Royaume by Librairie Gallimard in 1957.
Lyrical and Critical Essays. Ed. Philip Thody. Trans. Ellen Conroy Kennedy. New York: Vintage-Random House, 1970.
A selection of critical writings, including essays on Melville, Faulkner, and Sartre, plus all the early essays from Betwixt and Between and Nuptials.
Resistance, Rebellion, and Death. Trans. Justin O’Brien. New York: Vintage International, 1995.
A collection of essays on a wide variety of political topics ranging from the death penalty to the Cold War.
Caligula and Three Other Plays. Trans. Stuart Gilbert. New York: Vintage-Random House, 1958.
A collection of four of Camus’s best-known dramatic works: Caligula, The Misunderstanding, The State of Siege, and The Just Assassins, with a foreword by the author.
The First Man. Trans. David Hapgood. New York: Alfred Knopf, 1995.
A posthumous novel, partly autobiographical.
Camus at Combat: Writings 1944-1947. Ed. Jaqueline Levi-Valenci. Trans. Arthur Goldhammer. Princeton, NJ: Princeton University Press, 2006.
A collection of articles and editorials that Camus wrote during and after WW II for the French Resistance journal Combat.
Algerian Chronicles. Ed. Alice Kaplan. Trans. Arthur Goldhammer. Cambridge, MA: Belknap Press, 2013.
A collection of Camus’s political writings on Algeria.

b. Critical and Biographical Studies

Barthes, Roland. Writing Degree Zero. New York: Hill and Wang, 1968.
Bloom, Harold, ed. Albert Camus. New York: Chelsea House, 1989.
Brée, Germaine. Camus. New Brunswick, NJ: Rutgers University Press, 1961.
Brée, Germaine, ed. Camus: A Collection of Critical Essays. Englewood Cliffs, NJ: Prentice-Hall, 1962.
Cruickshank, John. Albert Camus and the Literature of Revolt. London: Oxford University Press, 1959.
Cruickshank, John. The Novelist as Philosopher. London: Oxford University Press, 1959.
Foley, John. Albert Camus: From the Absurd to Revolt. Montreal: McGill-Queens University Press, 2008.
Hughes, Edward J. ed. The Cambridge Companion to Camus. Cambridge, UK: Cambridge University Press, 2007.
Kauffman, Walter, ed. Religion from Tolstoy to Camus. New York: Harper, 1964.
Lottman, Herbert R. Albert Camus: A Biography. Corte Madera, CA: Gingko Press, 1997.
Malraux, Andre. Anti-Memoirs. New York: Holt, Rinehart, and Winston, 1968.
Margerrison, Christine. et al. Albert Camus in the 21st Century: A Reassessment of his Thinking at the Dawn of the New Millennium. Amsterdam, NL: Rodopi, 2008.
McBride, Joseph. Albert Camus: Philosopher and Littérateur. New York: St. Martin’s Press, 1992.
O’Brien, Conor Cruise. Camus. London: Faber and Faber, 1970.
Said, Edward. “Camus and the French Imperial Experience.” In Culture and Imperialism. New York: Vintage Books, 1994.
Sartre, Jean-Paul. “Camus’s The Outsider.” In Situations. New York: George Braziller, 1965.
Ronald D Srigley. Albert Camus’ Critique of Modernity. Columbia, MO: University of Missouri Press, 2011.
Thrody, Philip. Albert Camus, 1913-1960. London: Hamish Hamilton, 1961.
Todd, Olivier. Albert Camus: A Life. New York: Alfred A. Knopf, 1997.
Zaretsky, Robert. A Life Worth Living: Albert Camus and the Quest for Meaning. Cambridge, MA: Belknap Press of Harvard University Press, 2013.

Author Information

David Simpson
Email: dsimpson@depaul.edu
DePaul University
U. S. A.

Proper Functionalism

‘Proper Functionalism’ refers to a family of epistemological views according to which whether a belief (or some other doxastic state) was formed by way of properly functioning cognitive faculties plays a crucial role in whether it has a certain kind of positive epistemic status (such as being an item of knowledge, or a justified belief). Alvin Plantinga’s proper functionalist theory of knowledge has been the most prominent among these theories. Michael Bergmann’s (2006) proper functionalist theory of justification has also been the focus of much discussion. But proper functionalist theories of other epistemic properties have also been developed. Richard Otte (1987) and Alvin Plantinga (1993b: Chapter 9) offer proper functionalist theories of epistemic probability, for example. Nicholas Wolterstorff (2010) defends a proper functionalist theory of epistemic oughts. And Peter Graham (2010) develops a proper functionalist theory of epistemic entitlement. Since Plantinga’s theory of knowledge and Bergmann’s theory of justification are the most widely known and most discussed proper functionalist views, and because they share many features with other proper functionalist theories, this article focuses primarily on them—what can be said in their favor, the challenges they face, the ways in which they might be defended, and how they compare with some of their closest rivals.

Plantinga’s Proper Functionalist Theory of Knowledge
Bergmann’s Proper Functionalist Theory of Justification
1. Some Advantages of Bergmann’s Theory
2. Some Objections to Bergmann’s Theory
Rival Theories
1. Proper Functionalism and Phenomenal Conservatism
2. Proper Functionalism and Virtue Epistemology
References and Further Reading

1. Plantinga’s Proper Functionalist Theory of Knowledge

This article begins with a discussion of Alvin Plantinga’s proper functionalist theory of knowledge. As Plantinga himself frames matters, he takes himself to be giving a proper functionalist theory of a property he calls “warrant,” where warrant is whatever precisely it is which makes the difference between knowledge and mere true belief.

a. Motivations of Plantinga’s Theory

A theory of warrant is subject to Gettier-style counterexamples if a belief can meet all the conditions the theory specifies as jointly sufficient for knowledge, but meet them merely by accident (in a manner that precludes that beliefs being an item of knowledge). Plantinga argues that any theory that fails to construe a proper function condition as necessary for warrant is subject to counterexamples of this sort. This is so whether the theory emphasizes the believer’s internal states as most relevant to whether her belief has warrant, external factors, or both of these.

By way of illustration, Plantinga (1993b: 31-37) adopts a scenario originally introduced by Roderick Chisholm, who attributes it to Alexius Meinong. The scenario envisions an aging forest ranger living in the mountains, with a set of wind chimes hanging from a bough. The ranger is unaware of the fact that his hearing has been degenerating of late, and it has gotten to the point where he can no longer hear the chimes. He is also unaware that he is occasionally subject to small auditory hallucinations in which he appears to hear the wind-chime. On one occasion, he is thus appeared to and comes to believe that the wind is blowing. As it happens, the wind is blowing and causing the ringing of the chimes. Even if we stipulate that all is going well with this belief from the ranger’s own internal perspective, it is clear nonetheless that his belief lacks warrant. The reason his belief lacks warrant, Plantinga maintains, results from the fact that it is due to cognitive malfunction.

One might question whether this explanation is correct, however, on the ground that certain cognitively external environmental conditions are also amiss in this case. In particular, the case is one in which there is no reliable connection between the ranger’s appearing to hear the wind-chime and the wind’s blowing. And one might think that it is primarily for this reason that the ranger’s beliefs lack warrant. This thought might push one toward bypassing proper functionalism and endorsing a reliabilist theory of warrant instead (that is, an account according to which a belief having warrant is primarily a matter of it being formed or sustained in a way that involves a reliable connection to the truth). But Plantinga also argues that any reliabilist theory which does not incorporate a proper function condition is also subject to Gettier-style counterexamples.

Plantinga (1993a: 195-198, 205-207) takes this to be illustrated by The Case of the Epistemically Serendipitous Brain Lesion. Imagine Sam has a brain lesion, one that engenders cognitive processes which mostly result in false beliefs. One process the lesion engenders, however, is a process that results in the belief that one has a brain lesion. This particular process is highly reliable (it always results in one’s having a true belief). But clearly the belief that results is not a matter of knowledge. What explains why this is so, Plantinga maintains, is that the belief in question (though formed by a truth-reliable process) is not the result of cognitive proper function. Accordingly, Plantinga concludes that any reliabilist account of warrant must be augmented with a proper function condition.

Kenneth Boyce and Alvin Plantinga (2012: 127-128) have emphasized that there may be an even stronger lesson to be drawn from these cases. Once these cases are on the table, one can imagine variations of them in which different combinations of internal and external conditions (other than proper function ones) are met, but in which the belief in question lacks warrant because it ends up being true merely by accident. Furthermore, Boyce and Plantinga contend that in these cases it seems that part of what explains why these are cases in which the beliefs are true merely by accident (in a way that precludes their being items of knowledge) is that they were not formed in a manner specified by cognitive proper function; that is, the way they get at the truth is accidental from the perspective of the cognitive design plan. If that is correct, however, then (as Boyce and Plantinga point out), there is reason to believe that the notion of cognitive proper function is centrally involved in the notion of non-accidentally that any adequate analysis of warrant must capture.

b. The Content of Plantinga’s Theory

Examples of the sort discussed above are used by Plantinga to motivate the claim that cognitive proper function is necessary for warrant. Plantinga (1993b: 21-24) also maintains that the relevant notion of proper function presupposes that of a design plan—something that specifies the manner in which a thing is supposed to function in various circumstances. As Plantinga conceives of it, a design plan may be modeled as a set of ordered triples, where each triple specifies a circumstance, a response, and a purpose or function. One need not initially take this notion of a design plan to involve conscious design or purpose. The notion of a design plan at issue here is whatever notion is presupposed by talk of proper function for biological systems (as when a physician determines that a human heart is functioning the way it is supposed to on account of its pumping at 70 beats per minute). Plantinga himself gives a theistic account of this notion, but other proper functionalists, such as Ruth Millikan (1984) and Peter Graham (2012), have offered naturalistic, evolutionary accounts.

While Plantinga (1993b: 46) takes cognitive proper function to be necessary for warrant, he does not take it to be sufficient (or even nearly sufficient). Other conditions must also be satisfied. To a rough, first approximation, Plantinga takes a belief to be warranted if and only if it satisfies the following four conditions:

(1) The belief in question is formed by way of cognitive faculties that are properly functioning.

(2) The cognitive faculties in question are aimed at the production of true beliefs.

(3) The design plan is a good one. That is, when a belief is formed by way of truth-aimed cognitive proper function in the sort of environment for which the cognitive faculties in question were designed, there is a high objective probability that the resulting belief is true.

(4) The belief is formed in the sort of environment for which the cognitive faculties in question were designed.

While Plantinga adds various nuances, these four conditions serve to capture the main outlines of his view.

Many objections have been raised to Plantinga’s theory. Two of the most prominent among them are considered below. The first amounts to an objection to the claim that Plantinga’s four conditions are necessary for warrant. The second amounts to an objection to the claim that they are sufficient. For a sampling of other objections, one would do well to examine the collection of essays on Plantinga’s theory of warrant edited by Jonathan L. Kvanvig (1996).

c. Swampman

Some have argued that there are counterexamples to Plantinga’s theory involving beings who have warranted beliefs but who nevertheless fail to exhibit cognitive proper function. The most well-known version of this objection comes from Ernest Sosa (1993), who adapts a scenario originally proposed by Donald Davidson, and uses it against proper functionalism. In that scenario, Davidson is standing next to a swamp when lightning strikes a nearby dead tree, thereby obliterating Davidson. Simultaneously, by sheer accident, the lightning also causes the molecules of the tree to arrange themselves into a perfect duplicate of Davidson as he was at the time of his demise. The Davidson duplicate—this “Swampman”—leaves the swamp, acting and talking as if it were Davidson, having all the same intrinsic properties that Davidson would have had, had he left the swamp without having his unfortunate encounter. According to Sosa, “it … seems logically possible for … Swampman to have warranted beliefs not long after creation if not right away” (p. 54). Yet, not being the product of intentional design, and not having any evolutionary history, it would seem that Swampman has no design plan. And so we have what appears to be a counterexample to proper functionalism.

There are various responses to the Swampman objection. Plantinga (1993c: 206-208) and Graham (2012: 466-467) have each argued, albeit for different reasons, that it is doubtful the Swampman scenario is metaphysically possible. They have also suggested, again for different reasons, that if this scenario is possible, perhaps Swampman can acquire conditions for proper functioning without natural selection or intentional design. See Plantinga (1993c: 78) and Graham (2014). Bergmann (2006: 147-149) has argued that we are intuitively inclined to assign positive epistemic status to Swampman’s beliefs only to the extent we are inclined to think that his beliefs are fitting responses to the inputs he receives. And we are inclined to think that Swampman’s beliefs are fitting, argues Bergmann, only to the extent we are inclined to think of those responses as exhibiting cognitive proper function. Boyce and Plantinga (2012: 130-131) have suggested that since it is merely by accident that Swampman is forming his beliefs reliably, we can think of this case as a Gettier scenario (or at least, relevantly analogous to one), and thereby deny that Swampman’s beliefs have warrant). For a similar response, see (McNabb 2015).

Since then, Kenneth Boyce and Andrew Moon (2015) have argued that the Swampman objection relies on a false intuition concerning the conditions under which the belief of one creature has warrant if the belief of another, similar creature does. According to them, the central intuition that motivates our intuitive reaction to the Swampman case may be stated as follows:

(CI) If a belief B is warranted for a subject S and another subject S* comes to hold B in the same way that S came to hold B in a relevantly similar environment to the one in which S came to hold B, then B is warranted for S*.

They argue that it is CI, in conjunction with the stipulation that Swampman forms his beliefs in the same way that an ordinary human being would (an ordinary human being to whom we would be inclined to attribute knowledge), that explains our tendency to regard Swampman as having warranted beliefs. Boyce and Moon then go on to argue that CI is subject to counterexamples, and that this undercuts the force of the Swampman objection. See Section 3b for further discussion of their argument.

d. Gettier Cases

Plantinga has conceded that his theory, as he originally formulated it, is subject to Gettier-style counterexamples. In 2000, Plantinga formulated this counterexample:

I own a Chevrolet van, drive to Notre Dame on a football Saturday, and unthinkingly park in one of the many places reserved for the football coach. Naturally, his minions tow my van away and, as befits such lèse majesté, destroy it. By a splendid piece of good luck, however, I have won the Varsity Club’s Win-a-Chevrolet-Van contest, although I haven’t yet heard the good news. You ask me what sort of automobile I own; I reply, both honestly and truthfully, “A Chevrolet van.” My belief that I own such a van is true, but ‘just by accident’ (more accurately, it is only by accident that I happen to form a true belief); hence it does not constitute knowledge. All of the non-environmental conditions for warrant, furthermore, are met. It also looks as if the environmental condition is met: after all, isn’t the cognitive environment here on earth and in South Bend just the one for which our faculties were designed?

Clearly Plantinga’s belief (though true) is not an item of knowledge in this case and thus lacks warrant. So Plantinga’s original four conditions are not jointly sufficient for warrant. Something else must be added. But what?

According to Plantinga, what the original account requires is an addition to the environmental condition. More specifically, the problem in the above case is that while the global environment that Plantinga is in is the one for which his faculties were designed, his more local environment is epistemically misleading. So in order to deal with this counterexample, Plantinga proposes adding a resolution condition. This condition involves a distinction between two different kinds of environment, what Plantinga refers to as the “maxi-environment” and what he refers to as the “mini-environment.” The maxi-environment, Plantinga stipulates, is the kind of global environment in which we live here on earth, the kind of environment for which our cognitive faculties were designed (or to which they were adapted). The mini-environment, by contrast, is a much more specific state of affairs, one that includes, for a given exercise of one’s cognitive faculties E resulting in a belief B, all of the epistemically relevant circumstances obtaining when B is formed (though diminished with respect to whether B is true).

Letting ‘MBE’ denote the cognitive mini-environment with respect to B and E (which Plantinga says may contain as large a fragment of the actual world as one likes, up to whether B is true), Plantinga maintains that the needed resolution condition may be stated as follows:

(RC) A belief B produced by an exercise of cognitive powers has warrant sufficient for knowledge only if MBE (the mini-environment with respect to B and E) is favorable for E.

This, of course, raises the question of just what it is for a mini-environment to be “favorable.” Plantinga has, in the past, offered various proposals for what favorableness consists in that he has subsequently admitted to be unsatisfactory. A proposal is found in Boyce and Plantinga (2012: 134). For other proposals, see Crisp (2000) and Chignell (2003).

2. Bergmann’s Proper Functionalist Theory of Justification

Plantinga’s theory of warrant is not the only kind of proper functionalist theory. Proper functionalist theories of other epistemic concepts have also been developed. Noteworthy among these is Michael Bergmann’s proper functionalist theory of epistemic justification. The kind of epistemic justification that Bergmann (2006: 4-5) is interested in is doxastic justification. The having of this property is frequently (though not universally) held to be a necessary condition for a belief being an item of knowledge. In fact, it is often held that a belief having this property, in conjunction with its being non-accidentally true (in a way that rules out Gettier cases), is not only necessary, but also sufficient, for its being an item of knowledge.

A major divide in the literature occurs between those philosophers who are “externalists” about this kind of justification and those who are “internalists” about it. Just how this divide should be characterized is itself a matter of dispute. But for present purposes, we may characterize internalists about justification as being committed (at least) to the view that whether a belief is justified depends entirely on which mental states that belief is based upon (in such a way that necessarily, any two believers who are exactly alike in terms of their mental states and in terms of which of those mental states their beliefs are based upon are also alike in terms of which of their beliefs are justified). Externalists, by contrast, maintain that whether a belief is justified may depend on other factors.

It should be noted, however, that Bergmann (2006: chapter 3) divides up the territory a bit differently, though not in a way that impacts the current discussion. He takes it to be a necessary condition for a view of justification to count as “internalist” that it include an awareness requirement (that is, that it require, in order for a belief to be justified, that the believing subject is actually or potentially aware of some justification-contributor to that belief). The characterization of internalism given here, by contrast, includes no such requirement (and is similar to the characterization of a view of justification that Bergmann calls “mentalism,” one which he takes to be distinct from both externalism and internalism).

As Bergmann (2006: 3-7) points out, it is not always clear that philosophers who appear to dispute the nature of justification are actually disagreeing with one another. That is because it is plausible that epistemologists sometimes use the term ‘justification’ in different ways. He notes, for example, that some epistemologists use this term to pick out a subjective notion, one that it is satisfied by a belief provided that the subject is blameless in holding it. Others, by contrast, he observes, use the term to pick out a more objective notion, one according to which a belief is justified only if it is fitting with respect to the believer’s evidence or other epistemically relevant inputs. It is this objective notion of justification in which Bergmann is interested (see also pp. 111-113). He takes it to be a conceptually open question as whether this kind of justification is necessary for knowledge (though he thinks it is). And he also takes some disputes between self-avowed externalists (like himself) and self-avowed internalists (such as Richard Feldman and Earl Conee) to involve a genuine disagreement concerning the nature of this kind of justification.

Bergmann argues that the right way to analyze this kind of justification is in terms of proper function. More specifically, Bergmann’s (2006: 132-137) theory of epistemic justification takes the first of Plantinga’s three conditions (leaving out the fourth, environmental condition) to be necessary for a belief to be justified. Bergmann also takes the first three of Plantinga’s conditions, in conjunction with the condition that the subject does not take the relevant belief to be defeated, to be sufficient for a belief being justified. The motivations for this view are perhaps best appreciated by looking to its purported advantages.

a. Some Advantages of Bergmann’s Theory

Epistemic justification of the kind Bergmann has in mind has some puzzling features. On the one hand, it involves some notion of truth-aptness. In particular, there would appear to be some important, non-trivial, connection between a belief being justified and it being objectively likely to be true. At the very least, it would be a significant cost for a theory of justification to deny this. But which ways of forming and sustaining beliefs result in a high proportion of true beliefs depends on what sort of environment one is in. Our tending to believe that occluded objects still exist, for example, results in a high proportion of true beliefs in our environment, but it is easy to imagine environments in which this would not be the case. These considerations push in the direction of regarding what makes for epistemic justification a contingent matter, one that depends on the sort of environment one inhabits.

On the other hand, justification is a normative concept, the satisfaction of which does not appear to depend on the sort of environment in which one is located. This aspect of justification is made especially vivid by “The New Evil Demon Problem”, originally put forward by Keith Lehrer and Stewart Cohen (1983), as a problem for reliabilist theories of justification. Consider a population of beings, just like ourselves, who form their beliefs in response to experience in just the ways that we do, but who (unlike us) are victims of a Cartesian demon who renders their belief-forming processes unreliable. From many reliabilist theories of justification, it follows that these beings have far fewer justified beliefs than we do (since most of their beliefs are not formed in a truth-reliable manner). But this seems false. These beings are in an epistemically bad situation, to be sure, but they are still forming their beliefs in ways that are appropriate given their experiences because their beliefs are at least justified.

Bergmann’s theory of epistemic justification nicely combines these puzzling features. First, it accommodates the intuition that inhabitants of a demon world, who are like us, and who form their beliefs in response to experience in the same ways we do, have the same proportion of justified beliefs. For, as Bergmann (2006: 141-143) notes, his theory entails that provided these beings have a cognitive design plan comparable to ours and are properly functioning, many of their beliefs are justified, even though their ways of forming beliefs are, for the most part, unreliable. This analysis also, as Bergmann points out, accommodates the intuition that justification is importantly and non-trivially connected with truth-aptness. For, insofar as the beings living in a demon world fulfill Bergmann’s conditions for justification, the manner in which they form their beliefs would be truth-apt if they were placed in the environment for which their cognitive faculties were designed. Finally, since different design plans may be tailored to different kinds of environments, Bergmann’s theory accommodates the possibility that what makes for justification is a contingent matter, one that depends on the kind of environment for which the creatures at issue are situated.

b. Some Objections to Bergmann’s Theory

Like Plantinga’s theory, Bergmann’s faces the objection that it is subject to counterexamples involving creatures like Swampman. There is no need, though, to rehearse the various responses that might be given to this objection here (since many of them will be the same or similar to those described in Section 1c). As a theory of justification, however, Bergmann’s view also faces other objections, ones which are not (or not as obviously) applicable to a theory of warrant.

Todd. R. Long (2012: 264-265) questions, for example, whether Bergmann’s theory does in fact do a better job than alternative views in handling the New Evil Demon Problem. He grants that Bergmann’s view does accommodate the intuition that demon-world victims with the same design plan as ours do in fact have justified beliefs (in the same proportions that we do). But he notes that Bergmann’s view also entails that the same cannot be said for demon-world victims who are mentally indistinguishable from ourselves but whose ways of forming beliefs run contrary to their design plan. And Long maintains that to deny that beliefs of demon-world victims in the latter situation are justified also runs contrary to our intuitions. Bergmann (2006: 150), however, anticipates an objection like this. He suggests that there is an analogy between Swampman and the demon victims in such a scenario; accordingly, he adapts his reply to the former so as to apply it to the latter.

Another kind of objection to a proper functionalist theory of justification involves cases in which the design plan specifies ways of belief formation that appear to be objectively bad in some way, in spite of the fact that this component of the design plan is successfully aimed at truth. Long (2012) and Tucker (2014b) each present variations of this objection directed specifically against Bergmann’s view. There are also precedents found among objections to Plantinga’s theory of warrant (see for example Feldman 1993: 44). There are at least two kinds of cases of this sort. The following discussion will make reference to cases described by Tucker, who provides examples of each kind.

In the first kind of case, the design plan specifies coming to hold a belief on the basis of what appears to be an objectively bad form of reasoning. Tucker (2014b: 3321-3322) presents a case, for instance, in which a design plan specifies coming to hold a certain belief on the basis of the fallacy of denying the antecedent. As Tucker points out, even though denying the antecedent is, from a logical point of view, an objectively bad form of reasoning, there are circumstances in which reasoning that way is reliable. So there is no reason in principle why a good, truth-aimed design plan could not specify forming a belief in that way, under the right conditions. Even so, it is counterintuitive to think that a belief formed by way of committing a logical fallacy could be justified (at least in the absence of having any further basis).

In the second kind of case, the design plan specifies coming to hold a belief on the basis of an input that intuitively fails to provide any kind of epistemic support for that belief. Tucker (pp. 3318-3319) presents an example, for instance, in which a person comes to believe Gödel’s incompleteness theorem solely on the basis of his belief that his students hate a particular type of beer. Since Gödel’s incompleteness theorem is a necessary truth, there is no question that this belief-forming process is reliable. So there is no reason in principle why a good, truth-aimed design plan could not specify that a belief be formed in this way. Even so, it seems wrong to say that someone could come to be justified in believing Gödel’s incompleteness theorem solely on the basis of that belief.

This is a formidable objection. But there may be things that can be said on the proper functionalist’s behalf. Consider once again the first kind of case, a case that involves coming to hold a belief in the basis of formally bad reasoning. Some things that have been said in defense of reliabilism might also be of use to the proper functionalist here. Alvin Goldman (2002: 146-153), for example, points to research on the part of cognitive psychologists (such as Amos Tversky and Daniel Kahneman) indicating that human beings tend to rely on heuristics when engaged in probabilistic reasoning. As is now well known, these heuristics make people prone to commit elementary probabilistic fallacies. The conclusion that some psychologists have drawn is that these findings indicate that human beings are terrible at probabilistic reasoning. But as Goldman notes, other psychologists have drawn a more optimistic conclusion.

Goldman points to the work of a group of evolutionary psychologists (led by Gerd Gigerenzer, Leda Cosmides, and John Tooby) who argue that, given the limited information and computational power with which organisms must contend, an inference mechanism can be advantageous if it (in Goldman’s words) “often draws accurate conclusions about real-world environments, and does so quickly and with little computational effort” (p. 152). The heuristics humans rely on in probabilistic reasoning, some of these psychologists maintain, are mechanisms of just that sort. If that is the case, then perhaps human beings often do come to hold justified beliefs by way of these mechanisms after all, in spite of the fact that they are formally suspect. And if that is so, then perhaps other kinds of beings might come to form justified beliefs on the basis of kinds of reasoning that (from a purely logical point of view) are formally suspect, but nonetheless reliable in the environments for which their cognitive faculties were designed.

Now consider the second kind of case, the case in which the design plan specifies coming to hold a belief on the basis of an input that intuitively fails to provide any kind of epistemic support for that belief. Why is it exactly (concerning Tucker’s example) that we are inclined to deny that a person’s belief that his students dislike of a particular type of beer could justify the belief that Gödel’s incompleteness theorem is true? Perhaps it is because there does not appear to be any interesting logical connection between the content of the latter belief and the belief on which it is based. But a similar observation concerning the relationship between our sense experiences and the content of our perceptual beliefs is part of what motivates Bergmann’s proper functionalist theory.

As Bergmann (2006: 119) points out, “Thomas Reid emphasized that there does not seem to be any logical connection between our sense experiences and the content of the beliefs based on them.” Bergmann notes, for example, that “the tactile sensations we experience when touching a hard surface seem to have no logical relation to (nor do they resemble) the content of the hardness beliefs they prompt.” Because this is so, Bergmann argues that the evidential support relations that hold between various sensory experiences and the beliefs formed in response to them cannot be explained in terms of necessary connections. But this prompts the question as to what does explain these support relations. Bergmann (2006: 130-131) argues that proper functionalism provides a good answer to this question. The connections are to be explained by way of which belief-forming responses to sensory inputs are specified by the cognitive design plan.

To accept this motivation for proper functionalism is to accept the claim that at least some epistemic support relations hold only contingently. It is also to countenance the possibility that the epistemic support relations that hold for certain cognizers might seem utterly bizarre from the perspective of creatures like us. So, perhaps, for those who do take this motivation on board, the possibility of an agent’s coming to justifiably believe that Gödel’s incompleteness theorem is true solely on the basis of a belief concerning the beer preferences of his students no longer seems so counterintuitive. (See Bergmann (2006: 141) for a similar response to BonJour’s purported counterexamples to externalist views of justification involving reliably formed clairvoyant beliefs).

3. Rival Theories

Proper functionalist theories do not exist in a vacuum. A full appreciation of their merits or demerits requires an investigation into how well they stack up against their rivals. Two kinds of theories in particular that are often put up against proper functionalism—phenomenal conservatism and virtue epistemology. It is sometimes claimed by the proponents of these theories that they satisfy many of the same motivations as proper functionalism, while having fewer costs, as well as other advantages.

a. Proper Functionalism and Phenomenal Conservatism

At least to a first approximation, a phenomenal conservative theory of doxastic justification may be characterized as the view that a belief with the content that p is justified for an agent if it seems to the agent that p, the agent appropriately bases her belief that p on that seeming, and the agent has no defeaters for that belief. (See Phenomenal Conservatism for more details). As noted in Section 2a, proper functionalists about justification point to the apparent contingency of the connection between various experiences and the beliefs they justify as a motivation for their view. Phenomenal conservatives sometimes claim that their view does just as well at accommodating this apparent contingency while preserving the claim that there is a necessary connection between the things that justify our beliefs and the beliefs they justify. For this reason, phenomenal conservatism might be thought to do a better job than proper functionalism in accommodating the New Evil Demon intuition. Some phenomenal conservatives have also contended that it does a better job in accounting for the nature of evidential support.

Tucker (2011: 58-63) presses this point in connection with his objection (discussed in Section 2b) that proper functionalism allows for inputs which intuitively fail to provide any kind of epistemic support for a belief to justify that belief. In the example previously discussed, Tucker pointed to an instance in which a belief served as such an input. But Tucker also supplies examples in which the same seems to be true of the support relations that hold between various sensory experiences and the beliefs they are purported to justify. He notes, for example, that it is counterintuitive to think that a sensory experience associated with seeing a beautiful sunset could justify the belief that Gödel’s incompleteness theorem is true. But a design plan (presumably different from ours) might well specify that this is an appropriate belief-forming process.

Here the proper functionalist might attempt once more to press the Reidian point that in general it appears true that there is no inherent connection between our sensory experiences and the contents of the beliefs based on them. But Tucker (2011: 56-58, 61-63) suggests a way the advocate of phenomenal conservatism could account for the role that sensory experience plays in justifying our beliefs that accommodates this fact. According to Tucker, sensory experience might play a role in the justification of a certain belief by triggering a seeming with the content of that belief, it being a contingent matter which sensations trigger which seemings. Andrew Cullison (2013: 34-37) makes a similar suggestion, noting that just as two different sentences from different languages might well express the same proposition, two different kinds of cognitive apparatus associated with different species might cause seemings of the same content in response to differing kinds of phenomenology. This accommodates the Reidian point while preserving the claim that there is a necessary connection between the things that justify our beliefs (that is, our seeming states) and the beliefs they justify (via the identities of their contents).

Suppose one agrees that a phenomenal conservative view of justification does better than a proper functionalist view on these counts. This of course does not commit one to agreeing that phenomenal conservatism does better than proper functionalism over all. Bergmann (2013) argues, for example, that proper functionalists can accommodate many of the intuitions that motivate phenomenal conservatism, while also doing a better job in accommodating the intuition that some belief formations, downstream from sensory experiences, are objectively fitting responses to those experiences, whereas others are not.

Bergmann notes, for instance, that proper functionalists might adopt a model according to which, for humans (though not necessarily for all cognizers), when all goes well, a belief formed in response to a sensory experience is justified via being based on an intermediate seeming (one that is appropriately caused by the experience). He argues that this model accommodates many of the intuitions to which phenomenal conservatives appeal. But it also, he points out, allows for the possibility that there is an objective mismatch between a belief formed in response to a sensory experience and the nature of that experience, one which prevents the belief in question from being justified, even when the content of that belief matches the content of the intermediate seeming.

Bergmann describes, for example, a case in which a human cognizer, suffering from brain damage, forms the belief that she is holding a hard spherical object, in response to the olfactory sensation she experiences while smelling a lilac bush. Even if it is stipulated that she bases this belief on an intermediate seeming with the same content as her belief, it can still seem that her belief is objectively unfitting (in relation to her experience) and, for that reason, unjustified. A proper functionalist can accommodate this intuition, Bergmann claims, whereas a phenomenal conservative cannot. The proper functionalist can maintain that the reason the cognizer’s belief is objectively unfitting in this case is that, even though it is based on an appropriate intermediate seeming, it is not the appropriate response to the relevant sensation; it is not the belief her design plan specifies should result.

Relatedly, one might think that proper functionalism does better than phenomenal conservatism in accounting for the relation between justification and truth-aptness. A common objection to phenomenal conservative views is that they suffer from a “cognitive penetration” problem. In certain kinds of wishful thinking cases, for example, a seeming state might be caused by a desire; and in some such cases the believer in question will be unaware of this fact, and have no defeater for the belief in question. According to phenomenal conservatives, a belief properly based on such a seeming will still be justified. But to many this seems wrong. One explanation for why this consequence seems wrong is that it threatens to radically undermine the connection between justification and truth. A proper functionalist, by contrast, might maintain that when such cognitively penetrated seemings are produced in human beings, this is due either to cognitive malfunction or to one of the non-truth aimed facets of our cognitive design plan (either of which, according to her view, would render the belief unjustified). See Tucker (2014a) however for an argument that proper functionalists also suffer from cognitive penetration problems.

b. Proper Functionalism and Virtue Epistemology

According to John Greco (1993: 414), “the central idea of virtue epistemology is that, Gettier problems aside, knowledge is true belief which results from one’s cognitive virtues.” Similarly, Sosa (1993: 64) characterizes it as consisting of a family of theories which may be seen as “varieties of a single more fundamental option in epistemology, one which puts the explicative emphasis on truth-conducive intellectual virtues or faculties.”

Virtue epistemology is often thought of as coming in at least two varieties. Virtue responsibilists emphasize character traits—intellectual virtues such as open-mindedness, conscientiousness, perseverance in seeking the truth, an so on. Virtue reliabilists emphasize cognitive faculties, abilities, or competencies. (See Virtue Epistemology for more details). Of these two, it is virtue reliabilism that is most akin to proper functionalism. Accordingly, virtue reliabilism serves as a closer competitor. Or rather, since Greco (1993: 414) and Sosa (1993: 64) have both classified proper functionalism as a version of virtue epistemology, perhaps it should be said that it is the non-proper-functionalist versions thereof which may be seen as close competitors. For ease of exposition, the following discussion will focus on Sosa’s development of such a version.

According to Sosa’s (2015: 10) virtue theory of knowledge, knowledge is “apt belief” where apt belief is “belief that gets it right through competence rather than luck.” More precisely, according to Sosa, an apt belief is a belief that sufficiently manifests an “epistemic competence” (that is, a competence to get at the truth) (p. 9), where “a competence is in turn understood as a disposition to succeed in a given field of aimings, these being performances with an aim, whether the aim be intentional and even conscious, or teleological and functional” (p. 2). Note the similarity to proper functionalism here. Sosa’s epistemic competences are akin to Plantinga’s truth-aimed cognitive faculties. Both involve the property of being aimed at the formation of true beliefs, and both (when all goes well) are exercised in a way that is conducive to the fulfilment of that aim.

One way in which Sosa’s epistemic competencies differ from Plantinga’s truth-aimed cognitive faculties, however, is that the former do not initially seem to presuppose any notion of a design plan. And this might make Sosa’s theory more adept at accommodating things like Swampman scenarios (see the discussion in Section 1c). Indeed, it was Sosa (1993) who made famous that objection to proper functionalism. It might also make Sosa’s view more appealing to those who are both naturalistically inclined and skeptical about the prospects for a naturalistically acceptable account of cognitive proper function.

Proper functionalists have called into question whether Sosa’s view does in fact have these advantages. Plantinga (1993c: 79) has argued, for example, that in order to handle the case of The Case of the Epistemically Serendipitous Brain Lesion (discussed in Section 1a), Sosa’s epistemic virtues must involve competencies or faculties that are subject to proper function conditions. If that is right, then, as Plantinga (p. 81) points out, Sosa’s view (developed so as not to be subject to this counterexample) becomes a variety of proper functionalism. It should be noted however that virtue epistemologists may have other ways of dealing with this case. John Greco (2010: 152) has suggested, for instance, that “in the brain lesion case, the problem is not so much a lack of health as it is a lack of cognitive integration.” “The cognitive processes associated with the brain lesion,” claims Greco, “are not sufficiently integrated with other of the person’s cognitive dispositions so as to count as being part of cognitive character.” Whether this reply is successful may turn on just what is necessary for a cognitive process to exhibit the kind of cognitive integration required. Greco (2010: 152) suggests his own, non-proper-functionalist criteria. But it is open to proper functionalists to argue that part of what is required is incorporation into one’s cognitive design plan.

Since then, Boyce and Moon (2015) argued that there are other kinds of cases that pose a challenge to the claim that a true belief manifesting a competence is sufficient for its being an item of knowledge. As noted in Section 1c, Boyce and Moon propose a counterexample to what they regard as the central intuition underlying the Swampman Objection to proper functionalism. Their counterexample employs some of the cognitive science literature on initial knowledge, which supports the claim that human beings sometimes come to know things by way of innate, unlearned cognitive responses (see for example Spelke, 1994). Drawing from this literature (as well as from Bergmann, 2006:116-121), Boyce and Moon argue that some of these innate responses are merely contingently appropriate ways of forming beliefs (where the appropriateness at issue is of a kind necessary for warrant). They argue that while these responses are appropriate for human beings, given the kind environments to which humans are adapted, the same need not have been true for other kinds of beings.

Boyce and Moon then go on to argue that these facts entail there are possible cases involving two cognitive agents, who are members of different species, coming to hold the same belief, in the same way, in the same environment, but in which that belief is warranted for one of them (on account of its resulting from a way of forming beliefs that is appropriate for members of that species) but not the other. They further argue that not only do these cases furnish counterexamples to the central intuition motivating the Swampman objection to proper functionalism, but that they also provide a challenge to alternative theories. Boyce and Moon suggest, for instance, that they afford potential counterexamples to Sosa’s theory, at least insofar as it does not recognize factors such as proper function conditions or species membership as relevant to competence possession.

Proper functionalists point to the kinds of cases alluded to above as lending support to the view that a belief having arisen by way of cognitive proper function is necessary for it to count as an item of knowledge. It should be acknowledged, however, that virtue epistemologists have pointed to other kinds of cases in which the opposite seems true. John Greco (2010: 151-153) has noted, for example, that there appear to be “cases of improper function that actually increase a person’s capacity to know.” Greco cites various cases documented by the neurologist Oliver Sacks (1970) in order to illustrate this point. “An obvious example,” says Greco, “is the story of autistic twins, who enjoyed incredible mathematical abilities associated with their autism.” Another case is that of “a man whose illness resulted in an increase in detail and vividness regarding childhood memory.” So much so, Greco notes, that when “these memories were put to use in accurate and detailed paintings of the man’s hometown in Italy…the man came to be considered an expert on the layout and appearance of that town, even though he had not visited there in decades.” Greco claims that these are cases in which “dysfunction gave rise to knowledge.”

What might a proper functionalist say in response to these scenarios? A couple of strategies are suggested by Plantinga (1993c: 74-75) in a reply to Richard Feldman (1993: 48-49). Feldman also points to these kinds of cases as creating difficulties for proper functionalism; in particular, Feldman cites the case of the autistic twins described above. As Feldman notes, these twins had the ability to “just see” (apparently without counting) that the number of matches that had fallen out of a box was 111. In his reply, Plantinga further notes that these same twins could also “just see,” it seems, whether a given six or eight digit number was prime. The first strategy Plantinga suggests for dealing with these cases is to call into question whether the individuals involved really do acquire knowledge in the scenarios described. The second is to concede (at least for the sake of argument) that they do, but argue that this is consistent with proper functionalism.

Regarding the first strategy, Plantinga notes that while the twins mentioned above can in fact reliably identify prime numbers, they lack, according to Sacks, the concept of multiplication. But if the twins lack the concept of multiplication, Plantinga argues, it is not clear that they genuinely grasp the concept of a prime number; so it is not clear that they have the relevant beliefs. Plantinga concedes, however, that this is a less plausible thing to say regarding the twins’ ability to discern the number of matches that had fallen out of a box. Here Plantinga turns to the second strategy. He concedes that while the twins’ “faculties obviously seem to malfunction in some ways,” it is doubtful that they are malfunctioning in producing the belief that there are 111 matches on the floor. Plantinga suggests that, perhaps, the twins have a different design plan than that of other human beings, and that this belief-forming tendency of theirs is subject to proper function conditions. In support of this claim, he notes that it seems possible that this remarkable ability of theirs might become damaged (in such a way that it is no longer reliable); in that case, he contends, we would be inclined to say that this ability had malfunctioned.

Another possibility open to the proper functionalists is to concede that these are cases in which cognitive malfunction enables the acquisition of knowledge, but only by way of truth-aimed proper function. If a typical human being, as a result of cognitive malfunction, suddenly found it seeming to her that she could just see that 111 matches had fallen out of a box, we might doubt that she really knows there are 111 matches. We might think that this belief, formed by way of this new-found tendency of hers, fails to count as knowledge, unless or until she has independent confirmation that the tendency is reliable. Once she does have such confirmation, we might concede that the resulting beliefs do count as knowledge, but only because she learned this to be a reliable way of getting at the truth. So perhaps, in at least some of the cases at issue, the individuals in question do acquire knowledge via belief-forming tendencies resulting from cognitive malfunction, but only by way of having learned those tendencies to be reliable. And if this learning occurs by way of cognitive processes that are in accord with proper function, these cases pose no difficulties for a proper functionalist theory.

This is perhaps not a plausible thing to say regarding all of these cases, however. It is not as plausible a thing to say regarding the individual whose illness caused him to form detailed memorial beliefs pertaining to his hometown in Italy, for example. One reason this a less plausible thing to say concerning that case is that the person in question is (presumably) forming these beliefs in response to memory phenomenology, which is an epistemically appropriate way for human beings to form beliefs downstream from experience. We would be much less likely to judge this person as having knowledge if these same beliefs arose, say, in response to the kind of phenomenology associated with a vivid daydream, unaccompanied by memorial seemings, even if the resulting beliefs should turn out to be reliably formed. So, the proper functionalist might say, if cognitive malfunction is somehow enabling the acquisition of knowledge in this case, it is not by virtue of causing the subject to respond deviantly to his experience (since, in that regard, he is responding as proper function dictates). It must, rather, be by virtue of its causing some deviation upstream from experience (that is, by virtue of its producing an abnormality in the manner in which the subject’s memorial experiences are produced). Whether this creates a significant problem for proper functionalism, furthermore, may depend on just how the malfunction in question enables knowledge.

However exactly memory information is processed, stored, and retrieved so as to generate belief-producing memorial experiences, it is plausible that the cognitive system responsible (or set of systems responsible) has different functions associated with it. One of these functions is to generate experiences that reliably produce true beliefs. But no doubt there are other functions associated with this system that do not pertain to that goal (indeed, some may even be in tension with it). It is plausible, for instance, that some of those functions pertain to filtering information as it comes in, either by preventing some of that information from being stored in the first place, discarding some of that information after it has been stored, or preventing some of it from being encoded in the relevant experiences. The purpose of this filtration process might not be to secure the production of true beliefs, but to prevent various kinds of information overload, or to highlight important items information at the expense of discarding others. Plausibly, what occurs in the case at issue is that a malfunction results in the suppression of these kinds of functions, leaving various other truth-aimed functions associated with the production of the relevant memorial experiences intact.

This consideration suggests yet another possible strategy the proper functionalist might have for dealing with these kinds of cases. Yes, she might grant, some of these are cases in which cognitive malfunction enables knowledge, but not by way of interfering with truth-aimed cognitive proper function (at least not with respect to the process that issued in the relevant beliefs). In at least some of these cases, the malfunction enables knowledge by preventing various non-truth-aimed aspects of cognitive proper function from interfering with or dampening various truth-aimed aspects (or perhaps by preventing some truth-aimed aspects of cognitive proper function from interfering with or dampening various other truth-aimed aspects). The consequence is that certain truth-aimed aspects of cognitive proper function result in various items of knowledge they would not have otherwise produced. So even though these are cases in which cognitive malfunction enables knowledge, the proper functionalist might say, they are not counterexamples to the claim that knowledge itself must come by way of truth-aimed cognitive proper function. Or, she might insist, to the extent to which it is unclear that these purported items of knowledge come by way of truth-aimed cognitive proper function, it is also unclear that we should count them as genuine items of knowledge.

It should pointed out that many virtue theories of knowledge also quite naturally lend themselves to virtue theories of justification. As Sosa (2007: 22-23) points out, for instance, an agent can manifest skill in a performance even when that performance fails to achieve its aim or achieves it merely by luck. An archer might take a skillful shot (to use one of Sosa’s frequent analogies), for instance, while still missing the target (or hitting it only by luck) on account of erratic wind conditions. Similarly, a believer might manifest her skill at coming to hold true beliefs while nonetheless getting it wrong (or getting it right only by luck) on account of being in an epistemically bad environment. Under these circumstances, the belief in question may be said (in Sosa’s terminology) to be “adroit” but not “apt” (p. 23). A belief that is adroit, according to Sosa, may be said to be justified (in one good sense at least) even if it is not an item of knowledge (BonJour and Sosa: 2003: 157).

A Sosa (2015: 26-27) himself is well aware, the having of a skill presupposes something like a normal environment. As Sosa points out, we do not say that a person lacks driving skill merely because she is disposed to perform poorly on an icy road in the midst of a snowstorm. What matters is whether she is disposed to perform well under ordinary driving conditions. Similarly, what matters for whether an agent is skilled at coming to hold true beliefs is whether she is capable of doing so in a certain kind of environment. But which sort of environment is the relevant one? According to Bergmann, this question points to an area in which a proper functionalist theory of justification has the advantage.

As Bergmann (2006: 142-143) notes, in a 1991 work Sosa takes justification to be relativized to an environment. The person in the demon world has justified beliefs relative to our environment, according to this view, but not relative to her own. Similarly, the beliefs of alien cognizers who have radically different methods of belief formation than we do (ones that are adapted to their own environment) may have beliefs that are justified relative to their environment but not relative to ours. Bergmann argues however that our ordinary concept of justification does not appear to be relativized in this way.

In later work, as Bergmann also points out, in 2003 Sosa holds that there are two different senses in which a belief might be said to be justified. A belief is “adroit-justified” if the method by which it is formed is reliable in the actual world, and “apt-justified” if the method by which it is formed is reliable in the subject’s world. As Bergmann notes, however, this view does not account for our intuition that there is a single sense in which our beliefs, the alien cognizers’ beliefs, and the demon victims’ beliefs are all justified.

Proper functionalism, by contrast, Bergmann maintains, has no difficulty accommodating these intuitions, since it holds that the relevant environment is the one specified by the design plan (which is the same between us and the demon victims but different for the alien cognizers). Whether Bergmann points to a genuine advantage of his theory over Sosa’s in this regard has, however, been disputed. Markie (2009: 374-377) argues, for example, that Bergmann’s own theory faces disadvantages akin to those he attributes to Sosa’s.

As with most disputed views, the extent to which one is drawn to proper functionalist theories will depend in large measure on which intuitions one has, the relative weight one assigns to them, one’s assessment of how well the theories in question accommodate those intuitions, and whether their rivals do any better. And here one’s mileage may vary. But it is a safe bet that proper functionalist theories will continue to serve as serious contenders for the foreseeable future.

4. References and Further Reading

Bergmann, Michael. 2006. Justification Without Awareness: A Defense of Epistemic Externalism (Oxford: Oxford UP).
Bergmann, Michael. 2013. “Externalist Justification and the Role of Seemings” Philosophical Studies 166: 163-184.
BonJour, Laurence and Sosa, Ernest. 2003. Epistemic Justification: Internalism vs. Externalism, Foundations vs. Virtues (Malden, MA: Blackwell Publishing).
Boyce, Kenneth and Plantinga, Alvin. 2012. “Proper Functionalism” The Continuum Companion to Epistemology, ed. Andrew Cullison (London: Continuum International Publishing Group).
Boyce, Kenneth and Moon, Andrew. 2015. “In Defense of Proper Functionalism: Cognitive Science Takes on Swampman,” Synthese Online First: DOI 10.1007/s11229-015-0899-6. http://link.springer.com/ article/10.1007/s11229-015-0899-6.
Crisp, Thomas M. “Gettier and Plantinga’s Revised Account of Warrant” Analysis 60: 42-50.
Chignell, Andrew. 2003. “Accidentally True Belief and Warrant” Synthese 137: 445-458.
Cullison, Andrew. 2013. “Seemings and Semantics” Seemings and Justification, ed. Chris Tucker (Oxford: Oxford UP).
Feldman, Richard. 1993. “Proper Functionalism” Nous 27: 34-50.
Goldman, Alvin “The Sciences and Epistemology” The Oxford Handbook of Epistemology (Oxford: Oxford UP).
Graham, Peter. 2012. “Epistemic Entitlement” Nous 46: 449-482.
Graham, Peter. 2014. “Warrant, Functions, History” Naturalizing Epistemic Virtue, eds. Abrol Fairweather and Owen Flanagan (Cambridge: Cambridge University Press).
Greco, John. 1993. “Virtues and Vices of Virtue Epistemology” Canadian Journal of Philosophy 23: 413-432.
Greco, John. 2010. Achieving Knowledge: A Virtue-Theoretic Account of Epistemic Normativity (Cambridge: Cambridge University Press).
Kvanvig, Jonathan L. (ed.). 1996 Warrant in Contemporary Epistemology: Essay’s in Honor of Plantinga’s Theory of Knowledge (London: Rowman & Littlefield Publishers).
Lehrer, Keith, and Cohen, Stewart. 1983. “Justification, Truth, and Coherence” Synthese 55: 191-207.
Long, Todd R. 2012. “Mentalist Evidentialism Vindicated (and a Super-Blooper Epistemic Design Problem for Proper Function justification)” Philosophical Studies 157: 251-266.
Markie, Peter. 2009. “Justification and Awareness” Philosophical Studies 146: 361-377.
McNabb, Tyler Dalton. 2015. “Warranted Religion: Answering Objections to Alvin Plantinga’s Epistemology” Religious Studies 51: 477-495.
Millikan, Ruth. 1984. “Naturalist Reflections on Knowledge” Pacific Philosophical Quarterly, 4: 315-334.
Otte, Richard. 1987. “A Theistic Conception of Probability” Faith and Philosophy 4: 427-447.
Plantinga, Alvin. 1993a. Warrant: The Current Debate (Oxford: Oxford UP).
Plantinga, Alvin. 1993b. Warrant and Proper Function (Oxford: Oxford UP).
Plantinga, Alvin. 1993c. “Why We Need Proper Function” Nous 27: 66-82.
Spelke, Elizabeth. 1994. “Initial Knowledge: Six Suggestions” Cognition 50: 431-445.
Sacks, Oliver. 1970. The Man Who Mistook His Wife for a Hat (New York: HarperCollins).
Sosa, Ernest. 1993. “Proper Functionalism and Virtue Epistemology” Nous 27: 51-65.
Sosa, Ernest. 2007. A Virtue Epistemology: Apt Belief and Reflective Knowledge, Vol. 1 (Oxford: Oxford UP).
Sosa, Ernest. 2015. Judgement and Agency (Oxford: Oxford UP).
Tucker, Chris. 2011. “Phenomenal Conservatism and Evidentialism” Evidence and Religious Belief, eds. Kelly James Clark and Raymond J. VanArragon (Oxford: Oxford UP).
Tucker, Chris. 2014a. “If Dogmatists Have a Problem with Cognitive Penetration, You Do Too” Dialectica 68: 35-62.
Tucker, Chris. 2014b. “On What Inferentially Justifies What: The Vices of Reliabilism and Proper Functionalism,” Synthese 191: 3311-3328.
Wolterstorff, Nicholas. 2010. “Ought to Believe—Two Concepts” Practices of Belief: Selected Essays, Vol. 2, ed. Terence Cuneo (Cambridge: Cambridge University Press).

Author Information

Kenneth Boyce
Email: boyceka@missouri.edu
University of Missouri
U. S. A.

Dynamic Epistemic Logic

This article tells the story of the rise of dynamic epistemic logic. The rise began in the 1960s with the creation and development of epistemic logic, the logic of knowledge, Then in the late 1980s came dynamic epistemic logic, the logic of change of knowledge. Much of it was motivated by puzzles and paradoxes.

The number of active researchers in these logics grows significantly every year because there are so many relations and applications to computer science, to multi-agent systems, to philosophy, and to cognitive science.

The modal knowledge operators in epistemic logic are formally interpreted by employing binary accessibility relations in multi-agent Kripke models (relational structures), where these relations should be equivalence relations to respect the properties of knowledge. The operators for change of knowledge correspond to another sort of modality, more akin to a dynamic modality. A peculiarity of this dynamic modality is that it is interpreted by transforming the Kripke structures used to interpret knowledge, and not, at least not on first sight, by an accessibility relation given with a Kripke model. Although called “dynamic epistemic logic,” this two-sorted modal logic applies to more general settings than the logic of merely S5 knowledge.

The present article discusses in depth the early history of dynamic epistemic logic. It then mentions briefly a number of more recent developments involving factual change, one (of several) standard translations to temporal epistemic logic, and a relation to situation calculus (a well-known framework in artificial intelligence to represent change). Special attention is then given to the relevance of dynamic epistemic logic for belief revision, for speech act theory, and for philosophical logic. The part on philosophical logic pays attention to Moore sentences, the Fitch paradox, and the Surprise Examination.

Introduction
An Example Scenario
A History of DEL
1. Announcements
2. Other Informative Events
DEL and Belief Revision
DEL and Language
DEL and Philosophy
References and Further Reading

1. Introduction

In this overview we tell the story of the rise of dynamic epistemic logic. It is a bit presumptious to call it a rise, but we can only observe this rather peculiar phenomenon. The number of active researchers in these logics grows every year because there are so many relations to computer science, to multi-agent systems, to philosophy, and to cognitive science. It all began with the logic of knowledge in the 1960s, and much of it was motivated by puzzles and paradoxes.

Dynamic logic is the logic of changing knowledge. The starting point of dynamic epistemic logic (DEL) is therefore the logic of knowledge. A founding publication is [42]. We refer to [41] for an overview of epistemic logic and references. A key feature of epistemic logic is that the information state of several agents can be represented by a Kripke model. Given a set of agents and a set of propositional variables, a Kripke model consists of a set of states, a set of accessibility relations (each one a binary relation on the domain), namely one for each agent, and a valuation (that tells which propositional variables are true in which states). In epistemic logic the set of states of a Kripke model is interpreted as a set of epistemic alternatives. The information state of an agent consists of those epistemic alternatives that are possible according to the agent, which is represented by the binary accessibility relation R_α. An agent α knows that a proposition φ is true in a state a of a Kripke model M (M; a ⊨ K_αφ), if and only if that proposition φ is true in all the states that agent α considers possible in that state (that is, which are R_α-accessible from a). A proposition known by agent α may itself pertain to the knowledge of some agent (for instance if one considers the formula K_αK_βψ). In this way, a Kripke model with accessibility relations for all the agents represents the (higher-order) information of all relevant agents simultaneously.

In DEL, information change is modeled by transforming Kripke models. Since DEL is mostly about information change due to communication, the model transformations usually do not involve factual change. The bare physical facts of the world remain unchanged, but the agents’ information about the world changes. In terms of Kripke models that means that the accessibility relations of the agents have to change (and consequently the set of states of the model might change as well). Modal operators in dynamic epistemic languages denote these model transformations. The accessibility relation associated with these operators is not one within the Kripke model, but pertains to the transformation relation between the Kripke models, as the example in the next section will show.

In Section 2 an example scenario is presented which can be captured by DEL. In Section 3 an historical overview of the main approaches in DEL is presented, with details on their modelling techniques. Section 4 discusses how to model belief revision in DEL. Section 5 connects ideas between speech act theory and DEL. Finally, Section 6 is on the relation between DEL and philosophy: it deals with Moore-sentences, the Fitch-paradox, and the Surprise Examination.

2. An Example Scenario

Figure 1: A Kripke model for the situation in which two agents are each given a red or a white card.

Consider the following scenario: Ann and Bob are each given a card that is either red or white on one side (the face side) and nondescript on the other side (the back side). They only see their own card, and so they are ignorant about the other agent’s card. There are four possibilities: both have white cards, both have red cards, Ann has a white card and Bob has a red card, or the other way round. These are the states of the model, and are represented by informative names such as rw, meaning Ann was dealt a red card (r) and Bob was dealt a white card (w). Let us assume that both have red cards, that is, let the actual state be rr. This is indicated by the double lines around state rr in Figure 1. The states of the Kripke model are connected by lines, which are labeled (α or β, denoting Ann or Bob respectively) to indicate that the agents cannot distinguish the states thus connected. (To be complete it should also be indicated that no state can ever be distinguished from itself. For readability these “reflexive lines” are not drawn, but indeed the accessibility relations R_α and R_β are equivalence relations, since epistemic indistinguishability is reflexive, symmetric and transitive.) In the model of Figure 1 there are no α-lines between those states where Ann has different cards, that is, she can distinguish states at the top, where she has a red card, from those at the bottom, where she has a white one. Likewise, Bob is able to distinguish the left half from the right half of the model. This represents the circumstance that Ann and Bob each know the colour of their own card but not the colour of the other’s card. In the Kripke model of Figure 1 we also see that the higher-order information is represented correctly. Both agents know that the other agent knows the colour of his or her card, and they know that they know this, and so on. It is remarkable that a single Kripke model can represent the information of both agents simultaneously.

Figure 2: A Kripke model for the situation after Ann tells Bob she has a red card.

Figure 3: A Kripke model for the situation after Ann might have looked at Bob’s card.

Suppose that after picking up their cards, Ann truthfully says to Bob “I have a red card”. The Kripke model representing the resulting situation is displayed in Figure 2. Now both agents know that Ann has a red card, and they know that they know she has a red card, and so on: it is common knowledge among them. (A formula φ is common knowledge among a group of agents if everyone in the group knows that φ, everyone knows that everyone knows that φ, and so on.) Hence there is no need anymore for states where Ann has a white card, so those do not appear in the Kripke model. Note that in the new Kripke model there are no longer any lines labeled β. No matter how the cards were dealt, Bob only considers one state to be possible: the actual one. Indeed, Bob is now fully informed.

Now that Bob knows the colour of Ann’s card, Bob puts his card face down on the table, and leaves the room for a moment. When he returns he considers it possible that Ann took a look at his card, but also that she didn’t. Assuming she did not look, the Kripke model representing the resulting situation is the one displayed in Figure 3. In contrast to the previous model, there are in this model lines for Bob again. This is because he is no longer completely informed about the situation. He does not know whether Ann knows the colour of his card, yet he still knows that both Ann and he have a red card. Only his higher-order information has changed. Ann on the other hand knows whether she has looked at Bob’s card and also knows whether she knows the colour of Bob’s card. She also knows that Bob considers it possible that she knows the colour of his card. In the model of Figure 3 we see that two states representing the same factual information can differ by virtue of the lines connecting them to other states: the state rr on the top and rr on the bottom only differ in higher-order information.

In this section, we have seen two ways in which information change can occur. Going from the first model to the second, the information change was public, in the sense that all agents received the same information. Going from the second to the third model involved information change where not all agents had the same information, because Bob did not know whether Ann looked at his card while he was away. The task of DEL is to provide a logic with which to describe these kinds of information change.

3. A History of DEL

DEL did not arise in a scientific vacuum. The “dynamic turn” in logic and semantics ([72], [34] and [60]) very much inspired DEL, and DEL itself can also be seen as a part of the dynamic turn. The formal apparatus of DEL is a lot like propositional dynamic logic (PDL) [40] and quantified dynamic logic (QDL) [39]. There is also a relation to update semantics (US) [36, 93] — not all formulas are interpreted dynamically, as there, but formulas and updates are clearly distinguished.

The study of epistemic logic within computer science and AI led to the development of epistemic temporal logic (ETL) in order to model information change in multi-agent systems (see [25] and [55]). Rather than model change by modal operators that transform the model, change is modeled by the progression of time in these approaches. Yet the kinds of phenomena studied by ETL and DEL largely overlap.

After this brief sketch of the context in which DEL was developed, the remainder of the section focuses on the development of its two main approaches. The first is public announcement logic, which is presented in Section 3.1. The second, presented in Section 3.2, is the dominant approach in DEL (sometimes identified with DEL).

a. Announcements

The original publication: Plaza The first dynamic epistemic logic, called public announcement logic (PAL), was developed by Plaza in [61]. This was published in 1989. The example where Ann says to Bob that she has a red card is an example of a public announcement. A public announcement is a communicative event where all agents receive the same information and it is common knowledge among them that this is so. The language of PAL is given by the following Backus-Naur Form:

Besides the usual propositional language, K_αφ is read as agent α knows that φ, and [φ] ψ is read as after φ is announced ψ is the case. In the example above, we could for instance translate “After it is announced that Ann has a red card, Bob knows that Ann has a red card” as [r_α]K_βr_α.

An announcement is modeled by removing the states where the announcement is false, that is, by going to a submodel. This model transformation is the main feature of PAL’s semantics.

In clause (v) the condition that the announced formula be true at the actual state entails that only truthful announcements can take place. The model MΙφ is the model obtained from M by removing all non-φ states. The new set of states consists of the φ-states of M. Consequently, the accessibility relations as well as the valuation are restricted to these states . The propositional letters true at a state remain true after an announcement. This reflects the idea that communication can only bring about information change, not factual change.

Gerbrandy and Groeneveld’s approach A logic similar to PAL was developed independently by Gerbrandy and Groeneveld in [32], which is more extensively treated in Gerbrandy’s PhD thesis [30]. There are three main differences between this approach and Plaza’s approach. First of all, Gerbrandy and Groeneveld do not use Kripke models in the semantics of their language. Instead, they use structures called possibilities which are defined by means of non-wellfounded set theory [1], a branch of set theory where the foundation axiom is replaced by another axiom. Possibilities and Kripke models are closely linked: possibilities correspond to bisimulation classes of Kripke models [18]. Later, Gerbrandy provided semantics without using non-wellfounded set theory for a simplified version of his public announcement logic [31].

The second difference is that Gerbrandy and Groeneveld also consider announcements that are not truthful. In their view, a logic for announcements should model what happens when new information is taken to be true by the agents. Hence, according to them, what happens to be true deserves no special status. This is more akin to the notion of update in US. In terms of Kripke models this means that by updating, agents may no longer consider the actual state to be possible, that is, R_α may no longer be reflexive. In a sense it would therefore be more accurate to call this logic a dynamic doxastic logic (a dynamic logic of belief) rather than a dynamic epistemic logic, since according to most theories, knowledge implies truth, whereas beliefs need not be true.

Thirdly, their logic is more general in the sense that subgroup announcements are treated (where only a subgroup of the group of all agents acquires new information); and especially private announcements are considered, where only one agent gets information. These announcements are modeled in such a way that the agents who do not receive information do not even consider it possible that anyone has learned anything. In terms of Kripke models, this is another way in which R_α may lose reflexivity.

Adding common knowledge Semantics for public, group and private announcements using Kripke models was proposed by Baltag, Moss, and Solecki in [14]. This semantics is equivalent to Gerbrandy’s semantics (as was shown in [58]). The main contribution in [14] to PAL was that their approach also covered common knowledge, which is an important concept when one is interested in higher-order information and plays an important role in social interaction (cf. [92]). The inclusion of common knowledge poses a number of technical problems.

b. Other Informative Events

Groeneveld and Gerbrandy’s approach In addition to a logic for announcements Gerbrandy also developed a system for more general information change involving many agents, each of whom may have a different perspective. This is for instance the case when Ann may look at Bob’s card.

In order to model this information change it is important to realize that distinct levels of information are not distinctly represented in a Kripke model. For instance what Ann actually knows about the cards depends on R_α, but what Bob knows about what Ann knows about the cards depends on R_αas well. Therefore changing something in the Kripke model, such as cutting a line, changes the information on many levels. In order to come to grips with this issue it really pays to use non-wellfounded semantics. One of the ways to think about the possibilities defined by Gerbrandy and Groeneveld is as infinite trees. In such a tree, distinct levels of information are represented by certain paths in the tree. By manipulating the appropriate part of the tree, one can change the agents’ information at the appropriate level. This insight stems from Groeneveld [37] and was also used by Renardel de Lavalette in [62], who introduces treelike lean modal structures using ordinary set theory in the semantics of a dynamic epistemic logic.

Van Ditmarsch’s approach Inspired by Gerbrandy and Groeneveld’s work, Van Ditmarsch developed a dynamic epistemic logic for modeling information change in knowledge games, where the goal of the players is to obtain knowledge of some aspect of the game. Clue and Battleships are typical examples of knowledge games. Players are never deceived in such games and therefore the dynamic epistemic logic of Gerbrandy and Groeneveld in which reflexivity might be lost, seems unsuitable. In Van Ditmarsch’s Ph.D. thesis [86], a logic is presented where all model transformations are from Kripke models with equivalence relations to Kripke models with equivalence relations, which is thus tailored to information change involving knowledge. This approach was further streamlined by Van Ditmarsch in [87] and later extended to include concurrent actions (when two or more events occur at the same time) in [90]. One of the open problems of these logics is that a completeness proof for the axiom systems has not been obtained.

The dominant approach: Baltag, Moss and Solecki Another way of modeling complex informative events was developed by [14], which has become the dominant approach in DEL. Their approach is highly intuitive and is lying at the basis of many papers in the field: indeed, many refer to this approach simply as DEL. Their key insight was that information changing events can be modeled in the same way as situations involving information. Given a situation, such as when Ann and Bob each have a card, one can easily provide a Kripke model for such a situation. One simply considers which states might occur and which of those states the agents cannot distinguish. One can do the same with events involving information. Given a scenario, such as Ann possibly looking at Bob’s card, one can determine which events might occur: either she looks and sees it is red (she learns that rβ) or she sees that it is white (she learns that wβ), or she does not look at the card (she learns nothing new, indicated by the tautology Τ). It is clear that Ann can distinguish these particular events, but Bob cannot. Such models are called action models or event models.

An event model A is a triple consisting of a set of events E, a binary relation over E for each agent, and a precondition function which assigns a formula to each event. This precondition determines under what circumstances the event can actually occur. Ann can only truthfully say that she has a red card, if in fact she does have a red card. The event model for the event where Ann might have looked at Bob’s card is

Figure 4: An event model for when Ann might look at Bob’s card.

Figure 5: The product update for the models of Figure 3 and Figure 4.

given in Figure 4, where each event is represented by its precondition.

The Kripke model of the situation following the event is constructed with a procedure called a product update. For each state in the original Kripke model one determines which events could take place in that state (that is, one determines whether the precondition of the event is true at that state). The set of states of the new model consists of those pairs of states and events (a, e), which represent the result of event e occurring in state a. The new accessibility relation is now easy to determine. If two states were indistinguishable to an agent and two events were also indistinguishable to that agent, then the result of those events taking place in those states should also be indistinguishable. This implication also holds the other way round: if the result of two events happening in two states are indistinguishable, then the original states and events should be indistinguishable as well. (Van Benthem [73] characterizes product update as having perfect recall, no miracles, and uniformity .) The basic facts about the world do not change due to a

Figure 6: The product update for the models of Figure 1 and Figure 4.

merely communicative event. And so the valuation in <a, e> simply follows the old valuation in a.

The model in Figure 5 is the result of a product update of the model in Figure 2 and the event model of Figure 4. One can see that this is the same as the model in Figure 3 (except for the names of the states), which indicates that product update yields the intuitively right result.

One may wonder whether the model in Figure 4 represents the event accurately. According to the event model Bob considers it possible that Ann looks at his card and sees that it is white. Bob, however, already knows that the card is red, and therefore should not consider this event possible. This criticism is justified and one could construct an event model that takes this into account, but the beauty of the event model is precisely that it is detached from the agents’ information about the world in such a way that it provides an accurate model of just the information the agents have about the event. This means that product update yields the right outcome regardless of the Kripke model of the situation in which the event occurred. For instance taking the product update with the model of Figure 1, yields the Kripke model depicted in Figure 6, which represents the situation where Ann might look at Bob’s card immediately after the cards were dealt. The resulting model also represents that situation correctly. This indicates that in DEL static information and dynamic information can be separated.

In the logical language of DEL these event models appear as modalities [A, e], where e is taken to be the event that actually occurs. The language is given by the following Backus-Naur Form

Clauses (i)–(iv) are the same as for PAL. In clause (v) is the reflexive transitive closure of the union of the accessibility relations of members of Γ. Clause (vi) is a standard clause for dynamic modalities, except that the accessibility relation for dynamic modalities is a relation on the class of all Kripke models. In clause (vii) it is required that the precondition of the event model is true in the actual state, thus ensuring that <a, e>, the new actual state, exists in the product update. Clauses (viii) and (ix) are the usual semantics for non-deterministic choice and sequential composition.

Not only informative events where different agents have a different perspective can be modeled in DEL, but also public announcements can be thought of in terms of event models. A public announcement can be modeled by an event model containing just one event: the announcement. All agents know this is the actual event, so it is the only event considered possible. Indeed, DEL is a generalization of PAL.

Criticism, alternatives, and extensions Many people feel somewhat uncomfortable with having models as syntactical objects. Baltag and Moss have tried to accommodate this by proposing different languages while maintaining an underlying semantics using event models [10, 13]. This issue is extensively discussed in [91, Section 6.1]. There are alternatives using hybrid logic [70], and algebraic logic ([11], [12]). Most papers just use event models in the language.

DEL has been extended in various ways. Operators for factual change [85, 81] and past operators from temporal logic have been added [64, 5]. DEL has been combined with probability [45], justification logic [63] and extended such that belief revision is also within its grasp. Connections have been made between DEL and various other logics. Its relation to PDL, ETL [80], AGM belief revision, and situation calculus [83] has been studied. DEL has been applied to a number of puzzles and paradoxes from recreational mathematics and philosophy. It has also been applied to problems in game theory (see [79] for a very detailed survey), as well as issues in computer security [94]. Complexity and succinctness of DEL has been investigated in [54, 69, 6]. Two recent overviews of DEL are [17, 78]. In the next section we pay attention to DEL and belief revision.

4. DEL and Belief Revision

Something you cannot model in DEL is changing your mind. Once you know a fact, you know it forever, that is, once is true, it remains true after every update. Even when we have weaker constraints on the accessibility relations (for belief, or even general accessibility), this remains the case. But sometimes, when you believe a fact, you change your mind, and you may come to believe the opposite. This is not shocking or anything, it might have been that you merely did not believe it firmly. This means a change of into or, using the better suited belief modality for that: a change of into . In a different community, that of (AGM) belief revision, this is the most natural operation around—indeed called ‘belief revision’. In this section we shortly survey interactions between such AGM belief revision and dynamic epistemic logic.

Belief revision has been studied from the perspective of structural properties of reasoning about changing beliefs [29], from the perspective of changing, growing and shrinking knowledge bases, and from the perspective of models and other structures of belief change wherein such knowledge bases may be interpreted, or that satisfy assumed properties of reasoning about beliefs. A typical approach involves preferential orders to express increasing or decreasing degrees of belief [48, 56], where these works refer to the ‘systems of spheres’ in [51, 38]. Within this tradition multi-agent belief revision has also been investigated, for example, belief merging [46]. Belief operators are normally not explicit in the logical language, so that higher-order beliefs (I know that you are ignorant of a certain proposition) cannot be formalized. Iterated belief revision may also be problematic.

The link between belief revision and modal logic, that is, explicit belief modalities and belief change modalities in the logical language, was made in a strand of research known as dynamic doxastic logic. This was proposed and investigated by Segerberg and collaborators in works such as [68, 52, 67, 22]. These works are distinct from other approaches to belief revision in modal logics, without dynamic modal operators, such as [19, 50, 20], that also influenced the development of dynamic logics combining knowledge and belief change. In dynamic doxastic logics belief operators are in the logical language, and belief revision operators are dynamic modalities. Higher-order belief change, that is, to revise one’s beliefs about one’s own or other agents’ beliefs and ignorance, are considered problematic in dynamic doxastic logic, see [52]. In [68, 67] belief revision is restricted to propositional formulas (factual revision). There are

dynamic doxastic logics wherein [*φ] merely means belief revision with φ according to some externally defined strategy, as in AGM style (this is the general setup in [68], not unlike the nonepistemic/doxastic modal setup in [71]), but there are also dynamic doxastic logics, such as [67], wherein [*φ] is a recipe operating on a semantic structure and outputting a novel structure, the standard approach in dynamic epistemic logic.

Belief revision in dynamic epistemic logic was initiated in [4, 88, 77, 15]. From these, [4, 88] propose a treatment involving degrees of belief and based on degrees of plausibility among states in structures interpreting such logics, so-called quantitative dynamic belief revision; whereas [77, 15] propose a treatment involving comparative statements about plausibilities (a binary relation between states denoting more/less plausible), so-called qualitative dynamic belief revision. The latter is clearly more suitable for logics of belief revision, and for notions such as conditional belief. The analogue of the AGM postulate of success must be given up when one incorporates higher-order belief change as in dynamic epistemic logic, where again a prime mover are Moore-sentences of the form ‘proposition p is true but you don’t know it’, which cannot after acceptance be believed by you. Many more works on dynamic belief revision have appeared since, for example, [33, 53, 24]. A prior, independent, strand to model belief revision was in temporal epistemic logic, and was initiated in the mid 1990s in [28]. Their integrated treatment of belief, knowledge, plausibilities, and change is similar to the more recent developments to model belief revision in dynamic epistemic logic, and the relation between the two approaches is incompletely understood.

For an example of belief revision in dynamic epistemic logic, consider one agent and a proposition p that the agent is uncertain about. The agent could be Ann, who is uncertain whether Bob has a red card, as in the proposition before. We get a Kripke model depicted in Figure 7, not dissimilar from that in Figure 2. There are two states of the world, one where p is false and another one where p is true. Let us suggestively call them 0 and 1, respectively. The agent has epistemic preferences among these states. Namely, she considers it most plausible that 1 is the actual state, that is,, that p is true, and less plausible that 0 is the actual state. We write 1 < 0 where, as common in the area, the minimal element in the order is the most plausible state (and not, as maybe to be expected, the least plausible state). Let us further assume that p is false.

Figure 7: Ann believes p but considers epistemically possible.

The agent believes a proposition when it holds in the most plausible states. For example, she believes that p is true. This is formalized as

We write (for belief) instead of (for knowledge) as beliefs may be mistaken. Indeed, the agent believes that p but in fact p is false! But we also distinguish a modality for knowledge.

The agent knows a proposition when it holds in all plausible states. These are her strongest beliefs, or knowledge. In the case of this example her factual knowledge only involves tautologies such as p ∨ This is described as

(p ∨)

Now imagine that the agent wants to revise her current beliefs. She believes that p is true, but has been given sufficient reason to be willing to revise her beliefs with instead. We can accomplish that when we allow a model transformation that makes the 0 state more plausible than the 1 state. There are various ways to do that. In this simple example we can simply observe that it suffices to make the state satisfying the revision formula , that is, 0, more plausible than the other state, 1. See Figure 8. As a consequence of that, the agent now believes : is true. Therefore, the revision was successful. This can already be expressed in the initial situation by using a dynamic modal operator [*] for the relation induced by the program “belief revision with ”, followed by what should hold after that program is executed. In this dynamic modal setting we can then write that

ΛΛ[*]

was already true at the outset.

Figure 8: Ann revises her belief with

In dynamic epistemic logic, unlike in the original AGM or the subsequent DDL setting, beliefs and knowledge can also be about modal formulas. For example, we not only have that , but we also have that the agent believes that she does not know whether p. We might say: Ann is aware that her belief in p is not very strong, that it is defeasible.

5. DEL and Language

Consider the connection between DEL and speech act theory. Speech act theory started with the work of [7], who argued that language is used to perform all sorts of actions; we make promises, we ask questions, we issue commands, and so forth. An example of a speech act is a bartender who says “The bar will be closed in five minutes” [8]. Austin distinguishes three kinds of acts that are performed by the bartender (i) the locutionary act of uttering the words, (ii) the illocutionary act of informing his clientele that the bar will close in five minutes, and (iii) the perlocutionary act of getting the clientele to order one last drink and leave.

Truth conditions, which determine whether an indicative sentence is true of false, are generalized to success conditions to determine whether a speech act is successful or not. In speech act theory there are several distinctions when it comes to the ways in which something can be wrong with a speech act [7, p. 18]. Here we do not make such distinctions and simply speak of success conditions. Searle gives in [66, p. 66] the following success conditions, among others, for an assertion that p by speaker S to hearer H:

S has evidence (reasons, and so forth) for the truth of p.
It is not obvious to both S and H that H knows (does not need to be reminded of, and so forth) p.
S believes p

Speech act theory has been embraced by the multi-agent systems community, for example, by the Foundation for Intelligent Physical Agents (FIPA). FIPA is an IEEE Computer Society standards organization that promotes agent-based technology and the interoperability of its standards with other technologies. It published a Communicative Act Library Specification [26] that includes a specification of the inform action, which is similar to Searle’s analysis of assertions.

It is worthwhile to join this analysis of assertions to the analysis of public announcements in PAL. It is clear from the list of success conditions that one usually only announces what one believes (or knows) to be true. So, an extra precondition for an announcement that φ by an agent α , should be that . Public announcements are indeed modeled in this way in [61].

As an example, consider the case when Ann tells Bob she has a red card: it is more appropriate to model this as an announcement that rather than the announcement that rα. Fortunately, these formulas were equivalent in the model under consideration. Suppose that Ann had said “We do not both have white cards”. When this is modeled as an announcement that , we obtain the model in Figure 9(a). However, Ann only knows this statement to be true when she in fact has a red card herself. Indeed, when we look at the result of the announcement that we obtain the model in Figure 9(b). We see that the result of this

Figure 9: An illustration of the difference between the effect of the announcement that φ and the announcement that φ and an announcement that only changes the agents’ higher-order information

announcement is the same as when Ann says that she has a red card (see Figure 2). By making presuppositions part of the announcement, we are in a way accommodating the precondition (see also [44]).

The second success condition in Searle’s analysis conveys that an announcement ought to provide the hearer with new information. In the light of DEL, one can revise this second success condition by saying that p is not common knowledge, thus taking higher-order information into account. It seems natural to assume that a speaker wants to achieve common knowledge of p, since that plays an important role in coordinating social actions; and so lack of common knowledge of p is a condition for the success of announcing p.

Consider the situation where Ann did look at Bob’s card when he was away and found out that he has a red card (Figure 9(c)). Suppose that upon Bob’s return Ann tells him “I do not know that you have a white card”. Both Ann and Bob already know this, and they also both know that they both know it. Therefore Searle’s second condition is not fulfilled, and so according to his analysis there is something wrong with Ann’s assertion. The result of this announcement is given in Figure 9(d). We see that the information of the agents has changed. Now Bob no longer considers it possible that Ann considers it possible that Bob considers it possible that Ann knows that Bob has a white card. And so the announcement is informative. One can give more and more involved examples to show that indeed change of common knowledge is a more natural requirement for announcements than Searle’s second condition, especially multi-agent scenarios.

Van Benthem [76] analyzes question and answer episodes using DEL. One of the success conditions of questions as speech acts is that the speaker does not know the answer [66, p. 66]. Therefore posing a question can reveal crucial information to the hearer in such a way that the hearer only knows the answer after the question has been posed ([74],[91, p. 61],[82]).

Professor a is program chair of a conference on Changing Beliefs. It is not allowed to submit more than one paper to this conference, a rule all authors of papers did abide to (although the belief that this rule makes sense is gradually changing, but this is besides the point here). Our program chair a likes to have all decisions about submitted papers out of the way before the weekend, since on Saturday he is due to travel to attend a workshop on Applying Belief Change. Fortunately, although there appears not to be enough time to notify all authors, just before he leaves for the workshop, his reliable secretary assures him that she has informed all authors of rejected papers, by personally giving them a call and informing them about the sad news concerning their paper.

Freed from this burden, Professor a is just in time for the opening reception of the workshop, where he meets the brilliant Dr. b. The program chair remembers that b submitted a paper to Changing Beliefs, but to his own embarrassment he must admit that he honestly cannot remember whether it was accepted or not. Fortunately, he does not have to demonstrate his ignorance to b, because b’s question ‘Do you know whether my paper has been accepted?’ does make a reason as follows: a is sure that would b’s paper have been rejected, b would have had that information, in which case b had not shown his ignorance to a. So, instantaneously, a updates his belief with the fact that b’s paper is accepted, and he now can answer truthfully with respect to this new revised belief set.

This phenomenon shows that when a question is regarded as a request [49], the success condition that the hearer is able to grant the request, that is, provide the answer to the question, must be fulfilled after the request has been made, and not before. (However, it is not commonly agreed upon in the literature that questions can be regarded as requests (cf. [35, Section 3].) This analysis of questions in DEL fits well within the broad interest in questions in dynamic semantics [3]. Recent work on DEL and questions is [2, 59, 23].

6. DEL and Philosophy

The role of public announcements as typical informative speech acts focussed the attention on a number of situations wherein that form of success cannot be achieved. This has been investigated mainly within philosophical logic, under the heading of ‘Moore sentences’ and the ‘Fitch paradox’. The ‘Moore sentence’ was introduced by Moore in [57] and his original analysis is that p ∧¬Kp (p is true and I don’t know/believe it) cannot sincerely be uttered. As this is an informative speech act, you are supposed to believe your beliefs. It seems incoherent, and maybe even paradoxical, to believe a proposition stating that you do not believe it. In the DEL setting we can give this a dynamic interpretation. It is then no longer paradoxical.

If I tell you “You don’t know that I play cello”, this has the conversational implicature “You don’t know that I play cello and it is true that I play cello”. This has the form p ∧¬Kp. Suppose I were tell you again “You don’t know that I play cello.” Then you can respond: “You’re lying. You just told me that you play cello.” We can analyze what is going on here in modal logic. We model your uncertainty, for which a single epistemic modality suffices. Initially, there are two possible worlds, one in which p is true and another one in which p is false, and that you cannot distinguish from one another. Although in fact p is true, you don’t know that: p ∧¬Kp. The announcement of p ∧¬Kp results in a restriction of these two possibilities to those where the announcement is true: in the p-world, p ∧¬Kp is true, but in the :p-world, p ∧¬Kp is false.

In the model restriction consisting of the single world where p is true, p is known: Kp. Given that Kp is true, so is¬p ∨ Kp, and ¬p ∨ Kp is equivalent to ¬(p ∧ ¬Kp), the negation of the announced formula. So, announcement of p ∧ ¬Kp makes it false! Gerbrandy [30, 31] calls this phenomenon an unsuccessful update; the matter is also taken up in [89, 43, 84].

We continue with some words on the Fitch paradox [27]. A standard analysis of the Fitch paradox is as follows – see the excellent review of the literature on Fitch’s paradox in the Stanford Encyclopedia of Philosophy [21], and the volume dedicated on knowability [65]. The existence of unknown truths is formalized as ∃p (p ∧ ¬Kp). The requirement that all truths are th-knowable is formalized as ∀p (p → ◊ Kp), where ◊ formalizes the existence of some process after which p is known, or an accessible world in which p is known. Fitch’s paradox is that the existence of unknown truths is inconsistent with the requirement that all truths are knowable.

The Moore-sentence p ∧ ¬ Kp witnesses the existential statement ∃p (p ∧ ¬Kp). Assume that it is true. From ∃p (p ∧ ¬Kp) follows the truth of its instance (p ∧ ¬Kp) → ◊ K(p ∧ ¬Kp), and from that and p ∧ ¬Kp follows ◊ K(p ∧ ¬Kp). Whatever the interpretation of ◊, it results in having to evaluate K(p ∧ ¬Kp). But this is inconsistent for knowledge and belief.

We now get to the relation between knowable and DEL. The suggestion to interpret ‘knowable’ as ‘known after an announcement’ was made by van Benthem in [75], and [9] proposes a logic where ‘φ is knowable’ is interpreted in that way. In this setting, ◊p stands for ‘there is an announcement after which p (is true)’, so that ◊Kp stands for ‘there is an announcement after which p is known’, which is a form of ‘proposition p is knowable’.

For example, consider the proposition p for ‘it rains in Liverpool’. Suppose you are ignorant about p: ¬(Kp ∨ K¬p). First, suppose that p is true. I can announce to you here and now that it is raining in Liverpool (according to your expectations, maybe…), after which you know that: 〈 p〉 Kp stands for ‘p is true and after announcing p, p is known’ (〈φ〉 is the dual of [φ], that is, 〈φ〉ψ is defined by abbreviation as ¬[φ]¬ψ ). Now, suppose that p is false. In a similar way, after I announce that, you know that; so that we have 〈¬p〉 K¬p. If you already knew whether p, having its value announced does not have any informative consequence for you. Therefore, 〈p〉 Kp ∨ 〈¬p〉 K ¬p is a validity. Therefore we also have〈p〉 (Kp ∨ K ¬p) ∨ 〈¬p〉 (Kp ∨ K ¬p) . We can generalize the statement ‘there is a proposition p such that after its announcement, p is known’, to ‘there exists a proposition q, such that after its announcement, p is known’, where q is not necessarily the same as p. Then we have informally captured the meaning of ◊Kp. In other words, this operator is a quantification over announcements. But we have then just proved that ◊ (Kp ∨ K ¬p)is a validity. For more on such matters, see [9, 84].

Another paradox in philosophical logical circles that has been analyzed with DEL methods (and that has similar ‘Moore sentences’-like symptoms) is the Surprise Examination. This has been investigated in works as [30, 31, 89], and more recently by Baltag and Smets using plausibility epistemic structures, along the lines of [16].

Parts of the materials for this overview have been taken from [88, 47, 84], and subsequently revised to make it into a single comprehensive text.

7. References and Further Reading

[1] P. Aczel. Non-Well-Founded Sets. CSLI Publications, Stanford, CA, 1988. CSLI Lecture Notes 14.
[2] T. Agotnes, J. van Benthem, H. van Ditmarsch, and S. Minica. Question-Answer games, ˚ 2011.
[3] M. Aloni, A. Butler, and P. Dekker, editors. Questions in Dynamic Semantics. Elsevier, Amsterdam, 2007.
[4] G. Aucher. A combined system for update logic and belief revision. In Proc. of 7th PRIMA, pages 1–17. Springer, 2005. LNAI 3371.
[5] G. Aucher and A. Herzig. From DEL to EDL : Exploring the power of converse events. In K. Mellouli, editor, Proc. of ECSQARU, LNCS 4724, pages 199–209. Springer, 2007.
[6] G. Aucher and F. Schwarzentruber. On the complexity of dynamic epistemic logic. In Proc. of 14th TARK, 2013.
[7] J. L. Austin. How to Do Things with Words. Clarendon Press, Oxford, 1962.
[8] K. Bach. Speech acts. In E. Craig, editor, Routledge Encyclopedia of Philosophy, volume 8, pages 81–87. Routledge, London, 1998.
[9] P. Balbiani, A. Baltag, H. van Ditmarsch, A. Herzig, T. Hoshi, and T. De Lima. ‘Knowable’ as ‘known after an announcement’. Review of Symbolic Logic, 1(3):305–334, 2008.
[10] A. Baltag. A logic for suspicious players: epistemic actions and belief-updates in games. Bulletin of Economic Research, 54(1):1–45, 2002.
[11] A. Baltag, B. Coecke, and M. Sadrzadeh. Algebra and sequent calculus for epistemic actions. Electronic Notes in Theoretical Computer Science, 126:27–52, 2005.
[12] A. Baltag, B. Coecke, and M. Sadrzadeh. Epistemic actions as resources. J. of Logic Computat., 17(3):555–585, 2007.
[13] A. Baltag and L. S. Moss. Logics for epistemic programs. Synthese, 139:165–224, 2004.
[14] A. Baltag, L. S. Moss, and S. Solecki. The logic of public announcements, common knowledge, and private suspicions. In I. Gilboa, editor, Proceedings of TARK 98, pages 43–56, 1998.
[15] A. Baltag and S. Smets. A qualitative theory of dynamic interactive belief revision. In Proc. of 7th LOFT, Texts in Logic and Games 3, pages 13–60. Amsterdam University Press, 2008.
[16] A. Baltag and S. Smets. Group belief dynamics under iterated revision: fixed points and cycles of joint upgrades. In Proc. of 12th TARK, pages 41–50, 2009.
[17] A. Baltag, H. van Ditmarsch, and L.S. Moss. Epistemic logic and information update. In J. van Benthem and P. Adriaans, editors, Handbook on the Philosophy of Information, pages 361–456, Amsterdam, 2008. Elsevier.
[18] P. Blackburn, M. de Rijke, and Y. Venema. Modal Logic. Cambridge University Press, Cambridge, 2001. Cambridge Tracts in Theoretical Computer Science 53.
[19] O. Board. Dynamic interactive epistemology. Games and Economic Behaviour, 49:49–80, 2004.
[20] G. Bonanno. A simple modal logic for belief revision. Synthese (Knowledge, Rationality & Action), 147(2):193–228, 2005.
[21] B. Brogaard and J. Salerno. Fitch’s paradox of knowability, 2004. http://plato. stanford.edu/archives/sum2004/entries/fitch-paradox/.
[22] J. Cantwell. Some logics of iterated belief change. Studia Logica, 63(1):49–84, 1999.
[23] I. Ciardelli and F. Roelofsen. Inquisitive dynamic epistemic logic. Manuscript, 2013.
[24] C. Degremont. ´ The Temporal Mind. Observations on the logic of belief change in interactive systems. PhD thesis, University of Amsterdam, 2011. ILLC Dissertation Series DS-2010-03.
[25] R. Fagin, J. Y. Halpern, Y. Moses, and M. Y. Vardi. Reasoning about Knowledge. MIT, Cambridge, Massachusetts, 1995.
[26] FIPA. FIPA communicative act library specification, 2002. http://www.fipa.org/.
[27] F.B. Fitch. A logical analysis of some value concepts. The Journal of Symbolic Logic, 28(2):135–142, 1963.
[28] N. Friedman and J.Y. Halpern. A knowledge-based framework for belief change – part i: Foundations. In Proc. of 5th TARK, pages 44–64. Morgan Kaufmann, 1994.
[29] P. Gardenfors. ¨ Knowledge in Flux: Modeling the Dynamics of Epistemic States. Bradford Books, MIT Press, Cambridge, MA, 1988. 17
[30] J. Gerbrandy. Bisimulations on Planet Kripke. PhD thesis, University of Amsterdam, 1998. ILLC Dissertation Series DS-1999-01.
[31] J. Gerbrandy. The surprise examination in dynamic epistemic logic. Synthese, 155(1):21– 33, 2007.
[32] J. Gerbrandy and W. Groeneveld. Reasoning about information change. J. Logic, Lang., Inform., 6:147–169, 1997.
[33] P. Girard. Modal logic for belief and preference change. PhD thesis, Stanford University, 2008. ILLC Dissertation Series DS-2008-04.
[34] P. Gochet. The dynamic turn in twentieth century logic. Synthese, 130(2):175–184, 2002.
[35] J. Groenendijk and M. Stokhof. Questions. In J. van Benthem and A. ter Meulen, editors, Handbook of Logic and Language, pages 1055–1124. Elsevier, Amsterdam, 1997.
[36] J. Groenendijk, M. Stokhof, and F. Veltman. Coreference and modality. In S. Lappin, editor, The Handbook of Contemporary Semantic Theory, pages 179–213. Blackwell, Oxford, 1996.
[37] W. Groeneveld. Logical Investigations into Dynamic Semantics. PhD thesis, University of Amsterdam, 1995. ILLC Dissertation Series DS-1995-18.
[38] A. Grove. Two modellings for theory change. Journal of Philosophical Logic, 17:157–170, 1988.
[39] D. Harel. First-Order Dynamic Logic. LNCS 68. Springer, 1979.
[40] D. Harel. Dynamic logic. In D. Gabbay and F. Guenthner, editors, Handbook of Philosophical Logic, volume II, pages 497–604, Dordrecht, 1984. Kluwer Academic Publishers.
[41] Vincent Hendricks and John Symons. Epistemic logic. In Edward N. Zalta, editor, The Stanford Encyclopedia of Philosophy. Spring 2006.
[42] J. Hintikka. Knowledge and Belief. Cornell University Press, Ithaca, NY, 1962.
[43] W. Holliday and T. Icard. Moorean phenomena in epistemic logic. In L. Beklemishev, V. Goranko, and V. Shehtman, editors, Advances in Modal Logic 8, pages 178–199. College Publications, 2010.
[44] J. Hulstijn. Presupposition accommodation in a constructive update semantics. In G. Durieux, W. Daelemans, and S. Gillis, editors, Proceedings of CLIN VI, 1996.
[45] J. Gerbrandy J. van Benthem and B. Kooi. Dynamic update with probabilities. Studia Logica, 93(1):67–96, 2009.
[46] S. Konieczny and R. Pino Perez. Merging information under constraints: A logical frame- ´ work. Journal of Logic and Computation, 12(5):773–808, 2002.
[47] B.P. Kooi. Dynamic epistemic logic. In J. van Benthem and A. ter Meulen, editors, Handbook of Logic and Language, pages 671–690. Elsevier, 2011. Second edition.
[48] S. Kraus, D. Lehmann, and M. Magidor. Nonmonotonic reasoning, preferential models and cumulative logics. Artificial Intelligence, 44:167–207, 1990.
[49] R. Lang. Questions as epistemic requests. In H. Hiz, editor, ˙ Questions, pages 301–318. Reidel, Dordrecht, 1978.
[50] N. Laverny. Revision, mises ´ a jour et planification en logique doxastique graduelle ` . PhD thesis, Institut de Recherche en Informatique de Toulouse (IRIT), Toulouse, France, 2006.
[51] D.K. Lewis. Counterfactuals. Harvard University Press, Cambridge (MA), 1973.
[52] S. Lindstrom and W. Rabinowicz. DDL unlimited: dynamic doxastic logic for introspective ¨ agents. Erkenntnis, 50:353–385, 1999.
[53] F. Liu. Changing for the Better: Preference Dynamics and Agent Diversity. PhD thesis, University of Amsterdam, 2008. ILLC Dissertation Series DS-2008-02.
[54] C. Lutz. Complexity and succinctness of public announcement logic. In Proceedings AAMAS 06, Hakodate, Japan, 2006.
[55] J.-J. Ch. Meyer and W. van der Hoek. Epistemic Logic for AI and Computer Science. Cambridge University Press, Cambridge, 1995.
[56] T.A. Meyer, W.A. Labuschagne, and J. Heidema. Refined epistemic entrenchment. Journal of Logic, Language, and Information, 9:237–259, 2000.
[57] G.E. Moore. A reply to my critics. In P.A. Schilpp, editor, The Philosophy of G.E. Moore, pages 535–677. Northwestern University, Evanston IL, 1942. The Library of Living Philosophers (volume 4).
[58] L. S. Moss. From hypersets to Kripke models in logics of announcements. In J. Gerbrandy, M. Marx, M. de Rijke, and Y. Venema, editors, JFAK. Essays Dedicated to Johan van Benthem on the Occasion of his 50th Birthday, Amsterdam, 1999. Amsterdam University Press.
[59] Michal Peli and Ondrej Majer. Logic of questions and public announcements. In Nick Bezhanishvili, Sebastian Lbner, Kerstin Schwabe, and Luca Spada, editors, Logic, Language, and Computation, pages 145–157. Springer, 2011. LNCS 6618.
[60] J. Peregrin, editor. Meaning: the dynamic turn. Elsevier, Amsterdam, 2003. 19
[61] J. Plaza. Logics of public communications. Synthese, 158(2):165–179, 2007. This paper was originally published as Plaza, J. A. (1989). Logics of public communications. In M. L. Emrich, M. S. Pfeifer, M. Hadzikadic, and Z.W. Ras (Eds.), Proceedings of ISMIS: Poster session program (pp. 201–216). Publisher: Oak Ridge National Laboratory, ORNL/DSRD- 24.
[62] G. R. Renardel de Lavalette. Changing modalities. J. Logic and Comput., 14(2):253–278, 2004.
[63] B. Renne. A survey of dynamic epistemic logic. manuscript, 2008.
[64] J. Sack. Adding Temporal Logic to Dynamic Epistemic Logic. PhD thesis, Indiana University, Bloomington, USA, 2007.
[65] J. Salerno, editor. New Essays on the Knowability Paradox. Oxford University Press, Oxford, UK, 2009. [66] J. R. Searle. Speach Acts, An Essay in the Philosophy of Language. Cambridge University Press, Cambridge, 1969.
[67] K. Segerberg. Irrevocable belief revision in dynamic doxastic logic. Notre Dame Journal of Formal Logic, 39(3):287–306, 1998.
[68] K. Segerberg. Two traditions in the logic of belief: bringing them together. In H. J. Ohlbach and U. Reyle, editors, Logic, Language, and Reasoning, pages 135–147, Dordrecht, 1999. Kluwer.
[69] P. Iliev T. French, W. van der Hoek and B. Kooi. On the succinctness of some modal logics. Artificial Intelligence, 197:56–85, 2013.
[70] B. D. ten Cate. Internalizing epistemic actions. In M. Martinez, editor, Proceedings of the NASSLLI 2002 student session, pages 109 – 123, Stanford University, 2002.
[71] J. van Benthem. Semantic parallels in natural language and computation. In Logic Colloquium ’87, Amsterdam, 1989. North-Holland.
[72] J. van Benthem. Exploring Logical Dynamics. CSLI Publications, Stanford, 1996.
[73] J. van Benthem. Games in dynamic-epistemic logic. Bulletin of Economic Research, 53(4):219–248, 2001.
[74] J. van Benthem. Logics for information update. In J. van Benthem, editor, Proceedings of TARK 2001, pages 51–67, San Francisco, 2001. Morgan Kaufmann.
[75] J. van Benthem. What one may come to know. Analysis, 64(2):95–105, 2004.
[76] J. van Benthem. ‘one is a lonely number’: on the logic of communication. In Z. Chatzidakis, P. Koepke, and W. Pohlers, editors, Logic Colloquium ’02. ASL, Poughkeepsie, 2006. 20
[77] J. van Benthem. Dynamic logic of belief revision. Journal of Applied Non-Classical Logics, 17(2):129–155, 2007.
[78] J. van Benthem. Logical Dynamics of Information and Interaction. Cambridge University Press, 2011.
[79] J. van Benthem. Logic in Games. MIT Press, 2013. To appear.
[80] J. van Benthem, J.D. Gerbrandy, T. Hoshi, and E. Pacuit. Merging frameworks for interaction. Journal of Philosophical Logic, 38:491–526, 2009.
[81] J. van Benthem, J. van Eijck, and B. Kooi. Logics of communication and change. Information and Computation, 204(11):1620–1662, 2006.
[82] W. van der Hoek and R. Verbrugge. Epistemic logic: a survey. In L.A. Petrosjan and V.V. Mazalov, editors, Game theory and Applications, volume 8, pages 53–94, 2002.
[83] H. van Ditmarsch, A. Herzig, and T. De Lima. From situation calculus to dynamic epistemic logic. Journal of Logic and Computation, 21(2):179–204, 2011.
[84] H. van Ditmarsch, W. van der Hoek, and P. Iliev. Everything is knowable – how to get to know whether a proposition is true. Theoria, 78(2):93–114, 2012.
[85] H. van Ditmarsch, W. van der Hoek, and B. Kooi. Dynamic epistemic logic with assignment. In Proc. of 4th AAMAS, pages 141–148. ACM, 2005.
[86] H. P. van Ditmarsch. Knowledge games. PhD thesis, University of Groningen, 2000. ILLC Dissertation Series DS-2000-06.
[87] H. P. van Ditmarsch. Descriptions of game actions. J. Logic, Lang., Inform., 11:349–365, 2002.
[88] H. P. van Ditmarsch. Prolegomena to dynamic logic for belief revision. Synthese, 147:229– 275, 2005.
[89] H. P. van Ditmarsch and B. Kooi. The secret of my success. Synthese, 151(2):201–232, 2006.
[90] H. P. van Ditmarsch, W. van der Hoek, and B. Kooi. Concurrent dynamic epistemic logic. In V. F. Hendricks, K. F. Jørgensen, and S. A. Pedersen, editors, Knowledge Contributors, pages 45–82. Kluwer, Dordrecht, 2003.
[91] H. P. van Ditmarsch, W. van der Hoek, and B. Kooi. Dynamic Epistemic Logic. Springer, Berlin, 2007.
[92] Peter Vanderschraaf and Giacomo Sillari. Common knowledge. In Edward N. Zalta, editor, The Stanford Encyclopedia of Philosophy. Fall 2007. 21
[93] F. Veltman. Defaults in update semantics. Journal of Philosophical Logic, 25:221–261, 1996. [94] Y. Wang, L. Kuppusamy, and J. van Eijck. Verifying epistemic protocols under common knowledge. In Proc. of 12th TARK, pages 257–266. ACM, 2009.

Author Information

Hans van Ditmarsch
Email: hans.van-ditmarsch@loria.fr
University of Lorraine
France

and

Wiebe van der Hoek
Email: wiebe@csc.liv.ac.uk
The University of Liverpool
United Kingdom

and

Barteld Kooi
Email: B.P.Kooi@rug.nl
University of Groningen
Netherlands

Classification

One of the main topics of scientific research is classification. Classification is the operation of distributing objects into classes or groups—which are, in general, less numerous than them. It has a long history that has developed during four periods: (1) Antiquity, where its lineaments may be found in the writings of Plato and Aristotle; (2) The Classical Age, with natural scientists from Linnaeus to Lavoisier; (3) The 19th century, with the growth of chemistry and information science; and (4) the 20th century, with the arrival of mathematical models and computer science. Since that time, and from an extensional viewpoint, mathematics, specifically, the theory of orders and the theory of graphs or hypergraphs, has facilitated the precise study of strong and weak forms of order in the world, and the computation of all the possible partitions, chains of partitions, covers, hypergraphs or systems of classes that we can construct on a domain. With the development of computer science, Artificial Intelligence, and new kinds of languages such as oriented-objected languages, an intensional approach has completed the previous one. Ancient discussions between Aristotle and Plato, Ramus and Pascal, Jevons and Joseph found some kind of revival via object-oriented modeling and programming, most of objected oriented languages being concerned with hierarchies, or partial orders: these structures reflect in fact the relations between classes in those languages, which generally admit single or multiple inheritance. In spite of these advances, most of classifications are still based on the evaluation of resemblances between objects that constitute the empirical data. This one is almost always computed by the means of some notion of distance and of some algorithms of aggregation of classes. So all these classifications remain, for technical and epistemological reasons that are detailed below, very unstable ones. A real algebra of classifications, which could explain their properties and the relations existing between them, is lacking. Though the aim of a general theory of classifications is surely a wishful thought, some recent conjecture gives the hope that the existence of a metaclassification (or classification of all classification schemes) is possible.

General Introduction: Classification Problems
A Brief History of Classifications
The Problem of Information Storage and Retrieval
Ranganathan and the PMEST Scheme
Order and Mathematical Models
1. Extensional Structures
2. A Glance at an Intensional Approach
The Idea of a General Theory of Classifications
References and Further Readings

1. General Introduction: Classification Problems

Classification problems are one of the basic topics of scientific research. For example, mathematics, physics, natural sciences, social sciences and, of course, library and information sciences all make use of taxonomies. Classification is a very useful tool for ordering and organization. It has increased knowledge and helped to facilitate information retrieval.

Roughly speaking, ‘classification’ is the operation consisting of sharing, distributing or allocating objects in classes or groups which are, in general, less numerous than them. Commonly, classifications are defined on finite sets. However, if the objects are, for example, mathematical structures there can be infinite classifications. In this case, the previous requirement, of course, must be weakened: we may only want the (infinite) cardinal of the classification to be less than or equal to the (infinite) cardinal of the set of objects to be classified. What we call ‘classification’ is also the result of this operation. We want, as much as it is possible, for this result be constant, namely, that the classification itself remains stable for a little transformation of data (of course, the sense of this requirement will have to become clearer). Various situations may happen: the classes may intersect or not, be finite or infinite, formal or fuzzy, hierarchically ordered or not, and so on.

The basic operation of grouping elements into classes, which simplifies the world, is a very powerful operation, but it also raises many questions. In particular, a number of philosophers, from Socrates to Diderot and even post-modern philosophers, criticized such an operation (see, for instance, Foucault 1967). Indeed, this operation has multiple profits. First is the substitution of a rational and regular order in the chaotic and muddled multiplicities. Second is the reduction of the size of sets, so that, once we have constituted classes of equivalences, we can work with these classes and no more with the elements. Third, and finally, to make a partition of a set means locating in it a symmetry that decreases the complexity of the problem and so simplifies the world. We can say with Dagognet (1984, 1990) than “less is more”: to compress the data really brings an intellectual gain.

Having outlined the main reasons for classifications, let us see how these classifications have developed and which forms they got throughout the course of time.

2. A Brief History of Classifications

The history of classifications (Dahlberg 1976) develops in four periods. From Plato and Aristotle to the 18th century, ancient classifications are hierarchical ones, they are finite and generally based on one single criterion. During the 18th century, some new classifications appear, which are multicriteria – a domain can be co-divided in many ways, as Kant said in his Logic (see Kant 1988) – and indefinite or virtually infinite (Kant believed that we could endlessly subdivide the extension of a concept). At the end of the 18th and at the beginning of the 19th century, with the chemical classifications of Lavoisier and then of Mendeleyev, one discovers combinatorial classifications or multiple crossed orders, like the chemical table of Elements, which correspond to a new concept of classification. In the 20th century, through the progress of mathematical order theory, factorial analysis of correspondence, and automatic classification, formal models begin to develop.

a. From Antiquity to the Renaissance

French commentator of Greek philosophers, R. Joly said that a typical trend of the Greek spirit was to reduce a multiple and complex reality into some categories which satisfy the reason, both by their restricted number and by the clear and precise sense that becomes attached to each of them. Indeed, Plato and Aristotle are among the great classifiers of these ancient times.

In all of Plato’s Dialogues, and especially in the latest ones (Parmenides, Sophist, Politicus, Philaebus), Plato obviously classified a lot of things (ways of life, political constitutions, pleasures, arts, jobs, kinds of knowledge, and so forth). Generally, for Plato, things were classified in relation with the distance that separates them from their archetypal forms, which yields some order (or pre-order) on them. Plato’s classifications are finite, hierarchical, dichotomous, and based on a single criterion. For example, in Gorgias (465c), a set of all practices is divided into two classes, the practices concerning the body and the practices concerning the soul, each of them being then divided into two others: gymnastics and medicine, on one hand, and legislation and justice, on the other hand. In the same way, in Republic (510a), the whole universe, viewed as the set of all real things, is divided into the visible world and the invisible world, each class being subdivided into images and objects or living beings on one hand, mathematical objects and ideas, on the other hand.

According to Plato, the rules of classifications are very simple. First, we have to make symmetric divisions in order to get well-balanced classes. For example, if we classify the peoples, we have to avoid setting the Greek in front of the other peoples, because one of the classes will be plethoric while the other one will have only one element (Politicus, 262a). Second, As a good cook who cuts an animal─this metaphor is in the Phaedrus−it is also necessary to choose the good joints or articulations. For example, in the field of numbers, it would be senseless to set 1000 in front of 999 other numbers. In contrast, the opposition even/odd or prime/not prime, is a real one. Thirds, in general, we must also avoid using negative determinations. For example, we have to avoid determinations like not-A because it is impossible that the non-being has sorts or species, these determinations block the development of thought.

Plato did not observe these wise rules, so incurring Aristotle’s criticisms. Against Plato’s theory, Aristotle argues that the method of division is not a powerful tool because it is non-conclusive. It does not make syllogisms (First Analytics, I, 31). In another text (Second Analytics, II,5), Aristotle insists on the contingency of the passage from a predicate to another one, that is, in the Platonic division, for every new attribute, we can wonder why it is such an attribute oppose to another one. The differences introduced by dichotomies can be also purely negative and thus do not necessarily define a real being. Moreover, binary divisions presuppose that the number of the primitive species is a power of 2. In a division, a predicate can belong to different primitive species, for example “bipedalism” can apply to both birds and humans. But, according to Aristotle, the application of this term is not the same in both cases. Finally, the Platonic division confuses extensional and intensional views. It can identify the triangle, which is a kind, and one of its properties, for example, the equality of the sum of its angles in two right angles.

The previous questions get no answer in Plato’s theory. Aristotle rejected Plato’s method of division. But, Aristotle also rejected the Platonic doctrine of forms. According to Aristotle (Metaphysics, I, 9), Plato’s forms fail to explain how there could be permanence and order in the world. Far more, he argued, Plato’s theory of forms cannot explain anything at all in our material world. The properties that the forms have (according to Plato the forms are eternal, unchanging, transcendent, and so forth) are not compatible with material objects and the metaphor of participation or imitation breaks down in a number of cases. For instance, it is unclear what it mean for a white object to participate in, or to copy, the form of whiteness−that is, it is hard to understand the relationship between the form of whiteness and white objects themselves.

For all these reasons, Aristotle develops his own concepts, and his own logic of classifications. In the Topics (I, chap. 1), Aristotle introduces the notions of kind, species, property and a whole theory of basic predication that has subsequently developed in the work of Porphyry and Boece, respectively. This theory is based on the opposition between essence, all of the characters that define a thing, and accident, the qualities whose presence or absence does not modify the things essence. A commentator of the Aristotelian system, Porphyry (234-305), puts these distinctions to good use and tries to specify the hierarchy of the kinds and the species as defined by Aristotle. The famous Porphyrian Tree is the first abstract tree outlining these distinctions and illustrates the subordination existing between them (See Figure 1).

Figure 1: The Porphyrius Tree

In a passage of his Commentary on Aristotle’s Categories (2014) Porphyry asked good questions at the origin of a hotly-debated controversy over whether or not universals were physical or immaterial substances. That is, a contention over whether universals are separated from sensible things or if they are involved in them, finding their consistency therein. In opposition to the traditional views (Platonic and Aristotelian or scholastic realisms), other solutions appeared. For example, Nominalism (Roscelin, 11th c.) claimed that universals are but words and that nothing corresponds to them in the Nature, which knows only the singular. Against that was Conceptualism (Abélard, 12th cn. and Ockham, 14th cn.), the view that kinds exist as predicates of subjects that, themselves, are real. In the last centuries of Middle Ages and in the Renaissance, we find also great scholars who work on classification. In particular, Francis Bacon (1561-1626), whose work on the classification of knowledge that has inspired the great librarians of the 19th century. But, the logic of classifications, which remains, in this time, the Aristotelian logic, receives practically no new development until the 18th century.

b. From Classical Age to Victorian Taxonomy

In the Classical Age, taxonomy as a fully-fledged discipline began to develop for several reasons. One important reason emerges from the birth of natural science and the need to organize floras and faunas in connection with the growth of the human population on Earth, in the context of the beginning of agronomy (Dagognet, 1970). In this period, naturalists like Tournefort (1656-1708), Linnaeus (1707-1778), De Jussieu (1748-1836), Desfontaines (1750-1833) and Cuvier (1769-1832) tried to classify plants and animals all around the world.

When classifying things or beings, you must get a criterion or an index, in order to make classes and separate varieties inside the classes. Indeed, all those naturalists differ on the criteria of their classifications. For example, concerning the classification of plants, Tournefort chose corolla, while Linnaeus chose the sexual organs of the plant. Concerning the animals, the classification of Cuvier violates Aristotle’s recommendations, by compositing vertebrates and invertebrates which, by chance, are something real. At the end of the century, Kant summarizes, in his Logic (1800), the main part of the knowledge about classifications in this period, by specifying the definitions of a certain number of terms and operations that the naturalists of the time empirically use. Kant was only interested in the forms of the classifications. In his Logic he defines a logical division of a concept as “the division of all the possible contained in it”. The rules of this division are the following: 1) members of the division are mutually exclusive, 2) their union restores the sphere of the divided concept, 3) each member of the division can be itself divided (the division of such divided members is a subdivision). (1) and (2) seem to indicate that Kant was approaching our concept of a partition. But (3) shows that he does not have the concept of a chain of partitions, since he does not see that a subdivision of the same level forms one and the same partition.

These problems were also discussed, during the 19th century in Anglo-Saxon countries, even after Darwin’s theory of evolution. One may think that Darwin’s belief in branching evolution was based upon his familiarity with the taxonomy of his day, from which he was very aware. There were great taxonomists in England in the Victorian age and some of them−for instance, the paleontologist H. Alleyne Nicholson, a specialist of British Stromatoporoids−were prodigious and wrote monographs still in force today (Woodward 1903). At approximately the same time, H. Agassiz (Agassiz 1957), a scholar in classification theory, wrote about taxonomic concepts like categories, divisions, forms, homologies, analogies, and so on. Among different taxonomic systems mentioned in his Essay on Classification, include the classical systems of Leeuckart, Vogt, Linnaeus, Cuvier, Lamarck, de Blainville, Burmeister, Owen, Ehrenberg, Milne-Edwards, von Siebold, Stannius, Oken, Fitzinger, MacLeay, von Baer, van Bencden, and van der Hoeven. In The Origin of Species, Darwin himself said that it was a

truly wonderful fact…that all animals and all plants throughout all time and space should be related to each other in group subordinate to group, in the manner which we everywhere behold−namely, varieties of the same species most closely related together, species of the same genus less closely and unequally related together, forming sections and sub-genera, species of distinct genera much less closely related, and genera related in different degrees, forming sub-families, families, orders, subclasses, and classes. (1859, 128)

But what he called the “principle of divergence”–namely, the fact that during the modification of the descendants of any one species and during the incessant struggle of all species to increase in numbers, the more diversified these descendants become, the better will be their chance of succeeding in the battle of life−was illustrated by his famous tree-like diagram sketched in 1837 in the notebook in which he first posited evolution. From this time, tree-like structures, that has been also of great use in chemistry and would be formalized at the end of the century by the mathematician Arthur Cayley, tended to replace classifications.

c. The Beginning of Modernity

A new kind of classifications appeared at the end of the 18th century, with the development of Chemistry, namely, combinatorial classifications or cross multiple orders. This kind of classifications is either the crossing of two or more divisions, or the crossing of two or more hierarchies of divisions. In such a structure, as Granger (1967) said, “elements are distributed according to two or several dimensions, giving rise to a multiplication table”. In a combinatorial classification, the elements themselves are not necessarily distributed into classes. Only the components of these elements are classified. For Granger, this model refers to the Cartesian plane and to the ordinal principle on which it is based. The Cartesian plane, results from a will of ordering a certain distribution of points in the space, by ordering points in every row and then by ordering the rows themselves. The virtue of multiple orders is to place what is classified in the intersection of a line and a column. So, as Dagognet (1969) has shown, when an element is absent or there is an empty compartment, it can be defined by its surroundings. This is what happened in the Mendeleyev table. This table has two main advantages. First, the table is creative, so the mass of a chemical element can be calculated from those which surround it (see Figure 2), and hence, chemical elements, which did not exist in Nature but were synthesized only 30 years later in laboratories, have already been accounted for by Mendeleyev. Second, the classification is not a purely spatial picture of the world. The temporality, in particular the future, is already present in it.

Figure 2: The mass of an unknown element in the Mendeleyev Table

3. The Problem of Information Storage and Retrieval

At the end of the 19th century, the development of scientific research, which raised the question of information storage and retrieval, encouraged the constitution of voluminous librarian catalogues. This included the Dewey’s decimal classification, Otlet and La Fontaine’s universal decimal classification, and the Library of Congress classification. The aim of these kinds of classifications was to account for the whole of knowledge in the world. But, many problems arose from this attempt of library sciences to organize the whole knowledge. Three rules were commonly respected in more natural classifications: 1) Everything classified must appear in the catalogue (which must be, in principle, finite and complete), 2) there is no empty class, 3) nothing can belong to more than one class. Generally, these rules are not respected in library classifications. To face the extraordinary challenge of cataloguing knowledge growing indefinitely throughout the course of time, the big library classifications designed at the end of the 19th century adopted the principle of decimalization. This system was used because decimal numbers, used as numeral items, authorize indefinite extensions of classifications. Suppose you start with 10 main classes, from 0 to 9. If you add a zero to each number, you get the possibility of forming 100 classes (from 00 to 99) and if you go on, you can obtain 1000 classes (from 000 to 999). Then you can also put a comma or a point, and define items like: 150.234. After the point, the sequence of numbers is potentially infinite and you can go as far as is needed. Another difference is that library classifications can sometimes allow for vacant classes in their hierarchy, and also can, assume the inscription of classified subjects in several places. Vacant classes are used because a librarian must manage some place for new documents that are still temporarily unclassified. Multiple inscriptions are also used because readers, who sometimes do not know exactly what they are looking for, need to have a broad ranging accesses to knowledge. This made made way for the existence of entries like author, subject, time, place, and so forth. The previous requirement of decimalization is obvious in the Dewey Decimal Classification (DDC) proposed by Melvil Dewey in 1876 (Béthery 1982). This classification is made up of ten main classes or categories, each of them being divided into ten secondary classes or subcategories. These last ones contain in turn ten subdivisions. The partition of the ten main classes thus gives successively 100 divisions and 1000 sections.

DDC — main sections

000 – Computer Science, Information and General Works
100 – Philosophy and Psychology
200 – Religion
300 – Social Sciences
400 – Language
500 – Science (including Mathematics)
600 – Technology and Applied Science
700 – Arts and Recreation
800 – Literature
900 – History, Geography and Biography

In the same way, the Universal Decimal Classification (UDC) of Otlet and La Fontaine globally presents the same hierarchical organization, except in the fourth nodal class, which is left empty (thus, applying the previous principle of vacant classes).

As librarians have rapidly observed, one undesirable consequence of such decimal schemes is the increasing fragmentation of subjects as taxonomist’s work proceed. For example, the Dewey Classification, though having this useful advantage of being infinitely extendible, turns out rapidly to be a list or a nomenclature. This is also the case of the UDC of Otlet and La Fontaine, and of all the classifications of the same type. A first attempt to make up for such a disadvantage has consisted of allowing some junctions between categories in the classification. A second one is the possibility of using some tables (7 in the DDC) to aid in the search of a complex object, which may be located in different sites. For instance, a book of poetry, written by various poets from around the world, would appear in several classes, indexed thanks to the tables. In general, DDC used to combine elements from different parts of the structure, in order to construct a number representing the subject content. This one often combines 2 or more subject elements with linking numbers and geographical and temporal elements. The method consists of forming a new item rather than drawing upon a list containing each class and its meaning. For example, 330 (for Economics) + 9 (for Geographic Treatment) + 04 (for Europe) and the use of ‘/’ gives 330/94 (European Economy). Another example is the following: 973 (for United States) + 05 (division for periodicals) and the use of the point ‘.’ gives 973.05 (periodicals concerning the United States generally).

Other specific features occur in library classifications, which tend to make them very different from classical scientific taxonomies. One spectacular difference with hierarchical classifications in Zoology or Botany is, as we have already seen, that it is possible for subjects to appear in more than one class. For example, in DDC, a book on Mathematics could appear in the 372.7 section or in the 510 section, depending on if the book is a monograph instruction for teachers on how to teach mathematics, or a mathematics textbook for children. Another difference is a relative flexibility of library classifications.

Though there exist improvements, UDC and DDC, like most of the classifications constructed at the same time (see Bliss 1929) are based on a perception of knowledge and of the relationships between academic disciplines extant from 1890 to 1910. Moreover, though updated regularly, UDC and DDC, as decimal systems, are less hospitable to the addition of new subjects. These kinds of classification are based on fixed and historically dated categories. One may observe, for example, that none of the main concepts of our present library science (digital library, knowledge organization, automatic indexing, information retrieval, and so forth) were included in the index of the 2005 UDC edition, and that technical taxonomies generally require more complex features (Dobrowolski 1964).

4. Ranganathan and the PMEST Scheme

There have been many pursuits to solve the aforementioned librarian problems. Some of them are well known since the middle of the 20th century. In the course of the 20th century, new modes of indexing and original classification schedules appeared in library science with the Indian librarian Shiyali Ramamrita Ranganathan (1933, 2006) and his faceted classification – also called “Colon classification” (CC), because of its use of the colon to indicate relations between subjects in the former edition.

Ranganathan was at first a mathematician and knew little about the library. But he took charge of the Madras University Library, and was then deputed by his University to study Library Science in London. There, he attended the School of Librarianship in the University College and discovered, as he said later, the “charm of classifications”, and also its problems. He saw very quickly that Decimal Classifications did not give satisfaction to users. On the opposite, he had the vision of a meccano set, where, instead of having ready-made rigid toys, one can construct them with a few fundamental components. This made him think of a new kind of classification.

It appeared to Ranganathan that the new theory might be organized at the higher level in 5 fundamental categories (FC) called facets: Personality, Matter, Energy, Space and Time−in summary PMEST. In each isolate facet a Compound Subject is deemed to be a manifestation of one (and only one) of one or other of the five fundamental categories. There is also subfacets, so that the facet scheme PMEST and the subfacets we may form from it, are then used to sort subclasses in the main classes of the classification.

The difference with previous classifications is in the way one defines ‘subfacets’. Rather than simply dividing the main classes into a series of subordinate classes, one subdivides each main class by particular characteristics into facets. Facets, labeled by Arabic numbers, are then combined to make subordinate classes as needed. For example, Literature may be divided by the characteristic of language into the facet of Language, including English, German, and French. It may also be divided by form, which yields the facet of Form, including poetry, drama and fiction. So CC contains both basic subjects and their facets, which contain isolates. A basic subject stands alone, for example: Literature in the subject English Literature, while an isolate, in contrast, is a term that modifies a basic subject, for example, the term ‘English’. Every isolate in every facet must be a manifestation of one of the five fundamental categories in the PMEST scheme.

The advantages of the CC are numerous. The first one is a greater flexibility in determining new subjects and subject numbers. A second is the concept of phases, which allows taxonomists to readily combine most of the main classes in a subject. Consider for example a subject like Mathematics for biologist. In this case, single class number enumerative systems, as those predominating in US libraries, tend to force classifiers to choose either Mathematics or Biology as the main subject. However, CC supplies a specific notation to indicate this be-phased condition.

Indeed, some problems remain unsolved. In CC, facets, that is, small components of larger entities or units are similar to flat faces of a diamond which reflect the underlying symmetry of the crystal structure, so that the general structure of Ranganathan Classification, as that of a faceted classification in general, is a kind of permutohedron. In principle, all descriptions may be done, whatever the order of them. For example, if we have to classify a paper speaking about seasonal variations of the concentration of noradrenaline in the tissue of the rat, we must get the same access if we have the direct sequence: (1) Seasonal, variations, concentration, noradrenaline, tissue, rat, or the reversed one: (2) Rat, tissue, noradrenaline, concentration, variations, seasonal. In mathematical words, this means clearly that the underlying structure that makes this transformation possible must be a commutative group. But this is not always the case, and for some dihedral groups, this structure is even forbidden. Another potential worry is that the PMEST scheme, which certainly has some connections with Indian thought, is far from being universally accepted (see De Grolier 1962) and has not been very often implemented in libraries, even in India.

So, in spite of all the improvements they receive in the course of time, a lot of problems have been raised in front of library classifications. In particular, library classifications will be strongly questioned in the 20th century by the proliferating development of the knowledge. First, the ceaseless flux of new documents forbids a stiff topology for classifications. The problem, then, is to know how to construct evolutionary structures. Second, the successive orderings of the knowledge (groupings and revisions and not only ramifications) has called relational powerful and automated documentary languages. Classifications still remain necessary, because documentary languages cannot do everything. So the problem is still open. But, with the big development of mathematics in the last century, this general problem, which is the great problem of order, has to be investigated by the means of mathematical structures.

5. Order and Mathematical Models

First attempts to study orders in mathematics began to develop at the end of the 19th century with Peano, Dedekind and Cantor (especially with his theory of ordinals, which are linear ordered sets). They go on with Peirce (1880) and Shröder (1890) and their works around the question of an algebra of logic. Then, in the first part of the 20th century, comes the notion of partial order with an article of MacNeille (1937) and the famous work of G. Birkhoff (1967) who introduced the notion of lattice, algebraically developed later in the great book of Rasiowa and Sikorski (1970). During the same period, mathematical models of hierarchical classifications, which have been investigated in the USA by Sokal and Sneath (1963, 1973) or, in England, by Jardine and Sibson (1971) were developed in France in the works of Barbut and Monjardet (1970), Lerman (1970, 1981), and Benzécri (1973). All these works supposed the big last century advances in mathematical order theory: especially the papers of Birkhoff (1935), Dubreil-Jacotin (1939), Ore (1942, 1943), Krasner (1953-1954) and Riordan (1958). The Belgian logician Leo Apostel (1963) and the Polish mathematicians Luszczewska-Romahnowa and Batog (1965a, 1965b) have also published important articles on the subject. The more and more important use of computers in the search of automatic classifications has also been, in those years, a reason for searchers to get interested in mathematical models.

As there are many forms of classifications in the world of knowledge (we can find them, as we have seen, in mathematics, natural sciences, library and information science, and so forth) there are also many possible mathematical models for classifications. We begin with the study of extensional structures.

a. Extensional Structures

In order to clarify the situation, we start with the weakest form of them and move to stronger forms. Mathematics allows us to begin with very few axioms, that usually define weak general structures, and afterwards, by adding new conditions, one can get other properties and stronger models. In our case, the weakest structure is just a hypergraph H = (X,P) in the sense of Berge (1970), with X a set of vertices and P a set of nonempty subsets called edges (See Figure 3).

Figure 3: A Hypergraph

In this case, the set of edges P does not necessarily cover the set X, and some nodes (vertex of degree zero), may have no link to some edge. Assume the following conditions:

(C0) X ∈ P,

(C1) For all x ∈ P, {x} ∈ P,

Accordingly, we have a system of classes (in the sense of Brucker-Barthélemy 2007).

Add now the following new conditions: for every P_i ∈ P:

(C2) P_i ∩ P_j = Ø,

(C3) ∪ P_i= X,

Then P is a partition of X and the P_iare the blocks of the partition P.

Let now P(X) be the set of partitions on a nonempty finite set X. We may define on P(X) a partial order relation ≤ (reflexive, antisymmetric and transitive) such that P(X), ≤) is a lattice in the sense of Birkhoff (1967), that is, a partial order where every pair of elements has the same least upper bound and the same greatest lower bound. Then, one can prove that all the chains (all the linearly ordered sequences of partitions) of this lattice are equivalent to hierarchical classifications. So, the set C(X) of all these chains is exactly the set of all hierarchical classifications on a set. This set C(X) has itself a mathematical structure: it is a semilattice for set intersection. This model allows us to get all the possible partitions of P(X) and all the possible chains of C(X) (See Figure 4).

Figure 4: The lattice of partitions of a 4-element set.

A first problem is that such partitions are very numerous. For |X| = 9, for example, there is already 21147 partitions. So, when we want to classify some domain of objects (plants, animals, books, and so forth), it is not very easy to examine what classification is the best one among, say, several thousands of them.

A second problem is that the world is not made of chains of partitions. If it were, of course, the game would be over. Everything could be inserted in some hierarchical classification. But, the real world has no reason to present itself as a hierarchical classification. In the real world, we have generally to deal with quite chaotic entities, complicated fuzzy classes and poor structured objects, all that form what we can call ‘rough data’. So when we want to get a clear order, we have to construct it, such that it is extracted from the complicated data. For that, we have to compare objects, to know the degree to which they are similar, and to do so, we need of course a notion of ‘similarity’. In order to make empirical classifications we must evaluate the similarities or dissimilarities between elements to be classified. In the history of taxonomic science, Buffon (1749) and Adanson (1757) have tried to understand the meaning of this evaluation in the following way. First, they claim, we have to measure the distance between the objects by the means of some index, so that we can build classes. Afterwards, we have to measure the distance between classes themselves, so that we can group some classes into classes of classes, and so replace the initial set of objects with an ordered set of classes that is less numerous than them.

What old taxonomists were doing, only basis of observation, can now be carried out with the help of mathematics, using a modern notion of distance. Lerman (1970) and Benzécri (1973) showed that a hierarchical classification, that is, a chain of partitions, is nothing but a particular kind of distance or, a particular kind of dissimilarity (Van Cutsem 1994). It is an ultrametric distance, which gives tree representations (Barthélemy and Guénoche 1988) and also has the special property to correspond exactly with the chain, so that, when considering all the chains, the set of their corresponding distance matrices makes a semiring (R, +, ×) when we interpret the lattice operations min and max in an anusual but clever manner (+ for min, × for max) (Gondran 1976). Problems arise when the distance between the objects classified is not ultrametric. In such cases, we have to choose the closest ultrametric smaller than the given distance, and so, access to the best hierarchical classification we can get and which is the closest one to the data. However, this kind of approach leads, in general, to relatively unstable classifications.

Indeed, there are two kinds of instability for classifications. The first, Intrinsic instability,,is associated to the plurality of methods (distances, algorithms and so forth) that can be used to make the classifications of objects. The second is extrinsic instability, which is connected to the fact that our knowledge is changing with time, so the definitions of objects (or attributes of the objects) are evolving.

An answer to the question of intrinsic instability is a theorem of Lerman (1970) which says that if the number of attributes (or properties) possessed by the objects of a set X is constant, the associated quasi-order given by any natural metric is the same. But this result has two limits. First, when the sample variance of the number of attributes is a big one, of course, the stability is lost and second, if we classify the attributes, instead of classifying the objects, the reverse is not true.

For extrinsic instability the answers are more difficult to find. We may appeal to methods used in library decimal classifications (UDC, Dewey, and so forth), which make possible infinite ramified extensions, but these classifications, as we have seen, are apt to assume that higher levels are invariant and have also the disadvantage to be enumerative and to degenerate rapidly into simple lists. Also, pseudo-complemented structures (Hilman 1964) that admit some kinds of waiting boxes (or compartments) for indexing things that are not yet classified. We get as well structures whose transformations obey certain rules that have been fixed in advance. That is the case of Hopcroft 3-2 trees (Aho, Hopcroft, Ulmann 1983) for instance, or of structures close to these ones (Larson and Walden, 1979). In recent years, new models for making classifications came from conceptual formal analysis (Barwise and Seligman, 2003), computer science or views using non-classical logics in the domain of formal ontologies (Smith 1997, 2003). In computer science, for example, the concept of Abstract Data Type (ADT), related to the concept of Data Abstraction, important in object-oriented programming, may be viewed as a generalization of mathematical structures. An ADT is a mathematical model for data types, where a data type is defined by its behavior from the point of view of a user of the data. More formally, an ADT may be defined as a “class of objects whose logical behavior is defined by a set of values and a set of operations” (Dale-Walker 1996), which is strictly analogous to algebraic structures in mathematics. So, if we are not satisfied by a rough classification like the partition into collections, streams and iterators (support loops accessing data items) and relational data structures that capture relationships between data items, we must admit that ADT can also be regarded as a generalized approach of a number of algebraic structures, such as lattices, groups, and rings (Lidi 2004). Hence, classifications of ADT turn into classifications in algebraic specifications of ADT (Veglioni 1996). In this context, computer science adds nothing to mathematics and the problem is now that a classification of mathematical structures using, for instance, Category theory, as Pierce (1970) tried does not bring a sufficient answer because a category may exist while its objects are not necessarily constructible (Parrochia-Neuville 2013).

So, none of the previous approaches is very convincing for solving the basic problem, which always remains the same. We are lacking a general theory of classifications, which would only be able to study and, in the best case, solve some the main problems of classification.

b. A Glance at an Intensional Approach

Instead of making partitions by dividing a set of entities, so that the classes obtained in this way are extensional classes, as we saw in the previous section, we can instead proceed by associating a description to a set of entities. In this case, the classes are called intensional classes. Aristotle himself mixed the two points of view in his logic but Leibniz was the first to propose a purely intensional interpretation of classes. For a long time, that view was a minority and has never won unanimous support among the Ancient philosophers and logicians (as the numerous discussions between Aristotle and Plato, Ramus and Pascal, Jevons and Joseph demonstrate). However, the development of computer science brought this view back, since for declarative languages and particularly object-oriented languages, pure extensional classes or sets are rather uncommon. In this approach, the intension can be given either a priori, for example by a human actor from his knowledge of the domain, or a posteriori, when it is deduced from the analysis of a set of objects. In object-oriented modeling and programming, classes are traditionally defined a priori, with their extension mostly derived at running stage. This is usually done manually (intension being represented by logical predicates or tags), but techniques for a posteriori class discovery and organization also exist. In the context of programming languages, they deal with local class hierarchy modification by adding interclasses and use similarity-based clustering techniques or the Galois lattice approach (Wille 1996).

When there is an unrelated collection of sets, which is the case in artifact-based software classification, an issue is to compare and organize these sets simply by inclusion, or to apply conceptual clustering techniques. However, most of objected oriented languages are concerned with hierarchies, whose structure may be a tree, a lattice, or any partial order. The reason is that such structures reflect the variety of languages, some of them admitting multiple inheritance (C++, Eiffel), others only single inheritance (Smalltalk). Java has a special policy concerning this point: it admits two kinds of concepts, classes and interfaces, with single inheritance for classes and multiple inheritance for interfaces.

The viewpoint of Aristotle was the following: the division must be exhaustive, with parts mutually exclusive, and an indirect consequence of Aristotle’s principles is that only leaves of the hierarchy should have instances. Furthermore, the divisions must be based on a common concern whose modern name is the ‘discriminator’ in Unified Modeling Language (UML). But usual programming practices do not necessarily satisfy those principles. Multiple inheritance, for example, is contradictory with the assumption of mutually exclusive parts, and instances may in general be directly created from all (non-abstract) classes. Direct subclasses of a class can be derived according to different needs with different discriminators, but there is no evidence that this approach leads to relevant classifications. Objected oriented approaches, which transgress Aristotelian principles, are almost always practical storage modes but do not satisfy the main requisites of good classifications.

There are main principles that yield good classification, which are described in the intensional perspective. First–with Apostel [1963]– are some basic definitions.

From an intensional viewpoint, a division (or partition) is a closed formula F, which contains some assertion of the type (P ⊃ (Q₁ ∨ Q₂ ∨…∨ Q_n)). So, a classification is a sequence of implicative-disjunctive propositions which takes the following form: everything which has the property P has also one of the n properties Q₁… Q_n. Everything which has the property Q_r has also the property S, and so on (Apostel 1963, 188).

A division is essential if the individuals having the property P – and only this individual – may also have one of the properties Q_i. So, we can see that there are degrees in essentiality insofar as the number of individuals having the Q’s without having the P’s is greater or less. At every level, a classification may be probably or necessarily essential or exhaustive, or exclusive.

We call intensional weight w(P) of a property P, the set of disjunctions implied by this property (with necessity, factuality or probability). Properties defining classes in the same level may have extremely variable intensional weights. The basis of a division is the constant relation R, if any, between the properties of two different classes of this division.

A basis of division is (partially or totally) exhausted in some level insofar as, for this level, we do not find, in any case, true disjunctive propositions that are implied by the properties of this level and whose terms are connected by this very relation R.

A division is said to follow another one immediately (or to be immediately subsequent) if, for all P properties of the first, and for all Q properties of the second that are disjunctively implied by the P’s, there exists no sequence of R properties disjunctively implied by the P’s and disjunctively implying the Q’s.

The form of a property defining a class is the logical form of this property (conjunction of properties, disjunction of properties, negation of properties, single property).

For Apostel, an optimal classification should satisfy the following requisites:

Every level needs a basis for division;
No new basis for division shall be introduced before the previous one is exhausted;
Every division is essential;
Intensional weights of classes in a given level are comparable and relations between intensional weights of subsequent division properties in the classification must be constant.
Properties used to define classes are conjunctive ones, and not negative ones.
From the intensional viewpoint, divisions must be immediately subsequent.

In real domains, these requirements, or some of them, fail to hold. Levels are often extensionally equivalent but intensionally, the basis of division, the intensional weight, and so forth may change or not.

A natural classification is such that the definition of the domain classified determines in one and the same way the choice of the criteria of classification. It means that the fundamental set may be divided such that the division in the first level of the classification is an essential and subsequent one.

Intensional and extensional classifications are intimately related. Gathering entities in sets to produce extensional classes implies tagging these entities by their membership to these classes. But, intensional classes, built according to these descriptions, have an extension, which may be different from the initial extensional classes. So, in fact, both perspectives are not totally isomorphic and from Peirce (Hulwitt 1997) to Quine (1969), and presently, the question of natural classes remains an open and somewhat controversial question.

6. The Idea of a General Theory of Classifications

The idea of a general theory of classiﬁcations is not new. Such a project has been anticipated by Kant’s logic at the end of the 18th century. Then it was followed by many attempts to classify sciences at the beginning of the 19th century (Kedrov 1977) and had been posed by Auguste Comte in his Cours de philosophie positive (Comte 1975) as a general theory based on the study of symmetries in nature. Comte was inspired by mathematician Gaspard Monge and his classification of surfaces in geometry. However, this remains, in the work of Comte, a wishful thought. In the same way, the French naturalist Augustin-Pyramus de Candolle, published in 1813 an Elementary Theory of Botany, a book in which he introduced the term ‘taxonomia’, used in this work for the ﬁrst time (de Candolle 1813). De Candolle showed that Botany had to leave artiﬁcial methods for natural ones, in order to get a method independent from the nature of the objects. Unfortunately, nothing very concrete or precise followed his remarks. Moreover, the previous projects were only concerned with ﬁnite classiﬁcations, particularly, biological ones. A higher and more general view came into light around the 1960s with the Belgian logician Leo Apostel. Apostel (1963) wanted to write a concrete version of Set theory, and, in order to do that, needed axioms that allow him to include in the theory only the classes actually existing in the world. As such, Apostel was led to ask some questions about the well-known axioms of Zermelo-Fraenkel’s Set theory. He did not reject the whole ZF-axiomatics but however suspected axioms like the pairing axiom, the axiom of separation and the power set axiom. He also left optional the axiom of infinity and had rather a negative opinion about the axiom of choice. This project got a new revival with the recent book of Parrochia-Neuville (2013).

The hardships of solving the problem of instability of classifications provided motivation for a search for some clear composition laws to be defined on the set of classifications over a set and to a true algebra of classifications, if possible, which is very difficult because this algebra would have to be, in principle, commutative and non-associative. This search is all the more crucial that a recent theorem proved by Kleinberg (2002) shows that one cannot hope to find a classifying function which would be together scale invariant, rich enough and consistent. This result means that we cannot find empirical stable classifications by using traditional clustering methods.

In the past, some attempts have been made to formalize non-commutative parenthesized products: Comtet (1970) and Neuville, in the 1980s used the Lukasiewicz’s Reverse Polish Notation (RPN), named also Postfix Notation, whose advantage is not only to make brackets or parentheses superfluous, but also to perform calculations on trees in the required order. But, a general algebra of classifications on a set is not known, even if some new models−Loday’s dendriform algebras, for example, which work very well for trees (See Dzhumadil’daev-Löfwall 2002)−are good candidates. In any event, we are invited to look for it, for two reasons. First, the world is not completely chaotic and our knowledge is evolving according to some laws. Second, there exist quasi-invariant classifications in physics (elementary particle classification), chemistry (Mendeleyev table of Elements), crystallography (the 232 groups of crystallographic structures) among others. Most of these good classifications are based on some mathematical structures (Lie groups, discrete groups, and so forth.). To address questions concerning classification theory, and clarify the different domains of it, one may propose this final view (See Figure 6):

When our mathematical tools apply only to sense data, we get phenomenal classifications (by clustering methods): these are generally quite unstable.
When our mathematical tools deal with crystallographic or quantum structures, we get what we call, using a Kantian concept, noumenal classifications (for instance, by invariance of discrete groups or Lie Groups). These are generally more stables.
When we search a general theory of classifications (including infinite ones), we are in the domain of pure mathematics. In this field, ordering and articulating the infinite set of classifications comes to construct the continuum.

Figure 6: Metaclassification

This problem is far from being solved because there are a lot of unstable theories (Shelah 1978, 1998). However, the recent work of Parrochia-Neuville (2013) assumes the conjecture that a metaclassification, that is, a classification of all mathematical schemes of classifications, does exist. The reason is that all these forms may be expressed as ellipsoids of an n-dimensional space (Jambu 1983) that must converge necessarily on a point, the index of the classification. If the real proof comes, this will give a theorem of existence of such a structure from which a number of important results could follow.

7. References and Further Readings

Adanson, M. 1757. Histoire naturelle du Sénégal. Paris: Claude-Jean-Baptiste Bauche.
Aho, A.V., Hopcroft, J.E, Ulmann, J.D. 1983. Data Structures and algorithms. Reading (Mass.): Addison-Wesley Publishing Company.
Agassiz, L. 1962. Essay on Classification (1857), reprint. Cambridge: Harvard University Press.
Apostel, L. 1963. Le problème formel des classifications empiriques. La Classification dans les Sciences. Gembloux: Duculot.
Aristotle, 1984. The Complete Works. Princeton: Princeton University Press.
Barbut M., Monjardet, B. 1970. Ordre et classifications, 2 vol. Paris: Hachette.
Barthélemy, J.-P., A. Guénoche. 1988. Les arbres et les représentations des proximités. Paris: Masson.
Barwise, J., Seligman, J. 2003. The logic of distributed systems. Cambridge: Cambridge University Press.
Béthery, A. 1982. Abrégé de la classification décimale de Dewey. Paris: Cercle de la librairie.
Bliss, H. E. 1929. The organization of knowledge and the system of the sciences. New York: H. Holt and Company.
Benzécri, J.-P., et alii. 1973. L’analyse des données, 1, La taxinomie, 2 Correspondances. Paris: Dunod.
Birkhoff, G. 1935. On the structure of abstract algebras. Proc. Camb. Philos. Soc. 31, 433-454.
Birkhoff, G. 1967. Lattice theory (1940), 3rd ed. Providence: A.M.S.
Brucker F., Barthélemy, J.-P. 2007. Eléments de Classification, aspects combinatoires et algorithmiques. Paris: Hermès-Lavoisier.
Buffon, G. L. Leclerc de, 1749. Histoire naturelle générale et particulière (vol. 1). Paris: Imprimerie royale.
Candolle (de), A. P. 1813. Théorie élémentaire de la Botanique ou exposition des principes de la classification naturelle et de l’art d’écrire et d’étudier les végétaux, first edition. Paris: Deterville.
Comte, A. 1975. Philosophie Première, Cours de Philosophie Positive (1830), Leçons 1-45. Paris: Hermann.
Comtet, L. 1970. Analyse combinatoire. Paris: P.U.F..
Dagognet, F. 2002. Tableaux et Langages de la Chimie (1967). Seyssel: Champ Vallon.
Dagognet, F. 1970. Le Catalogue de la Vie. Paris: P.U.F..
Dagognet, F. 1984. Le Nombre et le lieu. Paris: Vrin.
Dagognet, F. 1990. Corps réfléchis. Paris: Odile Jacob.
Dahlberg, I., 1976. Classification theory, yesterday and today. International Classification 3 n°2, pp. 85-90.
Dale, N., Walker, H. M. 1996. Abstract Data Types: Specifications, Implementations, and Applications. Lexington, Massachusetts: D.C. Heath and Company.
Darwin, C.R., 1964. On the Origin of Species (1859), reprint. Cambridge: Harvard University Press.
De Grolier, E. 1962. Etude sur les catégories générales applicables aux classifications documentaires, Unesco.
Dobrowolski, Z. 1964. Etude sur la construction des systèmes de classification. Paris, Gauthier-Villars.
Dubreil, P., Jacotin, M.-L. 1939. Théorie algébrique des relations d’équivalence. J. Math. 18, pp. 63-95.
Dzhumadil’daev,A. et Löfwall, C. 2002. Trees, free right-symmetric algebras, free Novikov Algebras and Identities. Homology, homotopy and Applications, vol.(4(2), pp. 165-190.
Foucault, M. 1967. Les Mots et les Choses. Paris: Gallimard.
Gondran, M. 1976. La structure algébrique des classifications hiérarchiques. Annales de l’Insee, pp. 22-23.
Granger, G.-G. 1980. Pensée formelle et Science de l’Homme (1967). Paris: Aubier-Montaigne.
Hilman, D.J. 1965. Mathematical classification technics for non static document collections, with particular reference to the problem of revelance. Classification Research, Elsinore Conference Proceedings, Munksgaard, Copenhagen, pp. 177-209.
Huchard, M., R. Godin, , A. Napoli, A. 2003. Objects and Classification. ECOOP 2000 Workshop reader, J. Malenfant, S. Moisan, A. Moreira (Eds), LNCS 1964. Berlin-Heidelberg-New York: Springer-Verlag, pp 123-137.
Hulswit, M. 1997. Peirce’s Teleological Approach to Natural Classes. Transactions of the Charles S. Peirce Society, pp. 722-772.
Jambu, M. 1983. Classification automatique pour l’analyse des données, 2 vol.. Paris: Dunod.
Jardine N., Sibson, R. 1971. Numerical Taxonomy. New York: Wiley.
Joly, R. 1956. Le thème philosophique des genres de vie dans l’Antiquité grecque. Bruxelles: Mémoires de l’Académie royale de Belgique, classe des Lettres et des Sciences mor. et pol., tome Ll, fasc. 3.
Kant, E. 1988. Logic. New York: Dover Publications.
Kedrov, B. 1977. La Classification des Sciences (vol. 2). Moscou: Editions du Progrès.
Kleinberg, J. 2002. An impossibility theorem for Clustering. Advances in Neural Information Processing Systems (NIPS), 15, pp. 463-470.
Krastner M. 1953-1954. Espaces ultramétriques et ultramatroïdes. Paris: Séminaire, Faculté des Sciences de Paris.
Larson, J.A., Walden, W.E. 1979. Comparing insertion shemes used to update 3-2 trees. Information Systems, vol.4, pp. 127-136.
Lerman, I.C. 1970. Les bases de la classification automatique. Paris: Gauthier-Villars.
Lerman, I.C. 1981. Classification et analyse ordinale des données. Paris: Dunod.
Lidi R., 2004. Abstract Algebra. Berlin-Heidelberg-New York: Springer-Verlag.
Luszczewska-Romahnowa S., Batog T. 1965a. A generalized classification theory I. Stud. Log., tom XVI, pp. 53-70.
Luszczewska-Romahnowa S., Batog T. 1965b. A generalized classification theory II. Stud. Log., tom XVII, pp. 7-30.
MacNeille 1937. Partially ordered sets. Transaction Amer. Math. Soc., vol. 42, pp. 416-460.
Ore O. 1942. Theory of equivalence relations. Duke Math. J. 9, pp. 573-627.
Ore O. 1943. Some studies on closer relations. Duke Math. J. 10, pp. 761-785.
Parrochia, D., Neuville, P. 2013. Towards a general theory of classifications. Bäsel: Birkhaüser.
Peirce C. S. 1880. On the Algebra of Logic. American Journal of Mathematics 3, pp. 15-57.
Pierce, R.S. 1970. Classification problems. Mathematical System theory, vol. 4, n°1, March, pp. 65-80.
Plato, 1997. The Complete Works. Cambridge: Hacking publishing Company
Porphyry, 2014. On Aristotle’s Categories. London, New York: Bloomsbury Publishing Plc.
Quine, W.V.O. 1969. Ontological Relativity and Other Essays. New York: Columbia University Press.
Ranganathan, S. R. 1933. Colon Classification. Madras: Madras Library Association.
Ranganathan, S. R. 2006. Prolegomena to Library Classification (1937), Reprint. New Delhi: Ess Pub..
Rasiowa H., Sikorski, R. 1970. The Mathematics of Metamathematics. Cracovia: Drukarnia Uniwersytetu Jagiellonskiego.
Riordan, J. 1958. Introduction to combinatorial analysis. New York: Wiley.
Roux, M. 1985. Algorithmes de classification. Paris: Masson.
Shelah, S. 1988. Classification Theory (1978). Amsterdam: North Holland.
Shröder, E. 1890. Vier Kombinatorische Probleme. Z. Math. Phys. 15, pp. 361-376.
Smith, B. 1997. Boundaries: An Essay in Mereotopology. L. Hahn (ed.), The Philosophy of Roderick Chisholm. La Salle, Open Court: Library of Living Philosophers, pp. 534-561.
Smith, B. 2003. Groups, sets and wholes. Revista di estetica, NS (P. Bozzi Festschrift), 24-3, 1209-130.
Sokal R. R., Sneath, P.H. 1963. Principle of numerical taxonomy. San Francisco: W. H. Freeman.
Sokal, R. R., and Sneath, P. H. 1973. Numerical Taxonomy, the principles and practice of numerical classifications. San Francisco: W. H. Freeman.
Van Cutsem B. (ed.) 1994. Classification and dissimilarity analysis. New York-Berlin-Heidelberg: Springer Verlag.
Veglioni, S. 1996. Classifications in Algebraic specifications of Abstract Data Types. CiteSeer^X
Windsor, M. P. 2009. Taxonomy was the foundation of Darwin’s evolution. Taxon 58, 1, pp. 43-49.
Wille, R. 1996. Restructuring lattice theory: an approach based on hierarchy of concepts. Rival, I (ed.) Ordered Sets. Boston: Reidel, pp. 445-470.
Woodward, H. 1903. Memorial to Henry Alleyne Nicholson. M.D., D.Sc., F.R.S. Geological Magazine, 10, pp. 451-452.

Author Information

Daniel Parrochia
Email: daniel.parrochia@wanadoo.fr
Université Jean Moulin – Lyon III
France

The Aim of Belief

It is often said that belief has an aim. This aim has been traditionally identified with truth and, since the late 1990s, with knowledge. With this claim, philosophers designate a feature of belief according to which believing a proposition carries with it some sort of commitment or teleological directedness toward the truth (or knowledge) of that proposition. This feature is taken to be constitutive of belief (that is, it is part of what a belief is that it is an attitude having this aim) and individuative of that type of mental state (that is, it is sufficient for distinguishing beliefs from other types of mental attitude like desire and imagining). Philosophers appeal to belief’s aim mainly for explanatory purposes: the aim is supposed to explain a number of other features of belief, such as the impossibility of believing at will, the infelicity of asserting Moorean sentences (for example, “I believe that it is raining, but it is not raining”), and the normative force of evidential considerations in the processes of belief-formation and revision.

Though many tend to agree on the above aspects of the aim, there are major disagreements over two further issues: (1) how to interpret the claim that belief has an aim, and (2) what this aim is. With respect to (1), the claim has received very different interpretations. Some have interpreted it literally, taking the aim as an intentional purpose of believers or a functional goal of beliefs; others have interpreted it metaphorically, as some kind of commitment or norm governing beliefs and their regulation (formation, maintenance, and revision); still others deny that beliefs aim at truth in a substantive sense and endorse minimalist accounts of belief’s truth-directedness. With respect to (2), there is an ongoing debate on whether the aim of belief is truth, knowledge, or some other condition such as epistemic justification.

The Truth-Directedness of Belief
Interpretations of the Aim
What Does Belief Aim At?
Relevance of the Topic
References and Further Reading

1. The Truth-Directedness of Belief

The claim that “belief aims at truth” was first coined by Bernard Williams (1973) to designate a set of properties of beliefs, namely (1) that truth and falsehood are dimensions of assessment of beliefs as opposed to other psychological states and dispositions; (2) that to believe that p is to believe that p is true; and (3) that to say “I believe that p” carries, in general, a claim that p is true; that is, it is a qualified way of asserting that p is true (Williams, 1973, p. 137).

Since Williams, many have taken up the claim that belief aims at truth. However, with such an expression, these philosophers do not refer to a set of properties as Williams did, but to a unique feature of belief (sometimes also called truth-directedness). This feature (that is, aiming at truth) is supposed to capture the specific relation of belief with truth. This relation seems to be peculiar to belief, and to play an important role in the characterization of this type of attitude. No other attitude seems to entertain such a special relation with truth. Like belief, the content of attitudes like (propositional) desires, imaginings, and mere thoughts can be true or false. But differently from these attitudes, beliefs are considered defective if their content is false, or correct if it is true: if I imagine that snow is black, there is nothing defective in my imagination; but if I believe that snow is black, there is something wrong with my belief. Also, we can arbitrarily decide to form or revise attitudes like imagining and assuming regardless of whether we take their contents to be true or false, but this seems not to be possible for beliefs. In short, these attitudes are not sensitive to truth-regarding considerations in the way beliefs are (in both normative and descriptive ways). The relation of belief with truth also differs from that of factive attitudes like knowledge and regret. Differently from beliefs, these attitudes imply the truth of their content. If I know that it is raining now in Paris, then it is true that it is raining now in Paris. But if I believe that, the content of my belief may be false. The relation of belief to truth is thus neither as weak as that of other attitudes like imagining, nor as strong as that of knowledge. This is why it is often conceived as an aim or a commitment toward the truth (or knowledge) of the believed proposition: beliefs may fail to be true (to achieve that aim), not that they may fail to aim at truth.

That granted, a further question is how to interpret the claim that beliefs aim at truth. Philosophers conceive of truth-directedness in very different ways: as an intentional aim of the believer to accept a proposition if and only if it is true; as a function regulating our cognitive processes; as a norm requiring one to believe a proposition only if true; as a value attached to believing truly. In this section I remain neutral on the specific interpretations of the aim, postponing a discussion of these interpretations to §2. The objective of the present section is to introduce some properties commonly attributed to truth-directedness, independent of its specific interpretation. For ease of exposition, it will also be assumed that truth is the aim of belief until §3, where alternative candidates are considered.

Section 1.a introduces two properties commonly attributed to truth-directedness: (1) that it is a constitutive or essential feature of belief, and (2) that it is individuative of belief with respect to other mental attitudes. Section 1.b considers the differences between truth-directedness and other truth-related properties of belief such as the direction of fit and the value of having true beliefs. The truth-aim is usually attributed to belief in order to explain a number of characteristics of this attitude concerning its relation with truth. Section 1.c lists the main features that truth-directedness is supposed to explain.

a. The Aim as Constitutive and Individuative of Belief

When philosophers attribute an aim to belief, they conceive of this property as constitutive of this type of attitude. This means, roughly, that it is part of what a belief is (that is, part of the essence or the concept of belief) that it is a mental attitude directed at the truth. Let us label this the constitutivity thesis. Depending on how we conceive truth-directedness, there will be different ways of working up to this thesis. If, for example, we interpret truth-directedness as a goal of the agent (compare §2.a), we can conceive of beliefs as analogous to acts like concealing (Steglich-Petersen, 2006, p. 512). Part of what it is to conceal an object X is that it is a type of act involving the goal that someone will not find X. It is in virtue of this goal that an action counts as an instance of concealing. Similarly, a way of stating the constitutivity thesis for belief is that it is part of what S’s believing that p is that S has an aim or goal (or that it is a function of S’s cognitive system) to retain that attitude only if it is true. It is in virtue of this aim of the agent who believes (or this function of her cognitive system) that that attitude counts as belief.

Alternatively, if one interprets truth-directedness as a norm to believe only the truth (compare §2.b), the constitutivity thesis amounts to understanding this norm by analogy to rules constitutive of practices like games (Wedgwood, 2002, p. 268). A practice is constituted by a set of rules if and only if it is part of what that practice is that this set of rules is in force for agents engaged in that practice (Glüer & Pagin, 1998). Consider a specific example: chess is a game constituted by a set of rules stating which moves are legal or permissible in the game. If one plays chess, one is thereby committed by the rules of the game to perform only legal moves. The performance of a particular act does not count as a chess-move if it cannot be assessed (justified, criticized…) according to the constitutive rules of the game. Similarly, if it is part of what a belief is that it is an attitude governed by a norm to believe only the truth, a mental attitude does not count as a belief if it cannot be assessed (criticized, justified…) on the basis of this norm, as right or correct if true and wrong or incorrect if false. One can also conceive of the constitutivity thesis by analogy to other types of entity essentially constituted by norms or values. For example, it is constitutive of what it is to be a citizen to be subject to certain rights and commitments, and it is constitutive of murder to be an act of killing in a wicked, inhumane, or barbarous way (for the latter example, see Dretske, 2000, pp. 243-245).

The claim that truth-directedness is constitutive of belief can be conceived of in at least two ways, as relative to the concept of belief or to its nature. According to the conceptual interpretation, it is a condition of understanding the concept of belief that we conceive of beliefs as mental attitudes directed toward truth (Boghossian, 2003; Engel, 2004; Shah, 2003). A proper understanding of the concept of bachelor implies conceiving of a bachelor as an unmarried man. Analogously, if one has a correct grasp of the concept of belief and conceives of a mental attitude as a belief, she understands it as one that, in some sense to be specified, is directed toward truth.

Other philosophers consider truth-directedness as constitutive of the nature or essence of belief (Brandom, 2001; Railton, 1994; Velleman, 2000a; Wedgwood, 2002, 2007). The relation between belief and truth-directedness is here conceived of as one of metaphysical dependence of the former on the latter: as it is essential to water that it has a certain chemical composition (H₂O), it is essential to belief that it is an attitude involving a commitment to or an aim at truth. A mental attitude counts as a belief at least partially in virtue of aiming at the truth. It is simply impossible for an attitude to be a belief if it lacks this property.

It is usually held that the essentialist interpretation of the thesis does not entail the conceptual one (for example, Wedgwood, 2007, ch. 6). It is part of the essence of water, but not of its concept, that water is H₂O—we can understand the concept of water without conceiving water as having that specific chemical composition. Similarly the truth-aim may be constitutive of the essence of belief but not of its concept (see Zangwill, 2005 for a similar view). Also, some philosophers have argued that the conceptual interpretation does not entail the essentialist one (Papineau, 2013; Shah, 2003, fn. 41; Shah & Velleman, 2005, fn. 43; Wedgwood, 2007).

The second property commonly attributed to truth-directedness is the individuativity of belief: the aim is the feature that individuates belief as that type of mental state and distinguishes beliefs from other mental attitudes (Engel, 2004; Lynch, 2009; Railton 1994; Velleman, 2000a; Wedgwood, 2002). Though many other attitudes entertain relations with truth (compare §1.b), it is claimed that belief is the only attitude aiming at truth. The truth-aim plays a fundamental role in sorting out beliefs from other mental attitudes, being the distinctive feature of beliefs with respect to other types of attitude like thoughts, suppositions, desires, and imagining.

Philosophers usually appeal to the individuativity of truth-directedness for belief for two main reasons: (1) singling out the aim as a peculiarly distinctive property of belief helps to achieve a better grasp of what truth-directedness is and to distinguish this property from other properties of belief (a philosopher who assumes individuativity in order to define truth-directedness is Velleman, 2000a, pp. 247-252); and (2) individuativity provides an argument to the best explanation for the claim that belief aims at truth: as the argument goes, without assuming that belief’s truth-directedness has this peculiar individuative role, one cannot account for the difference between beliefs and other attitudes (Engel, 2004; Railton, 1994).

It has also been suggested that if truth-directedness is the distinctive feature of belief with respect to other mental attitudes, this would provide an argument for the claim that this property is also constitutive of belief (Lynch, 2009b, 81; McHugh & Whiting, 2014; Velleman, 2000a; Wedgwood, 2002). Here is a way in which this argument may proceed: if the truth-aim were not a necessary and constitutive feature of belief, it would be possible for a belief not to aim at truth. But then, assuming that the aim is the only feature distinguishing beliefs from other mental attitudes, it would be impossible to classify that attitude as a belief rather than as a different type of attitude. Thus, the truth-aim must be a feature that beliefs possess necessarily and essentially. The argument from individuativity is not the only one supporting the constitutivity of truth-directedness for belief. Since other arguments partially depend on normativist interpretations of the aim, they will be considered in 2.b.

A number of critics have pointed out that it is possible to distinguish beliefs from other types of attitude without stipulating that it involves a constitutive aim at truth. These philosophers identify the attitude of believing a proposition with that of merely holding it true or accepting it (Glüer & Wikforss, 2013; Vahid, 2009), or they take other dispositional or motivational properties of belief as distinctive of this type of attitude. For a discussion of some of these views see §2.c.

b. Differences between the Aim and Other Properties of Belief

According to many philosophers engaged in the present debate (in particular those endorsing teleological and normative interpretations), truth-directedness is supposed to characterize and distinguish belief from other types of mental attitude. This property is conceived of as unique to belief, not possessed by any other attitude. These philosophers are careful to distinguish it from other properties relating belief to truth that other attitudes also possess. In this subsection I will introduce some of these properties and explain in which respects they are supposed to differ from the aim of belief. Mentioning these other properties will provide a rough idea of what truth-directedness is not. However, before considering these properties, it is worth mentioning that some philosophers endorsing minimalist conceptions of truth-directedness tend to identify the aim with some of these properties; these alternative interpretations of the aim will be briefly mentioned in this subsection and considered in more detail in §2.c.

An obvious truth-related feature of belief is the fact that believing something is believing it to be true (Velleman, 2000a). In other words, beliefs have propositions as content, and propositions can be true or false. This property is obviously not individuative of belief, and thus cannot be identified with truth-directedness. All propositional attitudes share it with beliefs. For instance, believing that p is believing that p is true, hoping that p is hoping that p is true, imagining that p is imagining that p is true, and so on (Engel, 2004; Velleman, 2000a).

It is also commonly held that beliefs involve specific causal, functional, and dispositional-motivational roles with respect to action and behavior. Some of these roles determine another aspect under which beliefs are related to truth. Using Ramsey’s (1931) metaphor, beliefs are like maps by which we steer in the world and upon which we are disposed to act. Belief is an attitude involving dispositions to act and behave as if its content were true and to use it as premise in reasoning (Armstrong, 1973; Stalnaker, 1984). Some have argued that belief’s aim at truth can be identified with the possession of similar dispositional and functional properties. In response to this challenge, it has been argued that these properties are not sufficient to set belief apart from other mental attitudes, and thus to capture the distinctive relationship between belief and truth (Engel, 2004; Velleman, 2000a). Other types of attitude seem to possess these very same properties. For instance, attitudes like acceptance and pretense all seem to dispose the subject to act as if their content were true and have the same motivational role.

Another property commonly attributed to belief, and concerning the way it is related to truth, is its mind-to-world direction of fit. On the one hand, some attitudes, like desires, have a world-to-mind direction of fit: if what is desired is not the case, the world should be changed in order to fit what is desired, and not vice versa. On the other hand, other attitudes, like beliefs, have a mind-to-world direction of fit: if what is believed is not the case (that is, it does not fit what it is supposed to represent), the belief’s contents should be revised to fit the world, and not vice versa. This is only one way of fleshing out the distinction (see Frost, 2014 and Humberstone, 1992 for overviews of the distinction). Another popular way is to distinguish between cognitive and conative states, where cognitive states are such that the proposition in their content is regarded as something that is true, while conative states are such that they involve regarding the proposition in the content as something to be made true (Velleman, 2000a). It is difficult to evaluate the relation of the truth-directedness of belief with direction of fit, since this depends on which account of direction of fit one accepts, and there is no unique and undisputed account. Some philosophers seem to identify belief’s direction of fit with its aim at truth (Humberstone, 1992; Platts, 1979). Others (Engel, 2004; Shah & Velleman, 2005; Velleman, 2000a) distinguish the two features, arguing that other mental attitudes such as suppositions, assumptions, and imagining possess the same direction of fit as beliefs, and thus this property cannot be identified with truth-directedness, which is distinctive of beliefs. Notice that the persuasiveness of this argument depends on whether one endorses an account of direction of fit according to which other attitudes would have the same direction of fit as belief.

It is also important to distinguish the truth-directedness of belief from the value of possessing true beliefs. It has been argued that having true beliefs is something valuable (David, 2005; Horwich, 2006; Kvanvig, 2003; Lynch, 2004). We naturally prefer to have true rather than false beliefs, and tend to attribute some sort of value to true beliefs and disvalue to false ones. It seems to be a platitude that true beliefs are at least extrinsically and instrumentally valuable. For example, we might prefer true beliefs to false ones because the former are more conducive to the satisfaction of one’s desires and the avoidance of dangers. Some philosophers have argued that true beliefs have also epistemic value. For example, it has been argued that believing the truth is an intrinsically valuable cognitive success. Though one might expect there to be important connections between the two topics, the issue of whether true beliefs are valuable must be distinguished from the further issue of whether truth is the aim of belief. While the former is a matter of aims, goals, and evaluations extrinsic to the notion of belief (for example, the goal of believing truths and not believing falsehoods), the latter is a property intrinsic and constitutive of such a mental state (Vahid, 2006, 2009, p. 19). Another respect in which the two features must be distinguished is that the value of true beliefs is hardly individuative of beliefs: other types of mental state such as guesses, hypotheses, and conjectures are evaluable according to their being true or false. In spite of these important differences, some philosophers have suggested that the value of true beliefs can be at least in part related to and explained by the constitutive aim of belief, even if not identified with it (Engel, 2004; Lynch, 2004 Railton, 1994; Williams, 2002).

c. The Explanatory Role of the Aim

The hypothesis that beliefs involve an aim at truth has been used to explain a number of features specific to this mental attitude. Before considering such features, it is important to stress that not everyone who endorses some version of this hypothesis thinks that it can explain all of these features. The main features supposed to be explained by truth-directedness are the following:

The difficulty or impossibility of believing at will,
The infelicity of asserting Moorean sentences and the absurdity of having Moorean beliefs,
The normativity of mental content,
The motivational force of evidential considerations in deliberative contexts,
The nature of epistemic normativity and the norms governing belief and theoretical reasoning, and
The correctness standard of belief.

(1) As famously argued by Williams (1973), belief’s truth-aim would enable one to explain the difficulty of believing at will (see also Velleman, 2000a). Believing a proposition p at will would entail believing it without regard to whether p is true. However, if beliefs constitutively involve aiming at truth, the only considerations relevant to forming and maintaining a belief would be those in conformity to its constitutive aim; that is, truth-relevant considerations. Believing at will would thus be either impossible or very difficult. This line of argument has been widely discussed in the literature. For critical discussions see, for example, Frankish (2007); Hieronymi (2006); Setiya (2008); and Yamada (2012).

(2) Belief’s truth-directedness could also explain the infelicity of asserting Moorean sentences and the absurdity of thinking Moorean thoughts—sentences and thoughts having the form “I believe that p, but not p” (for example, Baldwin, 2007; Littlejohn, 2010; Millar, 2009; Moran, 1997; Railton, 1994). Though these sentences are not self-contradictory, if asserted, they sound odd and infelicitous. As Moore (1942, p. 543) observes, this feature of belief-ascription seems to show that self-ascribing a belief in the first person carries with it an implied claim to the truth of the believed proposition. Similar ascriptions relative to many other mental states involve either no infelicity (there is no paradox in asserting “I assume that p but it is false that p”) or a contradiction (it is contradictory to assert “I know that p but it is false that p”). The infelicity of asserting Moorean sentences can be explained as follows: on the one hand, an assertion is an act by which the speaker commits herself to the truth of what she says; on the other hand, a belief is a mental state involving an aim at the truth of the believed proposition. We can also think of this aim as a sort of commitment (Baldwin, 2007; Millar, 2009; see §2.a for normative interpretations of the aim). The infelicity would thus be due to a conflict between the respective constitutive commitments or aims of assertion and belief. By asserting a Moorean sentence like “p and I do not believe that p,” a speaker would both endorse a commitment to the truth of p and deny such a commitment at the same time. This explanation can be easily extended to an explanation of the unreasonableness of Moorean thoughts and judgments, since a judgment, like an assertion, can be considered an act involving a commitment to the truth of what is adjudged.

(3) Many philosophers argue that mental content is normative (for an overview and references, see Glüer & Wikforss, 2010). This thesis is often interpreted as the claim that there are norms governing the correct use of concepts in the content of propositional mental attitudes. An example of such norms is, for instance, that the concept white is correctly applied to an object x if and only if x is white. Some have suggested that the aim of belief can provide an explanation of the normativity of mental content. In particular, Velleman (2000a) has suggested that the normativity of content can be entirely reduced to the truth-directedness of belief: if there is a norm governing mental content, this norm applies only to the contents of attitudes that aim at truth; that is, to beliefs. Boghossian (2003) has provided an argument according to which the normativity of mental content would derive from that of belief. First, he argues that the truth-directedness of belief has to be conceived as a norm constitutive of the concept of belief. Second, he argues that there is a constitutive connection between the notions of content and belief: our grasp of the concept of content depends on the grasp of the concept of belief. The normativity of content would thus be inherited by the normativity of belief. This argument has been the target of several criticisms; see, in particular, Glüer & Wikforss (2009) and Miller (2008).

(4) Belief’s truth-directedness has also been invoked to explain certain aspects of doxastic deliberation (namely deliberation concerning what to believe). One such aspect is the motivational force of evidential considerations in deliberative contexts. In particular, Shah (2003) and Shah and Velleman (2005) have argued that truth-directedness can explain doxastic transparency, the phenomenon according to which, in the context of doxastic deliberation, the question whether to believe that p is invariably settled by the answer to the further question whether p is true. Roughly, the idea is that when an agent engages in deliberation whether to believe a given proposition, only evidential (truth-regarding) considerations can be treated as reasons for believing. Other types of considerations (for example, practical) have no motivational force in the deliberation. This can be explained by the hypothesis that the concept of belief is constitutively governed by a norm to believe p only if p is true, and that in doxastic deliberation, the agent deploying that concept in the question whether to believe that p is motivated by the truth-norm to form a belief only if it is true. This in turn explains why only truth-relevant considerations matter in answering the question. Other philosophers have provided similar explanations of doxastic transparency—and more generally of the central role of evidence in deliberative belief-formation processes—compatible with non-normative interpretations of the truth-aim (for example, Steglich-Petersen, 2006, §5). It is worth noting here that similar explanations of the impossibility of believing in response to non-evidential considerations can also be used to explain the impossibility of believing at will (see (1) above).

(5) Belief’s aim has also been invoked to explain the various norms governing belief and theoretical reasoning, and to shed light on the nature of epistemic normativity in general. For example, according to Velleman, belief’s truth-directedness accounts for the justificatory force of theoretical reasoning. Theoretical reasoning justifies a belief by adducing considerations that indicate it to be true (2000a, p. 246). This is the case because being true is what satisfies the aim of belief. Other philosophers have argued that belief’s aim helps to explain norms of rationality and justification governing beliefs, and, more generally, the nature of epistemic normativity (Boghossian, 2003; Millar, 2004, 2009; Shah & Velleman, 2005; Sosa, 2007; Wedgwood, 2002, 2013). A common explanation takes these norms as instrumentally conducive to the satisfaction of the constitutive truth-aim of belief. This approach to epistemic normativity is not new in the literature. Many philosophers of the past have argued that epistemic standards of justification and rationality would be derivable from the fundamental goal of believing truly and avoiding falsehoods (for an overview see Alston, 2005, chs. 1 and 2). Criticisms of this type of approach to epistemic normativity typically mirror arguments against similar approaches in the practical domain. See, for example, Berker (2013); Firth (1981); Kelly (2003); and Maitzen (1995).

The various attempts to reduce or explain epistemic normativity in terms of a fundamental aim or norm of truth governing belief are considered by some philosophers as part of a wider project directed at providing analogous accounts for other normative domains. In particular, some have argued that practical normativity can be tracked back to constitutive norms of action and agency, which in turn would determine derivative norms of practical rationality and justification (Korsgaard, 1996; Shah, 2008; Velleman, 2000a; Wedgwood, 2007).

(6) According to some philosophers, the aim at truth would also explain why a belief is correct if and only if it is true, that is, the so-called correctness standard of belief (Steglich-Petersen, 2006, 2009; Velleman, 2000a). Philosophers endorsing teleological interpretations of the aim hold that the standard would be an instrumental assessment indicating the measure of success that a belief must attain in order to achieve its constitutive aim. However, this thesis is the subject of major disagreements. Philosophers giving normative interpretations of truth-directedness either identify the correctness standard with the constitutive aim of belief (Engel, 2007; Wedgwood, 2002), or argue for the independence of the two (Shah & Velleman, 2005).

2. Interpretations of the Aim

In the contemporary debate, there is a wide disagreement on how to interpret the claim that belief aims at truth. There are two main interpretations of the aim: teleological and normativist. According to teleological accounts, the aim of belief is an intentional purpose of subjects holding beliefs, or a functional goal of cognitive systems regulating the formation, maintenance, and revision of beliefs. Normativist accounts hold that the claim that beliefs have an aim must be interpreted metaphorically. According to normativists, truth-directedness is better understood as a commitment, a norm governing the regulation of beliefs (their formation, maintenance, and revision). Other philosophers have endorsed minimalist accounts of truth-directedness, denying that beliefs aim at truth in a substantive sense.

a. Teleological Interpretations

Teleological (also called “teleologist”) interpretations hold that beliefs are literally directed at truth as an aim, an end, or a goal (telos in Greek). This aim would be realized in truth-conducive processes and practices of belief-regulation, whose role is the formation, maintenance, and revision of beliefs. An attitude would count as a belief only if it is formed and regulated by these processes and practices. An advantage commonly attributed to teleological interpretations is that these interpretations seem more compatible with a naturalistic account of belief than rival interpretations (in particular, normativist ones). The thought is that intentions, goals, or functions can be accounted for in naturalistic terms. Furthermore, this interpretation would naturally fit with broadly instrumentalist, naturalistically unproblematic conceptions of epistemic normativity and epistemic rationality (note however that these conceptions have been the target of many criticisms; for example, Berker, 2013; Kelly, 2003).

Teleological interpretations differ with respect to how they conceive the aim at truth. Some teleologists interpret the aim of belief as an intentional goal of the subject, like an interest to accept a proposition only if it is true. For example, according to Steglich-Petersen (2006), believing is accepting a proposition with the purpose of getting its truth-value right. According to such an interpretation, the aim is realized through deliberative practices like judgments, in which an agent accepts a proposition only if she has evidence in support of its truth, and maintains that acceptance in the absence of contrary evidence. Steglich-Petersen recognizes that many of our beliefs are regulated in entirely sub-intentional ways. However, he argues that only beliefs considered at an intentional level are connected to a literal aim:

cognitive states and processes that are not connected with any literal aim or intention of a believer can nevertheless count as ‘beliefs’ in virtue of […] being to some degree conducive to the hypothetical aim of someone intending to form a belief in the primary strong sense. (2006, p. 515)

Other philosophers have advanced sub-intentional interpretations of the aim, conceiving it as a functional goal of the attitude or the psychological system to form true beliefs and revise false ones. This function would be regulated at a sub-personal, often unconscious level. A similar approach has been defended by Bird (2007) and McHugh (2012b). Some authors also interpret certain functionalist accounts such as those of Burge (2003), Millikan (1984), and Plantinga (1993) as teleological in this sense (see, for example, McHugh, 2012a, fn. 6, 2012b, fn. 49).

The most popular interpretation of the aim is a “mixed” one, according to which truth-directedness would be constituted by both intentional and sub-intentional processes. In particular, Velleman (2000a) maintains that there is a broad spectrum of ways in which the aim can be regulated. While sometimes it is realized in the intentional aim of a subject in an act of judgment about a certain matter, at other times there are cognitive systems in charge of the regulation of belief designed to ensure the truth of such mental states. Such systems would carry out this function more or less automatically, not relying on the subject’s intentions. Other philosophers who distinguish between intentional and sub-personal levels of regulation of the aim are Millar (2004, ch.2); Sosa (2007, 2009).

A well-known objection to teleological accounts, provided by Owens (2003), is specifically directed at intentional and mixed interpretations of the aim (for similar objections see Kelly, 2003). Owens observes that if beliefs aim at truth as argued by teleologists, believing would be similar to guessing. Guesses are mental acts aiming at truth, in the sense that when one guesses, one strives to give the true answer to a question. As Owens writes,

a guesser intends to guess truly. The aim of a guess is to get it right: a successful guess is a true guess and a false guess is a failure as a guess. Someone who does not intend to guess truly is not really guessing. (2003, p. 290)

According to a teleologist perspective, similar considerations are valid for belief, which is a mental state held with the purpose of holding it only if true. But there are at least two important disanalogies between the intentional aim involved in guessing and the aim of belief.

First, the aim of belief does not interact with other aims of the subject the way the truth-aim of guesses does. The aim of guessing (as well as that of other goal-directed activities) can interact with other goals and objectives of the subject, it can conflict with these other goals, and it can be weighed with them. In particular, when we guess, we integrate the truth-aim constitutive of guessing with other purposes, such as the practical relevance of guessing, and we consider guessing that p reasonable when aiming at the truth by means of a guess that p would maximize expected utility (Owens, 2003, p. 292). If beliefs, like guesses, constitutively involve an aim at truth, then we should expect that, on at least some occasions, we would weigh the aim of belief against other aims. For example, when engaged in deliberation about whether to believe a given proposition, our pursuit of the truth-goal may be constrained by other goals and purposes of the subject in the usual way. But belief’s aim does not work like this. A large reward to believe that today it is not sunny gives me a reason to try to believe it, but, in deliberation about what to believe, these considerations do not interact and cannot be weighed with the truth-aim of belief in the way they do with other aims and purposes of the subject. In this respect, belief appears to be “insulated” from all but one aim, in a way that aim-directed behaviors in general are not (McHugh, 2012a, p. 430).

The second disanalogy suggested by Owens is that, in guessing, we can exercise a kind of voluntary control that is not possible in the case of belief. The guesser can compare different considerations and then decide whether to terminate her inquiry and guess. Nothing similar happens in deliberation about whether to believe a given proposition, where one cannot decide when to conclude her inquiry and start believing a proposition. The deliberation is concluded more or less automatically and cannot be controlled by reflection on how best to achieve the aim. Given these disanalogies, Owens concludes that while a guess is an attitude regulated by an intentional aim at truth, belief is not.

Teleologists have provided some replies to Owen’s argument. In particular, it has been argued that the aim of belief does in fact interact and can be weighed with other aims (Steglich-Petersen, 2009); it has been denied that evidential considerations play the exclusively prominent role in belief-formation suggested by Owen’s argument (McHugh, 2012a; for a similar point, though not directly related to Owen’s argument, see Frankish, 2007); and it has been argued that the direct form of control we have on the formation of guesses, but not of beliefs, can be explained by the fact that belief is a mental state, while guessing is a mental act (Shah & Velleman, 2005).

A related problem for a teleological interpretation of the aim is that sometimes we are completely indifferent to certain matters, and sometimes we even prefer (have the goal or aim) not to have any belief on certain matters. Nevertheless, evidence for these truths constitutes reasons for us to believe them, and if presented with such evidence in normal circumstances, we cannot refrain from forming beliefs on these matters. This seems to show that truth-directedness, and more generally epistemic rationality, cannot be reduced to aim-directed activities in the common sense of the term (Kelly, 2003).

Another very popular argument against teleological accounts of truth-directedness is Shah’s (2003) “teleologist’s dilemma”. The dilemma relies on the following observation: on the one hand, in practices of doxastic deliberation—deliberation directed at forming a belief about a certain matter—considerations concerning the evidence in support of the truth of a given proposition are the only ones that are relevant in order to answer the question whether to believe that proposition (this is what Shah calls the phenomenon of doxastic transparency; compare §1.c). On the other hand, some belief-formation processes can be influenced by non-evidential factors (for example, cases of wishful thinking). In an attempt to explain these two types of belief formation, the teleologist is pushed in two incompatible directions: she can consider the truth-aim as a disposition so weak as to allow cases in which beliefs are caused by non-evidential processes, in which case she cannot account for the exclusive influence of evidential considerations in deliberative contexts of belief-formation; alternatively, in order to account for the exclusive role of evidence in doxastic deliberation, she can strengthen the disposition that constitutes aiming at truth so that it excludes the influence of non-truth-regarding considerations from such kinds of reasoning—but then she cannot accommodate non-deliberative cases in which non-evidential factors influence belief-formation. In either case, the teleologist cannot explain the truth-regulation of belief in both deliberative and non-deliberative contexts. Therefore, a teleologist interpretation of the aim is not sufficient alone to provide an explanation for the truth-directedness of beliefs in all processes of belief formation.

In order to address this problem, Shah & Velleman (2005) argue that belief is regulated by two levels of truth-directedness: a sub-intentional teleological mechanism responsible for weak regulation in non-deliberative contexts, and one conceived in normative terms, able to explain the strong truth-regulation in deliberative contexts (see §2.b). For accounts of the dilemma compatible with a teleologist perspective, see, for example, Steglich-Petersen (2006) and Toribio (2013). For other objections to the teleological account, see Engel (2004) and Zalabardo (2010).

b. Normative Interpretations

Another way of interpreting belief’s truth-directedness has been through normative terms. According to normativist accounts of the aim of belief, the claim that “belief aims at truth” is just a metaphorical way of expressing the thought that beliefs are constitutively governed by a norm prescribing (or permitting) one to believe the truth (or only the truth). For example, if Mary forms the belief that it is now 12 a.m., she does what the norm requires (that is, she possesses a right belief) if that proposition is true, and she violates the norm if that proposition is false. Many normativists identify the norm of belief with a standard of correctness:

These philosophers take this standard to be constitutive of the essence or the concept of belief: belief would be a mental state characterized by the fact of being correct if and only if it is true (see §1.a for more details on normativist interpretations of the constitutivity thesis). This interpretation of the aim is probably the most popular in the early 21^st century. It has been defended by, among others, Boghossian (2003); Engel (2004, 2013); Gibbard (2005); Millar (2004); Shah (2003); Wedgwood (2002, 2007, 2013).

Let us here clarify a common confusion about the claim that belief is constitutively governed by a norm: that a truth-norm constitutively governs belief does not mean that all beliefs necessarily satisfy that norm. What is constitutive of belief is not the satisfaction of the norm (as a matter of fact, many beliefs happen to be false, and thus incorrect), but that the norm is in force and believers and their beliefs can be assessed and criticized according to it—as correct if the belief is true, and incorrect if it is false.

One of the best known arguments for a normative interpretation of truth-directedness, suggested by Shah (2003), is the argument to the best explanation of doxastic transparency. As mentioned in §2.a, transparency is the (alleged) phenomenon according to which the deliberative question whether to believe a given proposition p is invariably settled by answering the further question whether it is true that p. The two questions are answerable to the same set of considerations; that is, considerations concerning the evidence for or against the truth of p. This phenomenon is specific to deliberative contexts in which an agent explicitly considers whether to believe a given proposition. In such contexts, only evidential (truth-relevant) considerations can influence belief-formation. In contexts in which a subject forms a belief without passing through a deliberative process, on the contrary, non-evidential considerations could influence the belief-formation.

According to Shah, only a normative interpretation of the aim of belief can explain these facts—doxastic transparency, why this phenomenon is specific to doxastic deliberation, and the exclusive role of evidential considerations in deliberative contexts. The explanation is the following: let us assume that it is constitutive of the concept of belief that a belief is correct if and only if it is true. This is interpreted as the claim that someone believing a proposition p is under a normative commitment to believe p only if it is true. When a subject engages in doxastic deliberation and asks herself whether to believe a given proposition, she deploys the concept of belief. Assuming she understands this concept and is aware of its application conditions, she interprets this question as whether to form a mental attitude that she should have only if the proposition is true. This in turn determines a disposition to be moved only by considerations relevant to the truth of p. This explains transparency and the exclusive role of evidential considerations in deliberative contexts. In contrast, in non-deliberative contexts where belief-formation works at a sub-intentional level, the subject does not explicitly consider the question whether to believe p, does not deploy the concept of belief, and is not thereby motivated by the norm to regard only truth-relevant considerations as relevant in the process of belief-formation. For this reason, non-evidential factors can influence belief-formation in these contexts. In sum, Shah’s normativist account allows him to explain both the strong role of truth in the belief regulation in deliberative contexts, and its weak role in non-deliberative ones.

An objection to Shah’s argument is that it assumes an implausibly strong form of motivational internalism according to which the norm of belief necessarily and immediately motivates the agent when she recognizes and accepts it. This contrasts with the ways in which, in general, norms tend to motivate agents (McHugh, 2013; Steglich-Petersen, 2006).

Another argument for a normativist account of truth-directedness, suggested by Wedgwood (2002), is composed of two steps. First, it is argued that the correctness standard of belief expresses a relation of strong supervenience (correctness of a belief strongly supervenes on the truth of that belief’s content). The standard thereby articulates a necessary feature of belief: necessarily, all true beliefs are correct and all false beliefs are incorrect. Second, since the standard articulates a necessary feature of belief, it is an essential feature of beliefs. Both steps of the argument have been criticized (for example, Steglich-Petersen, 2008, pp. 277-278). Against the second step, one cannot infer from a thing necessarily possessing a certain property to the property being essential to the thing it is a property of—using a well-known example of Fine (1994, pp. 4-5), one cannot infer from the necessary claim that Socrates is the only member of the singleton having as its only member Socrates to the claim that it is essential to Socrates that he is the only member of that singleton. Against the first step, it has been argued that it relies on contentious assumptions about normative supervenience: it is an error to deduce from the supervenience of a normative property N over a non-normative property G the necessity of the claim that every object having property G also has property N. The most one can conclude is that, necessarily, if some object has property N in virtue of having property G, then anything with property G also has property N (where necessity here takes a wide scope on the conditional). For similar considerations on normative relations of supervenience, see Blackburn (1993, p. 132); and Steglich-Petersen (2008).

Other arguments often used in support of the normativist interpretation do not clearly favor this interpretation over alternative substantive conceptions of truth-directedness, such as teleological ones. For example, it has been argued that unless one assumes that belief is constitutively governed by a truth-norm, one is not in a position to distinguish beliefs from other cognitive propositional attitudes, such as assuming, thinking, or imagining. The assumption that belief is constitutively governed by a truth-norm has also been used in arguments to the best explanation of a number of features of belief such as (1) the infelicity of asserting Moorean sentences; (2) the disposition to rely on a believed proposition as a reason for action and a premise in practical reasoning (Baldwin, 2007, p. 83); and (3) the relation between belief, assertion, evidence, and action (Griffiths, 1962). See §1.c for discussion of some of these arguments. These various arguments have received formulations both in normativist and teleological terms (for teleological formulations, simply replace occurrences of “truth-norm” with “truth-aim”); for this reason, they do not favor either interpretation. It is also worth mentioning that the claim that belief is constitutively normative has received indirect support from views that, for independent reasons, hold that intentional attitudes in general are constitutively normative (Brandom, 1994; Millar, 2004; Wedgwood, 2007).

Though the normativist interpretation has been the most popular in the last two decades, it has also been the target of several criticisms. According to the No Guidance Argument, a truth-norm is incapable of guiding an agent in the formation and revision of her beliefs. One can conform one’s beliefs to a norm requiring one to believe only true propositions only by first forming beliefs about whether these propositions are true. The only way to follow this norm will thus be continuing to believe what one already believes. Such a norm would not provide any guidance as to what a subject should do in order to comply with it. More precisely, this norm would have no guiding role in processes of belief regulation (formation, maintenance, and revision). Versions of this argument have been given by Glüer & Wikforss (2009, pp. 44-45); Horwich (2006, p. 354); and Mayo (1963, p. 141). A reply to this argument consists in arguing that even if the truth-norm does not provide any direct guidance, it can guide belief regulation indirectly, via some other derived principle like norms of evidence and rationality (Boghossian, 2003; Wedgwood, 2002); or it could guide in specific contexts, such as in doxastic deliberation where an agent explicitly considers her evidence for a given proposition p with the aim of making up her mind about whether p (Shah & Velleman, 2005). For a further defense of the argument see Glüer & Wikforss (2010b, 2013).

Another criticism of the normative interpretation is that, in general, an agent subject to a norm should have some form of intentional control over the actions necessary to satisfy it and be free to choose whether to conform to the norm or not. These conditions on control and freedom to comply seem to be constraints on norms in general. However, belief formation is (at least often) an involuntary process and is realized at an automatic, non-inferential level. It is thus unclear how a truth-norm governing belief can satisfy the above constraints. For versions of this objection, see Glüer & Wikforss (2009); and Steglich-Petersen (2006).

Another problem for normative interpretations of truth-directedness concerns the formulation of the alleged norm of belief. If beliefs are constitutively governed by a truth-norm, it should be possible to state this norm in terms of some duty, prescription, or permission. However, all the suggested formulations seem to be affected by some problem. Bykvist and Hattiangadi (2007), in particular, consider several possible formulations of the norm and conclude that none of them is free from problems. The best known formulations are the following:

(1) For any S, p: S ought to (believe that p) iff p

(2) For any S, p: if S ought to (believe that p), then p

(3) For any S, p: S ought to (believe that p iff p)

All these formulations are flawed in some way. (1) implies that one ought to believe every true proposition, included trivial and uninteresting ones (see also Sosa, 2003 for a similar point). Furthermore, provided there are true propositions that it is impossible to believe (for example, it is raining and nobody believes that it is raining), (1) violates the commonly accepted rule according to which “ought” implies “can.” (2) seems not to be normatively interesting because it is unable to place any requirement on believers—if p is true, nothing follows from it about what S ought to believe; and if p is false, it only follows that it is not the case that S ought to believe that p; it does not follow that S ought not to believe that p. (3) is problematic for it does not allow one to derive claims about what one ought or ought not to believe. For example, from (3) and the falsity of p, one cannot derive that one ought not to believe p. Furthermore, (3) seems to be subject to the same objections raised against (1).

Bykvist and Hattiangadi raise similar objections to other formulations of the truth-norm. From this, they conclude that this general failure could be considered a clue that belief is not at all a normative concept, at least not in the way suggested by normativists. Many have considered this conclusion too hasty. First, even if all the available formulations of the norm are wrong, this does not mean that it is impossible to formulate the norm of belief in “ought” terms; it could simply mean that the right formulation is yet to come. Second, some have suggested alternative formulations that seem to avoid the above problems. For example, Whiting (2010) has suggested that interpreting the truth-norm as a norm of permissibility could avoid most of the problems. Other formulations avoiding these problems have been suggested by Littlejohn (2010), Fassio (2011), and Raleigh (2013). For a discussion, see Bykvist and Hattiangadi (2013). A third way of avoiding these objections is to deny that the norm of belief is a truth-norm (see §3).

A reply to the various considered objections to normative interpretations consists in abandoning a deontic conception of the truth-norm, according to which the norm would be like a prescription, directive, or permission. Adopting alternative non-deontic interpretations of the norm would allow one to avoid the various objections considered above. Some have suggested interpreting the normativity of belief in evaluative terms; that is, in terms of what it is good (in a certain sense of “good” to be specified) to believe (Fassio, 2011; McHugh, 2012b). Others have interpreted the norm of belief as involving a type of normativity sui generis (McHugh, 2014; Rosen, 2001, p. 621), as an ideal of reason (Engel, 2013), or as an “ought to be” in the Sellarsian sense, not requiring addressees of the norm to be capable of voluntarily following it (Chrisman, 2008).

For other criticisms of normativist interpretations of truth-directedness, see also Davidson (2001); Dretske (2000); Horwich (2006, 2013); and Papineau (1999, 2013). The common factor in these criticisms is the defense of the thesis that if there are norms governing beliefs, these are practical, contingent, and not constitutive of belief. Replies to some of the above objections are in Engel (2007, 2013); Shah & Velleman (2005); and Wedgwood (2013).

c. Minimalist Interpretations

The label “minimalist interpretations” is used here for a range of different views. The common factor of these views is that they deny that there is such a property as a truth-aim of belief, at least if one identifies it with some feature different from those considered in §1.b. Minimalists hold that the features supposedly explained by the aim of belief (see §1.c) can be explained by other properties commonly ascribed to these mental states, such as their causal, dispositional, functional, or motivational roles, or their direction of fit (Davidson, 2001; Dretske, 2000; Papineau, 1999).

Given the present characterization, one may wonder whether there is a clear-cut dividing line between teleological and minimalist views; in particular, between sub-intentional teleological views, identifying the aim at truth with functional mechanisms of the cognitive system, and dispositionalist and functionalist minimalist accounts. A way of distinguishing these two approaches is by looking at the dispositional or functional role distinctive of belief (compare McHugh, 2012a, fn. 6): while functionalist accounts congenial to teleological approaches to the truth-aim focus on the input side of belief’s functional role, and exclusively identify this role with a truth-directed goal (for example, forming true beliefs), minimalists think that the role that individuates belief is at least partially on the output side, and they are mainly concerned with practical roles of belief, such as satisfying the subject’s desires or providing reasons for action.

Some philosophers have endorsed accounts of belief according to which causal, dispositional, and/or functional roles of beliefs with respect to action and behavior would be sufficient to characterize and individuate this type of mental attitude. Some have argued that the specific relation between belief and truth can be fully explained by the functional role of beliefs of providing reasons for action. Others have argued that all that is necessary for an attitude to qualify as a belief is that it dispose the subject to behave in ways that would promote the satisfaction of his desires if its content were true. For similar views see, for example, Stalnaker (1984). Armstrong (1973) argues that an essential function of beliefs is moving a subject to action given the presence of suitable dominant desires and purposes, and locates in this causal role the peculiar difference between belief and other mental attitudes such as mere thoughts. Still others have identified the aim at truth of belief with its direction of fit (Humberstone, 1992; Platts, 1979).

A “deflationary” interpretation of truth-directedness has been defended by Vahid (2006, 2009). Vahid first considers the feature of accepting-as-true, introduced by Velleman (2000a), as common to all cognitive states (beliefs, assumptions, conjectures, imaginations…). He suggests that to capture the truth-directedness of belief, one should not add any further (teleological or normative) property to the fact that belief is an attitude of regarding-as-true. Rather, what is distinctive of belief according to Vahid is the specific way in which one regards-as-true a given proposition. While other attitudes involve regarding a proposition as true for the sake of something else, in order to reach certain specific goals (for example, assuming is regarding-as-true for the sake of argument, imagining involves regarding a proposition as true for motivational purposes), believing is regarding a proposition as true for its own sake, as an end in itself.

The main criticism directed at minimalist interpretations is that other mental states such as suppositions and assumptions possess these same properties (same causal, functional, dispositional, and motivational roles; same direction of fit) and, thus, that these properties are not sufficient alone to individuate the peculiar truth-directedness of belief, to explain the special features of belief listed in §1.c, and to distinguish beliefs from other types of mental attitude (Engel, 2004; Velleman, 2000a). For a reply, see, for instance, Zalabardo (2010, §10), who challenges the claim that a purely motivational conception of belief would not be sufficient to distinguish beliefs from other mental attitudes. See also Glüer & Wikforss (2009, p. 42).

3. What Does Belief Aim At?

There is debate concerning whether the aim of belief is truth, as has been traditionally argued, or some other property. Since the late 1990s, an increasing number of philosophers have defended the claim that knowledge is the fundamental aim or norm of belief. Upholders of this view are, among others, Adler (2002); Bird (2007); Huemer (2007); Littlejohn (2013); Peacocke (1999); Sutton (2007); and Williamson (2000). The best known defender of the thesis that belief would aim at knowledge is Williamson (2000). Williamson’s main motivations to hold this thesis derive from his view about the nature of knowledge and its relation to belief. Williamson criticizes the idea that it is possible to provide an analysis of knowledge in terms of other more fundamental notions. Rather, other epistemic notions such as belief and justification should be understood as derivative from the more fundamental notion of knowledge—this is the so-called Knowledge First approach in epistemology. In particular, Williamson suggests that belief be considered roughly as the attitude of treating a proposition as if one knew it. Knowledge would thus fix the standard of appropriateness or the success condition for a belief, and merely believing p without knowing it would be a sort of “botched knowing.” (2000, p. 47). In this sense, belief would not aim merely at truth but at knowledge.

A well-known argument for the knowledge aim is based on a parallel between assertion and belief. On the one side, many have argued that assertion is constitutively governed by a knowledge norm (Adler, 2002; Bird, 2007; Sutton, 2007; Williamson, 2000):

(KNA) one should assert p only if one knows p.

On the other hand, some philosophers have suggested that occurrent belief is the inner analogue of assertion (for example, Williamson, 2000, pp. 255-256). More precisely, the idea is that (flat-out) assertion is the verbal counterpart of a judgment, and a judgment is a form of occurrent (outright) belief. If so, it is plausible that assertion and belief are governed by the same norm, and knowledge would be the norm of belief too:

(KNB) one should believe p only if one knows p.

Similar arguments have been suggested by Adler (2002); Bird (2007); McHugh (2011); Sosa (2010, p. 48); Sutton (2007); and Williamson (2000). To this line of argument, it may be objected that knowledge is not the norm of assertion. Some philosophers have suggested counterexamples to this thesis (Brown 2008; Lackey 2007). Others have argued that assertion is governed by other norms such as truth or justification (Douven, 2006; Weiner, 2005). Another way to criticize this argument consists in challenging the similarity between belief and assertion, arguing in particular that they would not be governed by the same norms. Whiting (2013, pp. 187-188) provides some reasons why one should expect standards of belief and assertion to diverge: since assertion is an “external” act, involving a social dimension, in evaluating an assertion one might have to take into account the expectations and needs of interlocutors and the role of speech acts in the unfolding conversation. Furthermore, assertion is a potential source of testimony. In asserting, one takes on responsibility for others’ beliefs. All these considerations are extraneous to belief, which is a “private” state of mind. It would thus not be surprising if assertions were governed by more demanding epistemic standards than belief due to their social character and their communicative role. Brown (2012, pp. 137-144) provides another argument against the claim that assertion and belief share the same epistemic standard: she first argues that whether an assertion or belief is epistemically appropriate partially depends on its consequences (for example, the epistemic propriety of asserting varies with the stakes), and second, that the consequences of asserting p may differ from those of believing p. It follows that there can be cases in which it is epistemically appropriate for a subject to believe that p, but not to assert that p, and vice versa.

Considerations about versions of Moore’s paradox with “know” in place of “belief” provide another argument for the claim that knowledge is the aim or norm of belief (Adler, 2002; Gibbons, 2013; Huemer, 2007; Sutton, 2007). As it sounds absurd or infelicitous to assert sentences like “it is raining but I do not know it is raining,” it seems incoherent to believe that it is raining and at the same time that one does not know that it is raining. An explanation of this fact could be that knowledge is the aim or norm of belief. A subject believing that it is raining but that she does not know it would violate (KNB). This type of argument has been the target of two objections: first, some have argued that a weaker standard, like a truth-norm, would be sufficient to explain the absurdity of this sort of Moorean belief (see, for example, the explanation considered in §1.c). Second, it has been argued that while asserting Moorean propositions of the form “it is raining but I do not know it is raining” sounds absurd, there is no such absurdity in believing these same propositions. It seems both reasonable and appropriate to believe something even while believing not to know it (McGlynn, 2013; Whiting, 2013, pp. 188-189). Using an example in McGlynn (2013, p. 387), there is nothing unreasonable or incongruous for Jane to believe that her ticket will lose, that this belief is justified, and that nonetheless this belief fails to count as knowledge. A reply to the latter criticism consists in distinguishing between outright (or full) belief and partial belief. Only outright belief would be subject to a knowledge norm. For a reply, see McGlynn (2013, §3) and Whiting (2013, p. 189).

Another argument for the knowledge aim/norm of belief is provided by the way in which we tend to assess (justify and criticize) our beliefs. Williamson (2005, p. 109) provides the following case: John is at the zoo and sees what appears to him to be a zebra in a cage. The animal in the cage is really a zebra. However, unbeknownst to John, to save money, most of the other animals in the zoo have been replaced by cleverly disguised farm animals. In this scenario, John’s belief is true and fully reasonable (after all, he has no reason to believe that the animal in the cage could not be a zebra). Still, John does not know it is a zebra. Intuitively, John needs an excuse for believing that the animal is a zebra. He can excuse himself by pointing out that he is in no position to distinguish his state from one of knowing. The need for an excuse indicates that it is wrong for John to believe that the animal is a zebra. Despite the fact that John’s belief is both reasonable and true, it is somewhat defective. This contrasts with knowing that it is a zebra, which provides a full justification for believing it, not a mere excuse. A reply to this argument is that John is not wrong in believing that the animal is a zebra (and thus does not need an excuse for this). Rather, if John stands in need of correction, this is due to the false background beliefs he has—for example, the belief that animals in this area are all zebras (Littlejohn, 2010; Whiting, 2013).

Some philosophers have argued that knowledge is the aim or norm of belief on the ground that knowledge has more value than mere true belief (Bird, 2007). However, on the one hand, there is no general agreement on whether knowledge is more valuable than true belief. On the other hand, from the fact that knowledge is more valuable than true belief, it does not follow that belief is governed by a norm or involves a constitutive aim at knowledge.

For other arguments in support of the claim that the aim or norm of belief is knowledge, see Bird (2007); Engel (2004); McHugh (2011); and Sutton (2007). For other criticisms of the claim, see Littlejohn (2010); McGlynn (2013); and Whiting (2013).

Though truth and knowledge are widely identified as the main candidates for being the aim or norm of belief, some philosophers have suggested other properties. Another available option is that the aim of belief is reasonability or justification. For a defense of the claim that non-factive justification is the condition of epistemic success for belief, see, for example, Feldman (2002, pp. 378-379). It should be noted that this view has not found many proponents in the 21^st century literature, at least if we exclude philosophers for whom justified belief is factive or requires knowledge (for example, Gibbons, 2013; Littlejohn, 2012; Sutton, 2007). An original approach is that of Smithies (2012), who argues that the fundamental norm of belief is that one be in a position to know what one believes, where “[o]ne is in a position to know a proposition just in case one satisfies all the epistemic, as opposed to psychological, conditions for knowledge, such as having ungettiered justification to believe a true proposition.” (2012, p. 4)

Some philosophers have identified the aim of belief with some specific kind of epistemic virtue one could manifest in the possession of belief (Zagzebski, 2004) or with understanding (Kvanvig, 2003). According to other philosophers, the fundamental aim of belief consists in the satisfaction of practical goals such as survival, utility, or the satisfaction of desires and wants. Similar views are particularly popular among philosophers favoring naturalistic approaches in the philosophy of mind. For example, Millikan (1984) argues that beliefs are integrated in a naturally selected cognitive system having the function of tracking features of the world in order to help in the satisfaction of biological needs such as survival and reproduction. A similar view has been more or less explicitly endorsed by, for example, Horwich (2006); Kornblith (1993); and Papineau (1999, 2013).

Another option is that belief possesses multiple aims or norms equally fundamental and irreducible to each other. For example, Weiner (2014) endorses pluralism about epistemic norms, arguing that there are many different epistemic norms, each valid from a different standpoint, and that no one of these standpoints need be better than another. In a virtue-theoretic framework, Wright (2014) argues that there are two fundamental epistemic aims: believing in accordance with the intellectual virtues (such as intellectual courage, carefulness, and open-mindedness), and believing the truth and avoiding falsehoods.

4. Relevance of the Topic

The topic discussed in the present article has relevance for several more general philosophical debates. It is clearly relevant in those areas of philosophy directly concerned with the notion of belief, such as the Philosophy of Mind, and in particular, the ontology of mental attitudes. One of the issues that traditionally have most concerned philosophers is that of individuating the distinctive feature of belief with respect to other mental attitudes such as trusting, mere thinking, imagining, guessing, and so on. As David Hume admits (1739, book I, part III, §7), the distinction between belief and mere thought was the first philosophical problem that the Scottish philosopher posed himself, and also one of the hardest he found to solve (for a discussion, see Armstrong, 1973, part I, §5). Whereas the difference between belief and other types of mental attitude seems to reside in the specific relationship that belief entertains with truth (or knowledge), it has been extremely difficult to grasp the peculiar nature of such a relationship. The 21^st century debate on the aim of belief promises to provide an answer to such a problem. In answering questions about the nature of the aim (see §2), it also promises to shed some light on the issue of whether belief is a normative attitude or whether it can be characterized by a fully naturalistic account.

The progress in the debate about the aim/norm of belief also substantially contributes to the study of norms and aims of other attitudes such as desires, emotions, and intentions. The aim of such studies is to provide a unified and coherent representation of the various norms and aims of mental attitudes and of their reciprocal relations (for example, Millar, 2009; Railton, 1997; Shah, 2008; Velleman, 2000b; Wedgwood, 2007). For instance, Wedgwood (2007) interprets the aim of attitudes as norms of correctness and argues that similar norms are constitutive and individuative of all intentional attitudes. Similarly, Shah (2008) applies an argument analogous to the transparency argument for belief (considered in §2.b) to other attitudes. In particular, he argues that the hypothesis that the concept of intention is governed by a constitutive norm would best explain the presumed fact that in order to conclude deliberation on whether to intend to A one must answer the question whether to A.

The debate on the aim of belief has, also, a particular relevance for certain views in philosophy of normativity. According to a prominent view in meta-ethics (so-called Constitutivism), normative facts can be grounded in facts about the constitution of action or agency. According to this view, agency is constitutively governed by practical norms (for example, Korsgaard, 1996; Velleman, 2000b). Some philosophers have tried to extend the view to other normative domains such as epistemology and aesthetics. The epistemological analogue of Ethical Constitutivism holds that epistemic normativity can be grounded in the constitutive aim or norm of belief. For a criticism of constitutivist approaches, see Enoch (2006).

The view that epistemic normativity is grounded in a fundamental truth-aim of belief has deep roots in the 20^st century history of epistemology. Many philosophers in the past have argued that there is a strict dependence relation between the fundamental aim or norm of belief (sometimes presented as a conjunction of two values or goals of believing truly and avoiding falsehoods) and other derivable normative epistemic standards such as justification and rationality. Versions of this view have been defended by many well-known epistemologists, including Chisholm, Goldman, Lehrer, Plantinga, Alston, Foley, and Sosa (see Alston, 2005, ch. 1, for an overview). Accounts of this relation differ depending on the notions of justification and rationality adopted by philosophers, and by how philosophers conceive the relation between the truth-aim and other derived normative properties (for example, consequentialist, deontologist, virtue-based…).

Another approach in epistemology concerned with the topic of the aim or norm of belief is the so-called Knowledge First approach introduced in §3. According to this view, knowledge has a prominent role among epistemic notions and constitutes the fundamental epistemic standard of assertion, belief, action, practical reasoning, and disagreement (compare “Knowledge Norms”). This approach has generated a comparative study of these standards (for example, Smithies, 2012). An example is the reformulation of various arguments for the knowledge norm of assertion in order to defend other knowledge norms, such as those of belief and action (compare §3). In this perspective, the debate on the aim of belief can help in understanding important aspects of epistemic norms of assertion, action, practical reasoning, and disagreement, and in turn can receive important contributions from advances in the debates about norms governing these other practices.

5. References and Further Reading

Adler, J. (2002). Belief’s own ethics (Vol. 112). MIT Press.
Alston, W. (2005). Beyond “justification”: Dimensions of epistemic evaluation (Vol. 81). Ithaca: Cornell University Press.
Armstrong, D. M. (1973). Belief, truth and knowledge (Vol. 24). London: Cambridge University Press.
Baldwin, T. (2007). The normative character of belief. In M. S. Green & J. N. Williams (Eds.), Moore’s paradox: New essays on belief, rationality, and the first person (pp. 76–89). Oxford: Oxford University Press.
Berker, S. (2013). The rejection of epistemic consequentialism. Philosophical Issues, 23(1), 363–387.
Bird, A. (2007). Justified judging. Philosophy and Phenomenological Research, 74(1), 81–110.
Blackburn, S. (1993). Essays in quasi-realism. New York, NY: Oxford University Press.
Boghossian, P. A. (2003). The normativity of content. Philosophical Issues, 13(1), 31–45.
Brandom, R. B. (1994). Making it explicit: Reasoning, representing, and discursive commitment. Cambridge, MA: Harvard University Press.
Brandom, R. B. (2001). Modality, normativity, and intentionality. Philosophy and Phenomenological Research, 63(3), 611–623.
Brown, J. (2008). The knowledge norm for assertion. Philosophical Issues, 18(1), 89–103.
Brown, J. (2012). Assertion and practical reasoning: Common or divergent epistemic Standards? Philosophy and Phenomenological Research, 84(1), 123–157.
Burge, T. (2003). Perceptual entitlement. Philosophy and Phenomenological Research, 67(3), 503–548.
Bykvist, K., & Hattiangadi, A. (2007). Does thought imply ought? Analysis, 67(296), 277–285.
Bykvist, K., & Hattiangadi, A. (2013). Belief, truth, and blindspots. In T. Chan (Ed.), The aim of belief (pp. 100–122). New York, NY: Oxford University Press.
Chrisman, M. (2008). Ought to believe. Journal of Philosophy, 105(7), 346–370.
David, M. (2005). Truth as the primary epistemic goal: A working hypothesis. In M. Steup & E. Sosa (Eds.), Contemporary debates in epistemology (pp. 296–312). Oxford: Blackwell.
Davidson, D. (2001). Comments on Karlovy Vary papers. In P. Kotatko (Ed.), Interpreting Davidson (pp. 285-308). Stanford, CA: CSLI Publications.
Douven, I. (2006). Assertion, knowledge, and rational credibility. Philosophical Review, 115(4), 449–485.
Dretske, F. (2000). Norms, history and the constitution of the mental. In Perception, knowledge and belief (pp. 242–258.). Cambridge: Cambridge University Press.
Engel, P. (2004). Truth and the aim of belief. In D. Gillies (Ed.), Laws and models in science (pp. 77–97). London: King’s College Publications.
Engel, P. (2007). Belief and normativity. Disputatio, 2(23), 179–203.
Engel, P. (2013). Doxastic correctness. Aristotelian Society Supplementary Volume, 87(1), 199–216.
Enoch, D. (2006). Agency, shmagency: Why normativity won’t come from what is constitutive of action. Philosophical Review, 115(2), 169–198.
Fassio, D. (2011). Belief, correctness and normativity. Logique Et Analyse, 54(216), 471-486.
Feldman, R. (2002). Epistemological duties. In P. Moser (Ed.), The Oxford handbook of epistemology (pp. 362–384). Oxford: Oxford University Press.
Fine, K. (1994). Essence and modality. Philosophical Perspectives, 8, 1–16.
Firth, R. (1981). Epistemic merit, intrinsic and instrumental. Proceedings and Addresses of the American Philosophical Association, 55(1), 5–23.
Frankish, K. (2007). Deciding to believe again. Mind, 116(463), 523–547.
Frost, K. (2014). On the very idea of direction of fit. Philosophical Review, 123(4), 429–484.
Gibbard, A. (2005). Truth and correct belief. Philosophical Issues, 15(1), 338–350.
Gibbons, J. (2013). The norm of belief. Oxford: Oxford University Press.
Glüer, K., & Pagin, P. (1998). Rules of meaning and practical reasoning. Synthese, 117(2), 207–227.
Glüer, K., & Wikforss, Å. (2009). Against content normativity. Mind, 118(469), 31–70.
Glüer, K., & Wikforss, Å. (2010a). The normativity of meaning and content. In E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy.
Glüer, K., & Wikforss, Å. (2010b). The truth norm and guidance: A reply to Steglich-Petersen. Mind, 119(475), 757–761.
Glüer, K., & Wikforss, Å. (2013). Against belief normativity. In T. Chan (Ed.), The aim of belief. New York, NY: Oxford University Press.
Griffiths, A. P. (1962). On belief. Proceedings of the Aristotelian Society, 63(n/a), 167–186.
Hieronymi, P. (2006). Controlling attitudes. Pacific Philosophical Quarterly, 87(1), 45–74.
Horwich, P. (2006). The value of truth. Noûs, 40(2), 347–360.
Horwich, P. (2013). Belief-truth norms. In T. Chan (Ed.), The aim of belief (pp. 17–31). New York, NY: Oxford University Press.
Huemer, M. (2007). Moore’s paradox and the norm of belief. In S. Nuccetelli & G. Seay (Eds.), Themes from G.E. Moore (pp. 142–57). Oxford: Oxford University Press.
Humberstone, I. L. (1992). Direction of fit. Mind, 101(401), 59–83.
Hume, D. (1739). A treatise of human nature. Oxford: Oxford University Press.
Kelly, T. (2003). Epistemic rationality as instrumental rationality: A critique. Philosophy and Phenomenological Research, 66(3), 612–640.
Kornblith, H. (1993). Epistemic normativity. Synthese, 94(3), 357–376.
Korsgaard, C. M. (1996). The sources of normativity (Vol. 110). Cambridge: Cambridge University Press.
Kvanvig, J. L. (2003). The value of knowledge and the pursuit of understanding (Vol. 113). Cambridge: Cambridge University Press.
Lackey, J. (2007). Norms of assertion. Noûs, 41(4), 594–626.
Littlejohn, C. (2010). Moore’s paradox and epistemic norms. Australasian Journal of Philosophy, 88(1), 79–100.
Littlejohn, C. (2012). Justification and the truth-connection. Cambridge: Cambridge University Press.
Littlejohn, C. (2013). The Russellian retreat. Proceedings of the Aristotelian Society, 113, 293–320.
Lynch, M. (2004). True to life: Why truth matters. Cambridge, MA: MIT Press.
Lynch, M. (2009a). The value of truth and the Truth of Values. In A. Haddock, A. Millar, & D. Pritchard (Eds.), Epistemic value. Oxford: Oxford University Press.
Lynch, M. (2009b). Truth, value and epistemic expressivism. Philosophy and Phenomenological Research, 79(1), 76–97.
Maitzen, S. (1995). Our errant epistemic aim. Philosophy and Phenomenological Research, 55(4), 869–876.
Mayo, B. (1963). Belief and constraint. Proceedings of the Aristotelian Society, 64, 139–156.
McGlynn, A. (2013). Believing things unknown. Noûs, 47(2), 385–407.
McHugh, C. (2011). What do we aim at when we believe? Dialectica, 65(3), 369–392.
McHugh, C. (2012a). Belief and aims. Philosophical Studies, 160(3), 425–439.
McHugh, C. (2012b). The truth norm of belief. Pacific Philosophical Quarterly, 93(1), 8–30.
McHugh, C. (2013). Normativism and doxastic deliberation. Analytic Philosophy, 54(4), 447–465.
McHugh, C. (2014). Fitting belief. Proceedings of the Aristotelian Society, 114(2pt2), 167–187.
McHugh, C., & Whiting, D. (2014). The normativity of belief. Analysis, 74(4), 698–713.
Millar, A. (2004). Understanding people: Normativity and rationalizing explanation. New York, NY: Oxford University Press.
Millar, A. (2009). How reasons for action differ from reasons for belief. In S. Robertson (Ed.), Spheres of reason (pp. 140–163). New York, NY: Oxford University Press.
Miller, A. (2008). Thoughts, oughts and the conceptual primacy of belief. Analysis, 68(299), 234–238.
Millikan, R. G. (1984). Language, thought and other biological categories. Cambridge, MA: MIT Press.
Moore, G. E. (1942). A reply to my critics. In P. A. Schilpp (Ed.), The philosophy of G. E. Moore. Chicago, IL: Open Court.
Moran, R. A. (1997). Self-knowledge: Discovery, resolution, and undoing. European Journal of Philosophy, 5(2), 141–61.
Owens, D. J. (2003). Does belief have an aim? Philosophical Studies, 115(3), 283–305.
Papineau, D. (1999). Normativity and judgment. Proceedings of the Aristotelian Society, 73(73), 16–43.
Papineau, D. (2013). There are no norms of belief. In T. Chan (Ed.), The Aim of Belief (pp. 64–79). New York, NY: Oxford University Press.
Peacocke, C. (1999). Being known. Oxford: Oxford University Press.
Plantinga, A. (1993). Warrant and proper function. New York, NY: Oxford University Press.
Platts, M. B. (1979). Ways of meaning: An introduction to a philosophy of language. London: Routledge & K. Paul.
Railton, P. (1994). Truth, reason, and the regulation of belief. Philosophical Issues, 5, 71–93.
Railton, P. (1997). On the hypothetical and non-hypothetical in reasoning about belief and action. In G. Cullity & B. N. Gaut (Eds.), Ethics and practical reason (pp. 53–79). New York, NY: Oxford University Press.
Raleigh, T. (2013). Belief norms and blindspots. Southern Journal of Philosophy, 51(2), 243–269.
Ramsey, F. P. (1931). Foundations of mathematics and other logical essays. London: Routledge.
Rosen, G. (2001). Brandom on modality, normativity and intentionality. Philosophy and Phenomenological Research, 63(3), 611–623.
Setiya, K. (2008). Believing at will. Midwest Studies in Philosophy, 32(1), 36–52.
Shah, N. (2003). How truth governs belief. Philosophical Review, 112(4), 447–482.
Shah, N. (2008). How action governs intention. Philosophers’ Imprint, 8(5), 1–19.
Shah, N., & Velleman, J. D. (2005). Doxastic deliberation. Philosophical Review, 114(4), 497–534.
Smithies, D. (2012). The normative role of knowledge. Noûs, 46(2), 265–288.
Sosa, E. (2003). The place of truth in epistemology. In L. Zagzebski & M. DePaul (Eds.), Intellectual virtue: Perspectives from ethics and epistemology (pp. 155–180). New York, NY: Oxford University Press.
Sosa, E. (2007). A virtue epistemology: Apt belief and reflective knowledge, Volume I. Oxford: Oxford University Press.
Sosa, E. (2009). Knowing full well: The normativity of beliefs as performances. Philosophical Studies, 142(1), 5–15.
Sosa, E. (2010). Knowing full well. Princeton, NJ: Princeton University Press.
Stalnaker, R. (1984). Inquiry. Cambridge, MA: MIT Press.
Steglich-Petersen, A. (2006). No norm needed: On the aim of belief. Philosophical Quarterly, 56(225), 499–516.
Steglich-Petersen, A. (2008). Against essential normativity of the mental. Philosophical Studies, 140(2), 263–283.
Steglich-Petersen, A. (2009). Weighing the aim of belief. Philosophical Studies, 145(3), 395–405.
Sutton, J. (2007). Without justification. Cambridge, MA: MIT Press.
Toribio, J. (2013). Is there an “ought” in belief? Teorema: Revista Internacional de Filosofía, 32(3), 75–90.
Vahid, H. (2006). Aiming at truth: Doxastic vs. epistemic goals. Philosophical Studies, 131(2), 303–335.
Vahid, H. (2009). The epistemology of belief. London: Palgrave Macmillan.
Velleman, D. (2000a). On the aim of belief. In D. Velleman (Ed.), The possibility of practical reason (pp. 244–281). New York, NY: Oxford University Press.
Velleman, D. (2000b). The possibility of practical reason (Vol. 106). New York, NY: Oxford University Press.
Wedgwood, R. (2002). The aim of belief. Philosophical Perspectives, 16(s16), 267–97.
Wedgwood, R. (2007). The nature of normativity. New York, NY: Oxford University Press.
Wedgwood, R. (2013). Doxastic correctness. Aristotelian Society Supplementary Volume, 87(1), 217–234.
Weiner, M. (2005). Must we know what we say? Philosophical Review, 114(2), 227–251.
Weiner, M. (2014). The spectra of epistemic norms. In J. Turri & C. Littlejohn (Eds.), Epistemic norms: New essays on action, belief, and assertion (pp. 201–218). Oxford: Oxford University Press.
Whiting, D. (2010). Should I believe the truth? Dialectica, 64(2), 213–224.
Whiting, D. (2013). Nothing but the truth: On the norms and aims of belief. In T. Chan (Ed.), The Aim of Belief. New York, NY: Oxford University Press.
Williams, B. (1973). Deciding to believe. In B. Williams (Ed.), Problems of the Self (pp. 136–51). Cambridge, MA: Cambridge University Press.
Williams, B. (2002). Truth and truthfulness: An essay in genealogy. Princeton, NJ: Princeton University Press.
Williamson, T. (2000). Knowledge and its limits. Oxford: Oxford University Press.
Williamson, T. (2005). Knowledge, context, and the agent’s point of view. In G. Preyer & G. Peter (Eds.), Contextualism in philosophy: Knowledge, meaning, and truth (pp. 91–114). New York, NY: Oxford University Press.
Wright, S. (2014). The dual-aspect norms of belief and assertion: A virtue approach to epistemic norms. In C. Littlejohn & J. Turri (Eds.), Epistemic norms: New essays on action, belief, and assertion (pp. 239–258). New York, NY: Oxford University Press.
Yamada, M. (2012). Taking aim at the truth. Philosophical Studies, 157(1), 47–59.
Zagzebski, L. (2004). Epistemic value and the primacy of what we care about. Philosophical Papers, 33(3), 353–377.
Zalabardo, J. L. (2010). Why believe the truth? Shah and Velleman on the aim of belief. Philosophical Explorations, 13(1), 1–21.
Zangwill, N. (2005). The normativity of the mental. Philosophical Explorations, 8(1), 1–19.

Author Information

Davide Fassio
Email: davide.fassio@unige.ch
University of Geneva
Switzerland

Ernst Cassirer (1874-1945)

Ernst Cassirer was the most prominent, and the last, Neo-Kantian philosopher of the twentieth century. His major philosophical contribution was the transformation of his teacher Hermann Cohen’s mathematical-logical adaptation of Kant’s transcendental idealism into a comprehensive philosophy of symbolic forms intended to address all aspects of human cultural life and creativity. In doing so, Cassirer paid equal attention to both sides of the traditional Neo-Kantian division between the Geisteswissenschaften and Naturwissenschaften, that is, between the social sciences and the natural sciences. This is expressed most systematically in his masterwork, the multi-volume Philosophie der symbolischen Formen (1923-9). Here Cassirer marshaled the widest learning of human cultural expression—in myth, religion, language, philosophy, history, art, and science—for the sake of completing and correcting Kant’s transcendental program. The human being, for Cassirer, is not simply the rational animal, but the animal whose experience with and reaction to the world is governed by symbolic relations. Cassirer was a quintessential humanistic liberal, believing freedom of rational expression to be coextensive with liberation. Cassirer was also the twentieth century’s greatest embodiment of the Enlightenment ideal of comprehensive learning, having written widely-acclaimed histories of the ideas of science, historiography, mathematics, mythology, political theory, and philosophy. Though cordial with both Moritz Schlick and Martin Heidegger, Cassirer’s popularity was eclipsed by the simultaneous rise of logical positivism in the English-speaking world and of phenomenology on the European continent. His professional career was the victim, too, of the political events surrounding the ascendency of Nazism in German academies.

Biography
Philosophy of Symbolic Forms
Cultural Anthropology
Myth
Language
Science
Political Philosophy
The Davos Conference
References and Further Reading
1. Cassirer’s Major Works
2. Further Reading

1. Biography

Ernst Cassirer was born in 1874, the son of the established Jewish merchant Eduard Cassirer, in the former German city of Breslau (modern day Wrocław, Poland). He matriculated at the University of Berlin in 1892. His father intended that he study law, but Cassirer’s interest in literature and philosophy prevented him from doing so. Sampling various courses at the universities at Leipzig, Munich, and Heidelberg, Cassirer was first exposed to the Neo-Kantian philosophy by the social theorist Georg Simmel in Berlin. In 1896, Cassirer began his doctoral studies under Herman Cohen at the University of Marburg.

Cassirer’s interests at Marburg ran, as they would always, toward framing Neo-Kantian thought in the wider contexts of historical thinking. These interests culminated in his dissertation, Descartes: Kritik der Matematischen und Naturwissenschaftlichen Erkentniss (1899). Three years later, Cassirer published a similarly historical book on Leibniz’ System in seinen wissenschaftlichen Grundlagen (1902). Cassirer was also the editor of Leibniz’ Philosophische Werke (1906). His focus on the development of modern idealist epistemology and its foundational importance for the history of the various natural sciences and mathematics reached its apex in Cassirer’s three-volume Das Erkenntnisproblem in der Philosophie und Wissenschaft der neuren Zeit (1906-1920), for which he was awarded the Kuno Fischer Medal by the Heidelberg Academy. The first volume, Cassirer’s Habilitationschrift at the University of Berlin (1906), examines the development of epistemology from the Renaissance through Descartes; the second (1907) continues from modern empiricism through Kant; the third (1920) deals with the development of epistemology after Kant, especially the division between Hegelians and Neo-Kantians up to the mid-twentieth century; and the fourth volume of Das Erkenntnisproblem on contemporary epistemology and science was written in exile in 1940, but only published after the end of the war in 1946.

Although his quality as a scholar of ideas was unquestioned, anti-Jewish sentiment in German universities made finding suitable employment difficult for Cassirer. Only through the personal intervention of Wilhelm Dilthey was Cassirer given a Privatdozent position at the University of Berlin in 1906. His writing there was prolific and continued the Neo-Kantian preoccupation with the intersections among epistemology, mathematics, and natural science. Cassirer’s work on, and with, Einstein exemplifies the quality of his contributions to the philosophy of science: Der Substanzbegriff und der Funktionsbegriff (1910), and Zur Einstein’schen Relativitätstheoretische Betrachtung (1921). These works also mark Cassirer’s conviction that an historian of ideas could make a major contribution to the most contemporary problems in every field.

After the First World War, and in the more tolerant Weimar Republic, Cassirer was invited to a chair at the new University of Hamburg in 1919. There, Cassirer came into the cultural circle of Erwin Panofsky and the Warburg Library of the Cultural Sciences. Immediately Cassirer was absorbed into the vast cultural-anthropological data collected by the Library, affecting the widest expansion of Neo-Kantian ideas into the previously uncharted philosophical territories of myth, the evolution of language, zoology, primitive cultures, fine art, and music. The acquaintance with the Warburg circle transformed Cassirer from a student of the Marburg School’s analysis of the transcendental conditions of thinking into a philosopher of culture whose inquisitiveness touched nearly all areas of human cultural life. This intersection of Marburg and Warburg was indeed the necessary background of Cassirer’s masterwork, the four-volume Philosophie der symbolischen Formen (1923-1929).

In addition to his programmatic work, Cassirer was a major contributor to the history of ideas and the history of science. In conscious contrast with Hegelian accounts of history, Cassirer does not begin with the assumption of a theory of dialectical progress that would imply the inferiority of earlier stages of historical developments. By starting instead with the authors, cultural products, and historical events themselves, Cassirer instead finds characteristic frames of mind that are defined by the kinds of philosophical questions and responses that frame them, which are in turn constituted by characteristic forms of rationality. Among his works at this time, which influenced a generation of historians of ideas from Arthur Lovejoy to Peter Gay are Individuum und Kosmos in der Philosophie der Renaissance (1927); Die Platonische Renaisance in England und die Schule von Cambridge (1932); Philosophie der Aufklärung (1932); Das Problem Jean-Jacques Rousseau (1932); and Descartes: Lehre, Persönlichkeit, Wirkung (1939). Cassirer’s philosophy of science had a similar influence on the historical analyses of Alexander Koyré and, through him, Thomas Kuhn.

In 1929, Cassirer was chosen Rektor of the University of Hamburg, making him the first Jewish person to hold that position in Germany. However, even as Cassirer’s star was rising, the situation for Jewish academics was deteriorating. With Hitler’s election as Chancellor came the ban on Jews holding academic positions. Cassirer saw the writing on the wall and emigrated with his family in 1933. He spent two years at Oxford and then six at Göteborg, where he wrote Determinismus und Indeterminismus in der modernen Physik (1936), Descartes: Lehre, Persönlichkeit, Wirkung (1939), and Zur Logik der Kulturwissenschaften (1942). In 1931, he wrote the first comprehensive study of the Swedish legal theorist and proto-Analytic philosopher, Axel Hägerström.

In 1941, Cassirer boarded the last ship the Germans permitted to sail from Sweden to the United States, where he would hold positions at Yale for two years and then at Columbia for one. His final books, written in English, were the career-synopsis, An Essay on Man (1944), and his first philosophical foray into contemporary politics, The Myth of the State (1946), published posthumously. Cassirer’s death in New York City on April 13, 1945, preceded that of Hitler and the surrender of Germany by mere weeks.

2. Philosophy of Symbolic Forms

“The Philosophy of Symbolic Forms is not concerned exclusively or even primarily with the purely scientific, exact conceiving of the world; it is concerned with all the forms assumed by man’s understanding of the world” (Philosophy of Symbolic Forms, vol. III, 13). For Cassirer, Neo-Kantianism was less about doctrinal allegiance than it was about a common commitment to explore the cognitive structures that underlie the variety of human experience. After the death of Cohen, Cassirer became increasingly interested in value and culture. Inspired by the Warburg Library, Cassirer cast his net into an ocean of cultural expression, trying to find the common thread that united the manifold of cultural forms, that is, to move from the critique of reason to the critique of culture.

As to what precisely symbolic forms are, Cassirer offers perhaps his clearest definition in an early lecture at the Warburg Library (1921):

By ‘symbolic form’ I mean that energy of the spirit through which a mental meaning-content is attached to a sensual sign and inwardly dedicated to this sign. In this sense language, the mythical-religious world, and the arts each present us with a particular symbolic form. For in them all we see the mark of the basic phenomenon, that our consciousness is not satisfied to simply receive impressions from the outside, but rather that it permeates each impression with a free activity of expression. In what we call the objective reality of things we are thus confronted with a world of self-created signs and images. (“Der Begriff der Symbolischen Form im Aufbau der Geisteswissenschaften”)

An illustration Cassirer uses is that of the curved line on a flat plane. To the geometer, the line means a quantitative relation between the two dimensions of the plane; to the physicist, the line perhaps means a relation of energy to mass; and to the artist, the line means a relation between light and darkness, shape and contour. More than simply a reflection of different practical interests, Cassirer believes each of these brings different mental energies to bear in turning the visual sensation of the line into a distinct human experience. No one of these ways of experiencing is the true one; though they each have their distinctive pragmatic uses within their individual fields. The task of the philosopher is to understand the internal directedness of each of these mental energies independently and in relation to the others as the sum total of human mental expression, which is to say, culture.

The first two forms Cassirer discusses, in the first two volumes respectively, are language and myth. The third volume of the Philosophy of Symbolic Forms concerns contemporary advances in epistemology and natural science: “We shall show how the stratum of conceptual, discursive knowledge is grounded in those other strata of spiritual life which our analysis of language and myth has laid bare; and with constant reference to this substructure we shall attempt to determine the particularity, organization, and architectonics of the superstructure – that is, of science” (Philosophy of Symbolic Forms, vol. III, xiii). Cassirer works historically, tracing the problem of philosophical knowledge through the Ancient Greeks up through the Neo-Kantian tradition. The seemingly endless battle between intuition and conceptualization has been contended in various forms between the originators of myths and the earliest theorists of number, between the Milesians and Eleatics, between the empiricists and rationalists, and again right up to Ernst Mach and Max Planck. Cassirer’s position here is conciliatory: both sides have and will continue to contribute their perspective on the eternal questions of philosophy insofar as both recognize their efforts as springing from the human’s multifaceted and spontaneous creativity—as symbol-forming rather than designating endeavors that in their dialectics, each with the other side, construct more elaborate and yet universal ways to navigate our world:

Physics gains this unity and extension by advancing toward ever more universal symbols. But in this process it cannot jump over its own shadow. It can and must strive to replace particular concepts and signs with absolutely universal ones. But it can never dispense with the function of concepts and signs as such: this would demand an intellectual representation of the world without the basic instruments of representation. (Philosophy of Symbolic Forms, vol. III, 479)

The fourth volume, The Metaphysics of Symbolic Forms, was published posthumously. Along with other papers left at the time of his death, the German original is now found in the first volume of Cassirer’s Nachgelassene Manuskripte und Texte, edited by John Michael Krois and Oswald Schwemmer in 1995. The English volume, assembled and edited by Donald Philip Verene and John Michael Krois in 1996, contains two texts from different periods in Cassirer’s writings. The first, from 1928, deals with human nature rather than metaphysics proper. In agreement with Heidegger, curiously, Cassirer seeks to replace traditional metaphysics with a fundamental study of human nature. Much of the thematic discussion of this part receives a refined and more complete expression in Cassirer’s 1944 Essay on Man. What is of novel interest here concerns his discussion of then contemporary philosophical anthropologists like Dilthey, Bergson, and Simmel and also the Lebensphilosophen, Schopenhauer, Kierkegaard, and Nietzsche, who otherwise receive short shrift in his work. His critical remarks of these latter thinkers involve their treatment of life as a new sort of metaphysics, one marred, however, by the sorts of dogmatism of pre-Kantian metaphysics.

The second text in Verene and Krois’s assembled volume comes from 1940, well after the project had been otherwise finished, and its theme is what Cassirer terms “basis phenomena”: phenomena so fundamental that they cannot be derived from anything else. The main basis phenomena concerns how the tripartite structure of the self’s personal relation to the environment is mirrored in a tripartite social structure of the “I,” the “you,” and that which binds society: “work.” Not to be confused with the Marxist conception of work, for Cassirer work is anything made or effected, any subjective operation on the objective world. The initial and most fundamental production of work, for Cassirer, is culture—the sphere in which the “I” and “you” come together in active life.

Several objections to Cassirer’s masterwork have been raised. First, the precise identity and number of forms is ambiguous over Cassirer’s corpus. In the lecture from 1921, Cassirer names language, myth-religion, and art as forms, but that number cannot be considered exhaustive. Even in his summatory Essay on Man, consecutive pages maintain different lists: “myth, language, art, religion, history, science” (222) and then “language, myth, art, religion, science” (223); elsewhere science is omitted (63); mathematics is sometimes added; and religion is sometimes considered part of mythic thinking. The first two of the four volumes of The Philosophy of Symbolic Forms—on language and myth respectively—would seem to indicate that each volume would treat a specific form. But the latter two volumes break the trend to deal with a host of different forms. Moreover, it is ambiguous how precisely those forms are related. For example, myth is sometimes treated as a primitive form of language and sometimes non-developmentally as an equal correlate. Arithmetic and geometry are the logic that undergirds the scientific symbolic form, but in no way do they undergird primitive forms of science that have been superseded. Whether the forms are themselves developmental or whether development takes place by the instantiation of a new form is also left vague. For example, Cassirer indicates that the move from Euclidean to non-Euclidean geometry involves not just progress but an entirely new system of symbolization. However, myth does not seem to develop itself into anything else other than into something wholly different, that is, representational language.

There is, however, a certain necessity to Cassirer’s imprecision on these points. Taken together, the Philosophy of Symbolic Forms is a grand narrative that exposits how various human experiences evolve out of an originally animalistic and primitive articulation of expressive signs into the complicated and more abstract forms of culture in the twenty-first century. As “energies of the spirit” they cannot be affixed with the kind of rigid architectonic featured in Kant’s transcendental deduction of purely logical forms. Though spontaneous acts of mental energy, symbolic forms are both developmental and pragmatic insofar as they adapt over time to changing environments in response to real human needs, something that resists an overly rigid structuralism. Those responses feature a loose sort of internal-logic, but one characterized according to contingent cultural interactions with the world. Therefore, one ought not to expect Cassirer to offer the same logical precision that comes with the typical Neo-Kantian discernment of mental forms insofar as logic is only one form among many cultural relations with life.

3. Cultural Anthropology

Cassirer’s late Essay on Man (1944) expresses neatly his lifelong attempt to combine his Neo-Kantian view of the actively-constituting subject with his Warburgian appreciation for the diversity of human culture. Here, as ever, Cassirer begins with the history of views up into his present time, culminating in the presentation of a definitive scientific thesis that he would then proceed to refute. Johannes von Uexküll’s Umwelt und Innenwelt der Tiere (1909) argued that evolutionary biology has taken too far the view that animal parts and functions develop as a response to environmental factors. In its place Uexküll offers the “functional circle” of animal activity, which identifies the interaction of distinct receptor and effector systems. Animals are not simply reacting to the environment as it presents itself in sensory stimuli. They adapt themselves, consciously and unconsciously, to their environments, sometimes with clear signs of intelligence and insight. Different animals use diverse and sometimes highly complex systems of signals to better respond and manipulate their environments to their advantage. Dogs, for example, are adroit at reading signals in body language, vocal tones, and even hormone changes while being remarkably effective in expressing a complex range of immediate inner states in terms of the vocalized pitch of their whimpers, grunts, or barks, as well as the bends of their tails, or the posture of their spines. In Pavlov’s famous experiments, dogs were conditioned to react both to the immediate signals of meat—its visual appearance and smell—and also to mediate signals, like a ringing bell, to the same effect.

Cassirer thinks this theory makes good sense of the animal world as a corrective to a too-simple version of evolution, but doubts this can be applied to humans. Over and above the signals received and expressed by animals, human beings evolved to use symbols to make their world meaningful. The same ringing of the bell would not be considered by man a physical signal so much as a symbol whose meaning transcends its real, concrete stimulation. For man, a bell does not indicate simply that food is coming, but induces him to wonder why that bell might indicate food, or perhaps whether an exam is over, or the fulfillment of a sacrament, or that someone is on the telephone. None of those symbols would lead necessarily to a response in the way the conditioned dog salivates at the bell. They instead prompt a range of freely creative responses in human knowers within distinct spheres of meaning:

Symbols—in the proper sense of this term—cannot be reduced to mere signals. Signals and symbols belong to two different universes of discourse: a signal is a part of the physical world of being; a symbol is a part of the human world of meaning. Signals are ‘operators’; symbols are ‘designators’. Signals, even when understood and used as such, have nevertheless a sort of physical or substantial being; symbols have only a functional value. (Essay on Man 32)

Between the straightforward reception of physical stimuli and the expression of an inner world lies, for Cassirer, the symbolic system: “This new acquisition transforms the whole of human life. As compared with the other animals man lives not merely in a broader reality; he lives, so to speak, in a new dimension of reality” (Essay on Man 24). That dimension is distinctively Kantian: the a priori forms of space and time. Animals have little trouble working in three-dimensional space; their optical, tactile, acoustic, and kinesthetic apprehension of spatial distances functions at least as well as it does in humans. But only to the human is the symbol of pure geometrical space meaningful, a universal, non-perceptual, theoretical space that persists without immediate relationship to his or her own interaction with the world: “Geometrical space abstracts from all the variety and heterogeneity imposed upon us by the disparate nature of our senses. Here we have a homogenous, a universal space” (Essay on Man 45). In terms of time, too, there can be no doubt that higher animals remember past sensations, or that memory affects the manner in which they respond when similar sensations are presented. But in the human person the past is not simply repeated in the present, but transformed creatively and constructively in ways that reflect values, regrets, hopes, and so forth,

It is not enough to pick up isolated data of our past experience; we must really re-collect them, we must organize and synthesize them, and assemble them into a focus of thought. It is this kind of recollection which gives us the characteristic human shape of memory, and distinguishes it from all the other phenomena in animal or organic life. (Essay on Man 51)

As animals recall pasts and live within sensory space, human beings construct histories and geometries. Both history and geometry, then, are symbolic engagements that render the world meaningful in an irreducibly human fashion.

This symbolic dimension of the person carries him or her above the effector-receptor world of environmental facts and subjective responses. He or she lives instead in a world of possibilities, imaginations, fantasy, and dreams. However, just as there is a kind of logic to the language of contrary-to-fact conditionals or to the rules of poetic rhythym, so too is there a natural directedness expressed in how human beings construct a world of meaning out of those raw effections and receptions. That directedness cannot, however, be restricted to rational intentionality, though reason is indeed an essential component. In distinction from the Neo-Kantian theories of experience and representation, Cassirer thinks there is a wider network of forms that enable a far richer engagement between subject and object than reason could produce: “Hence, instead of defining man as an animal rationale, we should define him as an animal symbolicum” (Essay on Man 26).

With his definition of man as the symbolic animal, Cassirer is in position to reenvision the task of philosophy. Philosophy is much more than the analysis and eventual resolution of a set of linguistic problems, as Wittgenstein would have it, nor is it restricted, as it was for many Neo-Kantians, to transcendentally deducing the logical forms that would ground the natural sciences. Philosophy’s “starting point and its working hypothesis are embodied in the conviction that the varied and seemingly dispersed rays may be gathered together and brought into a common focus (Essay on Man 222). The functions of the human person are not merely aggregrate, loosely-connected expressions and factual conditions. Philosophy seeks to understand the connections that unite those expressions and conditions as an organic whole.

4. Myth

Max Müller was the leading theorist of myth in Cassirer’s day. In the face of Anglophone linguistic analysis, Müller held myth to be the necessary means by which earlier people communicate, one which left a number of traces within more-developed contemporary languages. What is needed for the proper study of myth, beyond this appreciation of its utility, is a step by step un-riddling of the mythical objects in non-mythical concepts so as to rationally articulate what a myth really means. Sigmund Freud, of course, also considered myth to be a sort of unconscious expression, one that stands as a primitive version of the naturally-occuring expression of subconscious drives.

Cassirer considers myth in terms of the Neo-Kantian reflex by first examining the conditions for thinking and then analyzing the objects which are thought. In his Sprache und Mythos (1925), which is a sort of condensed summary of the first two volumes of Philosophy of Symbolic Forms, Cassirer comes to criticize Müller, more so than Freud, for an unreflective realism about the objects of myth. To say that objects of any sort are what they are independent of their representation is to misunderstand the last century of transcendental epistemology. Accordingly, to treat myth as a false representation of those objects, one waiting to be “corrected” by a properly rational representation, is to ignore the wider range of human intellectual power. Naturalizing myths, as Müller and his followers sought to do, does not dissolve an object’s mythical mask so much as transplants it into the foreign soil of an alternative symbolic form:

From this point of view all artistic creation becomes a mere imitation, which must always fall short of the original. Not only simple imitation of a sensibly presented model, but also what is known as idealization, manner, or style, must finally succumb to this verdict; for measured by the naked ‘truth’ of the object to be depicted, idealization is nothing but subjective misconception and falsification. And it seems that all other processes of mental gestation involve the same sort of outrageous distortion, the same departure from objective reality and the immediate data of experience. (Language and Myth, trans. Langer [1946], 6)

Müller’s view of myth is a symptom of a wider problem. For if myth is akin to art or language in falsifying the world as it really is, then language is limited to merely expressing itself without any claim to truth either: “From this point it is but a single step to the conclusion which the modern skeptical critics of language have drawn: the complete dissolution of any alleged truth content of language, and the realization that this content is nothing but a sort of phantasmagoria of the spirit” (Language and Myth, trans. Langer [1946], 7). Cassirer rejects such fictionalism in myth and language both as an appeal to psychologistic measures of truth that fail to see a better alternative in the philosophy of symbolic forms.

For Cassirer, myth (and language, discussed below) does reflect reality: the reality of the subject. Accordingly, the study of myth must focus on the mental processes that create myth instead of the presupposed ‘real’ objects of myth:

Instead of measuring the content, meaning, and truth of intellectual forms by something extraneous which is supposed to be reproduced in them, we must find in these forms themselves the measure and criterion for their truth and intrinsic meaning. Instead of taking them as mere copies of something else, we must see in each of these spiritual forms a spontaneous law of generation; and original way and tendency of expression which is more than a mere record of something initially given in fixed categories of real existence. (Language and Myth, trans. Langer [1946], 8)

The mythic symbol creates its own “world” of meaning distinct from that created by language, mathematics, or science. The question is no longer whether mythic symbols, or any of these other symbolic forms, correspond to reality since it is distinct from that mode of representation, but instead it is a question on how myths relate to those other forms as limitations and supplementations. No matter how heterogeneous and variegated are the myths that come down to us, they move along definite avenues of feeling and creative thought.

An example Cassirer uses to illustrate his understanding of myth-making is the Avesta myth of Mithra. Attempts to identify Mithra as the sun-god, and thereby analogize it to the sun-god of the Egyptians, Greeks, and other early people, are misguided insofar as they stem from the attempt to explain away the object of mythical thinking in naturalistic rational terms. Cassirer points out that the analogy doesn’t hold for strictly interpretive reasons: Mithra is said to appear on mountain tops before dawn and is said to illuminate the earth at night as well, and cannot be the mythical analog of the sun. Mithra is not a thing to be naturalized, but evidence of an alternate spiritual energy that fashions symbolic responses to experiential confusions. What Mithra specifically reflects is a mode of thinking as it struggles to make sense of how the qualities of light and darkness result from a single essential unity: the cosmos.

As historical epochs provide new and self-enclosed worlds of experience, so too does myth evolve in conjunction with the needs of the age as an expression of overlapping but quite distinct patterns of mental life. Myths are hardly just wild stories with a particular pragmatic lesson. There is a specific mode of perception that imbues mythic thinking with its power to transcend experience. Similar to Giambattista Vico’s vision of historical epochs, Cassirer views the development of culture out of myth as a narrative of progressively more abstract systems of representation that serve as the foundation for human culture. Like Vico, too, there is continuity between the most elevated systems of theoretical expression of modern day—namely, religion, philosophy, and above all natural science—and a more primitive mind’s reliance upon myth and magic. However, Cassirer shares more with Enlightenment optimism than with Vico’s pessimistic conviction about the progressive degeneracy of scientific abstraction.

5. Language

The first volume of The Philosophy of Symbolic Forms (1923), on language, is guided by the search for epistemological reasons sufficient to explain the origin and development of human speech. Language is neither a nominal nor arbitrary designation of objects, nor, however, does language hold any immediate or essential connection to the object of its designation. The use of a word to designate an object is already caught in a web of intersubjectively-determined meanings which of themselves contain much more than the simple reference. Words are meaningful within experience, and that experience lies, as it did for Kant, as a sort of middle-ground between the pure reception of objects and the autonomous activity of reason to generate forms within which content could be meaningful. In contrast to Kant and the Neo-Kantians, however, those forms cannot be presumed to be identical among all rational agents over the spans of history. Animal language is essentially a language of emotion, expressions of desires and aversions in response to environmental factors. Similarly the earliest words uttered by our primitive ancestors were signs to deal with objects, every bit a tool alongside other tools to deal with the primitive’s sensed reality. As the human mind evolved to add spatio-temporal intuitions to mere sensation, a representational function overtook the mind’s merely expressive operations. The primitive vocalized report of received sensations became representations of enduring objects within fixed spatial points: “The difference between propositional language and emotional language is the real landmark between the human and the animal world. All the theories and observations concerning animal language are wide of the mark if they fail to recognize that fundamental difference” (Essay on Man 30). The features of those objects were further abstracted such that from commonalities there emerged a host of types, kinds, and eventually universals, whose meaning allowed for the emergence of mathematics, science, and philosophy.

The animal’s emotive signals operate as a practical imagination in a world of immediate experience. Proper human propositional speech, on the other hand, is already imbued at even its most basic levels with theoretical structures that involve quintessentially spatio-temporal forms linking subjects and their objects: “Language has a new task wherever such relationships are signified linguistically, where ‘here’ is distinguished from ‘there,’ where the location of the speaker is distinguished from the one spoken to, or where the greater nearness or distance is rendered by various indicative particles” (“The Problem of the Symbol and its Place in the System of Philosophy” in Luft [2015], 259). The application of dimensionality, and temporality as well, transforms the subjective sensation into an objective representation. Prepositions, participles, subjunctives, conditionals, and the rest, all involve either temporal or spatial prescriptions, and none of them seems to be a feature of animal space. The older animalistic content is not entirely discarded as the same basic desires and emotions are expressed. The means of that expression, however, are formally of an entirely different character that binds the subject to the object in ways supposed to be binding for other rational agents. Although the interjection “ouch!” expresses pain well enough, and although animals have variously similar yelps and cries, it lacks the representational form of the proposition “I (this one, here and now) am (presently) in pain..” In the uniquely human sphere of ethics, too, the reliance on subjunctive and conditional verbal forms—“I ought not to have done that,” for example—always carries language beyond simple evocations of pleasures and aversions into the symbolic realm of meaningfulness.

The Neo-Kantian position on language allows Cassirer to address two contemporary anomalies in linguistic science. The first is the famous case of Helen Keller, the unfortunate deafblind girl from Alabama, who, with the help of her teacher Anne Sullivan, went on to become a prolific author and social activist. Sullivan had taught Helen signs by using a series of taps on her hand to correspond to particular sense impressions. Beyond her disabled sensory capacities, Cassirer argued, Helen was unable to cognize in the characteristically human way. One day at a water pump, Sullivan tapped “water” and Helen recognized the disjunction between the various sensations of water (varying temperatures, viscocities, and degrees of pressure) and the “thing” which is universally referred to as such. That moment opened up for Helen an entire world of names, not as mere expressive signals covering various sensations but as intersubjectively valid objective symbols. This discovery marked her entry into a new, symbolic mode of thinking: “The child had to make a new and much more significant discovery. She had to understand that everything has a name—that the symbolic function is not restricted to particular cases but is a principle of universal applicability which encompasses the whole field of human thought” (Essay on Man 34f).

The second case is the pathology of aphasia. Similar to Helen Keller, what had long been thought a deficiency of the senses was revealed by Cassirer to be a cognitive failing. In the case of patients with traumatic injuries to certain areas of the brain, particular classes of speech act became impossible. The mechanical operation of producing the words was not the problem, but an inability to speak objectively about “unreal” conditions: “A patient who was suffering from a hemiplegia, from a paralysis of the right hand, could not, for instance, utter the words: ‘I can write with my right hand,’ because this was to him the statement of a fact, not of a hypothetical or unreal case” (Essay on Man 57). These types of aphasiacs were confined to the data provided by their sense impressions and therefore could not make the crucial symbolic move to theoretical possibility. For Cassirer, this was good evidence that language was neither mere emotional expression nor free-floating propositional content that could be analyzed logically only a posteriori.

In addition to these cases of abnormal speech pathology, Cassirer’s attention to the evolution of language enabled him to take a much wider view of both the form of utterance and its content than his more famous counterparts among the linguistic analysts. In Carnap’s Logical Syntax of Language, for example, the attempt is made to reduce semantic rules to syntax. The expected outcome was a philosophical grammar, a sound and complete system of words in the sort of logical relation that would be universally valid. For Cassirer, however, “human speech has to fulfill not only a universal logical task but also a social task which depends on the specific social conditions of the speaking community. Hence we cannot expect a real identity, a one-to-one correspondence between grammatical and logical forms” (Essay on Man 128). Contrary to the early analytical school, language cannot be considered a given thing waiting to be assessed according to independent logical categories, but instead needs to be assessed according to the a priori application of those categories to verbal expressions. Accordingly, the task of the philosopher of language must be refocused to account for the diversity and creativity of linguistic dynamics in order to better encapsulate the human rational agent in the fullest possible range of his or her powers.

6. Science

Cassirer was perhaps the last systematic philosopher to have both exhaustive knowledge of the historical development of each of the individual sciences as well as thorough familiarity with his day’s most important advancements. Substance and Function (1910) could still serve as a primer for the history of major scientific concepts prior to the twenthieth century. The first part examines the concepts of number, space, and a vast array of special problems such as Emil du Bois-Reymond’s “limiting concepts”; Robert Mayer’s methodological advancements in thermo-dynamics; the spatial continuities of atoms in the physics of Roger Boscovich and Gustav Fechner; Galileo’s concept of inertia; Heinrich Hertz’s mechanics; and John Dalton’s law of multiple proportions. Each of these is examined with a view toward the epistemological presuppositions that gave rise to those problems and how each scientist’s innovations represented a novel way of posing problems through an application of spatio-temporal concepts.

This historical survey allows Cassirer to offer his own contributions to these problems along recognizably Neo-Kantian lines in the second part of Substance and Function. Science cannot be considered a collection of empirical facts. Science discovers no absolute qualities but only qualities in relation to other qualities within a particular field, such as the concept of mass as the sum of relations with respect to external impulses in motion, or energy as the momentary condition of a given physical system. Concrete sensuous impressions are only transformed into empirical objects by the determination of spatial and temporal form. The properties of objects, in bringing them into meaningful discourse by means of measurement, are thus mathematized as a field of relations: “The chaos of impressions becomes a system of numbers; but these numbers first gain their denomination, and thus their specific meaning, from the system of concepts which are theoretically established as universal standards of measurement” (Substance and Function 149). Objects as they stand outside possible experience are not the proper subject matter of science, anymore than they are for mathematics. Proper science examines the logical connections among the spatio-temporal relationships of objects precisely as they are constituted by experience.

Abandoning the particular sensuous properties of objects for their logical relations as members of a system refocuses the scientific inquiry on how the natural world is symbolized by mathematical logic. Science becomes anthropomorphized insofar as whatever content is available to experience will be content that the human being spontaneuosly and creatively renders meaningful: “No content of experience can ever appear as something absolutely strange; for even in making it a content of our thought, in setting it in spatial and temporal relations with other contents we have thereby impressed it with the seal of our universal concepts of connection, in particular those of mathematical relations” (Substance and Function 150). However, this in no way reduces science to mere relativism of personal inner projections, as if one way of representing the world were no better than any other. Though we do not know objects independent of mental representation, scientific understanding functions objectively by fixing the permanent logical elements and their connections within a uniform manifold of experience: “The object marks the logical possession of knowledge, and not a dark beyond forever removed from knowledge” (Substance and Function 303). Thus, science is absolutely tied to empirical reality, by which Cassirer means the sum of logical relations through which humans cognize the world. Therefore science, too, as much as language or myth, symbolically constitutes the world in its particular idiom: “The symbol possesses its adequate correlate in the connection according to law, that subsists between the individual members, and not in any constitutive part of the perception; yet it is this connection that gradually reveals itself to be the real kernel of the thought of empirical ‘reality’” (Substance and Function 149).

This Neo-Kantian vision of science is not something Cassirer thinks stands to “correct” science as currently practiced. On the contrary, the great modern scientists themselves have assumed precisely the same view, though in terms lacking the proper philosophical rigor. Newton’s assumption of absolute space and time put science on its first firm foundation, and in doing so he had to relinquish a purely sense-certain view of experience. Space and time in classical physics fix natural processes within a geometric schema, and fix mass as a self-identical thing within infinitely different spaces and different times. What Newton failed to realize was that this vision of space and time imputed ideal forms into what he believed was the straightforward observation of real objects. Kant had already shown as much. James Clark Maxwell’s theory of light waves breaks with this system of transcribing observational circumstances with mathematical equations that associate spatial positions with affair-states. Maxwell’s spatial point simultaneously has two correlate directional quantities: the magnetic and electrical vectors, whose representations in mathematics are readily cognizable but whose observation as such is impossible. The theory of Maxwell was therefore functionally meaningful without requiring a substantial ontology behind it. The definitive theory of light he discovered was not about a permanent thing situated within space and time but a set of interrelated magnitudes that could be functionally represented as a universal constant.

Hermann Ludwig von Helmholtz was among the first natural scientists to properly acknowledge the difference between observational descriptions of reality and symbolic theoretical constructions of it. As Cassirer quotes Helmholtz:

[I]n investigating [phenomena] we must proceed on the supposition that they are comprehensible. Accordingly, the law of sufficient reason is really nothing more than the urge of our intellect to bring all our perceptions under its own control. It is not a law of nature. Our intellect is the faculty of forming general conceptions. It has nothing to do with our sense-perceptions and experiences unless it is able to form general conceptions or laws. (Essay on Man 220)

The alleged sensory manifold held so dear in naively realist science gave way before Helmholtz’s demonstration that such is an ideally defined totality according to the rule which distinguishes properties on the basis of numerical series. That ideal unit is, for Helmholtz, the “symbol,” which cannot be considered a “copy” of a non-signifying object-in-itself (for how could that be conceived?) but the functional correspondence between two or more conceptual structures. Thus what is discovered by Helmholtzian science are the laws of interrelation among phenomena, the laws which are the very condition of our experiencing something as an object in the first place.

To Helmholtz’s experimental demonstration, Cassirer is able to add the relational but still universal nature of scientific designation; that is, the crucial differentiation between substance-concepts and function-concepts:

For laws are never mere compendia of perceptible facts, in which the individual phenomena are merely placed end to end as on a string. Rather every law, as compared to immediate perception, comprises a […] transition to a new perspective. This can occur only when we replace the concrete data provided by experience with symbolic representations, which on the basis of certain theoretical presuppositions that the observer accepts as true and valid are thought to correspond to them. (The Philosophy of Symbolic Forms III, 21)

Accordingly, the truth of science does not depend upon an accurate conceptualization of substances so much as it does on the demonstrating the limits of conceptual thinking about those substances, that is, their symbolic functions.

The scientist cannot attain his end without strict obedience to the facts of nature. But this obedience is not passive submission. The work of all the great natural scientists – of Galileo and Newton, of Maxwell and Helmholtz, of Planck and Einstein—was not mere fact collecting; it was theoretical, and that means constructive, work. This spontaneity and productivity is the very center of all human activities. It is man’s highest power and it designates at the same time the natural boundary of our human world. In language, in religion, in art, in science, man can do no more than to build up his own universe – a symbolic universe that enables him to understand and interpret, to articulate and organize, to synthesize and universalize his human experience. (Essay on Man 221)

Cassirer’s essay Zur Einsteinschen Relativitätstheorie (1921) was his last major thematic enterprise before the first volume of The Philosophy of Symbolic Forms. In it he sees himself following Cohen’s task of updating Kant’s philosophical groundwork for science. Kant had taken for granted that the forms of science in his own day represented scientific thinking as such. His epistemological groundwork accordingly needed to support Newtonian physics. After Kant’s death, science leapt past the limits set by Newton just as mathematics pushed the limits of Euclidian three-dimensional geometry. Einstein’s theories of relativity effectively dismantled the authority of both; the fact that they did proved to Cassirer the non-absolute status of scientific symbolization as a doctrine about objects. An elucidation of the epistemological conditions that could allow for Einstein’s relativity was now necessary.

Cassirer replaced Kant’s static formalism with his attention to the varied and alterable features of mathematical science that could accomodate radical new forms of mathematical logic and, by extension, systems of natural science. Pure Euclidean geometry was so influential because it dealt concretely and intuitively with real things as uniform and absolute substances. And it still works with most material applications. When non-Euclidian geometry came to the fore with Gauss, Riemann, and Christoffel, it was considered a mere play of analytical concepts that held some logical curiosity but no applicability. Over time a gradual shift ensued from the widening of the concept of experience to include non-uniform concepts of space.

Pure Euclidean space stands, as it now seems, not closer to the demands of empirical and physical knowledge than the non-Euclidean manifolds but rather more removed. For precisely because it represents the logically simplest form of spatial construction it is not wholly adequate to the complexity of content and the material determinateness of the empirical. Its [i.e., the Euclidean] fundamental property of homogeneity, its axiom of the equivalence in the principal of all points, now marks it as an abstract space; for, in the concrete and empirical manifold, there never is such uniformity, but rather thorough-going differentiation reigns in it. (“Euclidean and non-Euclidean Geometry,” in Luft [2015], 243)

It is thus not the case, as traditionally thought, that the new physical sciences simply adopted a more abstract vision of mathematics as its basis. Their physics represent a more widely-encompassing symbolic representation that expresses a new mode of experience, one less concerned with the sense impressions of real objects than with the reality of their logical relations.

Einstein needed a geometry of curvature that varied according to the relation of mass and energy in order for general relativity to work, but this of itself does not mean Euclidean geometry was or even could be proven wrong by Minkowski space-time. In the terminology of symbolic forms, Cassirer thinks Einstein’s relativity has transcended the symbolic forms of natural objects with those of pure mathematical relations. The result is the fracture of non-commensurable ways of analyzing one and the same “substance”: physically, chemically, mathematically, and so forth. Those forms ought not to be reduced to a single “meta” method that levels their differences as merely partial views. Each ought to be retained as equally valid parts of the total determination of the object. Thus Einstein was right to abandon absolute Newtonian space-time for relative Minkowski space-time. But his reason for doing so did not concern the former’s falsity. In place of a single absolutist description, the new relativism embraced an epistemology that featured a wider variety of equally valid modes of thinking about one and the same object. Objects, in Cassirer’s idiom, are relative to the symbolic form under which they are expressed.

The One reality can only be disclosed and defined as the ideal limit of diversely changing theories; but the setting of this limit itself is not arbitrary; it is inescapable, since the continuity of experience is established only thereby. No particular astronomical system, the Copernican no more than the Ptolemaic…may be taken as an expression of the ‘true’ cosmic order, but only the whole of these systems as they continuously unfold in accordance with a certain context. …We do not need the objectivity of absolute things, but we do require the objective determinacy of the way of experience itself. (Philosophy of Symbolic Forms III, 476)

Cassirer’s view of the evolution of science may be compared with Thomas Kuhn’s view insofar as both reject a single consistent progress toward absolute truth. Cassirer’s symbolic forms echo in Kuhn’s paradigms as incommensurable frameworks of meaning that stand in discomfitted relationships with one another. But where Kuhn sees the conditions for shifted paradigms in the quasi-sociological language of the community crises brought about by insoluable intra-paradigm problems, Cassirer sees a more epistemological metamorphosis in the evolution and expansion of human thinking. More than just a professional and social shift away from Pythagoras or Galileo to Einstein or Plank, Cassirer thinks rational agency matures to embrace more variegated, more useful, and more precise symbols. This evolution does not bring the rational agent closer to the truth of objects, but it does bring more useful and exacting means by which to think about those objects. Insofar as science, more so than myth or language, cultivates that progression through its activity, it presents, for Cassirer, the prospect to carry human nature to the very highest cultural achievements possible: “Science is the last step in man’s mental development and it may be regarded as the highest and most characteristic attainment of human culture” (Essay on Man 207).

7. Political Philosophy

Cassirer’s political philosophy has its roots in Renaissance humanism and the classics of Modern thought: Machiavelli, Rousseau, Kant, Goethe, and Humboldt. Ever concerned with a subject’s connection to the wider sphere of cultural life, Cassirer noted that the Ancient, Medieval, and Renaissance conceptions of politics were framed within a holistic worldview. In Modern times, a holistic order still obtained, but after Machiavelli, this order is based upon intrapersonal relationships rather than the divine or the natural. These social and political relationships are, like symbolic forms, neither entirely objective nor entirely subjective. They represent the construction of ourselves in the framework of our ideal comprehensive social life.

Man’s social consciousness depends upon a double act of identification and discrimination. Man cannot find himself, he cannot become aware of his individuality, except through the medium of his social life. […] Man, like the animals, submits to the rules of society but, in addition, he has an active share in bringing about, and an active power to change, the forms of social life. (Essay on Man 223)

As it did for Kant, human dignity derives from the capacity of rational agents to pose and constrain themselves by normative laws of their own making. Cassirer stresses against Marx and Heidegger, respectively, that it is neither the material nor ontological conditions that man is born or thrown into that determines political order or social value. Rather, it is the active processes by which the human person creates laws for themself, social institutions for themself, and norms for themself are paramount in determining the place of the human being in society. Politics is not simply the study of the relations between social institutions, as Marx and his sociological disciples believed, but of their meaningful construction within the symbolic forms of myth-making, art, poetry, religion, and science.

Human culture taken as a whole may be described as the process of man’s progressive self-liberation. Language art, religion, science, are various phases in this process. In all of them man discovers and proves a new power – the power to build up a world of his own, an ‘ideal’ world. Philosophy cannot give up its search for a fundamental unity in this ideal world (Essay on Man, 228).

The opponent in Cassirer’s last work, The Myth of the State, is Heidegger and the kind of twentieth century totalitarian mythologies of “crisis” by which he and so much of Germany were then entranced. Even if he did stand mostly alone, Cassirer stood firmly against the myth of Aryan supremacy, the myth of the eternal Jew, and the myth of Socialist utopia. He did not oppose the creative acts that gave rise to these myths but the unthinking allegiance they demanded of their acolytes. In so doing, Cassirer felt Germany, and not just Germany, had abandoned its heritage of classical liberalism, tradition of laws, and its belief in the rational progress of both science and religion for a worldview based in power and struggles for personal gain masking as equality. With obvious reference toward Heidegger and the National Socialists, Cassirer laments:

Perhaps the most important and the most alarming feature in this development of modern political thought is the appearance of a new power: the power of mythical thought. The preponderance of mythical thought over rational thought in some of our modern systems is obvious. (Myth of the State, 3)

Cassirer’s focus in Myth of the State is mostly not, however, the contemporary state of European politics. In fact, only in the last chapter is the word Nazi mentioned. The great majority is caught up instead with history, almost jarringly so given the immediate crisis and Cassirer’s personal place in it. He has far more to say about medieval theories of grace, Plato’s Republic, and Hegel than he does about the rise of Hitler or the War. Back in the First World War, Cassirer’s wife Toni would write in her biography, Mein Leben mit Ernst Cassirer, that despite some limited clerical duties on behalf of Germany, their major wartime concerns were whether there was sufficient electricity to write and whether the train tickets were first class (Toni Cassirer, 1948, 116-20): “We weren’t politicians, and didn’t even know any politicians” (Ibid., 117). And that aloofness stayed with Cassirer until the end. Charles W. Hendel, who was responsible for Cassirer’s appointment at Yale and who later became the posthumous editor of Myth of the State, illustrates how frustrating Cassirer’s silence on contemporary political matters were: “Won’t you tell us the meaning of what is happening today, instead of writing about past history, science, and culture? You have so much knowledge and wisdom—we who are working with you know that so well—but you should give others, too, the benefit of it” (Myth of the State x). In the early twentieth-first century, Edward Skidelsky declaimed Cassirer’s reluctance to speak about contemporary politics as a symptom of a greater philosophical shortcoming:

“[Cassirer’s] is an enchanting vision. But it is also a fundamentally innocent one. Liberalism may have triumphed in the political sphere, but it was the illiberal philosophy of Heidegger that won the day at Davos and went on to leave the deepest stamp on twentieth-century culture. Who now shares Cassirer’s faith in the humanizing power of art or the liberating power of science? Who now believes that the truth will make us free?” (Skidelsky 2008, 222)

8. The Davos Conference

The historical event for which Cassirer is best known is the famous conference held in Davos, Switzerland in 1929. Planned as a symposium to bring together French- and German-speaking academics in a spirit of international collaboration, the conference was set in the resort town made famous by Thomas Mann’s epic The Magic Mountain (1924). Counting nearly 1,300 attendees, more than 900 of who were the town’s residents, the conference featured 56 lectures delivered over the span of three weeks. Among those in attendance were contemporary heavyweights like Fritz Heinemann and Karl Joël, and rising stars like Emmanuel Lévinas, Joachim Ritter, Maurice de Gandillac, Ludwig Binswanger, and a young Rudolf Carnap. The centerpiece of the conference was to have been the showdown between the two most important philosophers in Germany: Cassirer and Heidegger. Curiously, there never was a disputation proper, in the sense of an official point-by-point debate, in part because neither man was up for it: Cassirer was bed-ridden by illness and Heidegger was less interested in attending lectures than the resort town’s recreational activities. As a characteristic expression of his disdain toward stuffy academic conferences, Heidegger even gave one of his own talks while wearing his ski-suit.

Cassirer was the student and heir of Hermann Cohen, the unchallenged leader of Marburg Neo-Kantianism. Heidegger was the most brilliant student of the Southwest Neo-Kantian Heinrich Rickert, but was recommended to the chair of Marburg by none other than Marburger Paul Natorp. On at least three separate occasions, Cassirer and Heidegger were considered for the same academic post, as successor to Husserl, then to Rickert, and finally for the leading position in Berlin in 1930 (Gordon, 2010, 40). Cassirer and Heidegger were thus the two greatest living thinkers in the tradition of Kantian philosophy, and were invited to Davos to defend their rival interpretation on the question of whether an ontology could be derived from Kant’s epistemology. Their positions were contradictory in clear ways: Cassirer held the Marburg line that Kant’s entire project required that the thing-in-itself be jettisoned for a transcendental analysis of the forms of knowing. Heidegger wanted to recast not only Kant but philosophy itself as a fundamental investigation into the meaning of Being, and by specific extension, the human way of Being: Dasein. The debate about the proper interpretation of Kant went nearly nowhere, and Heidegger’s interpretation had more to do with Heidegger than with Kant. Cassirer, the co-editor of the critical edition of Kant’s works and the author of a superb intellectual biography, was no doubt the superior exegete. Nevertheless, Heidegger was doubtless the more captivating and original philosopher.

Beyond their divergent interpretations of Kant, the debate brought to the fore two competing intellectual forces that were at genuine odds: Cassirer’s Neo-Kantian maintenance of the spontaneous mental freedom requisite for the production of symbolic forms was pitted against Heidegger’s existential-phenomenological concentration on the irrevocable “thrownness” of human beings into a world of which the common denominator was their realization of death. Cassirer thought Heidegger vastly overstated Dasein’s thrownness and understated its spontaneity, and that his subjectivism discounted the objectivity of the sciences and of moral laws. Also, if both the character of rationality and the inviolable value of the human person lie in a subject’s spontaneous use of theoretical and practical forms of reasoning, then the danger was clear: Heidegger’s Dasein had one foot in irrationality and the other in nihilism.

The historical significance of the Davos Conference thus lay, ironically, in its symbolic meaning. Primed by the cultural clash between humanism and iconoclasm represented by Thomas Mann’s characters Settembrini and Naphta, the participants in Davos expected the same battle between the stodgy old enlightenment Cassirer and the exciting, young, radical Heidegger. No doubt some in the audience fancied themselves a Hans Castorp, whose soul, and the very fate of Europe, was caught in the tug of war between Settembrini/Cassirer’s liberal rationalism and Naphta/Heidegger’s conservative mysticism. (Though, to be sure, Mann’s model for Naphta was György Lukács and not Heidegger.) In the Weimar Republic’s “Age of Crisis,” it was not so much what either man said, but what each symbolized that mattered. As Rudolf Carnap wrote in his journal, “Cassirer speaks well, but somewhat pastorally. […] Heidegger is serious and objective, as a person very attractive” (Friedman, 2000, 7). In a subsequent satirical reenactment, a young Emmanuel Lévinas mocked Cassirer by performing in buffo what he took to be the salient point of his lectures at Davos: “Humboldt, culture, Humboldt, culture” (Skidelsky, 2008, 1). Indeed what Cassirer defended was then subject to parody among the young. Cassirer was the last of the great polymaths like Goethe, the last comprehensive historian like Ranke, the last optimist like Humboldt, and the last of the Neo-Kantian academic establishment. Heidegger represented the revolution of a new German nation, one that would sweep away the old ways of philosophy as much as Hitler would sweep away Wilhelmine politics. Heidegger welcomed crisis as the condition for new growth and invention; Cassirer saw in crisis the collapse of a culture that took so long to achieve. Cassirer was the great scholar. Heidegger was the great philosopher. Cassirer clung to rational optimism and humanist culture while Heidegger championed existential fatalism. In 1929, the Zeitgeist clearly favored the latter.

The consequences of Davos, like the meaning of the conference itself, operated on two levels. On the level of the factual, Cassirer and Heidegger would maintain a somewhat detached respect for the other, with mutually critical yet professionally cordial responses in print over the years to come. Neither man came to change either his interpretation of Kant or his philosophy generally in any major way due to the conference. Symbolically, however, Davos was a disaster for Cassirer and for Neo-Kantianism. Europe was immediately swept up in increasingly violent waves of nationalism. Days after Hitler’s election as Chancellor in 1933, Jews were banned from teaching in state schools. The Night of the Long Knives happened five years after Davos, and then the Night of Broken Glass four years after that. Neo-Kantian philosophers, especially the followers and friends of Hermann Cohen, were mainly Jewish. Cassirer fled to England and then Sweden in 1933 in fear of the Nazi’s, even while Heidegger was made Rektor at Freiburg. The Wilhelmine era’s enlightened cultural humanism, and its last defender, had clearly lost.

9. References and Further Reading

What follows is a list of Cassirer’s major works. For an exhaustive bibliography, see http://www1.uni-hamburg.de/cassirer/bib/bibgr.htm. For the contents of Cassirer’s archive at Yale, see http://www1.uni-hamburg.de/cassirer/bib/yale1.htm.

a. Cassirer’s Major Works

(1899) Descartes: Kritik der Matematischen und Naturwissenschaftlichen Erkenntnis (dissertation at Marburg).
(1902) Leibniz’ System in seinen wissenschaftlichen Grundlagen. Marburg: Elwert.
(1906) Das Erkenntnisproblem in der Philosophie und Wissenschaft der neueren Zeit. Erster Band. Berlin: Bruno Cassirer.
(1907) Das Erkenntnisproblem in der Philosophie und Wissenschaft der neueren Zeit. Zweiter Band. Berlin: Bruno Cassirer.
(1910) Substanzbegriff und Funktionsbegriff: Untersuchungen über die Grundfragen der Erkenntniskritik. Berlin: Bruno Cassirer.
(1916) Freiheit und Form: Studien zur deutschen Geistesgeschichte. Berlin: Bruno Cassirer.
(1921) Zur Einsteinschen Relativitätstheorie. Erkenntnistheoretische Betrachtungen. Berlin: Bruno Cassirer.
(1923) Philosophie der symbolischen Formen. Erster Teil: Die Sprache. Berlin: Bruno Cassirer.
(1925) Philosophie der symbolischen Formen. Zweiter Teil: Das mythische Denken. Berlin: Bruno Cassirer.
(1925) Sprache und Mythos: Ein Beitrag zum Problem der Götternamen. Leipzig: Teubner.
(1927) Individuum und Kosmos in der Philosophie der Renaissance. Leipzig: Teubner.
(1929) Die Idee der republikanischen Verfassung. Hamburg: Friedrichsen.
(1929) Philosophie der symbolischen Formen. Dritter Teil: Phänomenologie der Erkenntnis. Berlin: Bruno Cassirer.
(1932) Die Platonische Renaissance in England und die Schule von Cambridge. Leipzig: Teubner.
(1932) Die Philosophie der Aufklärung. Tübingen: J.C.B. Mohr.
(1936) Determinismus und Indeterminismus in der modernen Physik. Göteborg: Göteborgs Högskolas Årsskrift.
(1939) Axel Hägerström: Eine Studie zur Schwedischen Philosophie der Gegenwart. Göteborg: Högskolas Årsskrift.
(1939) Descartes: Lehre, Persönlichkeit, Wirkung. Stockholm: Bermann-Fischer Verlag.
(1942) Zur Logik der Kulturwissenschaften. Göteborg: Högskolas Årsskrift.
(1944) An Essay on Man. New Haven: Yale University Press.
(1945) Rousseau, Kant, Goethe: Two Essays. New York: Harper & Row.
(1946) The Myth of the State. New Haven: Yale University Press.

b. Further Reading

Barash, Jeffrey Andrew (2008), The Symbolic Construction of Reality: The Legacy of Ernst Cassirer. Chicago: University of Chicago Press.
Bayar, Thora Ilin (2001), Cassirer’s Metaphysics of Symbolic Forms: A Philosophical Commentary. New Haven: Yale University Press.
Braun, H.J., Holhey H., & Orth, E.W. (eds.) (1998), Über Ernst Cassirers Philosophie des symbolischen Formen. Frankfurt: Suhrkamp.
Cassirer, Toni (1948), Mein Leben mit Ernst Cassirer. Hildesheim: Gerstenberg.
Friedman, Michael (2000), A Parting of the Ways: Carnap, Cassirer, and Heidegger. Peru, IL: Open Court.
Gaubert, Joël (1996), La science politique d’Ernst Cassirer: pour une réfondation symbolique de la raison pratique contre le mythe politique contemporain. Paris: Éd. Kimé.
Gordon, Peter E. (2010), Continental Divide: Heidegger, Cassirer, Davos. Cambridge: Harvard University Press.
Hamlin, C., & Krois, J.M. (eds.) (2004), Symbolic Forms and Cultural Studies: Ernst Cassirer’s Theory of Culture. New Haven: Yale University Press.
Hanson, J. & Nordin, S. (2006), Ernst Cassirer: The Swedish Years. Bern: Peter Lang.
Heidegger, Martin (1928), “Ernst Cassirer: Philosophie der symbolischen Formen. 2. Teil: Das mythische Denken.” Deutsche Literaturzeitung, 21: 1000–1012.
Itzkoff, Seymor (1971), Ernst Cassirer: Scientific Knowledge and the Concept of Man. South Bend, IN: Notre Dame University Press.
Krois, John Michael (1987), Cassirer: Symbolic Forms and History. New Haven: Yale University Press.
Langer, Suzanne (1942), Philosophy in a New Key: A Study in the Symbolism of Reason, Rite and Art. Cambridge, Mass.: Harvard University Press.
Lipton, D. R. (1978), Ernst Cassirer: The Dilemma of a Liberal Intellectual in Germany, 1914-1933. Toronto: University of Toronto Press.
Lofts, S.G. (2000), Cassirer: A “Repetition” of Modernity. Albany: SUNY Press.
Lübbe, Hermann (1975), Cassirer und die Mythen des zwanzigsten Jahrhunderts. Göttingen: Vandenhoeck & Ruprecht.
Luft, Sebastian (ed.) (2015), The Neo-Kantian Reader. New York: Routledge.
Paetzold, Heinz (1995), Ernst Cassirer — Von Marburg nach New York: eine philosophische Biographie. Darmstadt: Wissenschaftliche Buchgesellschaft.
Renz, Ursula (2002), Die Rationalität der Kultur: Zur Kulturphilosophie und ihrer transzendentalen Begründung bei Cohen, Natorp und Cassirer. Hamburg: Felix Meiner.
Rudolph, Enno (ed.) (1999), Cassirers Weg zur Philosophie der Politik. Hamburg: Felix Meiner.
Schilpp, Paul Arthur (ed.) (1949), The Philosophy of Ernst Cassirer. Evanston: Library of Living Philosophers.
Schultz, William (2000), Cassirer and Langer on Myth: An Introduction. New York: Routledge.
Schwemmer, Oswald (1997), Ernst Cassirer. Ein Philosoph der europäischen Moderne. Berlin: Akademie Verlag.
Skidelsky, Edward (2008), Ernst Cassirer: The Last Philosopher of Culture. Princeton: Princeton University Press.
Tomberg, Markus (1996), Der Begriff von Mythos und Wissenschaft bei Ernst Cassirer und Kurt Hübner. Münster: LIT Verlag.
Verene, Donald Phillip (ed.) (1979), Symbol, Myth, and Culture: Essays and Lectures of Ernst Cassirer, 1935-1945. New Haven: Yale University Press.
Verene, Donald Phillip (2011) The Origins of the Philosophy of Symbolic Forms: Kant, Hegel, and Cassirer. Evanston, IL: Northwestern University Press.

Author Information

Anthony K. Jensen
Email: Anthony.Jensen@providence.edu
Providence College
U. S. A.

Legal Hermeneutics

The question of how best to determine the meaning of a given text (legal or otherwise) has always been the chief concern of the general field of inquiry known as hermeneutics. Legal hermeneutics is rooted in philosophical hermeneutics and takes as its subject matter the nature of legal meaning. Legal hermeneutics asks the following sorts of questions: How do we come to decide what a given law means? Who makes that decision? What are the criteria for making that decision? What should be the criteria? Are the criteria that we use for deciding what a given law means good criteria? Are they necessary criteria? Are they sufficient? In whose service do our interpretive criteria operate? How were these criteria chosen and by whom? Within what sociopolitical, sociocultural, and sociohistorical contexts were these criteria generated? Are the criteria we have used in the past to ascertain the meaning of a given law the criteria we should still use today? Why or why not? What personal or political goals do the meanings of laws serve? How can we come up with better meanings of laws? On what basis can one meaning of a given law be justifiably prioritized over another? Through an interrogation into these meta-interpretive questions, legal hermeneutics serves the critical role of helping the interpreter of laws reach a higher level of self-reflexivity about the interpretive process. From a legal hermeneutical point of view, it is primarily through this heightened transparency about the process of interpretation that better meaning assessments are generated.

Some distinctive features of legal hermeneutics are (1) it is rooted in philosophical hermeneutics; (2) within the schema of mainstream philosophies of law, it is most closely conceptually related to legal interpretivism; (3) it shares an antifoundationalist sensibility with many alternative theories of law; and (4) within jurisprudence proper (legal theory), its primary substantive focus is on the debate in constitutional theory between the interpretive methods of originalism and non-originalism.

Roots: Philosophical Hermeneutics
Legal Hermeneutics and Mainstream Philosophy of Law
Legal Hermeneutics and Alternative Theories of Law
Legal Hermeneutics in Jurisprudence Proper
Conclusion
References and Further Reading
1. References
2. Further Reading

1. Roots: Philosophical Hermeneutics

The term hermeneutics can be traced back at least as far as Ancient Greece. David Hoy traces the origin of term to the Greek god Hermes, who was, among other things, the inventor of language and an interpreter between the gods and humanity. In addition, the Greek term ἑρμηνεύω, or hermeneutice, is central to Aristotle’s On Interpretation (Περὶ Ἑρμηνείαςas), which concerns the relationship between language and logic and meaning.

Hermeneutical approaches to meaning are thematized and utilized in many academic disciplines: archaeology, architecture, environmental studies, international relations, political theory, psychology, religion, and sociology. Specifically philosophical hermeneutics is unique in that rather than taking a particular approach to meaning, it is concerned with the nature of meaning, understanding, or interpretation.

Legal hermeneutics is rooted in philosophical hermeneutics, which asks not only the question of how best to interpret a given text, but also the deeper question of what it means to interpret a text at all. In other words, philosophical hermeneutics takes as its object of inquiry the interpretive process itself and seeks interpretive practices designed to respect that process (Dostal 2002; Malpas 2014; Wachterhauser 1994). Philosophical hermeneutics, then, can be alternately described as the philosophy of interpretation, the philosophy of understanding, or the philosophy of meaning. The central problem of philosophical hermeneutics is how to successfully ascertain anything on the order of an objective interpretation, understanding, or meaning in light of the apparent fact that all meaning is ascertained through the filter of at least one interpreter’s subjectivity (Bleicher 1980: 1). Philosophical hermeneutics seeks transparency in the interpretive process en route to better meaning determinations. On this view, better theories of interpretation (1) capture the key features of the interpretive process, (2) recognize each act of understanding as an interpretation, and (3) are able to distinguish between more and less legitimate or objective interpretations, understandings, or meanings.

Philosophical hermeneutics has its theoretical origins in the work of 19^th century German philologist Friedrich Ast. Ast’s Basic Elements of Grammar, Hermeneutics, and Criticism (Grundlinien der Grammatik, Hermeneutik und Kritik) of 1808 contains an early articulation of the main components of what later became known as the hermeneutic circle. Ast wrote that the basic principle of all understanding was a cyclical process of coming to understand the parts through the whole and the whole through the parts. This basic principle derived, for Ast, from the “original unity of all being” (Ast 1808: Section 72) or what Ast called spirit or Geist. (Ast’s Geist is commonly understood to have been derived from Herder’s concept of Volkgeist.)

To understand a text, for Ast, was to determine its inner meaning or spirit, its own internal development, through a circularity of reason, a dialectical relation between the parts of a given work and the whole (Ast 1808: Section 76). What Ast called the hermeneutic of the spirit involved, in turn, the development of an understanding of the spirit of the writer and her era and an attempt to identify the one idea, or Grundidee, that unified a given text and that provided clarification regarding the relationship of the whole to the parts and the parts to the whole. In this process, for Ast, it was incumbent upon the interpreter to always remain cognizant of the historical period in which the text was situated.

Friedrich August Wolf was a contemporary of Ast’s and a fellow philologist. His Lecture on the Encyclopedia of Classical Studies (Vorlesung über die Enzyklopädie der Altertumswissenschaft) of 1831 defined hermeneutics as the science of the rules by which the meaning of signs is determined. These rules pointed, for Wolf, to a knowledge of human nature. Both historical and linguistic facts have a proper role in the interpretive process, according to Wolf, and help us to understand the organic whole that is the text. For Wolf, however, the primary task of hermeneutics was not the identification of the Grundidee or focal point of the text à la Ast, but the much more practical goal of the achievement of a high level of communication or dialogue between the interpreter of the text and the author, as well as between the interpreter and those to whom the text is to be explained.

Although aspects of the hermeneutics of both Ast and Wolf have survived into contemporary philosophical hermeneutics, the hermeneutics of both are generally understood to be concerned with what later became known as regional hermeneutics, or hermeneutics applicable to specific fields of study. Friedrich D. E. Schleiermacher, by contrast, was the first to define hermeneutics as the art of understanding itself, irrespective of field of study (Palmer 1969: 84). Underlying and grounding the specific rules of interpretation of the various fields of study, for Schleiermacher, was a unity grounded in the fact that all interpretation takes place in language (bid.). Schleirmacher thought that a general, rather than regional, hermeneutics was possible and that such a general hermeneutics would consist of the principles for the understanding of language (Ibid.). Specifically, for Schleiermacher, proper interpretation, or understanding, was not merely a function of grasping the thoughts of the author, but of coming to grips with the extent to which the language in which the thoughts took place affected, constrained, and informed those thoughts. Schleiermacher, then, is calling our philosophical attention to the fact that when we say we understand something, we are essentially just comparing it to something we already know, most basically a given language. Here, to understand is to place something within a pre-existing context of intelligibility. Understanding, for Schleiermacher, was therefore decidedly circular, but for him this did not amount to the conclusion that understanding was impossible. Instead, circularity is how understanding is defined. Understanding necessarily and structurally entails that the text and the interpreter share the same language and the same context of intelligibility.

Wilhelm Dilthey continued Schleiermacher’s pursuit of understanding qua understanding, but he sought to do so within the specific context of what he called the human sciences, or the Geisteswissenschaften (Dilthey 1883). The methods of scientific knowledge, for Dilthey, were too “reductionist and mechanistic” to capture the fullness of human-created phenomena (Palmer 1969, 100). The human sciences, or humanities, required instead two particular processes: (1) the development of an appreciation for the role of historical consciousness in our conceptions of meaning, and (2) a recognition that human-created phenomena are generated from “life itself” rather than through theory or concepts (Ibid.). In contemporary hermeneutic theory, the first process is often referred to as the historicality, or Geschichtlichkeit, of meaning and the second as life-philosophy, or Lebensphilosophie, the phenomenological view that meaning can be only be generated and understanding can only be had through lived experience (Erlebnis) and not through the examination of concepts, theories, or other purely idealistic or rational methods (Nenon 1995).

While Dilthey observed that the categorical methods of understanding useful in science were inappropriate for use in the human sciences, Martin Heidegger switched the entire hermeneutic enterprise from an epistemological focus to an ontological one. This switch is customarily referred to as the ontological turn in hermeneutics (Kisiel 1993; Tugendhat 1992). For Heidegger, in his classic, Being and Time (1962/2008), the question of the nature of understanding, or Verstehen, could only be answered by first answering the question of the nature of what it means to be. Accordingly, Heidegger set out in Being and Time to discover the nature of being qua being. To do so Heidegger went to the things themselves, or die Sachen selbst, in keeping with the phenomenological methodology he learned from his teacher, Edmund Husserl. Heidegger called his phenomenological inquiry into the nature of being qua being fundamental ontology. He also called it hermeneutic ontology, which highlights that, for Heidegger, being and interpretation are inextricably linked almost to the point of identity.

The idea that, for Heidegger, being and interpretation were virtually the same phenomenon is arguably best captured in two of Heidegger’s key concepts: Dasein and being-in-the-world. Dasein can be roughly translated as the human way of being, but its literal translation is “being there” or “being here.” With these concepts, Heidegger is attempting to stress that the human way of being is interactive both with one’s environment and with others in the world. To be human is to be active and involved in one’s world and with other people rather than to be in a particular static state. There are no isolated human subjects separate from the world, for Heidegger, and the human way of being is not adequately characterized by the traditional philosophical distinction between subject and object, or by the distinction between subject and other subjects (or minds and other minds, as this polarity is sometimes described) that originates, for Heidegger, in Descartes’s Meditations. Instead, being, for humans, is being-in-the-world, a term meant to highlight the lack of clear barriers between human beings and the contexts, or schemas of intelligibility, in which they find themselves (Dreyfus & Wrathall 2002). According to Heidegger, what this means for the phenomenon of understanding is that it is always a function of how a given human being is in the world, that is, a function of context. The relationship between being, or context, and understanding is reciprocal. Understanding, for Heidegger, discloses to us what it means to be, and who we are affects how we understand things. In other words, understanding, for Heidegger, is not a sort of apprehension of the way things really are, as the canonical, modern, philosophical tradition might think of it, but rather it is the process of appreciating the manner in which things are there for a particular person, or group of persons, in the world. Further, the manner in which things are there for us in the world is a function primarily of shared social and cultural practices. To understand something, then, is to be able to place it within a schema of intelligibility, which is generated by the shared social and cultural practices in which one finds oneself (Dreyfus & Wrathall 2005).

In his Truth and Method (1975), Hans-George Gadamer picks up on Heidegger’s concept of the hermeneutic circle of understanding that is at the core of what it means to be human in the world, but while it is true that Gadamer works within the Heideggerian paradigm to the extent that he fully accepts the ontological turn in hermeneutics, Gadamer’s own stated project in Truth and Method is to get at the question of understanding qua understanding. Specifically, Gadamer observes that the traditional paths to truth are wrong-headed and run antithetical to the reality that being and interpretive understanding are intertwined. In the traditional paths to truth, truth and method are at odds. The methods used in the Western tradition will not get us to truth. These methods are critical interpretation, or traditional hermeneutics, and the Enlightenment focus on reason as the path to truth. Both of these methods have what Gadamer calls a pre-judgment against pre-judgment. That is, they both fail to acknowledge the role of the interpreter in determining truth. Traditional critical interpretation is inadequate because it seeks original intent or original meaning, that is, it holds on to the fiction that the meaning of the text can be found in the original intent of the author or in the words of the text. The Enlightenment focus on reason is an equally inadequate path to truth because it retains the subject/object distinction and thinks the path to truth is through the scientific method, both of which are wrong-headed.

For Gadamer, the word pre-judgment, or Vorurteile means the same thing as Heidegger’s fore-structure of understanding. Gadamer claims that today’s negative connotation of pre-judgment only develops with the Enlightenment (Schmidt 2006: 100). The original meaning of pre-judgment, according to Gadamer, was neither positive nor negative, but simply a view we hold, either consciously or unconsciously. All understanding necessarily starts with pre-judgments. The pre-judgments of the interpreter, for Gadamer, rather than being a barrier to truth, actually facilitate its generation. The pre-judgments of the interpreter—held as a result of the interpreter’s personal facticity—not only contribute to the generation of the question being raised in the first instance, but, if taken into account on the path to truth, are capable of being critically evaluated and revised, with the result that the quality of the interpretation is improved. Additionally, pre-judgments are either legitimate or illegitimate. Legitimate pre-judgments lead to understanding. Illegitimate pre-judgments do not. One of the goals of Truth and Method is to provide a theoretically sound basis upon which to distinguish between legitimate and illegitimate pre-judgments (Schmidt 2006: 102). Understanding or meaning, for Gadamer, is a function of legitimate pre-judgments.

The model for how understanding actually operates, for Gadamer, is the conversation or dialogue. In an authentic dialogue, says Gadamer, understanding or meaning is something that occurs inside of a tradition, which is just a set of cultural assumptions and beliefs. A tradition is a worldview, or Weltanschauung, a system of intelligibility, a framework of ideas and beliefs through which a given culture experiences and interprets the world. A tradition, in this Gadamerian sense, is the theoretical grandchild of what Ast called a given text’s Grundidee, or one idea that unified it. For Gadamer, a legitimate pre-judgment is a pre-judgment that survives throughout time, eventually becoming a central part of a given culture, a part of its tradition. Understanding or meaning is an event, a happening, the substance of which is a fusion of this narrowly defined concept of tradition and the pre-judgments of the interpreter. In this sense, understanding is not willed by the participants. If it were, the dialogue would not be authentic and understanding or meaning could never be achieved. Instead, the conversation or dialogue wills the path to understanding. The thing itself reveals the truth.

In the course of the dialogue, and as a direct and organic result of the things being discussed by the particular participants of the conversation, a question arises. This question becomes the matter at hand, the topic of the conversation. As the conversation proceeds, the answer will show up as well, and it will be a function of the “fusion of horizons” between the perspectives or pre-judgments of the participants of the conversation (Gadamer 1975). This fusion is understanding/meaning. It is the answer to the question and the closest thing there is to truth. In this way, both the things themselves and the participants of the conversation together generate both the conversation topic (the question) and the answer. Together, the things and the participants of the conversation generate the truth of the matter. Moreover, all of this takes place within a tradition that gives legitimacy and weight to the meaning generated.

It is important, for Gadamer, that the path to truth is phenomenological, that is, we must go to the things themselves, and the path is also hermeneutic in that it appreciates that the pre-judgment against pre-judgment is unavoidable. Every interpreter arrives at a text with what Gadamer calls a given horizon, or conglomeration of pre-judgments, which is analogous to a Heideggerian world or fore-structure of understanding and which has been described as a given schema of intelligibility in which an interpreter finds himself or herself. A Gadamerian horizon is a shared system of social and cultural practices that provides the scope of what shows up as meaningful for an interpreter as well as for how things show up. Picking up on the hermeneutic circle, Gadamer holds that an act of understanding is always interpretive.

Another key element of Gadamerian philosophical hermeneutics is Gadamer’s insistence that interpretation, understanding, or meaning cannot take place outside of practical application. Interpretation is more than mere explication for Gadamer. It is more than mere exegesis. Beyond these things, interpretation of a given text—and it is important that everything is a text—always and necessarily takes place through the lens of present concerns and interests. The interpreter always and necessarily¸ in other words, comes to the table of the interpretive conversation or dialogue with a present concern that is grounded in a given epistemological or metaphysical horizon in which the interpreter dwells. In this way, for Gadamer, Aristotle got it right that understanding necessarily occurs through practical reasoning, or phronesis. For Gadamer, “[a]pplication does not mean first understanding a given universal in itself and then afterward applying it to a concrete case. It is the very understanding of the universal….itself” (Gadamer 1975). That phronesis is central to Gadamer’s hermeneutics is not disputed (Arthos 2014).

But, even more important than this, for Gadamer, the distance in time between the interpreter and the text is not a barrier to understanding but that which enables it. Temporal distance between text and interpretation is a “positive and productive condition enabling understanding” (Gadamer 1975). When we seek to interpret a text, we are trying to figure out not the author’s original intent but “what the text has to say to us” (Schmidt 2006: 104), and this is a function of the extent to which the author’s original intent and the meaning generated by the contemporary context and the contemporary interpreter agree, that is, the extent to which the horizons of the author and the instant interpreter fuse or blend. (Gadamer specifically discusses legal hermeneutics in Truth and Method. He writes that there are two commonly understood ways of determining meaning in the law. The first is when a judge decides a case. In such a scenario, the judge must necessarily factor the present facts into the decision. The second is the case of the legal historian. In this second scenario, although it may seem as if the task is to discover the meaning of the law by only considering the history of the law, the reality is that it is impossible for the legal historian to understand the law solely in terms of its historical origin to the exclusion of considerations of the continuing effect of the law. In other words, determinations of meaning in the law, as is the case of all determinations of meaning, necessarily and at all times involves practical application.)

Post-Gadamerian philosophical hermeneutics takes many forms but can arguably be said to begin with the work of Emilio Betti. Finding what he saw as an epistemological relativism in the philosophical hermeneutics of Gadamer, Betti returns to the general hermeneutics of Schleiermacher and Dilthey and resists the tide of the ontological turn (Pinton 1972, 1973). Betti was a legal theorist who tried to bring the hermeneutic project back to one of interpretation without reference to the human way of being. Betti believed in and sought objective understanding or objective interpretation, or Auslegung, while at the same time stressing that texts reflected human intentions. Accordingly, he thought it was possible to ascertain the meaning of the text through replicating the original creative process, the train of thought, so to speak, of the text’s author. Betti believed in the autonomy of the text (Bleicher 1980: 58). Objective interpretation was possible, for him, but this objectivity was based both in terms of a priori epistemological existence à la Plato’s forms and of historical and cultural coherence. (Bleicher 1980: 28-29).

Jürgen Habermas, like Emilio Betti, seeks objective understanding (Habermas 1971), but, unlike Betti and in agreement with Gadamer, Habermas believes that hermeneutics is not and cannot be merely a matter of trying to find the best method of interpretation. Instead, objectivity of interpretation is grounded in something Habermas called communicative action, a sort of Gadamerian dialogue modified by the recognition that power imbalances often distort what passes for collective understanding, and that real consensus—the closest thing available to truth and/or objective understanding—can only be had where that consensus has been generated impartially and in circumstances where agreement has been unconstrained. (Habermas’s communicative action concept is also known as communicative praxis or communicative rationality.) While Gadamer’s philosophical hermeneutics grounded a kind of quasi-objectivity in the authority of tradition, however, Habermas found this approach insufficiently able to guide social liberation and progress. The task of hermeneutics is not merely to deconstruct the process of understanding and/or to somehow ground that understanding in either method à la Betti or tradition à la Gadamer, but to determine rules of ascertaining universal validity in the social sciences en route to social change. In this way, Habermas’s hermeneutics claims that hermeneutics can and does permit the kind of value judgments of which some critics say hermeneutics is incapable.

Paul Ricoeur was a contemporary philosophical hermeneutist who is known for creating what is often described as a critical hermeneutics. For Ricoeur, meaning and understanding are to be obtained through culture and narrative, as these take place in time. Influenced by Freud, Ricoeur thought all ideology required a critique to uncover repressed and hidden meanings that exist behind surface meanings that pass for truth. In The Conflict of Interpretations (1974), Ricoeur argued that there were many and various paths to understanding and that each uniquely adds to meaning. Ricoeur’s work has been taken up recently by phenomenologists interested in questions of the nature of paradox (Geniusas 2015).

The work of Jacques Derrida (Derrida 1976, 1978) is more commonly associated with a 20th century movement in French philosophy known as deconstruction than with philosophical hermeneutics per se. However, there are important similarities between the two movements. First, deconstruction on its own terms, like hermeneutics, is not a method. Instead, deconstruction is a critique of authoritative systems of intelligibility or meaning that exposes the hierarchies of power within those systems. In understanding itself as outside of existing theoretical schemas, in other words, Derrida’s deconstruction is within the hermeneutic tradition. Second, deconstruction is based on Heidegger’s concept of Destruktion, a central concept in his hermeneutic ontology. But, while Heidegger’s Destrucktion, a project of critiquing authoritative systems of meaning that are based on structures of foundationalist metaphysics or epistemology, concludes that every act of understanding is an act of interpretation (Heidegger, 2008/1962), Derrida’s deconstruction involves identifying that language, or text, contains conceptual oppositions that involve the prioritizing of one side of a given conceptual opposition over the other, for example, writing over speech. Still, Derrida’s deconstruction is clearly in the hermeneutic tradition in that it is designed to highlight the elliptical and enigmatic nature of language and meaning. This is particularly evident in Derrida’s concept of différance, according to which every word in a given language implicates other words, which implicate other words, in a process of infinite reference and therefore what Derrida calls absence, meaning an absence of definitive meaning.

Susan-Judith Hoffman argues that Gadamerian hermeneutics furthers feminist objectives and can be understood as a form of feminist theorizing. Highlighting Gadamer’s account of the importance of difference, his notion of understanding as an inclusive dialogue, his account of pre-judgments as conditions for understanding that must always remain provisional, his account of tradition as that which is transformed by our reflection, and his account of language (Hoffman, 2003: 103), Hoffman argues that Gadamer’s philosophical hermeneutics is in line with feminist theorizing in that it “overthrows the false universalism of the natural sciences as the privileged model of human understanding” (Ibid.: 81). In the process, Gadamer’s hermeneutics amounts to feminist theorizing in two important ways. First, it contains a sensitivity to the historical and cultural situation of knowledge and knowledge seekers, and second, it contains the critical power to challenge reductive universalizing tendencies in traditional canons of thought (Ibid.: 82).

Linda Martín Alcoff also sees value for feminist theory in Gadamer’s hermeneutics. For Alcoff, Gadamer’s “openness to alterity,” his “move from knowledge to understanding,” and his “holism in justification and immanent realism” all align themselves with feminist theorizing (Alcoff 2003: 256). That Gadamer’s philosophical hermeneutics contained these elements was insisted upon by Gadamer himself, who saw his philosophical hermeneutics as a critique of the Enlightenment view that truth could be had through abstract reasoning, divorced from historical considerations, as well as a call for the acknowledgment that the path to truth was through the particular rather than through the universal (Gadamer 1975). Miranda Fricker has recently developed hermeneutical themes into what she calls hermeneutical injustice, according to which an injustice is done when the collective hermeneutical resources available to a given individual or group are inadequate for expressing one or more important areas of their experience (Fricker 2007).

The work of Donatella di Cesare, Günter Figal, and James Risser is at the forefront of contemporary hermeneutics. For Cesare, the ground of hermeneutics is in Heideggerian existentialism, but this does not mean that hermeneutics is a kind of nihilism. Instead, hermeneutics, or the philosophy of understanding, is aimed at consensus; it is a constructive enterprise (Cesare 2005). For Figal, hermeneutics is most fundamentally a critique of objectivity and a call to understand things previously understood as objective elements of human life (Figal 2010). Risser has been interpreted as attempting to advance beyond Gadamer’s philosophical hermeneutics by acknowledging the radical finitude at stake in the phenomenon of tradition (George 2014).

2. Legal Hermeneutics and Mainstream Philosophy of Law

Within mainstream philosophy of law, legal hermeneutics is most closely aligned with legal interpretivism. Legal interpretivism is conceptually positioned between the two main subfields of philosophy of law: legal positivism and natural law theory. While mainstream philosophy law has many faces, and includes, among other theories, legal realism, legal formalism, legal pragmatism, and legal process theory, legal positivism and natural law theory form the theoretical poles between which each of the mainstream theories can be understood to lie. Legal positivism is the view, in broad strokes, that there is no necessary connection between law and morality and that law owes neither its legitimacy nor its authority to moral considerations (Feinberg and Coleman 2008; Patterson 2003). The validity of law, for the legal positivist, is determined not by its moral content but by certain social facts (Hart 1958, 1961; Dickson 2001; Coleman 2001; Gardner 2001). Natural law theory is grounded in the work of two main thinkers: John Finnis and Lon Fuller. For Finnis, an unjust law has no authority (Finnis 1969; 1980; 1991), and for Fuller, an immoral law is no law at all (Fuller 1958). Natural law theory, generally speaking, then, is the view that there is a necessary connection between law and morality and that an immoral law is not a law (Raz 2009; Simmonds 2007; Murphy 2006).

Sometimes called a third, main theory of law, legal interpretivism, developed by Ronald Dworkin, is the view that the law is essentially interpretive in nature and that it gains is authority and legitimacy from legal principles. Dworkin understands these principles to be neither bare rules nor moral tenets, but a set of guidelines to interpretation that are generated from legal practice (Dworkin 2011, 1996, 1986, 1985, 1983). Some describe legal interpretivism as a hybrid between legal positivism and natural law theory for the reason that Dworkin’s principles seem to qualify both as rules and to have a kind of normative quality that is similar to moral tenets (Hiley et al. eds. 1991; Brink 2001; Burley 2004; Greenberg 2004). But, what distinguishes legal interpretivism from both legal positivism and natural law theory is its insistence that legal meaning is tempered by the legal tradition within which it operates (Greenberg 2004; Hershovitz 2006). For the legal interpretivist, in other words, the line between legal positivism and natural law theory is not clearly drawn. Instead, rules and normative guidelines together shape and form both what the law is and what it means. This approach to legal ontology and meaning is known as the interpretive turn in analytic jurisprudence (West 2000; Feldman 1991).

Arguably, however, what legal positivism, natural law theory, and legal interpretivism all have in common is epistemological and metaphysical foundationalism. While, for the legal positivist, the answer to both the question of what the law is and the question of what the law means can be found in rules (Hart 1958), for the natural law theorist, the answer to both questions can be found in morality (Fuller 1958). Similarly, for the legal interpretivist, the answer to both questions is found in legal principles. In other words, for the legal interpretivist, law gains its legitimacy and authority from principles emanating from legal practice. Although law is interpretive in nature, on this view, the interpretative process stops at the point at which a judgment call has to be made as to what the/a law means, preferably by someone well-versed in the relevant legal tradition. Once that judgment call is made, we have our answer. Meaning has been determined.

The legal hermeneutical approach is similar but different in the important respect that no meaning determination is ever understood to be fixed. As is the case for the legal interpretivist, for the legal hermeneutist, law is interpretive in nature, but at no point can any meaning determination rise to the level of definitive. Things like Dworkinian principles are acknowledged and considered, along with myriad other factors relevant to good interpretive practice, but the story of the meaning of the law, for the legal hermeneutist, most certainly cannot end at any point. There is no foundation to the law, for the legal hermeneutist, and there can be none. Instead, there can only be better or worse interpretations, measured comparatively and by the quality of the interpretive practices used to generate the various interpretations. More importantly, however, for the legal hermeneutist, objective interpretation simply is not and cannot be the project. Instead, the search for legal meaning is a critical project. The search for legal meaning involves critical engagement with previous interpretations and the current interpretation and includes critical analysis of the conditions for the possibility for both.

Legal hermeneutics, then, while similar to legal interpretivism in many respects, provides an alternative to the three main theories of law in that its approach to legal meaning can be understood to avoid engagement with the question of foundationalism that is characteristic of the traditional approaches. Rather than offering a new theory of law, legal hermeneutics “provides us with the necessary protocols for determining meaning” (Douzinas, Warrington, and McVeigh 1992: 30). Legal hermeneutics provides no specific theory of law and privileges no particular methodology or ideology. Instead, legal hermeneutics calls the interpreter of legal texts first and foremost to the fact that every act of understanding a law is an act of interpretation, and at the same time, highlights that better interpretation takes conscious and proactive account of what philosophical hermeneutics, as described above, reveals as the necessary structures and components of the interpretive process. Some might describe this feature of legal hermeneutics as taking the determinacy of meaning to be context-dependent and open-ended. While this account is on track, another key feature of legal hermeneutics is that it is a descriptive rather than a normative project. Legal hermeneutics, then, is more a way of clarifying the nature of how legal interpretation actually works than a theory of how legal interpretation ought to work. In this way, legal hermeneutics can be understood to provide the tools with which to investigate, clarify, and help solve what appear from other perspectives to be insoluble legal problems, particularly problems based in conflicts of interpretation.

3. Legal Hermeneutics and Alternative Theories of Law

Legal hermeneutics shares an antifoundationalist sensibility with many alternative theories of law, including the critical legal studies movement, Marxist legal theory, deconstructionist legal theory, postmodernist legal theory, outsider jurisprudence, and the law and literature movement. For each of these theories of law, the goal of locating law’s ultimate legitimacy, authority, or meaning anywhere at all is understood as an exercise in futility. Some characterize this feature of these theories as the failure of complete determinacy as a semantic thesis, rather than as a failure of ultimate justification as an epistemological thesis. However, for others, this distinction is not meaningful and fails to adequately account for the radical rejection of the entire project of justification inherent in alternative theories.

The critical legal studies movement was an intellectual movement in the late 1970s and early 1980s that stood for the proposition that there is radical indeterminacy in the law. Conceptually based in the critical theory of the Frankfurt School, critical legal studies stands for the proposition that legal doctrine is an empty shell. There is no such thing as the law, for the critical legal theorist, as the law is understood as an entity that exists out of context (Binder 1996/1999: 282). Instead, law is produced by power differentials that have their origins in differences in levels of property ownership. The liberal ideal of the rule of law devoid of influence from power differentials, contained in all analytic approaches to jurisprudence, is an illusion. For this reason, law is inherently self-contradictory and self-defeating and can never be a mere formality, as liberal theory and analytic jurisprudence would have us believe. This way of understanding the law is known as the indeterminacy thesis. For some, this does not necessarily mean that law is indeterminable. However, it means that determinability is context-dependent. Others do not find this distinction meaningful.

Marxist legal theory begins with the work of Evgeny Pashukanis and takes place in contemporary form in the work of Alan Hunt, among others. For Pashukanis, law was inextricably linked to capitalism and hopelessly bourgeois. Outside of capitalism, things like legal rights are unnecessary, since outside of capitalism there are no conflicting interests or rights to be meted out or over which it is necessary for persons to fight. In the socialist society that Pashukanis envisions on the other side of capitalism, what would take the place of law and all talk of individual rights would be a sort of quasi-utilitarianism that values collective satisfaction over the perceived need to protect the individual interests of individual legal subjects (Pashukanis 1924). What contemporary Marxist legal theory retains from Pashukanis is the view that law is inescapably political, merely one form of politics. In this way, law is always potentially coercive and expressive of prevailing economic relations, and the content of law always manifests the interests of the dominant class (Hunt 1996, 1999: 355). So described, the content of law, for Marxist legal theorists, has no theoretical or practical basis in anything epistemologically foundational or universal.

Deconstructionist legal theories can be considered post-structuralist like critical legal studies but are unique in that they center around conceptual oppositions or binary concepts, also known as binaries. According to the deconstructionist approach, within a given conceptual opposition, one term in the opposition has been traditionally privileged over the other in a particular context, or text. A text can be a written text, an argument, a historical tradition, or a social practice. Jacques Derrida, considered the forerunner of deconstruction as a philosophy of language and meaning, famously identified a conceptual opposition between writing and speech, for example, with writing being the privileged form (Derrida 1976). Privileged, in deconstruction, means truer, more valuable, more important, or more universal than the opposing term (Balkin, 1996, 1999: 368). According to deconstructionist theories of law, legal distinctions are often masked conceptual oppositions taht privilege one term over another. For example, individualism is privileged over altruism, and universalizability is privileged over the attention to the particular that is an inherent part of equitable distribution. These binary concepts and the privileging of one term in each binary lend an instability to the law, on deconstructionist terms, that is decidedly anti-foundationalist. J.M. Balkin, for example, argues that the true nature of the legal subject is ignored and obliterated by conventional legal theory (Balkin 2010; 1993). Balkin argues that when an attempt is made to understand a law, we bring our subjective experiences to bear on that attempted understanding (Ibid.). For Balkin, mainstream philosophy of law’s failure to acknowledge that this is the case is its very deep and abiding flaw.

Postmodernist legal theories are grounded in a 20th century movement in aesthetic and intellectual thought, which departed from interpretation based in universal truths, essences, and foundations. Postmodern legal theory departs from a belief in the rule of law, or any generalized or universalizable Grand Theory of Jurisprudence, in favor of using “local, small-scale problem-solving strategies to raise new questions about the relation of law, politics, and culture” (Minda 1995: 3). Other than this statement, it is difficult to describe postmodernist legal theory in any general way, since the entire point of postmodernist legal theory is that generalized theories are vacuous, even impossible. Instead, there are only individual theories, individual authors of theories, and individual texts/laws. It is fair to say, however, that postmodern legal theorists generally resist the sort of conceptual theorization routinely practiced by more mainstream legal academics and analytic philosophers for the reason that more mainstream approaches unduly emphasize abstract theory at the expense of pragmatic concerns (Ibid.). The postmodern rejection of ultimate theories can be construed as a form of antifoundationalism.

Outsider jurisprudence is an area of legal theorizing that is highly skeptical of the ability of mainstream legal theory to address the needs of members of historically marginalized groups. Although there has been a proliferation of kinds of outsider jurisprudence in the early 21st century, including LatCrit and QueerCrit (Mahmud 2014; Valdes 2003; Eskridge 1994), there are two main kinds of outsider jurisprudence: critical race theory and feminist jurisprudence (Parks 2008; Jones 2002; Delgado 2012; Levit and Verchick 2006). Critical race theorists are concerned with the particularized experiences of African Americans in American jurisprudence. They share with the postmodernists a rejection of the idea of the existence of one grand and universally applicable theory of law that applies equally to everyone: “There is a hidden category of persons to whom the laws do not equally and universally apply, for the critical race theorists, and that category of persons is African Americans” (Minda 1995: 167). Key themes in critical race theory are a call to contextualized theorizing about the law that acknowledges that the lives and experiences of African Americans in America have a juridical tenor very different from the lives and experiences of other Americans, a critique of political liberalism, which bases its apportionment of rights on the fiction that African Americans as a group have the same degree of access to rights, in American society, as other Americans, and a call for juridical acknowledgment of the persistence of racism in American society (Ladson-Billings 2011; Whyte 2005; Delgado 1995: xv).

Feminist jurisprudence “[goes] beyond rules and precedents to explore the deeper structures of the law” (Chamallas 2003: xix). It operates under the belief that gender is a significant factor in American life and explores the ways in which gender, and related power dynamics between men and women throughout American legal history, have affected how American law has developed (Ibid.). Feminist jurisprudence concerns itself with legal issues of particular significance to women, such as sexual harassment, domestic violence, and pay equity. It also approaches legal theory in a way that comports with many women’s lived experiences, that is, without pretending, as mainstream jurisprudence tends to do, that gender is irrelevant to the outcome of legal disputes (Ibid.). Of primary concern to feminist legal scholars is the systemic nature of women’s inequality and the pervasiveness of female subordination through law in America. The methodology of feminist jurisprudence is the excavation and examination of hidden legalized mechanisms of discrimination to uncover hierarchies in law that operate to the detriment of the ideal of equal rights for women (Ibid.). The feminist legal scholar’s identification of hidden power dynamics at work in American law can be construed as yet another antifoundationalist perspective on law.

A recent development in outsider jurisprudence is intersectionality theory, or the idea that oppression takes place across multiple, intersecting systems, or axes, of oppression (Cho, Crenshaw, and McCall 2013; MacKinnon 2013; Walby 2007). Intersectionality theory is grounded in the thought of Kimberlé Crenshaw (Crenshaw 1989, 1991) and reinforces the idea from critical race theory and feminist jurisprudence that law operates differently on the bodies of the oppressed. For Crenshaw, race and gender discrimination combine on the bodies of black women in a way that neither race discrimination nor gender discrimination alone capture or are able to capture or handle. Crenshaw’s point is that ignoring race when taking up gender reinforces the oppression of people of color, and anti-racist perspectives that ignore patriarchy reinforce the oppression of women (Crenshaw 1991, 1252). But, more specifically, taking up any form of oppression in a vacuum ignores the way that oppression actually works in the lives of the oppressed. For the law to help combat oppression, it must grapple with the complexities and nuances of lived experience.

Containing very similar themes to legal hermeneutics is what is known as the law and literature movement (Fish 1999; Rorty 2007, 2000, 1998, 1991, 1979; Bruns 1992; Fiss 1982). The law and literature movement, like certain forms of legal hermeneutics, is heavily influenced by the deconstructionist philosophy of Jacques Derrida (Derrida 1990, 1992). The literary legal theorist, in other words, has developed an appreciation for the costs of excluding certain types of questions from the process of ascertaining meaning in the law (Levinson and Mailloux 1988: xi). Moreover, there is an active attempt on the part of the literary legal theorist to dismantle or undo the conventional illusion that the structures that support claims to authentic, legitimate, or official meaning are built on solid ground. The role of the interpreter is also highlighted in these approaches, as is the inextricability of determinations of meaning from the power dynamics in which they take place (Thorsteinsson 2015; Surrette 1995).

4. Legal Hermeneutics in Jurisprudence Proper

Legal hermeneutics in jurisprudence proper, legal theory, can be traced back to the publication of Francis Lieber’s 19th century work, Legal and Political Hermeneutics (Lieber 2010/1880). There, Lieber tried to identify principles of legal interpretation that would bring consistency and objectivity to the interpretation of the U.S. Constitution, and at the same time exposed strict intentionalist interpretative methods—defined as those in which the so-called intent of the Framers had interpretive authority—as incoherent (Binder and Weisberg 2000: 48). More than 125 years after Lieber’s landmark text, contemporary legal hermeneutics is still trying to find that balance. Contemporary legal hermeneutics retains Lieber’s goal of objectivity of interpretation and his attention to the roles of history, temporality, politics, and socio-historical context in credible meaning assessments. The central question of legal hermeneutics in constitutional theory is: What sorts of interpretive methods should we use to come up with an interpretation of the constitution that approaches objectivity despite the fact that, owing to certain realities about how the interpretive process works, it is impossible for us to ascertain the intent of the Framers?

Another question at the core of legal hermeneutics, however, is: Even if we could ascertain the intent of the Framers, which all legal hermeneutists think is impossible, why would we want to do so, given the nature of what a constitution is—a living, breathing text designed to govern real people in real life contexts—and the fact that legal hermeneutical principles based in philosophical hermeneutics dictate that the particular time and place, that is, the context, of a given application of a given law significantly influences, and should influence, the content of the interpretation? This is an example of the hermeneutic circle at work in legal interpretation. That is, from the vantage point of legal hermeneutics: What the constitution means in a particular instance is importantly influenced by the context in which the interpretation is taking place, the application, and the context in which the interpretation is taking place, the application, is importantly influenced by what the constitution means in that same context.

The primary focus of contemporary legal hermeneutics is the debate in constitutional theory between the interpretive methods of originalism and non-originalism. Originalism is the view, generally, that the meaning of the constitution is to be found by determining the original intent of the Framers, understood to be most prudently found in the text of the constitution itself (Scalia and Garner 2012; Calabresi 2007; Monaghan 2004). By contrast, non-originalism is the view, generally, that the constitution is a living, breathing document meant more as a set of guidelines for future lawmakers than as a strict rulebook demanding literal compliance (Cross 2013; Balkin 2011; Goodwin 2010). For clarification purposes, it should be noted that the divide between originalism and non-originalism is akin to the divide between epistemological foundationalism and antifoundationalism.

Within the debate between originalists and non-originalists, clearly all legal hermeneutists are necessarily non-originalists, since by the basic tenets of legal hermeneutics, original intent cannot be ascertained. But, what separates the legal hermeneutist from the average non-originalist is a high degree of respect for the text of the constitution as an interpretive starting point, together with a call to heightened self-reflexivity regarding the degree to which one’s own pre-judgments, and the pre-judgments of previous interpreters, may be affecting the interpretive process. By the same token, just as the legal interpretivist is constrained by the principles of legal practice in the interpretive process, the legal hermeneutist is similarly constrained by the spirit of the text. Finally, while the goal of the average non-originalist is a definitive interpretation of the text, however at odds with the original intent of the Framers, the legal hermeneutist has the more modest goal of deconstructing the mosaic of considerations that went into previous interpretations in an effort to examine each tile of the mosaic, one by one, more in the service of understanding the text/law within a given context than in the service of producing anything on the order of a definitive interpretation for posterity.

Another way of thinking about legal hermeneutics, however, is to see it as neither originalist nor non-originalist, but orthogonal to the originalist/antioriginalist continuum. In other words, it is consistent with the themes of legal hermeneutics that it rejects the originalist/antioriginalist continuum itself as wrong-headed and unproductive. Indeed, legal hermeneutics rejects interpretive method altogether in favor of a call to an increased level of self-reflexivity on the part of the interpreter, a call that is meant to actively and consciously engage the interpreter in the interpretive process in a way that neither originalism nor non-originalism demands.

On the contemporary scene, George Taylor’s work in legal hermeneutics follows Ricoeur’s in philosophical hermeneutics. In his “Hermeneutics and Critique in Legal Practice,” Taylor argues that Ricoeur’s approach to hermeneutics gets it right when it attempts to mediate the difference between understanding and explanation (Taylor 2000: 1101 et seq.). Understanding, on this view, is obtained through hermeneutic methods, but explanation is obtained through science. Ricoeur, according to Taylor, sees the interpretive enterprise as containing both elements. The way Taylor sees it, Ricoeur’s emphasis on the narrative nature of meaning acknowledges the roles of both understanding and explanation in a successful interpretation (Taylor 2000: 1123). The usefulness of legal hermeneutics, for Taylor, is that it correctly identifies and brings to the forefront that there is explanation or fact in understanding or interpretation, and there is understanding or interpretation in explanation or fact, shedding a kind of glaring light on all understandings that might deny this reality. The goals of originalism, on this view, are simply impossible to reach.

Francis J. Mootz, III agrees with Taylor about the impossibility to ascertain the original meaning (Mootz 1994). Accordingly, instead of engaging in what he understands as the necessarily fruitless exercise of attempting to ascertain original meaning, Mootz argues, we should instead attempt to find the interpretation that “allows the text to be most fully realized in the present situation” (Mootz 1988: 605).

Georgia Warnke describes the interpretive turn in the study of justice as an abandonment of the attempt to discern universally valid principles of justice in favor of attempts to “articulate those principles of justice that are suitable for a particular culture and society” in light of that society’s culture and traditions, “the meanings of its social goods,” and its public values (Warnke 1993: 158). We would then appeal to hermeneutical standards of coherence to reject interpretations that fail to respect that culture or those traditions, or meanings (Ibid.). Such an approach, for Warnke, “[shifts] the emphasis from a conflict between two opposing rights…to a conflict between two interpretations of…actions and practices that are consonant with [a given culture’s] traditions and self-understandings” (Ibid.: 162).

For Gregory Leyh, legal hermeneutics reveals to us the political nature of every act of constitutional interpretation. This includes both originalist approaches to constitutional interpretation as well as non-originalist approaches. However, for Leyh, legal hermeneutics also provides us with some constructive lessons for improving the quality of our necessarily political acts of interpretation. Specifically, in “Toward a Constitutional Hermeneutics,” Leyh makes the case for a legal hermeneutics based in the philosophical hermeneutics of Hans-George Gadamer (Gadamer 1975) in which, as the self-understanding of the interpreters of legal texts is increased, the quality of the interpretation produced by those interpreters is increased (Leyh 1992). This self-understanding would include primarily an explicit acknowledgment of the role that history plays in the development of both understanding and meaning (Ibid.: 370), an explicit acknowledgment of the “irreducible conditions of all human knowing” (Ibid.: 371), and attentiveness to the kinds of issues characteristically associated with the interpretation of all texts, including legal texts (Ibid.). For Leyh, a call to the constitution’s original meaning, á la a standard originalist approach, for example, entails certain assumptions about historical understanding, for example, that it is fixed and identifiable by subsequent interpreters, which legal hermeneutics exposes as impossible. What constitutional theorists need, for Leyh, is not greater insight into the intent of the framers, for this is not obtainable, but deeper reflection on the issue of the conditions that make historical knowledge possible at all (Ibid.: 372). For Leyh, legal hermeneutics “sets for itself an ontological task, namely, that of identifying the ineluctable relationships between text and reader, past and present, that allow for understanding to occur in the first place” (Ibid.).

There are two key aspects to Leyh’s legal hermeneutics: (1) an appreciation for the role of language in understanding, which sharpens our awareness of the “historical structures constitutive of all knowledge,” and (2) a recognition of the “enabling character of our prejudgments and preconceptions as windows to the past” (Ibid.: 372). Taking these things into consideration, it is impossible, according to Leyh, for us to obtain an understanding of historical texts like the constitution without going through the language we use today and our present-day prejudgments and preconceptions, or what Hans-George Gadamer called our pre-judgments. For Leyh, all reason is historical, and there is a historicity to all inquiry (Ibid.: 375). “No text simply sits before us and announces its meaning,” Leyh writes (1988: 375).

Rather than understanding the historicity of all inquiry as an impediment between the contemporary interpreter and the text, however, Leyh suggests that this information should aid us in recognizing that reason “finds its expression only as it is applied concretely” (Ibid.). In other words, interpretation is always practical, it always occurs in a particular set of circumstances at a particular time and place, and it applies itself to a particular set of facts. An acknowledgement of this reality on the part of the interpreter, for Leyh, adds a level of awareness vis-à-vis the interpretive process that can only aid in making sound judgments of constitutional interpretation.

David Hoy’s take on legal hermeneutics involves a focused critique of the intentionalist position in constitutional theory, according to which the so-called intent of the framers is the ultimate authority on constitutional meaning (Hoy 1992). For Hoy, while the intentionalist believes that no interpretation is needed to locate the intent of the framers, the hermeneutist understands that the concept of intended meaning presupposes a prior understanding of meaning in a different sense of the word. The concept of an ambiguous sentence highlights this prior understanding of meaning. A given sentence can have two different meanings in this prior sense, Hoy explains, whether either or both of them were intended or not (Hoy 1992: 175). The hermeneutist acknowledges, in other words, according to Hoy, a difference between sentence meaning and speaker’s meaning. However, while the intentionalist incorrectly presumes that there are only two possible bases for a theory of meaning—intention and convention (Ibid.)—the hermeneutist understands that there can be no fact of the matter vis-à-vis sentence meaning. Hoy writes, “[Hermeneutics] acknowledges semantic complexity. It does not exclude questions about intention when these are relevant to interpretation, but it believes that since textual meaning is not reducible to intended meaning, there are many other kinds of questions that can be asked about texts” (Hoy 1992: 178).

At the same time, Hoy’s hermeneutics stands for the proposition that the traditional way law is practiced operates as a constraint on judicial discretion. It provides a schema of intelligibility in which a judge must necessarily decide a case. As Hoy indicates, using discretion to decide what the law means within the tradition of the practice of law is what judges do all the time. “Only when the judges know that the law entails one decision and they nevertheless decide something else could they be said to be rewriting,” writes Hoy (1992: 183), and the hermeneutic claim is that this is almost never the case. See also Hoy, David (1987) “Interpreting the Law: Hermeneutical and Poststructuralist Perspectives,” Southern California Law Review 58 (1985): 136-76 and “Dworkin’s Constructive Optimism v. Deconstructive Legal Nihilism,” Law and Philosophy 6 (1987): 321-56. If Hoy is right, then, as Leyh points out as well, there is no act of judicial interpretation that takes place without interpretation. Such a possibility is an illusion. Instead, all acts of understanding are acts of interpretation including originalist and/or intentionalist acts of understanding.

In the early 21st century, John T. Valauri argued that the new questions for legal hermeneutics are different from the ones of the late 1980s and early 1990s (Valauri 2010). For Valauri, the continuing significance of hermeneutics for legal theory is to help us sort through the varieties of originalism that compete for our allegiance in the aftermath of what he sees as a kind of unanimous consent to originalism’s legitimacy. In other words, for Valuari, the hermeneutical question is no longer whether originalism is valuable, but what kind of originalism is valuable (Ibid.). The remaining questions that need to be answered to help us sort through the varieties of originalism, for Valauri, are (1) whether the various forms of originalism share a common conception of understanding and interpretation, and (2) whether hermeneutics is a descriptive or normative practice. To address these questions, says Valauri, we need to “[recover]…the fundamental hermeneutical problem” (Gadamer 1975), which means focusing on three key hermeneutical paradigms: (1) the process of application, (2) Aristotle’s practical wisdom, and (3) a focus on the “Aristotelian face” of hermeneutics over the Heideggerian one (Valauri 2010). The significance of paradigms (1) and (2) are self-explanatory and common to all forms of hermeneutics, legal and otherwise. By paradigm (3), Valauri hopes to recover legal hermeneutics from its Heideggerian-based, full scale rejection of method that many mainstream legal theorists find so unpalatable.

Drawing themes and seeking overlap between the various contemporary legal hermeneutists, a legal hermeneutical approach to constitutional theory can be understood as a call to the interpreter of the constitution to take into conscious consideration the following factors when engaged in constitutional interpretation: (1) the identity of the interpreter, of previous interpreters, and the original author, (2) the sociohistorical context in which the text was written and in which the interpretation is taking place, (3) the political climate at the time the text was written and in which the interpretation is taking place, (4) the extent to which the meaning of words and concepts relevant to the interpretation have changed or have not changed over time, (5) the particularity of experience of those affected by a given law, (6) the extent to which that experience is acknowledged or unacknowledged by previous interpretations, (7) the relationship between who the interpreter is, who the interpreter takes herself to be, and the kinds of interpretive choices the interpreter makes, (8) the necessary truth that original meaning is an illusion and cannot be ascertained, and (9) the extent to which one’s own pre-judgments enter into one’s attempt at ascertaining meaning. This final aspect adds a level of self-reflexivity to the interpretive enterprise that is understood to significantly improve the quality of the interpretation. In other words, from the vantage point of legal hermeneutics, the more that assumptions customarily unacknowledged in mainstream legal theory are excavated and examined, the greater the degree of legitimacy of the interpretation.

5. Conclusion

Legal hermeneutics is an approach to legal texts that understands that the legal text is always historically embedded and contextually informed so that it is impossible to understand the law simply as a product of reason and argument. Instead, meaning in the law takes place according practical, material, and context-dependent factors such as power, social relations, and other contingent considerations. As Gerald Bruns has put it:

Legal hermeneutics is what occurs in the give-and-take—the dialogue—between meaning and history. The historicality of the law means that its meaning is always supplemented whenever the law is understood. This understanding is always situated, always an answer to some unique question that needs deciding, and so is different from the understanding of the law in its original meaning, say, the understanding a legal historian would have in figuring the law in terms of the situation in which it was originally handed down. The historicality of the law means that its meaning is always supplemented whenever it is understood or interpreted. Supplementation always takes the form of self-understanding; that is, it is generated by the way we understand ourselves—how we see and judge ourselves—in light of the law. But, this self-understanding throws its light on the law in turn, allowing us to grasp the original meaning of the law in a new way. The present gives the past its point. (Bruns 1992)

This seems to mean, at a minimum, that every Supreme Court decision is an interpretation, which directly undermines all originalist approaches to constitutional theory.

The claim that every Supreme Court decision is an act of interpretation, however, is not a claim about the indeterminacy of meaning itself but a more modest claim about the impossibility of ascertaining original meaning. The difference between these two positions is subtle but important. While for the non-originalist, the possibility of authoritative meaning is an illusion, for the legal hermeneutist more and less authoritative meanings are possible and are a function of the interpreter’s taking conscious account of several key factors that inform and shape the interpretive process. Taking conscious account of each of these factors when attempting to interpret a given legal text lends to the interpretative process a sort of legitimacy and authority, the possibility of which most non-originalist positions deny.

Legal hermeneutics, then, can be understood as an anti-method in constitutional theory. As Gregory Leyh has identified, “[H]ermeneutics neither supplies a method for correctly reading texts nor underwrites an authoritative interpretation of any given text, legal or otherwise” (Leyh 1992: xvii). Instead, “the activity of questioning and adopting a suspicious attitude toward authority is at the heart of hermeneutical discourse. Hermeneutics involves confronting the aporias that face us, and it attempts to undermine, at least in partial ways, the calm assurances transmitted by the received views and legal orthodoxies” (Leyh 1992: Ibid.). Arguably, any approach to legal hermeneutics that rejects its distinctively critical enterprise, then, misses the point of legal hermeneutics entirely. As an approach to legal interpretation, it is necessarily, following Heidegger and Gadamer, a complete rejection of the gods of both truth and method in favor of a call to the interpreter of laws to cast an incisive and self-reflexive gaze on all that is called mainstream legal orthodoxy.

6. References and Further Reading

a. References

Alcoff, Linda Martín (2003) “Gadamer’s Feminist Epistemology” in L. Code (ed.) Feminist Interpretations of Hans-Georg Gadamer, University Park: Penn State University Press.
Aquinas, Thomas (1998) On Law, Morality, and Politics. Ed. William P. Baumgarth and Richard J. Regan Indianapolis: Hackett Publishing Co.
Aristotle (350 B.C.E./ 2000) On Interpretation. Trans. E.M. Edghill. Adelaide: University of Adelaide Library.
Arthos, John (2014) “What is Phronesis? Seven Hermeneutic Differences in Gadamer and Ricoeur,” Philosophy Today, 58(1): 53-66.
Ast, F. (1808) Grundlinien der Grammatik, Hermeneutik und Kritik. Landshut, Ger.: Jos. Thomann.
Balkin, J.M. (2011) Living Originalism. Cambridge: Belknap Press of Harvard University Press.
Balkin, J.M. (2010) “Deconstruction” in A Companion to Philosophy of Law and Legal Theory (Second Edition), Dennis Patterson, ed., Malden: Wiley-Blackwell.
Balkin, J.M. (1999, 1996) “Deconstruction” in A Companion to Philosophy of Law and Legal Theory. Ed. Dennis Patterson, Malden: Blackwell Publishing, 367-374.
Balkin, J.M. (1993) “Understanding Legal Understanding: The Legal Subject and the Problem of Legal Coherence,” 103 Yale Law Journal 105.
Binder, Guyora (1996/1999) “Critical Legal Studies,” in A Companion to Philosophy of Law and Legal Theory. Ed. Dennis Patterson, Malden: Blackwell Publishing, 280-290.
Bleicher, J. (1980) Contemporary Hermeneutics: Hermeneutics as Method, Philosophy and Critique. London, Boston: Routledge & Kegan Paul.
Brink, David (2001) “Legal Interpretation and Morality,” in B. Leiter (ed.), Objectivity in Law and Morals. Cambridge: Cambridge University Press.
Bruns, Gerald L. (1992) “Law and Language: A Hermeneutics of the Legal Text,” in Legal Hermeneutics: History, Theory, and Practice, ed. Gregory Leyh, Berkeley: University of California Press, 23-40.
Burley, Justine (ed.) (2004) Dworkin and His Critics: With Replies by Dworkin. Oxford: Blackwell.
Calabresi, Steven (2007) Originalism: A Quarter-Century of Debate. Washington, D.C.: Regnery Pub.
(Di) Cesare, Donatella (2005) “Reinterpreting Hermeneutics,” Philosophy Today, 49(4): 325-332.
Chamallas, Martha (2003) Introduction to Feminist Legal Theory. 2nd Ed. New York: Aspen Publishers.
Cho, S., K.W. Crenshaw and L. McCall (2013) “Toward a Field of Intersectionality Studies: Theory, Applications, and Praxis,” Signs, 38(4): 785-810.
Coleman, Jules (2001) The Practice of Principle. Oxford: Clarendon Press.
Crenshaw, K.W. (1991) “Mapping the Margins: Intersectionality, Identity Politics, and Violence against Women of Color,” Stanford Law Review, 43(6): 1241-99.
Crenshaw, K.W. (1989) “Demarginalizing the Intersection of Race and Sex: A Black Feminist Critique of Antidiscrimination Doctrine, Feminist Theory and Antiracist Politics,” University of Chicago Legal Forum, 140: 139-67.
Cross, Frank B. (2013) The Failed Promise of Originalism. Stanford: Stanford Law Books, an imprint of Stanford University Press.
Delgado, Richard (ed.) (1995) Critical Race Theory: The Cutting Edge. Philadelphia: Temple University Press.
Derrida, Jacques (1976) Of Grammatology. Baltimore: Johns Hopkins University Press.
Derrida, Jacques (1978) Writing and Difference. Trans. A. Bass. London: Routledge & Kegan Paul.
Derrida, Jacques (1990) “Force of Law: ‘The Mystical Foundation of Authority.’” Cordozo Law Review, 97, 1, 276.
Derrida, Jacques (1992) “Before the Law” in Acts of Literature. Ed. Derek Attridge. New York and London: Routledge, 181-220.
Dickson, Julie (2001) Evaluation and Legal Theory. Oxford: Hart Publishing.
Dilthey, Wilhelm (1883) Introduction to the Human Sciences. In Makkreel, Rudolf A. and Frithjob Rodi, eds. 1989. Selected Works: Volume I: Introduction to the Human Sciences. Princeton: Princeton University Press.
Dostal, Robert J. (ed.) (2002) The Cambridge Companion to Gadamer. Cambridge: Cambridge University Press.
Douzinas, Costas, Ronnie Warrington and Shaun McVeigh (1991) Postmodern Jurisprudence: The Law of Text in the Texts of Law. London, New York: Routledge.
Dreyfus, H.L. and M.A. Wrathall (eds.) (2005) A Companion to Heidegger. Oxford: Blackwell.
Dreyfus, H.L. and M.A. Wrathall (eds.) (2002) Heidegger Reexamined (4 Volumes). London: Routledge.
Dworkin, Ronald (1983) “My Reply to Stanley Fish (and Walter Benn Michaels): Please Don’t Talk about Objectivity Any More,” in Mitchell, W.J.T., ed., The Politics of Interpretation. Chicago: University of Chicago Press, 287-313.
Dworkin, Ronald (1985) A Matter of Principle. Cambridge: Harvard University Press.
Dworkin, Ronald (1986) Law’s Empire. Cambridge: Harvard University Press.
Dworkin, Ronald (1996) “Objectivity and Truth: You’d Better Believe It,” Philosophy and Public Affairs, 25:88.
Dworkin, Ronald (2011) Justice for Hedgehogs. Cambridge: Harvard University Press.
Feldman, Stephen Matthew (1991) “The New Metaphysics: The Interpretive Turn in Jurisprudence,” Iowa Law Review, Vol. 76, 1991.
Figal, Günter (2010) Objectivity: The Hermeneutical and Philosophy. Albany: State University of New York Press.
Finnis, John (1980) Natural Law and Natural Rights. Oxford: Clarenden Press.
Finnis, John (ed.) (1991) Natural Law, 2 Vols. New York: New York University Press.
Finnis, John (1969) The Morality of Law. New Haven: Yale University Press, rev. edn.
Fiss, Owen (1982) “Objectivity and Interpretation,” Stanford Law Review, 34: 739-763.
Fish, Stanley (1999) Doing What Comes Naturally: Change, Rhetoric, and the Practice of Theory in Literary and Legal Studies. Durham, London: Duke University Press.
Fricker, M. (2007) Epistemic Injustice: Power and Ethics of Knowing, Oxford: Oxford University Press.
Fuller, Lon (1958) “Positivism and fidelity to law—a response to Professor Hart.” Harvard Law Review, 71: 630-72.
Gadamer, Hans-George (1975) Truth and Method. London: Sheed & Ward.
Gardner, John (2001) “Legal Positivism: 5 ½ Myths,” 46 American Journal of Jurisprudence, 199.
Geniusas, Saulius (2015) “Between Phenomenology and Hermeneutics: Paul Ricoeur’s Philosophy of Imagination,” Human Studies: A Journal for Philosophy and the Social Sciences, 38: 2, 223-241.
Gibgons, Michael T. (2006) “Hermeneutics, Political Inquiry, and Practical Reason: An Evolving Challenge to Political Science,” The American Political Science Review, 100(4): 563-571.
Goodwin, Liu (2010) Keeping Faith with the Constitution. Oxford: Oxford University Press.
Greenberg, Mark (2004) “How Facts Make Law,” Legal Theory, 10: 157-98.
Habermas, Jürgen (1971) “Der Universalitätsanspruch der Hermeneutik” (The Hermeutic Claim to Universality) in Karl- Otto Apel et al., eds., Hermeneutik und Ideologiekritik (Hermeneutics and Ideology) Frankfurt: Suhrkamp, 120-158.
Hart, H.L.A. (1961) The Concept of Law Oxford. Oxford: Oxford University Press.
Hart, H.L.A. (1958) “Positivism and the Separation of law and Morals.” Harvard Law Review 71: 593- 629.
Heidegger, Martin (2008) Being and Time. Trans. John Macquarrie and Edward Robinson. New York: HarperPerennial/Modern Thought. (Translation of Sein und Zeit. Reprint. Originally published: Harper & Row, 1962).
Hekman, Susan (1986) Hermeneutics and the Sociology of Knowledge. Notre Dame: University of Notre Dame Press.
Hershovitz, Scott (ed.) (2006) Exploring Law’s Empire. Oxford: Oxford University Press.
Hiley, David R. et al. (eds.) (1991) The Interpretive Turn: Philosophy, Science, Culture. Ithaca: Cornell University Press.
Hinchman, Lewis P. (1995) “Aldo Leopold’s Hermeneutic of Nature,” The Review of Politics, 57(2): 225-249.
Hoffman, S.J. (2003) “Gadamer’s Philosophical Hermeneutics and Feminist Projects,” in L. Code (ed.) Feminist Interpretations of Hans-Georg Gadamer, University Park: Penn State University Press.
Hoy, David Couzens (1992) “Intentions and the Law: Defending Hermeneutics,” in Legal Hermeneutics: History, Theory, and Practice. Ed. Gregory Leyh. Berkeley, Los Angeles, Oxford: University of California Press.
Hoy, David Couzens (1987) “Dworkin’s Constructive Optimism v. Deconstructive Legal Nihilism,” Law and Philosophy 6: 321-56.
Hoy, David Couzens (1985) “Interpreting the Law: Hermeneutical and Poststructuralist Perspectives,” Southern California Law Review 58: 136-76.
Hunt, Alan (1996, 1999) “Marxist Theory of Law”. In A Companion to Philosophy of Law and Legal Theory. Ed. Dennis Patterson, Malden: Blackwell Publishing, 355-366.
Johnsen, Harald and Bjornar Olsen (1992) “Hermeneutics and Archaeology: On the Philosophy of Contextual Archaeology,” American Antiquity, 57(3): 419-436.
Kisiel, Theodore (1993) The Genesis of Heidegger’s Being and Time. Berkeley: University of California Press.
Kornprobst, Markus (2009) “International Relations as Rhetorical Discipline: Toward (Re-) Newing Horizons,” International Studies Review, 11(1): 87-108.
Levinson, Sanford and Steven Mailloux (1988) Interpreting Law and Literature: A Hermeneutic Reader. Evanston: Northwestern University Press.
Leyh, Gregory (1988) “Toward a Constitutional Hermeneutics.” American Journal of Political Science. 32(2): 369-387.
Leyh, Gregory (ed.) (1992) Legal Hermeneutics: History, Theory, and Practice. Berkeley, Los Angeles, Oxford: University of California Press.
Lieber, Francis (2010/1880) Legal and Political Hermeneutics. Lawbook Exchange, Ltd.
Malpas, Jeff and Hans-Helmuth Gander (eds.) (2014) The Routledge Companion to Hermeneutics. London, New York: Routledge, Taylor & Francis Group.
Malpas, Jeff and Hans-Helmuth Gander (1992) “Analysis and Hermeneutics,” Philosophy & Rhetoric, 25(2): 93-123.
Minda, Gary (1995) Postmodern Legal Movements. New York, London: New York University Press.
Monaghan, Henry Paul (2004) “Doing Originalism,” Columbia Law Review, 104(1): 32-38.
Mootz, Francis J., III (1994) “The New Legal Hermeneutics,” 47 Vand. L. Rev. 116.
Mootz, Francis J., III (1988) “The Ontological Basis of Legal Hermeneutics: A Proposed Model of Inquiry Based on the Work of Gadamer, Habermas, and Ricoeur,” 68 B.U.L. Rev. 523.
Murphy, Mark C. (2006) Natural Law in Jurisprudence & Politics. Cambridge: Cambridge University Press.
Nenon, Tom (1995) “Hermeneutical Truth and the Structure of Human Experience: Gadamer’s Critique of Dilthey,” in The Specter of Relativism: Truth, Dialogue, and Phronesis in Philosophical Hermeneutics, ed. Schmidt, Lawrence. Evanston: Northwestern University Press. 39-55.
Palmer, Richard E. (1969) Hermeneutics: Interpretation Theory in Schleiermacher, Dilthey, Heidegger, and Gadamer. Evanston: Northwestern University Press.
Pashukanis, Evgeny (1924/2002) The General Theory of Law and Marxism. New Brunswick: Transaction Publishers.
Pérez-Gómez, Alberto (1999) “Hermeneutics as Discourse in Design,” Design Issues, 15(2) Design Research: 71-79.
Pinton, Giorgio Alberto (1972/1973) “Emilio Betti’s (1890-1969) Theory of General Interpretation.” Ph.D. Dissertation. Hartford Seminary Foundation.
Ricoeur, P. (1974) The Conflict of Interpretations. Evanston: Northwestern University Press.
Rorty, Richard (2007) Philosophy as Cultural Politics. Cambridge: Cambridge University Press.
Rorty, Richard (2000) Philosophy and Social Hope. CITY: Penguin.
Rorty, Richard (1998) Achieving Our Country: Leftist Thought in Twentieth Century America. Cambridge: Harvard University Press, 1998.
Rorty, Richard (1991) Essays on Heidegger and Others: Philosophical Papers, Volume 3. Cambridge: Cambridge University Press.
Rorty, Richard (1979) Philosophy and the Mirror of Nature. Princeton: Princeton University Press.
Scalia, Antonin (1997) A Matter of Interpretation: Federal Courts and the Law: An Essay. Princeton: Princeton University Press.
Scalia, Antonin and Bryan Garner (2010) Reading Law: The Interpretation of Legal Texts. St. Paul: Thomson/West.
Schmidt, Lawrence (2006) Understanding Hermeneutics. Stocksfield: Acumen Publishing Ltd.
Surette, Leon (1995) “Richard Rorty Lays Down the Law,” Philosophy and Literature, 19(2): 261-275.
Taylor, George H. (2000) “Hermeneutics and Critique in Legal Practice: Critical Hermeneutics: The Intertwining of Explanation and Understanding as Exemplified in Legal Analysis,” 76 Chi.-Kent L. Rev. 1101.
Tugendhat, Ernst (1992) “Heidegger’s Idea of Truth.” trans. Christopher Macann, in Macann, Christopher (ed.) Martin Heidegger: Critical Assessments. 4 Vols. London: Routledge.
Valauri, John T. (2010) “As Time Goes By: Hermeneutics and Originalism,” Nevada Law Review, August 23, 2010.
Wachterhauser, Brice R (ed.) (1994) Hermeneutics and Truth. Evanston: Northwestern University Press.
Walby, S. (2007) “Complexity Theory, Systems Theory and Multiple Intersecting Social Inequalities,” Philosophy of the Social Sciences, 37(4): 449-70.
Warnke, Georgia (1993) Justice and Interpretation. Cambridge: MIT Press.
Wolf, Friedrich August (1831) Vorlesung über die Enzyklopadie der Altertumswissenschaft. Vorlesungen über die Altertumswissenschaft series, Ed. J.D. Gürtler, Vol. I. Leipzig: Lehnhold.

b. Further Reading

Attridge, Derrick (ed.) (1992) Acts of Literature. Ed. Derek Attridge. New York, London: Routledge. 181-220.
Austin, John (1832) The Province of Jurisprudence Determined.
Austin, John (1987) “Deconstructive practice and legal theory.” Yale Law Journal, 96, 743.
Barnett, Randy E. (1995-1996) The Relevance of the Framers’ Intent. 19 Harv. J. L. & Pub. Pol’y 403.
Bentham, Jeremy (1970) Of Laws in General. Ed. H.L.A. Hart. London: University of London, Athlone Press.
Binder, Guyora, and Robert Weisberg. Literary Criticisms of Law. Princeton: Princeton University Press, 2000.
Bix, Brian (1999) Natural Law Theory. In A Companion to Philosophy of Law and Legal Theory. Ed. Dennis Patterson. Malden, Oxford: Blackwell Publishing.
Bobbit, Phillip (1996, 1999) “Constitutional Law and Interpretation.” In A Companion to Philosophy of Law and Legal Theory. Ed. Dennis Patterson, Malden: Blackwell Publishing.
Bobbit, Phillip (1982) “A Typology of Constitutional Arguments.” Constitutional Fate: Theory of the Constitution. Oxford: Oxford University Press.
Brennan, Jr. William (1986) “The Constitution of the United States: Contemporary Ratification,” The Great Debate: Interpreting Our Written Constitution. Washington, D.C.: The Federalist Society.
Brest, Paul (1980) “The Misconceived Quest for the Original Understanding,” 60 B.U. L. Rev. 204.
Campos, Paul (1992) “Against Constitutional Theory,” Yale Journal of Law and the Humanities 4.
Campos, Paul (1993) “That Obscure Object of Desire: Hermeneutics and the Autonomous Legal Text,” Minnesota Law Review 77.
Cicero, Marcus Tullius (1988) De Re Publica; (On the Commonwealth). trans. C.W. Keyes, Cambridge: Harvard University Press.
Cicero, Marcus Tullius (1988) De Legibus (On the Laws). trans. C.W. Keyes, Cambridge: Harvard University Press.
Delgado, Richard (2012) “Centennial Reflections on the California Law Review’s Scholarship on Race: The Structure of Civil Rights Thought,” 100 Calif. L. Rev. 431.
Derrida, Jacques (1986) “But beyond…(Open Letter to Anne McClintock and Rob Nixon)” Trans. Peggy Kamuf. Critical Inquiry 13 (Autumn). 167-168.
Derrida, Jacques (1973) “Différance.” Speech and Phenomena, and Other Essays on Husserl’s Theory of Signs. Evanston: Northwestern University Press.
Dilthey, Wilhelm (1958) Gesammelte Schriften. Vol. II. Stuttgart: B.G. Teubner.
Epictetus (1926-1928) The Discourses as reported by Arrian, the Manual, and Fragments. London, W. Heinemann; New York, G.P. Putnam’s Sons.
Epictetus (2008) Enchiridion. Auckland: Floating Press.
Eskridge, Jr., William N. (1994) “Gaylegal Narratives,” 46 Stan. L. Rev. 607, 633.
Eskridge, Jr., William N. (1990) “Gadamer/Statutory Interpretation,” 90 Colum. L. Rev. 609.
Feinberg, Joel and Jules Coleman (2008) Philosophy of Law. Belmont: Thomson Wadsworth.
Fish, Stanley (1984) “Fiss v. Fiss,” 36 Stanford Law Review.
Fiss, Owen (1982) “Objectivity and Interpretation,” 34 Stanford Law Review 739.
George, Theodore D (2014) “Remarks on James Risser’s ‘The Life of Understanding: A Contemporary Hermeneutics,’” Philosophy Today, 58(1): 107-116.
Grotius, Hugo (1625) De iure belli ac pacis libri tres. Paris: Buon
Heidegger, Martin (1999) Ontology: Hermeneutics of Facticity. John van Buren (trans.) Bloomington: Indiana University Press.
Hoy, David Couzens (1978) The Critical Circle: Literature, History, and Philosophical Hermeneutics. Berkeley, Los Angeles, Oxford: University of California Press.
Hutchinson, A. (ed.) (1989) Critical Legal Studies. Totowa: Rowman & Littlefield.
Jones, Bernie D. (2002) “Critical Race Theory: New Strategies for Civil Rights in the New Millennium?” 18 Harv. BlackLetter J. 1.
Ladson-Billings, Gloria, “Race…to the Top, Again: Comments on the Genealogy of Critical Race Theory,” 43 Conn. L. Rev. 1439.
Levinson, Sanford (1980) “Law as Literature,” 60 Texas Law Review 373-403.
Levit, Nancy and Robert R.M. Verchick, Feminist Legal Theory, New York, London: New York University Press.
Mahmud, Tayyab (2014) “Foreword: Looking Back, Moving Forward: Latin Roots of the Modern Global and Global Orientation of Latcrit,” 12 Seattle J. Soc. Just. 699.
Levit, Nancy and R.M. Verchick (eds.) (2006) Feminist Legal Theory: A Primer. New York, London: New York University Press:.
MacKinnon, C. (2013) “Intersectionality as a Method: A Note,” Signs, 38:4, Intersectionalty: Theorizing Power, Empowering Theory, 1019-30.
Marx, Karl (1967) Capital. A Critical Analysis of Capitalist Production, Vol. 1, New York: International.
Marx, Karl (1859/1994) Preface to A Contribution to the Critique of Political Economy, in Simon, supra, 209.
Parks, Gregory Scott (2008) “Note: Toward a Critical Race Realism,” 17 Cornell J.L. & Pub. Pol’y 683.
Patterson, Dennis (ed.) (2003) Philosophy of Law and Legal Theory. Malden: Blackwell Publishing.
Patterson, Dennis (1996/1999) “Postmodernism.” In A Companion to Philosophy of Law and Legal Theory. Ed. Dennis Patterson, Malden: Blackwell Publishing, 375-384.
Patterson, Dennis (1996) Law and Truth. Oxford, New York: Oxford University Press.
Poteat, W. H. (1985) Polanyian Meditations: in Search of a Post-Critical Logic. Durham: Duke University Press.
Raz, Joseph (2009) Between Authority and Interpretation: On the Theory of Law & Practical Reason, Oxford: Oxford University Press.
Rorty, R. (1988) The Linguistic turn: recent essays in philosophical method. Midway Reprint edition. Chicago: University of Chicago Press.
Rorty, R. (1979) Philosophy and the mirror of nature. Princeton: Princeton University Press.
Schmidt, Lawrence (ed.) (1995) The Specter of Relativism: Truth, Dialogue, and Phronesis in Philosophical Hermeneutics, Evanston: Northwestern University Press.
Simmonds, N.E. (2007) Law as a Moral Idea, Oxford: Oxford University Press.
Teo, Thomas (2011) “Empirical Race Psychology and the Hermeneutics of Espistemological Violence,” Human Studies, 34(3): 237-255.
Thorsteinsson, Björn (2015) “From ‘Différance’ to Justice: Derrida and Heidegger’s ‘Anaximander’s Saying,” Continental Philosophy Review, 48(2): 255-271.
Tushnet, Mark V. (1983) “Following the Rules Laid Down: A Critique of Interpretivism and Neutral Principles” 96 Harvard Law Review 781.
Valauri, John T. (1991) “Constitutional Hermeneutics” in The Interpretive Turn: Philosophy, Science, Culture, David R. Hiley, James F. Bohman, and Richard Shusterman, eds. Ithaca: Cornell University Press.
Valdes, Francisco (2003) “Outsider Jurisprudence, Critical Pedagogy and Social Justice Activism: Marking the Stirrings of Critical Legal Education,” Asian American Law Journal, 10(1): Article 7.
Vedder, Ben (2002) “Religion and Hermeneutic Philosophy,” International Journal for Philosophy and Religion, 51(1): 39-54.
Warner, Richard (1996, 1999) “Legal Pragmatism” In A Companion to Philosophy of Law and Legal Theory. ed. Dennis Patterson, Malden: Blackwell Publishing, 385-393.
West, Robin L. (2000) “Commentary: Are There Nothing but Texts in this Class? Interpreting the Interpretive turns in Legal Thought,” Chicago-Kent College of Law Chicago-Kent Law Review, 76 Chi.-Kent L. Rev. 1125, 19547.
Whyte, Megan K. “Going Back to Class? The Reemergence of Class in Critical Race Theory Symposium: Introduction: From Discourse to Struggle: A New Direction in Critical Race Theory,” 11 Mich. J. Race & L. 1.

Author Information

Tina Botts
Email: tina.botts@oberlin.edu
Oberlin College
U. S. A.

Charles Sanders Peirce: Logic

Charles Sanders Peirce (1839-1914) was an accomplished scientist, philosopher, and mathematician, who considered himself primarily a logician. His contributions to the development of modern logic at the turn of the 20^th century were colossal, original and influential. Formal, or deductive, logic was just one of the branches in which he exercized his logical and analytical talent. His work developed upon Boole’s algebra of logic and De Morgan’s logic of relations. He worked on the algebra of relatives (1870-1885), the theory of quantification (1883-1885), graphical or diagrammatic logic (1896-1911), trivalent logic (1909), higher-order and modal logics. He also contributed significantly to the theory and methodology of induction, and discovered a third kind of reasoning, different from both deduction and induction, which he called abduction or retroduction, and which he identified with the logic of scientific discovery.

Philosophically, logic became for Peirce a broad discipline with internal divisions and external architectonic relations to other parts of scientific inquiry. Logic depends upon, or draws its principles from, mathematics, phaneroscopy (=phenomenology), and ethics, while metaphysics and psychology depend upon logic. One of the most important characters of Peirce’s late logical thought is that logic becomes coextensive with semeiotic (his preferred spelling), namely the theory of signs. Peirce divides logic, when conceived as semeiotic, into (i) speculative grammar, the preliminary analysis, definition, and classification of those signs that can be used by a scientific intelligence; (ii) critical logic, the study of the validity and justification of each kind of reasoning; and (iii) methodeutic or speculative rhetoric, the theory of methods. Peirce’s logical investigations cover all these three areas.

Logic among the Sciences
Logic as Semeiotic
Peirce’s Logic in Historical Perspective
References and Further Reading

1. Logic among the Sciences

Peirce’s idea of logic is guided by finding the location of logic in the map of the sciences. Peirce’s mature classification of the sciences (CP 1.180-202, 1903; see Brent 1987), which is a “ladder-like scheme” (MS 328, p. 20, c. 1905), takes superordinate sciences to provide principles to subordinated sciences, forming a ladder of decreasing generality.

According to Peirce’s 1903 scheme, which he as late as 1911 considered a satisfactory account (MS 675), sciences are either sciences of discovery, sciences of review, or practical sciences. Logic is a science of discovery. The sciences of discovery are divided into mathematics, philosophy and idioscopy. Mathematics studies the necessary consequences of purely hypothetical states of things. Philosophy, by contrast, is a positive science, concerning matters of fact. Idioscopy embraces more special physical and psychical sciences, and depends upon philosophy. Philosophy in turn divides into phaneroscopy, normative sciences and metaphysics. Phaneroscopy is the investigation of what Peirce calls the phaneron: whatever is present to the mind in any way. The normative sciences (aesthetic, ethics, and logic) introduce dichotomies, in that they are, in general, the investigation of what ought and what ought not to be. Metaphysics gives an account of the universe in both its physical and psychical dimensions. Since every science draws its principles from the ones above it in the classification, logic must draw its principles from mathematics, phaneroscopy, aesthetics and ethics, while metaphysics, and a fortiori psychology, draw their principles from logic (EP 2, pp. 258-262, 1903).

In sharp contrast to the logicist hypothesis, Peirce did not believe that mathematics depends upon deductive logic. On the contrary, in a sense it is deductive logic that depends upon mathematics. For Peirce, mathematics is the practice of deduction, logic its description and analysis: Peirce’s father Benjamin Peirce had defined mathematics as the science which draws necessary conclusions (B. Peirce 1870, p. 1). Hence deductive logic for Charles became the science of drawing necessary conclusions (CP 4.239, 1902). Logic cannot furnish any justification of a piece of deductive reasoning: deduction in general is in the first place mathematically, rather than logically, valid. And deductive logic is at any rate only a part of logic: “logic is the theory of all reasoning, while mathematics is the practice of a particular kind of reasoning” (MS 78, p. 4; see Haack 1993 and Houser 1993). Logic rather draws its principles from phaneroscopy, as the latter analyzes the structure of appearance but does not pronounce upon the veracity of such appearance. Logic also draws its principles from the normative sciences of ethics and esthetics (Peirce’s preferred spelling), which precede normative logic in the ladder of generality. Ethics depends on esthetics because ethics draws from esthetics the principles involved in the idea of a summum bonum, the highest good. Since ethics is the science that distinguishes good from bad conduct, it must be concerned with deliberate, self-controlled, conduct, because only by deliberate conduct is it possible to say whether the conduct is good or bad. Logic treats of a special kind of deliberate conduct, thought, and distinguishes good from bad thinking, that is, valid from invalid reasoning. Since deliberate thought is a species of deliberate conduct, logic must draw its principles form ethics (CP 5.120-50; EP 2, pp. 196-207, 1903)

Of the sciences down the ladder of generality, metaphysics and psychology come out next. Peirce had learnt from Kant that metaphysical conceptions mirror those of formal logic. Peirce’s criticism ever since the 1860s had been that Kant’s table of categories was mistaken not because he based them upon formal logic but because the formal logic that Kant had used was itself poor and ultimately wrong (see NEM 4, p. 162, 1898). The only way to arrive at a good metaphysics is to begin with a good logical theory (EP 2, pp. 30-31, 1898). Psychology, too, depends upon logic. According to Peirce, different versions of logical psychologism characterized the logics of his time, especially in Germany. Logic for Peirce considers not what or how we in fact think but how we ought to think; logic is a normative, not a descriptive, science. The validity of an argument consists in the fact that its conclusion is true, always or for the most part, when its premises are true; it has nothing to do with reference to a mind. Logical necessity is a necessity of (non-empirical) facts, not a necessity of thinking. No appeal to psychology is thereby of any aid in logic. On the contrary, it is psychology that stands in the need of a science of logic (EP 2, pp. 242-257, 1903).

2. Logic as Semeiotic

In the 1890s (MS 595, 787), Peirce divided logic into three branches: speculative grammar (also called stechiology), logical critics (or just critics) and methodeutic (also called speculative rhetoric). The division echoes the three sciences of the medieval Trivium: grammar, dialectic and rhetoric.

Perhaps the most salient character of Peirce’s logic as a whole is that in his later works (MS L 75, 1902; MS 478, 1903; MS 693, 1904; MS 640, 1909) logic is identified with semeiotic, the science and philosophy of signs and representations. Already in his early works on the theory of inference Peirce had affirmed that logic is the branch of semeiotic that treats of one particular kind of representations, namely symbols, in their reference to their objects (W1, p. 309, 1865). By the beginning of the 20^th-century, he had shifted from the idea of “logic-within-semeiotic” to that of “logic-as-semeiotic.” He thus needed to distinguish between logic in the narrow sense, which he now calls logical critics, and logic in the wide sense; the latter is made coextensive with semeiotic. “Logic, in its general sense, is, as I believe I have shown, only another name for semiotic (σημειωτική), the quasi-necessary, or formal, doctrine of signs” (CP 2.227, c.1897; cf. Fisch 1986, pp. 338-341).

According to Peirce’s mature views, an enlargement of logic to cover all varieties of signs was a valuable methodological guidance to the building of an objective, anti-psychological and formal logical theory: “The study of the provisional table of the Divisions of Signs will, if I do not deceive myself, help a student to many a lesson in logic” (MS S 46, 1906; cf. MS 283, c. 1905; MS 675-676, 1911; MS 12, 1912). Therefore, his logic contains, as a proper part of it, a study of its own scope and expansions. In homage to Thomas of Erfurt’s grammatica speculativa, which at Peirce’s times was misattributed to Duns Scotus, Peirce names this part of logic “speculative grammar.”

a. Speculative Grammar

In the 1890s Peirce regarded speculative grammar as an analysis of the nature of assertion (MS 409-8, 1894; CP 3.432, 1896, MS 787, c. 1897). Starting with the Syllabus of his 1903 Lowell Lectures (A Syllabus of Certain Topics of Logic, MS 478, MS 540), speculative grammar becomes a classification of signs. In the Syllabus Peirce defines a Sign or Representamen as

the first Correlate of a triadic relation, the second Correlate being termed its Object, and the possible Third Correlate being termed its Interpretant, by which triadic relation the possible Interpretant is determined to be the first correlate of the same triadic relation to the same Object, and for some possible Interpretant. (MS 540, CP 2.242; EP 2, p. 290).

A sign for Peirce is something that represents an independent object and which thereby brings another sign, called interpretant, to represent that object as the sign does. According to a long tradition in the history of logic, Peirce declares that the principal classes of signs that logic is concerned with are terms, propositions, and arguments. But by 1903 these three elements become parts of a larger taxonomic scheme.

Since the Syllabus and until at least 1909 Peirce continued experimenting with principles and terminologies, without however settling on any definitive division. This section presents the main principles of the Syllabus classification.

Signs are divisible by three trichotomies; first, according as the sign in itself is a mere quality, is an actual existent, or is a general law; secondly, according as the relation of the sign to its Object consists in the sign’s having some character in itself, or in some existential relation to that Object, or in its relation to an Interpretant; thirdly, according as its Interpretant represents it as a sign of possibility, or as a sign of fact, or a sign of reason. (CP 2.243, 1903)

The first trichotomy considers signs as (i) tones, when taken in their material qualities (such as the blueness of the ink), as (ii) tokens (such as any instance of the word “the”), and as (iii) general types (such as the word “the”). The second trichotomy is the best known, namely that of (i) icons, or signs that bear similarity or resemblance to their objects, (ii) indices, which have factual connections to their objects, and (iii) symbols, which have rational connections to their objects. The third trichotomy divides signs into terms, propositions, and arguments: Through his work on the logic of relatives (see § 2.b.iii.), Peirce had come to consider the terms as (i) rhemas, which are unsaturated predicates with logical bonds or subject-positions, in some ways similar to Frege’s Begriff and Russell’s propositional function; (ii) propositions, which unify subject and predicate and thus assert, or as in the Syllabus are dicisigns, signs that tell, and (iii) arguments, which embody the ultimate perfection and end of signs as a representation of facts that are signs of other facts, such as the premises being the sign of the conclusion. [Peirce’s theory of the proposition is articulated and highly original and has been thoroughly investigated in Hilpinen 1982, 1992; Ferriani 1987; Chauviré 1994; Stjernfelt 2014.]

There are cross-divisions of these three trichotomies across speculative grammar. A term or rhema is a symbol which is represented by its interpretant as an icon of its object, while a proposition or dicisign is a symbol which is represented by its interpretant as an index of its object. Arguments themselves are considered as symbols that represent their conclusion in three different ways: iconically in abduction, indexically in deduction, and symbolically in induction. (In an early cross-division proposed in 1867 these last two were interchanged. See W2, p. 58). Other outcomes of the classifications consisted in further divisions of objects and interpretants into various subtypes. [For more on Peirce’s classifications, see Weiss & Burks 1945; Short 2007, chs. 7-9; and Burch 2011.]

Grammatical taxonomy shows that there are three kinds of arguments, each manifesting a different semiotic principle. But it is up to the second branch of logic, critics, to investigate the question of logical validity and justification of such arguments. The analysis of the conditions of validity of these three kinds of reasoning is a critical, not grammatical, question.

b. Logical Critics

i. From Three Types of Inference to Three Stages of Inquiry

Logical critics is the heart of Peirce’s logic. It cover what usually goes under the name of logic proper, that is, the investigation of inference and arguments. Many 19^th-century logicians (for example, John S. Mill, George Boole, John Venn and William Stanley Jevons) took the range of logic to include deductive as well as inductive logic. As appears from the classification, the remarkable novelty of Peirce’s logical critics is that it embraces three essentially distinct though not entirely unrelated types of inferences: deduction, induction, and abduction. Initially, Peirce had conceived deductive logic as the logic of mathematics, and inductive and abductive logic as the logic of science. Later in his life, however, he saw these as three different stages of inquiry rather than different kinds of inference employed in different areas of scientific inquiry.

Peirce had formulated a definite theory of logical leading principles early in the late 1860s. His argument is roughly as follows. In any inference, we pass from some fact to some other fact that follows logically from it. The former is the premise (for in cases where there is more than one they may be colligated or compounded into one copulative premise), the latter is the conclusion.

P
∴C

The conclusion follows from the premise logically, that is, according to some leading principle, L. As logic supposes inferences to be analyzed and criticized, as soon as the logician asks what is it that warrants the passing from such premise to the conclusion she is obliged to express the leading principle L in a proposition and to lay it down as an additional premise:

P
L
∴C

This gives what Peirce calls a complete argument, in opposition to incomplete, rhetorical or enthymematic arguments. This second argument has itself its own leading principle L₁, which may again be expressed in a proposition and laid down as a further premise:

P
L
L₁∴C

When L₁is not a substantially different leading principle than L, then L is said to be a logical leading principle. In Peirce’s words:

This second argument has certainly itself a leading principle, although it is a far more abstract one than the leading principle of the original argument. But you might ask, why not express this new leading principle as a premise, and so obtain a third argument having a leading principle still more abstract? If, however, you try the experiment, you will find that the third argument so obtained has no more abstract a leading principle than the second argument has. Its leading principle is indeed precisely the same as that of the second argument. This leading principle has therefore attained a maximum degree of abstractness; and a leading principle of maximum abstractness may be termed a logical principle. (NEM 4, p.175, 1898)

A logical leading principle is therefore a formal or logical proposition which, when explicitly stated, adds nothing to the premises of the inference which it governs. The central question of logical critics becomes that of determining different kinds of logical leading principles.

Peirce’s initial strategy to prove that there are three and only three irreducible kinds of reasoning was to use syllogism. He gave the demonstration that the second and the third figure are reducible to the first only through the employment of the very figure that is to be reduced. The principles involved in the three syllogistic figures cannot then be reduced to a combination of other, more primitive principles, as they invariably enter as parts into the reduction proof itself. From this Peirce drew the broader conclusion that the three figures of syllogism correspond to the three kinds of inference in general: deduction corresponds to the first figure, abduction to the second, and induction to the third:

In Peirce 1878 and Peirce 1883, abduction and induction are described as inversions of a deductive syllogism. If we call the major premise of a syllogism in the first figure Rule, its minor premise Case, and its conclusion Result, then abduction may be said to be the inference of a Case from a Result and a Rule, while induction may be said to be the inference of a Rule from a Case and a Result:

Later in 1903 Peirce had come to the conclusion that the three kinds of reasoning are in fact three stages in scientific research. First comes abduction, now often also called retroduction, by which a hypothesis or conjecture that explains some surprising fact is set forth. Then comes deduction, which traces the necessary consequences of the hypothesis. Lastly comes induction, which puts those consequences to test and generalizes its conclusions.

Any inquiry is for Peirce bound to follow this pattern: abduction–deduction–induction. Each kind of inference retains its validity and modus operandi and is logically irreducible to either of the others; yet all three of them are necessary in any complete process of inquiry. Of the three methods, Peirce took deduction to be the most secure and the least fertile, while abduction is the most fertile and the least secure.

All these three departments of critics epitomize the originality of Peirce’s contributions. The following sections deals with Peirce’s abductive, deductive, and inductive logics, respectively.

ii. Abductive Logic

The central question of abductive or retroductive logic is: is there a logic of scientific discovery? If yes, what are its justification and method?

Initially, Peirce described abduction as the inference of a Case from a Rule and a Result:

Hypothesis proceeds from Rule and Result to Case; it is the formula of the […] process by which a confused concatenation of predicates is brought into order under a synthetizing predicate. (Peirce 1883, p. 145)

Its general formula is this:

Result: S is M₁ M₂ M₃ M₄Rule: P is M₁ M₂ M₃ M₄Case: Therefore, S is P.

A certain number of surprising facts have been observed which call for explanation, and a single predicate embracing all of them is found which would explain them. When I notice that light manifests such-and-such complicated and surprising phenomena, and I know that ether waves exhibits those same phenomena, I conclude abductively that, if light were ether waves, it would be normal for it to manifest those phenomena. This offers rational ground for the hypothesis that light is ether waves.

In 1900, Peirce began viewing this description of abduction as inadequate. What he in 1883 had called hypothesis or abduction was actually induction about characters instead of things and is therefore better to be called qualitative induction (see § 2.b.iv): its leading principle is inductive and not abductive. Abduction is no longer constrained by the syllogistic framework. Most generally, it is the non-inductive process of forming an explanatory hypothesis. In Peirce’s words, abduction “is the only logical operation which introduces any new idea” (CP 5.172, 1903). Although abduction asserts its conclusions only conjecturally, it has a definite logical form. The following has become its standard, albeit not the ultimately satisfactory, description after Peirce’s pronouncement of it in the seventh of the Harvard Lectures of 1903:

The surprising fact, C, is observed;
But if A were true, C would be a matter of course,
Hence, there is reason to suspect that A is true. (CP 5.189, 1903)

This schema reveals why abduction is also called retroduction: it is reasoning that leads from a consequent of an admitted consequence to its antecedent.

Another description of the logical form of abduction is contained in a later, unpublished manuscript:

In the inquiry, all the possible significant circumstances of the surprising phenomenon are mustered and pondered, until a conjecture furnishes some possible Explanation of it, by which I mean a syllogism exhibiting the surprising fact as necessarily following from the circumstances of its occurrence together with the truth of the conjecture as premisses. (MS 843, p. 41, 1908)

The explaining syllogism is the inversion of the 1903 formula:

If A were true, C would be observable.
A is true.
Therefore, C is observable.

One more and hitherto unknown formulation of retroduction is found in an unpublished letter to Lady Welby:

[The] “interrogative mood” does not mean the mere idle entertainment of an idea. It means that it will be wise to go to some expense, dependent upon the advantage that would accrue from knowing that Any/Some S is M, provided that expense would render it safe to act on that assumption supposing it to be true. This is the kind of reasoning called reasoning from consequent to antecedent. For it is related to the Modus Tollens thus:

Instead of “interrogatory”, the mood of the conclusion might more accurately be called “investigand”, and be expressed as follows:

It is to be inquired whether A is not true.

The reasoning might be called “Reasoning from Surprise to Inquiry..” (Peirce to Welby, July 16, 1905, LoF, pp. 907-908)

The whole course of thought, consisting in noticing the surprising phenomenon, searching for pertinent circumstances, asking a question, forming a conjecture, remarking that the conjecture appears to explain the surprising phenomenon, and adopting the conjecture as plausible, constitutes the first, abductive stage of inquiry. Nonetheless, its crucial phase is that of forming the conjecture itself. This is often described by Peirce as an act of insight, or an instinct for guessing right, or what Galileo called il lume naturale. [That Peirce actually got the phrase from Galileo has sometimes been contested. But see the story by Victor Baker in Bellucci, Pietarinen & Stjernfelt (2014) on the “Myth of Galileo.” Baker refers to Jaime Nubiola’s finding of Peirce’s copy of Galileo’s Opere that had that phrase underlined in Peirce’s hand. That fifteen volumes edition was at least in 2012 still to be found at the Robbins Library at the Department of Philosophy, Harvard University.]

However, to pronounce reasoning to be instinctive would amount to excluding it from the realm of logic. For logic only considers reasoning, and reasoning is a deliberate act subject to self-control. According to Peirce, abduction is an inference type based upon a logical principle. In its most abstract shape, such logical principle gives abduction its justification, and the justification of abduction is the bottom question of logical critics (EP 2, p. 443, 1908).

According to Peirce, abduction “consists in studying the facts and devising a theory to explain them” (CP 5.145, 1903). Its only justifications are that “if we are ever to understand things at all, it must be in that way,” and that “its method is the only way in which there can be any hope of attaining a rational explanation” (CP 2.777, 1902). The only justification for a hypothesis is that it might explain the facts. But in general, an inference is valid if its leading principle is an instance of a logical principle which is conducive to the acquisition of new information. Therefore, the logical leading principle of all abductions is that nature, in general, is explainable. To suppose something inexplicable is contrary to the principles of logic: such supposition only has the appearance of an explanation conducive to the acquisition of new information, but to really suppose something inexplicable is to renounce knowledge.

That nature is explicable is therefore the primary abduction underlining all possible abductions. Human powers of insight may well be justified also inductively, that is, as testified by the history of science. But abduction’s primary justification is abductive rather than inductive: if we are to acquire new knowledge at all, sooner or later we must reason abductively. [See Bellucci & Pietarinen 2014; Burks 1946; Fann 1970; Kapitan 1992; Kapitan 1997; Paavola 2004 for further details on Peirce’s theory of abduction or retroduction, and Ma & Pietarinen 2015 for a dynamic logic approach to Peirce’s interrogative abduction.]

Of these three stages of reasoning, abduction is the most fertile but the least secure. For this reason, Peirce affirms that abduction is the principal kind of reasoning in which, after logical critics has pronounced it valid, it remains to be inquired whether and how it is advantageous. To carry out such tasks pertains to the third branch of logic, methodeutic, which is discussed in § 2.c below.

iii. Deductive Logic

The works of George Boole and Augustus De Morgan provided the essential backdrop for Peirce’s development of deductive logic. Also Benjamin Peirce’s Linear Associative Algebra (B. Peirce 1870) influenced his son’s early development of algebraic logic of relatives. Peirce’s dissatisfaction with how Boole represented syllogisms as algebraic equations led him to develop new algebraic approaches to logic, which he did by combining Boole’s calculus (Boole 1847, 1854) with De Morgan’s treatment of relations (De Morgan 1847, 1860).

Some of Peirce’s most important contributions to the development of modern logic are highlighted below.

1867: In the paper “An Improvement in Boole’s Calculus of Logic” published in the Proceedings of the American Academy of Arts and Sciences (Peirce 1867), Peirce subscribed all operation symbols with a comma to differentiate the logical from the arithmetical interpretation. He also made Boole’s union operator inclusive rather than exclusive (he was anticipated in this by Jevons 1864). Peirce became aware of the limitations of Boole’s algebraic logic, such as that it cannot express categorical propositions (Some X is Y), so it fails to properly represent quantification.

1870: Peirce’s development of De Morgan’s theory of relations is fully exposed in his paper “Description of a Notation for the Logic of Relatives, resulting from an Amplification of the Conceptions of Boole’s Calculus of Logic” (Peirce 1870), which was communicated to the American Academy of Arts and Sciences in January 1870. In this paper, Peirce combines De Morgan’s theory with Boole’s calculus. The result is a logical algebra equivalent in expressive power to first-order predicate logic without identity. Peirce’s paper introduces a number of original innovations. Among them is a new process of logical differentiation, explained in Welsh (2012, pp.166-180). Peirce also introduces the copula of inclusion, , later also termed the sign of illation and expressed in cursive form as . Inclusion is for Peirce a wider and logically simpler concept than that of equality. This difference marks another important departure from Boole’s mathematical algebra towards new types of logical algebras. Beginning with C. I. Lewis’s (1918) comments on Peirce’s algebra, the literature has discussed whether in this 1870 paper what Peirce calls relative terms are to be equated with relations (see Merrill 1997 for a summary and further references). It appears that in the very least they are what verbs and phrases express linguistically, such as lover of___, whatever is a lover of____, or buyer of____for____from____. Here blanks stand for nouns that are required to complete the expressions. Beginning in 1882, Peirce comes to define relative terms as classes of ordered pairs ( later understood as being relations). He denotes such terms by rhemas with blanks for expressions standing for subjects, such as ____is a lover of____, whatever____is a lover of____ and so forth. Of note is that the algebra of Peirce’s 1870 paper is able to express various forms of quantification although the term “quantifier” and its modern conception was to emerge only later in his works after the early 1880s.

1880a: The long paper “On the Algebra of Logic” (Peirce 1880a), which was published in the American Journal of Mathematics, introduces a number of further developments of which the following six are listed here:

(1) The copula, expressed as a binary relation between classes of propositions, P_i C_i, is now understood to express the notion of the semantic consequence, namely that “every state of things in which a proposition of the class P_i is true is a state of things in which the corresponding propositions of the class C_i are true” (W4, p. 166). The binary relation here is thus a truth-functional implication. Moreover, the remarks in his Logic Notebook published in W4 (p. 216) were written in the same year and appear to be the first instance presenting variables v and f to denote the truth values true and false.

(2) A dash over a symbol is used to denote a negative of the symbol. A dash over the sign of illation in P_i C_i indicates the class complement. Constants ∞ and 0 are taken to mean the values of the possible and the impossible. An important modal component which Peirce would develop later on is thus emerging in this work.

(3) The totality of all that is possible is according to Peirce the “universe of discourse, and may be very limited,” that is, limited to that which “actually occurs,” rendering “everything which does not occur” impossible (W4, p. 170). The important idea of working with variable and restricted domains makes a marked difference not only to Frege who is well-known to have his logic to quantify over the entire “logical thought”, but also to Schröder, who though also working on the algebra of logic had nonetheless rendered Peirce’s 1885 (see below) algebra of logic so as to quantify, in a Fregean fashion, what Peirce had later in 1903 remarked to be “the whole universe of logical possibility” (MS 478, pp. 163-4).

(4) A new operation on relatives, which Peirce termed transaddition (º) is then introduced (W4, p.204). Taking two relatives, such as being lover of___ (l) and being servant of___ (s), their relative product ls denotes whatever is lover of a servant of___. Their transaddition l º s denotes whatever is not a lover of everything but servants of___, that is, it denotes a complement of the complements of the relative product of the two terms l and s.

(5) The 1880 paper is also the first in which the idea of a relative sum, which is the complement of the transaddition and which Peirce in 1882 will denote by the dagger (†), is employed. For example, l † s reads lover of everything but servants of___. Hence this 1880 paper marks a decisive move towards a theory of quantification which will see its emergence in his 1883 Note B in the Studies in Logic (see below) and which comes to be completed in his 1885 “Algebra of Logic” paper (see below).

(6) The 1880 paper also suggests a mathematical theory of lattices for the treatment of the algebra of logic (W4, pp. 183-188).

Arthur Prior (1958, 1964) showed that Peirce’s 1880 paper provides a complete basis for propositional logic.

1880b: In an unpublished manuscript (MS 378, Peirce 1880b) entitled “A Boolian Algebra with One Constant”, which still in 1926 was tagged “to be discarded” at the Department of Philosophy at Harvard University, Peirce reduces the number of logical operations to one constant. He states that “this notation … uses the minimum number of different signs … shows for the first time the possibility of writing both universal and particular propositions with but one copula” (W4, p.221). Peirce’s notation was later termed the Sheffer stroke, and is also well-known as the NAND operation, in Peirce’s terms the operation by which “[t]wo propositions written in a pair are considered to be both denied” (W4, p.218). In the same manuscript, Peirce also discovers what is the expressive completeness of the NOR operation, indeed today rightly recognized as the Peirce arrow.

1881: “On the Logic of Number” (Peirce 1881), published in American Journal of Mathematics and read before the National Academy of Sciences, was noted by Gerrit Mannoury (1909, pp. 51, 78) to be the first successful axiomatization of natural numbers. Shields (1981/2012) has shown Peirce’s axiom system to be equivalent to the better-known systems of Dedekind (1888) and Peano (1889). Peirce’s paper formulates, presumably for the first time, the notions of partial and total linear orders, recursive definitions for arithmetical operations, and the general definition of cardinal numbers in terms of ordinals. The paper also provides a purely cardinal definition of a finite set (Dedekind-finite) by checking whether De Morgan’s syllogism of transposed quantity is valid. [The syllogism of transposed quantity is expressed in the following mode of inference: Every Texan kills a Texan; Nobody is killed by but one person; Hence, every Texan is killed by a Texan.] Peirce then derives in this paper the latter property of finiteness from the ordinal one. Doing the converse assumes the axiom of choice.

During 1881-2, Peirce edited a book, published in 1883 and entitled Studies in Logic by Members of the Johns Hopkins University (Peirce 1883). It contained significant graduate work by his students Benjamin I. Gilman, Christine Ladd(-Franklin), Allan Marquand and Oscar Howard Mitchell. Peirce contributed to the volume a paper “A Theory of Probable Inference”, together with Note A, “A Limited Universe of Marks”, and Note B, “The Logic of Relatives.” Some developments in Mitchell’s paper as well as in Note B are worth highlighting.

Mitchell’s “On a New Algebra of Logic” was hailed by his teacher as “one of the greatest contributions that the whole history of logic can show” (MS 492; LoF, p. 225). Peirce attributed to Mitchell two major discoveries: first, the invention of the basic form of proof transformation and second, the interpretation of quantifiers in multiple dimensions, one of which is time. The former is similar to the resolution rule in logic programming, and consists of a series of insertions (by adding to premises) and erasures (by elimination of consequents). In Peirce’s words, “the passage from a premiss or premisses … to a necessary conclusion in the manner to which is alone usually called necessary reasoning, can always be reached by adding to the stated antecedents and subtracting from stated consequents, being understood that if an antecedent be itself a conditional proposition, its antecedent is of the nature of a consequent” (MS 905; LoF, pp. 731-732). The latter discovery—universes in multiple dimensions—has its correlate in the idea of interpreted domains and in the modern notion of temporal logics and many-sorted quantification. It can also be seen as a development of new languages that take the role of indices in the quantifiers to be mappings from contexts to values in universes of discourse. Having in mind Mitchell’s pioneering idea of logical dimensions, Peirce goes on to mention that the study of Mitchell’s paper was for him necessary in order to break “ground in the gamma [modal logic] part of the subject” of existential graphs (MS 467; LoF, p. 332; see below). Years later, Peirce defines the term “dimension” in the Dictionary of Philosophy and Psychology by noting that it is

an element or respect of extension of a logical universe of such a nature that the same term which is individual in one such element of extension is not so in another. Thus, we may consider different persons as individual in one respect, while they may be divisible in respect to time, and in respect to different admissible hypothetical states of things, etc. This is to be widely distinguished from different universes, as, for example, of things and of characters, where any given individual belonging to one cannot belong to another. The conception of a multidimensional logical universe is one of the fecund conceptions which exact logic owes to O. H. Mitchell. Schröder, in his then second volume, where he is far below himself in many respects, pronounces this conception ‘untenable’. But a doctrine which has, as a matter of fact, been held by Mitchell, Peirce, and others, on apparently cogent grounds, without meeting any attempt at refutation in about twenty years, may be regarded as being, for the present, at any rate, tenable enough to be held. (DPP 2, p. 27)

Mitchell develops, for the first time, the idea and the notation for existential and universal quantifiers, and notices that it is by from alternations of these quantifiers that logic derives its expressive power. Peirce testifies in the same dictionary entry that placing Σ and Π in alternating orders “was probably first introduced by O. H. Mitchell in his epoch-making paper” (DPP 2, p. 650). However, being limited to monadic predicates, Mitchell’s language was deprived of some expressive power.

Having supervised and perused Mitchell’s paper, in Note B of the Studies in Logic, Peirce generalizes the groundwork Mitchell had laid on the theory of quantification. His theory of relatives adds indices as individual variables to the operators Σ and Π to denote individual objects. Relative products and relative sums are then defined as (lb)_ij = Σx(l)_ix(b)_xj and (l † b)_ij = Π_x{(l)_ix+ (b)_xj}, thus becoming species of existential and universal quantification: the lover of a benefactor is “a particular combination, because it implies the existence of something loved by its relate and a benefactor of its correlate.” The lover of everything but benefactors is “universal, because it implies the non-existence of anything except what is either loved by its relate or a benefactor of its correlate” (Peirce 1883, p. 189). Peirce had already had the relative sum at his disposal and the idea of it expressing the non-existence of exceptions naturally led to its dual of the existential quantification. Towards the end of Note B Peirce writes something is a lover of something as Σ_i Σ_aj l_ij, everything is a lover of something as Π_iΣ_j l_ij, there is something which stands to something in the relation of loving everything except benefactors of it as Σ_iΣ_kΠ_j (l_ij+ b_jk), and so on. Taking α to denote accuser to___of___, ε excuser to___of___, and π preferrer to___of___, Π_iΣ_jΣ_k (α)_ijk (ε_jki + π_kij) means that “having taken any individual i whatever, it is always possible so to select two, j and k, that i is an accuser to j of k, and also is either excused by j to k or is something to which j is preferred by k” (Peirce 1883, p. 201). The phrasing Peirce uses here (such as “having taken any individual”, “it is always possible so to select”) is indicative of a new semantic treatment of quantifiers and sequences of quantifiers which he goes on to pursue further in later papers, and which in Hilpinen (1982), Hintikka (1996) and Pietarinen (2006) have shown to agree with game-theoretic semantics. Interestingly, Peirce’s examples are all stated in prenex normal form, which highlights the idea of sequences of dependent quantifiers. Peirce’s quantifiers bind variables ranging over interpreted domains. In this 1883 paper, he provides the basic inference rules, such as Σ_iΠ_j Π_jΣ_i, for manipulating the strings of quantifiers. The language is not inductively defined, it lacks notation for functions, and it uses neither constants nor an equality sign, but in other respects it coincides with that of first-order predicate calculus.

Alfred Tarski’s summary concerning Peirce’s contributions to the logical theory of relatives is illuminating:

[t]he title of creator of the theory of relations was reserved for C. S. Peirce. In several papers published between 1870 and 1882, he introduced and made precise all the fundamental concepts of the theory of relations and formulated and established its fundamental laws. Thus Peirce laid the foundation for the theory of relations as a deductive discipline; moreover he initiated the discussion of more profound problems in this domain. In particular, his investigations made it clear that a large part of the theory of relations can be presented as a calculus which is formally much like the calculus of classes developed by G. Boole and W. S. Jevons, but which greatly exceeds it in richness of expression and is therefore incomparably more interesting from the deductive point of view. (Tarski 1941, p. 73)

However, it is his 1885 theory of quantification that Peirce calculated to settle the problems of deductive logic and logical analysis in a way that decidedly brought him beyond the algebraic approach to the logic of relatives.

1885: Peirce’s logic of quantifiers comes to a full blossom in his paper, written in summer 1884, “On the Algebra of Logic: A Contribution to the Philosophy of Notation”, and published in the American Journal of Mathematics in the following year (Peirce 1885). This massive paper defies any condensed exposition; but in summary, it contains Peirce’s “five icons of algebra” as a system of natural deduction based on introduction and elimination rules. Peirce had repeatedly stated that his having supervised and examined Mitchell’s paper was essential in order to arrive at the idea of these two basic operations. There is an abundant use of truth-functional propositions and an anticipation of the truth-table method to test tautologies. One of the examples comes close to the tableaux method, later proposed by Evert Beth and Jaakko Hintikka, that spells out a systematic search for counter-models by deriving contradictions from the negations of the formula to be proved. In order “to find whether a formula is necessarily true,” he says, “substitute f and v for the letters and see whether it can be supposed false by any such assignment of values” (Peirce 1885, p. 224; Pietarinen 2006; Anellis 2012a).

When he moves on to the first-order (“first-intentional”) logic, Peirce seeks to devise a notation that is as iconic as possible, building on his semiotic insight that the more iconic a notation is, the better suited it would be for logical analysis. He starts by using “Σ for some, suggesting a sum, and Π for all, suggesting a product” (1885, p. 180). Once again, Peirce credits Mitchell, now for the method of separating the “quantifying part”—which he later termed the “Hopkinsian” to honour its place of discovery (MS 515, 1902)—from the pure Boolean expression: the latter refers to an individual by its use of indices (like pronouns in language) while the former states what that individual is. The quantifying operators are, however, “only similar to a sum and product,…because the individuals of the universe may be denumerable” (1885, p. 180). Peirce’s consideration illustrates similar lines of thought as those that prompted Löwenheim to formulate his famous 1915 theorem: if a first-order sentence has a model then it has also a countable model, or generally, models for sets of formulas being of some cardinality imply models of some other infinite cardinality (Badesa 2004). (Associating infinite products and sums with conjunctions and disjunctions was what Wittgenstein took to be his own biggest mistake in logic.) The 1885 paper continues introducing rules for quantifier manipulation, including “putting the Σs to the left, as far as possible” (1885, p. 182), which is a prelude to the idea of Skolem normal forms. One could say that it is the sequences of quantifiers, especially those of dependent quantifiers, that contribute to a linear logic notation as being maximally iconic, and that it is the prenex and Skolem normal forms that bring out maximal analyticity which logical icons exploit. The 1885 paper then presents many examples drawn from natural language to be analyzed logically with this new notation. The paper also extensively deals with issues having to do with the representation of mathematical notions such as one-to-one correspondence and identity in the second-intentional logic, developed in the third part of the paper, and in which variables range over relations. There is an early attempt at axiomatizing set theory as well as some profound philosophical consideration on the possibility of developing a “method for the discovery of methods in mathematics,” which is to be based on these new approaches that aim at formulating a general theory of deductive logic.

Thanks to the volumes that have appeared in the Chronological Edition of the Writings between 1982 and 2010, and which have covered Peirce’s work up to 1892, these earlier phases of Peirce’s deductive logic are now relatively well understood. But the research from that point on has been hampered by the unavailability of systematic editions concerning Peirce’s later logical writings. Yet the mid-1890s marks only the beginnings of a new and by far the most productive era in Peirce’s logical investigations, which were to last until the last months of his life. This situation has by no means been adequately reflected in the secondary literature.

Although Peirce would continue his investigations on the algebra of logic throughout his life, the algebraic element would no longer assume a central position in his overall oeuvre:

In 1895 Schröder published the third huge volume of his logic, which consisted mainly of a vast elaboration in detail of the logical algebra of my Note B. That I never considered that algebra to be a great masterpiece is sufficiently shown by my giving my exposition of it no other title than “Note B.” The perusal of Schröder’s book convinced me that the algebra was not what was wanted, and in the Monist for January 1897 I produced a system of graphs which I now term Entitative Graphs. I shortly after abandoned that and took up Existential Graphs” (MS 467; LoF, p. 332).

Although it was Schröder’s elaboration that was to influence the works of the early model theorists such as Löwenheim and Skolem (Brady 2000), it was Peirce’s and Mitchell’s works that germinated the concept of first-order statements being true-in-a model (Pietarinen 2006; Bellucci & Pietarinen 2015a). Moreover, Peirce’s incessant hunt for new logical notations and methods was much more ambitious and philosophical than his early algebraic investigations revealed.

What was to take the place of algebra were the ideas that emerged from diagrammatic, iconic and topological considerations on logical representation and reasoning. These considerations were at first prompted by logical analogues to algebraic invariants in chemistry first developed by Peirce’s John Hopkins colleague J. J. Sylvester (1878) and investigated in Kempe (1886). Peirce was initially fascinated by the analogy in which a chemical atom is like a relative “in having a definite number of loose ends or “unsaturated bonds”, corresponding to the blanks of the relative” (CP 3.469, 1897). But the continual search for better and better notations for the overall purposes of logical analysis would also reveal the reasons why Peirce had to overcome this analogy between logic and chemistry.

Peirce’s theory of Existential Graphs (EGs), first conceived in summer 1896 and developed in subsequent years (for example Peirce 1897, 1906), was in part motivated by his need to respond to the expressive insufficiency and lack of analytic power of the systems described in his Note B, which he later termed the algebra of dyadic (dual) relatives, and in the 1885 general (universal) algebra of logic. The analytic power comes from the idea of subsuming what the algebraic operations do when composing concepts under one mode of composition. This composition of concepts is effected in the theory of EGs by the device of ligatures. A ligature is a complex line, composed of what Peirce terms the lines of identities, which connects various parts and areas of the graphs: [See e.g. Zeman 1964; Roberts 1973; Shin 2002; Dipert 2006; Pietarinen 2005, 2006, 2011, 2015a.]

Fig. 1

Fig. 2

Fig. 3

The meaning of these lines is that the two or more descriptions apply to the same thing. For example, in Figure 2 there is a horizontal line attached to the predicate term “is obedient.” It means that “something exists which is obedient.” There is also another line which connects to the predicate term “is a catholic,” and that composition means that “something exists which is a catholic”, which is equivalent to the graph-instance in Figure 3. Since in Figure 1 these two lines are in fact connected by a continuous line, the graph-instance in Figure 1 means that “there exists a catholic which is obedient,” that is, “there exists an obedient catholic.” Ligatures, representing continuous connections composed of two or more lines of identities, stand for quantification, identity and predication, all in one go.

These EGs are drawn on a sheet of assertion that represents what the modeller knows or what mutually has been agreed upon to be the case by those who undertake the investigation of logic. The sheet thus represents the universe of discourse. The graph that is drawn on the sheet puts forth an assertion, true or false, that there is something in the universe to which it applies. This is the reason why Peirce terms these graphs existential. Drawing a circle around the graph, or alternatively, shading the area on which the graph-instance rests, means that nothing exists of the sort of description intended. In Figure 4, the assertion “something is a catholic” is denied by drawing an oval around it and thus severing that assertion from the sheet of assertion:

Fig. 4

The graph-instance depicted in Figure 4 thus means that “something exists that is not catholic.”

Peirce aimed at a diagrammatic syntax that would use a minimal number of logical signs but at the same time be maximally expressive and as analytic as possible. His ovals, for instance, have different notational functions: “The first office which the ovals fulfill is that of negation. […] The second office of the ovals is that of associating the conjunctions of terms. […] This is the office of parentheses in algebra” (MS 430, pp. 54-56, 1902). The ovals are thus not only the diagrammatic counterpart to negation but also serve to represent the compositionality of a graph-formula. He held (MS 430, 1902; MS 670, 1911) that a notation that does not separate the sign of truth-function from the representation of its scope is more analytic than some other notation, such as that of an ordinary “symbolic” language, where such a separation is needed to force a one-dimensional notation. The role of ovals as denials is in fact a derived function from more primitive considerations of inclusion and implication (Bellucci & Pietarinen 2015; MS 300, 1908).

As to expressivity, Peirce had already recognized that the notion of dependent quantification was essential in any system expressive enough to serve the purposes of logical analysis of any assertions. The nested system of ovals in EGs effectuate this in a natural way, much in contrast to algebras that resort to an explicit use of parentheses and other punctuation devices. For example, the graph in Figure 5 means that “Every Catholic adores some woman.” The graph in Figure 6 means that “Some woman is adored by every Catholic.” Peirce notes that the latter asserts more since it states that all Catholics adore the same woman, whereas the former allows different Catholics to adore different women.

Fig. 5

Fig. 6

The graph in Figure 7 means that “anything whatever is unloved by something that benefits it,” that is, “everything is benefitted by something or other that does not love it”:

Fig. 7

Lastly, Figure 8 provides an example of a very complex graph taken from MS 504 (1898):

Fig. 8

Peirce provided the meaning in natural language this way:

Every being unless he worships some being who does not create all beings either does not believe any being (unless it be not a woman) to be any mother of a creator of all beings or else he praises that woman to every being unless to a person whom he does not think he can induce to become anything unless it be a non-praiser of that woman to every being.

It is on the level of semantics where the power of dependent quantification comes to the fore. Peirce carried the semantics out in terms of defining the basics of what today is recognized as two-player zero-sum semantic games. For Peirce these games take place between the Graphist/Utterer and the Grapheus/Interpreter. [Sometimes, especially in Peirce’s model-building games, these roles split so that the Grapheus and the Interpreter are playing separate roles, see Pietarinen 2013.] Peirce’s semantic games were not limited to EGs; he applied the same idea also to interpret quantificational expressions and connectives in his general algebra of logic.

It speaks to the superiority of EGs over algebraic systems that in it deduction, following Mitchell’s work, is reduced to a minimum number of permissive operations. Peirce termed these operations illative rules of transformation, and in effect they consist only of two: insertions (permissions to draw a graph-instance on the sheet of assertion) and erasures (permissions to erase a graph-instance from the sheet). More precisely, the oddly enclosed areas of graphs (areas within an odd number of enclosures) permit inserting any graph on that area, while evenly enclosed areas permit erasing any graph from that area. A copy of a graph-instance is permitted to be pasted on that same area or any area deeper within the same nest of enclosures (the rule of iteration), and a copy thus iterated is permitted to be erased (the converse rule of deiteration). An interpretational corollary is that the double enclosure with no intervening graphs in the middle area can be inserted and erased at will.

A more detailed exposition of these illative rules of transformation would need to show their application to quantificational expressions, namely applying insertions and erasures to ligatures. A flavor of such proofs is given by inspecting the Figures 1, 2 and 3: an application of a permissible erasure on the line of identity in Figure 1 amounts to the graph-instance in Figure 2, and that another application of a permissible erasure on the upper part of the graph-instance in Figure 2 amounts to the graph-instance depicted in Figure 3. Thus what is represented in Figure 2 is a logical consequence of the graph-instance in Figure 1, and what is represented in Figure 3 is a logical consequence of the graph-instance given in Figure 2.

Roberts (1973) was the first to prove that these transformation rules, first given by Peirce in 1898, form a semantically complete system of deduction. Roberts did not mention, however, that Peirce had demonstrated their soundness in 1898 and again in 1903 and that he had argued for their completeness in terms of what he termed the “perfect archegetic rules of transformation” in the unpublished parts of the Syllabus for the Lowell Lectures that Peirce delivered in 1903.

The polarity of the outermost ends or portions of ligatures determines whether the quantification is existential (that end or portion resting on even/positive areas) or universal (if it rests on odd/negative area). Unlike in the Tarski-type semantics, but just as what happens in game-theoretic semantics, the preferred rule of interpretation of the graphs is what Peirce termed “endoporeutic”: one looks for the outermost portions of ligatures on the sheet of assertions first, assigns semantic values to that part, and then proceeds inwards into the areas enclosed with ovals. In non-modal contexts, ligatures are not well-formed graphs because they may cross the enclosures.

The diagrammatic nature of EGs consists in the iconic relationship between forms of relations exhibited in the diagrams and the real relations in the universe of discourse. Peirce was convinced that, since these graphical systems exploit a proper diagrammatic syntax, they—together with any of their extensions that would be introduced to cover modalities, non-declarative expressions, speech acts, and so forth—can express any assertion, however intricate. Guided by the precepts laid out by the diagrammatic forms of expression, and together with the simple illative permissions by which deductive inference proceeds, the conclusions from premises can be “read before one’s eyes”; these graphs present what Peirce believed is a “moving picture of the action of the mind in thought” (MS 298; LoF, p. 655; late 1906-1907).

If upon one lantern-slide there be shown the premisses of a theorem as expressed in these graphs and then upon other slides the successive results of the different transformations of those graphs; and if these slides in their proper order be successively exhibited, we should have in them a veritable moving picture of the mind in reasoning. (MS 905; LoF, p. 723; late 1907-1908)

The theory of EGs that uses only the notation of ovals and the spatial notion of juxtaposition of graphs is termed by Peirce the Alpha part of the EGs, and it corresponds to propositional logic. The extension of the alpha part with ligatures and rhemas (also termed spots by Peirce) gives rise to the Beta part, and it corresponds to fragments of first-order predicate calculus. What Peirce termed the Gamma part was a boutique of a number of developments, including various modalities such as metaphysical, epistemic and temporal modalities, as well as extensions of such graphs with ligatures. In Peirce’s writings, there are developments of graphical systems for higher-order logics and abstraction (Peirce’s “logic of potentials”), the logic of collections, and investigation of meta-logical expressions that use the language of graphs to talk about notions and properties of the graphs in that language (Peirce’s “graphs of graphs”). He mentions late in 1911 that the Delta part would also need to be added, most likely because of the ever-expanding systems that had been mushrooming in the Gamma part.

Peirce’s further contributions to deductive logic. While the development of the theory of the logic of existential graphs was his chief contribution, Peirce’s other contributions to the development of modern logic were numerous. In the Logic Notebook (1909) he defined a number of operations for three-valued logic and gave semantics for them in terms of defining truth-tables for such new connectives (Fisch & Turquette 1966). In these systems, which he called triadic logic, the third value is “the limit” between “true” and “not true,” and it applies to what Lane (1999) has identified as boundary-propositions: in Peirce’s terms, boundary-propositions have “a lower mode of being” which can “neither be determinately P, nor determinately not-P,” but are “at the limit between P and not P” (MS 399, p. 344r, 1909). Peirce defined several connectives to realize this idea in alternative ways, including four one-place connectives which were later reinvented as strong negation, two Post negations and the Tertium function, as well as six two-place connectives, including one that pertains to the logic of ordinary discourse.

Generally, Peirce divided deduction in two: on the one hand, deduction is either necessary or probable (deductive reasoning about probabilities), and on the other hand, deduction is either corollarial or theorematic. Corollarial deduction is reasoning “where it is only necessary to imagine any case in which the premisses are true in order to perceive immediately that the conclusion holds in that case.” Theorematic deduction “is deduction in which it is necessary to experiment in the imagination upon the image of the premiss in order from the result of such experiment to make corollarial deductions to the truth of the conclusion” (MS L 75, 1902). He considered the theorematic/corollarial distinction his first real discovery in the philosophy of mathematics. Theorematic deductions can be of different kinds and degrees of complexity, and he took the classification of various types of theorematic deductions to be of the utmost value in the theory of logic (MS 617; MS 201; Peirce 1908). Stjernfelt (2014) proposes a new classification of theorematic inferences. Hintikka (1980) has argued that reasoning is theorematic if it increases the number of layers of quantifiers, and that an argument is the more theorematic the more new individuals are used in it (see also Ketner 1985; Zeman 1986; Hoffmann 2010).

Zooming into some of the details of Peirce’s systems of logic, including those of diagrammatic logics, one finds a treasury of developments the meaning of which is only beginning to unravel over a century later (Bellucci, Pietarinen & Stjernfelt 2014). In 1886, Peirce suggested in a letter to his former student Allan Marquand, who had designed mechanical logic machines for syllogistic reasoning, that “it is by no means hopeless to expect to make a machine for really difficult problems. But you would have to proceed step by step. I think electricity would be the best thing to rely on” (L 269, Peirce to Marquand, 30 December, 1886; W5, p. 422). He then showed how switching circuits can be connected serially and in parallel, noting that these two configurations correspond to multiplication (algebraic sum as logical disjunction) and addition (algebraic product as logical conjunction) in logic. In addition to the idea of real logical machines running on electricity, Peirce was also very interested in the philosophical question of whether living intelligence is required in performing deductive reasoning, an issue of continuing relevance to A.I. and to the prospects of automatized theorem proving. In 1902 he developed two notational systems with sixteen binary connectives to map out all of the possible truth functions of the binary propositional calculus (Clark 1997; Zellweger 1997). According to Max Fisch, “No other logician compares with Peirce in attention to systems of notation and to sign-creation” (Fisch 1982, p. 132). Peirce’s work on these notational systems foresaw geometrical structures of logic, including spaces revealed by the study of the geometry of negation and other operators. Based on Peirce’s conceptual and sign-theoretic considerations, an apparatus for displaying and performing a complete set of the sixteen binary connectives in a two-valued propositional logic was patented in the U.S.A. in 1981 by Shea Zellweger.

Peirce also worked on early forms of topology (Havenel 2010), including studies on what might be recognized as rudimentary versions of homologies and knots, in his attempts to find pathways not only to logical issues but also to questions in philosophy of mathematics (Murphey 1961; Moore 2010).

Moreover, his diagrammatic systems of modal logic included suggestions for defining several types of multi-modal logics in terms of tinctures of areas of graphs. Tinctures enable logic to assert, among others, modalities including necessities and metaphysical possibilities, and so call for changes in the nature of how the corresponding logics behave, including the identification of individuals at the presence of multiple universes of discourses. He defined epistemic operators in terms of subjective possibilities, which, just as in contemporary epistemic logic, are epistemic possibilities defined as duals of knowledge operators. He analyzed the meaning of identities between actual and possible objects in quantified multi-modal logics. As an example, the two graphs given in Figures 9 and 10 that he presented in a 1906 draft of the Prolegomena paper (MS 292) illustrate the nature of the interplay between epistemic modalities and quantification.

Fig. 9

Fig. 10

The graph in Figure 9 is read “There is a man who is loved by one woman and loves a woman known by the Graphist to be another.” The reason is this. In the equivalent graph depicted in Figure 10 the woman who loves is denoted by the name A, and the woman who is loved is denoted by the name B. The shaded area is a tincture that refers to the modality of subjective possibility. Thus the graph in Figure 10 means that it is subjectively impossible, by which Peirce means that “it is contrary to what is known by the Graphist” (= the modeller of the graph), that A should be B. In other words, the woman who loves and the woman who is loved (whom the graph does not assert to be otherwise known to the Graphist) are known by the Graphist not to be the same person.

Peirce’s work highlights the philosophical signiﬁcance of ideas that were rediscovered later and largely after the mid-twentieth century, though often in different clothes: in Peirce’s largely unpublished works one finds him addressing such topics as multi-modal logics and possible-worlds semantics, quantiﬁcation into modal contexts, cross-world identities (in MS 490 he termed these special relations connecting objects in different possible worlds (for references, see Pietarinen 2005), cumulative and branching quantifiers (the latter being related to independence-friendly logic, see Pietarinen 2015b), as well as what later on became known as “Peirce’s Puzzle” (Dekker 2001; Hintikka 2011; Pietarinen 2015b), namely the question of the meaning of indeﬁnites in conditional sentences, which Peirce himself analyzed in quantified modal extensions of EGs.

Far from merely anticipating later discoveries, thus, Peirce’s logic in general puts what later on came to be explored in the fields of philosophical logic, formal semantics and pragmatics, philosophy of logic, mathematics, mind and language, cognitive and computing sciences, and history and philosophy of science, into a systematic logico-semeiotic perspective. From time to time, his ideas even surpass stagnated contemporary discussions, especially in the philosophy of logic and mathematics. [See for example Bellucci, Pietarinen & Stjernfelt 2014; Lupher & Adajian 2015; Sowa 2006; Zalamea 2012a; Zalamea 2012b; PM. For further details on Peirce’s deductive logic, see the collection of Houser and others, eds. 1997. Hilpinen 2004 provides a useful overview.]

iv. Inductive Logic

In 1865 (W1, pp. 263-64) Peirce defines induction as inference from Case and Result to Rule. Its general form is:

Case: M₁ M₂ M₃ M₄are S
Result: M₁ M₂ M₃ M₄are P
Rule: Therefore, all S are P.

A certain number of objects (M₁ M₂ M₃ M₄), known to belong to a certain class (S), possess a certain character (P); therefore, it can be infered inductively that the whole class S possesses that character. I notice that neat, swine, sheep, and deer, which I know are cloven-hoofed, are herbivores. Therefore, I infer inductively that all cloven-hoofed animals are herbivores.

Later, Peirce came to divide induction into three principal kinds. Crude induction is the lowest form of induction, based upon the common practice of generalizing about future events on the ground of previous experience. For example, “No instance of a genuine power of clairvoyance has ever been established: So I presume there is no such thing”; “cancer is incurable, because every known case has proved to be so.” Its general form is “All observed As are B. Therefore, All As are B.” It is the weakest form of inductive reasoning in terms of security. Qualitative induction is the intermediate kind in terms of security. It is what Peirce had earlier called hypothetical reasoning or abduction. It consists in testing a hypothesis by sampling the possible predications that may be made on the basis of it (CP 7.216). Qualitative abduction is reasoning that tests hypotheses already formulated. It should not be confused with abduction, which is reasoning that originates new hypotheses. Quantitative induction is the highest form of induction in terms of security. It investigates the real probability that a member of a certain class will have a certain character. Its procedure consists in finding a representative sample of the class and noting the proportion of them that possess the character P. Then, the inference is drawn that the proportion holds for the whole class. Its logical form is

S₁S₂S₃S₄and so forth, are taken at random from the Ms.
The proportion p of S₁S₂S₃S₄is P.
Hence, probably and approximately, the same proportion p of the Ms are P.

The inversion of a quantitative induction gives us a statistical deduction, whose form is

The proportion p of the Ms are P.
S₁S₂S₃S₄and so forth, are taken at random from the Ms.
Hence, the proportion p of them is P.

Although crude, qualitative, and quantitative induction are different in kind, their justification is, according to Peirce, the same:

The validity of Induction consists in the fact it proceeds according to a method which though it may give provisional results that are incorrect will yet if steadily pursued, eventually correct any such error. […] all Induction possesses this kind of validity, and […] no Induction possesses any other kind that is more than a further determination of this kind. (MS 293, 1907)

The validity rests upon induction being self-corrective: in the long run induction is bound to lead us ever closer to the correct representation of reality. Its validity is therefore linked to esse in futuro, to the possibility of self-correction of the very method itself. Any actual induction that is performed may well be wrong or partly wrong, but it remains valid because its leading principle is valid, that is, is conducive to truth in the long run.

Peirce’s polemic target was a theory that would make the validity of induction rest upon some principles of uniformity or regularity in nature. According to Peirce, that was how John S. Mill and Philodemus of Gadara (ca.110–ca.30 B.C.E.) attempted, unsoundly, to justify induction. Of the several objections that Peirce raised from time to time against this way of justifying induction, one is worth reporting. Mill argues that a universe without any regularity is imaginable, and that in that universe inductions would be invalid. But the absence of uniformity, that is, the absence among certain objects S of the character P, is itself a uniformity. No universe is imaginable in which induction is not valid. According to Peirce, “even if nature were not uniform, induction would be sure to find it out, so long as inductive reasoning could be performed at all” (CP 2.775).

Cheng 1969, Goudge 1946, Merrill 1975 and Forster 1989 provide further details on Peirce’s inductive logic.

c. Methodeutic

The third branch of Peirce’s logic is methodeutic, which he also called speculative rhetoric. He defined it to be “the study of the proper way of arranging and conducting an inquiry” (MS 606, p. 17), depicting it as being “not so exact in its conclusions as is critical logic” (MS L 75, 1902) and as involving “certain psychological principles” (MS 633, 1909). But it nevertheless is a theoretical study and not an art. Methodeutic is based upon critics, and considers not what is admissible (logical validity) but what is advantageous (logical economy). It is a “theoretical study of advantages” (MS L 75, 1902).

Abduction is of special interest to methodeutic, because abduction is the only mode of inference that can initiate a scientific hypothesis. But being justifiable is not a sufficient property of good hypotheses:

Any hypothesis which explains the facts is justified critically. But among justifiable hypotheses we have to select that one which is suitable for being tested by experiment. (MS L 75, 1902)

Among critically equivalent hypotheses (that is, hypotheses that explain the facts), one should be able to select for testing those that are capable of experimental verification. [Being capable of experimental verification is in Peirce’s philosophy of science to be conceived in the wide sense, including mental experimentation and imaginative activities in our thoughts (Bellucci & Pietarinen 2015b). It is not the same thing as the empirical verification criterion of the positivists, which Peirce often criticized.] This is the core of Peirce’s philosophy of pragmati(ci)sm, which teaches that the whole meaning of a hypothesis is in its conceivable practical (that is, experienceable) effects; pragmaticism therefore is “nothing else than the question of the logic of abduction” (CP 5.196, 1903).

In turn, among “pragmatistically” equivalent hypotheses (that is, hypotheses that are capable of experimental verification) one should select those that in the sense of Peirce’s economy of research are the cheapest ones. His argument for the economic character of methodeutic is roughly as follows: the logical validity of abduction presupposes that nature be in principle explainable. This means that to discover is simply to expedite an event that would sooner or later occur. Therefore, the real service of a logic of abduction is of the nature of an economy. Economy itself depends on three factors: cost (of money, time, energy, thought), the value of the hypothesis itself, and its effects upon other projects and hypotheses (MS L 75, 1902; MS 690, CP 7.164-231, 1901).

Although primarily concerned with abduction, methodeutic also has an interest in deduction and induction. Theorematic deductions (see §2.b.iii) manifest peculiar logical steps that are abductive rather than deductive. In order to overcome the lack of critical instruments for the investigation of those steps, Peirce emphasizes the need to have an inventory and logical classification of valuable steps in the history of mathematics which would become part of a methodeutic of necessary reasoning (Peirce 1908, MS 200-201).

Peirce also considered the study of the properties of different logical and mathematical notations and symbolisms as belonging to the department of methodeutic. In this respect, he coined the maxim of the ethics of terminology and of notation:

The person who introduces a conception into science has both the right and the duty of prescribing a terminology and a notation for it; and his terminology and notation should be followed except so far as it may prove positively and seriously disadvantageous to the progress of science. If a slight modification is sufficient to remove the objection, a much greater one should be avoided. (MS 530, 1902)

Induction too has its methodological side. The methods of the three classes of inductions are all based on “samples,” and they all presuppose that the samples are representative of the class from which they are sampled: methodeutic should therefore teach methods of producing fair samplings. His own experimental work is exemplary in that it develops new statistical methods to ascertain that truly randomized samples are achieved and that fully blinded testing conditions are secured. He emphasizes the method of predesignation, which prescribes that the characters concerning which class is sampled are to be chosen beforehand so that the sampler would not be influenced by any agreement among the members of the sample (see Goudge 1946).

Other things Peirce considers to pertain to methodeutic include the principles of definition, the methods of classification in general, and the doctrine of the clearness of ideas.

Peirce’s logic, conceived as semeiotic, characterizes a broad philosophical, methodological and scientific area of investigation. Although the present article has exposed a number of developments in Peirce’s studies in deductive logic, the deductive part is only a fraction of the wider project of semeiotic, the theory and philosophy of signs, and the logic of science. From a contemporary perspective, deductive, formal, and mathematical logic may have become the mainstay of logic as such, but for Peirce other areas of logic, such as speculative grammar and the critics and methodeutic of abduction and induction, are at least as important as the deductive part of logic.

3. Peirce’s Logic in Historical Perspective

Peirce’s algebraic work in formal logic influenced Ernst Schröder (1841–1902), who drew heavily upon Peirce’s work in the three volumes of his Vorlesungen über die Algebra der Logik (Schröder 1890–1905). Peirce also successfully initiated a school in logic during his Johns Hopkins period (1879–1884), whose most evident manifestation is the richness and originality of the papers contained in the Studies in Logic (Peirce 1883). For the main part of his career, Peirce had been in contact and correspondence with the most prominent logicians, mathematicians and scientists of the time, and his works appeared in leading scientific journals and proceedings.

All these facts notwithstanding, the reception of Peirce’s deductive logic has been strangely erratic, even in the early days. Especially in his later period (1892–1914), Peirce worked virtually alone in an adverse environment and without much intellectual and material support. It is true that the recognition of his contributions has suffered from a long-term unavailability of his vast Nachlass of over 100,000 surviving pages of manuscripts and correspondence. In some cases at least, the explanation may be found in the unprecedented technical and mathematical standard and rigor characterizing his work. But what is certainly a chief reason behind the general neglect of Peirce’s logic is the rise, at the end of the 19^th century, of what has later been named the Frege-Russell tradition in logic.

The historiography of logic seems to have accepted the idea, initially promoted by Bertrand Russell and subsequently canonized by historian of logic Jean van Heijenoort (1912–1986), of a “Fregean revolution” in logic. In this narrative, modern mathematical logic (also deceptively called symbolic logic) has replaced traditional or Aristotelian logic. According to such a picture, the work of the “algebraists,” (including Boole, De Morgan, Peirce and Schröder) belongs to the pre-Fregean logical paradigm.

Anellis (2012b) identified seven features of such a “Fregean” revolution: (1) A propositional calculus with a truth-functional definition of connectives, especially the conditional. (2) Decomposition of propositions into function and argument instead of into subject and predicate. (3) A quantification theory, based on a system of axioms and inference rules. (4) Definitions of infinite sequence and natural number in terms of logical notions (that is the logicization of mathematics). (5) Presentation and clarification of the concept of a formal system. (6) Relevance and use of logic for philosophical investigations (especially for philosophy of language). (7) Separating singular propositions, such as “Socrates is mortal” from universal propositions such as “All Greeks are mortal.” All these characteristics, Anellis argued, can be found in Peirce’s work, which therefore falls within the parameters of van Heijenoort’s conception of the Fregean revolution and the definition of mathematical logic. One also needs to remember that there are many characteristics of Peirce’s logic and philosophy of logic, vitally important to his logical vision, that either add to, modify or reject those that have been taken to typify the Fregean tradition. What may be ill-named as a Fregean revolution is found in a different, and perhaps more penetrating and consequential shape in Peirce’s work.

Peirce and Frege discovered quantificational theory around the same time (1879–1883). Frege’s work was at the time largely ignored. Russell credited Frege a posteriori with having founded modern logic in the Begriffsschrift (Frege 1879). However, while Frege’s notation was hardly ever used, the Peirce-Schröder notation was largely adopted by others. The important results of Löwenheim and Skolem at the beginning of the 20^th century were presented in the Peirce-Schröder system without any trace of influence by Frege or Russell. Peano’s use of the existential and universal quantifiers derives from Schröder and Peirce, not from Frege. Unlike Frege, Peirce recognized the utmost importance of dependent quantifiers and experimented with that idea in various ways in the algebra of logic and in existential graphs, and proposed new systems and dimensions of quantification that involve independent quantification (MS 430). Peirce’s overall influence upon the development of modern logic was considerable though its nature and scope had remained ill-understood for a long time (Putnam 1982; Dipert 1995; Pietarinen 2015a).

Peirce’s philosophy of logic had no better fate. Aside from Josiah Royce and especially Lady Victoria Welby with whom Peirce corresponded on the logic of signs and semiotics during 1903-1910, Peirce’s radical idea of “logic as semeiotic” largely passed by unnoticed. In the 1930s Charles Morris took, misleadingly, Peirce’s trivium of speculative grammar, critics and methodeutic to correspond to the division of the study of language into syntax, semantics, and pragmatics (Morris 1938, pp. 21-22). Carnap (1942) adopted Morris’ trichotomy and made it popular. Peirce’s philosophy of signs has since been studied by semioticians, led by the pioneering explorations by Roman Jakobson and Umberto Eco (see Eco 1975; Jakobson 1977; Eco 1984). Other aspects of Peirce’s philosophy of logic, such as the distinction between corollarial and theorematic deduction, his ideas on diagrammatic reasoning, and the evolution of new logical notations and meanings, is gaining the interest, not only of logicians and historians of logic, but also of philosophers of science, cognitive scientists as well as many scholars, scientists, artists and practitioners looking for ways to overcome boundaries of narrow conceptions of logic, reasoning and representation, as well as the outdated 20^th-century scientific methodologies that have characterized their respective fields. [See for example the 2014 Peirce Centennial Conference at Lowell as well as the Applying Peirce Conference series at Helsinki in 2007 and 2014, which have brought together scholars and scientists interested in Peirce’s thought virtually on any field of science.]

From the perspectives of the history and philosophy of modern logic, it may not be entirely right to talk in strict terms about the two traditions in logic, namely those of the algebraic and the symbolic ones. On the one hand, Peirce’s line of work in the algebra of logic led to the invention of a spectrum of methods in the semantic and model-theoretic tradition while the logic that Schröder, for example, preferred was to quantify over the entire universe and was thus at bottom a universalist one, thereby sharing the same preference as Frege. On the other hand, Peirce’s continuous search for new notations for the purposes of logical analysis and representation made what others may have considered to be the subject of symbolic notations really the subject of diagrams and icons. Algebraic notations were for Peirce iconic and often even very graphically so. What mattered to him was to remain clear of the significations of logical signs. Logical signs were to be interpreted in proper contexts and according to the purposes of investigation at hand. Thus, Peirce’s philosophy of logic stands in stark contrast to purely formal, mathematical and proof-theoretic approaches to logic, which do not care so much for signification. Peirce should accordingly be counted in the pragmatic, rather than just the semantic, tradition in philosophy of logic and language (Pietarinen 2006; Tiercelin 1991).

The famous van Heijenoort–Hintikka distinction between “logic as calculus” and “logic as a universal medium” is nonetheless instructive here (van Heijenoort 1967; Hintikka 1997; Peckhaus 2004). According to the former view of logic as calculus, methods and languages are many, they are reinterpretable according to the context and purposes at hand, and they admit of many and varying universes as well as modal and intensional considerations. The latter, universalist position means, in contrast, that there is one logic to “rule them all,” and so our thought is bounded by what that logic can express. Peirce fits squarely into the former camp. Here again it is not that all who worked on the algebra of logic would be members of that same camp (Schröder is a counterexample), or that all of those who in the literature have been tagged as formalists would share the universalist presuppositions (David Hilbert may serve as another kind of a counterexample). It may be one of the lessons of Peirce’s pragmaticism and the methodological pluralism which he exercised in his logic that one does not fix in advance what may in the future be considered to fall within the scope of logic.

4. References and Further Reading

Peirce’s works
1867. An Improvement in Boole’s Calculus of Logic. Proceedings of the American Academy of Arts and Sciences 7, pp. 249-261.
1870. Description of a Notation for the Logic of Relatives. Memoirs of the American Academy of Arts and Sciences 9, pp. 317-378.
1880a. On the Algebra of Logic. American Journal of Mathematics 3, pp. 15–57.
1881. On the Logic of Number. American Journal of Mathematics 4, pp. 85-95.
1883 (ed.). Studies in Logic by Members of the Johns Hopkins University. Boston: Little, Brown, and Co. 1883.
1885. On the Algebra of Logic. A Contribution to the Philosophy of Notation. American Journal of Mathematics 7, pp. 197–202.
1897. The Logic of Relatives. The Monist 7, pp. 161–217.
1901-1902. Entries in Dictionary of Philosophy and Psychology, 3 vols, edited by Baldwin, James Mark. Cited as DPP followed by volume and page number.
1906. Prolegomena to an Apology for Pragmaticism. The Monist 16, pp. 492–546.
1908. Some Amazing Mazes. The Monist 18 (3), pp. 416-464.
1931–1966. The Collected Papers of Charles S. Peirce, 8 vols., ed. by Hartshorne, C, Weiss, P. and Burks, A. W. Cambridge: Harvard University Press. Cited as CP followed by volume and paragraph number.
1967. Manuscripts in the Houghton Library of Harvard University, as identified by Richard Robin, “Annotated Catalogue of the Papers of Charles S. Peirce,” Amherst: University of Massachusetts Press, 1967, and in “The Peirce Papers: A supplementary catalogue,” Transactions of the C. S. Peirce Society 7 (1971): 37–57. Cited as MS followed by manuscript number and, when available, page number.
1976. The New Elements of Mathematics by Charles S. Peirce, 4 vols., ed. by Eisele, C. The Hague: Mouton. Cited as NEM followed by volume and page number.
1982 – …. Writings of Charles S. Peirce: A Chronological Edition, 7 vols., ed. by. Moore, E. C., Kloesel, C. J. W. et al. Bloomington: Indiana University Press. Cited as W followed by volume and page number.
2010. Philosophy of Mathematics: Selected Writings, ed. by M. E. Moore, Bloomington and Indianapolis, IN: Indiana University Press. Cited as PM.
2015. Logic of the Future. Peirce’s Writings on Existential Graphs, ed. by A.-V. Pietarinen, Bloomington: Indiana University Press. Cited as LoF.
Other works
Anellis, I. 2012a. Peirce’s Truth-Functional Analysis and the Origin of the Truth Table. History and Philosophy of Logic 33, pp. 37–41.
Anellis, I. 2012b. How Peircean was the ‘Fregean’ Revolution in Logic? arXiv:1201.0353.
Badesa, C. 2004. The Birth of Model Theory: Löwenheim’s Theorem in the Frame of the Theory of Relatives, Princeton: Princeton University Press.
Bellucci, F. & Pietarinen, A.-V. (2014). New Light on Peirce’s Concept of Retroduction and Scientific Reasoning, International Studies in the Philosophy of Science 28(2), pp. 1-21.
Bellucci, F. & Pietarinen, A.-V. (2015). Existential Graphs as an Instrument of Logical Analysis. Part I: Alpha, to appear.
Bellucci, F., Pietarinen, A.-V. & Stjernfelt, F. eds. 2014. Peirce: 5 Questions. VIP/Automatic Press.
Boole, G. 1847. The Mathematical Analysis of Logic. Cambridge: Macmillan, Barclay, & Macmillan.
Boole, G. 1854. An Investigation of the Laws of Thought. Cambridge: Walton & Maberly.
Brady, G. 2000. From Peirce to Skolem. Amsterdam: Elsevier Science.
Brent, B. 1987. Charles S. Peirce. Logic and the Classification of the Sciences, Kingston/Montreal: MacGill-Queen’s University Press
Burch, R. W. 2011. Peirce’s 10, 28, and 66 Sign-Types: The Simplest Mathematics. Semiotica 184, pp. 93–98.
Burks, A. W. 1946. Peirce’s Theory of Abduction. Philosophy of Science 13, pp. 301-306.
Carnap, R. 1942. Introduction to Semantics, Cambridge, Mass: MIT Press.
Chauviré, Ch. 1994. Logique et Grammaire Pure. Propositions, Sujets et Prédicats Chez Peirce. Histoire Epistémologie Langage 16, pp. 137–175.
Cheng, C.-Y. 1969. Peirce’s and Lewis’s Theories of Induction, The Hague: Martinus Nijhoff.
Clark, G. 1997. New Light on Peirce’s Iconic Notation for the Sixteen Binary Connectives. In Houser and others 1997, pp. 304-333.
Dedekind, R. 1888. Was sind und was sollen die Zahlen. Braunschweig: Vieweg.
Dekker, Paul 2001. Dynamics and Pragmatics of ‘Peirce’s Puzzle’, Journal of Semantics 18, pp. 211-241.
De Morgan, A. 1847. Formal Logic. London: Taylor and Walton.
De Morgan, A. 1860. On the Syllogism IV; and on the Logic of Relations. Transactions of the Cambridge Philosophical Society 10, pp. 331-358.
Dipert, R. 1995. Peirce’s Underestimated Place in the History of Logic: A Response to Quine. In Ketner, K. L. ed. Peirce and Contemporary Thought. New York: Fordham University Press, pp. 32-58.
Dipert, R. 2006. Peirce’s Deductive Logic: Its Development, Influence, and Philosophical Significance. In: Misak, C. (ed.). The Cambridge Companion to Peirce. Cambridge: Cambridge University Press, pp. 287-324.
Eco, U. 1975. Trattato di semiotica generale. Milano: Bompiani.
Eco, U. 1984. Semiotica e filosofia del linguaggio. Torino: Einaudi.
Fann, K. T. 1970. Peirce’s Theory of Abduction. The Hague: Martinus Nijhoff.
Ferriani, M. 1987. Peirce’s Analysis of the Proposition: Grammatical and Logical Aspects. In Ferriani, M. & Buzzetti, D. (eds.), Speculative grammar, universal grammar and philosophical analysis of language. Amsterdam: Benjamins, pp. 149-172.
Fisch, M. H. 1982. The Range of Peirce’s Relevance, The Monist 65, pp. 123-141. Reprinted in Fisch 1986, pp. 422-448.
Fisch, M. H. 1986. Peirce, Semeiotic and Pragmatism. Ed. by K. L. Ketner and C. J. W. Kloesel, Bloomington: Indiana University Press.
Fisch, M. H. & Turquette, A. 1966. Peirce’s Triadic Logic. Transactions of the Charles S. Peirce Society 2, pp.71-85.
Forster, P. 1989. Peirce on the Progress and Authority of Science. Transactions of the Charles S. Peirce Society 25, pp. 421–452.
Frege, G. 1879. Begriffsschrift: eine der arithmetischen nachgebildete Formelsprache des reinen Denkens. Halle: Louis Nebert.
Goudge, T. 1946. Peirce’s Treatment of Induction. Philosophy of Science 7, pp. 56-68.
Haack, S. 1993. Peirce and Logicism: Notes Towards an Exposition. Transactions of the Charles S. Peirce Society 29, pp. 33–56.
Havenel, J. 2010. Peirce’s Topological Concepts. In Moore 2010, pp. 283-322.
Hilpinen, R. 1982. On C. S. Peirceʼs Theory of the Proposition: Peirce as a Precursor of Game-Theoretical Semantics. The Monist 65, pp. 182-188.
Hilpinen, R. 1992. On Peirce’s Philosophical Logic: Propositions and Their Objects. Transactions of the Charles S. Peirce Society 28, pp. 467–488.
Hilpinen, R. 2004. Peirce’s Logic, in Gabbay, D.M., and J. Woods. 2004. Handbook of the History of Logic. Vol. 3: The Rise of Modern Logic From Leibniz to Frege. Vol. 3. Amsterdam: Elsevier North-Holland, pp. 611-658.
Hintikka, J. 1980. C. S. Peirce’s ‘First Real Discovery’ and Its Contemporary Relevance. Monist 63, pp. 304-315.
Hintikka, J. 1996. The Place of C. S. Peirce in the History of Logical Theory. In J. Brunning, J. & Forster, P. eds. The Rule of Reason: The Philosophy of Charles Sanders Peirce, Toronto: University of Toronto Press, pp. 13–33.
Hintikka, J. 1997. Lingua Universalis vs. Calculus Ratiocinator: An Ultimate Presupposition of Twentieth Century Philosophy. Dordrecht: Kluwer.
Hintikka, J. 2011. What the bald man can tell us. In: Biletzky, A. (ed.) Hues of Philosophy: Essays in Memory of Ruth Manor. London: College Publications.
Hoffmann, M. 2010. Theoric Transformations. Transactions of the Charles S. Peirce Society 46, pp. 570–590.
Houser, N. 1993. On ‘Peirce and Logicism’: A Response to Haack. Transactions of the Charles S. Peirce Society 29, pp. 57–67.
Houser, N., Roberts, D., Van Evra, J. eds. 1997. Studies in the Logic of Charles S. Peirce. Bloomington and Indianapolis: Indiana University Press.
Jakobson, R. 1977. A Few Remarks on Peirce, Pathfinder in the Science of Language. MLN 92, pp. 1026–1032.
Kapitan, T. 1992. Peirce and the Autonomy of Abductive Reasoning. Erkenntnis 37, pp. 1–26.
Kapitan, T. 1997. Peirce and the Structure of Abductive Inference. In Houser and others eds. 1997, pp. 477-496.
Kempe, A. B. 1886. A Memoir on the Theory of Mathematical Form. Philosophical Transactions of the Royal Society of London 177, pp. 1-70.
Ketner, K. L. 1985. How Hintikka Misunderstood Peirce’s Account of Theorematic Reasoning. Transactions of the Charles S. Peirce Society 21, pp. 407–418.
Lane, R. 1999. Peirce’s Triadic Logic Revisited. Transactions of the Charles S. Peirce Society 35, pp. 284–311.
Lewis, C. I. 1918. A Survey of Symbolic Logic. Berkeley: University of California Press.
Lupher, T. and Adajian, T. eds. 2015. Philosophy of Logic: 5 Questions. Copenhagen: Automatic Press.
Ma, Minghui & Pietarinen, A.-V. 2015. A dynamic approach to Peirce’s interrogative construal of abductive reasoning, IFCoLog Journal of Logics and their Applications, in press.
Mannoury, G. 1909. Methodologisches und Philosophisches zur Elementar-Mathematik. Haarlem: P. Visser.
Merrill, D. D. 1997. Relations and Quantification in Peirce’s Logic, 1870-1885. In Houser et. al. eds. 1997, pp. 158-172.
Merrill, G. H. 1975. Peirce on Probability and Induction. Transactions of the Charles S. Peirce Society 11, pp. 90–109.
Moore, M. ed. 2010. New Essays on Peirce’s Mathematical Philosophy. Chicago: Open Court.
Morris, C. W. 1938. Foundations of the Theory of Signs. In Morris, C. 1971. Writings on the General Theory of Signs. The Hague: Mouton.
Murphey, M. G. 1961. The Development of Peirce’s Philosophy, Cambridge, Mass: Harvard University Press, 2^nd ed. 1993, Indianapolis: Hackett.
Paavola, S. 2004. Abduction as a Logic and Methodology of Discovery: The Importance of Strategies. Foundations of Science 9, pp. 267–283.
Peano, G. 1889. Arithmetices Principia. Nova Methodo Exposita. Torino: Bocca.
Peckhaus, V. 2004. Calculus Ratiocinator vs. Characteristica Universalis? The Two Traditions in Logic, Revisited. History and Philosophy of Logic 25, pp. 3-14.
Peirce, B. 1870. Linear Associative Algebra. Litographated ed., Washington D.C.
Pietarinen, A.-V. 2005. Compositionality, Relevance and Peirce’s Logic of Existential Graphs. Axiomathes 15, pp. 513-540.
Pietarinen, A.-V. 2006. Signs of Logic: Peircean Themes on the Philosophy of Language, Games and Communication. Dordrecht: Springer.
Pietarinen, A.-V. 2011. Existential Graphs: What a Diagrammatic Logic of Cognition Might Look Like. History and Philosophy of Logic 32, pp. 265–281.
Pietarinen, A.-V. 2013. Logical and Linguistic Games from Peirce to Grice to Hintikka (with comments by J. Hintikka). Teorema 33, pp. 121-136.
Pietarinen, A.-V. 2015a. Exploring the Beta Quadrant. Synthese 192, pp. 941-970.
Pietarinen, A.-V. ed. 2015b. Two Papers on Existential Graphs by Charles S. Peirce: 1. Recent Developments of Existential Graphs and their Consequences for Logic (MS 498, 499, 490, S-36, 1906), 2. Assurance through Reasoning (MS 669, 670, 1911). Synthese 192, pp. 881-922.
Pietarinen, A.-V. & Bellucci, F. 2015a. Habits of Reasoning: On the Grammar and Critics of Logical Habits. In D. E. West & M. Anderson eds. Consensus on Peirce’s Concept of Habit: Before and Beyond Consciousness. Dordrecht: Springer.
Pietarinen, A.-V. & Bellucci, F. 2015b. The Iconic Moment: Towards a Peircean Theory of Diagrammatic Imagination. In J. Redmond, A. N. Fernàndez, O. Pombo eds. Epistemology, Knowledge, and the Impact of Interaction, Dordrecht: Springer.
Prior, A. N. 1958. Peirce’s Axioms for Propositional Calculus. The Journal of Symbolic Logic 23, pp. 135–136.
Prior, A. N. 1964. The Algebra of the Copula. In Moore, E. & Robin, R. eds. Studies in the Philosophy of Charles Sanders Peirce. Amherst: The University of Massachusetts Press, pp. 79-94.
Putnam, H. 1982. Peirce the Logician. Historia Mathematica 9, pp. 290-301.
Roberts, D. D. 1973. The Existential Graphs of Charles S. Peirce. The Hague- Paris: Mouton.
Schröder, E. 1890-1905. Vorlesungen über die Algebra der Logik, 3 vols. Leipzig: Teubner.
Shields, P. 1981. Charles S. Peirce on the Logic of Number, 2^nd ed. Boston: Docent Press, 2012.
Shin, S.-J. 2002. The Iconic Logic of Peirce’s Graphs. Cambridge, MA: MIT Press.
Short, T. L. 2007. Peirce’s Theory of Signs. Cambridge: Cambridge University Press.
Sowa, J. 2006. Peirce’s Contributions to the 21st Century, 14th International Conference on Conceptual Structures, Aalborg, Denmark, July 16-21, Lecture Notes in Computer Sceince, 4068, pp. 54-69.
Stjernfelt, F. 2014. Natural Propositions. The Actuality of Peirce’s Doctrine of Dicisigns, Boston: Docent Press.
Sylvester, J. J. 1878. On an Application of the New Atomic Theory to the Graphical Representation of the Invariants and Covariants of Binary Quantics. American Journal of Mathematics 1, pp. 64-104.
Tarski, A. 1941. On the Calculus of Relations. Journal of Symbolic Logic 6, pp. 73-89.
Tiercelin, C. 1991. Peirce’s Semiotic Version of The Semantic Tradition in Formal Logic. In New Inquiries into Meaning and Truth, N. Cooper & P. Engel eds. Harverest Wheatsheaf: St. Martin’s Press, pp. 187–210.
Van Heijenoort, J. 1967. Logic as Calculus and Logic as Language. Synthese 17, pp. 324–330.
Walsh, A. 2012. Relations between Logic and Mathematics in the Work of Benjamin and Charles S. Peirce. Boston: Docent Press.
Weiss, P. & Burks, A. 1945. Peirce’s Sixty-Six Signs. The Journal of Philosophy 42, pp. 383–388.
Zalamea, F. 2012a. Synthetic Philosophy of Contemporary Mathematics, Urbanomic.
Zalamea, F. 2012b. Peirce’s Logic of Continuity: A Mathematical and Conceptual Approach, Docent Press.
Zellweger, S. 1997. Untapped Potential in Peirce’s Iconic Notation for the Sixteen Binary Connectives. In Houser and others eds. 1997, pp. 334-386.
Zeman, J. 1964. The Graphical Logic of Charles S. Peirce, Ph.D. dissertation, University of Chicago.
Zeman, J. 1986. Peirceʼs Philosophy of Logic. Transactions of the Charles S. Peirce Society 22, pp. 1-22.

Author Information

Francesco Bellucci
Email: bellucci.francesco@gmail.com
Tallinn University of Technology
Estonia

and

Ahti-Veikko Pietarinen
Email: ahti-veikko.pietarinen@ttu.ee
Tallinn University of Technology
Estonia

Thomas S. Kuhn (1922—1996)

Thomas Samuel Kuhn, although trained as a physicist at Harvard University, became an historian and philosopher of science through the support of Harvard’s president, James Conant. In 1962, Kuhn’s renowned The Structure of Scientific Revolutions (Structure) helped to inaugurate a revolution—the 1960s historiographic revolution—by providing a new image of science. For Kuhn, scientific revolutions involved paradigm shifts that punctuated periods of stasis or normal science. Towards the end of his career, however, Kuhn underwent a paradigm shift of his own—from a historical philosophy of science to an evolutionary one.

In this article, Kuhn’s philosophy of science is reconstructed chronologically. To that end, the following questions are entertained: What was Kuhn’s early life and career? What was the road towards Structure? What is Structure? Why did Kuhn revise Structure? What was the road Kuhn took after Structure? At the heart of the answers to these questions is the person of Kuhn himself, especially the intellectual and social context in which he practiced his trade. This chronological reconstruction of Kuhn’s philosophy begins with his work in the 1950s on physical theory in the Lowell lectures and on the Copernican revolution and ends with his work in the 1990s on an evolutionary philosophy of science. Rather than present Kuhn’s philosophy as a finished product, this approach endeavors to capture it in the process of its formation so as to represent it accurately and faithfully.

Early Life and Career
The Road to Structure
The Structure of Scientific Revolutions
The Road after Structure
Conclusion
References and Further Reading
1. Kuhn’s Work
2. Secondary Sources

1. Early Life and Career

Kuhn was born in Cincinnati, Ohio, on 18 July 1922. He was the first of two children born to Samuel L. and Minette (neè Stroock) Kuhn, with a brother Roger born several years later. His father was a native Cincinnatian and his mother a native New Yorker. Kuhn’s father, Sam, was a hydraulic engineer, trained at Harvard University and at Massachusetts Institute of Technology (MIT) prior to World War I. He entered the war, and served in the Army Corps of Engineers. After leaving the armed services, Sam returned to Cincinnati for several years before moving to New York to help his recently widowed mother Setty (neè Swartz) Kuhn. Kuhn’s mother, Minette, was a liberally educated person who came from an affluent family.

Kuhn’s early education reflected the family’s liberal progressiveness. In 1927, Kuhn began schooling at the progressive Lincoln School in Manhattan. His early education taught him to think independently, but by his own admission, there was little content to the thinking. He remembered that by the second grade, for instance, he was unable to read proficiently, much to the consternation of his parents.

Beginning in the sixth grade, Kuhn’s family moved to Croton-on-Hudson, a small town about fifty miles from Manhattan, and the adolescent Kuhn attended the progressive Hessian Hills School. According to Kuhn the school was staffed by left-oriented radical teachers, who taught the students pacifism. When he left the school after the ninth grade, Kuhn felt he was a bright and independent thinker. After spending an uninspired year at the preparatory school Solebury in Pennsylvania, Kuhn spent his last two years of high school at the Yale-preparatory Taft School in Watertown, Connecticut. He graduated third in his class of 105 students and was inducted into the National Honor Society. He also received the prestigious Rensselaer Alumni Association Medal.

Kuhn matriculated to Harvard College in the fall of 1940, following his father’s and uncles’ footsteps. At Harvard, he acquired a better sense of himself socially by participating in various organizations. During his first year, Kuhn took a yearlong philosophy course. In the first semester, he studied Plato and Aristotle; while in the second semester, he studied Descartes, Spinoza, Hume, and Kant. He intended to take additional philosophy courses but could not find time. He attended, however, several of George Sarton’s lectures on the history of science, but he found them boring.

At Harvard, Kuhn agonized over majoring in either physics or mathematics. After seeking his father’s counsel, he chose physics because of career opportunities. Interestingly, the attraction of physics or mathematics was their problem-solving traditions. In the fall of his sophomore year, the Japanese attacked Pearl Harbor and Kuhn expedited his undergraduate education by going to summer school. The physics department focused on teaching predominantly electronics, and Kuhn followed suit.

Kuhn underwent another radical transformation, also during his sophomore year. Although he was trained a pacifist the atrocities perpetrated in Europe during World War II, especially by Hitler, horrified him. Kuhn experienced a crisis, since he was unable to defend pacifism reasonably. The outcome was that he became an interventionist, which was the position of many at Harvard—especially its president, Conant. The episode left a lasting impact upon him. In a Harvard Crimson editorial, Kuhn supported Conant’s effort to militarize the universities in the United States. The editorial came to the attention of the administration, and eventually Conant and Kuhn met.

In the spring of 1943, Kuhn graduated summa cum laude from Harvard College with an S.B. After graduation, he worked for the Radio Research Laboratory located in Harvard’s biology building. He conducted research on radar counter technology, under John van Vleck’s supervision. The job procured for Kuhn a deferment from the draft. After a year, he requested a transfer to England and then to the continent, where he worked in association with the U.S. Office of Scientific Research and Development. The trip was Kuhn’s first abroad and he felt invigorated by the experience. However, Kuhn realized that he did not like radar work, which led him to reconsider whether he wanted to continue as a physicist. But, these doubts did not dampen his enthusiasm for or belief in science. During this time, Kuhn had the opportunity to read what he wanted; he read in the philosophy of science, including authors such as Bertrand Russell, P.W. Bridgman, Rudolf Carnap, and Philipp Frank.

After V.E. day in 1945, Kuhn returned to Harvard. As the war abated with the dropping of atomic bombs on Japan, Kuhn activated an earlier acceptance into graduate school and began studies in the physics department. Although Kuhn persuaded the department to permit him to take philosophy courses during his first year, he again chose the pragmatic course and focused on physics. In 1946, Kuhn passed the general examinations and received a master’s degree in physics. He then began dissertation research on theoretical solid-state physics, under the direction of van Vleck. In 1949, Harvard awarded Kuhn a doctorate in physics.

Although Kuhn had high regard for science, especially physics, he was unfulfilled as a physicist and continually harbored doubts during graduate school about a career in physics. He had chosen both a dissertation topic and an advisor to expedite obtaining a degree. But, he was to find direction for his career through Conant’s invitation in 1947 to help prepare a historical case-based course on science for upper-level undergraduates. Kuhn accepted the invitation to be one of two assistants for Conant’s course. He undertook a project investigating the origins of seventeenth-century mechanics, a project that would transform his image of science.

That transformation came, as Kuhn recounted later, on a summer day in 1947 as he struggled to understand Aristotle’s idea of motion in Physics. The problem was that Kuhn tried to make sense of Aristotle’s idea of motion using Newtonian assumptions and categories of motion. Once he realized that he had to read Aristotle’s Physics using assumptions and categories contemporary to when the Greek philosopher wrote it, suddenly Aristotle’s idea of motion made sense.

After this experience, Kuhn realized that he wanted to be a philosopher of science by doing history of science. His interest was not strictly history of science but philosophy, for he felt that philosophy was the way to truth and truth was what he was after. To achieve that goal, Kuhn asked Conant to sponsor him as a junior fellow in the Harvard Society of Fellows. Harvard initiated the society to provide promising young scholars freedom from teaching for three years to develop a scholarly program. Kuhn’s colleagues stimulated him professionally, especially a senior fellow by the name of Willard Quine. At the time, Quine was publishing his critique on the distinction between the analytic and the synthetic, which Kuhn found reassuring for his own thinking.

Kuhn began as a fellow in the fall of 1948, which provided him the opportunity to retool as a historian of science. Kuhn took advantage of the opportunity and read widely over the next year and a half in the humanities and sciences. Just prior to his appointment as a fellow, Kuhn was also undergoing psychoanalysis. This experience allowed him to see other people’s perspectives and contributed to his approach for conducting historical research.

2. The Road to Structure

a. The Lowell Lectures

In 1950, the trustee of the Lowell Institute, Ralph Lowell, invited Kuhn to deliver the 1951 Lowell lectures. In these lectures, Kuhn outlined a conception of science in contrast to the traditional philosophy of science’s conception in which facts are slowly accumulated and stockpiled in textbooks. Kuhn began by assuring his audience that he, as a once practicing scientist, believed that science produces useful and cumulative knowledge of the world, but that traditional analysis of science distorts the process by which scientific knowledge develops. He went on to inform the audience that the history of science could be instructive for identifying the process by which creative science advances, rather than focusing on the finished product promulgated in textbooks. Because textbooks only state the immutable scientific laws and marshal forth the experimental evidence to support those laws, they cover over the creative process that leads to the laws in the first place.

Kuhn then presented an alternative historical approach to scientific methodology. He claimed that the traditional position in which Galileo rejected Aristotle’s physics because of Galileo’s experiments is a fallacy. Rather, Galileo rejected Aristotelianism as an entire system. In other words, Galileo’s evidence was necessary but not sufficient; rather, the Aristotelian system was under evaluation, which also included its logic. Next, Kuhn proposed an alternative image of science based on the new approach to the history of science. He introduced the notion of conceptual frameworks, and drew from psychology to defend the advancement of science though scientists’ predispositions. These predispositions allow scientists to negotiate a professional world and to learn from their experiences. Moreover, they are important in organizing the scientist’s professional world and scientists do not dispense with them easily. Change in them represents a foundational alteration in a professional world.

Kuhn argued that although logic is important for deriving meaning and for managing and manipulating knowledge, scientific language—as natural—outstrips such formalization. He upended the tables on an important tool for the traditional analysis of science. By revealing the limitations of logical analysis, he showed that logic is necessary but insufficient for justifying scientific knowledge. Logic, then, cannot guarantee the traditional image of science as the progressive accumulation of scientific facts. Kuhn next examined logical analysis in terms of language and meaning. His position was that language is a way of dissecting the professional world in which scientists operate. But, there is always ambiguity or overlap in the meaning of terms as that world is dissected. Certainly, scientists attempt to increase the precision of their terms but not to the point that they can eliminate ambiguity. Kuhn concluded by distinguishing between creative and textbook science.

In the same year of the Lowell lectures, Harvard appointed Kuhn as an instructor and the following year as an assistant professor. Kuhn’s primary teaching duty was in the general education curriculum, where he taught Natural Sciences 4 along with Leonard Nash. He also taught courses in the history of science. And, it was during this time that Kuhn developed a course on the history of cosmology. Kuhn utilized course preparation for scholarly writing projects. For example, he handed out draft chapters of The Copernican Revolution to his classes.

A part of Kuhn’s motivation for developing a new image of science was the misconceptions of science held by the public. He blamed its misconceptions on introductory courses that stressed the textbook image of science as a fixed body of facts. After discussing this state of affairs with friends and Conant, Kuhn provided students with a more accurate image of science. The key to that image, claimed Kuhn, was science’s history, which displays the creative and dynamic nature of science.

b. The Copernican Revolution

In The Copernican Revolution, Kuhn claimed he had identified an important feature of the revolution, which previous scholars had missed: its plurality. What Kuhn meant by plurality was that scientists have philosophical and even religious commitments, which are important for the justification of scientific knowledge. This stance was anathema to traditional philosophers of science, who believed that such commitments played little—if any—role in the justification of scientific knowledge and relegated them to the discovery process.

Kuhn began reconstruction of the Copernican revolution by establishing the genuine scientific character of ancient cosmological conceptual schemes, especially the two-sphere cosmology composed of an inner sphere for the earth and an outer sphere for the heavens. For Kuhn, conceptual schemes exhibit three important features. They are comprehensive in terms of scientific predictions, there is no final proof for them, and they are derived from other schemes. Finally, to be successful conceptual schemes must perform logical and psychological functions. The logical function is expressed in explanatory terms, while the psychological function in existential terms. Although the logical function of the two-sphere cosmology continued to be problematic, its psychological function afforded adherents a comprehensive worldview that included even religious elements.

The major logical problem with the two-sphere cosmology was the movement and positions of the planets. The conceptual scheme Ptolemy developed in the second century guided research for the next millennium. But, problems surfaced with the scheme and predecessors could only correct it so far with ad hoc modifications. Kuhn asked at this point in the narrative why the Ptolemaic system, given its imperfection, was not overthrown sooner. The answer, for Kuhn, depended on a distinction between the logical and psychological dimensions of scientific revolutions. According to Kuhn, there are logically different conceptual schemes that can organize and account for observations. The difference among these schemes is their predictive power. Consequently, if an observation is made that is not compatible with a prediction the scheme must be replaced. But, before change can occur, there is also the psychological dimension to a revolution.

Copernicus had to overcome not only the logical dimension of the Ptolemaic system but also its psychological dimension. Aristotle had established this latter dimension by wedding the two-sphere cosmology to a philosophical system. Through the Aristotelian notion of motion among the earthly and heavenly spheres, the inner sphere was connected and depended on the outer sphere. The ability to presage future events linked astronomy to astrology. Such an alliance, according to Kuhn, provided a formidable obstacle to change of any kind.

But change began to take place, albeit slowly. From Aristotle to Ptolemy, a sharp distinction arose between the psychological dimensions of cosmology and the mathematical precision of astronomy. By Ptolemy’s time, astronomy was less concerned with the psychological dimensions of data interpretation and more with the accuracy of theoretical prediction. To some extent, this aided Copernicus, since whether the earth moved could be determined by theoretical analysis of the empirical data. But still, the earth as center of the universe gave existential consolation to people. The strands of the Copernican revolution, then, included not only astronomical concerns but also theological, economic, and social ones. Besides the Scholastic tradition, with its impetus theory of motion, other factors also paved the way for the Copernican revolution, including the Protestant revolution, navigation for oceanic voyages, calendar reform, and Renaissance humanism and Neoplatonism.

Copernicus, according to Kuhn, was the immediate inheritor of Aristotelian-Ptolemaic cosmological tradition and, except for the position of the earth, was closer to that tradition than to modern astronomy. For Kuhn, De Revolutionibus precipitated a revolution and was not the revolution itself. Although the problem Copernicus addressed was the same as for his predecessors, that is, planetary motion, his solution was to revise the mathematical model for that motion by making the earth a planet that moves around the sun. Essentially, Copernicus maintained the Aristotelian-Ptolemaic universe but exchanged the sun for the earth, as the universe’s center. Although Copernicus had eliminated major epicycles, he still used minor ones and the accuracy of planetary position was no better than Ptolemy’s. Kuhn concluded that Copernicus did not really solve the problem of planetary motion.

Initially, according to Kuhn, there were only a few supporters of Copernicus’ cosmology. Although the majority of astronomers accepted the mathematical harmonies of De Revolutionibus after its publication in 1543, they rejected or ignored its cosmology. Tycho Brahe, for example, although relying on Copernican harmonies to explain astronomical data, proposed a system in which the earth was still the universe’s center. Essentially, it was a compromise between ancient cosmology and Copernican mathematical astronomy. However, Brahe recorded accurate and precise astronomical observations, which helped to compel others towards Copernicanism—particularly Johannes Kepler, who used its mathematical precision to solve the planetary motion problem. The final player Kuhn considered in the revolution was Galileo, who, Kuhn claimed, provided through telescopic observations not proof of but rather propaganda for Copernicanism.

Although astronomers achieved consensus during the seventeenth century, Copernicanism still faced serious resistance from Christianity. The Copernican revolution was completed with the Newtonian universe, which not only had an impact on astronomy but also on other sciences and even non-sciences. For instance, Newton’s universe changed the nature of God to that of a clockmaker. For Kuhn, Newtonian’s impact on disciplines other than astronomy was an example of its fruitfulness. Scientific progress, concluded Kuhn, is not the linear process, as championed by traditional philosophers of science, in which scientific facts are stockpiled in a warehouse. Rather, it is the repeated destruction and replacement of scientific theories.

The professional reviews of The Copernican Revolution signaled Kuhn’s acceptance into the philosophical and historical communities. His reconstruction of the revolution was considered for the most part scientifically accurate and methodologically appropriate. Reviewers considered integration of the science and the social an advance over other histories that ignored these dimensions of the historical narrative. Although philosophers appreciated the historical dimension of Kuhn’s study, they found its analysis imprecise according to their standards. Overall, both the historical and philosophical communities expressed no major objections to the image of science that animated Kuhn’s narrative.

Kuhn’s reconstruction of the Copernican revolution portrayed a radically different image of science than that of traditional philosophers of science. Justification of scientific knowledge was not simply a logical or objective affair but also included non-logical or subjective factors. According to Kuhn, scientific progress is not a clear-sighted linear process aimed directly at the truth. Rather, there are contingencies that can divert and forestall the progress of science. Moreover, Copernicus’ revolution changed the way astronomers and non-astronomers viewed the world. This change in perceiving the world was the result of new sets of challenges, new techniques, and a new hermeneutics for interpreting data.

Besides differing from traditional philosophers of science, Kuhn’s image of science put him at odds with Whig historians of science. These historians underrated ancient cosmologies by degrading them to myth or religious belief. Such a move was often a rhetorical ploy on the part of the victors to enhance the status of the current scientific theory. Only by showing how Aristotelian-Ptolemaic geocentric astronomy was authentic science could Kuhn argue for the radical transformation (revolution) that Copernican heliocentric astronomy invoked. Kuhn also asserted that Copernicus’ theory was not accepted simply for its predictive ability, since it was not as accurate as the original conceptual scheme, but because of non-empirical factors, such as the simplicity of Copernican’s system in which certain ad hoc modifications for accounting for the orbits of various planets were eliminated.

In 1956, Harvard denied Kuhn tenure because the tenure committee felt his book on the Copernican revolution was too popular in its approach and analysis. A friend of Kuhn knew Steven Pepper, who was chair of the philosophy department at the University of California at Berkeley. Kuhn’s friend told Pepper that Kuhn was looking for an academic position. Pepper’s department was searching for someone to establish a program in the history and philosophy of science. Berkeley eventually offered Kuhn a position in the philosophy department and later asked if he also wanted an appointment in the history department. Kuhn accepted both positions and joined the Berkeley faculty as an assistant professor.

Kuhn found Stanley Cavell in the philosophy department, a soulmate to replace Nash. Kuhn had meet Cavell earlier while they were both fellows at Harvard. Cavell was an ethicist and aesthetician, whom Kuhn found intellectually stimulating. He introduced Kuhn to Wittgenstein’s notion of language games. Besides Cavell, Kuhn developed a professional relationship with Paul Feyerabend, who was also working on the notion of incommensurability.

In 1958, Berkeley promoted Kuhn to associate professor and granted him tenure. Moreover, having completed several historical projects, he was ready to return to the philosophical issues that first attracted him to the history of science. Beginning in the fall of 1958, he spent a year as a fellow at the Center for Advanced Study in the Behavioral Sciences at Stanford, California. What struck Kuhn about the relationships among behavioral and social scientists was their inability to agree on the fundamental problems and practices of their discipline. Although natural scientists do not necessarily have the right answers to their questions, there is an agreement over fundamentals. This difference between natural and social scientists eventually led Kuhn to formulate the paradigm concept.

c. The Last Mile to Structure

Although The Copernican Revolution represented a significant advance in Kuhn’s articulation of a revolutionary theory of science, several issues still needed attention. What was missing from Kuhn’s reconstruction of the Copernican revolution was an understanding of how scientists function on a daily basis, when an impending revolution is not looming. That understanding emerged gradually during the last mile on the road to Structure in terms of three papers written from the mid-fifties to the early sixties.

In the first paper, ‘The function of measurement in modern physical science’, Kuhn challenged the belief that if scientists cannot measure a phenomenon then their knowledge of it is inadequate or not scientific. Part of the reason for Kuhn’s concern over measurement in science was its textbook tradition, which he believed perpetuates a myth about measurement that is misleading. Kuhn compared the textbook presentation of measurement to a machine in which scientists feed laws and theories along with initial conditions into the machine’s hopper at the top, turn a handle on the side representing logical and mathematical operations, and then collect numerical predictions exiting the machine’s chute in the front. Scientists finally compare experimental observations to theoretical predictions. The function of these measurements serves as a test of the theory, which is the confirmation function of measurement.

Kuhn claimed that the above function is not why measurements are reported in textbooks; rather, measurements are reported to give the reader an idea of what the professional community believes is reasonable agreement between theoretical predictions and experimental observations. Reasonable agreement, however, depends upon approximate, not exact, agreement between theory and data and differs from one science to the next. Moreover, external criteria do not exist for determining reasonableness. For Kuhn, the actual function of normal measurement in science is found in its journal articles. That function is neither invention of novel theories nor the confirmation of older ones. Discovery and exploratory measurements in science instead are rare. The reason is that changes in theories, which require discovery or confirmation, occur during revolutions, which are also quite rare. Once a revolution occurs, moreover, the new theory only exhibits potential for ordering and explaining natural phenomena. The function of normal measurement is to tighten reasonable agreement between novel theoretical predictions and experimental observations.

The textbook tradition is also misleading in terms of normal measurement’s effects. It claims that theories must conform to quantitative facts. Such facts are not the given but the expected and the scientist’s task is to obtain them. This obligation to obtain the expected quantitative fact is often the incentive for developing novel technology. Moreover, a well-developed theoretical system is required for meaningful measurement in science. Besides the function of normal or expected measurement, Kuhn also examined the function of extraordinary measurement—which pertain to unexpected results. It is this latter type of measurement that exhibits the discovery and confirmatory functions. When normal scientific practice results consistently in unexpected anomalies, this leads to crisis, and extraordinary measurement often aids to resolve it. Crisis then leads to the invention of new theories. Again, extraordinary measurement plays a critical role in this process. Theory invention in response to quantitative anomalies leads to decisive measurements for judging a novel theory’s adequacy, whereas qualitative anomalies generally lead to ad hoc modifications of theories. Extraordinary measurement allows scientists to choose among competing theories.

Kuhn was moving closer towards a notion of normal science through an analysis of normal measurement, in contrast to extraordinary measurement, in science. His conception of science continued to distance him from traditional philosophers of science. But, the notion of normal measurement was not as robust as he needed. Importantly, Kuhn was changing the agenda for philosophy of science from justification of scientific theories as finished products in textbooks to dynamic process by which theories are tested and assimilated into the professional literature. A robust notion of normal science was the revolutionary concept he needed, to overturn the traditional image of science as an accumulated body of facts.

With the introduction of normal and extraordinary measurement, the step towards the notions of normal and extraordinary science in Kuhn’s revolutionary image of science was imminent. Kuhn worked out those notions in The Essential Tension. He began by addressing the notion that creative thinking in science assumes a particular assumption of science in which science advances through unbridled imagination and divergent thinking—which involves identifying multiple avenues by which to solve a problem and determining which one works best. Kuhn acknowledged that such thinking is responsible for some scientific progress, but he proposed that convergent thinking—which limits itself to well-defined, often logical, steps for solving a problem—is also an important means of progress. While revolutions, which depend on divergent thinking, are an obvious means for scientific progress, Kuhn insisted that few scientists consciously design revolutionary experiments. Rather, most scientists engage in normal research, which represents convergent thinking. But, occasionally scientists may break with the tradition of normal science and replace it with a new tradition. Science, as a profession, is both traditional and iconoclastic, and the tension between them often creates a space in which to practice it.

Next, Kuhn utilized the term paradigm, while discussing the pedagogical advantages of convergent thinking—especially as displayed in science textbooks. Whereas textbooks in other disciplines include the methodological and conceptual conflicts prevalent within the discipline, science textbooks do not. Rather, science education is the transmission of a tradition that guides the activities of practitioners. In science education, students are taught not to evaluate the tradition but to accept it.

Progress within normal research projects represents attempts to bring theory and observation into closer agreement and to extend a theory’s scope to new phenomena. Given the convergent and tradition-bound nature of science education and of scientific practice, how can normal research be a means for the generation of revolutionary knowledge and technology? According to Kuhn, a mature science provides the background that allows practitioners to identify non-trivial problems or anomalies with a paradigm. In other words, without mature science there can be no revolution.

Kuhn continued to develop the notion of normal research and its convergent thinking in ‘The function of dogma in scientific research’. He began with the traditional image of science as an objective and critical enterprise. Although this is the ideal, the reality is that often scientists already know what to expect from their investigations of natural phenomena. If the expected is not forthcoming, then scientists must struggle to find conformity between what they expect and what they observe, which textbooks encode as dogmas. Dogmas are critical for the practice of normal science and for advancement in it because they define the puzzles for the profession and stipulate the criteria for their solution.

Kuhn next expanded the range of paradigms to embrace scientific practice in general, rather than simply as a model for research. Specifically, paradigms include not only a community’s previous scientific achievements but also its theoretical concepts, the experimental techniques and protocols, and even the natural entities. In short, they are the community’s body of beliefs or foundations. Paradigms are also open-ended in terms of solving problems. Moreover, they are exclusive in their nature, in that there is only one paradigm per mature science. Finally, they are not permanent fixtures of the scientific landscape, for eventually paradigms are replaceable. Importantly, for Kuhn, when a paradigm replaces another the two paradigms are radically different.

Having done paradigmatic spadework, Kuhn then discussed the notion of normal scientific research. The process of matching paradigm and nature includes extending and applying the paradigm to expected but also unexpected parts of nature. This does not necessarily mean discovering the unknown as it does explaining the known. Although the dogma paper is only a fragment of the solution to problems associated with the traditional image of science, the complete solution was soon to appear in Structure.

3. The Structure of Scientific Revolutions

In July 1961, Kuhn completed a draft of Structure; and in 1962, it was published as the final monograph in the second volume of Neurath’s International Encyclopedia of Unified Science. Charles Morris was instrumental in its publication and Carnap served as its editor. Structure was not a single publishing event in 1962; rather, it covered the years from 1962 to 1970. After its publication, Kuhn was engrossed for the rest of the sixties addressing criticisms directed to the ideas contained in it, especially the paradigm concept. During this time, he continued to develop and refine his new image of science. The endpoint was a second edition of Structure that appeared in 1970. The text of the revised edition, however, remained essentially unaltered and only a ‘Postscript—1969’ was added in which Kuhn addressed his critics.

What Kuhn proposed in Structure was a new image of science. That image differed radically from the traditional one. The difference hinged on a shift from a logical analysis and an explanation of scientific knowledge as finished product to a historical narration and description of scientific practices by which a community of practitioners produces scientific knowledge. In short, it was a shift from the subject (the product) to the verb (to produce).

According to the traditional image, science is a repository of accumulated facts, discovered by individuals at specific periods in history. One of the central tasks of traditional historians, given this image of science, was to answer questions about who discovered what and when. Even though the task seemed straightforward, many historians found it difficult and doubted whether these were the right kinds of questions to ask concerning science’s historical record. The historiographic revolution in the study of science changed the sorts of questions historians asked by revising the underlying assumptions concerning the approach to reading the historical record. Rather than reading history backwards and imposing current ideas and values on the past, texts are read within their historical context thereby maintaining their integrity. The historiographic revolution also had implications for how to analyze and understand science philosophically. The goal of Structure, declared Kuhn, was to cash out those implications.

The structure of scientific development, according to Kuhn, may be illustrated schematically, as follows: pre-paradigm science → normal science → extraordinary science → new normal science. The step from pre-paradigm science to normal science involves consensus of the community around a single paradigm, where no prior consensus existed. This is the step required for transitioning from immature to mature science. The step from normal science to extraordinary science includes the community’s recognition that the reigning paradigm is unable to account for accumulating anomalies. A crisis ensues, and community practitioners engage in extraordinary science to resolve its anomalies. A scientific revolution occurs with crisis resolution. Once a community selects a new paradigm, it discards the old one and another period of new normal science follows. The revolution or paradigm shift is now complete, and the cycle from normal science to new normal science through revolution is free to occur again.

For Kuhn, the origin of a scientific discipline begins with the identification of a natural phenomenon, which members of the discipline investigate experimentally and attempt to explain theoretically. But, each member of that nascent discipline is at cross-purposes with other members; for each member often represents a school working from different foundations. Scientists, operating under these conditions, share few, if any, theoretical concepts, experimental techniques, or phenomenal entities. Rather, each school is in competition for monetary and social resources and for the allegiance of the professional guild. An outcome of this lack of consensus is that all facts seem equally relevant to the problem(s) at hand and fact gathering itself is often a random activity. There is then a proliferation of facts and hence little progress in solving the problem(s) under these conditions. Kuhn called this state pre-paradigm or immature science, which is non-directed and flexible, providing a community of practitioners little guidance.

To achieve the status of a science, a discipline must reach consensus with respect to a single paradigm. This is realized when, during the competition involved in pre-paradigm science, one school makes a stunning achievement that catches the professional community’s attention. The candidate paradigm elicits the community’s confidence that the problems are solvable with precision and in detail. The community’s confidence in a paradigm to guide research is the basis for the conversion of its members, who now commit to it. After paradigm consensus, Kuhn claimed that scientists are in the position to commence with the practice of normal science. The prerequisite of normal science then includes a commitment to a shared paradigm that defines the rules and standards by which to practice science. Whereas pre-paradigm science is non-directed and flexible, normal or paradigm science is highly directed and rigid. Because of its directedness and rigidity, normal scientists are able to make the progress they do.

The paradigm concept loomed large in Kuhn’s new image of science. He defined the concept in terms of the community’s concrete achievements, such as Newtonian mechanics, which the professional can commonly recognize but cannot fully describe or explain. A paradigm is certainly not just a set of rules or algorithms by which scientists blindly practice their trade. In fact, there is no easy way to abstract a paradigm’s essence or to define its features exhaustively. Moreover, a paradigm defines a family resemblance, à la Wittgenstein, of problems and procedures for solving problems that are part of a single research tradition.

Although scientists rely, at times, on rules to guide research, these rules do not precede paradigms. Importantly, Kuhn was not claiming that rules are unnecessary for guiding research but rather that they are not always sufficient, either pedagogically or professionally. Kuhn compared the paradigm concept to Polanyi’s notion of tacit knowledge, in which knowledge production depends on the investigator’s acquisition of skills that do not reduce to methodological rules and protocols.

As noted above, Newtonian mechanics represents an example of a Kuhnian paradigm. The three laws of motion comprising it provided the scientific community with the resources to investigate natural phenomena in terms of both precision and predictability. In terms of precision, Newtonian mechanics allowed physicists to measure and explain accurately—with clockwork exactitude—the motion not only of celestial but also terrestrial bodies. With respect to prediction, physicists used the Newtonian paradigm to determine the potential movement of heavenly and earthly bodies. Thus, Newtonian mechanics qua paradigm equipped physicists with the ability to explain and manipulate natural phenomena. In sum, it became a way of viewing the world.

According to Kuhn, a paradigm allows scientists to ignore concerns over a discipline’s fundamentals and to concentrate on solving its puzzles—as the Newtonian paradigm permitted physicists to do for several centuries. It not only guides scientists in terms of identifying soluble puzzles, but it also prevents scientists from tackling insoluble ones. Kuhn compared paradigms to maps that guide and direct the community’s investigations. Only when a paradigm guides the community’s activities is scientific advancement as cumulative progress possible.

The activity of practitioners engaged in normal science is paradigm articulation and extension to new areas. Indeed, the Newtonian paradigm was adapted even for medicine. When a new paradigm is established, it solves only a few critical problems that faced the community. But, it does offer the promise for solving many more problems. Much of normal science involves mopping up, in which the community forces nature into a conceptually rigid framework—the paradigm. Rather than being dull and routine, however, such activity, according to Kuhn, is exciting and rewarding and requires practitioners who are creative and resourceful.

Normal scientists are not out to make new discoveries or to invent new theories, outside the paradigm’s aegis. Rather, they are involved in using the paradigm to understand nature precisely and in detail. From the experimental end of this task, normal scientists go to great pains to increase the precision and reliability of their measurements and facts. They are also involved in closing the gap between observations and theoretical predictions, and they attempt to clarify ambiguities left over from the paradigm’s initial adoption. They also strive to extend the scope of the paradigm by including phenomena not heretofore investigated. Much of this activity requires exploratory investigation, in which normal scientists make novel discoveries but anticipated vis-à-vis the paradigm. To solve these experimental puzzles often requires considerable technological ingenuity and innovation on the part of the scientific community. As Kuhn notes, Atwood’s machine—developed almost a century after Newton, is a good illustration of this.

Besides experimental puzzles, there are also the theoretical puzzles of normal science, which obviously mirror the types of experimental puzzles. Normal scientists conduct theoretical analyses to enhance the match between theoretical predictions and experimental observations, especially in terms of increasing the paradigm’s precision and scope. Again, just as experimental ingenuity is required so is theoretical ingenuity to explain natural phenomena successfully.

Normal science, according to Kuhn, is puzzle-solving activity, and its practitioners are puzzle solvers and not paradigm testers. The paradigm’s power over a community of practitioners is that it can transform seemingly insoluble problems into soluble ones through the practitioner’s ingenuity and skill. Besides the assured solution, Kuhn’s paradigm concept also involved rules of the puzzle-solving game not in a narrow sense of algorithms but in a broad sense of viewpoints or preconceptions. Besides these rules of the game, as it were, there are also metaphysical commitments, which inform the community as to the types of natural entities, and methodological commitments, which inform the community as to kinds of laws and explanations. Although rules are often necessary for normal scientific research, they are not always required. Normal science can proceed in the absence of such rules.

Although scientists engaged in normal science do not intentionally attempt to make unexpected discoveries, such discoveries do occur. Paradigms are imperfect and rifts in the match between paradigm and nature are inevitable. For Kuhn, discoveries not only occur in terms of new facts but there is also invention in terms of novel theories. Both discoveries of new facts and invention of novel theories begin with anomalies, which are violations of paradigm expectations during the practice of normal science. Anomalies can lead to unexpected discoveries. For Kuhn, unexpected discoveries involve complex processes that include the intertwining of both new facts and novel theories. Facts and theories go hand-in-hand, for such discoveries cannot be made by simple inspection. Because discoveries depend upon the intertwining of observations and theories, the discovery process takes time for the conceptual integration of the novel with the known. Moreover, that process is complicated by the fact that novelties are often resisted due to prior expectations. Because of allegiance to a paradigm, scientists are loathed to abandon it simply because of an anomaly or even several anomalies. In other words, anomalies are generally not counter-instances that falsify a paradigm.

Just as anomalies are critical for discovery of new facts or phenomena, so they are essential for the invention of novel theories. Although facts and theories are intertwined, the emergence of novel theories is the outcome of a crisis. The crisis is the result of the paradigm’s breakdown or inability to provide solutions to its anomalies. The community then begins to harbor questions about the ability of the paradigm to guide research, which has a profound impact upon it. The chief characteristic of a crisis is the proliferation of theories. As members of a community in crisis attempt to resolve its anomalies, they offer more and varied theories. Interestingly, anomalies that are responsible for the crisis may not necessarily be new since they may have been present all along. This helps to explain why anomalies lead to a period of crisis in the first place. The paradigm promised resolution of them but was unable to fulfill its promise. The overall effect is a return to a situation very similar to pre-paradigm science.

Closure of a crisis occurs in one of three possible ways, according to Kuhn. First, on occasion that the paradigm is sufficiently robust to resolve anomalies and to restore normal science practice. Second, even the most radical methods are unable to revolve the anomalies. Under these circumstances, the community tables them until future investigation and analysis. Third, the crisis is resolved with the replacement of the old paradigm by a new one but only after a period of extraordinary science.

Kuhn stressed that the initial response of a community in crisis is not to abandon its paradigm. Rather, its members make every effort to salvage it through ad hoc modifications until the anomalies can be resolved, either theoretically or experimentally. The reason for this strong allegiance, claimed Kuhn, is that a community must first have an alternative candidate to take the original paradigm’s place. For science, at least normal science, is possible only with a paradigm, and to reject it without a substitute is to reject science itself, which reflects poorly on the community and not on the paradigm. Moreover, a community does not reject a paradigm simply because of a fissure in the paradigm-nature fit. Kuhn’s aim was to reject a naïve Popperian falsificationism in which single counter-instances are sufficient to reject a theory. In fact, he reversed the tables and contended that counter-instances are essential for the practice of vibrant normal science. Although the goal of normal science is not necessarily to generate counter-instances, normal science practice does provide the occasion for their possible occurrence. Normal science, then, serves as an opportunity for scientific revolutions. If there are no counter-instances, reasoned Kuhn, scientific development comes to a halt.

The transition from normal science through crisis to extraordinary science involves two key events. First, the paradigm’s boundaries become blurred when faced with recalcitrant anomalies; and, second, its rules are relaxed leading to proliferation of theories and ultimately to the emergence of a new paradigm. Often relaxing the rules allows practitioners to see exactly where the problem is and how to solve it. This state has tremendous impact upon a community’s practitioners, similar to that during pre-paradigm science. Extraordinary scientists, according to Kuhn, behave erratically—because scientists are trained under a paradigm to be puzzle-solvers, not paradigm-testers. In other words, they are not trained to do extraordinary science and must learn as they go. For Kuhn, this type of behavior is more open to psychological than logical analysis. Moreover, during periods of extraordinary science practitioners may even examine the discipline’s philosophical foundations. To that end, they analyze their assumptions in order to loosen the old paradigm’s grip on the community and to suggest alternative approaches to the generation of a new paradigm.

Although the process of extraordinary science is convoluted and complex, a replacement paradigm may emerge suddenly. Often the source of its inspiration is rooted in the practice of extraordinary science itself, in terms of the interconnections among various anomalies. Finally, whereas normal science is a cumulative process, adding one paradigm achievement to the next, extraordinary science is not; rather, it is like—using Herbert Butterfield’s analogy—grabbing hold of a stick’s other end. That other end of the stick is a scientific revolution.

The transition from extraordinary science to a new normal science represents a scientific revolution. According to Kuhn, a scientific revolution is non-cumulative in which a newer paradigm replaces an older one—either partially or completely. It can come in two sizes: a major revolution such as the shift from geocentric universe to heliocentric universe or a minor revolution such as the discovery of X-rays or oxygen. But whether big or small, all revolutions have the same structure: generation of a crisis through irresolvable anomalies and establishment of a new paradigm that resolves the crisis-producing anomalies.

Because of the extreme positions taken by participants in a revolution, opposing camps often become galvanized in their positions, and communication between them breaks down and discourse fails. The ultimate source for the establishment of a new paradigm during a crisis is community consensus, that is, when enough community members are convinced by persuasion and not simply by empirical evidence or logical analysis. Moreover, to accept the new paradigm, community practitioners must be assured that there is no chance for the old paradigm to solve its anomalies.

Persuasion loomed large in Kuhn’s scientific revolutions because the new paradigm solves the anomalies the old paradigm could not. Thus, the two paradigms are radically different from each other, often with little overlap between them. For Kuhn, a community can only accept the new paradigm if it considers the old one wrong. The radical difference between old and new paradigms, such that the old cannot be derived from the new, is the basis of the incommensurability thesis. In essence, there is no common measure or standard for the two paradigms. This is evident, claimed Kuhn, when looking at the meaning of theoretical terms. Although the terms from an older paradigm can be compared to those of a newer one, the older terms must be transformed with respect to the newer ones. But, there is a serious problem with restating the old paradigm in transformed terms. The older, transformed paradigm may have some utility, for example pedagogically, but a community cannot use it to guide its research. Like a fossil, it reminds the community of its history but it can no longer direct its future.

The establishment of a new paradigm resolves a scientific revolution and issues forth a new period of normal science. With its establishment, Kuhn’s new image of a mature science comes full circle. Only after a period on intense competition among rival paradigms, does the community choose a new paradigm and scientists once again become puzzle-solvers rather than paradigm-testers. The resolution of a scientific revolution is not a straightforward process that depends only upon reason or evidence. Part of the problem is that proponents of competing paradigms cannot agree on the relevant evidence or proof or even on the relevant anomalies that require resolution, since their paradigms are incommensurable.

Another factor that leads to difficulties in resolving scientific revolutions is that communication among members in crisis is only partial. This results from the new paradigm borrowing from the old paradigm theoretical terms and concepts, and laboratory protocols. Although they share borrowed vocabulary and technology, the new paradigm gives new meaning and uses to them. The net result is that members of competing paradigms talk past one another. Moreover, the change in paradigms is not a gradual process in which different parts of the paradigm are changed piecemeal; rather, the change must be as a whole and suddenly. Convincing scientists to make such a wholesale transformation takes time.

How then does one segment of the community convince or persuade another to switch paradigms? For members who worked for decades under the old paradigm, they may never accept the new paradigm. Rather, it is often the younger members who accept the new paradigm through something like a religious conversion. According to Kuhn, faith is the basis for conversion, especially faith in the potential of the new paradigm to solve future puzzles. By invoking the terms conversion and faith, Kuhn was not implying that arguments and reason are unimportant in a paradigm shift. Indeed, the most common reason for accepting a new paradigm is that it solves the anomalies the old paradigm could not. However, Kuhn point was that argument and reason alone are insufficient. Aesthetic or subjective factors also play an important role in a paradigm shift, since the new paradigm solves only a few, but critical, anomalies. These factors weigh heavily in the shift initially by reassuring community members that the new paradigm represents the discipline’s future.

From the resolution of revolutions, Kuhn made several important philosophical points concerning the principles of verification and falsification. As Kuhn acknowledged, philosophers no longer search for absolute verification, since no theory can be tested exhaustively; rather, they calculate the probability of a theory’s verification. According to probabilistic verification, every imaginable theory must be compared with one another vis-à-vis the available data. The problem in terms of Kuhn’s new image of science is that a theory is tested with respect to a given paradigm, and such a restriction precludes access to every imaginable theory. Moreover, Kuhn rejected falsifying instances because no paradigm resolves every problem facing a community. Under these conditions, no paradigm would ever be accepted. For Kuhn, the process of verification and falsification must include imprecision associated with theory-fact fit.

An interesting feature of scientific revolutions, according to Kuhn, is their invisibility. What he meant by this is that in the process of writing textbooks, popular scientific essays, and even philosophy of science, the path to the current paradigm is sanitized to make it appear as if it was in some sense born mature. Disguising a paradigm’s history is an outcome of a belief about scientific knowledge, which considers it as invariable and its accumulation as linear. This disguising serves the winner of the crisis by establishing its authority, especially as a pedagogical aid for indoctrinating students into a community of practitioners. Another important effect of a revolution, related to a paradigm shift, is a shift in the community’s image of science. The change in science’s image should be no surprise, since the prevailing paradigm defines science. Change that paradigm and science itself changes, at least how to practice it. In other words, the shift in science’s image is a result of a change in the community’s standards for what constitutes its puzzles and its puzzles’ solutions. Finally, revolutions transform scientists from practitioners of normal science, who are puzzle-solvers, to practitioners of extraordinary science, who are paradigm-testers. Besides transforming science, revolutions also transform the world that scientists inhabit and investigate.

One of the major impacts of a scientific revolution is a change of the world in which scientists practice their trade. Kuhn’s world-changes thesis, as it has become known, is certainly one of his most radical and controversial ideas, besides the associated incommensurability thesis. The issue is how far ontologically does the change go, or is it simply an epistemological ploy to reinforce the comprehensive effects of scientific revolutions. In other words, does the world really change or simply the worldview, that is, one’s perspective on or perception of the world? For Kuhn, the answer relied not on a logical or even a philosophical but rather a psychological analysis of the change.

Kuhn analyzed the changes in worldview by analogizing it to a gestalt switch, for example, duck-rabbit. Although the gestalt analogy is suggestive, it is limited to only perceptual changes and says little about the role of previous experience in such transformations. Previous experience is important because it influences what a scientist sees when making an observation. Moreover, with a gestalt switch, the person can stand above or outside of it acknowledging with certainty that one sees now a duck or now a rabbit. Such an independent perspective, which eventually is an authoritarian stance, is not available to the community of practitioners; there is no answer sheet, as it were. Because the community’s access to the world is limited by what it can observe, any change in what is observed has important consequences for the nature of what is observed, that is, the change has ontological significance.

Thus, for Kuhn, the change revolution brings about is more than simply seeing or observing a different world; it also involves working in a different world. The perceptual transformation is more than reinterpretation of data. For, data are not stable but they too change during a paradigm shift. Data interpretation is a function of normal science, while data transformation is a function of extraordinary science. That transformation is often a result of intuitions. Moreover, besides a change in data, revolutions change the relationships among data. Although traditional western philosophy has searched for three centuries for stable theory-neutral data or observations to justify theories, that search has been in vain. Sensory experience occurs through a paradigm of some sort, argued Kuhn, even articulations of that experience. Hence, no one can step outside a paradigm to make an observation; it is simply impossible given the limits of human physiology.

Kuhn then took on the nature of scientific progress. For normal science, progress is cumulative in that the solutions to puzzles form a repository of information and knowledge about the world. This progress is the result of the direction a paradigm provides a community of practitioners. Importantly, the progress achieved through normal science, in terms of the information and knowledge, is used to educate the next generation of scientists and to manipulate the world for human welfare. Scientific revolutions change all that. For Kuhn, revolutionary progress is not cumulative but non-cumulative.

What, then, does a community of practitioners gain by going through a revolution or paradigm shift? Has it made any kind of progress in its rejection of a previous paradigm and the fruit that paradigm yielded? Of course, the victors of the revolution are going to claim that progress was made after the revolution. To do otherwise would be to admit that they were wrong. Rather advocates of the new normal science are going to do everything they can to ensure that their winning paradigm is seen as pushing forward a better understanding of the world. The progress achieved through a revolution is two-fold, according to Kuhn. The first is the successful solution of anomalies that a previous paradigm could not solve. The second is the promise to solve additional problems or puzzles that arise from these anomalies.

But has the community gotten closer to the truth, that is, the notion of verisimilitude, by going through a revolution? According to Kuhn the answer is no. For Kuhn, progress in science is not directed activity towards some goal like truth. Rather, scientific progress is evolutionary. Just as natural selection operates during biological evolution in the emergence of a new species, so community selection during a scientific revolution functions similarly in the emergence of a new theory. And, just as species are adapted to their environments, so theories are adapted to the world. Kuhn had no answer to the question why this should be other than the world and the community that investigates it exhibit unique features. What these features are, Kuhn did not know, but he concluded that the new image of science he had proposed would resolve, like a new paradigm after a scientific revolution, these problems. He invited the next generation of philosophers of science to join him in a new philosophy of science incommensurate with its predecessor.

The reaction to Kuhn’s Structure was at first congenial, especially by historians of science, but within a few years it turned critical, particularly by philosophers. Critics charged him with irrationalism and epistemic relativism. Although he felt the reviews of Structure were good, his chief concerns were the tags of irrationalism and relativism—at least a pernicious kind of relativism. Kuhn believed the charges were inaccurate, however, simply because he maintained that science does not progress toward a predetermined goal. But, like evolutionary change, one theory replaces another with a better fit between theory and nature vis-à-vis competitors. Moreover, he believed that use of the Darwinian evolution was the correct framework for discussing science’s progress. But, he felt no one took it seriously.

On 13 July 1965, Kuhn participated in an International Colloquium in the Philosophy of Science, held at Bedford College in London. The colloquium was organized jointly by the British Society for the Philosophy of Science and by the London School of Economics and Political Science. Kuhn delivered the initial paper comparing his and Karl Popper’s conceptions of the growth of scientific knowledge. John Watkins then delivered a paper criticizing Kuhn’s notion of normal science, with Popper chairing the session. Popper also presented a paper criticizing Kuhn, as did several other members of the philosophy of science community, including Stephen Toulmin, L. Pearce Williams, and Margaret Masterman, who identified twenty-one senses of Kuhn’s use of paradigm in Structure. Masterman concluded her paper inviting others to join in clarifying Kuhn’s paradigm concept.

Kuhn himself took up Masterman’s challenge and clarified the paradigm concept in the second edition of Structure, particularly in its ‘Postscript—1969’. To that end, he divided paradigm into disciplinary matrix and exemplars. The former represents the milieu of the professional practice, consisting of symbolic generalizations, models, and values, while the latter represents solutions to concrete problems that a community accepts as paradigmatic. In other words, exemplars serve as templates for solving problems or puzzles facing the scientific community and thereby for advancing the community’s scientific knowledge. For Kuhn, scientific knowledge is not localized simply within theories and rules; rather, it is localized within exemplars. The basis for an exemplar to function in puzzle solving is the scientist’s ability to see the similarity between a previously solved puzzle and a currently unsolved one.

In the early sixties, van Vleck invited Kuhn to direct a project collecting materials on the history of quantum mechanics. In August 1960, Hunter Dupree, Charles Kittel, Kuhn, John Wheeler, and Harry Wolff, met in Berkeley to discuss the project’s organization. Wheeler next met with Richard Shryock and a joint committee of the American Physical Society and the American Philosophical Society on the History of Theoretical Physics in the Twentieth Century was formed to sponsor and develop the project. The project lasted for three years, with the first and last years of the project conducted in Berkeley and the middle year in Europe. The National Science Foundation funded the project.

The project led to a publication, by John Heilbron and Kuhn, on the origins of the Bohr atom. They provided a revisionist narrative of Bohr’s path to the quantized atom, beginning with his 1911 doctoral dissertation and concluding with his 1913 three-part paper on atomic structure. The intrigue of this historical study was that within a six-week period in mid-1912 Bohr went from little interest in models of the atom to producing a quantized model of J.J. Rutherford’s atom and applying that model to several perplexing problems. The authors explored Bohr’s sudden interest in atomic models. They proposed that his interest stemmed from specific problems, which guided Bohr in terms of both his reading and research toward the potential of the atomic structure for solving them. The solutions to those problems resulted from what Heilbron and Kuhn called a 1913 February transformation in Bohr’s research. What initiated the transformation, claimed the authors, was that Bohr had read a few months earlier J.W. Nicholson’s papers on the application of Max Planck’s constant to generate an atomic model. Although Nicholson’s model was incorrect, it led Bohr in the right direction. Then in February 1913, Bohr, in a conversation with H.R. Hansen, obtained the last piece of the puzzle. After the transformation, Bohr completed the atomic model project within the year.

Besides completing a draft of Structure in 1961, Kuhn was made full professor at Berkeley, but only in the history department. Members of philosophy department voted to deny him promotion in their department, a denial that angered and hurt Kuhn tremendously. Princeton University made Kuhn an offer to join its faculty, while he was in Europe. The university had recently inaugurated a history and philosophy of science program. The program’s chair was Charles Gillispie and its staff included John Murdoch, Hilary Putnam, and Carl Hempel. Upon returning to the United States in 1963, Kuhn visited Princeton. He decided to accept the offer and joined its faculty in 1964. He became the program’s director in 1967 and the following year Princeton appointed him the Moses Taylor Pyne Professor of History. As the sixties ended, Structure was becoming increasingly popular, especially among student radicals who believed it liberated them from the tyranny of tradition.

4. The Road after Structure

In 1979, Kuhn moved to M.I.T.’s Department of Linguistics and Philosophy. In 1983, he was appointed the Laurance S. Rockefeller Professor of Philosophy. At M.I.T., he took a linguistic turn in his thinking, reflecting his new environment, which had a major impact on his subsequent work, especially on the incommensurability thesis.

Structure’s success not only established the historiographic revolution in the study of science in either historically or philosophically or what came to be called the discipline of history and philosophy of science, but also supported the rise of science studies in general and specifically the sociology and anthropology of science, particularly the sociology of scientific knowledge. Kuhn rejected both these trajectories often attributed to Structure, for what he called historical philosophy of science. He conducted—as he categorized his work in the Essential Tension—either historical studies on science or their historiographic implications, or either metahistorical studies or their philosophical implications. In other words, his scholarly work was either historical or philosophical.

a. Historical and Historiographic Studies

Kuhn’s final major historical study was on Planck’s black-body radiation theory and the origins of quantum discontinuity. The transition from classical physics—in which particles pass through intermediate energy stages—to quantum physics—in which energy change is discontinuous—is traditionally attributed to Planck’s 1900 and 1901 quantum papers. According to Kuhn, this traditional account was inaccurate and the transition was initiated by Albert Einstein’s and Paul Ehrenfest’s independent 1906 quantum papers. Kuhn’s realization of this inaccuracy was similar to the enlightenment he experienced when struggling to make sense of Aristotle’s notion of mechanical motion. His initial epiphany occurred while reading Planck’s 1895 paper on black-body radiation. Through that experience, he realized that Planck’s 1900 and 1901 quantum papers were not the initiation of a new theory of quantum discontinuity, but rather they represented Planck’s effort to derive the black-body distribution law based on classical statistical mechanics. Kuhn concluded the study with an analysis of Planck’s second black-body theory, first published in 1911, in which Planck used the notion of discontinuity to derive the second theory. Rather than the traditional position, which claimed the second theory represents a regression on Planck’s part to classical physics, Kuhn argued that it represents the first time Planck incorporated into his theoretical work a theory in which he was not completely confident.

In the black-body radiation and quantum discontinuity historical study, Kuhn did not use paradigm, normal science, anomaly, crisis, or incommensurability, which he championed in Structure. Critics, especially within the history and philosophy of science discipline, were disappointed. Kuhn bemoaned the book’s reception, even by its supporters. However, he later explored the historiographic and philosophical issues raised in Black-Body Theory with respect to Structure. The historiographic issues that the former book addressed were the same raised in the 1962 monograph. Specifically, he claimed that current historiography should attempt to understand previous scientific texts in terms of their contemporary context and not in terms of modern science. Kuhn’s concern was more than historical accuracy; rather, he was interested in recapturing the thought processes that lead to a change in theory. Although Structure was Kuhn’s articulation of this process for scientific change, the terminology in the monograph did not represent a straightjacket for narrating history. For Kuhn, the terminology and vocabulary, like paradigm and normal science, used in Structure were not products, such as metaphysical categories, to which a historical narrative must conform; rather, they had a different metaphysical function—as presuppositions towards an historical narrative as process. In other words, Structure’s terminology and vocabulary were tools by which to reconstruct a scientific historical narrative and not a template for articulating it.

The purpose of history of science, according to Kuhn, was not just getting the facts straight but providing philosophers of science with an accurate image of science to practice their trade. Kuhn fervently believed that the new historiography of science would prevent philosophers from engaging in the excesses and distortions prevalent within traditional philosophy of science. He envisioned history of science informing philosophy of science as historical philosophy of science rather than history and philosophy of science, since the relationship was asymmetrical.

Prior to 1950, history of science was a discipline practiced mostly by eminent scientists, who generally wrote heroic biographies or sweeping overviews of a discipline often for pedagogical purposes. Within the past generation, historians of science, such as Alexander Koyré, Anneliese Maier, and E.J. Dijsterhuis, developed an approach to the history of science that was simply more than chronicling science’s theoretical and technical achievements. An important factor in that development was the recognition of institutional and sociological factors in the practice of science. A consequence of this historiographic revolution was the distinction between internal and external histories of science. Internal history of science is concerned with the development of the theories and methods employed by scientists. In other words, it studies the history of events, people, and ideas internal to scientific advancement. The historian as internalist attempts to climb inside the mind of scientists as they push forward the boundaries of their discipline. External history of science concentrates on the social and cultural factors that impinge on the practice of science.

For Kuhn, the distinction between internal and external histories of science mapped onto his pattern of scientific development. External or cultural and social factors are important during a scientific discipline’s initial establishment; however, once established, those factors no longer have a major impact on a community’s practice or its generation of scientific knowledge. They can have a minor impact on a mature science’s practice, such as the timing of technological innovation. Importantly for Kuhn, internal and external approaches to the history of science are not necessarily mutually exclusive but complementary.

b. Metahistorical Studies

As mentioned already, Kuhn considered himself a practitioner of both the history of science and the philosophy of science and not the history and philosophy of science, for a very practical reason. Crassly put, the goal for history is the particular while for philosophy the universal. Kuhn compared the differences between the two disciplines to a duck-rabbit Gestalt switch. In other words, the two disciplines are so fundamentally different in terms of their goals, that the resulting images of science are incommensurable. Moreover, to see the other discipline’s image requires a conversion. For Kuhn, then, the history of science and the philosophy of science cannot be practiced at the same time but only alternatively, and then with difficulty.

How then can the history of science be of use to philosophers of science? The answer for Kuhn was by providing an accurate image of science. Rejecting the covering law model for historical explanation because it reduces historians to mere social scientists, Kuhn advocated an image based on ordering of historical facts into a narrative analogous to the one he proposed for puzzle solving under the aegis of a paradigm in the physical sciences. Historians of science, as they narrate change in science, provide an image of science that reflects the process by which scientific information develops, rather than the image provided by traditional philosophers of science in which scientific knowledge is simply a product of logical verification or falsification. Kuhn insisted that the history of science and the philosophy of science remain distinct disciplines, so that historians of science can provide an image of science to correct the distortion produced by traditional philosophers of science.

According to Kuhn, the social history of science also distorts the image of science. For social historians, scientists construct rather than discover scientific knowledge. Although Kuhn was sympathetic to this type of history, he believed it created a gap between older constructions and the ones replacing them, which he challenged historians of science to fill. Besides social historians of science, Kuhn also accused sociologists of science for distorting the image of science. Although Kuhn acknowledged that factors such as interests, power, authority, among others, are important in the production of scientific knowledge, the predominant use of them by sociologists eclipses other factors such as nature itself. The key to rectifying the distortion introduced by sociologists is to shift from a rationality of belief, that is, the reasons scientists hold specific beliefs, to a rationality of change in beliefs, that is, the reasons scientists change their beliefs. For Kuhn, a historical philosophy of science was the means for correcting these distortions of the scientific image.

Kuhn’s historical philosophy of science focused on the metahistorical issues derived from historical research, particularly scientific development and the related issues of theory choice and incommensurability. Importantly for Kuhn, both theory choice and incommensurability are intimately linked to one another. The former cannot be reduced to an algorithm of objective rules but requires subjective values because of the latter.

Kuhn explored scientific development using three different approaches. The first was in terms of problem versus puzzle solving. According to Kuhn, problems have no ready solution; and, problem solving is often generally pragmatic and is the hallmark of an underdeveloped or immature science. Puzzles, on the other hand, occupy the attention of scientists involved in a developed or mature science. Although they have guaranteed solutions, the methods for solving puzzles are not assured. Scientists, who solve them, demonstrate their ingenuity and are rewarded by the community.

With this distinction in mind, Kuhn envisioned scientific development as the transition of a scientific discipline from an underdeveloped problem-solving state to a developed puzzle-solving one. The question then arises as to how this occurs. The answer that many took from Structure was, adopt a paradigm. However, Kuhn found this answer to be incorrect in that paradigms are not unique only to the sciences. But does articulating the question in terms of puzzle-solving help? Kuhn’s answer was pragmatic, that is, keep trying different solutions until one works. In other words, philosophers of science had no exemplars by which to solve their problems.

Kuhn’s second approach to scientific development was in terms of the growth of knowledge. He proposed an alternative view to the traditional one that scientific knowledge grows by a piecemeal accumulation of facts. To shed light on the alternative view, Kuhn offered a different reconstruction of science. The central ideas of a science cohere with one another, forming a set of the central ideas or core of a particular science. Besides the core, a periphery exists, which represents an area where scientists can investigate problems associated with a research tradition without changing core ideas.

Kuhn then drew parallels between the current reconstruction of science and the earlier one in Structure. Obviously, the transition in cores from one research tradition to another is a scientific revolution. Moreover, the core represents a paradigm that defines a particular research tradition. Finally, the periphery is identified with normal science. The core then provides the means by which to practice science, and to change the core requires significant retooling that practitioners naturally resist.

Is this change in the core a growth of knowledge? To answer the question, Kuhn examined the standard account of knowledge as justified true belief. What he found problematic with the account is the amount or nature of the evidence needed to justify a belief. And this, of course, raises the issue of truth for which he had no ready solution. Ultimately, Kuhn equivocated on the question of the growth of knowledge.

Kuhn’s final approach to scientific development was through the analysis of three scientific revolutions: the shift from Aristotelian to Newtonian physics, Volta’s discovery of the electric cell, and Planck’s black-body radiation research and quantum discontinuity. From these examples, Kuhn derived three characteristics of scientific revolutions. The first was holistic in that scientific revolutions are all-or-none events. The second was the way referents change after revolutions, especially in terms of taxonomic categories. According to Kuhn, revolutions redistribute objects among these categories. The final characteristic of scientific revolutions was a change in a discipline’s analogy, metaphor, or model, which represents the connection between taxonomic categories and the world’s structure.

According to traditional philosophers of science, the objective features of a good scientific theory include accuracy, consistency, scope, simplicity, and fecundity. However, these features, when used individually as criteria for theory choice, argued Kuhn, are imprecise and often conflict with one another. Although necessary for theory choice, they are insufficient and must include the characteristics of the scientists making the choices. These characteristics involve personal experiences or biography and personality or psychological traits. In other words, not only does theory choice rely on a theory’s objective features but also on individual scientists’ subjective characteristics.

Why have traditional philosophers of science ignored or neglected subjective factors in theory choice? Part of the answer is that they confined the subjective to the context of discovery, while restricting the objective to the context of justification. Kuhn insisted that this distinction does not fit with observations of scientific practice. It is artificial, reflecting science pedagogy. But, actual scientific practice reveals that textbook presentations of theory choice are stylized, to convince students who rely on the authority of their instructors. What else can students do? Textbook science discloses only the product of science, not its process. For Kuhn, since subjective factors are present at the discovery phase of science, they should also be present at the justification phase.

According to Kuhn, objective criteria function as values, which do not dictate theory choice but rather influence it. Values help to explain scientists’ behavior, which for the traditional philosopher of science may at times appear irrational. Most importantly, values account for disagreement over theories and help to distribute risk during debates over theories. Kuhn’s position had important consequences for the philosophy of science. He maintained that critics misinterpreted his position on theory choice as subjective. For them, the term denoted a matter of taste that is not rationally discussable. But, his use of the term did involve the discussable with respect to standards. Moreover, Kuhn denied that facts are theory independent and that there is strictly a rational choice to be made. Rather, he contended scientists do not choose a theory based on objective criteria alone but are converted based on subjective values.

Finally, Kuhn discussed theory choice with respect to the incommensurability thesis. The question he entertained was what type of communication is possible among community members holding competing theories. The answer, according to Kuhn, is that communication is partial. The answer raised a second, and more important, question for Kuhn and his critics. Is good reason vis-à-vis empirical evidence available to justify theory choice, given such partial communication? The answer would be straightforward if communication was complete, but it is not. For Kuhn, this situation meant that ultimately reasonable evaluation of the empirical evidence is not compelling for theory choice and, of course, raised the charge of irrationality, which he denied.

Kuhn identified two common misconceptions of his version of the incommensurability thesis. The first was that since two incommensurable theories cannot be stated in a common language, then they be cannot compared to one another in order to choose between them. The second was that since an older theory cannot be translated into modern expression, it cannot be articulated meaningfully.

Kuhn addressed the first misconception by distinguishing between incommensurability as no common measure and as no common language. He defined the incommensurability thesis in terms of the latter rather than the former. Most theoretical terms are homophonic and can have the same meaning in two competing theories. However, only a handful of terms are incommensurable or untranslatable. Kuhn considered this a modest version of the incommensurability thesis, calling it local incommensurability, and claimed that it was his originally intention. Although there may be no common language to compare terms that change their meaning during a scientific revolution, there is a partially common language composed of the invariant terms that do permit some semblance of comparison. Thus, Kuhn argued, the first criticism fails; because, and this was his main point, an incommensurate residue remains even with a partially common language.

As for the second misconception, Kuhn claimed that critics conflate the difference between translation and interpretation. The conflation is understandable since translation often involves interpretation. Translation for Kuhn is the process by which words or phrases of one language substitute for another. Interpretation, however, involves attempts to make sense of a statement or to make it intelligible. Incommensurability, then, does not mean that a theoretical term cannot be interpreted, that is, cannot be made intelligible; rather, it means that the term cannot be translated, that is, there is no equivalent for the term in the competing theoretical language. In other words, in order for the theoretical term to have meaning the scientist must go native in its use.

Kuhn introduced the notion of the lexicon and its attendant taxonomy to capture both a term’s reference and intention or sense. In the lexicon, there are referring terms that are interrelated to other referring terms, that is, the holistic principle. The lexicon’s structure of interrelated terms resembles the world’s structure in terms of its taxonomic categories. A particular scientific community uses its lexicon to describe and explain the world in terms of this taxonomy. And, members of a community or of different communities must share the same lexicon if they are to communicate fully with one another. Moreover, claimed Kuhn, if full translation is to be achieved the two languages must share a similar structure with respect to their respective lexicons. Incommensurability, then, reflects lexicons that have different taxonomic structures by which the world is carved up and articulated.

Kuhn also addressed a problem that involves communication among communities who hold incommensurable theories, or who occupy positions across a historical divide. Kuhn noted that although lexicons can change dramatically, this does not deter members from reconstructing their past in the current lexicon’s vocabulary. Such reconstruction obviously plays an important function in the community. But the issue is that, given the incommensurable nature of theories, assessments of true and false or right and wrong are unwarranted, for which critics charged Kuhn with a relativist position—a position he was less inclined to deny.

The charge stemmed from the fact that Kuhn advocated no privileged position from which to evaluate a theory. Rather, evaluations must be made within the context of a particular lexicon. And thus, evaluations are relative to the relevant lexicon. But, Kuhn found the charge of relativism trivial. He acknowledged that his position on the relativity of truth and objectivity, with respect to the community’s lexicon, left him no option but to take literally world changes associated with lexical changes. But, is this an idealist position? Kuhn admitted that it appears to be, but he claimed that it is an idealism like none other. On the one hand, the world is composed of the community’s lexicon, but one the other hand, preconceived ideas cannot mold it.

c. Evolutionary Philosophy of Science

From the mid-1980s to early-1990s, Kuhn transitioned from historical philosophy of science and the paradigm concept to an evolutionary philosophy of science and the lexicon notion. To that end, he identified an alternative role for the incommensurability thesis with respect to segregating or isolating lexicons and their associated worlds from one another. Incommensurability now functioned for Kuhn as a mechanism to isolate a community’s lexicon from another’s and as a means to underpin a notion of scientific progress as the proliferation of scientific specialties. In other words, as the taxonomical structure of the two lexicons become isolated and thereby incommensurable with one another, according to Kuhn, a new specialty and its lexicon split off from the old or parent specialty and its lexicon. This process accounts for a notion of scientific progress as an increase in the number of scientific specialties after a revolution.

Scientific progress, then, is akin to biological speciation, argued Kuhn, with incommensurability serving as the isolation mechanism. The result is a tree-like structure with increased specialization at the tips of the branches. Finally, Kuhn’s evolutionary philosophy of science is non-teleological in the sense that science progresses not towards an ultimate truth about the world but simply away from a lexicon that cannot be used to solve its anomalies to one that can. However, he still articulated incommensurability in terms of no common language, with its attendant problems involving the notion of meaning, and did not transform it fully with respect to an evolutionary philosophy of science.

Kuhn was working out an evolutionary philosophy of science in a proposed book, Words and Worlds: An Evolutionary View of Scientific Development. He divided it into three parts, with three chapters in each. In the first part, Kuhn framed the problem associated with the incommensurability thesis and addressed the difficulties accessing past scientific achievements. In the first chapter, he presented an evolutionary view of scientific development. Without an Archimedean platform to guide theory assessment, Kuhn proposed a comparative method for assessing theoretical changes. The method forbids assessment of theories in isolation and methodological solecism. In the next chapter, he discussed the problems associated with examining past historical studies in science. Based on several historical cases, he claimed that anomalies in older scientific texts could be understood only through an interpretative process involving an ethnographic or a hermeneutical reading. He had now laid the groundwork for examining the incommensurability thesis. In the third chapter, Kuhn discussed the changes of word-meanings as changes in a taxonomy embedded in a lexicon—an apparatus of a language’s referring terms. The result of these changes was an untranslatable gap between two incommensurable theories. Finally, the lexical terms referring to objects change as the number of scientific specialties proliferate.

In the book’s second part, Kuhn continued to explore the nature of a community’s lexicon, which he explicated in terms of taxonomic categories. These categories are grouped as contrast sets and no overlap of categories exists within the same contrast set, which Kuhn called the no-overlap principle. The principle prohibits the reference of terms to objects unless related to one another as species to genus. Moreover, the properties of the categories are reflected in the properties of their names. A term’s meaning then is a function of its taxonomic category. And, this restriction is the origin of untranslatability. In the first chapter of this part, Kuhn discussed the nature of substances in terms of sortal predicates. This move allowed Kuhn to introduce plasticity into the lexicon’s usage. Moreover, the differentiating set is not strictly conventional but relies on the world to which the different sets connect. In the next chapter, Kuhn extended the lexicon notion to artifacts, abstractions, and theoretical entities.

In the final chapter of the second part, Kuhn specified the means by which community members acquire a lexicon. First, they must already possess a vocabulary about physical entities and forces. Next, definitions play little, if any, role in learning new terms; rather, those terms are acquired through ostensive examples, especially through problem solving and laboratory demonstrations. Third, a single example is inadequate to learn the meaning of a term; rather, multiple examples are required. Next, acquisition of a new term within a statement also requires acquisition of other new terms within that statement. And lastly, students can acquire the terms of a lexicon through different pedagogical routes.

In the book’s concluding part, Kuhn discussed what occurs during a change in the lexicon and the implications for scientific development. In chapter seven, he examined the means by which lexicons change and the repercussions such change has for communication among communities with different lexicons. Moreover, he explored the role of arguments in lexical change. In the subsequent chapter, Kuhn identified the type of progress achieved with changes in lexicons. He maintained that progress is not the type that aims at a specific goal but rather is instrumental. In the final chapter, he broached the issues of relativism and realism not in traditional terms of truth and objectivity but rather with respect to the capability of making a statement. Statements from incommensurable theories that cannot be translated are ultimately ineffable. They can be neither true nor false but their capability of being stated is relative to the community’s history.

In sum, the book’s aim was certainly to address the philosophical issues left over from Structure, but more importantly, it was to resolve the problems generated by a historical philosophy of science. Although others were also responsible for its creation, Kuhn assumed responsibility for resolving the problems; and the sine qua non for resolving them was the incommensurability thesis. For Kuhn, the thesis was required more than ever to defend rationality from the post-modern development of the strong program.

5. Conclusion

In May 1990, a conference—or as Hempel called it, a Kuhnfest—was held in Kuhn’s honor at MIT, sponsored by the Sloan Foundation and organized by Paul Horwich and Judith Thomson. The conference speakers included Jed Buchwald, Nancy Cartwright, John Earman, Michael Friedman, Ian Hacking, John Heilbron, Ernan McMullin, N.M Swerdlow, and Norton Wise. The papers reflected Kuhn’s impact on the history and the philosophy of science. Hempel made a special appearance on the last day, followed by Kuhn’s remarks on the conference papers. As he approached the podium after Hempel’s remarks, before a standing-room-only audience, Kuhn was visibly moved by the outpouring of professional appreciation for his contributions, to a discipline that he cherished and from its members whom he truly respected.

Kuhn retired from teaching in 1991 and became an emeritus professor at MIT. During Kuhn’s career, he received numerous awards and accolades. He was the recipient of honorary degrees from around a dozen academic institutions, such as University of Chicago, Columbia University, University of Padua, and University of Notre Dame. He was elected a member of the National Academy of Science—the most prestigious society for U.S. scientists—and was an honorary life member of the New York Academy of Science and a corresponding fellow of the British Academy. He was president of the History of Science Society from 1968 to 1970 and the society awarded him its highest honor, the Sarton Medal, in 1982. Kuhn was also the recipient in 1977 of the Howard T. Behrman Award for distinguished achievement in the humanities and in 1983 of the celebrated John Desmond Bernal award. Kuhn died on 17 June 1996 in Cambridge, Massachusetts, after suffering for two years from cancer of the throat and bronchial tubes.

6. References and Further Reading

a. Kuhn’s Work

a Kuhn’s work
Kuhn Papers, MIT MC 240, Institute Archives and Special Collections, MIT Libraries, Cambridge, MA.
Kuhn, T. S. (1957) The Copernican Revolution: Planetary Astronomy in the Development of Western Thought. Cambridge, MA: Harvard University Press.
Kuhn, T. S. (1962) The Structure of Scientific Revolutions. Chicago, IL: University of Chicago Press.
Kuhn, T. S. (1963) ‘The function of dogma in scientific research’, in A.C. Crombie, ed. Scientific Change: Historical Studies in the Intellectual, Social and Technical Conditions for Scientific Discovery and Technical Invention, From Antiquity to the Present. New York: Basic Books, pp. 347-69.
Kuhn, T. S., Heilbron, J. L., Forman, P. and Allen, L. (1967) Sources for History of Quantum Physics: An Inventory and Report. Philadelphia, PA: American Philosophical Society.
Heilbron, J. L., and Kuhn, T. S. (1969) ‘The genesis of the Bohr atom’. Historical Studies in the Physical Sciences, 1, 211-90.
Kuhn, T. S. (1970) The Structure of Scientific Revolutions (2^nd edition). Chicago, IL: University of Chicago Press.
Kuhn, T. S. (1977) The Essential Tension: Selected Studies in Science Tradition and Change. Chicago: University of Chicago Press.
Kuhn, T. S. (1987) Black-Body Theory and the Quantum Discontinuity, 1894-1912 (revised edition). Chicago: University of Chicago Press.
Kuhn, T. S. (1990) ‘Dubbing and redubbing: the vulnerability of rigid designation’, in C.W. Savage, ed. Scientific Theories. Minneapolis, MN: University of Minnesota Press, pp. 298-318.
Kuhn, T. S. (2000) The Road since Structure: Philosophical Essays, 1970-1993, with an Autobiographical Interview. Chicago: University of Chicago Press.
- Contains a comprehensive interview with Kuhn covering his life and work.

b. Secondary Sources

Andersen, H. (2001) On Kuhn. Belmont, CA: Wadsworth Publishing.
- A general introduction to Kuhn and his philosophy.
Andersen, H., Barker, P. and Chen, X. (2006) The Cognitive Structure of Scientific Revolutions. New York: Cambridge University Press.
Barnes, B. (1982) T.S. Kuhn and Social Science. London: Macmillan Press.
- Discusses the impact of Kuhn’s philosophy for the sociology of science.
Bernardoni, J. (2009) Knowing Nature without Mirrors: Thomas Kuhn’s Antirepresentationalist Objectivity. Saarbrücken, DE: VDM Verlag Dr. Müller.
Bird, A. (2000) Thomas Kuhn. Princeton, NJ: Princeton University Press.
- A critical introduction to Kuhn’s philosophy of science.
Bird, A. (2012) ‘The Structure of Scientific Revolutions and its significance: an essay review of the fiftieth anniversary edition’. British Journal for the Philosophy of Science, 63, 859-83.
Buchwald, J. Z. and Smith, G. E. (1997) ‘Thomas S. Kuhn, 1922-1996’. Philosophy of Science, 64, 361-76.
D’Agostino, F. (2010) Naturalizing Epistemology: Thomas Kuhn and the Essential Tension. New York: Palgrave Macmillan.
Davidson, K. (2006) The Death of Truth: Thomas S. Kuhn and the Evolution of Ideas. New York: Oxford University Press.
Favretti, R. R., Sandri, G. and Scazzieri, R., eds. (1999) Incommensurability and Translation: Kuhnian Perspectives on Scientific Communication and Theory Change. Northampton, MA: Edward Elgar.
Fuller, S. (2000) Thomas Kuhn: A Philosophical History of Our Times. Chicago, IL: University of Chicago Press.
- A revisionist account of Kuhn as a foot soldier in Conant’s agenda to educate the public about science.
Fuller, S. (2004) Kuhn vs. Popper: The Struggle for the Soul of Science. New York: Columbia University Press.
Gattei, S. (2008) Thomas Kuhn’s ‘Linguistic Turn’ and the Legacy of Logical Empiricism: Incommensurability, Rationality and the Search for Truth. Burlington, VT: Ashgate.
Gutting, G., ed. (1980) Paradigms and Revolutions: Appraisals and Applications of Thomas Kuhn’s Philosophy of Science. Notre Dame, IN: University of Notre Dame Press.
- A collection of articles addressing Kuhn’s philosophy of science.
Heilbron, J. L. (1998) ‘Thomas Samuel Kuhn’. Isis, 89, 505-15.
Horgan, J. (1991) ‘Reluctant revolutionary’. Scientific American, 264, 40-9.
- Is based on an interview with Kuhn about his philosophy.
Hufbauer, K. (2012) ‘From student of physics to historian of science: TS Kuhn’s education and early career, 1940–1958’. Physics in Perspective, 14, 421-70.
- A detailed reconstruction of Kuhn’s education and early career at Harvard.
Horwich, P., ed. (1993) World Changes: Thomas Kuhn and the Nature of Science. Cambridge, MA: MIT Press.
- The published papers from the 1990 Kuhnfest.
Hoyningen-Huene, P. (1993) Reconstructing Scientific Revolutions: Thomas S. Kuhn’s Philosophy of Science. Chicago, IL: University of Chicago Press.
Hoyningen-Huene, P. and Sankey, H., eds. (2001) Incommensurability and Related Matters. Boston, MA: Kluwer.
Hung, E. H. -C. (2006) Beyond Kuhn: Scientific Explanation, Theory Structure, Incommensurability, and Physical Necessity. Burlington, VT: Ashgate.
Kindi, V. and Arabatzis T., eds. (2012) Kuhn’s The Structure of Scientific Revolutions Revisited. New York: Routledge.
- A collection of essays examining the impact of Structure on contemporary philosophy of science.
Kuukkanen, J. M. (2008) Meaning Changes: A Study of Thomas Kuhn’s Philosophy. Saarbrücken, DE: VDM Verlag Dr. Müller.
Lakatos, I. and Musgrave, A., eds. (1970) Criticism and the Growth of Knowledge. Cambridge, U.K.: Cambridge University Press.
- The published papers from the 1965 London colloquium.
Marcum, J.A. (2015) Thomas Kuhn’s Revolutions: A Historical and an Evolutionary Philosophy of Science? London: Bloomsbury.
Nickles, T., ed. (2003) Thomas Kuhn. Cambridge, UK: Cambridge University Press.
Onkware, K. (2010) Thomas Kuhn and Scientific Progression: Investigation on Kuhn’s Account of How Science Progresses. Staarbrücken: Lambert Academic Publishing.
Preston, J. M. (2008) Kuhn’s The Structure of Scientific Revolutions: A Reader’s Guide. London: Continuum.
Ruse, M. (1999) The Darwinian Revolution: Science Red in Tooth and Claw (2nd edition). Chicago, IL: University of Chicago Press
Sankey, H. (1994) The Incommensurability Thesis. London: Ashgate.
Sardar, Z. (2000) Thomas Kuhn and the Science Wars. New York: Totem Books.
Sharrock, W., and Read, R. (2002) Kuhn: Philosopher of Scientific Revolution. Cambridge, UK: Polity.
Sigurdsson, S. (1990) ‘The nature of scientific knowledge: an interview with Thomas Kuhn’. Harvard Science Review, Winter issue, 18-25.
Suppe, F., ed. (1977) The Structure of Scientific Theories (2nd edition). Urbana, IL: University of Illinois Press.
Swerdlow, N.M. (2013) ‘Thomas S. Kuhn, 1922-1996’. Biographical Memoir, National Academy of Sciences USA. http://www.nasonline.org/publications/biographical-memoirs/memoir-pdfs/kuhn-thomas.pdf.
Torres, J. M., ed. (2010) On Kuhn’s Philosophy and Its Legacy. Faculdade de Ciêcias da Universidade de Lisboa.
von Dietze, E. (2001) Paradigms Explained: Rethinking Thomas Kuhn’s Philosophy of Science. Westport, CT: Praeger.
Wade, N. (1977) ‘Thomas S. Kuhn: revolutionary theorist of science’. Science, 197, 143-5.
Wang, X. (2007) Incommensurability and Cross-Language Communication. Burlington, VT: Ashgate.
Wray, K. B. (2011) Kuhn’s Evolutionary Social Epistemology. New York: Cambridge University Press.

Author Information

James A. Marcum
Email: James_Marcum@baylor.edu
Baylor University
U. S. A.

Morality and Cognitive Science

What do we know about how people make moral judgments? And what should moral philosophers do with this knowledge? This article addresses the cognitive science of moral judgment. It reviews important empirical findings and discusses how philosophers have reacted to them.

Several trends have dominated the cognitive science of morality in the early 21^st century. One is a move away from strict opposition between biological and cultural explanations of morality’s origin, toward a hybrid account in which culture greatly modifies an underlying common biological core. Another is the fading of strictly rationalist accounts in favor of those that recognize an important role for unconscious or heuristic judgments. Along with this has come expanded interest in the psychology of reasoning errors within the moral domains. Another trend is the recognition that moral judgment interacts in complex ways with judgment in other domains; rather than being caused by judgments about intention or free will, moral judgment may partly influence them. Finally, new technology and neuroscientific techniques have led to novel discoveries about the functional organization of the moral brain and the roles that neurotransmitters play in moral judgment.

Philosophers have responded to these developments in a variety of ways. Some deny that the cognitive science of moral judgment has any relevance to philosophical reflection on how we ought to live our lives, or on what is morally right to do. One argument to this end follows the traditional is/ought distinction and insists that we cannot generate a moral ought from any psychological is. Another argument insists that the study of morality is autonomous from scientific inquiry, because moral deliberation is essentially first-personal and not subject to any third-personal empirical correction.

Other philosophers argue that the cognitive science of moral judgment may have significant revisionary consequences for our best moral theories. Some make an epistemic argument: if moral judgment aims at discovering moral truth, then psychological findings can expose when our judgments are unreliable, like faulty scientific instruments. Other philosophers focus on non-epistemic factors, such as the need for internal consistency within moral judgment, the importance of conscious reflection, or the centrality of intersubjective justification. Certain cognitive scientific findings might require a new approach to these features of morality.

The first half of this article (sections 1 to 4) surveys the cognitive science literature, describing key experimental findings and psychological theories in the moral domain. The second half (sections 5 to 10) discusses how philosophers have reacted to these findings, discussing different ways philosophers have sought to immunize moral inquiry from empirical revision, or enthusiastically taken up psychological tools to make new moral arguments.

Note that the focus of this article is on moral judgment. See the article “Moral Character” for discussion of the relationship between cognitive science and moral character.

Biological and Cultural Origins of Moral Judgment
The Psychology of Moral Reasoning
Interaction between Moral Judgment and Other Cognitive Domains
The Neuroanatomy of Moral Judgment
What Do Moral Philosophers Think of Cognitive Science?
Moral Cognition and the Is/Ought Distinction
1. Semantic Is/Ought
2. Non-semantic Is/Ought
The Autonomy of Morality
Moral Cognition and Moral Epistemology
Non-epistemic Approaches
Objections and Alternatives
References and Further Reading

1. Biological and Cultural Origins of Moral Judgment

One key empirical question is this: are moral judgments rooted in innate factors or are they fully explained by acquired cultural traits? During the 20^th century, scientists tended to adopt extreme positions on this question. The psychologist B. F. Skinner (1971) saw moral rules as socially conditioned patterns of behavior; given the right reinforcement, people could be led to judge virtually anything morally right or wrong. The biologist E. O. Wilson (1975), in contrast, argued that nearly all of human morality could be understand via the application of evolutionary biology. Around the early 21^st century, however, most researchers on moral judgment began to favor a hybrid model, allowing roles for both biological and cultural factors.

There is evidence that at least the precursors of moral judgment are present in humans at birth, suggesting an evolutionary component. In a widely cited study, Kiley Hamlin and colleagues examined the social preferences of pre-verbal infants (Hamlin, Wynn, and Bloom 2007). The babies, seated on their parents’ laps, watched a puppet show in which a character trying to climb a hill was helped up by one puppet, but pushed back down by another puppet. Offered the opportunity to reach out and grasp one of the two puppets, babies showed a preference for the helping puppet over the hindering puppet. This sort of preference is not yet full-fledged moral judgment, but it is perhaps the best we can do in assessing the social responses of humans before the onset of language, and it suggests that however human morality comes about, it builds upon innate preferences for pro-social behavior.

A further piece of evidence comes from the work of theorists Leda Cosmides and John Tooby (1992), who argue that the minds of human adults display an evolutionary specialization for moral judgment. The Wason Selection Task is an extremely well-established paradigm in the psychology of reasoning, which shows that most people make persistent and predictable mistakes in evaluating abstract inferences. A series of studies by Cosmides, Tooby, and their colleagues shows that people do much better on a form of this task when it is restricted to violations of social norms. So, for instance, rather than being asked to evaluate an abstract rule linking numbers and colors, people were asked to evaluate a rule prohibiting those below a certain age from consuming alcohol. Participants in these studies made the normal mistakes when looking for violations of abstract rules, but made fewer mistakes in detecting violations of social rules. According to Cosmides and Tooby, these results suggest that moral judgment evolved as a domain-specific capacity, rather than an as application of domain-general reasoning. If this is right, then there must be at least an innate core to moral judgments.

Perhaps the most influential hybrid model of innate and cultural factors in moral judgment research is the linguistic analogy research program (Dwyer 2006; Hauser 2006; Mikhail 2011). This approach is explicitly modeled on Noam Chomsky’s (1965) generative linguistics. According to Chomsky, the capacity for language production and some basic structural parameters for functioning grammar are innate, but the enormous diversity of human languages comes about through myriad cultural settings and prunings within the evolutionarily allowed range of possible grammars. By analogy, then, moral grammar suggests that the capacity of making moral judgments is innate, along with some basic constraints on the substance of the moral domain—but within this evolutionarily enabled space, culture works to select and prune distinct local moralities.

The psychologist Jonathan Haidt (2012) has highlighted the role of cultural difference in the scientific study of morality through his influential Moral Foundations account. According to Haidt, all documented moral beliefs can be classified as fitting within a handful of moral sub-domains, such as harm-aversion, justice (in distribution of resources), respect for authority, and purity. Haidt argues that moral differences between cultures reflect differences in emphasis on these foundations. In fact, he suggests, industrialized western cultures appear to have emphasized the harm-aversion and justice foundations almost exclusively, unlike many other world cultures. Yet Haidt also insists upon a role for biology in explaining moral judgment; he sees each of the foundations as rooted in a particular evolutionary origin. Haidt’s foundations account remains quite controversial, but it is a prominent example of contemporary scientists’ focus on avoiding polarized answers to the biology versus culture question.

The rest of this article mostly sets aside further discussion of the distal—evolutionary or cultural—explanation of moral judgment. Instead it focuses on proximal cognitive explanations for particular moral judgments. This is because the ultimate aim is to consider the philosophical significance of moral cognitive science, whereas the moral philosophical uptake of debates over evolution is discussed elsewhere. Interested readers can see the articles on “Evolutionary Ethics” and “Moral Relativism.”

2. The Psychology of Moral Reasoning

One crucial question is whether moral judgments arise from conscious reasoning and reflection, or are triggered by unconscious and immediate impulses. In the 1970s and 1980s, research in moral psychology was dominated by the work of Lawrence Kohlberg (1971), who advocated a strongly rationalist conception of moral judgment. According to Kohlberg, mature moral judgment demonstrates a concern with reasoning through highly abstract social rules. Kohlberg asked his participants (boys and young men) to express opinions about ambiguous moral dilemmas and then explain the reasons behind their conclusions. Kohlberg took it for granted that his participants were engaged in some form of reasoning; what he wanted to find out was the nature and quality of this reasoning, which he claimed emerged through a series of developmental stages. This approach came under increasing criticism in the 1980s, particularly through the work of feminist scholar Carol Gilligan (1982), who exposed the trouble caused by Kohlberg’s reliance on exclusively male research participants. See the article “Moral Development” for further discussion of that issue.

Since the turn of the twentieth century psychologists have placed much less emphasis on the role of reasoning in moral judgment. A turning point was the publication of Jonathan Haidt’s paper, “The Emotional Dog and Its Rational Tail” (Haidt 2001). There Haidt discusses a phenomenon he calls “moral dumbfounding.” He presented his participants with provocative stories, such as a brother and sister who engage in deliberate incest, or a man who eats his (accidentally) dead dog. Asked to explain their moral condemnation of these characters, participants cited reasons that seem to be ruled out by the description of the stories—for instance, participants said that the incestuous siblings might create a child with dangerous birth defects, though the original story made clear that they took extraordinary precautions to avoid conception. When reminded of these details, participants did not withdraw their moral judgments; instead they said things like, “I don’t know why it’s wrong, it just is.” According to Haidt, studies of moral dumbfounding confirm a pattern of evidence that people do not really reason their way to moral conclusions. Moral judgment, Haidt argues, arrives in spontaneous and unreflective flashes, and reasoning is only something done post hoc, to rationalize the already-made judgments.

Many scientists and philosophers have written about the evidence Haidt discusses. A few take extremist positions, absolutely upholding the old rationalist tradition or firmly insisting that reasoning plays no role at all. (Haidt himself has slightly softened his anti-rationalism in later publications.) But it is probably right to say that the dominant view in moral psychology is a hybrid model. Many of our everyday moral judgments do arise in sudden flashes, without any need for explicit reasoning. But when faced with new and difficult moral situations, and sometimes even when confronted with explicit arguments against our existing beliefs, we are able to reason our way toward new moral judgments.

The dispute between rationalists and their antagonists is primarily one about procedure: are moral judgments produced through conscious thought or unconscious psychological mechanisms? There is a related but distinct set of issues concerning the quality of moral judgment. However moral judgment works, explicitly or implicitly, is it rational? Is the procedure that produces our moral judgments a reliable procedure? (It is important to note that a mental procedure might be unconscious and still be rational. Many of our simple mathematical calculations are accomplished unconsciously, but that alone does not keep them from counting as rational.)

There is considerable evidence suggesting that our moral judgments are unreliable (and so arguably irrational). Some of this evidence comes from experiments imported from other domains of psychology, especially from behavioral economics. Perhaps most famous is the Asian disease case first employed by the psychologists Amos Tversky and Daniel Kahneman in the early 1980s. Here is the text they gave to some of their participants:

Imagine that the U.S. is preparing for the outbreak of an unusual Asian disease, which is expected to kill 600 people. Two alternative programs to combat this disease have been proposed. Assume that the exact scientific estimate of the consequences of the programs are as follows:

If Program A is adopted, 200 people will be saved.

If Program B is adopted, there is a 1/3 probability that 600 will be saved, and 2/3 probability that no people will be saved. (Tversky and Kahneman 1981, 453)

Other participants read a modified form of this scenario, with the first option being that 400 people will die and the second option being that there is a 1/3 probability that nobody will die, and a 2/3 probability that 600 people will die. Notice that this is a change only in wording: the first option is the same either way, whether you describe it as 200 people being saved or 400 dying, and the second option gives the same probabilities of outcomes in either description. Notice also that, in terms of probability-expected outcome, the two programs are mathematically equivalent (for expected values, certain survival of 1/3 of people is equivalent to a 1/3 chance of all people surviving). Yet participants responded strongly to the way the choices are described; those who read the original phrasing preferred Program A three to one, while those who read the other phrasing showed almost precisely the opposite preference. Apparently, describing the choice in terms of saving makes people prefer the certain outcome (Program A) while describing it in terms of dying makes people prefer the chancy outcome (Program B).

This study is one of the most famous in the literature on framing effects, which shows that people’s judgments are affected by merely verbal differences in how a set of options is presented. Framing effects have been shown in many types of decision, especially in economics, but when the outcomes concern the deaths of large numbers of people it is clear that they are of moral significance. Many studies have shown that people’s moral judgments can be manipulated by mere changes in verbal description, without (apparently) any difference in features that matter to morality (see Sinnott-Armstrong 2008 for other examples).

Partly on this basis, some theorists have advocated a heuristics and biases approach to moral psychology. A cognitive bias is a systematic defect in how we think about a particular domain, where some feature influences our thinking in a way that it should not. Some framing effects apparently trigger cognitive biases; the saving/dying frame exposes a bias toward risk-taking in save frames and a bias away from it in dying frames. (Note that this leaves open whether either is the correct response—the point is that the mere difference of frame shouldn’t affect our responses, so at least one of the divergent responses must be mistaken.)

In the psychology of (non-moral) reasoning, a heuristic is a sort of mental short-cut, a way of skipping lengthy or computationally demanding explicit reasoning. For instance, if you want to know which of several similar objects is the heaviest, you could assume that it is the largest. Heuristics are valuable because they save time and are usually correct—but in certain types of cases an otherwise valuable heuristic will make predictable errors (some small objects are actually denser and so heavier than large objects). Perhaps some of our simple moral rules (“do no harm”) are heuristics of this sort—right most of the time, but unreliable in certain cases (perhaps it is sometimes necessary to harm people in emergencies). Some theorists (for example, Sunstein 2005) argue that large sectors of our ordinary moral judgments can be shown to exhibit heuristics and biases. If this is right, most moral judgment is systematically unreliable.

A closely related type of research shows that moral judgment is affected not only by irrelevant features within the questions (such as phrasing) but also by completely accidental features of the environment in which we make judgments. To take a very simple example: if you sit down to record your judgments about morally dubious activities, the severity of your reaction will depend partly on the cleanliness of the table (Schnall et al. 2008). You are likely to give a more harsh assessment of the bad behavior if the table around you is sticky and covered in pizza boxes than if it is nice and clean. Similarly, you are likely to render more negative moral judgments if you have been drinking bitter liquids (Eskine, Kacinik, and Prinz 2011), or handling dirty money (Yang et al. 2013). Watching a funny movie will make you temporarily less condemning of violence (Strohminger, Lewis, and Meyer 2011).

Some factors affecting moral judgment are internal to the judge rather than the environment. If you’ve been given a post-hypnotic suggestion to feel queasy whenever you hear a certain completely innocuous word, you will probably make more negative moral judgments about characters in stories containing those triggering words (Wheatley and Haidt 2005). Such effects are not restricted to laboratories; one Israeli study showed that parole boards are more likely to be lenient immediately after meals than when hungry (Danziger, Levav, and Avnaim-Pesso 2011).

Taken together, these studies appear to show that moral judgment is affected by factors that are morally irrelevant. The degree of wrongness of an act does not depend on which words are used to describe it, or the cleanliness of the desk at which it is considered, or whether the thinker has eaten recently. There seems to be strong evidence, then, that moral judgments are at least somewhat unreliable. Moral judgment is not always of high quality. The extent of this problem, and what difference it might make to philosophical morality, is a matter that is discussed later in the article.

3. Interaction between Moral Judgment and Other Cognitive Domains

Setting aside for now the reliability of moral judgment, there are other questions we can ask about its production, especially about its causal structure. Psychologically speaking, what factors go into producing a particular moral judgment about a particular case? Are these the same factors that moral philosophers invoke when formally analyzing ethical decisions? Empirical research appears to suggest otherwise.

One example of this phenomenon concerns intention. Most philosophers have assumed that in order for something done by a person to be evaluable as morally right or wrong, the person must have acted intentionally (at least in ordinary cases, setting aside negligence). If you trip me on purpose, that is wrong, but if you trip me by accident, that is merely unfortunate. Following this intuitive point, we might think that assessment of intentionality is causally prior to assessment of morality. That is, when I go to evaluate a potentially morally important situation, I first work out whether the person involved acted intentionally, and then use this judgment as an input to working out whether what they have done is morally wrong. Hence, in this simple model, the causal structure of moral judgment places it downstream from intentionality judgment.

But empirical evidence suggests that this simple model is mistaken. A very well-known set of studies concerning the side-effect effect (also known as the Knobe effect, for its discoverer Joshua Knobe) appears to show that the causal relationship between moral judgment and intention-assessment is much more complicated. Participants were asked to read short stories like the following:

The vice-president of a company went to the chairman of the board and said, “We are thinking of starting a new program. It will help us increase profits, but it will also harm the environment.” The chairman of the board answered, “I don’t care at all about harming the environment. I just want to make as much profit as I can. Let’s start the new program.” They started the new program. Sure enough, the environment was harmed. (Knobe 2003, 191)

Other participants read the same story, except that the program would instead help, rather than harm, the environment as a side effect. Both groups of participants were asked whether the executive intentionally brought about the side effect. Strikingly, when the side effect was a morally bad one (harming the environment), 82% of participants thought it was brought about intentionally, but when the side effect was morally good (helping the environment) 77% of participants thought it was not brought about intentionally.

There is a large literature offering many different accounts of the side-effect effect (which has been repeatedly experimentally replicated). One plausible account is this: people sometimes make moral judgments prior to assessing intentionality. Rather than intentionality-assessment always being an input to moral judgment, sometimes moral judgment feeds input to assessing intentionality. A side effect judged wrong is more likely to be judged intentional than one judged morally right. If this is right, then the simple model of the causal structure of moral judgment cannot be quite correct—the causal relation between intentionality-assessment and moral judgment is not unidirectional.

Other studies have shown similar complexity in how moral judgment relates to other philosophical concepts. Philosophers have often thought that questions about causal determinism and free will are conceptually prior to attributing moral responsibility. That is, whether or not we can hold people morally responsible depends in some way on what we say about freedom of the will. Some philosophers hold that determinism is compatible with moral responsibility and others deny this, but both groups start from thinking about the metaphysics of agency and move toward morality. Yet experimental research suggests that the causal structure of moral judgment works somewhat differently (Nichols and Knobe 2007). People asked to judge whether an agent can be morally responsible in a deterministic universe seem to base their decision in part on how strongly they morally evaluate the agent’s actions. Scenarios involving especially vivid and egregious moral violations tend to produce greater endorsement of compatibilism than more abstract versions. The interpretation of these studies is highly controversial, but at minimum they seem to cast doubt on a simple causal model placing judgments about free will prior to moral judgment.

A similar pattern holds for judgments about the true self. Some moral philosophers hold that morally responsible action is action resulting from desires or commitments that are a part of one’s true or deep self, rather than momentary impulses or external influences. If this view reflects how moral judgments are made, then we should expect people to first assess whether a given action results from an agent’s true self and then render moral judgment about those that do. But it turns out that moral judgment provides input to true self assessments. Participants in one experiment (Newman, Bloom, and Knobe 2014) were asked to consider a preacher who explicitly denounced homosexuality while secretly engaging in gay relationships. Asked to determine which of these behaviors reflected the preacher’s true self, participants first employed their own moral judgment; those disposed to see homosexuality as morally wrong thought the preacher’s words demonstrated his true self, while those disposed to accept homosexuality thought the preacher’s relationships came from his true self. The implication is that moral judgment is sometimes a causal antecedent of other types of judgments, including those that philosophers have thought conceptually prior to assessing morality.

4. The Neuroanatomy of Moral Judgment

Physiological approaches to the study of moral judgment have taken on an increasingly important role. Employing neuroscientific and psychopharmacological research techniques, these studies help to illuminate the functional organization of moral judgment by revealing the brain areas implicated in its exercise.

An especially central concern in this literature is the role of emotion in moral judgment. An early influential study by Jorge Moll and colleagues (2005) demonstrated selective activity for moral judgment in a network of brain areas generally understood to be central to emotional processing. This work employed functional magnetic resonance imaging (fMRI), the workhorse of modern neuroscience, in which a powerful magnet is used to produce a visual representation of relative levels of cellular energy used in various brain areas. Employing fMRI allows researchers to get a near-real-time representation of the brain’s activities while the conscious subject makes judgments about morally important scenarios.

One extremely influential fMRI study of moral judgment was conducted by philosopher-neuroscientist Joshua D. Greene and colleagues. They compared brain activity in people making deontological moral judgments with brain activity while making utilitarian moral judgments. (To oversimplify: a utilitarian moral judgment is one primarily attentive to the consequences of a decision, even allowing the deliberate sacrifice of an innocent to save a larger number of others. Deontological moral judgment is harder to define, but for our purposes means moral judgment that responds to factors other than consequences, such as the individual rights of someone who might be sacrificed to save a greater number. See “Ethics.”

In a series of empirical studies and philosophical papers, Greene has argued that his results show that utilitarian moral judgment correlates with activity in cognitive or rational brain areas, while deontological moral judgment correlates with activity in emotional areas (Greene 2008). (He has since softened this view a bit, conceding that both types of moral judgment allow some form of emotional processing. He now holds that deontological emotions are a type that trigger automatic behavioral responses, whereas utilitarian emotions are flexible prompts to deliberation (Greene 2014).) According to Greene, learning these psychological facts give us reason to distrust our deontological judgments; in effect, his is a neuroscience-fueled argument on behalf of utilitarianism. This argument is at the center of a still very spirited debate. Whatever its outcome, Greene’s research program has had an undeniable influence on moral psychology; his scenarios (which derive from philosophical thought experiments by Philippa Foot and Judith Jarvis Thomson) have been adopted as standard across much of the discipline, and the growth of moral neuroimaging owes much to his project.

Alongside neuroimaging, lesion study is one of the central techniques of neuroscience. Recruiting research participants who have pre-existing localized brain damage (often due to physical trauma or stroke) allows scientists to infer the function of a brain area from the behavioral consequences of its damage. For example, participants with damage to the ventromedial prefrontal cortex, who have persistent difficulties with social and emotional processing, were tested on dilemmas similar to those used by Greene (Koenigs et al. 2007). These patients show a greater tendency toward utilitarian judgments than did healthy controls. Similar lesion studies have since found a range of different results, so the empirical debate remains unsettled, but the technique continues to be important.

Two newer neuroscientific research techniques have begun to play important roles in moral psychology. Transcranial magnetic simulation (TMS) uses blasts of electromagnetism to suppress or heighten activity in a brain region. In effect, this allows researchers to (temporarily) alter healthy brains and correlate this alteration with behavioral effects. For instance, one study (Young et al. 2010) used TMS to suppress activity in the right temporoparietal junction, an area associated with assessing the mental states of others (see “Theory of Mind.” After the TMS treatment, participants’ moral judgments showed less sensitivity to whether characters in a dilemma acted intentionally or accidentally. Another technique, transcranial direct current stimulation (TCDS) has been shown to increase compliance with social norms when applied to the right lateral prefrontal cortex (Ruff, Ugazio, and Fehr 2013).

Finally, it is possible to study the brain not only at the gross structural scale, but also by examining its chemical operations. Psychopharmacology is the study of the cognitive and behavioral effects of chemical alteration of the brain. In particular, the levels of neurotransmitters, which regulate brain activity in a number of ways, can be manipulated by introducing pharmaceuticals. For example, participants’ readiness to make utilitarian moral judgments can be altered by administration of the drugs propranolol (Terbeck et al. 2013) and citalopram (Crockett et al. 2010).

5. What Do Moral Philosophers Think of Cognitive Science?

So far this article has described the existing science of moral judgment: what we have learned empirically about the causal processes by which moral judgments are produced. The rest of the article discusses the philosophical application of this science. Moral philosophers try to answer substantive ethical questions: what are the most valuable goals we could pursue? How should we resolve conflicts among these goals? Are there ways we should not act even if doing so would promote the best outcome? What is the shape of a good human life and how could we acquire it? How is a just society organized? And so on.

What cognitive science provides is causal information about how we typically respond to these kinds of questions. But philosophers disagree about what to make of this information. Should we ever change our answers to ethical questions on the basis of what we learn about their causal origins? Is it ever reasonable (or even rationally mandatory) to abandon a confident moral belief because of a newly learned fact about how one came to believe it?

We now consider several prominent responses to these questions. We start with views that deny much or any role for cognitive science in moral philosophy. We then look at views that assign to cognitive science a primarily negative role, in disqualifying or diminishing the plausibility of certain moral beliefs. We conclude by examining views that see the findings of cognitive science as playing a positive role in shaping moral theory.

6. Moral Cognition and the Is/Ought Distinction

We must start with the famous is/ought distinction. Often attributed to Hume (see “Hume“), the distinction is a commonsensical point about the difference between descriptive claims that characterize how things are (for example, “the puppy was sleeping”) and prescriptive claims that assert how things should or should not be (for instance, “it was very wrong of you to kick that sleeping puppy”). These are widely taken to be two different types of claims, and there is a lot to be said about the relationship between them. For our purposes, we may gloss it as follows: people often make mistakes when they act as if an ought-claim follows immediately and without further argument from an is-claim. The point is not (necessarily) that it is always a mistake to draw an ought-statement as the conclusion of an is-statement, just that the relationship between them is messy and it is easy to get confused. Some philosophers do assert the much stronger claim that ought-statements can never be validly drawn from is-statements, but this is not what everyone means when the issue is raised.

For our purposes, we are interested in how the is/ought distinction might matter to applying cognitive science to moral philosophy. Cognitive scientific findings are is-type claims; they describe facts about how our minds actually do work, not about how they ought to work. Yet the kind of cognitive scientific claims at interest here are claims about moral cognition—is-claims about the origin of ought-claims. Not surprisingly then, if the is/ought distinction tends to mark moments of high risk for confusion, we should expect this to be one of those moments. Some philosophers have argued that attempts to change moral beliefs on the basis of cognitive scientific findings are indeed confusions of this sort.

a. Semantic Is/Ought

In the mid-twentieth century it was popular to understand the is/ought distinction as a point about moral semantics. That is, the distinction pointed to a difference in the implicit logic of two uses of language. Descriptive statements (is-statements) are, logically speaking, used to attribute properties to objects; “the puppy is sleeping” just means that the sleeping-property attaches to the puppy-object. But prescriptive statements (ought-statements) do not have this logical structure. Their surface descriptive grammar disguises an imperative or expressive logic. So “it was very wrong of you to kick that sleeping puppy” is not really attributing the property of wrongness to your action of puppy-kicking. Rather, the statement means something like “don’t kick sleeping puppies!” or even “kicking sleeping puppies? Boo!” Or perhaps “do not like it when sleeping puppies are kicked and I want you to not like it as well.” (See “Ethical Expressivism.”)

If this analysis of the semantics of moral statements is right, then we can easily see why it is a mistake to draw ought-conclusions from is-premises. Logically speaking, simple imperatives and expressives do not follow from simple declaratives. If you agree with “the puppy was sleeping” yet refuse to accept the imperative “don’t kick sleeping puppies!” you haven’t made a logical mistake. You haven’t made a logical mistake even if you also agree with “kicking sleeping puppies causes them to suffer.” The point here isn’t about the moral substance of animal cruelty—the point is about the logical relationship between two types of language use. There is no purely logical compulsion to accept any simple imperative or expressive on the basis of any descriptive claim, because the types of language do not participate in the right sort of logical relationship.

Interestingly, this sort of argument has not played much of a role in the debate about moral cognitive science, though seemingly it could. After all, cognitive scientific claims are, logically speaking, descriptive claims, so we could challenge their logical relevance to assessing any imperative or expressive claims. But, historically speaking, the rise of moral cognitive science came mostly after the height of this sort of semantic argument. Understanding the is/ought distinction in semantic terms like these had begun to fade from philosophical prominence by the 1970s, and especially by the 1980s when modern moral cognitive science (arguably) began. Contemporary moral expressivists are often eager to explain how we can legitimately draw inferences from is to ought despite their underlying semantic theory.

It is possible to see the simultaneous decline of simple expressivism and the rise of moral cognitive science as more than mere coincidence. Some historians of philosophy see the discipline as having pivoted from the linguistic turn of the late 19^th and early 20^th century to the cognitive turn (or conceptual turn) of the late 20^th century. Philosophy of language, while still very important, has receded from its position at the center of every philosophical inquiry. In its place, at least in some areas of philosophy, is a growing interest in naturalism and consilience with scientific inquiry. Rather than approaching philosophy via the words we use, theorists often now approach it through the concepts in our minds—concepts which are in principle amenable to scientific study. As philosophers became more interested in the empirical study of their topics, they were more likely to encourage and collaborate with psychologists. This has certainly contributed to the growth of moral cognitive science since the 1990s.

b. Non-semantic Is/Ought

Setting aside semantic issues, we still have a puzzle about the is/ought distinction. How do we get to an ought conclusion from an is premise? The idea that prescriptive and descriptive claims are different types of claim retains its intuitive plausibility. Some philosophers have argued that scientific findings cannot lead us to (rationally) change our moral beliefs because science issues only descriptive claims. It is something like a category mistake to allow our beliefs in a prescriptive domain to depend crucially upon claims within a descriptive domain. More precisely, it is a mistake to revise your prescriptive moral beliefs because of some purely descriptive fact, even if it is a fact about those beliefs.

Of course, the idea here cannot be that it is always wrong to update moral beliefs on the basis of new scientific information. Imagine that you are a demolition crew chief and you are just about to press the trigger to implode an old factory. Suddenly one of your crew members shouts, “Wait, look at the thermal monitor! There’s a heat signature inside the factory—that’s probably a person! You shouldn’t press the trigger!” It would be extremely unfitting for you to reply that whether or not you should press the trigger cannot depend on the findings of scientific contraptions like thermal monitors.

What this example shows is that the is/ought problem, if it is a problem, is obviously not about excluding all empirical information from moral deliberation. But it is important to note that the scientific instrument itself does not tell us what we ought to do. We cannot just read off moral conclusions from descriptive scientific facts. We need some sort of bridging premise, something that connects the purely descriptive claim to a prescriptive claim. In the demolition crew case, it is easy to see what this bridging premise might be, something like: “if pressing a trigger will cause the violent death of an innocent person, then you should not press the trigger.” In ordinary moral interactions we often leave bridging premises implicit—it is obvious to everyone in this scenario that the presence of an innocent human implies the wrongness of going ahead with implosion.

But there is a risk in leaving our bridging premises implicit. Sometimes people seem to be relying upon implicit bridging premises that are not mutually agreed on, or that may not make any sense at all. Consider: “You shouldn’t give that man any spare change. He is a Templar.” Here you might guess that the speaker thinks Templars are bad people and do not deserve help, but you might not be sure—why would your interlocutor even care about Templars? And you are unlikely to agree with this premise anyway, so unless the person makes their anti-Templar views explicit and justifies them, you do not have much reason to follow their moral advice. Another example: “The tiles in my kitchen are purple, so it’s fine for you to let those babies drown.” It is actually hard to interpret this utterance as something other than a joke or metaphor. If someone tried to press it seriously, we would certainly demand to be informed of the bridging premise between linoleum color and nautical infanticide, and we would be skeptical that anything plausible might be provided.

So far, so simple. Now consider: “Brain area B is extremely active whenever you judge it morally wrong to cheat on your taxes. So it is morally wrong to cheat on your taxes.” What should we make of this claim? The apparent implicit bridging premise is: If brain area B gets excited when you judge it wrong to X, then it is wrong to X. But this is a very strange bridging premise. It does not make reference to any internal features of tax-cheating that might explain its wrongness. In fact, the premise appears to suggest that an act can be wrong simply because someone thinks it is wrong and some physical activity is going on inside that person’s body. This is not the sort of thing we normally offer to get to moral conclusions, and it is not immediately clear why anyone would find it convincing. Perhaps, as we said, this is an example of how easy it is to get confused about is and ought claims.

Two points come out here. First, some attempts at drawing moral conclusions from cognitive science involve implicit bridging premises that fall apart when anyone attempts to make them explicit. This is often true of popular science journalism, in which some new psychological finding is claimed to prove that such-and-such moral belief is mistaken. At times, philosophers have accused their psychologist or philosopher opponents of doing something similar. According to Berker (2009), Joshua Greene’s neuroscience-based debunking of deontology (discussed above) lacks a convincing bridging premise. Berker suggest that Greene avoids being explicit about this premise, because when it is made explicit it is either a traditional moral argument which does not use cognitive science to motivate its conclusion or it employs cognitive scientific claims but does not lead to any plausible moral conclusion. Hence, Berker says, the neuroscience is normatively insignificant; it does not play a role in any plausible bridging premise to a moral conclusion.

Of course, even if this is right, then it implies only that Greene’s argument fails. But Berker and other philosophers have expressed doubt that any cognitive science-based argument could generate a moral conclusion. They suggest that exciting newspaper headlines and book subtitles (for example, How Science Can Determine Human Values (Harris 2010)) trade on our leaving their bridging premises implicit and unchallenged. For—this is the second point—if there were a successful bridging principle, it would be a very unusual one. Why, we want to know, could the fact that such-and-such causal process precedes or accompanies a moral judgment give us reason to change our view about that moral judgment? Coincident causal processes do not appear to be morally relevant. What your brain is doing while you are making moral judgments seems to be in the same category as the color of your kitchen tiles—why should it matter?

Of course, the fact that causal processes are not typically used in bridging premises does not show us that they could not. But it does perhaps give us reason to be suspicious, and to insist that the burden of proof is on anyone who wishes to present such an argument. They must explain to us why their use of cognitive science is normatively significant. A later section considers different attempts to meet this burden of proof. First, though, consider an argument that it can never be met, even in principle.

7. The Autonomy of Morality

Some philosophers hold that it is a mistake to try to draw moral conclusions from psychological findings because doing so misunderstands the nature of moral deliberation. According to these philosophers, moral deliberation is essentially first-personal, while cognitive science can give us only third-personal forms of information about ourselves. When you are trying to decide what to do, morally speaking, you are looking for reasons that relate your options to the values you uphold. I have moral reason not to kick puppies because I recognize value in puppies (or at least the non-suffering of puppies). Psychological claims about your brain or psychological apparatus might be interesting to someone else observing you, but they are beside the point of first-personal moral deliberation about how to act.

Here is how Ronald Dworkin makes this point. He asks us to imagine learning some new psychological fact about why we have certain beliefs concerning justice. Suppose that the evidence suggests our justice beliefs are caused by self-interested psychological processes. Then:

It will be said that it is unreasonable for you still to think that justice requires anything, one way or the other. But why is that unreasonable? Your opinion is one about justice, not about your own psychological processes . . . You lack a normative connection between the bleak psychology and any conclusion about justice, or any other conclusion about how you should vote or act. (Dworkin 1996, 124–125)

The idea here is not just that beliefs about morality and beliefs about psychology are about different topics. Part of the point is that morality is special—it is not just another subject of study alongside psychology and sociology and economics. Some philosophers put this point in terms of an a priori / a posteriori distinction: morality is something that we can work out entirely in our minds, without needing to go and do experiments (though of course we might need empirical information to apply moral principles once we have figured them out). Notice that when philosophers debate moral principles with one another, they do not typically conduct or make reference to empirical studies. They think hard about putative principles, come up with test cases that generate intuitive verdicts, and then think hard again about how to modify principles to make them fit to these verdicts. The process does not require empirical input, so there is no basis for empirical psychology to become involved.

This view is sometimes called the autonomy of morality (Fried 1978; Nagel 1978). It holds that, in the end, the arbiter of our moral judgments will be our moral judgments—not anything else. The only way you can get a moral conclusion from a psychological finding is to provide the normative connection that Dworkin refers to. So, for instance: if you believe that it is morally wrong to vote in a way triggered by a selfish psychological process, then finding out that your intending to vote for the Egalitarian Party is somehow selfishly motivated could give you a reason to reconsider your vote. But notice that this still crucially depends upon a moral judgment—the judgment that it is wrong to vote in a way triggered by selfish psychological processes. There is no way to get entirely out of the business of making moral judgments; psychological claims are morally inert unless accompanied by explicitly moral claims.

The point here can be made in weaker and stronger forms. The weaker form is simply this: empirical psychology cannot generate moral conclusions entirely on its own. This weaker form is accepted by most philosophers; few deny that we need at least some moral premises in order to get moral conclusions. Notice though that even the weaker form casts doubt on the idea that psychology might serve as an Archimedean arbiter of moral disagreement. Once we’ve conceded that we must rely upon moral premises to get results from psychological premises, we cannot claim that psychology is a value-neutral platform from which to settle matters of moral controversy.

The stronger form of the autonomy of morality claim holds that a psychological approach to morality fundamentally misunderstands the topic. Moral judgment, in this view, is about taking agential responsibility for our own value and actions. Re-describing these in psychological terms, so that our commitments are just various causal levers, represents an abdication of this responsibility. We should instead maintain a focus on thinking through the moral reasons for and against our positions, leaving psychology to the psychologists.

Few philosophers completely accept the strongest form of the autonomy of morality. That is, most philosophers agree that there are at least some ways psychological genealogy could change the moral reasons we take ourselves to have. But it will be helpful for us to keep this strong form in mind as a sort of null hypothesis. As we now turn to various theoretical arguments in support of a role for cognitive science in moral theory, we can understand each as a way of addressing the strong autonomy of morality challenge. They are arguments demonstrating that psychological investigation is important to our understanding of our moral commitments.

8. Moral Cognition and Moral Epistemology

Many moral philosophers think about their moral judgments (or intuitions) as pieces of evidence in constructing moral theories. Rival moral principles, such as those constituting deontology and consequentialism, are tested by seeing if they get it right on certain important cases.

Take the well-known example of the Trolley Problem. An out-of-control trolley is rumbling toward several innocents trapped on a track. You can divert the trolley, but only by sending it onto a side track where it will kill one person. Most people think it is morally permissible to divert the trolley in this case. Now imagine that the trolley cannot be diverted—it can be stopped only by physically pushing a large person into its path, causing an early crash. Most people do not think this is morally permitted. If we treat these two intuitive reactions as moral evidence, then we can affirm that how an individual is killed makes a difference to the rightfulness of sacrificing a smaller number of lives to save a larger number. Apparently, it is permissible to indirectly sacrifice an innocent as a side-effect of diverting a threat, but not permissible to directly use an innocent as a means to stop the threat. This seems like preliminary evidence against a moral theory that says that outcomes are the only things that matter morally, since the outcomes are identical in these two cases.

This sort of reasoning is at the center of how most philosophers practice normative ethics. Moral principles are proposed to account for intuitive reactions to cases. The best principles are (all else equal) those that cohere with the largest number of important intuitions. A philosopher who wishes to challenge a principle will construct a clever counterexample: a case where it just seems obvious that it is wrong to do X, but the targeted principle allows us to do X in the case. Proponents of the principle must now (a) show that their principle has been misapplied and actually gives a different verdict about the case; (b) accept that the principle has gone wrong but alter it to give a better answer; (c) bite the bullet and insist that even if the principle seems to have gone wrong here, it still trustworthy because it is right in so many other cases; or (d) explain away the problematic intuition, by showing that the test case is underdescribed or somehow unfair, or that the intuitive reaction itself likely results from confusion. If all of this sounds a bit like the testing of scientific hypotheses against experimental data, that is no accident. The philosopher John Rawls (1971) explicitly modeled this approach, which he called “reflective equilibrium,” See “Rawls” on hypothesis testing in science.

There are various ways to understand what reflective equilibrium aims at doing. In a widely accepted interpretation, reflective equilibrium aims at discovering the substantive truths of ethics. In this understanding, those moral principles supported in reflective equilibrium are the ones most likely to be the moral truth. (We will leave aside here How to interpret truth in the moral domain is left aside in this article, but see “Metaethics.”) Intuitions about test cases are evidence for moral truth in much the same way that scientific observations are evidence for empirical truth. In science, our confidence in a particular theory depends on whether it gets evidential support from repeated observations, and in moral philosophy (in this conception) our confidence in a particular ethical theory depends on whether it gets evidential confirmation from multiple intuitions.

This parallel allows us to see one way in which the psychology of moral judgment might be relevant to moral philosophy. When we are testing a scientific theory, our trust in any experimental observation depends on our confidence in the scientific instruments used to generate them. If we come to doubt the reliability of our instruments, then we should doubt the observations we get from them, and so should doubt the theories they appear to support. What, then, if we come to doubt the reliability of our instruments in moral philosophy? Of course, philosophers do not use microscopes or mass spectrometers. Our instruments are nothing more than our own minds—or, more precisely, our mental abilities to understand situations and apply moral concepts to them. Could our moral minds be unreliable in the way that microscopes can be unreliable?

This is certainly not an idle worry, since we know that we make consistent mental mistakes in other domains. Think about persistent optical illusions or predictably bad mathematical reasoning. There is an enormous literature in cognitive science showing that we make consistent mistakes in domains other than morality. Against this background, it would be a surprise if our moral intuitions turned out not to be full of mistakes.

In earlier sections we discussed psychological evidence showing systematic deficits in moral reasoning. We saw, for instance, that people’s moral judgments are affected by the verbal framing in which test cases are presented (save versus die) and by the cleanliness of their immediate environment. If the readings of a scientific instrument appeared to be affected by environmental factors that had nothing to do with what the instrument was supposed to measure, then we would rightly doubt the trustworthiness of readings obtained from that instrument. In moral philosophy, a mere difference in verbal framing, or the dirtiness of the desk you are now sitting at, certainly do not seem like things that matter to whether or not a particular act is permissible. So it seems that our moral intuitions, like defective lab equipment, sometimes cannot be trusted.

Note how this argument relates to earlier claims about the autonomy of morality or the is/ought distinction. Here no one is claiming that cognitive science tells us directly which moral judgments are right and which wrong. All cognitive science can do is show us that particular intuitions are affected by certain causal factors—it cannot tell us which causal factors count as distorting and which are acceptable. It is up to us, as moral judges, to determine that differences of verbal framing (save/die) or desk cleanliness do not lead to genuine moral differences. Of course, we do not have to think very hard to decide that these causal factors are morally irrelevant—but the point remains that we are still making moral judgments, even very easy ones, and not getting our verdict directly from cognitive science.

Many proponents of a role for cognitive science in morality are willing to concede this much: in the end, any revision of our moral judgments will be authorized only by some other moral judgments, not by the science itself. But, they will now point out, there are some revisions of our moral judgments we ought to make, and we are able to make only because of input from cognitive science. We all agree that our moral judgments should not be affected by the cleanliness of our environment, and the science is unnecessary for our agreeing on this. But we would not know about the fact that our moral judgments are affected by environmental cleanliness without cognitive science. So, in this sense at least, improving the quality of our moral judgments does seem to require the use of cognitive science. Put more positively: paying attention to the cognitive science of morality can allow us to realize that some seemingly reliable intuitions are not in fact reliable. Once these are set aside like broken microscopes, we can have greater confidence that the theories we build from the remainder will capture the moral truth. (See, for example, Sinnott-Armstrong 2008; Mason 2011.)

This argument is one way of showing the relevance of cognitive science to morality. Note that it is an epistemic debunking argument. The argument undermines certain moral intuitions as default sources of evidential justification. Cognitive science plays a debunking role by exposing the unreliable causal origins of our intuitions. Philosophers who employ debunking arguments like this do so with a range of aims. Some psychological debunking arguments are narrowly targeted, trying to show that a few particular moral intuitions are unjustified. Often this is meant to be a move within normative theory, weakening a set of principles the philosopher rejects. For instance, if intuitions supporting deontological moral theory are undermined, then we may end up dismissing deontology. Other times, philosophers intend to aim psychological debunking arguments much more widely. If it can be shown that all moral intuitions are undermined in this way, then we have grounds for skepticism about moral judgment [see “Moral Epistemology.”] The plausibility of all these arguments remains hotly debated. But if any of them should turn out to be convincing, then we have a clear demonstration of how cognitive science can matter to moral theory.

9. Non-epistemic Approaches

Epistemic debunking arguments presuppose that morality is best understood as an epistemic domain. That is, these arguments assume that there is (or could be) a set of moral truths, and that the aim of moral judgment is to track these moral truths. What if we do not accept this conception of the moral domain? What if we do not expect there to be moral truths, or do not think that moral intuition aims at tracking any such truth? In that case, should we care about the cognitive science of morality?

a. Consistency in Moral Reasoning

Obviously the answer to this question will depend on what we think the moral domain is, if it is not an epistemic domain. One common view, often associated with contemporary Kantians, is that morality concerns consistency in practical reasoning. Though we do not aim to uncover independent moral truths, we can still say that some moral beliefs are better than others, because some moral beliefs are better at cohering with the way in which we conceive of ourselves, or what we think we have most reason to do. In this understanding, a bad moral belief is not bad because it is mistaken about the moral facts (there are no moral facts) but is bad because it does not fit well with our other normative beliefs. The assumption here is that we want to be coherent, in that being a rational agent means aiming at consistency in the beliefs that ground one’s actions. [See “Moral Epistemology.”]

In this conception of the moral domain, cognitive science is useful because it can show us when we have unwittingly stumbled into inconsistency in our moral beliefs. A very simple example comes from the universalizability condition on moral judgment. It is incoherent to render different moral judgments about two people doing the exact same thing in the exact same circumstances. This is because morality (unlike taste, for instance) aims at universal prescription. If it is wrong for you to steal from me, then it is wrong for me to steal from you. (Assuming you and I are similarly situated—if I am an unjustly rich oligarch and you are a starving orphan, maybe things are different. But then this moral difference is due to different features of our respective situations. The point of universal prescription is to deny that there can be moral differences when situations are similar.) A person who explicitly denied that moral requirements applied equally to herself as to other people would not seem to really have gotten the point of morality. We would not take such a person very seriously when she complained of others violating her moral rights, if she claimed to be able to ignore those same rights in others.

We all know that we fail at universalizing our moral judgments sometimes; we all suffer moments of weakness where we try to make excuses for our own moral failings, excuses we would not permit to others. But some psychological research suggests that we may fail in this way far more often than we realize. Nadelhoffer and Feltz (2008) found that people make different judgments about moral dilemmas when they imagine themselves in the dilemma than when imagining others in the same dilemma. Presumably most people would not explicitly agree that there is such a moral difference, but they can be led into endorsing differing standards depending on whether they are presented with a me versus someone else framing of the case. This is an unconscious failure of universalization, but it is still an inconsistency. If we aim at being consistent in our practical reasoning, we should want to be alerted to unconscious inconsistencies, so that we might get a start on correcting them. And in cases like this one, we do not have introspective access to the fact that we are inconsistent, but we can learn it from cognitive science.

Note how this argument parallels the epistemic one. The claim is, again, not that cognitive science tells us what counts as a good moral judgment. Rather cognitive science reveals to us features of the moral judgments we make, and we must then use moral reasoning to decide whether these features are problematic. Here the claim is that inconsistency in moral judgments is bad, because it undermines our aim to be coherent rational agents. We do not get that claim from cognitive science, but there are some cases where we could not apply it without the self-knowledge we gain from cognitive science. Hence the relevance of cognitive science to morality as aimed at consistency.

b. Rational Agency

There is another way in which cognitive science can matter to the coherent rational agency conception of morality. Some findings in cognitive science may threaten the intelligibility of this conception altogether. Recall, from section 2, the psychologist Jonathan Haidt’s work on moral dumbfounding; people appear to spontaneously invent justifications for their intuitive moral verdicts, and stick with these verdicts even after the justifications are shown to fail. If pressed hard enough, people will admit they simply do not know why they came to the verdicts, but hold to them nevertheless. In Haidt’s interpretation, these findings show that moral judgment happens almost entirely unconsciously, with conscious moral reasoning mostly a post hoc epiphenomenon.

If Haidt is right, point out Jeanette Kennett and Cordelia Fine (2009), then this poses a serious problem for the ideal of moral agency. For us to count as moral agents, there needs to be the right sort of connection between our conscious reasoning and our responses to the world. A robot or a simple animal can react, but a rational agent is one that can critically reflect upon her reasons for action and come to a deliberative conclusion about what she ought to do. Yet if we are morally dumbfounded in the way Haidt suggests, then our conscious moral reasoning may lack the appropriate connection to our moral reactions. We think that we know why we judge and act as we do, but actually the reasons we consciously endorse are mere post hoc confabulations.

In the end, Kennett and Fine argue that Haidt’s findings do not actually lead to this unwelcome conclusion. They suggest that he has misinterpreted what the experiments show, and that there is a more plausible interpretation that preserves the possibility of conscious moral agency. Note that responding in this way concedes that cognitive science might be relevant to assessing the status of our moral judgments. The dispute here is only over what the experiments show, not over what the implications would be if they showed a particular thing. This leaves the door open for further empirical research on conscious moral agency.

One possible approach is the selective application of Haidt’s argument. If it could be shown that certain moral judgments—those about a particular topic or sub-domain of morality—are especially prone to moral dumbfounding, then we might have the basis for disqualifying them from inclusion in reflective moral theory. This seems, at times, to be the approach adopted by Joshua Greene (see section 4) in his psychological attack on deontology. According to Greene (2014), deontological intuitions are of a psychological type distinctively disconnected from conscious reflection and should accordingly be distrusted. Many philosophers dispute Greene’s claims (see, for instance, Kahane 2012), but this debate itself shows the richness of engagement between ethics and cognitive science.

c. Intersubjective Justification

There is one further way in which cognitive science may have relevance to moral theory. In this last conception, morality is essentially concerned with intersubjective justification. Rather than trying to discover independent moral truths, my moral judgments aim at determining when and how my preferences can be seen as reasonable by other people. A defective moral judgment, in this conception, is one that reflects only my own personal idiosyncrasies and so will not be acceptable to others. For instance, perhaps I have an intuitive negative reaction to people who dress their dogs in sweaters even when it is not cold. If I come to appreciate that my revulsion of this practice is not widely shared, and that other people cannot see any justification for it, then I may conclude that it is not properly a moral judgment at all. It may be a matter of personal taste, but it cannot be a moral judgment if it has no chance of being intersubjectively justified.

Sometimes we can discover introspectively that our putative moral judgments are actually not intersubjectively justifiable, just by thinking carefully about what justifications we can or cannot offer. But there may be other instances in which we cannot discover this introspectively, and where cognitive science may help. This is especially so when we have unknowingly confabulated plausible-sounding justifications in order to make our preferences appear more compelling than they are (Rini 2013). For example, suppose that I have come to believe that a particular charity is the most deserving of our donations, and I am now trying to convince you to agree. You point out that other charities seem to be at least as effective, but I insist. By coincidence, the next day I participate in a psychological study of color association. The psychologists’ instruments notice that I react very positively to the color forest green—and then I remember that my favorite charity’s logo is a deep forest green. If this turns out to be the explanation for why I argued for the charity, then I should doubt that I have provided an intersubjective justification. Maybe my fondness for the charity’s logo is an okay reason for me to make a personal choice (if I am otherwise indifferent), but it certainly is not a reason for you to agree. Now that I am aware of this psychological influence, I should consider the possibility that I have merely confabulated the reasons I offered to you.

10. Objections and Alternatives

The preceding sections have focused on negative implications of the cognitive science of moral judgment. We have seen ways in which learning about a particular moral judgment’s psychological origins might lead us to disqualify it, or at least reduce our confidence in it. This final section briefly considers some objections to drawing such negative implications, and also discusses more positive proposals for the relationship between cognitive science and moral philosophy.

a. Explanation and Justification

One objection to disqualifying a moral judgment on cognitive scientific grounds is that this involves confusion between reasons of explanation and justification. The explanatory reason for the fact that I judge X to be immoral could be any number of psychological factors. But my justifying reason for the judgment is unlikely to be identical with the explanatory reason. Consider my judgment that it is wrong to tease dogs with treats that will not be provided. Perhaps the explanatory reason for my believing this is that my childhood dog bit me when I refused to share a sandwich. But this is not how I justify my judgment—the justifying reason I have is that dogs suffer when led to form unfulfilled expectations, and the suffering of animals is a moral bad. As long as this is a good justifying reason, then the explanatory reason does not really matter. So, runs the objection, those who disqualify moral judgments on cognitive scientific grounds are looking at the wrong thing—they should be asking about whether the judgment is justified, not why (psychologically speaking) it was made (Kamm 1998; van Roojen 1999).

One problem with this objection is that it assumes we have a basis for affirming the justifying reasons for a judgment that are unaffected by cognitive scientific investigation. Obviously if we had oracular certainty that judgment X is correct, then we should not worry about how we came to make the judgment. But in moral theory we rarely (if ever) have such certainty. As discussed earlier (see section 8), our justification for trusting a particular judgment is often dependent upon how well it coheres with other judgments and general principles. So if a cognitive scientific finding showed that some dubious psychological process is responsible for many of our moral judgments, their ability to justify one another may be in question. To see the point, consider the maximal case: suppose you learned that all of your moral judgments were affected by a chemical placed in the water supply by the government. Would this knowledge not give you some reason to second-guess your moral judgments? If that is right, then it seems that our justifying reasons for holding to a judgment can be sensitive to at least some discoveries about the explanatory reasons for them. (For related arguments, see Street 2006 and Joyce 2006.)

b. The Expertise Defense

Another objection claims to protect the judgments used in moral theory-making even while allowing the in-principle relevance of cognitive scientific findings. The claim is this: cognitive science uses research subjects who are not experts in making moral judgments. But moral philosophers have years of training at drawing careful distinctions, and also typically have much more time than research subjects to think carefully about their judgments. So even if ordinary participants in cognitive science studies make mistakes due to psychological quirks, we should not assume that the judgments of experts will resemble those of non-experts. We do not doubt the competence of expert mathematicians simply because the rest of us make arithmetic mistakes (Ludwig 2007). So, the objection runs, if it is plausible to think of moral philosophers as experts, then moral philosophers can continue to rely upon their judgments whatever the cognitive science says about the judgments of non-experts.

Is this expertise defense plausible? One major problem is that it does not appear to be well supported by empirical evidence. In a few studies (Schwitzgebel and Cushman 2012; Tobia, Buckwalter, and Stich 2013), people with doctorates in moral philosophy have been subjected to the same psychological tests as non-expert subjects and appear to make similar mistakes. There is some dispute about how to interpret these studies (Rini 2015), but if they hold up then it will be hard to defend the moral judgments of philosophers on grounds of expertise.

c. The Regress Challenge

A final objection comes in the form of a regress challenge. Henry Sidgwick first made the point, in his Methods of Ethics, that it would be self-defeating to attempt to debunk judgments on the grounds of their causal origins. The debunking itself would rely on some judgments for its plausibility, and we would then be led down an infinite regress in querying the causal origins of these judgments, and causal origins of the judgments responsible for our judgments about those first origins, and so on. Sidgwick seems to be discussing general moral skepticism, but a variant of this argument presents a regress challenge to even selective cognitive scientific debunking of particular moral judgments. According to the objection, once we have opened the door to debunking, we will be drawn into an inescapable spiral of producing and challenging judgments about the moral trustworthiness of various causal origins. Perhaps, then, we should not start on this project at all.

This objection is limited in effect; it applies most obviously to epistemic forms of cognitive scientific debunking. In non-epistemic conceptions of the aims of moral judgment, it may be possible to ignore some of the objection’s force. The objection is also dependent upon certain empirical assumptions about the interdependence of the causal origins driving various moral judgments. But if sustained, the regress challenge for epistemic debunking seems significant.

d. Positive Alternatives

Finally, we might consider an alternative take on the relationship between moral judgment and cognitive science. Unlike most of the approaches discussed above, this one is positive rather than negative. The idea is this: if cognitive science can reveal to us that we already unconsciously accept certain moral principles, and if these fit with the judgments we think we have good reason to continue to hold, then cognitive science may be able to contribute to the construction of moral theory. Cognitive science might help you to explicitly articulate a moral principle that you already accepted implicitly (Mikhail 2011; Kahane 2013). In a sense, this is simply scientific assistance to the traditional philosophical project of making explicit the moral commitments we already hold—the method of reflective equilibrium developed by Rawls and employed by most contemporary ethicists. In this view, the use of cognitive science is likely to be less revolutionary, but still quite important. Though negative approaches have received most discussion, the positive approach seems to be an interesting direction for future research.

11. References and Further Reading

Berker, Selim. 2009. “The Normative Insignificance of Neuroscience.” Philosophy & Public Affairs 37 (4): 293–329. doi:10.1111/j.1088-4963.2009.01164.x.
Chomsky, Noam. 1965. Aspects of the Theory of Syntax. Cambridge, MA: MIT Press.
Cosmides, Leda, and John Tooby. 1992. “Cognitive Adaptations for Social Exchange.” In The Adapted Mind: Evolutionary Psychology and the Generation of Culture, edited by J. Barkow, Leda Cosmides, and Tooby, 163–228. New York: Oxford University Press.
Crockett, Molly J., Luke Clark, Marc D. Hauser, and Trevor W. Robbins. 2010. “Serotonin Selectively Influences Moral Judgment and Behavior through Effects on Harm Aversion.” Proceedings of the National Academy of Sciences of the United States of America 107 (40): 17433–38. doi:10.1073/pnas.1009396107.
Danziger, Shai, Jonathan Levav, and Liora Avnaim-Pesso. 2011. “Extraneous Factors in Judicial Decisions.” Proceedings of the National Academy of Sciences 108 (17): 6889–92. doi:10.1073/pnas.1018033108.
Dworkin, Ronald. 1996. “Objectivity and Truth: You’d Better Believe It.” Philosophy & Public Affairs 25 (2): 87–139.
Dwyer, Susan. 2006. “How Good Is the Linguistic Analogy?” In The Innate Mind: Culture and Cognition, edited by Peter Carruthers, Stephen Laurence, and Stephen Stich. Oxford: Oxford University Press.
Eskine, Kendall J, Natalie A Kacinik, and Jesse J Prinz. 2011. “A Bad Taste in the Mouth: Gustatory Disgust Influences Moral Judgment.” Psychological Science 22 (3): 295–99. doi:10.1177/0956797611398497.
Fried, Charles. 1978. “Biology and Ethics: Normative Implications.” In Morality as a Biological Phenomenon: The Presuppositions of Sociobiological Research, 187–97. Berkeley, CA: University of California Press.
Gilligan, Carol. 1982. In a Different Voice: Psychology Theory and Women’s Development. Cambridge, MA: Harvard University Press.
Greene, Joshua D. 2008. “The Secret Joke of Kant’s Soul.” In Moral Psychology, Vol. 3. The Neuroscience of Morality: Emotion, Brain Disorders, and Development, edited by Walter Sinnott-Armstrong, 35–80. Cambridge, MA: MIT Press.
Greene, Joshua D. 2014. “Beyond Point-and-Shoot Morality: Why Cognitive (Neuro)Science Matters for Ethics.” Ethics 124 (4): 695–726. doi:10.1086/675875.
Haidt, Jonathan. 2001. “The Emotional Dog and Its Rational Tail: A Social Intuitionist Approach to Moral Judgment.” Psychological Review 108 (4): 814–34.
Haidt, Jonathan. 2012. The Righteous Mind: Why Good People Are Divided by Politics and Religion. 1st ed. New York: Pantheon.
Hamlin, J. Kiley, Karen Wynn, and Paul Bloom. 2007. “Social Evaluation by Preverbal Infants.” Nature 450 (7169): 557–59. doi:10.1038/nature06288.
Harris, Sam. 2010. The Moral Landscape: How Science Can Determine Human Values. First Edition. New York: Free Press.
Hauser, Marc D. 2006. “The Liver and the Moral Organ.” Social Cognitive and Affective Neuroscience 1 (3): 214–20. doi:10.1093/scan/nsl026.
Joyce, Richard. 2006. The Evolution of Morality. 1st ed. MIT Press.
Kahane, Guy. 2012. “On the Wrong Track: Process and Content in Moral Psychology.” Mind & Language 27 (5): 519–45. doi:10.1111/mila.12001.
Kahane, Guy. 2013. “The Armchair and the Trolley: An Argument for Experimental Ethics.” Philosophical Studies 162 (2): 421–45. doi:10.1007/s11098-011-9775-5.
Kamm, F. M. 1998. “Moral Intuitions, Cognitive Psychology, and the Harming-Versus-Not-Aiding Distinction.” Ethics 108 (3): 463–88.
Kennett, Jeanette, and Cordelia Fine. 2009. “Will the Real Moral Judgment Please Stand up? The Implications of Social Intuitionist Models of Cognition for Meta-Ethics and Moral Psychology.” Ethical Theory and Moral Practice 12 (1): 77–96.
Knobe, Joshua. 2003. “Intentional Action and Side Effects in Ordinary Language.” Analysis 63 (279): 190–94. doi:10.1111/1467-8284.00419.
Koenigs, Michael, Liane Young, Ralph Adolphs, Daniel Tranel, Fiery Cushman, Marc Hauser, and Antonio Damasio. 2007. “Damage to the Prefrontal Cortex Increases Utilitarian Moral Judgements.” Nature 446 (7138): 908–11. doi:10.1038/nature05631.
Kohlberg, Lawrence. 1971. “From ‘Is’ to ‘Ought’: How to Commit the Naturalistic Fallacy and Get Away with It in the Study of Moral Development.” In Cognitive Development and Epistemology, edited by Theodore Mischel. New York: Academic Press.
Ludwig, Kirk. 2007. “The Epistemology of Thought Experiments: First Person versus Third Person Approaches.” Midwest Studies In Philosophy 31 (1): 128–59. doi:10.1111/j.1475-4975.2007.00160.x.
Mason, Kelby. 2011. “Moral Psychology And Moral Intuition: A Pox On All Your Houses.” Australasian Journal of Philosophy 89 (3): 441–58. doi:10.1080/00048402.2010.506515.
Mikhail, John. 2011. Elements of Moral Cognition: Rawls’ Linguistic Analogy and the Cognitive Science of Moral and Legal Judgment. 3rd ed. Cambridge: Cambridge University Press.
Moll, Jorge, Roland Zahn, Ricardo de Oliveira-Souza, Frank Krueger, and Jordan Grafman. 2005. “The Neural Basis of Human Moral Cognition.” Nature Reviews Neuroscience 6 (10): 799–809. doi:10.1038/nrn1768.
Nadelhoffer, Thomas, and Adam Feltz. 2008. “The Actor–Observer Bias and Moral Intuitions: Adding Fuel to Sinnott-Armstrong’s Fire.” Neuroethics 1 (2): 133–44.
Nagel, Thomas. 1978. “Ethics as an Autonomous Theoretical Subject.” In Morality as a Biological Phenomenon: The Presuppositions of Sociobiological Research, edited by Gunther S. Stent, 198–205. Berkeley, CA: University of California Press.
Newman, George E., Paul Bloom, and Joshua Knobe. 2014. “Value Judgments and the True Self.” Personality and Social Psychology Bulletin 40 (2): 203–16. doi:10.1177/0146167213508791.
Nichols, Shaun, and Joshua Knobe. 2007. “Moral Responsibility and Determinism: The Cognitive Science of Folk Intuitions.” Noûs 41 (4): 663–85. doi:10.1111/j.1468-0068.2007.00666.x.
Rawls, John. 1971. A Theory of Justice. 1st ed. Cambridge, MA: Harvard University Press.
Rini, Regina A. 2013. “Making Psychology Normatively Significant.” The Journal of Ethics 17 (3): 257–74. doi:10.1007/s10892-013-9145-y.
Rini, Regina A. 2015. “How Not to Test for Philosophical Expertise.” Synthese 192 (2): 431–52.
Ruff, C. C., G. Ugazio, and E. Fehr. 2013. “Changing Social Norm Compliance with Noninvasive Brain Stimulation.” Science 342 (6157): 482–84. doi:10.1126/science.1241399.
Schnall, Simone, Jonathan Haidt, Gerald L. Clore, and Alexander H. Jordan. 2008. “Disgust as Embodied Moral Judgment.” Personality & Social Psychology Bulletin 34 (8): 1096–1109. doi:10.1177/0146167208317771.
Schwitzgebel, Eric, and Fiery Cushman. 2012. “Expertise in Moral Reasoning? Order Effects on Moral Judgment in Professional Philosophers and Non-Philosophers.” Mind and Language 27 (2): 135–53.
Sinnott-Armstrong, Walter. 2008. “Framing Moral Intuition.” In Moral Psychology, Vol 2. The Cognitive Science of Morality: Intuition and Diversity, 47–76. Cambridge, MA: MIT Press.
Skinner, B. F. 1971. Beyond Freedom and Dignity. New York: Knopf.
Street, Sharon. 2006. “A Darwinian Dilemma for Realist Theories of Value.” Philosophical Studies 127 (1): 109–66.
Strohminger, Nina, Richard L Lewis, and David E Meyer. 2011. “Divergent Effects of Different Positive Emotions on Moral Judgment.” Cognition 119 (2): 295–300. doi:10.1016/j.cognition.2010.12.012.
Sunstein, Cass R. 2005. “Moral Heuristics.” Behavioral and Brain Sciences 28 (4): 531–42.
Terbeck, Sylvia, Guy Kahane, Sarah McTavish, Julian Savulescu, Neil Levy, Miles Hewstone, and Philip Cowen. 2013. “Beta Adrenergic Blockade Reduces Utilitarian Judgement.” Biological Psychology 92 (2): 323–28.
Tobia, Kevin, Wesley Buckwalter, and Stephen Stich. 2013. “Moral Intuitions: Are Philosophers Experts?” Philosophical Psychology 26 (5): 629–38. doi:10.1080/09515089.2012.696327.
Tversky, A., and D. Kahneman. 1981. “The Framing of Decisions and the Psychology of Choice.” Science 211 (4481): 453–58. doi:10.1126/science.7455683.
Van Roojen, Mark. 1999. “Reflective Moral Equilibrium and Psychological Theory.” Ethics 109 (4): 846–57.
Wheatley, Thalia, and Jonathan Haidt. 2005. “Hypnotic Disgust Makes Moral Judgments More Severe.” Psychological Science 16 (10): 780–84. doi:10.1111/j.1467-9280.2005.01614.x.
Wilson, E. O. 1975. Sociobiology: The New Synthesis. Cambridge, MA: Harvard University Press.
Yang, Qing, Xiaochang Wu, Xinyue Zhou, Nicole L. Mead, Kathleen D. Vohs, and Roy F. Baumeister. 2013. “Diverging Effects of Clean versus Dirty Money on Attitudes, Values, and Interpersonal Behavior.” Journal of Personality and Social Psychology 104 (3): 473–89. doi:10.1037/a0030596.
Young, Liane, Joan Albert Camprodon, Marc Hauser, Alvaro Pascual-Leone, and Rebecca Saxe. 2010. “Disruption of the Right Temporoparietal Junction with Transcranial Magnetic Stimulation Reduces the Role of Beliefs in Moral Judgments.” Proceedings of the National Academy of Sciences 107 (15): 6753–58. doi:10.1073/pnas.0914826107.

Author Information

Regina A. Rini
Email: gina.rini@nyu.edu
New York University
U. S. A.

Theories of Religious Diversity

Religious diversity is the fact that there are significant differences in religious belief and practice. It has always been recognized by people outside the smallest and most isolated communities. But since early modern times, increasing information from travel, publishing, and emigration have forced thoughtful people to reflect more deeply on religious diversity. Roughly, pluralistic approaches to religious diversity say that, within bounds, one religion is as good as any other. In contrast, exclusivist approaches say that only one religion is uniquely valuable. Finally, inclusivist theories try to steer a middle course by agreeing with exclusivism that one religion has the most value while also agreeing with pluralism that others still have significant religious value.

What values are at issue? Literature since 1950 focuses on the truth or rationality of religious teachings, the veridicality (conformity with reality) of religious experiences, salvific efficacy (the ability to deliver whatever cure religion should provide), and alleged directedness towards one and the same ultimate religious object.

The exclusivist-inclusivist-pluralist trichotomy has become standard since the 1980s. Unfortunately, it is often used with some mix of the above values in mind, leaving it unclear exactly which values are pertinent. While this trichotomy is sometimes thought of in terms of general attitudes that a religious person may have towards other religions—approximately the attitudes of rejection, limited openness, and wide acceptance respectively—in this article they figure as theories concerning the facts of religious diversity. “Religious pluralism” in some contexts means an informed, tolerant, and appreciative or sympathetic view of the various religions. In other contexts, “religious pluralism” is a normative principle requiring that peoples of all or most religions should be treated the same. In this article, “religious pluralism” refers to a theory about the diversity of religions. Finally, some authors use “descriptive religious pluralism” to mean what is here called “religious diversity,” calling “normative religious pluralism” views that are here called varieties of “religious pluralism.” While the trichotomy has been repeatedly challenged, it is still widely used, and can be precisely defined in various ways.

Facts and Theories of Religious Diversity
1. History
2. Theories and Associations
Religious Pluralism
Exclusivism
Inclusivism
References and Further Reading

1. Facts and Theories of Religious Diversity

Scholars distinguish seven aspects of religious traditions: the doctrinal and philosophical, the mythic and narrative, the ethical and legal, the ritual and practical, the experiential and emotional, the social and organizational, and the material and artistic. (Smart 1996) Religious traditions differ along all these dimensions. These are the undisputed facts of religious diversity. Some authors, usually ones who wish to celebrate these facts, call them “religious pluralism,” but this entry reserves this label for a family of theories about the facts of religious diversity.

It is arguably the doctrinal and philosophical aspects of a religion which are foundational, in that the other aspects can only be understood in light of them. Particularly central to any religion’s teaching are its diagnosis of the fundamental problem facing human beings and its suggested cure, a way to positively and permanently resolve this problem. (Prothero 2010, 13-6; Yandell 1999, 16-9)

a. History

Scholarly study of a wide range of religions, and comparison and evaluation of them, was to a large extent pioneered by Christian missionaries in the nineteenth century seeking to understand those whom they sought to convert. This led to both the questioning and the defense of various “exclusivist” traditional Christian claims. (Netland 2001, 23-54) Theories of religious diversity have largely been driven by attacks on and defenses of such claims, and discussions continue within the realm of Christian theology. (Kärkkäinen 2003; Netland 2001) The most famous of these has been the view (held by some Christians) that all non-Christians are doomed to an eternity of conscious suffering in hell. (Hick 1982 ch. 2; see section 3c below)

All the theories discussed in this article are ways that (usually religious) people regard other religions, but here we discuss them abstractly, without descending much into the details of how they would be worked out in the teachings and practices of any one religion. Such would be the work of a religiously embedded and committed theology of religious diversity, not of a general philosophy of religious diversity.

b. Theories and Associations

Many people associate any sort of pluralist theory of religious diversity with a number of arguably good qualities. These qualities include but are not limited to: being humble, reasonable, kind, broad-minded, open-minded, informed, cosmopolitan, modern, properly appreciative of difference, non-bigoted, tolerant, being opposed to proselytizing (attempts to convince those outside the religion to convert to it), anti-colonialist, and anti-imperialist. In contrast, any non-pluralist theory of religious diversity is associated with many arguably bad qualities. These negative qualities include but are not limited to: being arrogant, unreasonable, mean, narrow-minded, closed-minded, uninformed, provincial, out of date, xenophobic, bigoted, intolerant, in favor of proselytism, colonialist, and imperialist.

These, however, are mere associations; there seems to be no obvious entailments between the theories of religious diversity and the above qualities. In principle, it would seem that an exclusivist or inclusivist may have all or most of the good qualities, and one who accepts a theory of religious pluralism may have all or most of the bad qualities. These connections between theory and character – which are believed by some to provide practical arguments for or objections to various theories – need to be argued for. But it is very rare for a scholar to go beyond merely assuming or asserting some sort of causal connection between the various theories about religious diversity and the above virtues and vices.

2. Religious Pluralism

A theory of religious pluralism says that all religions of some kind are the same in some valuable respect(s). While this is compatible with some religion being the best in some other respect(s), the theorists using this label have in mind that many religions are equal regarding the central value(s) of religion. (Legenhausen 2009)

The term “religious pluralism” is almost always used for a theory asserting positive value for many or most religions. But one may talk also of “negative religious pluralism” in which most or all religions have little or no positive value and are equal in this respect. This would be the view of many naturalists, who hold that all religions are the product of human imagination, and fail to have most or all of the values claimed for them. (Byrne 2004; Feuerbach 1967)

a. Naive Pluralisms

Though naive pluralisms are not common amongst scholars in relevant fields, they are important to mention because they are entertained by many people as they begin to reflect on religious diversity.

An uninformed person, noting certain commonalities of religious belief and practice, may suppose that all religions are the same, namely, that there are no significant differences between religious traditions. This naive pluralism is refuted by accurate information on religious differences. (Prothero 2010)

A common form of negative pluralism may be called “verificationist pluralism.” This is the view that all religious claims are meaningless, and as a result are incapable of rational evaluation. This is because they cannot be empirically verified, that is, their truth or falsity is not known by way of observational evidence.

There are three serious problems with verificationist pluralism. First, some religious claims can be empirically confirmed or disconfirmed. For example, people have empirically disconfirmed claims that Jesus will visibly return to rule the earth from Jerusalem in 1974, or that magical “ghost shirts” will protect the wearer from bullets, or that saying a certain mantra three times will protect one from highway robbers. Second, the claim that meaningfulness requires the possibility of empirical verification has little to recommend it, and is self-refuting (that is, the claim itself is not empirically verifiable). (Peterson et. al. 2013, 268-72) Third, religions differ in how much, if at all, they make empirically verifiable claims, so it is unclear that all religions will be equal in making meaningless claims.

While there are other sorts of negative naive pluralism, we shall concentrate on positive kinds here, as most of the scholarly literature focuses on those.

Some forms of naive pluralism suppose that all religions will turn out to be complementary. One idea is that all religions would turn out to be parts of one whole (either one religion or at least one conglomeration of religions). This unified consistency may be hoped for in terms of truth, or in terms of practice. With truth, the problem is that it is hard to see how the core claims of the religions could all be true. For instance, some religions teach that the ultimate reality (the most real, only real, or primary thing) is ineffable (such that no human concept can apply to it). But others teach that the ultimate reality is a perfect self, a being capable of knowledge, will, and intentional action.

What about the religions’ practices – are they all complementary? Some practices seem compatible, such as church attendance and mindfulness meditation. On the other hand, others seem to make little or no sense outside the context of the home religion, and others are simply incompatible. What sense, for instance, would it make for a Zen Buddhist to undergo the Catholic rites of confession and penance? Or what sense would it make for an Orthodox Jew, whose religion teaches him to “be fruitful and multiply,” to employ the Buddhist practice of viewing corpses at a burial ground so as to expunge the unwanted liability of sexual desire? Nor can he be fruitful and multiply while living as a celibate Buddhist monk. Dabblers and hobbyists freely stitch together unique quilts of religious beliefs and practices, but such constructions seem to make little sense once a believer has accepted any particular religion. Many religious claims will be logically incompatible with the accepted diagnosis, and many religious practices will be useless or counter-productive when it comes to getting what one believes to be the cure.

Another way in which pluralism can be naive is the common assumption that absolutely all religions are good in significant ways, for example, by improving their adherents’ lives, facilitating interaction with God, or leading to eternal salvation. However, such a person is probably only thinking of large, respectable, and historically important religions. It is not hard to find religions or “religious cults” which would not plausibly be thought of as good in the way(s) that a pluralist has in mind. For example, a religious group may function only to satisfy the desires of its founder, discourage the worship of God, encourage the sexual abuse of children, or lead to the damnation of its members.

Carefully worked out theories of religious pluralism often sound all-inclusive. However, they nearly always have at least one criterion for excluding religions as inferior in the aspect(s) they focus on. A difficulty for any pluralist theory is how to restrict the group of equally good religions without losing the appearance of being all-accepting or wholly non-judgmental. A common strategy here is to simply ignore disreputable religious traditions, only discussing the prestigious ones.

b. Core Pluralisms

An improvement upon naive pluralism acknowledges differences in all the aspects of religions, but separates peripheral from core differences. A core pluralist claims that all religions of some kind share a common core, and that this is really what matters about the religions; their equal value is found in this common core. If the core is true teachings, they’ll all be true (to some degree). If the core is veridical experiences, all religions will enable ways to perceive whatever the objects of religious experience are. If the core is salvifically effective practice, then all will be equal in that each is equally well a means to obtaining the cure. Given that any core pluralism inevitably downplays the other non-core elements of the religions, this approach has also been called “reductive pluralism.” (Legenhausen 2006)

The most influential recent proponent of a version of core pluralism has been Huston Smith. (b. 1919) In his view, the common core of religions is a tiered worldview. This encompasses the idea that physical reality, the terrestrial plane, is contained within and controlled by a more real intermediate plane (that is, the subtle, animic, or psychic plane) which is in turn contained and controlled by the celestial plane. This celestial plane is a personal God. Beyond this is infinite, unlimited Being (also called “Absolute Truth, “the True Reality,” “the Absolute,” “God”). Given that it is ineffable, this Being is neither a god, nor the God of monotheism. It is more real than all that comes from it. The various “planes” are not distinct from it, and it is the ultimate object of all desire, and the deepest reality within each human self. Some experience this Being as if it were a god, but the most able gain a non-conceptual awareness of it in its ineffable glory. Smith holds that in former ages, and among primitive peoples now, such a worldview is near universal. It is only modern people who are blinded by the misunderstanding that science reveals all, who have forgotten it. (Smith 1992, 2003 ch. 3) The highest level in some sense is the human “Spirit,” the deep self which underlies the self of ordinary experience. Appropriating Hindu, Buddhist, and Christian language, Smith says that this “spirit is the Atman that is Brahman, the Buddha-nature that appears when our finite selves get out of its way, my istigkeit (is-ness) which…we see is God’s is-ness.” (Smith 2003 ch. 3, 3-4)

Such an outlook, often called “the perennial philosophy” or “traditionalism,” owes much to nineteenth and twentieth century occult literature, and to neo-Platonism and its early modern revivals. (Sedgwick 2004) Like traditional religions, it too offers a diagnosis of the human condition and a cure. It offers a fall from primordial spirituality into modern spiritual poverty, cured by adopting the outlook sketched above. Most importantly, it offers a chance to discover the deep self as Being. A muted ally in this was the influential religious scholar Mircea Eliade (1907-86), whose work focused on comparing mythologies, and on what he viewed as an important, primitive religious outlook, which separates things into the sacred and the profane.

This “perennial philosophy” appeals to many present-day people, particularly those who, like Smith, have moved from a childhood religious faith, in his case, Christianity, to a more naturalistic and, hence, atheistic outlook. Such an outlook is commonly perceived as meaningless, hopeless, and devoid of value. (Smith 2003, 2012)

Dissenters are found among historians of religion, who deny that there is and has always been a common core in all of the world’s religions. Others dissent because they accept the incompatible diagnosis and cure taught by some other religion, such as the ones found in Islam or Christianity. (Legenhausen 2002) Those who believe the ultimate reality to be a unique god object to Smith’s view that the ultimate reality is ineffable, and so not, in itself, a god. Others find it excessive that Smith accepts other traditional doctrines, such as that Plato’s Forms are not only real but alive, that in dreaming the “subtle body” leaves the body and roams free in the intermediate realm, belief in siddhis (supernatural powers gained by meditation), possession by spirits, psychic phenomena, and so on.

This sort of core pluralism was propounded by some members of the Theosophical Society such as co-founder Helena Petrovna Blavatsky (1831-91) in her widely read The Secret Doctrine (1888), and by French convert to Sufi Islam René Guénon (1886-1951) and those influenced by him, such as the eclectic Swiss-German writer Frithjof Schuon (1907-98). This sort of pluralism, following Guénon and Schuon, has been championed by Iranian philosopher Seyyed Hossein Nasr (b. 1933) and English convert to Islam, Martin Lings (1909-2005) who was a biographer of Muhammad. (Legenhausen 2002)

While Smith’s view rests on belief in an impersonal Ultimate, other versions of core pluralism rest upon monotheism. Thus, the Hindu intellectual Ram Mohun Roy (1772/4-1833) held that Hinduism, Islam, and Christianity, when understood in their original, non-idolatrous and non-superstitious ways, all teach the important truths about God and humankind, enabling humans to love and serve God. Roy, however, always retained his Hindu and Brahmin identities. (Sharma 1999, ch. 2) He was what we now call a “pluralistic Hindu” (and most Hindus would add that he was also an unorthodox Hindu).

Swedish philosopher Mikael Stenmark explores what he calls the “some-are-equally-right view” about religious diversity, and discusses a version of it on which Judaism, Christianity, and Islam are held to possess equal amounts of religiously important truths. (Stenmark 2009) He does not advocate this view, but explores it as an alternative to exclusivism, inclusivism, and Hickian identist pluralism. Stenmark views it as most similar to identist pluralism (see 2e below). But Stenmark’s “some-are-equally-right view” can also be seen as a form of core pluralism, the core being truths about the one God and our need for relationship with God. On this view, “all” the religions are right to the same degree, that is, all versions of monotheism (or perhaps, ethical monotheisms, or Abrahamic monotheisms). This account is narrower than “pluralism” is usually thought to be, but it is arguably a version of it.

c. Hindu Pluralisms

The tradition now called “Hinduism” is and always has been very internally diverse. In modern times, it tries to equalize other religions in the same ways it equalizes the apparently contrary claims and practices internal to it. While elements within it have been sectarian and exclusivistic, modern Hindu thought is usually pluralistic. Furthermore, Hindu thought has shifted in modern times from a scriptural to an experiential emphasis. (Long 2011) Still, some Hindus object to various kinds of pluralism. (Morales 2008)

Within the pluralistic mainstream of Hinduism, a popular slogan is that “all religions are true,” but this may be an expression of almost any sort of positive religious pluralism. Moreover, some influential modern Hindu leaders have adopted a complicated rhetoric of “universal religion,” which often assumes some sort of religious pluralism. (Sharma 1999, ch. 6)

At bare minimum, the slogan that “all religions are true” means that all religions are in some way directed towards one Truth, where this is understood as an ultimate reality. Thus, it has been observed that identist religious pluralism (see 2e below) “is essentially a Hindu position,” and closely resembles Advaita Vendanta thought. (Long 2005) The slogan may also imply that all religions feature veridical experience of that one object, by way of a non-cognitive, immediate awareness. (Sharma 1990) This modern Hindu outlook has proven difficult to formulate in any clear way. One prominent scholar argues that the “neo-Hindu” position on religious diversity (that is, modern Hindu pluralism) is not the view that all religions are equal, one, true, or the same. Instead, it is the view that all religions are “valid,” meaning that they have some degree of (some kind of) value. (Sharma 1979) But if there is no one clear modern Hindu pluralism, it remains that various modern Indian thinkers have held to versions of core or identist pluralism.

Paradoxically, such pluralism is often expressed along with claims that Hinduism is greatly superior in various ways to other religions. (Datta 2005; Morales 2008) It has been argued that whether and how a Hindu embraces a theory of religious pluralism will depend crucially on what she takes “Hinduism” to be. (Long 2005)

d. Ultimist Pluralisms

Building on the speculative metaphysics of Alfred North Whitehead’s (1861–1947) Process and Reality (1929), and work by his student Charles Hartshorne (1897-2000), theologians John Cobb and David Ray Griffin have advocated what the latter calls “deep,” “differential,” and “complementary” pluralism – what is here described as “ultimist.” (Cooper 2006, ch. 7; Griffin 2005b)

Cobb and Griffin assume that there is no supernatural intervention (any miraculous interruption of the ordinary course of nature) by God or other beings. This, it is hoped, rules out anyone having grounds for believing any particular religion to be the uniquely best religion. (Griffin 2005a) They do, however, take seriously at least many of the unusual religious experiences people report. They hypothesize that some religious mystics really do perceive (without using ordinary sense organs) some “ultimate” (that is, something regarded as ultimate). Thus, in experiencing what they call “Emptiness” or the Dharmakaya (truth body), Mahayana Buddhists really do perceive what Cobb calls Creativity (or Being Itself), as do Advaita Vedanta Hindus when they perceive Nirguna Brahman (Brahman without qualities). Other Buddhists experience Amida Buddha, while Christians experience Christ, and Jews Yahweh, Hindus Isvara, and Muslims Allah. All such religious mystics really perceive a personal, supreme God, understood panentheistically, as being “in” the cosmos, akin to how a soul is in a body. Yet other religious mystics perceive the Cosmos, that is, the totality of finite things (the “body” of the World Soul).

These three – Creativity, God, Cosmos – are such that none could exist without the others. Further, it is really Creativity that is ultimate, and it is “embodied in” and does not exist apart from or as an object in addition to God and the Cosmos. Sometimes God and Cosmos are described as aspects of Creativity. The underlying metaphysics here is that of process philosophy, in which events are the basic or fundamental units of reality. On such a metaphysics, any apparent substance (being, entity) turns out to be one or more events or processes. Even God, the greatest concrete, actual being in this philosophy is, in the end, an all-encompassing “unity of experience,” and is to be understood as a process of “Creative-Responsive love.” (Griffin 2005b; Cooper 2006, ch. 7)

All the major religions, then, are really oriented towards, and involve the experience of some reality regarded as “ultimate” (Creativity, God, or Cosmos). It is also allowed that each major religion really does deliver the cure it claims to (for example, salvation and heaven, Nirvana, Moksha), and is entitled to operate by its own moral and epistemic values. Further, it respects and does not try to eliminate all these differences, and so makes genuine dialogue between members of the religions possible. Finally, Cobb and Griffin emphasize that this approach does not endorse any unreasonable form of relativism and, as such, allows one to remain distinctively Christian or Buddhist and so forth. They hope that each religion can, while remaining distinct, begin to construct “global” theologies, influenced by the truths and values of other religions. In all these ways, they argue that their ultimist pluralism is superior to other pluralisms.

This view has not been widely accepted because the Process theology and philosophy on which it is based has not been widely accepted.

One may object that this above proposal is counter to the equalizing spirit of pluralism. Griffin and Cobb seem to attribute the deepest insight to those who think the ultimate reality is an impersonal, indescribable non-thing. In their view, those who confess experience of Emptiness, Nirguna Brahman, or the One (of Neoplatonism) behold the ultimate reality (Creativity) as it really is, in contrast to monotheists or cosmos-focused religionists, who latch on to what are limited aspects of Creativity. But these monotheists and cosmos-worshipers each take their object to be ultimate, and would deny the existence of any further back entity or non-entity, that is, of Creativity. It would seem that that, for example, a Christian to accept this ultimist pluralism, she will have to reinterpret what many Christians will regard as a core commitment, namely, that the ultimate reality is personal. Even a Mahayana Buddhist may have a lot of adjusting to do, if she is to admit that believers in a personal God really do experience the greatest entity, and something which is not separate from Emptiness. Wouldn’t this be to attribute more reality to God than she’s willing to? And how can the ultimist pluralist demand such changes?

A similar pluralism is advanced by Japanese Zen scholar Masao Abe (1915-2006). He applies the Mahayana doctrine of the “three bodies” of the Buddha to other religions. In Mahayana Buddhism, the ultimate reality, a formless but active non-thing, is Emptiness, or the Truth Body (Dharmakaya). This in some sense manifests as, acts as, and is not different from a host of Enjoyment Bodies (Sambhogakaya), each of which is a Buddha outside of space and time, a historical Buddha now escaped from samsara and dwelling in a Buddha-realm. The historical Buddha, the man Gautama is, in this doctrine, a Transformation Body (or Apparitional Body, Nirmanakaya) of one of these, as are other Buddhas in time and space. In some sense these three are one, however, the Truth Body manifests or acts as various Enjoyment Bodies, which in turn manifest or act as various Transformation Bodies. The latter two classes of beings, but not the first, may be described as “personal.”

As to religious diversity, Abe suggests that we view the dynamic activity Emptiness (also called “Openness”) as ultimate, and as manifesting as various “Gods,” that is various monotheistic deities, and “Lords,” which are human religious teachers, whether manifestations of a god, as in the case of Jesus, or just pre-eminent servants of a god, as with Moses or Muhammad.

It is a mistake, Abe holds, to regard any god as ultimate, and monotheists must revise their understanding as above, if true inter-religious dialogue and peace are to be achievable. Equally, Advaita Vedanta Hindus must let go of their insistence on Nirguna Brahman as ultimate. It is a mistake to think that the ultimate is any substantial, self-identical thing, even an ineffable one. (Abe 1985)

Must the Mahayanist make any significant revision to accept the proposed “threefold reality” of Emptiness, gods, and lords? Presumably not, as she already believes in levels of truth and levels of reality. At the highest level there is only Emptiness, the ultimate. The gods and lords will stay at the “provisional” levels of truth and reality, the levels which a fully enlightened person, as it were, sees beyond.

Abe’s views have been criticized by other scholars as misunderstanding some other religions’ claims, and as privileging Mahayana Buddhist doctrines, insofar as he understands these doctrines as being truer than others. It can be argued that Abe is an inclusivist, maintaining that Buddhism is the best religion, rather than a true pluralist. (Burton 2010)

e. Identist Pluralisms

In much religious studies, theology, and philosophy of religion literature of the 1980s through the 2000s, the term “religious pluralism” means the theory of philosopher John Hick. (Hick 2004; Quinn and Meeker 2000) Hick’s approach is original, thorough, and informed by a broad and deep knowledge of many religions. His theory is at least superficially clear, and is rooted in his own spiritual journey. It attracted widespread discussion and criticism, and Hick has engaged in a spirited debate with all comers. It is here described as “Identist” pluralism because his theory claims that people in all the major religions interact with one and the same transcendent reality, variously called “God,” “the Real,” and “the Ultimate Reality.”

Hick viewed religious belief as rationally defensible, and held that one may be rational in trusting the veridicality of one’s religious experiences. However, he thought that it is arbitrary and indefensible to hold that only one’s own experiences or the experiences of those in one’s group are veridical, while those of people in other religions are not. Subjectively, those other people have similar grounds for belief. These ideas, and the fact that religious belief is strongly correlated with birthplace, convinced him that the facts of religious diversity pose irresolvable problems for any exclusivist or inclusivist view, leaving only some form of pluralism as a viable option. (Hick 1997)

Starting as a traditional, non-pluralistic Christian, Hick attended religious meetings and studied with people of other religions. As a result, he became convinced that basically the same thing was going on with these others religious followers as with Christians, namely, people responding in culturally conditioned but transformative ways to one and the same Real or Ultimate Reality. In his earlier writings, monotheistic concerns seem important. How could a perfect being fail to be available to all people in all the religions? Later on, Hick firmly settled on the view that this Real should be thought of as ineffable. Appropriating Immanuel Kant’s distinction between phenomena, how things appear, and noumena, things in themselves, Hick postulated that the Real is ineffable and is not directly experienced by anyone. However, he maintained that people in the religions interact with it indirectly, by way of various personae and impersonae, personal and impersonal appearances or phenomena of the Real. In other words, this Ultimate Reality, due to the various qualities of human minds, appears to various people as personae, such as God, the Trinity, Allah, Vishnu, and also as impersonae such as Emptiness, Nirvana, Nirguna Brahman, and the Dao. These objects of religious experience are mind-dependent, in that they depend for their existence, in part, on people with certain religious backgrounds. By contrast, the Real in itself, that is, the Ultimate Reality as it intrinsically is, is never experienced by anyone, but is only hypothesized as overall the most reasonable explanation of the facts of religious diversity.

Among these purported facts, for Hick, is that the great religions equally well facilitate the ethical transformation of their adherents, what Hick calls a transformation from self-centeredness to other-centeredness and Reality-centeredness. (Sometimes, however, Hick makes the weaker claim that we’re unable to pick any religion as best at effecting this transformation.) This transformation, Hick theorizes, is really the point of religion. All religions, then, are equal in that they are responses to the ineffable Ultimate Reality which equally well—or for all we can tell equally well—bring about an ethical improvement in humans, away from self-centeredness and towards other humans and the Ultimate Reality.

Hick realizes the incoherence of dubbing all religions “true,” for they have core teachings that conflict, and most religions are not shy about pointing out such conflicts. We could loosely paraphrase Hick’s view as being that all religions are false, not that all their teachings are false (for there is much ethical and practical agreement among them), but rather that their core claims about the main object of religious experience and pursuit equally contain falsehoods. Monotheists, after all, take the ultimate being to be a personal god while others, variously called ultimists, absolutists, or monists, hold the ultimate to be impersonal, such as the Dao, Emptiness, Nirguna Brahman, and so forth. These, Hick holds, are all mistaken; the Ultimate reality is neither personal nor impersonal. (Hick 1995, ch. 3) To say that it is either, Hick realizes, would be to hand an epistemic victory to either the monotheists or the absolutists (ultimists). This, he will not do.

Instead, Hick downgrades the importance of true belief to religion. Though not true, doctrines such as the Trinity or the Incarnation, he argues, may be interpreted to have “mythological truth,” that is, a tendency to influence people towards getting what Hick postulates is the cure offered by the religions, the ethical transformation described above.

Hick doesn’t argue for the salvific or cure-delivering equality of all religions. Rather, he only argues for the equality of what he calls “post-axial” religions – major religious traditions which have arisen since around 800 B.C.E. (Hick 2004, ch. 2-4; Meeker 2003)

Hick’s identist religious pluralism has been objected to as thoroughly as any recent theory in philosophy. Here we can survey only a few of the criticisms that have been made. (For others see Hick 2004, Introduction.)

Many have objected that Hick’s pluralism is not merely a theory about religions, but is itself a competitor in the field, offering a diagnosis and cure which disagrees with those of the world religions. It is hard to see, then, how his theory enables one to be, as Hick claimed to be, a “pluralistic Christian,” given that one has replaced the diagnosis and cure of Christianity with those of Hick’s pluralism. In reply, Hick urges that his claims are not themselves religious, but are rather about religious matters, and are, as such, philosophical.

Hick’s claim that no human concept applies to the Ultimate Reality has been criticized by many, who’ve pointed out that Hick applies these concepts to it: being one, being ultimate, being a partial cause of the impersonae and personae, and being ineffable. Moreover, it seems a necessary truth that if the concept personal doesn’t apply to the Real, then the concept non-personal must apply to it. (King, 2008; Rowe 1999; Yandell 1999) In response, Hick concedes that some concepts, “formal” ones, can be applied to the Real, while “substantial” ones cannot. He switches to the term “transcategorial,” points out historical versions of this thesis, and urges that the Real simply is not in the domain of entities to which concepts like personal and non-personal apply. His critics, he argues, are merely asserting without reason that there cannot be a transcategorial reality. (Hick 2000, 2004)

As to Hick’s idea that the correlation of birthplace and religious belief somehow undermines the rationality of religious belief, it has been pointed out that religious pluralism too is correlated with birthplace. In response to his claims that non-pluralistic religious believers are being arrogant, irrational, or arbitrary in believing that one religion (theirs) is the most true one, it has been pointed out that Hick too, as a religious pluralist, holds views which are inconsistent with many or most religions, seemingly preferring his own insights or experiences to theirs, which would, by Hick’s own lights, be just as arrogant, irrational, or arbitrary. (O’Connor 1999; King 2008; Bogardus 2013)

Others object that given the transcategoriality or ineffability of the Real, even with the above qualifications, there is no reason to think that interaction with the Real should be ethically beneficial, or that it should have any connection at all to any religious value. (Netland 2001)

Others object that Hick’s pluralism requires arbitrarily reinterpreting religious language non-literally, and usually as having to do with morality, contrary to what most proponents of those religions believe. (Yandell 2013)

Again, it has been objected that Hick, contrary to many religions, downgrades religious practice and belief as inessential to a religion, the only important features of a religion being that it is a response to the Ultimate Reality and that it fosters the ethical transformation noted above. Further, Hick presupposes the correctness of recent socially “liberal” ethics, for example, “sexual liberation,” and thus rules out as inessential to any religion any conflicting ethical demands. (Legenhausen 2006)

Other objections have been centered on the status of Hick’s personae. If, for example, in his view Allah, Vishnu, and Yahweh are all real and distinct, is Hick thereby committed to polytheism? Or are those gods mere fictions? (Hasker 2011) At first Hick evades the issue of polytheism by describing his theory not as a kind of “polytheism,” but rather as “poly-something.” He then suggests that two views of the personae are compatible with his theory: that they are mental projections, or that they are real, but angel- or deva– like beings, intermediaries and not really gods. Finally, Hick revises his view: the monotheistic gods people experience are mental projections in response to the Real, and not real selves, but since religious people really do encounter great selves in religious experience, we should posit personal intermediaries between humans and the Real, with whom religious people interact. Perhaps these are the angels, devas, and heavenly Buddhas of the religions, great but nonetheless finite beings. Thus Christians, for example, imagine that in prayer they interact with the ultimate, a monotheistic god, but in fact they interact with angels, and perhaps different Christians with different angels. It is a mistake, he now holds, to suppose that the personae (that is, Vishnu, Yahweh, Allah, and so on) are angel-like selves. This is not compatible with his thesis that Vishnu and others are phenomena of the Real, that is, culturally conditioned ways that the Real appears to us. (Hick 2011)

A less developed identist pluralism is explored by Peter Byrne. (1995, 2004) All the major religions are equal in that they (1) refer to and facilitate cognitive contact with a single, transcendent reality, (2) each offers a similarly moral- and eternal-oriented “cure,” and (3) each includes revisable and incomplete accounts of this transcendent reality. It has been objected that this theory is not promising because it is hard to see how we could ever have sufficient evidence for some of its claims, while others are implausible in light of the evidence we do have. (Mawson 2005)

f. Expedient Means Pluralism

Historically, Buddhist thought about other religions has almost never been pluralistic. (Burton 2010; Kiblinger 2005; and section 4c below) But in modern times, some have constructed a novel and distinctively Buddhist pluralism using the Mahayana doctrine of “expedient means” (Sanskrit: upaya). The classic discussion of this is in the Lotus Sutra (before 255 C.E.), which argues that previous versions of Buddhist teaching were mere expedient means, that is, non-truths taught because in his great wisdom, the Buddha knew that at its then immature stage, humanity would be aided only by those teachings. This was a polemic against non-Mahayana versions of Buddhist dharma (teaching). Now that the time is right, the truth may be told, that is, Mahayana doctrine, superseding the old. However, more recently, it has been argued that all religious doctrines, even Mahayana ones, are expedient means, helpful non-truths, ladders to be kicked away upon attainment of the cure, here understood as a non-cognitive awareness of the ultimate reality. (Burton 2010)

3. Exclusivism

a. Terminological Problems

The term “exclusivist” was originally a polemical term, chosen in part for its negative connotations. Some have urged that it be replaced by the more neutral terms “particularism” or “restrictivism.” (Netland 2001, 46; Kärkkäinen 2003, 80-1) This article retains the common term because it is widespread and many have adopted the label for their own theory of religious diversity.

In this article “exclusivism” about religious diversity denies any form of pluralism; it denies that all religions, or all “major ones,” are the same in some important respect. Insofar as a religion claims to possess a diagnosis of the fundamental problem facing humans and a cure, that is, a way to permanently and positively resolve this problem, it will then assume that other, incompatible diagnoses and cures are incorrect. Because of this, arguably exclusivism (or inclusivism, see section 4 below) is a default view in religious traditions. Thus, for example, the earliest Buddhist and Christian sources prominently feature staunch criticisms of various rival teachings and practices as, respectively, false and useless or harmful. (Netland 2001; Burton 2010)

Some philosophers, going against the much-discussed identist pluralism of John Hick (see 2e above) use “exclusivism” to mean reasonable and informed religious belief which is not pluralist. (O’Connor 1999) This “exclusivism” is compatible with both exclusivism and inclusivism in this article. It is difficult to make a fully clear distinction between exclusivist and inclusivist approaches. The basic idea is that the inclusivist grants more of the values in question to religions other than the single best religion – more truth, more salvific efficacy, more veridical experience of the objects of religious experience, more genuine moral transformation, and so forth.

Finally, because of their fit with many traditional religious beliefs and commitments, sometimes exclusivism and inclusivism are considered as two varieties of “confessionalism,” views on which “one religion is…true and…we must view other religions in the light of that fact.” (Byrne 2004, 203)

b. Naive Exclusivisms

An exclusivist stance is often signaled by the claim that there is “only one true religion.” Other religions, then, are “false.” A naive person may infer from this that no claim, or no central claim of any other religion is true, but all such are false. This position cannot be self-consistently maintained. Consider the claim that the cosmos was intentionally made. An informed Christian must concede that Jews and Muslims too believe this, and that they teach it as a central doctrine. Thus, if central Christian teachings are true, then so is at least one central teaching of these two rival religions.

Another naive exclusivist view which is rejected by most theorists is that all who are not full-fledged members of the best religion fail to get the cure. For example, all non-Christians go to hell, or all non-Buddhists fail to gain Nirvana, or to make progress towards it. Theorists nearly always loosen the requirement with regard to what they view as the one most true and/or most effective religion. Thus, Christian exclusivists usually allow that those who die as babies, the severely mentally handicapped, or friends of God who lived before Christian times may avoid hell and attain heaven despite their not being fully-fledged, believing and practicing Christians. (Dupuis 2001; Meeker 2003; section 3c below) Similarly, Buddhists usually allow that a person may gain positive karma, and so a better rebirth, by the practice of various other religions, helping her to advance, life by life, towards getting the cure by means of the distinctive Buddhist beliefs and practices.

While such naive exclusivist positions are rarely expounded by scholars, they frequently appear in the work of pluralists and inclusivists, held up as unfortunate, harmful, and unreasonable theories which are in urgent need of replacement.

c. Christian Exclusivisms

Early bishop Ignatius of Antioch (c. 35-107) writes that “if any follow a schismatic [that is, the founder of a religious group outside of the bishop-ruled catholic mainstream] they will not inherit the Kingdom of God.” (Letter of Ignatius to the Philadelphians 3:3) Leading catholic theologian Origen of Alexandria (c. 186-255) wrote: “outside the Church no one is saved.” (Dupuis 2001, 86-7) Yet Origen also held, at least tentatively, that eventually all rational beings will be saved.

Thus, the slogan that there is no salvation outside the church (Latin: Extra ecclesiam nulla salus) was meant to communicate at bare minimum the uniqueness of the Christian church as God’s instrument of salvation since the resurrection of Jesus. The slogan was nearly always, in the first three Christian centuries, wielded in the context of disputes with “heretical” Christian groups, the point being that one can’t be saved through membership in such groups. (Dupuis 2001, 86-9)

However, what about Jews, pagans, unbaptized babies, or people who never have a chance to hear the Christian message? After catholic Christianity became the official religion of the empire (c. 381), it was usually assumed that the message had been preached throughout the world, leaving all adult non-Christians without excuse. Thus, Augustine of Hippo (354-430) and Fulgentius of Ruspe (468-533) interpreted the slogan as implying that all non-Christians are damned, because they bear the guilt of “original sin” stemming from the sin of Adam, which has not been as it were washed away by baptism. (Dupuis 2000, 91-2)

Water baptism, from the beginning, had been the initiation rite into Christianity, but it was still unclear what church membership strictly required. Some theorized, for instance, that a “baptism of blood,” that is, martyrdom, would be enough to save unbaptized catechumens. Later theologians added a “baptism of desire,” which was either a desire to be baptized or the inclination to form such a desire, either way enough to secure saving membership in the church. In the first case, a person who is killed in an accident on her way to be baptized would nonetheless be in the church. In the second, even a virtuous pagan might be a church member. This “baptism of desire” was officially affirmed by the Roman Catholic Council of Trent in 1547.

With the split of the catholic movement into Roman Catholic and Eastern Orthodox branches, “the church” was understood in Western contexts to be specifically the Roman Catholic church. Thus, famously, in a papal bull of 1302, called by its first words Unam Sanctam (that is, “One Holy”), Pope Boniface VIII (r. 1294-1303) declared that outside the Roman Catholic church, “there is neither salvation nor remission of sins,” and “it is altogether necessary to salvation for every human creature to be subject to” the pope. (Plantinga 1999, 124-5; Neuner and Dupuis 2001, 305) Note that this might still be interpreted with or without the various non-standard ways to obtain church membership mentioned above. The context of this statement was not a discussion of the fate of non-Christians, but rather a political struggle between the pope and the king of France.

In the Decree for the Copts of the General Council of Florence (1442), a papal bull issued by pope Eugene IV (r. 1431-47), for the first time in an official Roman Catholic doctrinal document the slogan was asserted not only with respect to heretics and schismatics, but also concerning Jews and pagans. (Neuner and Dupuis 2001, 309-10) It also seems to close the door to non-standard routes to church membership, saying that “never was anyone, conceived by a man and a woman, liberated from the devil’s dominion except by faith in our Lord Jesus Christ.” (Tanner 1990, 575) Non-Catholics will “go into the everlasting fire…unless they are joined to the catholic church before the end of their lives…nobody can be saved, no matter how much he has given away in alms and even if has shed his blood in the name of Christ, unless he has persevered in the bosom and the unity of the catholic church.” (Tanner 1990, 578)

This exclusivistic or “rigorist” way of understanding the slogan, on which only the Roman Catholic church could provide the “cure” needed by all humans, was the most common Catholic stance on religious diversity until mid-nineteenth century. But some had always held on to theories about ways into the church other than water baptism, and since the European discovery of the New World it had become clear that the gospel had not been preached to the whole world, and many held that such pagans were non-culpably ignorant of the gospel. This view was affirmed by Pope Pius X (r. 1846-78) in his Singulari Quadam (1854): “outside the Apostolic Roman Church no one can be saved…On the other hand…those who live in ignorance of the true religion, if such ignorance be invincible, are not subject to any guilt in this matter before the eyes of the Lord.” (Neuner and Dupuis 2001, 311)

Nineteenth century popes condemned Enlightenment-inspired theories of religion pluralism about truth and salvation, then called “indifferentism,” it being, allegedly, indifferent which major religion one chose, since all were of equal value. At the same time, they argued that many people who are outside the one church cannot be blamed for this, and so will not be condemned by God.

Such views are consistent with exclusivism in the sense that Roman Catholic Christianity is the one divinely provided and so most effective instrument of salvation, as well as the most true religion, and the “true religion” in the sense that any claim which contradicts it official teaching is false. Letters by Pius XII (r. 1939-58) declared that a “by an unconscious desire and longing” non-Catholics may enjoy a saving relationship with the church. (Dupuis 2001, 127-9) Whether these non-Catholics are thought to be in the church by a non-standard means, or whether they are said to be not in the church “in reality” but only “in desire,” it was held that they were saved by God’s grace. (Neuner and Dupuis 2001, 329)

Since the Vatican II council (1962-5), many Catholic theologians have embraced what most philosophers will consider some form of inclusivism rather than a suitably qualified exclusivism, with a minority opting for some sort of pluralism. (On the majority inclusivism, see section 4b below.) The impetus for this change was fueled by statements from that council (their Latin titles: Lumen Gentium, Ad Gentes, Nostra Aetate, Gaudium et Spes, Heilsoptimismus), which are in various ways positive towards non-Catholics. One asserts not merely the possibility, but the actuality of salvation for those who are inculpably ignorant of the gospel but who seek God and try to follow his will as expressed through their own conscience. Another, without saying that people may be saved through membership in them, affirms various positive values in other religions, including true teachings, which serve as divinely ordained preparations for reception the gospel. Catholics are exhorted to patient, friendly dialogue with members of other religions. (Dupuis 2000, 161-5) Some Catholic theologians have seen the seeds or even the basic elements of inclusivism in these statements, while others view them as within the orbit of a suitably articulated exclusivism. (Dupuis 2000, 165-170) A key area of disagreement is whether or not these imply that a person may be saved by means of their participation in some other religion. Still other Catholic theologians have found these moves to be positive but not nearly different enough from the more pessimistic sort of exclusivism. Such theologians, prominently Hans Küng (b. 1928) and Paul Knitter (b. 1939), have formulated various pluralist theories. (Kärkkäinen 2003, 197-204, 309-17)

Protestant versions of exclusivism can be at least as strict as Augustine’s. Recently called “restrictivism,” this position insists that explicit knowledge of the gospel of Jesus Christ is necessary for salvation, and there is no hope for those who die without having heard the gospel. (McDermott and Netland 2014, 148) But these are sometimes tempered with loopholes such as: a universal chance to hear the gospel at or after death, a free pass to people who die before the “age of accountability,” or the view that less was required to be saved in pre-Christian times. Another view which is taken by Bible-oriented evangelical Protestants allows the possibility of non-Christians receiving saving grace, but is firmly agnostic as to whether this actually occurs, and if it does, how often, because of the paucity of relevant biblical statements. (McDermott and Netland 2014) Other Protestants choose forms of inclusivism similar to Rahner’s (see 4b below).

4. Inclusivism

a. Terminological Problems

On the one hand, it is difficult to consistently distinguish inclusivism from exclusivism, because the latter nearly always concedes some significant value to other religions. “Inclusivism” for some authors just means a friendlier or more open-minded exclusivism. On the other hand, many theorists want to adopt the friendly and broad-minded sounding label “pluralism” for their theory, even though they clearly hold that one religion is uniquely valuable. For example, both Christians and Buddhists have adopted religious-diversity-celebrating rhetoric while clearly denying anything described above in this article as a kind of “pluralism” about religious diversity. (Dupuis 2001; Burton 2010)

b. Abrahamic Inclusivisms

Historically, Jewish intellectuals have usually adopted an inclusivist rather than an exclusivist view about other religions. A typical Rabbinic view is that although non-Jews may be reconciled to God, and thus gain life in the world to come, by keeping a lesser covenant which God has made with them, still Jews enjoy a better covenant with God. Beginning in the late twentieth century, however, some Jewish thinkers have argued for pluralism along the lines of various Christian authors, revising traditional Jewish theology. (Cohn-Sherbok 2005)

Since the latter twentieth century many Roman Catholic theologians have explored non-exclusivist options. As explained above (section 3c) a major impetus for this has been statements issued by the latest official council (Vatican II, 1962-5). One goes so far as to say that “the Holy Spirit offers to all [humans] the possibility of being associated, in a way known to God, with the Paschal Mystery [that is, the saving death and resurrection of Jesus].” (Gaudium et Spes 22, quoted in Dupuis 2001, 162) Some Catholic theologians see the groundwork or beginning in these documents for an inclusivist theory, on which other religions have saving value.

Influential German theologian Karl Rahner (1904-84), in his essay “Christianity and the Non-Christian Religions,” argues that before people encounter Christianity, other religions may be the divinely appointed means of their salvation. Insofar as they in good conscience practice what is good in their religion, people in other religions receive God’s grace and are “anonymous Christians,” people who are being saved through Christ, though they do not realize it. All Christians believe that some were saved before Christianity, through Judaism. So too at least some other religions must still be means for salvation, though not necessarily to the same degree, for God wills the salvation of all humankind. But these lesser ways should and eventually will give way to Christianity, the truest religion, intended for all humankind. (Plantinga 1999, 288-303)

Subsequent papal statements have moved cautiously in Rahner’s direction, affirming the work of the Holy Spirit not only in the people in other religions, but also in those religions themselves, so that in the practice of what is good in those religions, people may respond to God’s grace and be saved, unbeknownst to them, by Christ. Nonetheless, the Roman Catholic church remains the unique divine instrument; no one is saved without some positive relation to it. (Dupuis 2001, 170-9; Neuner and Dupuis 2001, 350-1)

Although many traditional Protestant Christians hold some form of exclusivism, others favor an inclusivism much like Rahner’s. (Peterson et. al. 2013, 333-40) Theologically liberal Protestants most often hold on to some form of religious pluralism.

As a relative latecomer which has always acknowledged the legitimacy of previous prophets, including Abraham, Moses, and Jesus, while proclaiming its prophet to be the greatest and last, Islam has, like Judaism, tended towards inclusivist views of other religions. The traditional Islamic perspective is that while in one sense “Islam” was initiated by Muhammad (570-632 CE), “Islam” in the sense of submission to God was taught by all prior prophets, and so their followers were truly Muslims, that is, truly submitted to God. Still, given that Muhammad is the seal of the prophets, his teachings and practices should, and some day will supersede all previous ones. Recent Islamic thinkers have independently come to conclusions parallel to those of Rahner, while critiquing various pluralist theories as entailing the sin of unbelief (kufr), the rejection of Islam.

It is a matter of dispute whether certain famous Sufi Muslims such as Rumi (1207-73) and Ibn ‘Arabi (1165-1240) have held to some form of religious pluralism. (Legenhausen 2006, 2009)

c. Buddhist Inclusivisms

While there have been Buddhist teachers and movements who have been exclusivists, in general Buddhism has been inclusivist. Buddhism has long been very doctrinally diverse, and many schools of Buddhism argue that theirs is the truest teaching or the best practice, while other versions of the dharma are less true or less conducive to getting the cure, and have now been superseded. It has been typical also for Buddhist thinkers to hold that at best, the same is true of other religious traditions. (Burton 2010) On the other hand, some religions’ teachings are simply false and their practices are unhelpful; the contents of their prescribed beliefs and practices matter.

Some Buddhist texts teach that there can be a solitary Buddha (pratyekabuddha), a person who has gained enlightenment by his own efforts, independently of Buddhist teaching. Such a person is outside of the tradition, yet obtains the cure taught by the tradition. This is an inclusivist view about getting the cure, and about central religious truths. There are even cases of “Buddhists seeking to turn devotees of other religions traditions into ‘anonymous Buddhists’ who worship Buddhist deities without realizing that this is the case.” (Burton 2010, 11)

d. Plural Salvations Inclusivism

Forming his views by way of a detailed critique of various core and identist pluralist theories, Baptist theologian S. Mark Heim (b. 1950) proposes what he calls “a true religious pluralism,” which is nonetheless best understood as a version of inclusivism, as it allows its proponent to maintain the superiority of her own religion. (Heim 1995, 7)

Heim notes that pluralists like Hick insist on one true goal or “salvation” which is achieved by all the equally valuable religions, a goal which is proposed by the pluralist and which differs from those proposed by most of those religions. Heim suggests that we should instead assume that other religions both pursue and achieve real and distinct religious “salvations” (goals or ends). For instance, as an inclusivist Christian, Heim holds that Buddhists really do attain Nirvana. But doesn’t Christian tradition demand that each person eventually either achieves fellowship or union with God, or is irrevocably damned? Heim suggests that those who attain Nirvana would be, from a Christian perspective, either a subgroup of the saved or of the damned, depending on just what, metaphysically, is actually going on with such people. (Heim 1995, 163) This is consistent with the Christian thinking that the end pursued by Christians is in fact better than all others; thus, heaven is better than Nirvana. However, God has ordained Nirvana as a goal suitable for some non-Christians to both pursue and attain. In this and in a later book Heim asserts that such a plurality of ordained religious goals is implied by the doctrine of the Trinity.

It is far from clear that Heim is correct that this stance will be consistent with the claims of the “home” religion. Importantly, he construes the various religious goals as “experiences” obtaining in this life and continuing beyond. (Heim 1995, 154-5, 161) This is an important qualifier, as various religious goals clearly presuppose contrary claims. For example, in Theravada Buddhism, one must realize that there is no self, whereas in Advaita Vedanta Hinduism one must gain awareness that one’s true self is none other than the ultimate reality, Brahman. Similarly, in Christianity, one must realize that one’s self is a sinner in need of God’s grace. It is impossible that all three experiences are veridical. But Heim’s theory does not require them to be, but only that they occur and may be plausibly thought of as fulfilling to those who have them.

Heim strenuously objects to pluralist theories that they impose uniformity on the various religions. However, his theory seems to depend crucially on the existence of many human problems, each of which may be solved by participation in some religion or other. In contrast, each of the various religions claims to have discerned the one fundamental problem facing humans, namely, the problem from which other problems derive. In the terms explained above, a religion claims to have a diagnosis (section 1 above). This seems incompatible with Heim’s agnosticism about which, if any, of the diagnosed problems is the fundamental one. If a religion cures only a shallow, derivative human problem, leaving the deeper problem intact, then what it offers would not deserve the name “salvation,” for it would leave those who achieve it still in need of the cure. (Peterson et. al. 2013, 333) For instance, if Theravada Buddhism is correct that humans are trapped in the cycle of rebirth by craving and ignorance, even if one goes to a heavenly realm upon death, such as envisaged by non-Buddhist religions, one is still trapped in samsara, in this realm of suffering, albeit at a higher tier. How can a Theravada Buddhists accept that such a heavenly next life is a good and final end for non-Buddhists? Again, if a Christian diagnosis is correct, that humans are alienated from and need to be reconciled to God, yet some manage to attain Nirvana, they would still lack the cure, for it is no part of Nirvana that one is reconciled to God.

5. References and Further Reading

Abe, Masao. “A Dynamic Unity in Religious Pluralism: A Proposal from the Buddhist Point of View.” The Experience of Religious Diversity. Ed. John Hick and Hasan Askari. Brookfield, Connecticut: Gower Publishing Company, 1985. 167-227. Partially reprinted in Readings in Philosophy of Religion: East Meets West. Ed. Andrew Eshleman. Malden, Massachusetts: Blackwell, 2008. 395-404.
- Presents an ultimist pluralism modeled on the Mahayana Buddhist “three bodies of the Buddha” doctrine.
Bogardus, Tomas. “The Problem of Contingency for Religious Belief,” Faith and Philosophy 30.4 (2013): 371-92.
- Rebuts sophisticated arguments by Hick, Kitcher, and others, that you cannot know that some religious claim is true because had you been born in another place or time, you would not have believed that claim.
Burton, David. “A Buddhist Perspective.” The Oxford Handbook of Religious Diversity. New York: Oxford University Press, 2010. 321-36.
- Surveys Buddhist views on religious diversity.
Byrne, Peter. Prolegomena to Religious Pluralism: Reference and Realism in Religion. New York: St. Martin’s Press, 1995.
- Explores in depth without endorsing an identist religious pluralism.
Byrne, Peter. “It is not Reasonable to Believe that Only One Religion is True.” Contemporary Debates in Philosophy of Religion. Ed. Michael Peterson and Raymond VanArragon. Malden, MA: Blackwell, 2004. 201-10.
- Argues for an identist pluralism and against “confessionalism” (either inclusivism or exclusivism).
Cohn-Sherbok, Dan. “Judaism and Other Faiths.” The Myth of Religious Superiority: A Multifaith Exploration. Ed. Paul F. Knitter. Maryknoll, New York: Orbis Books, 2005. 119-32.
- An overview of traditional, inclusivist Jewish views of other religions, then arguing that Jewish theology should be revised to accommodate an identist pluralism.
Cooper, John W. Panentheism: The Other God of the Philosophers: From Plato to the Present. Grand Rapids, Michigan: Baker Academic, 2006.
- Chapter 7 is an accessible introduction to the metaphysics and theology of Whitehead and Hartshorne, without which the ultimist pluralism of Cobb and Griffin can’t be understood.
Datta, Narendra [Swami Vivekananda]. “Hinduism.” The Penguin Swami Vivekananda Reader. Ed. Makarand Paranjape. New Delhi: Penguin Books India, 2005 [1893]. 43-55.
- One of several hit speeches given at the first World Parliament of Religions in Chicago in 1893, asserting that all religions share one object and goal, although Hinduism is more tolerant, peaceful, and flexible than other traditions.
de Cea, Abraham Vélez. “A Cross-cultural and Buddhist-Friendly Interpretation of the Typology Exclusivism-Inclusivism-Pluralism.” Sophia 50 (2011): 453-80.
- An attempt to clarify and expand the common trichotomy, adding a fourth category, “pluralistic inclusivism.”
Dupuis, Jacques. Toward a Christian Theology of Religious Pluralism. Maryknoll, New York: Orbis Books, 2001 [1997].
- Survey of the long evolution of Roman Catholic thought on religious diversity, arguing for an inclusivist theory.
Feuerbach, Ludwig. Lectures on the Essence of Religion. Translated by Ralph Manheim. New York: Harper and Row, 1967 [1851]
- A naturalistic, humanistic, atheistic critique of belief in God as a product of human desire and imagination; a form of negative pluralism.
Griffin, David Ray. “Religious Pluralism: Generic, Identist, Deep.” Deep Religious Pluralism. Ed. David Ray Griffin. Louisville, Kentucky: Westminster John Knox Press, 2005a. 3-38.
- Surveys sophisticated recent pluralist theories by Hick, Smith, Knitter, Cobb, and criticisms of these. Argues for the superiority of his own ultimist (“deep”) religious pluralism.
Griffin, David Ray. “John Cobb’s Whiteheadian Complementary Pluralism.” Deep Religious Pluralism. Ed. David Ray Griffin. Louisville, Kentucky: Westminster John Knox Press, 2005b. 39-66.
- Presentation of ultimist pluralism as developed by Cobb and Griffin.
Hasker, William. “The Many Gods of Hick and Mavrodes,” Evidence and Religious Belief. Ed. Kelly James Clark and Raymond J. VanArragon. New York: Oxford University Press, 2011. 186-98.
- Critical discussion of Hick’s views on the personae and impersonae of religious experience, which are supposed to be manifestations of the Ultimate Reality.
Heim, S. Mark. Salvations: Truth and Difference in Religion. Maryknoll, New York: Orbis, 1995.
- Critiques various pluralistic theories as insufficiently respectful of the real differences between religions and proposes a plural salvations inclusivism.
Hick, John. God Has Many Names. Philadelphia: The Westminster Press, 1982 [1980].
- A short book written at a crucial juncture in Hick’s thinking about religious diversity; probably the best place to start in understanding Hick’s views.
Hick, John. The Rainbow of Faiths [U.S. title: A Christian Theology of Religions: The Rainbow of Faiths]. London: SCM Press, 1995.
- A short and popular exposition of, and development of his mature views as expounded in his 1989 An Interpretation; mostly written in the form of imagined dialogues.
Hick, John. “The Epistemological Challenge of Religious Pluralism.” Faith and Philosophy 14.3 (1997): 277-86. Reprinted in Hick 2010, 25-36.
- Argues that religious exclusivism and inclusivism face devastating epistemological problems; see Hick 2010 for his exchanges with some leading Christian philosophers about this piece.
Hick, John. “Ineffability,” Religious Studies 36.1 (2000): 35-46. Reprinted in Hick 2010, ch. 3.
- Replies to criticisms of the ineffability or trancategoriality of “the Real” by Rowe and Insole.
Hick, John. An Interpretation of Religion: Human Responses to the Transcendent, 2nd ed. New Haven, Connecticut: Yale University Press, 2004 [1989].
- The main exposition of what is widely considered the best-developed pluralist theory (esp. ch. 14-16), espousing the practical equality of “post-axial” religions (ch. 2-4). Its long introduction summarizes his replies to many critics.
Hick, John, ed. Dialogues in the Philosophy of Religion. New York: Palgrave-MacMillan, 2010 [2001].
- Reprints and continues Hick’s exchanges in the late 1990s with a number of prominent philosophers and theologians.
Hick, John. “Response to Hasker.” Evidence and Religious Belief. Ed. Kelly James Clark and Raymond J. VanArragon. New York: Oxford University Press, 2011. 199-201.
- Hick clarifies his claims regarding the personae and impersonae by means of which people interact with the Ultimate Reality.
Ignatius of Antioch, “The Letter of Ignatius to the Philadelphians,” in The Apostolic Fathers: Greek Texts and English Translations, 3rd ed. Ed. Michael W. Holmes. Grand Rapids, Michigan: BakerAcademic, 2007. 236-47.
- Early Christian writing, probably from the first half of the second century, in which the bishop says that followers of schismatic leaders will not be saved.
Kärkkäinen, Veli-Matti. An Introduction to the Theology of Religions. Biblical, Historical, and Contemporary Perspectives. Downers Grove, Illinois: InterVarsity Press, 2003.
- Wide-ranging discussion of Christian responses to religious diversity from biblical times up till the present, valuable for its summaries of ancient, early modern, and recent theological sources.
King, Nathan. “Religious Diversity and its Challenges to Religious Belief.” Philosophy Compass 3/4 (2008): 830-53.
- Lucid survey of varieties of exclusivism, inclusivism, the identist pluralism of John Hick, and epistemological difficulties arising from disagreements about religious matters.
Kiblinger, Kristin. Buddhist Inclusivism: Attitudes Towards Religious Others. Burlington, Vermont: Ashgate, 2005.
- Sympathetic description and criticism of Buddhist inclusivism by a non-Buddhist scholar.
Legenhausen, Hajj Muhammad [Gary Carl] “Why I am not a Traditionalist.” 2002.
- Online overview of traditionalist core pluralism and critique from the perspective of Shia Islam.
Legenhausen, Hajj Muhammad [Gary Carl]. “A Muslim’s Proposal: Non-Reductive Religious Pluralism.” 2006.
- Insightful online article classifying theories of religious pluralism and arguing for what the author calls “non-reductive pluralism” (here described as an example of Abrahamic Inclusivism, 4b above) by a philosopher who is an American convert to Shia Islam.
Legenhausen, Hajj Muhammad [Gary Carl]. “On the Plurality of Religious Pluralisms.” International Journal of Hekmat 1 (2009): 6-42.
- The most comprehensive classification of varieties of theories of religious pluralism.
Long, Jeffery D. “Anekanta Vedanta: Towards a Deep Hindu Religious Pluralism.” Deep Religious Pluralism. Ed. David Ray Griffin. Louisville, Kentucky, 2005. 130-57.
- Exploration of ultimist (“deep”) religious pluralism by a scholar who is an American convert to Hinduism; argues that whether or not “Hinduism” is pluralistic or inclusivist depends on whether it is understood as Vedic tradition, Indian tradition, or Sanatana Dharma [eternal religion].
Long, Jeffery D. “Universalism in Hinduism.” Religion Compass 5/6 (2011): 214-23.
- Survey of historical and recent pluralist theories in Hinduism (here called versions of “universalism”) and criticisms thereof.
Mawson, T.J. “‘Byrne’s’ religious pluralism.” International Journal for Philosophy of Religion 58.1 (2005): 37-54.
- Negative critique of the identist pluralism of Peter Byrne.
McDermott, Gerald and Netland, Harold. A Trinitarian Theology of Religions: An Evangelical Proposal. New York: Oxford University Press, 2014.
- A recent evangelical Protestant version of exclusivism, embedded in a Christian theology of religions.
Meeker, Kevin. “Exclusivism, Pluralism, and Anarchy.” God Matters: Readings in the Philosophy of Religion. Ed. Raymond Martin and Christopher Bernard. New York: Pearson, 2003. 524-35.
- Shows how versions of exclusivism, inclusivism, and pluralism can be viewed on a continuum, as excluding more or fewer religions; contrasts Hick’s “altruistic pluralism” with an more open-minded but less plausible “anarchic pluralism.”
Morales, Frank [Sri Dharma Pravartaka Acharya]. Radical Universalism: Does Hinduism Teach That All Religions Are the Same? New Delhi: Voice of India, 2008.
- American-born Hindu teacher and scholar argues against the pluralism of Datta [Vivekananda] and Chattopadhyay [Ramakrishna] that it is: incoherent, inconsistent with the facts of religious diversity, foreign to Hinduism, relativistic, intolerant, destructive of Hinduism, and based on misinterpretations of Hindu scriptures.
Netland, Harold. Encountering Religious Pluralism: The Challenge to Christian Faith and Mission. Downers Grove, Illinois: InterVarsity Press, 2001.
- A defense of Christian “particularism” (compatible with the descriptions “exclusivism” or “inclusivism” in this article) by an evangelical theologian, with summaries of earlier missionary literature and criticisms of pluralist theories.
Neuner, Josef and Dupuis, Jacques, eds. The Christian Faith in the Doctrinal Documents of the Catholic Church, Seventh Revised and Enlarged Edition. Bangalore: St. Peter’s Seminary, 2001.
- Collection of Roman Catholic primary sources, including documents relating to the uniqueness of Catholicism, the idea that there is no salvation outside the church, and views on non-Catholic Christianity and non-Christian religions.
O’Connor, Timothy. “Religious Pluralism.” Reason for the Hope Within. Ed. Michael Murray. Grand Rapids, Michigan: Eerdmans, 1999. 165-81
- Defends “exclusivism” (rejection of any pluralism) against arguments that it is arbitrary, arrogant, or irrational, and argues that Hickian identist pluralism is incoherent.
Peterson, Michael. et. al. Reason and Religious Belief, 5th ed. New York: Oxford University Press, 2013.
- Leading philosophy of religion textbook with excellent chapter (14) on pluralism, exclusivism, and inclusivism.
Plantinga, Cornelius, ed. Christianity and Plurality: Classic and Contemporary Readings. Malden, Massachusetts: Blackwell, 1999.
- Collection of Christian documents concerning religious diversity, starting with the Bible and ending with a statement by Pope John Paul II.
Prothero, Stephen. God is Not One. The Eight Rival Religions that Run the World – and Why Their Differences Matter. New York: HarperOne, 2010.
- Introductory overview of eight religious traditions which aims to undermine “the new orthodoxy” of naive or core pluralism.
Quinn, Philip L. and Kevin Meeker, eds. The Philosophical Challenge of Religious Diversity. New York: Oxford University Press, 2000.
- Important anthology of philosophical pieces, largely consisting of attacks on and defenses of Hick’s identist pluralism.
Rowe, William. “Religious Pluralism.” Religious Studies 35.2 (1999): 139-50.
- Argues that Hick’s central claim that the Ultimate Reality is ineffable is incoherent.
Schmidt-Leukel, Perry. “Exclusivism, Inclusivism, Pluralism: The Tripolar Typology – Clarified and Reaffirmed.” The Myth of Religious Superiority: A Multifaith Exploration. Ed. Paul F. Knitter. Maryknoll, New York: Orbis Books, 2005, 13-27.
- Catalogues and responds to the many objections various authors have given to the standard trichotomy, and precisely defines it in terms of giving knowledge sufficient to give people the cure which religion offers.
Sedgwick, Mark. Against the Modern World: Traditionalism and the Secret Intellectual History of the Twentieth Century. New York: Oxford University Press, 2004.
- Intellectual history of “traditionalism” or “perennialism” (core pluralism).
Sharma, Arvind. “All religions are – equal? one? true? same?: a critical examination of some formulations of the Neo-Hindu position.” Philosophy East and West 29.1 (January 1979): 59-72.
- An attempt to clarify the sort of pluralism popular in Hinduism since Datta.
Sharma, Arvind. A Hindu Perspective on the Philosophy of Religion. London: MacMillan, 1990.
- Chapter 9 is a basic introduction to the pluralistic orientation of modern Hinduism, interacting with a few western scholars (W.A. Christian, W.C. Smith, and Hick).
Sharma, Arvind. The Concept of Universal Religion in Modern Hindu Thought. New York: St. Martin’s Press, 1999.
- Surveys the views of leading 19th and 20th century Hindu intellectuals on the theme of “universal religion,” which can be an (alleged) fact or an unrealized ideal; some versions of religious pluralism are discussed.
Smart, Ninian. Dimensions of the Sacred: An Anatomy of the World’s Beliefs. Berkeley, California: University of California Press, 1996.
- Presents a seven-fold analysis of the different aspects of religious traditions.
Smith, Huston. Forgotten Truth: The Common Vision of the World’s Religions. San Francisco: HarperSanFrancisco, 1992 [1976].
- Thorough presentation of a core pluralism, as a part of what he calls “perennial philosophy,” or “traditionalism.”
Smith, Huston. Beyond the Postmodern Mind: The Place of Meaning in a Global Civilization, Updated and Revised. Wheaton, Illinois: Quest Books, 2003 [1982].
- Further exposition of Smith’s “perennial philosophy,” put in the context of his diagnoses of the historical mistakes of “Modern” and “Postmodern” thinking, and his practical suggestions for the future.
Smith, Huston. “No Wasted Journey: A Theological Autobiography.” The Huston Smith Reader. Ed. Huston Smith and Jeffrey Paine. Berkeley: University of California Press, 2012, 3-12.
- Smith explains his journey from Christian to religious naturalist to eclectic seeker to perennialist core pluralist.
Stenmark, Mikael. “Religious Pluralism and the Some-Are-Equally-Right View.” European Journal for Philosophy of Religion 2 (2009): 21-35.
- Articulates the position named in the title as an alternative to the exclusivism-inclusivism-pluralism trichotomy, in part motivated by what he calls “the problem of emptiness” for Hick’s pluralism – roughly, that his (nearly) inconceivable Real is irrelevant to any religious concerns.
Tanner, Norman, ed. Decrees of the Ecumenical Councils, Volume One: Nicea I to Lateran V. Washington, D.C.: Georgetown University Press, 1990.
- First of two volumes with the official original language texts and English translations of all twenty-one official councils recognized by the Roman Catholic church; the “Bull of union with the Copts” from the council of Florence (1442) expresses an exclusivist stance.
Yandell, Keith. Philosophy of Religion: A Contemporary Introduction. New York: Routledge, 1999.
- Focuses on the differences between monotheistic religions, Advaita Vedanta Hinduism, Jainism, and Theravada Buddhism, with a thorough critique of Hick’s pluralism. (ch. 6)
Yandell, Keith. “Has Normative Religious Pluralism a Rationale?” Can Only One Religion Be True? Paul Knitter and Harold Netland in Dialogue Ed. Robert B. Stewart. Minneapolis: Fortress Press, 2013, 163-79.
- Spirited attack on pluralist theories as poorly motivated and inconsistent with traditional religious beliefs.

Author Information

Dale Tuggy
Email: filosofer@gmail.com
State University of New York at Fredonia
U. S. A.

Adolf Lindenbaum (1904-1941)

Adolf Lindenbaum was a Polish mathematician and logician who worked in topology, set theory, metalogic, general metamathematics and the foundations of mathematics. He represented an attitude typical of the Polish Mathematical School, consisting of using all admissible methods, independently of whether they were finitary. For example, the axiom of choice was freely applied, but on the other hand, proofs omitting this axiom were welcomed. In set theory, Lindenbaum and Tarski posed an important conjecture that the generalized continuum hypothesis entails the axiom of choice. Among the most important metalogical and metamathematical results obtained by Lindenbaum are the following: every system of propositional calculus has an at most denumerably infinite normal matrix; the construction of the so–called Lindenbaum algebra; and the maximalization theorem.

Lindenbaum studied mathematics under Wacław Sierpiński, Stefan Mazurkiewicz and Kazimierz Kuratowski in Warsaw. As part of the Lvov–Warsaw School, formed by a powerful Polish group of analytic philosophers, Lindenbaum belonged to the Polish mathematical school and the Warsaw school of logic. He began his career as a topologist, and his doctoral dissertation, written under Sierpiński, was devoted to properties of point–sets. Then in the mid–1920s he switched to logic and joined the Warsaw School of Logic that was established by Jan Łukasiewicz and Stanisław Leśniewski after World War I. Lindenbaum was a close friend and collaborator of Alfred Tarski.

Biography
A General Outline of Lindenbaum’s Scientific Career and His Views
Lindenbaum and Set Theory
Lindenbaum and Logical Calculi
Lindenbaum and General Metamathematics
Final Remarks
References and Further Reading

1. Biography

Adolf Lindenbaum was born in an assimilated (that is, polonized) rich Jewish family in Warsaw on June 12, 1904. He took his secondary education at M. Kreczmar’s Gymnasium in Warsaw from 1915 to 1922, and then he entered Warsaw University to study mathematics from 1922 to 1926. There his teachers included Kazimierz Kuratowski, Stefan Mazurkiewicz and Wacław Sierpiński in mathematics as well as Stanisław Leśniewski and Jan Łukasiewicz in logic. Lindenbaum also attended Alfred Tarski’s courses on cardinal numbers and elementary mathematics, Tadeusz Kotarbiński’s course in logic, and some classes in humanities. The latter included history of philosophy, Władysław Tatarkiewicz’s special class on Kant, Tatarkiewicz’s course on French art, a course in psychology by Władysław Witwicki, a course on linguistics by Karol Appel, a course on literature by Józef Ujejski, including his course on Adama Mickiewicz, the most important Polish national poet, and a course on the history and culture of Palestine by Moses Schorr.

Sierpiński supervised Lindenbaum’s Ph.D. dissertation entitled O Własnościach Mnogości Punktowych (On Properties of Point–Sets). The thesis was defended in 1928, and Lindenbaum received the title of Doctor of Philosophy. In 1934 he presented his Habilitation thesis, based on several published papers, to the Faculty of Mathematics and Natural Sciences at Warsaw University and obtained the degree of Docent (a person who could lecture). This resulted in his appointment as adjunct professor at the Philosophical Seminar at the same faculty. The Philosophical Seminar was an independent unit at the Faculty of Mathematics and Science directed for years by Łukasiewicz. It was the Logical Seminar. Lindenbaum lectured on various mathematical and logical topics from 1935 to 1939. His courses, for example, concerned the following topics: “On New Investigations into the Foundations of Mathematics and the Mathematical Foundations of other Disciplines”, “On Superposition of Functions” and “Selected Topics from Metrology and from the Theory of Functions”. He stood little chance of being promoted to an academic position higher than docent because of the anti–Semitic policy in Polish universities after 1935, and his involvement in the communist movement, as well as the shortage of university positions at the time. He was also a tutor at the Scientific Circle of Jewish Students at Warsaw University, established after the exclusion (in the 1890s) of the Jews from general students’ associations existing in Poland.

Lindenbaum was a typical, good–looking bon vivant. He married a beautiful Jewish woman, Janina Hosiasson, who also was a logician and who successfully worked on induction and confirmation. She and Adolf became separated at the beginning of World War II (as Janina Hosiasson informed Alfred Tarski in one of her letters in early 1941, a letter that is in the Tarski Archive in U.C. Berkeley’s Bancroft Library).

Lindenbaum was a declared leftist. He belonged to the Polish Communist Party (KPP) until its dissolution by Stalin in 1938; He also was an activist in the intelligentsia circles. In 1936 Lindenbaum signed a petition demanding that Carl von Ossietzky, a journalist imprisoned by the Nazis, be awarded the Nobel Peace Prize, and he protested against the massacre of workers in Lvov in 1936. Mrs. Janina Kotarbiński, the wife of Tadeusz Kotarbiński who was a close friend of the Lindenbaums reported the following story:

It happened that I and Antoni Pański [also a philosopher – J. W] visited the Lindenbaums in their apartment. I noticed on Dolek’s [the diminutive of Adolf – J. W.] desk the Short Philosophical Dictionary [the dictionary written by Pavel Yudin and Mark Rozental and published in Russian in the Soviet Union at the beginning of the 1930s. The authors represented the Stalinist version of Marxism. The Short Philosophical Dictionary became a symbol of the orthodox Marxist ideology – J. W.]. The book was opened at the entry ‘Dialectical Contradiction’. I was surprised, and on our way back I asked Pański why Dolek, such a clever person, read such stupidities about the concept of contradiction. Pański answered that Dolek believed in every word of this book.

She added, however, that Lindenbaum had a considerable interest in philosophy without any kind of dogmatism. He was ready for an open discussion on any philosophical issue. Lindenbaum was strongly interested in literature and art, and was famous as a passionate climber.

After the outbreak of World War II, Lindenbaum immediately realized that for his own security he should escape before the German army would take Warsaw. His Jewish origin was probably not the only reason behind this decision. More importantly, it was obvious that the Germans possessed the lists of Polish communists and other critics of Hitler’s regime. Lindenbaum was particularly afraid of the consequences of the support he had given to Ossietzky, because such actions were strongly, even furiously, criticized in Germany. Although Lindenbaum could emigrate to the West or get a position in Moscow as a scientist, he decided to remain in Poland, because he had hope in the socialist future of the country. The Lindenbaums left Warsaw on September 6, 1939 and went to Vilna (presently Vilnius); this city, formerly in Poland, became the capital of Lithuania after September 17, 1939. Lithuania was an independent country in 1918–1939 with Kaunas as its capital; in 1939–1941 Lithuania was formally independent, but entirely dependent on the Soviet Union. It was occupied by Germany after its attack on the Soviet Union on June 22, 1941. Janina remained there, but Adolf moved to Bialystok, a city occupied by the Soviet Army after its invasion into Poland on September 17; he probably expected this part of Poland to become the center of the future Polish communist state. Lindenbaum was appointed as a docent in the Bialystok Pedagogical Institute established by the Soviet authorities, and he taught mathematics there. The German–Soviet War started on June 22, 1941, and the Germans soon came to Bialystok. The reasons why Lindenbaum did not leave the city are unknown. In September 1941, he was arrested by the Gestapo, transported to Vilnius, and killed in Ponary (the place of many massacres, particularly of the Jews in 1941–1944) near the Lithuanian capital. Janina Hossiasson–Lindenbaum was murdered in Vilnius in 1942. The more exact dates of the deaths of the Lindenbaums remain unknown.

See (Mostowski–Marczewski 1971), (Surma 1982), (Zygmunt–Purdy 2014) for more biographical data on Lindenbaum).

2. A General Outline of Lindenbaum’s Scientific Career and His Views

Lindenbaum began his scientific carrier as a topologist, but he soon converted to logic and the foundations of mathematics. He and Tarski (three years older) became friends in the early 1920s, and the latter influenced Lindenbaum in the direction of mathematical logic and the foundations of mathematics. Both shared not only scientific interests, but also a negative attitude to any version of religion, various leftist political ideas and a love of mountains and literature, but they were also very sensitive to their fate as secular Jews and their pretending to be assimilated and accepted by Polish society. Neither of them, however, was successful in the last respect. At the beginning of his scientific career, he was very active in the Student Mathematical Scientific Circle as well as in the Student Philosophical Circle, and he successfully promoted logic among his colleagues in both groups. In particular, he delivered several lectures on logical problems at the meetings of both circles. He was a mentor in logic to students in Warsaw. For instance, he wrote a section on logic in the Mathematical-Physical Study: Information Book for Newcomers published in 1926 in which he reported how logic was taught in Warsaw. It is a very interesting document which shows how powerful logic was in Warsaw in the mid–1920s. The revelation that the Principia Mathematica was recommended as a textbook for advanced students seems really shocking. Lindenbaum’s brilliant personality, charming style of life and powerful mathematical skills fascinated the Warsaw scientific community. Not surprisingly, he was commonly considered one of the most gifted Polish mathematicians of his generation. Tarski 1949, p. XII described Lindenbaum as “a man of unusual intelligence”. Mostowski once called Lindenbaum the most lucid mind in the foundations of mathematics. Legendary stories told by Lindenbaum’s friends and colleagues document many cases of theorems discovered by him but proved by someone else, as he had no time to complete his ideas. Yet the list of his scientific contributions is quite long; it comprises more than 40 papers, abstracts and reviews (see Surma 1982 for Lindenbaum’s bibliography), mostly published in German and French. Some of Lindenbaum’s papers were co–authored, in particular, by Tarski and Andrzej Mostowski. Lindenbaum and Tarski worked on the book Theorie der Eindeutigen Abbildungen. It was announced as volume 8 of the series “Mathematical Monograph” to be published in 1938. This information is given on the back cover of S. Sachs, The Theory of Integral, Monografie Matematyczne, Warszawa–Lwów 1937 (co–published by G. E. Stechert, New York). Sachs’ book appeared as volume 7 of this series. This suggests that the book by Lindenbaum and Tarski was near to being completed. Lindenbaum was also well–perceived on the international scale. Leading logicians and mathematicians, including Wilhelm Ackermann, Friedrich Bachmann, Abraham Fraenkel, Andrei Kolmogoroff and Arnold Schmidt, reviewed his writings. Lindenbaum actively participated in the Polish Mathematical Congresses and in the International Congress of Scientific Philosophy held in Paris in 1935.

Lindenbaum belonged to three scientific schools. One affiliation was with the Polish Mathematical School with Sierpiński, Mazurkiewicz and Kuratowski as its leaders in Warsaw. The second affiliation was the Warsaw School of Logic (Łukasiewicz, Leśniewski, Tarski—the latter joined the top of the School in the 1920s; see Woleński 1995 for a general presentation of logic in Poland in the interwar period). The third affiliation was with the Lvov–Warsaw School (Kazimierz Twardowski and his disciples from Lvov, in particular, Łukasiewicz, Leśniewski and Kotarbiński).

The second affiliation is perhaps the most important. The Warsaw School of Logic was a “child” of mathematicians and philosophers. Łukasiewicz and Leśniewski, the main figures in this group were philosophers by training. Nevertheless, they became professors at the Faculty of Mathematics and Natural Science at Warsaw University and were active in the mathematical environment. Zygmunt Janiszewski, another founding father of the Polish Mathematical School, who died prematurely before Lindenbaum entered the university, developed the so–called Janiszewski program, a very ambitious plan of the development of mathematics in Poland that attributed crucial significance to logic and the foundations on mathematics. The Fundamenta Mathematicae, a journal established by Janiszewski and serving from its inception as an official scientific journal of the Polish Mathematical School, published many papers by logicians. It is noteworthy that Mazurkiewicz, Sierpiński, Leśniewski, and Łukasiewicz—two professional mathematicians and two logicians originating from philosophy—formed the Editorial Board of the Fundamenta. It was important for the subsequent stormy development of logic in Warsaw that the mathematical milieu accepted philosophers as professional teachers of students of mathematics.

This double heritage, philosophical as well as mathematical, determined the scientific ideology of Warsaw logicians. Perhaps one point should be mentioned here as particularly important in this context. Firstly, the Polish Mathematical School did not assume any specific philosophical standpoint concerning the nature of mathematics. Methodologically speaking, all fruitful mathematical methods, particularly coming from set theory, could be used in logical investigations, provided that they did not lead to contradictions. The last statement should be understood in the following way. Clearly, proofs of consistency are important and required, however even before Gödel’s second incompleteness theorem (roughly speaking, that the consistency of arithmetic cannot be proved in arithmetic itself) was announced, Polish mathematicians maintained that if it is empirically known that a theory of given concepts is contradiction–free, it can be faithfully used in mathematical investigations including logical research. As a consequence, the Polish Mathematics School did not subscribe to logicism, formalism or intuitionism, which were the main foundational currents in the philosophy of mathematics, even though they were widely regarded as such in 1900–1930. On the other hand, Polish logicians worked on many problems suggested by Russell, Hilbert and Brouwer, the main exponents of the mentioned schools. Moreover, sometimes a tension held between private, so to speak, philosophical views of some Polish logicians and their research practices. For instance, Tarski, influenced by Leśniewski’s and Kotarbiński’s nominalism, expressed explicit sympathy with this view, but on the other hand, he did not hesitate to use higher set theory and inaccessible cardinals in the foundations of mathematics.

There is practically nothing known about Lindenbaum’s philosophical views concerning mathematics. Clearly, his inclinations to dialectical materialism had no influence on his philosophy of mathematics and foundational views. In fact, he shared the general attitude of the Polish Mathematical School mentioned above. Lindenbaum published some papers directly related to general foundational problems (for example, Lindenbaum 1930, Lindenbaum 1931) in which he recommended the use of mathematics in logical investigations without any hesitation with respect to employing infinitary methods. He pointed out that such methods were present even in elementary arithmetic. For instance, he published reviews of works by Polish radical nominalists, such as Leon Chwistek and Władysław Hetper, but he entirely abstained from philosophical comments. Two papers—Lindenbaum 1936 and Lindenbaum–Tarski 1934–1935—should be particularly mentioned. The former paper is Lindenbaum’s contribution to the Paris Congress in 1935. It concerns the formal simplicity of concepts. Although Lindenbaum points out that this question arises in many fields, he does not offer any general definition of simplicity. Lindenbaum distinguishes seven relevant problems concerning simplicity (a) of systems of concepts (terms); (b) of propositions and their systems; (c) of inference rules; (d) of proofs; (e) of definitions and constructions; (f) of deductive theories; (g) of formal languages; but he addresses his further considerations to (a). The idea is that measuring the number of letters occurring in a given concept can tell us about the simplicity of the term. Lindenbaum follows Leśniewski in this respect. Two points are interesting. First, Lindenbaum assumes the simple theory of types. This suggests that he preferred a more elementary construction if it is possible and adequate for a given problem. This attitude was also very popular among Warsaw mathematicians and logicians. Second, Lindenbaum observes, obviously under Tarski’s influence, that simplicity has not only a syntactic dimension (in Carnap’s sense), but it should also be considered semantically.

The paper Lindenbaum–Tarski 1934–1935 basically concerns some metamathematical problems about the limitations of means of expressions (expressive power, to use present terminology) in deductive theories. The authors claim that all relations between objects (individuals, classes, relations, and so forth) expressible by purely logical means remain invariant under an arbitrary one–one mapping of the “world” (that is, the collection of all individuals) onto itself. Moreover, this invariance is logically provable. This idea was more fully developed by Tarski in his paper on logical concepts (see Tarski 1986). In a sense, the understanding of logical concepts as invariant under all one–one mappings has affinities with the Erlangen Program (Tarski stressed this point) in the foundations of geometry formulated by Felix Klein. Lindenbaum and Tarski proved a general theorem justifying the intuitive explanation of understanding logical relations as invariant under one–one mappings. The consequences of this approach to logical concepts for philosophy of logic are far–reaching. In particular, the Lindenbaum–Tarski definition of logical concepts motivates the theorem that logic does not distinguish any extralogical concept (roughly speaking, what can be proved in logic about an extralogical item, for instance, an individual, can be proved about any other individual). Consequently, logical theorems are true in all possible worlds (models). Thus, the definition in question naturally leads to seeing logic as invariant with respect to any specific content.

Lindenbaum’s works that are related to logic and the foundations of mathematics concern set theory and logical calculi, including their metalogical properties. This article deals with general set theory in section 3, while sections 4 and 5 are devoted to logical matters Section 3 skips special topics, including those belonging to other mathematical fields; the borderline between general and special set theory is somehow arbitrary; also some of Lindenbaum’s results are mentioned without entering into formal details.

Lindenbaum’s results in set theory were achieved in an individual collaboration with other authors, particularly Tarski and Mostowski. In the 1920s and 1930s Łukasiewicz conducted a seminar in mathematical logic. Its participants were the group of young logicians including, in addition to Lindenbaum and Tarski, Stanisław Jaśkowski, Andrzej Mostowski, Jerzy Słupecki, Bolesław Sobociński and Mordechaj Wajsberg. This seminar soon became a factory of new results in mathematical logic. Its participants collaboratively worked on problems. Lindenbaum’s results about logical calculi, as Łukasiewicz explicitly says, were achieved at this seminar. Lindenbaum frequently stated theorems, usually without proofs. His most important results were mentioned by others; some of them are to be found in Łukasiewicz–Tarski 1930 (refered to in this article as Ł–T1930).

3. Lindenbaum and Set Theory

In 1926, Lindenbaum and Tarski published the joint paper “Communication sur les Recherches de la Théorie de Ensembles” (Lindenbaum–Tarski 1926; the abbreviation L–T1926 is used in further references). This paper is very compact. In 30 pages the authors announced many results in set theory and its applications, that were achieved by them within the “last few years.” More particularly, theorems and definitions concerned cardinal and ordinal numbers, the relations between them and the theory of one–one mappings. The results were stated without proofs. The authors noted that proofs and further developments would appear in the subsequent writings. (Perhaps the already mentioned monograph Theorie der Eindeutigen Abbildungen was intended as a continuation of L–T1926; several related results are contained in Tarski 1949; Lindenbaum is mentioned in the Preface to this book as a person particularly effective in conducting research on cardinal numbers.) The spirit of the Polish Mathematical School is evident in this paper. The purely mathematical text is interrupted by historical and methodological comments; Polish mathematicians considered (and they still do) such remarks as a very important feature of mathematical prose. Due to the role of the axiom of choice (AC) in set theory and its controversial nature, results obtained without use of this axiom are grouped in a section different from the sections containing the theorems based on AC. The paper lists 102 theorems or lemmas about cardinal numbers, 5 theorems or lemmas about properties of one–one mappings, 16 theorems or lemmas about order types, 4 theorems on the arithmetic of ordinal numbers, and 19 theorems or lemmas on point–sets. In many cases, the investigations by Lindenbaum and Tarski continue the earlier works and achievements by Cantor, Dedekind, Bernstein, Fraenkel, Hartogs, Korselt, König, Lebesgue, von Neumann, Russell and Whitehead, Schröder, Zermelo, Banach, Kuratowski, Leśniewski and Sierpiński. In a sense, L–T1926 can serve as a very important historical report on the state of the art in set theory and its foundational problems in the mid–1920s.

Perhaps the most important result (theorem 94) announced in L–T1926 concerns the generalized continuum hypothesis (GCH) and AC. Lindenbaum posed the problem of how AC (one of the main focuses of Polish Mathematical School) is related to Cantor’s hypothesis on alephs (the name of GCH used in L–T1926). Theorem 94 states that GCH entails AC. This result was proved by Sierpiński in 1947 (see Sierpiński 1965, pp. 43–44, Moore 1982, pp. 215–217 for a brief survey). The search for equivalents of AC became one of the Polish mathematical specialties de la maison. Lindenbaum (L–T1926, theorem 82(L)) claimed that AC is equivalent to the assertion that for arbitrary cardinal numbers m and n, m ≤* n or n ≤* m (the symbol ≤* expresses the relation between cardinal numbers m and n such that either m = 0 or every set of power n is the sum of m mutually disjoint non–empty sets). This theorem was finally proved by Sierpiński in 1949 (see Sierpiński 1965, p. 435–436). L–T1926 also presents the material on the Cantor–Bernstein theorem (CBT; Lindenbaum and Tarski used the label “the Schröder–Bernstein theorem”), which says that for any cardinal numbers m, n, if m ≤ n and n ≤ m, then m = n (see Hinkis 2013 for a very detailed historical exposition). In particular, Lindenbaum proposed some equivalents of CBT also in the terms of order–types of one–one mappings. One of the equivalents of CBT is the following proposition: for any order–types α, β, γ, δ, if α = β + γ and γ = α + δ, then α = γ (via one–one mappings: if an well-ordered set X is similar to a segment of an ordered set B, and B is similar to a residue of A, then both sets A and B are similar). Another interesting result related to the Bernstein Division Theorem (BDT) says that for any natural number k and any cardinal numbers m, n, if km = kn, then m = n. L–T1926 claims that Lindenbaum proved BDT in its full generality and made that without AC. Yet there is a slight historical controversy concerning the scope of Bernstein’s original proof (see Hinkis, p. 139). [A ⇔ B]? Leaving this question aside, there is no doubt that L–T1926 played an important role in the development of the foundations of set theory.

Finally, consider the paper by Lindenbaum and Mostowski (Lindenbaum–Mostowski 1938) on the independence of AC from other axioms of Zermelo–Fraenkel set theory (ZF). Abraham Fraenkel claimed that he proved the independence of AC (in its standard version postulating the existence of a choice set for any family of non–empty and disjoint sets) and its two equivalents (every set can be ordered; if for every family of finite and mutually disjoint sets, a choice set exists, then there exists a choice set for every countable family of sets with mutually disjoint set as elements) from ZF. Lindenbaum and Mostowski remark that Fraenkel’s investigations, though of great value, cannot be regarded as fully successful “because a dangerous confusion of metamathematical and mathematical notions is inherent in it”. More specifically, Fraenkel’s notions of model and function are obscure. Lindenbaum and Mostowski propose to understand the concept of function in a semantic manner, that is, as determined by the formula “a set satisfies a propositional function”. Moreover, they point out an error in Fraenkel’s proof consisting in his treatment of permutations. Lindenbaum and Mostowski propose a modification of Fraenkel’s construction in order to correct his proof. It is done by the improving axiomatization with respect to axioms of separation, replacement and infinity. This step allows for proving that also other equivalents of AC are not derivable in ZF. The authors observe that ZF remains (relatively) consistent if we add the axiom “there exists an infinite set having not–sets as elements”. Moreover, the results cannot be obtained in the system in which sets are admitted as the only elements. These results were later completed by Mostowski. The models of the improved ZF with Urelements (items not being sets) are called the Fraenkel–Mostowski models. Perhaps a particularly interesting feature of Lindenbaum–Mostowski 1938 consists in the conscious use of semantic methods in the metamathematics of set theory.

4. Lindenbaum and Logical Calculi

This section reviews Lindenbaum’s results in metalogic of propositional calculus. They were announced at Łukasiewicz’s seminar in mathematical logic in 1926–1930. Łukasiewicz initiated a research program devoted to propositional (sentential) logic, classical as well as many–valued (the results are collected in Ł–T1930). The two main directions of investigations were executed by Łukasiewicz’s group. First, constructions of sentential logic as axiomatic systems were undertaken. Łukasiewicz and his students looked for independent and possibly economic axioms (economic in the sense of having the minimal number of axioms consisting of the shortest number of symbols). Second, Warsaw logicians—mostly Tarski—invented and systematized the basic metalogical tools for investigations of propositional calculus. Lindenbaum, contrary to the majority of the members of Łukasiewicz seminar, had no interest in proposing new axiomatizations of logic. His activities belonged entirely to metalogic. More specifically, Lindenbaum contributed to the matrix method, that is, to investigating sentential logic via logical matrices. They are generalizations of the well–known truth–tables.

Abstractly speaking, (note that the level of abstraction is higher in Ł–T1930), a logical matrix is an ordered quadruple M = [U, U’, f, g] such that U and U’ are two arbitrary sets (in order to exclude trivial cases, both sets are assumed to be non–empty), U has at least two elements, U’ ⊂ U, f is a binary function, and g is a unary function. Both functions are defined for U and take values from U. The intended interpretation is as follows: U – the set of logical values, U’ – the set of designated logical values. In the case of two–valued logic (1 – truth, 0 – falsehood), U’ = {1, 0} and U’ = {0}. Assume that L is a language of sentential logic. If A ∈ L, then f(A), g(A) ∈ U. Intuitively speaking, f and g are valuation functions defined for formulas of L taking values from U; if language is not taken into account, M is an algebra of (logical or other) values. Write v(A) for “v is a logical value of A in M”. The matrix M is normal provided that if v(A) ∈ U’ and v(B) ∈ U, then v(A, B) ∈ U. Roughly speaking, values of compound formulas always belong to U, independently of whether their constituents are valued by designated values or other (undesignated) values. If A ∈ L and for every w, w(A) = 1, then A is a tautology in M (A is verified by M). We can consider g as the function corresponding to negation and f as the counterpart of implication. Thus, v(¬A) = 1, if v(A) = 0, v(¬A) = 0, if v(A) = 1; v(A ⇒ B) = 1, if v(A) = 0 or v(B) = 1, otherwise v(A ⇒ B) = 1. These equalities show that truth–tables for implication and negation are special cases of logical matrices in the abstract sense.

Lindenbaum established several important theorems connecting propositional calculi with logical matrices (they are listed in a different order than the theorems appear in Ł–T 1930; this paper contains no proofs). Let the symbol LOG_ℵ₀ refer to a many–valued logic with a denumerably infinite set of logical values. Łukasiewicz defined matrices for such a system. Lindenbaum (theorem 16 in Ł–T 1930) established that LOG_ℵ₀ can be characterized by a matrix in which U’ = {1}, functions f and g satisfy the conditions f(x, y) = min(1, 1 – x + y), g(x) = 1 – x and U is an arbitrary infinite set of numbers which satisfies the condition 0 < x < 1 for any element of U and is closed under the operations f and g. Lindenbaum also proved (theorem 19 in Ł–T 1930) the following logico–arithmetical result (the converse of a result obtained earlier by Łukasiewicz): for 2 ≤ m ≤ ℵ₀and 2 ≤ n ≤ ℵ₀, we have the equivalence LOG_m ⊆ LOG_n if and only if n – 1 is a divisor of m – 1. The next theorem established by Lindenbaum for n = 3 (Tarski generalized this result for any prime number) says if n is a prime number that there are only two systems L (the entire language) and LOG₂ which contain LOG₃ as a proper part. Lindenbaum also proved (Ł–T1930, theorem 23) that every logic LOG_n is axiomatizable for any 1 ≤ n < ℵ₀.

Although the above results are general, they were mainly directed as reporting facts on many–valued logic. Lindenbaum also obtained the results on arbitrary sentential calculi and their matrices. The definition of M (plus the definition of a sentential calculus as closed by the consequence operation) implies that the set of tautologies in a matrix, that is, LOG(M), provided that M is normal, is a system. Lindenbaum announced (theorem 3 in Ł–T 1930) that every sentential calculus has at most a denumerably infinite normal matrix. This theorem was proved by Jerzy Łoś (see Łoś 1949; this work contains the first systematic treatment of logical matrices). This last work originated from systematic investigations on matrix semantics for propositional calculi (see Wójcicki 1989 for an extensive report on this field of logical research; several important results are also described in Pogorzelski 1994). The following historical speculation can illustrate the importance of theorem 3. When Heyting formalized intuitionistic sentential logic (ISC) in 1930, the question of finding a normal matrix for this logic became important. Gödel showed (in 1932) that no finite matrix (he used the term “realization”) verifies all theorems of ISC. By Lindenbaum’s result (Gödel did not refer to it), there exists a denumerably infinite normal matrix ISC. Jaśkowski constructed it in 1936. He certainly must have known Ł–T 1930, but he did not refer to it. This story makes a nice example of how influences could interplay in looking for an adequate matrix for intuitionistic propositional logic; but we have no accessible evidence that Lindenbaum’s theorem actually inspired Jaśkowski.

5. Lindenbaum and General Metamathematics

Several of Lindenbaum’s contributions to general metamathematics are mentioned in Tarski 1956 (referred to as T1956 below). The two most important results achieved by Lindenbaum are his construction of the so–called Lindenbaum algebra (LIA) and the maximalization theorem LMT (frequently called the Lindenbaum Lemma). Lindenbaum observed that formulas can be the elements of logical matrices (see Surma 1967, Surma 1973, Surma 1982 for the reconstruction of Lindenbaum’s path to LIA). Then, he as well as other logicians (Łoś, for instance) generalized this idea for arbitrary languages. LIA is presented in this article for classical sentential calculus. Let L be a formal language with ¬, ∧, ∨, ⇒, ⇔ as connectives. Formulas as such, that is, variables and their well–formed strings do not constitute an algebra. The symbol [A] refers to the Lindenbaum class of formulas with respect to A. We further stipulate that B ∈ [A] if and only if ├ A ⇔ B (this step gives a congruence in L) and then define –[A] as [¬A], [A] ∩ [B] as [A ∧ B], [A] ∪ [B] as [A ∨ B], [A] ⊆ [B] as [A ⇒ B], and [A ⇔ B] as [A] = [B]. If we denote the set of classes of formulas produced by the defined congruence by the symbol L_[_≈_], the structure < L_[_≈_], –, ∩, ∪, ⊆ , = > is a Boolean algebra of formulas, that is, LIA for sentential logic. An interesting feature of this construction is that building blocks for LIA come from language. This justifies the use of the same symbols for propositional connectives in the object language (the language of propositional calculus) and the metalanguage, (the language of LIA). Lindenbaum’s construction of algebras has important applications in algebraic proofs of metalogical theorems (see Surma 1967, Rasiowa–Sikorski 1970, Surma 1973, Rasiowa 1974, Surma 1982, Zygmunt–Pardy 2014), including the completeness theorem for classical logic as well as many non–classical systems.

There is a controversy concerning the origin of LIA. According to Surma 1967, p. 128 Lindenbaum presented his idea at the 1^st Polish Mathematical Congress in 1927. This fact, however, is only known from the Polish oral tradition. Significant information can be found in Rasiowa,–Sikorski 1970, 245–246, footnote 1. The first published mention was made by McKinsey with reference to Tarski’s oral communication in 1941. However, Tarski complained that the discovery of LIA should be credited to him (it is reported in the mentioned footnote in Rasiowa–Sikorski 1974). Since Tarski’s historical claim concerning LIA is known, some authors used label “the Lindenbaum–Tarski algebra”.

MT is perhaps the most important result achieved by Lindenbaum. Its formulation is very simple: every consistent formal system has its maximal and consistent extension. The theorem was probably inspired by the concept of Post–completeness, well known in the Łukasiewicz’s group. This theorem is mentioned several times in T1956; its proof is to be found on pp. 98–100. An important feature of LMT consists in the fact that it is not intuitionistically provable (although we know that maximal extensions exist, there is no general method of their construction) and it requires infinitistic methods (it is but weaker than AC; see below). T1956 points out many applications of LMT, for instance, that classical logic is the only consistent extension of intuitionistic logic (Tarski). The great career of LMT began after Henkin had used it in his proof of the completeness of first–order logic. This proof combines together LIA and the construction of maximal consistent sets of formulas. Henkin’s method became adapted for proving the completeness property of many logical systems. LMT is effectively equivalent to the Stone ultrafilter theorem. One can prove (see Surma 1968) that AC implies the Gödel–Malcev completeness theorem, and the latter entails LMT. The standard version of LMT works for countable languages (see Łoś 1955 for LMT for uncountable languages) and a compact consequence operation (omitting the property of compactness is not particularly significant). On the other hand, if LMT is strengthened by additional assumption concerning individual constants or by admitting uncountable languages, it becomes provably equivalent to AC (see also Gazzari 2014). These results help in placing LMT on the scale of infinitary methods in metamathematics.

The program of reverse mathematics (see Simpson 2009) gives a more general perspective in this respect. The symbol RCA₀ refers to Peano arithmetic minus the full induction scheme (it is restricted to ∑ zero–one formulas) plus the recursive comprehension scheme; it is a relatively weak subsystem of second–order arithmetic. LMT is equivalent over RCA₀ to the following propositions: weak König lemma, Gödel–Malcev completeness theorem, Gödel compactness theorem, completeness theorem for propositional logic for countable languages and the compactness theorem for propositional logic for countable languages. Since the weak König lemma is a rather mathematical (not metalogical) result, its equivalence with LMT exactly characterizes the mathematical content of the second. If LMT is associated with the consequence operation admitting the rule of substitution, we obtain the so–called relative Lindenbaum extensions (see Pogorzelski 1994, p. 318; this theorem was proved by Asser in 1959). Roughly speaking, if we take a consistent set X and a formula A such that A ∉ X, we have two consistent extensions of X, namely X ∪ {A} and X ∪ {¬A}. Clearly, by the standard LMT, there exist at least two different maximally consistent extensions. The problem how many such relative Lindenbaum sets are associated with a given consistent set X has no unique solution.

Here is a list of some other of Lindenbaum’s metamathematical results (see Tarski 1956, 32, 33, 36, 71, 297, 307, 338 for the technical details):

the number of all deductive systems is equal to 2^ℵ^o;
the number of all axiomatizable systems is equal to ℵ₀;
the condition which must be satisfied in order for the sum of a deductive system to be a deductive system;
structural type of a theory;
theorems of degrees of completeness;
atomic (atomistic, according to an older terminology) Boolean algebra.
independence of primitive concepts in mathematical systems.

The last three results in the above list were achieved jointly by Lindenbaum and Tarski (see Tarski–Lindenbaum 1927), and other results inspired Tarski in his metamathematical investigations. Moreover, Tarski credited to Lindenbaum the pointing out the role of set–theoretical methods in metamathematical investigations (T1956, p. 75).

6. Final Remarks

Helena Rasiowa (in Rasiowa 1974, p. v) says that the introduction of the Lindenbaum–Tarski algebra became “one of the turning points in algebraic study of logic”. This tradition was continued, systematized and conceptually unified by Tarski himself as well as by his American students, particularly J. C. C. McKinsey, Bjarni Jónsson, Don Pigozzi as well as Polish logicians, notably Jerzy Łoś, Helena Rasiowa and Roman Sikorski. Rasiowa–Sikorski 1970 can be considered as the opus magnum in this direction. A similar role should be attributed to LMT (not properly called the Lindenbaum Lemma, because its actual importance exceeds the fact of being an auxiliary device for proving other results) as a mark of mathematical content of tools used in metamathematics. Thus, Adolf Lindenbaum appears as one of the main masters in developing mathematics for metamathematics. His results on logical matrices opened a new stage in metalogical investigations concerning propositional calculus.

7. References and Further Reading

Gazzari, R. 2014, “Direct Proofs of Lindenbaum Conditionals”, Logica Universalis 8, Issue 3–4, 321–343.
Hinkis, A. 2013, Proofs of the Cantor-Bernstein Theorem. A Mathematical Excursion, Basel: Birkhäuser.
Lindenbaum, A. 1930, “Remarques sur une question de la methode mathematique”, Fundamenta Mathematicae 15, 313–321.
Lindenbaum, A. 1931, “Bemerkungen zu den vorhergehendem “Bemerkungen” des Herrn J. v. Neumann”, Fundamenta Mathematicae 17, 335–336.
Lindenbaum, A. 1936, “Sur la simplicité formelle des notions”, in Actes du Congrès International de Philosophie Scientifique, VII Logique (Acutalités Scientifique et Industrieles 394), 29–38.
Lindenbaum, A.–Mostowski, A. 1938, “Über die Unabhängigkeit des Auswahlaxioms und einiger seiner Folgerungen”, Comptes Renduz des Séances de la Société des Sciences et des Letters de Varsovie, Classe III, 31, 27–32; Eng. tr. in A. Mostowski, Foundational Studies. Selected Works, Vol. II, Warszawa–Amsterdam: PWN–Polish Scientific Publishers – North–Holland Publishing Company, 70–74.
Lindenbaum, A.–Tarski, A. 1926, „Communication sur la recherches de la théorie des ensembles”, Comptes Renduz des Séances de la Société des Sciences et des Letters de Varsovie, Classe III, 19, 299–330; repr. in A. Tarski, Collected Papers, vol. 1, 1921–1934, Basel: Birkhäuser 1986, 171–204.
Lindenbaum, A.–Tarski, A. 1934–1935, “Über die Beschränktheit des Ausdrucksmittel deduktiver Theorien, Ergebnisse eines mathematischen Kolloqiums 7, 15–22; Eng. tr. in Tarski 1986, 384–392.
Łoś, J. 1949, O matrycach logicznych (On Logical Matrices), Wrocław: Wrocławskie Towarzystwo Naukowe.
Łoś, J. 1955, „The Algebraic Treatment of the Methodology of Elementary Deductive Systems”, Studia Logica 2, 151–212.
Łukasiewicz, J.–Tarski, A. 1930, “Untersuchungen über den Aussgenkalkül”, Comptes Renduz des Séances de la Société des Sciences et des Letters de Varsovie, Classe III, 23, 30–50; Eng. tr. in Tarski 1956, 38–59.
Marczewski, E.–Mostowski, A. 1971 “Lindenbaum Adolf (1904–1941)”, Polski Słownik Biograficzny 17, 364b–365b;
Moore, G. H. 1982, Zermelo’s Axiom of Choice. Its Origins, Development, and Influence, Springer Verlag: New York – Heidelberg – Berlin.
Pogorzelski, W. 1994, Notions and Theories of Elementary Formal Logic, Białystok: Warsaw University – Białystok Branch.
Rasiowa, H. 1974, An Algebraic Approach to Non-Classical Logics, Amsterdam – Warszawa: PWN–Polish–Scientific Publishers – North–Holland Publishing Company.
Rasiowa, H.–Sikorski, R. 1970, The Mathematics of Metamathematics, Warszawa: PWN– Polish Scientific Publishers.
Sierpiński, W. 1965, Cardinal and Ordinal Numbers, Warszawa: PWN – Polish Scientific Publishers.
Simpson, S. G. 2009, Subsystems of Second Order Arithmetic, Cambridge: Cambridge University Press.
Surma, S. J. 1967, “History of Logical Applications of the Method of Lindenbaum’s Algebra”, Analele Univeritatii Bucurereşti – Acta Logica 10, 127–138.
Surma, S. J. 1968, “Some Metamathematical Equivalents of the Axiom of Choice”, Prace z Logiki 3¸71–80.
Surma, S. J. 1973, “The Concept of Lindenbaum Algebra and Its Genesis, in Studies in the History of Mathematical Logic, ed. by S. J. Surma, Ossolineum, Wrocław, 239–253.
Surma, S. J. 1982, “On the Origins and Subsequent Applications of the Concept of Lindenbaum Algebra”, in Logic, Methodology and Philosophy of Science VI. Proceedings of the Sixth International Congress of Logic, Methodology and Philosophy of Science, Hannover 1979, ed. by L. J. Cohen, J. Łoś, H. Pfeiffer and K.–P. Podewski, Warszawa – Amsterdam: PWN Polish Scientific Publishers – North–Holland Publishing Company, 719–734.
Tarski, A.–Lindenbaum, A. 1927, “Sur l’indépendance des notions primitives dans les systèmes mathématiques”, Annales de la Société Polonaise de Mathématique 7, 111–113.
Tarski, A. 1949, Cardinal Algebras, New York: Oxford University Press.
Tarski, A. 1956, Logic, Semantics, Metamathematics. Papers from 1923 to 1939, Oxford: Clarendon Press.
Tarski, A. 1986, “What are Logical Notions?”, History and Philosophy of Logic 7, 143–154.
Woleński, J. 1995, “Mathematical Logic in Poland 1900–1939: People, Institutions Circles, Institutions, Ideas”, Modern Logic V(4), 363–405; repr. in J. Woleński, Essays in the History of Logic and Logical Philosophy, Kraków: Jagiellonian University Press 1999, 59–84.
Wójcicki, R. 1989, Theory of Logical Calculi. Basic Theory of Consequence Operations, Dordrecht: Kluwer Academic Publishers.
Zygmunt, J.–Purdy, R. 2014, “Adolf Lindenbaum: Notes on His Life with Bibliography and Selected References”, Logica Universalis 8, Issue 3–4, 285–320.

Author Information

Jan Woleński
Email: wolenski@if.uj.edu.pl
University of Technology, Management and Information
Poland

Institution Theory

Institution theory is a very general mathematical study of formal logical systems—with emphasis on semantics—that is not committed to any particular concrete logical system. This is based upon a mathematical definition for the informal notion of logical system, called institution, which includes both syntax and semantics as well as the relationship between them. Because of its very high level of abstraction, this definition accommodates not only well-established logical systems but also very unconventional ones; and moreover it has served and it may serve as a template for defining new ones.

There is some criticism that the abstraction power of institutions is too much, allowing for examples that can hardly be recognised as logical systems. Institution theory is nevertheless part of the universal logic trend (Béziau, 2012) which approaches logic from a relativistic, non-substantialist perspective, that is quite different from the common reading of logic, both in philosophy and in the exact sciences. However, institution theory should not be regarded as opposed to the established tradition of logic since it includes it from a higher abstraction level. In fact the real difference may occur at the level of methodology, top-down (in the case of institution theory) versus bottom-up (in the case of conventional logic tradition). This means that, in institution theory, concepts come naturally as presumed features that a logical system might or might not exhibit, and they are defined at the most appropriate level of abstraction; hypotheses are kept as general as possible and introduced on an as-needed basis. These lead to a deeper understanding of logic phenomena that is not hindered by the largely irrelevant details of particular logical systems, but are guided by structurally-clean causality. In the exposition, after discussing the history of institution theory, its main concepts are presented. Then there is a discussion of the main contributions of institution theory, including a wide range from pure mathematical logic to applied computing science. A special point here is the institution-theoretic method for doing logic by translation, which means handling logical issues rather indirectly by transporting them across logical systems and solving them at the most appropriate place. After this some extensions of mainstream institution theory are presented briefly.

History
The Concept of Institution
Institution-independent Model Theory
Logic by Translation
Contributions to Computing Science
Extensions
References and Further Reading

1. History

a. Origins

Institution theory was introduced by Joseph Goguen and Rod Burstall in the late seventies as a response to the explosion in the population of logical systems in use in formal specification theory and practice. Formal specification is a logic-based area of computer science that aims to support reliable system development through axiomatic formalisation of their structure and functionality. At the time (and now even more) there was a great diversity of specification formalisms, each of them supported by a particular underlying logical system. Hence the need for a uniform approach to specification theory capable to develop those part of the theory that are independent of the choice of a particular logical system, and thus are common to many specification formalisms. The key step was the definition of the concept of institution in (Goguen and Burstall, 1984) intended to capture formally the structural essence of logical systems beyond specific details. Since semantics plays the primary role in formal specification, institutions lean towards the semantics side of logic, known as model theory. This aspect has permanently constituted a source of criticism from the side of syntactic and proof oriented logicians and at the same time a source of celebration from the side of the semantics oriented ones.

The concept of institution has two theoretical sources. One is the abstract model theory of Barwise (1974), and the other is the category theory of Eilenberg and Mac Lane (1945); Mac Lane (1998). While the importance of the latter in traditional areas of mathematics remains controversial, it has gained a major status in theoretical computing science (see (Goguen, 1991)) and logic. Although mathematically institutions are categorical structures
their spirit is that of abstract model theory.

b. Early Developments

The first paper on institution theory (Goguen and Burstall, 1984) introduced the main concepts and in parallel illustrated them on the example of the capture of many-sorted equational logic as an institution, at the time being the most traditional and important specification logic. Among the basics of institution theory developed there were the Galois connection between syntax and semantics as well as various concepts and results related to the structuring of specifications and programs. The latter task has been carried much further in the influential work (Sannella and Tarlecki, 1988). Another important early development was the introduction in (Tarlecki, 1986b) of the treatment of classical connectives (conjunction, disjunction, negation, and so forth) and of quantifications in abstract institutions, and also of other logic concepts such as interpolation. That was a clear indication that institution theory may reach far beyond its original goal, that of providing a very general theoretical platform for formal specification. The work (Tarlecki, 1986c) was the first paper on institution theory that developed deep results having a logic (model theory) rather than a computing science flavour. In parallel with these developments the list of logical systems captured as institutions kept growing, with quite unconventional ones being added, a process fueled mainly by the increasing diversity of computing science logics. Very often the effort to formally capture particular logical systems as institutions has lead to (re)considerations, within the respective logical setups, of some basic logical concepts, such as variable, language (or vocabulary, signature), model, sentence, and so forth. Presenting logical systems as precise and coherent mathematical objects proves to be more than a simple exercise, it has lead to new understanding of some aspects of particular logical systems.

c. Later Developments

The work (Meseguer, 1989) extended the concept of institution, that has a pronounced semantic character, to include proof-theoretic concepts, thus opening the possibility to have a general institution-theoretic approach to logical calculi.

Although the conceptual infrastructure was already in place right from the beginning of institution theory, the first substantial institution-theoretic work in the direction of doing logic by translation is (Cerioli and Meseguer, 1997). Many other works in the same direction followed, most notably (Mossakowski et al., 2009), the winner of the contest of the 2nd World Congress in Universal Logic (Xi’an, China, 2007).

An important trend within institution theory is less motivated by computing science and more by model theory research. Several important model theory methods have been developed at the level of abstract institutions and a lot of very general and yet deep results have been developed. This has resulted in a very abstract form of model theory, often refereed to as institution-independent model theory or synonymously institutional model theory. The monograph (Diaconescu, 2008) provides a snapshot of this dynamic area. Many of the institution-independent model theory results constitute high generalisations of well known results from conventional concrete model theory and can be used for obtaining easily corresponding results in less conventional logical systems. The same can be said for model theoretic concepts. A lot of theoretical computer science has been developed within institution theory based on the principle that formal specification and declarative programming languages should be based rigorously upon an underlying concrete institution. Based upon a large body of institution-theoretic developments, two modern specification languages have been designed by following this principle: CafeOBJ in Japan (Diaconescu and Futatsugi, 1998) and CASL in Europe (Astesiano et al., 2002). Both developments (the latter via the Hets environment (Mossakowski, 2005)) acknowledged the importance of logically heterogeneous environments, where several logical systems instead of only one are used via appropriate translations between them. For this, institution theory was able to accommodate a construction from algebraic geometry due to the French mathematician Alexandre Grothendieck, and flatten any such heterogeneous environment to a single institution (Diaconescu, 2002; Mossakowski, 2002), with the benefit of avoiding the rather big eort of a redevelopment of concepts and results for the heterogeneous situation. Institution theory plays the core role in the OMG standard The Ontology, Integration and Interoperability (OntoIOp).

d. Notes

Although (Goguen and Burstall, 1984) may be considered as the first prominent publication from the now rather vast institution theory literature, the seminal reference of the area is considered (Goguen and Burstall, 1992). The large time gap between these two publications is due to a very slow editorial process. Some critics consider the term ‘institution’ as uninspired since it does not convey anything about the scientific or the philosophical meaning of the concept. Goguen said that they chose this name, somehow half joking, in response to the sectarianism that was taking over the specification community at the time. Around particular specification formalisms, people were building real social institutions consisting of dedicated conferences, publication forums, user groups, and so forth.

2. The Concept of Institution

An institution is a mathematical structure that can be regarded as a template for capturing mathematically logical systems. Many argue that this template is general enough to accomodate anything that may be called ‘logic’, or at least any logical system based on satisfaction between sentences and models of any kind. The concept of institution relies heavily upon category theory concepts, but in a rather elementary way. This means that most of institution theory does not involve sophisticated or advanced levels of category theory. An institution consists of four kinds of entities: signatures, sentences, models, and the satisfaction between models and sentences. All these are considered fully abstractly and axiomatically. This means the focus is on their external properties, how they relate to the other entities, rather than what they actually are or may be.

a. Signatures

When assuming a logical context the first thing to be done is to assume a collection of symbols as primitive building blocks for the syntactic constructs. In logic this is usually called language or vocabulary. In computing science this is usually called signature, and so is in institution theory. In a signature the symbols are also usually structured, and often this gets to rather complex structures. This is especially true in the context of many modern computing science logics. But institution theory encapsulates all such information and treats signatures as fully abstract entities. In addition to this, institution theory considers that in a logical system signatures may vary. This comes from the practice of formal specification where the signatures are defined locally. Hence in any institution we have a collection of signatures rather than only one signature. However, this is not taken as a discrete collection as institution theory also considers interpretations between signatures called morphisms of signatures; these are also considered fully abstractly. The only data is that any signature morphism $\varphi$ has a source signature $\Sigma$ and a target signature $\Sigma’$; this is denoted $\varphi \,\colon\; \Sigma \rightarrow \Sigma’$ by employing the common mathematical notation of a function. Note that in concrete examples signature morphisms are not necessarily functions, they may be much more complex interpretations that preserve the respective structure of the signatures. Also given two signatures, in general nothing prevents the existence of more than one morphisms between them, or the non-existence of such morphisms.

In the case of concrete institutions, of examples, one has to define precisely what are the signatures and their morphisms. Let us consider the simple example of (the institution of) classical propositional logic, which we denote by $\mathit{PL}$. Here a signature is just a set $P$, traditionally referred to as a set of ‘propositional variables’. It is important that $P$, although arbitrary, is a fixed set. Another set $P’$ gives another signature. And any function $\varphi \,\colon\; P \rightarrow P’$ is a signature morphisms. Note that the Boolean connectors $\wedge$, $\neg$, etc, although they contribute to the building of the propositional logic statements, are not part of the signatures.

The only property that institution theory considers for signatures and their morphisms is that when $$\varphi \,\colon\; \Sigma \rightarrow \Sigma’$$ and $$\varphi’ \,\colon\; \Sigma’ \rightarrow \Sigma”$$ there exists another signature morphism $$\varphi;\varphi’ \,\colon\; \Sigma \rightarrow \Sigma”$$ called the composition of $\varphi$ and $\varphi’$. This composition should satisfy some associativity and identity laws, hence very compactly we say that in any institution the signatures and their morphisms form a category. In our $\mathit{PL}$ example this is just the most standard category, the category $\mathit{\mathrm{SET}}$ of sets with functions as morphisms, with the composition $\varphi;\varphi’$ being just the set theoretic composition $\varphi’ \circ \varphi$, i.e. $$(\varphi;\varphi’)(x) = (\varphi’ \circ \varphi)(x) = \varphi’ (\varphi (x))$$

b. Sentences

In any particular logical system once a language (signature) is assumed, we can have logical statements or sentences. The collection of sentences is dependent on the assumed language (signature). In institution theory this very basic principle is reflected by a designated function ($\mathrm{Sen}$) that maps each signature to a set (of presumed sentences). At the abstract level one does not bother what this function is, it is considered fully abstractly. For each signature $\Sigma$ we just call the elements of $\mathrm{Sen}(\Sigma)$ as ‘$\Sigma$-sentences’. However, the concrete institutions need to define these. For instance, given a $\mathit{PL}$ signature $P$ (which is just a set), $\mathrm{Sen}(P)$ is the set of all formulæ built from $P$ by using the connectors $\wedge$ and $\neg$. If $$P = \{ \pi_1, \pi_2 \}$$ then for example $$(\neg \pi_1) \wedge \pi_2 \in \mathrm{Sen}(P)$$ Since semantically the other propositional logic connectors such as implication $\Rightarrow$, disjunction $\vee$, etc. can be expressed in terms of $\wedge$ and $\neg$, the latter two are enough to get all the expressivity power of propositional logic.

The signature morphisms reflect at the level of sentences as translation mappings. That is, for any signature morphism $\varphi \,\colon\; \Sigma \rightarrow \Sigma’$ there exists a function $$\mathrm{Sen}(\varphi) \,\colon\; \mathrm{Sen}(\Sigma)\rightarrow \mathrm{Sen}(\Sigma’)$$ For abstract institutions $\mathrm{Sen}(\varphi)$ is considered abstractly, but in concrete institutions we have to define it. Usually $\mathrm{Sen}(\varphi)$ just replaces the symbols in $\Sigma$-sentences according to $\varphi$. For example, given the only existing $\mathit{PL}$ signature morphism $$\varphi \,\colon\; P \rightarrow P’$$ where $P$ is as above and $P’ = \{ \pi \}$ then $$\mathrm{Sen}(\varphi)((\neg \pi_1)\wedge \pi_2) = (\neg \pi) \wedge \pi$$. In other situations things are not that straightforward. For instance in many sorted quantified logics the translation of quantified sentences gets a little sophisticated due to the management of the first order variables (for example to make sure that a translated variable does not clash with an existing constant of the target signature). From our example we may note easily a very simple coherence property of the sentence translation: for any composition of signature morphisms $\varphi;\theta$ we have that for each sentence $\rho$, $$\mathrm{Sen}(\varphi;\theta)(\rho) = \mathrm{Sen}(\theta)(\mathrm{Sen}(\varphi)(\rho))$$ If we add also that the identity signature morphisms (i.e. those keeping everything as it is) always get mapped to identity functions then in category theory terminology we can just say that $\mathrm{Sen}$ is a functor from the category $\mathrm{Sig}$ of the signatures to the category $\mathit{\mathrm{SET}}$ of sets and functions. At the level of abstract institutions this property of the $\mathit{PL}$ sentence translations is given as an axiom that characterises the sentence part of the general definition of institution.

c. Models

On the semantics shore, each signature can be interpreted as a collection of models. In general for each signature $\Sigma$ there can be several models (called $\Sigma$-models), an aspect that gives model theory its relativistic character. For instance, the models of a $\mathit{PL}$ signature $P$ are the valuations of the ‘propositional variables’ of $P$ to the truth values $0$ and $1$, which is the same as the subsets of $P$ (by retaining only the elements of $P$ that are valuated to $1)$. Like with signatures and sentences, at the abstract level, that of the definition of the concept of institution, the models are also considered fully abstractly. Like for signatures, but unlike for sentences (of a given signature), we also consider morphisms between models. This yields the same kind of mathematical structure for the collection of the $\Sigma$-models like in the case of $\mathrm{Sig}$, that of a category. Hence let $\mathrm{Mod}(\Sigma)$ denote the category of the $\Sigma$-models and their morphisms. In the case of $\mathit{PL}$, $\mathrm{Mod}(P)$ has the particularity that given two models $M$ and $N$ there exists at most one morphism $M \rightarrow N$; this happens only when $M \subseteq N$ (here we regard the $P$-models as subsets of $P$ rather than valuations to $\{ 0,1 \}$).

The fact that in general $\mathrm{Mod}(\Sigma)$ is defined as a category rather than set (like $\mathrm{Sen}(\Sigma)$ is), besides the morphisms aspect, has a rather subtle set-theoretic aspect. The $\Sigma$-models may be so many that they may not constitute a set anymore. In examples this is closely related to the fact that there does not exists ‘the set of all sets’, which would be a violation of one of the axioms of formal set theory. However, this does not necessarily happen always, for example in $\mathit{PL}$ the $P$-models do form a set.

Another important difference between the syntax and semantics sides of the definition of institution occur at the level of the translations induced by the signature morphism. This phenomenon can be best understood when looking at concrete examples, such as $\mathit{PL}$. Given a $\mathit{PL}$ signature morphism $\varphi \,\colon\; P \rightarrow P’$ the corresponding translation mapping goes from $\mathrm{Mod}(P’)$ to $\mathrm{Mod}(P)$, opposite the direction the sentence translation goes. Namely, each $P’$-model $M’$ gets reduced to the $P$-model $M’ \circ \varphi$ (here for convenience we regard $\mathit{PL}$-models as valuations rather than subsets). This is a very important feature of the semantics, called the contravariance of the reduction. The use of the name ‘reduction’ instead of ‘translation’ is in fact meant to convey this aspect. This terminological choice can be easily understood when considering $\varphi$ to be an inclusion $P \subseteq P’$: as the valuation function, $M’ \circ \varphi$ is just the restriction of $M’$ to $P$. Thus, at the general level, for any signature morphism $\varphi\,\colon\; \Sigma \rightarrow \Sigma’$ there is a model reduct functor $$\mathrm{Mod}(\varphi)\,\colon\; \mathrm{Mod}(\Sigma’)\rightarrow \mathrm{Mod}(\Sigma)$$ By taking into account that $\mathrm{Mod}(\Sigma’)$ and $\mathrm{Mod}(\Sigma)$ are categories rather than sets, $\mathrm{Mod}(\varphi)$ is functor rather than function. The behaviour of the model reduct functors with respect to the composition of the signature morphisms is perfectly similar to what happens in the case of the sentence translations, of course modulo the contravariance aspect: $$\mathrm{Mod}(\varphi;\varphi’)(M”) = \mathrm{Mod}(\varphi)(\mathrm{Mod}(\varphi’)(M”))$$ All these can be formulated compactly just by saying that $\mathrm{Mod}$ is a contravariant functor from $\mathrm{Sig}$ to the category $\mathit{\mathrm{CAT}}$ of categories and functors.

d. Satisfaction

At the heart of the semantic concept of truth, promoted by Tarski (1944) and employed by institution theory, lies the satisfaction relation between models and sentences. For example, in $\mathit{PL}$ given a signature $P$, a $P$-model $M$ and a $P$-sentence $\rho$, we may evaluate $\rho$ for the valuation $M$ as a truth value $0$ or $1$ by applying the well known truth table of classical propositional logic inductively on the structure of $\rho$. Then we say that $M$ satisfies $\rho$, written $M \models \rho$, if and only if the evaluation of $\rho$ in $M$ yields $1$. Note that we speak about satisfaction only when the model and the sentence belong to the same signature.

Since in the definition of institutions signatures, sentences and models are fully abstract, the satisfaction of sentences by models is fully abstract too. Thus for each signature $\Sigma$ there is a satisfaction relation between $\Sigma$-models and $\Sigma$ sentences, denoted $\models_\Sigma$. This is subject of an axiom known as the Satisfaction Condition whose meaning is often informally explained as the invariance of truth with respect to the change of notation: for each signature morphism $\varphi \,\colon\; \Sigma\rightarrow \Sigma’$, each $\Sigma$-sentence $\rho$ and each $\Sigma’$-model $M’$, $$! \mathrm{Mod}(\varphi)(M’) \models_\Sigma \rho \text{ if and only if } M’ \models_{\Sigma’} \mathrm{Sen}(\varphi)(\rho). $$ In concrete institutions one always has to define the satisfaction relation, commonly done by induction on the structure of the sentences (like in the case of $\mathit{PL}$$. This principle, known as truth functionality, means that given a semantic context (model) the truth value of any compound sentence is determined from the truth values of its components. Truth functionality provides also the common method to establish the Satisfaction Condition in concrete institutions, by induction on the structure of the sentences. While in some cases this is an easy task ($\mathit{PL}$ is such an example) in other cases it can be highly non-trivial. Especially in the case of quantified logics the induction step corresponding the quantifications poses some technical problems, requiring some compositionality property for the models (see (Diaconescu, 2008)).

Our presentation of the mathematical definition of the concept of institution may be summarised as follows. An institution is a tuple $(\mathrm{Sig},\mathrm{Sen},\mathrm{Mod},\models)$ where $\mathrm{Sig}$ is a category, $\mathrm{Sen}$ is a functor $\mathrm{Sig} \rightarrow \mathit{\mathrm{SET}}$, $\mathrm{Mod}$ is a contravariant functor $\mathrm{Sig} \rightarrow \mathit{\mathrm{CAT}}$, and for each signature $\Sigma$, $\models_\Sigma$ is a relation between $\Sigma$-models and $\Sigma$-sentences such that the Satisfaction Condition holds. Note that besides the Satisfaction Condition, the definition of institution includes other axioms encapsulated by the several categories and functors that are part of the definition.

e. Concrete Institutions

The institution theory literature (which includes part of the specification theory literature) contains countless examples of logical systems that have been formally captured as institutions. Among these there are first, second, higher order logics, logics with some form of partiality for the functions such as partial algebra and various dialects of order sorted algebras, non-classical logics such as intuitionistic ones, a wide diversity of modal logics, fuzzy and many valued logics, and so forth. All these institutions admit also many-sorted variants. Many examples of institutions arose on the basis of various combinations between other institutions. Several institutions related to programming look rather divorced from the common perception of what is a logical system, some of these being presented in (Sannella and Tarlecki, 2012).

f. Notes

The traditional style of doing logic has a rather global approach to signatures, they usually do not vary, and when they do this would be just extensions. By contrast, institution theory is genuinely multi signature oriented, with signature morphisms being a rather primitive concept. Moreover in concrete examples the institution-theoretic view is that these can be broader than extensions, they may rename and even collapse elements. This widening of the concept of signature morphism serves the purpose of specification theory but is also very convenient for abstraction. Having signature morphisms as a primitive concept is also crucial for the development of various important logic concepts at the abstract level, for example, quantification, interpolation, method of diagrams, saturated models, and so forth. Moreover, the generality of the concept of signature morphism in institution theory accommodates even first-order substitutions like in (Găină and Petria, 2010), or substitutions of propositional variables by compound propositions like in (Voutsadakis, 2002).

The Satisfaction Condition had appeared for the first time as an axiom in (Barwise, 1974), but in a much less abstract context than institution theory. Although in logic this appears as an indisputable property of logical systems, in computing science there was some criticism of being too strong and thus preventing some logical systems, albeit very marginal, from being institutions. More precisely, the critics of the Satisfaction Condition argued that the implication from the right to the left would be enough. Counter-criticism to this argues that in the absence of the full Satisfaction Condition as an equivalence, almost all general institution theory results both in model theory and in computing science become impossible. In defence of the Satisfaction Condition (Goguen, 1991) argues that those counterexamples arise due to some heavy incoherence between the respective concepts of signature morphism and satisfaction relation, that under a meaningful fix of the respective concept of signature morphism the Satisfaction Condition is rescued.

3. Institution-independent Model Theory

The institution-theoretic approach to model theory tends to be rather comprehensive, here we present some rudiments of it. This reading of this section may sometimes require some technical inclination from the side of the reader.

a. The Galois Connection between Syntax and Semantics

Given a signature $\Sigma$ in an institution we let

for each $E\subseteq \mathrm{Sen}(\Sigma)$, $E^* = \{ M \in \mathrm{Mod}(\Sigma) \mid M \models_\Sigma \rho \mbox{ for each } \rho\in E \}$, and
for each $\mathcal{M}\subseteq \mathrm{Mod}(\Sigma)$, $\mathcal{M}^* = \{ \rho\in \mathrm{Sen}(\Sigma) \mid M \models_\Sigma \rho \mbox{ for each } M \in \mathcal{M} \}$.

These give what is called a Galois connection between the subsets of $\mathrm{Sen}(\Sigma)$ and those of $\mathrm{Mod}(\Sigma). The \(*$ operators allow also for the definition of semantic consequence: given any set $E$ of $\Sigma$-sentences and any $\Sigma$-sentence $\rho$ we say that $\rho$ is a semantic consequence of $E$, written $E \models \rho$, when $\rho \in E^{**}$ (that is every model that satisfies every sentence in $E$ satisfies $\rho$ too).

b. Logical Connectors

The semantics of the Boolean connectors can be formally defined in institutions also by using the $*$ operators. A $\Sigma$-sentence $\rho$ is a conjunction of the $\Sigma$-sentences $\rho_1$ and $\rho_2$ when $\rho^* = \rho_1^* \cap \rho_2^*$. Or, $\rho’$ is a negation of $\rho$ when $\rho’^* = \mathrm{Mod}(\Sigma) \setminus \rho^*$. And similarly for disjunction, implication, etc. Note that, unlike their syntactic correspondents that are unique, the semantic conjunction, negation, implication, etc. are unique only up to semantic equivalence, which means that from this point of view sentences satisfied by the same models are indistinguishable. The definition of the Boolean connectors can be extended at the level of the whole institution: for example an institution has conjunctions when any two of its sentences have a conjunction, and so on.

c. Quantifiers

The institution theoretic treatment of quantifiers is semantic and relies crucially upon the concept of signature morphism essentially by assimilating valuations of variables with model expansions along the extension of the signature with the respective variables. While in concrete institutions one may discuss about valuations of variables $X$ into models $M$ as functions from $X$ to some underlying set theoretic carrier of $M$, in the abstract setup this is not possible due to the lack of explicit set theoretic structures. However by assimilating the variables $X$ with the signature extensions $\Sigma \rightarrow \Sigma+X$ obtained by adding them to the respective signatures, we may note that for any $\Sigma$-model $M$, there is a canonical one-to-one correspondence between the valuations $X \rightarrow M$ and the $\Sigma+X$-models $M’$ such that their reducts to $\Sigma$ are just $M$. This trick implies that variables can become part of the signatures, which breaks with the habit of traditional approaches to logic of keeping variables separated from the language (signature). The point of this separation is to avoid some clash between $X$ and the entities of $\Sigma$. However this can be achieved differently without separating variables from the signatures, which anyway from the formal perspective poses several difficulties. The idea is very simple and comes from the practice of specification languages. One has just to qualify properly the variables by their signature context. For example a variable for a signature $\Sigma$ in the institution of many sorted first order logic would be a triple $(x,s,\Sigma)$ where $x$ is the name, $s$ the sort/type, and $\Sigma$ the signature of the variable. A signature $\Sigma$ becoming part of the data defining the variables $X$ prevents, by formal set theory reasons, any clash between $\Sigma$ and $X$ when adjoining $X$ to $\Sigma$.

Hence at the level of abstract institutions a variable for a signature $\Sigma$ is just a signature morphism $\chi \,\colon\; \Sigma \rightarrow \Sigma$. Note that in the concrete situations variables-as-signature-morphisms support concepts of variables at the same level with the signatures, hence it is rather powerful. For example if the signature allows for higher order functions, then one can have variables for those. However often the intended variables are significantly more particular than what the respective concept of signature provides. This is one of the reasons why not any signature morphism can be considered as representing a variable. Another reason is that in concrete institutions the signature morphism representing variables are extensions, and moreover they are extensions with entities that have a proper qualification, as discussed above. A standard way to realize these restrictions abstractly is to have an abstract designated subclass $\mathcal{D}$ of signature morphism as variables. Given $(\chi \,\colon\; \Sigma \rightarrow \Sigma’)$ in $\mathcal{D}$, a $\Sigma$-sentence $\rho$ is a universal $\chi$-quantification of a $\Sigma’$-sentence $\rho’$ when for each $\Sigma$-model $M$, $M \models_\Sigma \rho$ if and only if $M’ \models_{\Sigma’} \rho’$ for all $\Sigma’$-models $M’$ such that $M = \mathrm{Mod}(\chi)(M’)$. Existential quantification is defined similarly. It is said that the institution has universal/existential $\mathcal{D}$-quantifications when any sentence $\rho’$ as above has a universal/existential $\chi$-quantification for any $\chi\in \mathcal{D}$ as above.

In logic in general and in model theory in particular it is well known that quantification by first-order variables has very good and desirable properties. This kind of quantification is captured at the abstract level by requiring that any signature morphism $(\chi \,\colon\; \Sigma \rightarrow \Sigma’)$ in $\mathcal{D}$ is representable, which essentially means that there exists a $\Sigma$-model $M_\chi$ such that there is a one-to-one correspondence between the $\Sigma’$-models $N’$ with $\mathrm{Mod}(\chi)(N’) = N$ and the $\Sigma$-model morphisms $M_\chi \rightarrow N$. The idea behind this is that in concrete situations valuations of variables can be represented by model morphisms from a ‘free model’ over the variables. In other words, when $\chi$ is a signature extension $\Sigma \rightarrow \Sigma + X$ as discussed above, $M_\chi$ is the $\Sigma$-model freely generated by $X$, in standard actual situations being a term model over $X$.

d. Diagrams

The method of diagrams is a widely pervasive method in model theory (see for example (Chang and Keisler, 1990)). Its institution-theoretic abstraction of Diaconescu (2004b) also plays an important role in many institution-independent model theory developments. In essence, the diagram of given a model $M$ represents a kind of comprehensive syntactic characterisation of $M$. The institution-theoretic definition introduced in (Diaconescu, 2004b) axiomatises a category theoretic one-to-one correspondence between the model morphisms $M \rightarrow N$ and the models satisfying an abstractly designated set of sentences $E_M$. The signature $Sigma_M$ of $E_M$ is not the signature $\Sigma$ of the model $M$, instead there is an abstractly designated signature morphism $$\iota_{\Sigma,M} \,\colon\; \Sigma \rightarrow \Sigma_M$$ In the concrete examples in which the models have an underlying set theoretic structure, $\iota_{\Sigma,M}$ is usually the extension of $\Sigma$ with the elements of the underlying carrier of $M$ that are adjoined to $\Sigma$ as new constants.

The most typical examples come from classical first order model theory. Let $\iota_{\Sigma,M}$ be as described above, and let $M_M$ be the $\Sigma_M$ model that expands $M$ by interpreting the new constants by themselves. Then $E_M$ is the set of all atoms that are satisfied by $M_M$. This corresponds to the case when the model morphisms are those that preserve the structure of the models, if we change the concept of model morphism then the diagram should change also. For example if we consider the elementary embeddings as model morphisms then $E_M$ is much larger, consisting of all first order sentences that are satisfied by $M_M$. And so on. The narrower the class of model morphisms considered, the larger the corresponding diagrams.

The existence of institution-theoretic diagrams in concrete logical systems is a mark of coherence between the syntax (the kind of sentences involved) and the semantics (the concept of model morphism employed). For example non-hybrid modal logics lack diagrams because there is an unbalance between the syntax and the semantics (Kripke structures), in the sense that the syntax is too weak to express aspects of the semantics. On the other hand, the hybrid (modal) logics do have diagrams because they have more expressive power with respect to the Kripke frames.

e. Ultraproducts

The method of ultraproducts is a most important and remarkably powerful one in conventional model theory (Chang and Keisler, 1990). This has been realised at the abstract level of institution theory beginning with (Diaconescu, 2003), on the basis of the previously established concept of categorical ultraproduct (introduced perhaps first time in (Matthiessen, 1978)) and applied to the categories of models $\mathrm{Mod}(\Sigma)$ in institutions. This requires some familiarity with categorical limits and colimits. A filter $F$ on a set $I$ is a set of subsets of $I$, such that $J \cap J’ \in F$ when $J,J’ \in F$ and $J’\in F$ when $J\in F$ and $J \subseteq J’$. $F$ is an ultrafilter when in addition $F$ satisfies the property that for each $J\subseteq I$, $J \in F$ if and only if its complementary $I \setminus J \not\in F$. Let us consider a family $(M_i)_{i\in I}$ of $\Sigma$-models in an institution and a filter $F$ on $I$. For any $J\in F$ let us denote the direct product of $(M_i)_{i\in J}$ by $M_J$. If $\mathrm{Mod}(\Sigma)$ has direct products then for any $J \subseteq J’$ in $F$ there is a canonical projection $$p_{J’\supseteq J} \,\colon\; M_{J’} \rightarrow M_J$$ Then any colimit $$\mu = \{ \mu_J \,\colon\; M_J \rightarrow M_F \mid J \in F \}$$ of the diagram $$\{ p_{J’\supseteq J} \mid J\subseteq J’, J\in F \}$$ is called an $F$-product of $(M_i)_{i\in I}$. When $F$ is ultrafilter, $F$-products are called ultraproducts. Due to the uniqueness up to isomorphisms of categorical direct products and of colimits it follows that $F$-products are unique only up to isomorphisms too.

The foundation of the ultraproducts method in first-order model theory is constituted by a result in (Ło´s, 1955) which gives a ‘preservation’ property for the satisfaction by ultraproducts of models that is common to all sentences in first-order logic. This has been highly generalised in institution theory by decomposing it into a puzzle of general preservation results across connectors (Boolean, modalities) and quantifiers. Concrete instances of this result and of some of its extensions provide for free an ultraproducts method for a variety of logical systems, including unconventional ones for which such development was otherwise difficult to envisage.

A typical application of ultraproducts is the derivation of the compactness of the semantic consequence without having to resort to a proof system and a related completeness argument, which in general is technically very difficult. An institution is said to be compact when for each semantic consequence $E \models_\Sigma \rho$, where $E$ is a set of sentences and $\rho$ is a single sentence, there exists a finite set $E_0 \subseteq E$ such that $E_0 \models_\Sigma \rho$. In the presence of the preservation of the satisfaction relation under the ultraproduct construction of models, if in addition the institution has conjunctions and negations, we get the compactness of the institution. When adjoined to the institution-theoretic generalisation of the fundamental ultraproducts result of (Ło´s, 1955), this remarkably general property gives an abstract compactness result, which can be instantiated with little eort to a wide variety of concrete logical systems. The eciency of this path to compactness has become transparent for example in (Diaconescu and Stefaneas, 2007) in the case of a wide class of quantified modal systems.

f. Interpolation and Definability

Interpolation is an important property of logical systems which has a strikingly elementary formulation but which is usually very difficult to establish. Its common semantic version sounds like this:

given a $\Sigma_1$-sentence $\rho_1$ and a $\Sigma_2$-sentence $\rho_2$, if $\rho_1 \models_{\Sigma_1 \cup \Sigma_2} \rho_2$ then there exists a $\Sigma_1 \cap \Sigma_2$-sentence $\rho_0$ (called interpolant) such that $\rho_1 \models_{\Sigma_1} \rho_0$ and $\rho_0 \models_{\Sigma_2} \rho_2$.

In other words, any semantic consequence is established by means of symbols that are common to both sentences. In institution theory interpolation is considered in a form that generalizes several aspects of its common formulation. First, the signature inclusions that appear implicitly in the formulation of interpolation are abstracted to arbitrary signature morphisms. Then the common formulation of interpolation corresponds to the situation when the institution comes with the signature morphisms restricted to inclusions only. However the generalisation of interpolation to arbitrary signature morphism allows in the concrete situations for consideration of signature morphisms that may rename or even collapse syntactic entities. While such extended form of interpolation may be unusual in conventional logic, it is used in specification theory. A second generalisation of the concept of interpolation replaces individual sentences by finite sets of sentences. While in logics that have conjunction (such as classical propositional, first-order logics, and so forth) this does not mean anything, it is very meaningful in logics lacking conjunctions, such as equational or Horn clause logics. In the latter ones interpolation may fail artificially due to unrealistic single sentence style formulation. These get us to the following definition of interpolation. A commuting square of signature morphisms like below

is an interpolation square when for any finite sets $E_1\subseteq\mathrm{Sen}(\Sigma_1)$ and $E_2\subseteq\mathrm{Sen}(\Sigma_2)$ such that $$\mathrm{Sen}(\theta_1)(E_1) \models_{\Sigma’} \mathrm{Sen}(\theta_2)(E_2)$$ there exists a finite set $$E_0 \subseteq\mathrm{Sen}(\Sigma_0)$$ such that $$E_1 \models_{\Sigma_1} \mathrm{Sen}(\varphi_1)(E_0)$$ and $$\mathrm{Sen}(\varphi_2)(E_0)\models_{\Sigma_2} E_2$$ Such commuting squares are meant to emulate the intersection-union of signatures from the common formulation of interpolation, with $\Sigma_0$ in the role of $\Sigma_1 \cap \Sigma_2$ and $\Sigma’$ in the role of $\Sigma_1 \cup \Sigma_2$. However in the common formulation of interpolation it is important that $\Sigma_1 \cup \Sigma_2$ is the lowest signature above $\Sigma_1$ and $\Sigma_2$. In the generalised formulation of interpolation this property appears as a category theoretic condition, that the commuting square is a pushout in the category $\mathrm{Sig}$ of the signatures of the institution. But when considering interpolation as a property of the institution as a whole, it is in general not meaningful to look for interpolation in all pushout squares. This leads to another generalisation layer in the formulation of interpolation, which restricts abstractly the range of $\varphi_1$ and $\varphi_2$ to designated subclasses of signature morphisms. Thus, given $\mathcal{L}$ and $\mathcal{R}$ subclasses of signature morphisms, the institution has $(\mathcal{L},\mathcal{R})$-interpolation when each pushout of a span $(\varphi_1,\varphi_2)$ of signature morphisms with $\varphi_1\in \mathcal{L}$ and $\varphi_2\in \mathcal{R}$ is an interpolation square.

Institution-theoretic interpolation has been established at the general level in relation to several different causes. One cause can be an axiomatizability property of the institution, like in (Diaconescu, 2004a). A typical example here is many sorted equational logic which has $(\mathcal{L},\mathcal{R})$-interpolation with $\mathcal{L}$ being the injective signature morphisms and $\mathcal{R}$ being all signature morphisms. Another cause can be the Robinson consistency property, like in (Găină and Popescu, 2007). An instance of this is many sorted first order logic which has $(\mathcal{L},\mathcal{R})$-interpolation when either $\mathcal{L}$ of $\mathcal{R}$ consists of signatures morphisms that are injective on the sorts. And yet another cause can be the existence of an adequate translation to an institution that has well established interpolation properties, like in (Diaconescu, 2012b).

Definability is one of the traditional important consequences of interpolation. The concept of definability has been approached in institution theory by Petria and Diaconescu (2006) in a rather similar way to interpolation, by abstracting the new (presumably definable) symbols for signatures to arbitrary abstract signature morphisms. The institution-theoretic study of definability in (Petria and Diaconescu, 2006) revealed that, besides interpolation, axiomatizability properties (considered in the generalised form introduced in (Diaconescu, 2004a)) may constitute a primary cause for definability.

g. Layered Completeness

The concept of institution can be enhanced with a proof theoretic structure (Meseguer, 1989) by adding for each signature $\Sigma$ an entailment relation $\vdash_\Sigma$ between sets of $\Sigma$-sentences and single $\Sigma$-sentences subject to the following axioms:

$E \vdash_\Sigma \rho$ when $\rho\in E$;
If $E \vdash_\Sigma \gamma$ for each $\gamma\in \Gamma$ and $\Gamma \vdash_\Sigma \rho$ then $E \vdash_\Sigma \rho$; and
For each signature morphism $\varphi \,\colon\; \Sigma \rightarrow \Sigma’$ if $E \vdash_\Sigma \rho$ then $\mathrm{Sen}(\varphi)(E) \vdash_{\Sigma’} \mathrm{Sen}(\varphi)(\rho)$.

The entailment system $\vdash$ is sound for the respective institution when $E \vdash_\Sigma \rho$ implies $E \models_\Sigma \rho$ and is complete when the reverse implication holds. These are institution-theoretic generalisations of the common concepts of soundness and completeness from logic and model theory. Soundness is an obligatory property that is also easy to establish in the concrete situations. Completeness is very desirable but hard to establish; completeness results are difficult ones.

The institution-theoretic approach to completeness results is a layered one based upon the observation that usually proof systems can usually be deconstructed into several layers that correspond to the structure of the sentences involved, and that completeness can be developed at each layer relative to the completeness at the previous one. This means that at each layer the respective entailment system is considered fully abstract, and hence the proof rules that build the next layer come in a form that is independent of the previous layers. The total completeness
result is thus obtained as a combination of smaller independent completeness results, each of them having the potential to be reused in other contexts.

For instance, let us consider the case of the completeness of Horn clause logic with equality, the fragment of first order logic with equality that restricts the sentences to those of the form $(\forall X)H \Rightarrow C$, where $H$ is a finite conjunction of (equational and relational) atoms and $C$ is a single atom. This completeness can be decomposed into three layers. At the base layer we consider the institution that has as sentences only equational $t=t’$ and relational $\pi(t_1,\dots,t_n)$ atoms. For that we have a complete proof system defined by

$\vdash t=t$ for each term $t$;
$t =t’ \vdash t’ = t$ for all terms $t,t’$;
$\{ t=t’, t’=t” \} \vdash t = t’$ for all terms $t,t’,t”$;
$\{ t_i = t’_i \mid 1\leq i\leq n \} \vdash \sigma(t_1,\dots,t_n) = \sigma(t’_1,\dots, t’_n)$ for any function symbol $\sigma$ of arity $n$; and
$\{ t_i = t’_i \mid 1\leq i\leq n \} \cup \pi(t_1,\dots,t_n) \vdash \pi(t’_1,\dots,t’_n)$ for any relation symbol $\pi$ or arity $n$.

At the next layer we add the sentences of the form $H \Rightarrow C$ where $H$ is a finite conjunctions of atoms and $C$ is a single atom and extend the proof system with the meta-rule

$\Gamma \cup H \vdash C$ if and only if $\Gamma \vdash H \Rightarrow C$.

The crucial point here is that this step does not depend upon the previous layer, meaning that information can be completely encapsulated, allowing the addition of implications over an arbitrary institution endowed with a complete entailment system. Then the completeness at the new layer is obtained. The final layer consists of adding the universal quantification to the sentences and a rule and a meta-rule to the proof system.

$(\forall X)\rho \vdash (\forall Y)\theta(\rho)$ for any substitution $\theta$ of variables $X$ with terms over the variables $Y$; and
$\Gamma \vdash _\Sigma (\forall X)\rho$ if and only if $\Gamma \vdash_{\Sigma+X} \rho$.

This step can be also developed independently of the previous layer, over an institution endowed with a complete entailment system considered fully abstractly. This requires however an abstract treatment of substitutions. The completeness at the final upper layer is also obtained. By instantiating now the base layer to the one of the atoms in first-order logic with equality and the final layer to universal first-order quantification we get a complete proof system for Horn clause logic with equality. But all the compound completeness results can be also used separately on different instances of the abstract institutions, thus obtaining complete proof systems in various different logical systems. For example we can obtain a complete proof system for the universal fragment of first-order logic, that is, sentences of the form $(\forall X)\rho$ where $\rho$ is a quantifier-free sentence, just by replacing the mid layer above by the proof system of propositional logic.

h. Notes

The institution-theoretic concept of diagrams introduced by Diaconescu (2004b) is significantly simpler than a previously introduced one in (Tarlecki, 1986a,c). Since (Diaconescu, 2004b) this has been used rather intensively in many institution theory works.

The many-sorted first-order logic instance of the general interpolation result of (Găină and Popescu, 2007) represents an elegant solution by means of institution theory to a question about the limits of interpolation in many-sorted first-order logic that stayed as conjecture for several years. The layered completeness was introduced by Borzyszkowski (2002) within the context of the study of complete calculi for structured specifications. The example discussed here comes from Codescu and Găină (2008). Other works on layered completeness include (Găină and Petria, 2010) (two layers), (Găină et al., 2012) (four layers).

Other important model theory methods that have been developed at the level of abstract institutions include saturated models (Diaconescu and Petria, 2010), forcing (Găină and Petria, 2010), omitting types (Găină, 2014). These have lead to remarkable high generalisations of deep model theory results including downwards Löwenheim-Skolem in (Găină, 2014) and the Keisler-Shelah characterisation of first-order elementary equivalence as isomorphism under ultrapowers in (Diaconescu and Petria, 2010).

4. Logic by Translation

Institution theory is very well positioned with respect to the logic-by-translation paradigm because of its perspective on logical systems as mathematical objects/structures. Concepts of structure preserving mappings between institutions, when regarded as mathematical structures, constitute mathematical formalisations for translation concepts. The importance of this idea has been recognised right from the beginnings of institution theory.

a. Morphisms and Comorphisms

Institutions as mathematical structures invite several concepts of ‘morphisms’, or structure preserving mappings between institutions. All of them define three components, corresponding to the translations of the signatures, of the sentences, and of the models. Moreover in each case an axiom that represents an invariance property of the satisfaction relation with respect to these translations is imposed.

The original structure preserving mapping between institutions has been defined by Goguen and Burstall (1992), and called just morphism of institutions. Given institutions $\mathcal{I}$ and $\mathcal{I}’$ a morphism $\mathcal{I}’ \rightarrow \mathcal{I}$ consists of

a functor $\Phi \,\colon\; \mathrm{Sig}’ \rightarrow \mathrm{Sig}$ translating $\mathcal{I}’$-signatures to $\mathcal{I}$-signatures;
for each $\mathcal{I}’$-signature $\Sigma’$ a sentence translation $\alpha_{\Sigma’} \,\colon\; \mathrm{Sen}(\Phi(\Sigma’)) \rightarrow \mathrm{Sen}'(\Sigma’)$
for each $\mathcal{I}’$-signature $\Sigma’$ a model translation/reduct $\beta_{\Sigma’} \,\colon\; \mathrm{Mod}'(\Sigma’) \rightarrow \mathrm{Mod}(\Phi(\Sigma’))$

such that both $\alpha$ and $\beta$ are natural transformations (this is a category theory notion, ‘naturality’ in this context meaning a coherence property of the component translations with respect to the signature morphisms) and such that the following Satisfaction/Translation Condition holds for each $\Sigma’$-model $M’$ and each $\Phi(\Sigma’)$-sentence $\rho$: $! M’ \models’_{\Sigma’} \alpha_{\Sigma’}(\rho) \mbox{ if and only if } \beta_{\Sigma’}(M’) \models_{\Phi(\Sigma’)} \rho. $

Institution morphisms have the flavour of ‘projecting’ from a more complex institution to a simpler one, like the following morphism from first-order to propositional logic. This maps any first-order logic signature to its set of sentences (each sentence being regarded as a propositional variable in a propositional logic signature), $\alpha_{\Sigma’} (\rho)$ is just the first-order sentence $\rho$, and $\beta_{\Sigma’} (M’)$ being just the propositional logic model consisting of all $\Sigma’$-sentences that hold in $M’$.

By reversing the translation of the signatures we get the concept of comorphism of institutions which has the flavour of an ‘embedding’ of a simpler institution into a more complex one. A comorphism $\mathcal{I} \rightarrow \mathcal{I}’$ consists of

a functor $\Phi \,\colon\; \mathrm{Sig} \rightarrow \mathrm{Sig}’$ translating $\mathcal{I}$-signatures to $\mathcal{I}’$-signatures;
for each $\mathcal{I}$-signature $\Sigma$ a sentence translation $\alpha_{\Sigma} \,\colon\; \mathrm{Sen}(\Sigma) \rightarrow \mathrm{Sen}'(\Phi(\Sigma))$; and
for each $\mathcal{I}$-signature $\Sigma$ a model translation/reduct $\beta_{\Sigma} \,\colon\; \mathrm{Mod}'(\Phi(\Sigma)) \rightarrow \mathrm{Mod}(\Sigma)$.

such that both $\alpha$ and $\beta$ are natural transformations and the following Satisfaction/Translation Condition holds for each $\Phi(\Sigma)$-model $M’$ and each $\Sigma$-sentence $\rho$:
$$! M’ \models’_{\Phi(\Sigma)} \alpha_{\Sigma}(\rho) \mbox{ if and only if } \beta_{\Sigma}(M’) \models_{\Sigma} \rho.$$
The embedding of propositional logic into first-order logic can be captured as a comorphism that interprets any set of propositional variables as a first-order signature that has only relation symbols of arity zero; note that both $\alpha$ and $\beta$ are identities.

The level of preservation of institution-theoretic structure by morphisms and comorphisms is equal, so just from the viewpoint of institutions as mathematical structures one cannot say which of these is more adequate to play the role of morphisms for the category of institutions. This means that there can be at least two categories that have institutions as their objects, both of them equally legitimate from the perspective of structure. This constitutes a good example of the idea, prevalent in category theory, that a category is best described by its morphisms rather than its objects. The conceptual symmetry between institution morphisms and comorphisms got a formal explanation by Arrais and Fiadeiro (1996) where it is showed that a categorical adjunction between the categories of the signatures of two institutions $\mathcal{I}$ and $\mathcal{I}’$ determine a canonical one-to-one correspondence between the morphisms $\mathcal{I}’ \rightarrow \mathcal{I}$ and the comorphisms $\mathcal{I} \rightarrow \mathcal{I}’$.

b. Encodings

Besides embeddings of simpler logics into more complex ones, the concept of comorphism supports also ‘encodings’ of more complex logics into simpler ones. Famous cases such as the translation of classical propositional logic into intuitionistic by Kolmogorov (1925) or the standard translation of modal logic into first-order logic by van Bentham (1988) can be presented as comorphisms. Most often the cost representing the difference in complexity is payed at the level of translation of the signatures, $\Phi$ mapping signatures to theories in the target institution. This can be explained as a comorphism $\mathcal{I} \rightarrow \mathcal{I}’^{\mathrm{th}}$ where $\mathcal{I}’^{\mathrm{th}}$ denotes the canonically defined institution of $\mathcal{I}’$-theories, that has as signatures the pairs $(\Sigma,E)$ where $\Sigma$ is an $\mathcal{I}’$-signature and $E$ a set of $\Sigma$-sentences. (Note that ‘theories’ here are mere sets of sentences rather than sets of sentences closed under some consequence relation as meant in some logical studies.) The comorphism $\mathcal{I} \rightarrow \mathcal{I}’^{\mathrm{th}}$ are sometimes called theoroidal comorphisms. The two encodings mentioned above represent a notable exception, as they can be presented as plain rather than theoroidal comorphisms (Goguen and Ro¸su, 2002), in both cases the encoding cost being payed at the level of the translation of the sentences.

c. Borrowing

Comorphism-based encodings between institutions represent the main tool for the institution theoretic approach to the logic-by-translation paradigm. Suppose we want to establish a property $P$ in an institution $\mathcal{I}$ but due to various reasons this is rather difficult. Then we can look for a suitable encoding $\mathcal{I} \rightarrow \mathcal{I}’$ such that the translation of $P$ along the encoding can be established in $\mathcal{I}’$, and moreover such that we can reflect back to $\mathcal{I}$ this conclusion in the form of $P$. In this case we say that $P$ is ‘borrowed’ from $\mathcal{I}’$ along the encoding. This requires the respective encoding to satisfy some specific properties that are conducive for the borrowing process. The institution theory literature abounds of such examples, that include interpolation, definability, diagrams, saturated models, etc.

One especially important case is that of the borrowing of a consequence relation. Let $P$ be a consequence $E \models_\Sigma \rho$ to be established. If it holds then by a simple argument relying upon the Satisfaction Condition of the comorphism we have that $$\alpha_\Sigma (E) \models_{\Phi(\Sigma)} \alpha_\Sigma (\rho)$$ holds too. If $\mathcal{I}’$ comes equipped with a complete proof system then $$\alpha_\Sigma (E) \models’_{\Phi(\Sigma)} \alpha_\Sigma (\rho)$$ may be established by proof theoretic means. Now in order to get the conclusion back to $\mathcal{I}$ we need a reflection property for the semantic consequence, which does not happen in general. However a standard way to ensure this is to check that the comorphism is conservative, which means that for any $\Sigma$-model $M$ there exists a $\Phi(\Sigma)$-model $M’$ such that $\beta_\Sigma (M’) = M$.

d. Notes

The importance of comorphisms and other types of structure preserving mappings between institutions has been understood gradually, with the paper (Goguen and Ro¸su, 2002) presenting a systematic comparative overview of the different notions of morphisms between institutions. That paper also fixed a lot of current terminology, including for example the term ‘comorphism’.

The work (Mossakowski et al., 2005) gave the institution-theoretic answer for the contest question of the 1st World Congress on Universal Logic (Montreux, Switzerland, 2005), namely what is the identity of a logic? The answer was that it is an equivalence class of institutions under equivalence of institutions. Briefly, the concept of equivalence of institutions proposed by Mossakowski et al. (2005) is a comorphisms (or alternatively a morphism) that consists of categorical equivalences at all levels.

The borrowing of a consequence relation is the foundation for formal verification by translation within the context of logic-based formal system specification. A specification based upon a source institution $\mathcal{I}$ may have very good expressivity and readability but $\mathcal{I}$ may lack adequate proof support. The solution is to shift the formal verification process across an encoding to an institution $\mathcal{I}’$ that is supported by good theorem proving tools, such as theorem provers, proof assistants, and so forth. Within the context of the heterogenous specification paradigm this has become a rather standard practice (Mossakowski, 2005), which is very economical since it makes good use of existing tools instead of building new ones.

5. Contributions to Computing Science

The contribution of institution theory to computing science is manifold, the most basic one being that it sets a standard style of developing new specification languages that requires at the beginning to define a logical system captured as institution and then develop all the language constructs solidly and rigorously backed by corresponding mathematical entities in the underlying institution.

a. Structured Specifications

Software systems tend to be very complex, and likewise their formal specifications. One key device for the management of this complexity is a structuring mechanism for specifications that allows to develop them in modular way. It has been noticed that structuring systems are largely, if not completely, independent of the logical systems underlying specification formalisms. Hence the idea to study the structuring of specifications and programs at a conceptual level that abstracts away the details of logical systems and instead exploits some of their compositionality properties. This has been the original motivation for institution theory.

Given an institution, considered abstractly, a structured specification is just a term formed by applying a specific fixed set of building operators to blocks consisting of finite set of sentences in the institutions (called basic, flat or unstructured specifications). Then by induction on the structure of a specification $\mathit{SP}$ one may calculate its signature $\mathrm{Sig}(\mathit{SP})$ and its class of models $\mathrm{Mod}(\mathit{SP})$. Seminal works such as (Sannella and Tarlecki, 1988) have proposed a core set of specification-building operators consisting of renaming (translations), sum, hiding (derivation) which can express most of the concrete structuring operations in the actual specification languages. Often an initial semantics operator is added in order to deal with initial semantics specification modules. A lot of results have been developed on this conceptual basis, providing an uniform and solid foundation for modular software development. A particularly remarkable one is the layered completeness result of Borzyszkowski (2002) that lifts a proof system (considered in the form of an abstract entailment system) from the base institution to structured specifications; under a set of general conditions expressed as properties of the base institution, the completeness of the former yields the completeness of the latter.

A later development by Diaconescu (2012a) proposes an even more abstract approach that avoids particular choices of sets of specification building operators. This abstracts the (structured) specifications as signatures in an abstract ‘upper level’ institution $\mathcal{I}’$, the relationship to the base institution $\mathcal{I}$ (representing the underlying logic) being axiomatised as a special kind of institution morphism, whose sentence translations are identities, and whose model translations/reducts are inclusions.

b. Heterogeneous Specifications

The increasing complexity of current systems has led to an understanding of the limitations of specification formalisms based upon single logical systems. Hence the emergence of the heterogeneous specification environments based on multiple logical systems. A standard view of such heterogeneous logical environments, actually realised by CafeOBJ (Diaconescu and Futatsugi, 1998) and Hets (Mossakowski, 2005), is as diagrams of institutions that are linked by comorphisms. However this raises the difficult question of how to lift specification theory to the heterogenous level. The answer comes from a category-theoretic construction that originated from algebraic geometry by Grothendieck (1963), and that can be replicated to institutions in order to ‘flatten’ a diagram of institutions to a single institution that retains all data provided by the respective diagram of institutions. This construction is known as the Grothendieck institution associated to the respective diagram. The main idea is to aggregate together all institutions of the diagrams into a big institution and label all the entities by their origin (node or edge in the diagram). This means that a signature of the Grothendieck institution is a pair $(i,\Sigma)$ where $i$ is a node in the diagram and $\Sigma$ is an signature in the institution at $i$. A signature morphism $(i,\Sigma) \rightarrow (i’,\Sigma’)$ in the Grothendieck institution is a pair $(u,\varphi)$ where $u$ is an edge in the diagram $i \rightarrow i’$ and $\varphi$ is a signature morphism in the institution at $i’$ from the translation of $\Sigma$ across the institution comorphisms at $u$ to $\Sigma’$. The $(i,\Sigma)$-sentences are just the $\Sigma$-sentences (at $i$) and likewise the models, but the translations of both sentences and models across $(u,\varphi)$ make use of the respective translations given by the comorphism at $u$ and the local translations corresponding to $\varphi$. The Grothendieck institution satisfaction relation $\models_{(i,\Sigma)}$ is just $\models_\Sigma$ of the institution at $i$.

A series of properties required by specification theory have been gradually established for Grothendieck institutions. These results usually provide sets of sucient (and often necessary) conditions for lifting of institution-theoretic properties from the local level of the component institutions to the global level of the Grothendieck institution.

c. Ontologies

Computer science ontologies can be regarded as logic-based formal specifications. On the basis of this observation Goguen (2006) introduces the institution-theoretic trend in theory of ontologies which represents more or less a rephrasing of well established specification theory concepts and results in an ontology theoretic setup. This has brought several notable gains for ontologies, such as very well developed structuring technologies and the Grothendieck institution approach to logical heterogeneity (Kutz et al., 2010).

This line of development plays the core role in the OMG standard Ontology, Modeling and Specification Integration and Interoperability (OntoIOp).

d. Logic Programming

The Herbrand theorems as foundations of the logic-programming paradigm have been developed at the level of abstract institutions by Diaconescu (2004c). This has opened the possibility to develop logic programming over non-conventional logical structures, thus providing solid foundations for the combination of logic programming and other computing paradigms. Particularly important results in this line of developments are the Herbrand-based foundations for constraint solving (Diaconescu, 2008), which show that at the denotational level constraint solving is just an instance of plain (abstract) logic programming, and logic programming for services (Ţuţu and Fiadeiro, 2015b) that extends the original approach over a concept of generalized substitution system (Ţuţu and Fiadeiro, 2015a).

e. Notes

The work on the specification language Clear (Burstall and Goguen, 1980) was very influential and preceded the institution-theoretic structuring of specifications. Another particularly influential work in this area was (Diaconescu et al., 1993).

Grothendieck institutions, preceded by an attempt by Diaconescu (1998) to lift ‘by-hand’ specification theory concepts and results to the heterogeneous level, have been originally introduced by Diaconescu (2002) but in a slightly improper way using institution morphisms at the edges of the diagram representing the heterogeneous environment; that was soon corrected by Mossakowski (2002) to comorphisms.

The treatment of variables as signature extensions outlined in Sec. 3.c plays a key role in the institution-independent study of logic programming. It is an essential feature of the institutional approach to Herbrand theorems, one that has enabled the development of some of the most fundamental semantic concepts to logic programming, like query and solution, in arbitrary institutions. Despite such mild assumptions, the logic-programming semantics of services (Ţuţu and Fiadeiro, 2015b) does not fit into the framework proposed in (Diaconescu, 2004c). This has led to an upgrade of the original approach, to Herbrand theorems over a concept of generalized substitution system (Ţuţu and Fiadeiro, 2015a) that extends institutions by allowing for direct representations of variables and substitutions, much in the spirit of context institutions (Pawlowski, 1996). The connection between the substitution-system-based and the original institution-independent approach to logic programming was studied in depth in (Ţuţu and Fiadeiro, 2015c).

6. Extensions

While non-classical logics can be captured properly as institutions, some of their fine aspects may be beyond the conventional institution theory. For example, the institutions of many-valued logics handle the ternary aspect of the satisfaction relation by adjoining truth value to the sentences; in this way we get a binary satisfaction relation. While this works well with respect to most aspects of many-valued logics, it cannot handle graded (non-binary) consequences. The same with modal logics, the conventional concept of institution does not allow for a general semantics of modalities. These shortcomings have been overcome by extensions of the definition of institution towards non-classical aspects of logics.

a. Many-valued Truth

The many-valued extension of institution theory is very simple, just replace the binary truth values with any set $L$ of truth values. The satisfaction relation becomes a function $$\models_\Sigma \,\colon\; \mathrm{Sen}(\Sigma) \times \mathrm{Mod}(\Sigma) \rightarrow L$$ that for each sentence and model gives a truth value interpreted as a satisfaction degree. The Satisfaction Condition gets rephrased as $$!(\mathrm{Mod}(\varphi)(M’) \models_\Sigma \rho) = (M’ \models_{\Sigma’} \mathrm{Sen}(\varphi)(\rho)).$$ Very often it is necessary that $L$ comes as a partial order of some kind, such as lattice. In order to get a genuine many-valued implication one needs even more, that $L$ is a residuated lattice.

b. Modalities

The stratified institutions of Aiguier and Diaconescu (2007) refine institutions by considering models with states. Thus each model $M$ comes equipped with a designated set $[\![ M ]\!]$, and this is subject to several coherent conditions. In the case of Kripke models $M$, $[\![ M ] \!]$ gives the set of the possible worlds.

c. Notes

The idea to extend the definition of institution to many-valued truth arose at the beginning of institution theory (Mayoh, 1985) motivated by research in data base theory; this was already mentioned in (Goguen and Burstall, 1992). Later works on many-valued institutions include (Eklund and Helgesson, 2010; Diaconescu, 2013, 2014).

7. References and Further Reading

a. Primary Sources

Răzvan Diaconescu. Institution-independent Model Theory. Birkhäuser, 2008.
Joseph Goguen and Rod Burstall. Institutions: Abstract model theory for specification and programming. Journal of the Association for Computing Machinery, 39(1):95–146, 1992.
Joseph Goguen and Grigore Ro¸su. Institution morphisms. Formal Aspects of Computing, 13:274–307, 2002.
José Meseguer. General logics. In H.-D. Ebbinghaus et al., editors, Proceedings, Logic Colloquium, 1987, pages 275–329. North-Holland, 1989.
Till Mossakowski, Joseph Goguen, Răzvan Diaconescu, and Andrzej Tarlecki. What is a logic? In Jean-Yves Béziau, editor, Logica Universalis, pages 113–133. Birkhäuser, 2005. 19
Till Mossakowski, Răzvan Diaconescu, and Andrzej Tarlecki. What is a logic translation? Logica Universalis, 3(1):59–94, 2009.
Donald Sannella and Andrzej Tarlecki. Foundations of Algebraic Specifications and Formal Software Development. Springer, 2012.

b. Secondary Sources

Marc Aiguier and Răzvan Diaconescu. Stratified institutions and elementary homomorphisms. Information Processing Letters, 103(1):5–13, 2007. 01.
Arrais and José L. Fiadeiro. Unifying theories in different institutions. In Magne Haveraaen, Olaf Owe, and Ole-Johan Dahl, editors, Recent Trends in Data Type Specification, volume 1130 of Lecture Notes in Computer Science, pages 81–101. Springer, 1996.
Edigio Astesiano, Michel Bidoit, Hélène Kirchner, Berndt Krieg-Brückner, Peter Mosses, Don Sannella, and Andrzej Tarlecki. CASL: The common algebraic specification language. Theoretical Computer Science, 286(2):153–196, 2002.
Tomasz Borzyszkowski. Logical systems for structured specifications. Theoretical Computer Science, 286(2):197–245, 2002.
Rod Burstall and Joseph Goguen. The semantics of Clear, a specification language. In Dines Bjorner, editor, 1979 Copenhagen Winter School on Abstract Software Specification, volume 86 of Lecture Notes in Computer Science, pages 292–332. Springer, 1980.
Maura Cerioli and José Meseguer. May I borrow your logic? (transporting logical structures along maps). Theoretical Computer Science, 173:311–347, 1997.
Mihai Codescu and Daniel Găină. Birkhoff completeness in institutions. Logica Universalis, 2(2):277–309, 2008.
Răzvan Diaconescu. Extra theory morphisms for institutions: logical semantics for multi-paradigm languages. Applied Categorical Structures, 6(4):427–453, 1998. A preliminary version appeared as JAIST Technical Report IS-RR-97-0032F in 1997.
Răzvan Diaconescu. Grothendieck institutions. Applied Categorical Structures, 10(4):383–402, 2002. Preliminary version appeared as IMAR Preprint 2-2000, ISSN 250-3638, February 2000.
Răzvan Diaconescu. Institution-independent ultraproducts. Fundamenta Informaticæ, 55(3-4):321–348, 2003.
Răzvan Diaconescu. An institution-independent proof of Craig Interpolation Theorem. Studia Logica, 77(1):59–79, 2004a.
Răzvan Diaconescu. Elementary diagrams in institutions. Journal of Logic and Computation, 14(5):651–674, 2004b.
Răzvan Diaconescu. Herbrand theorems in arbitrary institutions. Information Processing Letters, 90:29–37, 2004c.
Răzvan Diaconescu. An axiomatic approach to structuring specifications. Theoretical Computer Science, 433:20–42, 2012a.
Răzvan Diaconescu. Borrowing interpolation. Journal of Logic and Computation, 22(3):561–586, 2012b.
Răzvan Diaconescu. Institutional semantics for many-valued logics. Fuzzy Sets and Systems, 218:32–52, 2013.
Răzvan Diaconescu. Graded consequence: an institution theoretic study. Soft Computing, 18(7):1247–1267, 2014.
Răzvan Diaconescu and Kokichi Futatsugi. CafeOBJ Report: The Language, Proof Techniques, and Methodologies for Object-Oriented Algebraic Specification, volume 6 of AMAST Series in Computing. World Scientific, 1998.
Răzvan Diaconescu and Marius Petria. Saturated models in institutions. Archive for Mathematical Logic, 49(6):693–723, 2010.
Răzvan Diaconescu and Petros Stefaneas. Ultraproducts and possible worlds semantics in institutions. Theoretical Computer Science, 379(1):210–230, 2007.
Răzvan Diaconescu, Joseph Goguen, and Petros Stefaneas. Logical support for modularisation. In Gerard Huet and Gordon Plotkin, editors, Logical Environments, pages 83–130. Cambridge, 1993. Proceedings of a Workshop held in Edinburgh, Scotland, May 1991. 20
Patrick Eklund and Robert Helgesson. Monadic extensions of institutions. Fuzzy Sets and Systems, 161:2354–2368, 2010.
Joseph Goguen. Types as theories. In George Michael Reed, Andrew William Roscoe, and Ralph F. Wachter, editors, Topology and Category Theory in Computer Science, pages 357–390. Oxford, 1991. Proceedings of a Conference held at Oxford, June 1989.
Joseph Goguen. Data, schema, ontology and logic integration. Journal of IGPL, 13(6):685–715, 2006.
Joseph Goguen and Rod Burstall. Introducing institutions. In Edward Clarke and Dexter Kozen, editors, Proceedings, Logics of Programming Workshop, volume 164 of Lecture Notes in Computer Science, pages 221–256. Springer, 1984.
Daniel Găină. Forcing, downward Löwenheim-Skolem and omitting types theorems, institutionally. Logica Universalis, 8(3): 469–498, 2014.
Daniel Găină and Marius Petria. Completeness by forcing. Journal of Logic and Computation, 20(6):1165–1186, 2010.
Daniel Găină and Andrei Popescu. An institution-independent proof of Robinson consistency theorem. Studia Logica, 85(1): 41–73, 2007.
Daniel Găină, Kokichi Futatsugi, and Kazuhiro Ogata. Constructor-based logics. J. of Universal Computer Science, 18 (2204–2233), 2012.
Oliver Kutz, Till Mossakowski, and Dominik Lücke. Carnap, Goguen, and the hyperontologies – logical pluralism and heterogeneous structuring in ontology design. Logica Universalis, 4(2):255–333, 2010.
Brian Mayoh. Galleries and institutions. Technical Report DAIMI PB-191, Aarhus University, 1985.
Till Mossakowski. Comorphism-based Grothendieck logics. In K. Diks and W. Rytter, editors, Mathematical foundations of computer science, volume 2420 of Lecture Notes in Computer Science, pages 593–604. Springer, 2002.
Till Mossakowski. Heterogeneous specification and the heterogeneous tool set. Habilitation thesis, University of Bremen, 2005.
Wieslaw Pawlowski. Context institutions. In Magne Haveraaen, Olaf Owe, and Ole-Johan Dahl, editors, Recent Trends in Data Type Specification, volume 1130 of Lecture Notes in Computer Science, pages 436–457. Springer, 1996.
Marius Petria and Răzvan Diaconescu. Abstract Beth definability in institutions. Journal of Symbolic Logic, 71(3):1002–1028, 2006.
Donald Sannella and Andrzej Tarlecki. Specifications in an arbitrary institution. Information and Control, 76:165–210, 1988.
Andrzej Tarlecki. On the existence of free models in abstract algebraic institutions. Theoretical Computer Science, 37:269–304, 1986a.
Andrzej Tarlecki. Bits and pieces of the theory of institutions. In David Pitt, Samson Abramsky, Axel Poigné, and David Rydeheard, editors, Proceedings, Summer Workshop on Category Theory and Computer Programming, volume 240 of Lecture Notes in Computer Science, pages 334–360. Springer, 1986b.
Andrzej Tarlecki. Quasi-varieties in abstract algebraic institutions. Journal of Computer and System Sciences, 33(3):333–360, 1986c.
Ionuţ Ţuţu and José L. Fiadeiro. From conventional to institution-independent logic programming, 2015a.
Ionuţ Ţuţu and José L. Fiadeiro. Service-oriented logic programming, 2015b.
Ionuţ Ţuţu and José L. Fiadeiro. Revisiting the institutional approach to Herbrand’s theorem. In Algebra and Coalgebra in Computer Science, Leibniz International Proceedings in Informatics. Schloss Dagstuhl, 2015c.
George Voutsadakis. Categorical abstract algebraic logic: Algebrizable institutions. Applied Categorical Structues, 10: 531–568, 2002.

c. Auxiliary Non-Institutional Sources

Jon Barwise. Axioms for abstract model theory. Annals of Mathematical Logic, 7:221–265, 1974. 21
Jean-Yves Béziau, editor. Universal Logic: an Anthology. Studies in Universal Logic. Springer Basel, 2012.
Chen-Chung Chang and H. Jerome Keisler. Model Theory. North Holland, Amsterdam, 1990.
Samuel Eilenberg and Saunders Mac Lane. General theory of natural equivalences. Transactions of the American Mathematical Society, 58:231–294, 1945.
Joseph Goguen. A categorical manifesto. Mathematical Structures in Computer Science, 1(1):49–67, March 1991. Joseph Goguen. Programming Research Group Technical Monograph PRG–72, Oxford University, March 1989.
Alexandre Grothendieck. Catégories fibrées et descente. In Revêtements étales et groupe fondamental, Séminaire de Géométrie Algébraique du Bois-Marie 1960/61, Exposé VI. Institut des Hautes Études Scientifiques, 1963. Reprinted in Lecture Notes in Mathematics, Volume 224, Springer, 1971, pages 145–94.
Andrei N. Kolmogorov. On the principle of the excluded middle. Matematicheskii Sbornik, 32(646–667), 1925. (in Russian).
Jerzy Ło´s. Quelques remarques, théorèmes et problèmes sur les classes définissables d’algèbres. In Mathematical Interpretation of Formal Systems, pages 98–113. North-Holland, Amsterdam, 1955.
Saunders Mac Lane. Categories for the Working Mathematician. Springer, second edition, 1998.
Günter Matthiessen. Regular and strongly finitary structures over strongly algebroidal categories. Canadian Journal of Mathematics, 30:250–261, 1978.
Alfred Tarski. The semantic conception of truth. Philos. Phenomenological Research, 4:13–47, 1944.
Johan van Bentham. Modal Logic and Classical Logic. Humanities Press, 1988.

Author Information

Răzvan Diaconescu
Email: Razvan.Diaconescu@imar.ro
Simion Stoilow Institute of Mathematics of the Romanian Academy
Romania

Francisco Suárez (1548—1617)

Sometimes called the “Eminent Doctor” after Paul V’s designation of him as doctor eximius et pius, Francisco Suárez was the leading theological and philosophical light of Spain’s Golden Age, alongside such cultural icons as Miguel de Cervantes, Tomás Luis de Victoria, and El Greco. Although initially rejected on grounds of deficient health and intelligence when he attempted to join the rapidly growing Society of Jesus, he attained international prominence within his lifetime. He taught at the schools in Segovia, Valladolid, Rome, Alcalá, Salamanca, and finally at Coimbra, the last at Philip II’s insistence.

Not all of the attention Suárez received was positive. His Defensio fidei, published in 1613, defended a theory of political power that was widely perceived to undermine any monarch’s absolute right to rule. He explicitly permitted tyrannicide and argued that even monarchs who come to power legitimately can become tyrants and thereby lose their authority. Such views led to the book being publically burned in London and Paris.

Suárez was clearly a scholastic in style and temperament, despite coming after the rise of humanism and living on the cusp of what is usually identified as the era of modern philosophy. His writings are sometimes said to contain the whole of scholastic philosophy because in addressing a question he surveys the full range of scholastic positions—Thomist, Scotist, nominalist, and others—before affirming one of those positions or presenting his own variant. The position he ultimately settles on is likely to be a via media.

Suárez’s greatness as a philosopher comes precisely from his magisterial weighing of all the competing positions across an extraordinarily broad range of theological and philosophical issues. The combination of broad systematicity, detailed elaboration, and thorough argumentation for his preferred view and against contrary views finds few rivals. He is a philosopher’s philosopher.

The even-handed presentation of the panoply of scholastic positions also explains why Suárez’s writing served as one of the key conduits through which medieval philosophy influenced early modern philosophy. Descartes, Leibniz, and Wolff, among others, learned scholasticism at least in part from reading Suárez, a scholasticism from which they then borrowed in developing their own philosophical theories.

Life
Writings
Thought
Legacy
References and Further Reading
1. Primary Sources
2. Secondary Sources

1. Life

Francisco Suárez was born on January 5, 1548 in Granada to Gaspar Suárez de Toledo and Antonia Vázquez de Utiel, a mere half-century after the Catholic Monarchs, Ferdinand and Isabella, wrested the city from eight centuries of Muslim control. The family was prosperous, although members of earlier generations on the maternal side ran into trouble with the Inquisition due to Jewish lineage, and several members were burned at the stake. Suárez’s uncle, Francisco Toledo de Herrera, was a prominent professor of philosophy who became the first Jesuit cardinal. His extended family included others of some note, including an arch-bishop cardinal and a viceroy of Peru.

Suárez was the second son of eight children. His brother Baltasar also joined the Jesuit order and was sent to the Philippines but died en route. Another brother became a priest, and three sisters joined a Jeronymite convent. Whether these facts indicate an exceptionally devout family or an attempt to mitigate some converso ancestry is less clear.

His childhood seems to have been unexceptional. He was schooled in Latin and rhetoric until thirteen, at which point he went to Salamanca to study canon law for three years. His academic performance at this point was lackluster. During his time at Salamanca he heard the legendary sermons of the Jesuit Juan Ramirez and felt the call to join the Jesuit order. Stunning in retrospect, Suárez was rejected from the then-fledgling order on grounds of insufficient intellectual aptitude. Suárez persisted, and finally, after numerous appeals, he was admitted to the order in 1564 as an indifferent, that is, provisionally accepted with the understanding that he might be refused entry into the priesthood.

At this point, Suárez’s academic abilities seem to have flowered so suddenly that some biographers attribute the flowering to the miraculous invention of Mary, the Mother of God. His newfound academic abilities did not go unnoticed, and he was sent to study theology at the University of Salamanca, then one of the most prestigious universities at the height of its glory and at the center of the Iberian revival of scholasticism. In 1570 he performed the “Grand Act” at Salamanca, something done only by the most gifted students. The Grand Act was a public academic exercise, resembling a medieval quodlibetal dispute, in which the student needed to be able to answer questions and resolve difficulties posed by professors and visitors. Suárez had enough of a reputation already to ensure prominent guests and a full hall. His new reputation also led his superiors to have him start teaching philosophy rather than first teaching grammar or rhetoric as was usually the case. Between 1570 and 1580, Suárez taught at several institutions around Spain: Salamanca, Segovia, Valladolid, and Avila. His earliest works come from this period.

Charges of novelty also started in this period. On Suárez’s telling, the problem was that he refused to teach by dictation from copy books but rather searched “for truth at its very roots.” The resulting controversy may have factored into his new position. In 1580 Suárez was called to Rome to join the Collegio Romano and to contribute to the development of the famous Jesuit pedagogical document, the Ratio Studiorum. The Roman College was an intellectually stimulating place to be, including such luminaries as the humanist Francisco Sanches, the theologians Robert Bellarmine and Gregory of Valencia, and the mathematician Christopher Clavius. Suárez also became a close colleague of the extraordinarily young general of the Jesuit order, Claudio Acquaviva.

In 1585 Suárez started teaching at the University of Alcalá. His seven years there were marked by strife with other theologians, including, notably, with his colleague Gabriel Vasquez, a dispute that left its mark on Jesuit philosophical history for generations. Suárez was unhappy with the distraction of all these conflicts and requested release from his position. Finally, in 1592, he was sent back to the university of his student days, Salamanca, where he wrote his best-known work, the Disputationes metaphysicae.

In 1597, the same year that Disputationes metaphysicae was published, Suárez moved yet again, this time to Coimbra in Portugal at the behest of Philip II. The first time Philip asked Suárez to move to Coimbra, in 1596, Suárez declined since he feared Coimbra’s teaching responsibilities would keep him from his writing projects. Because Spain had claimed Portugal during the 1580 Portuguese succession crisis, Suárez may also have feared the political situation, since the Portuguese were likely to be less than welcoming of a Spanish professor appointed by a Spanish king (even if the Spanish king was also Philip I of Portugal). Furthermore, Suárez would have occupied a new Jesuit chair at a Dominican university and tensions were running high between the two orders. Philip was disinclined to accept Suárez’s declination, but granted it after a personal visit. Unfortunately, the person appointed in his stead died at the end of 1596, raising the issue all over again. This time Philip insisted that Suárez move to Coimbra. An amusing consequence was that Suárez now needed to acquire a doctoral degree, since the Coimbra faculty objected to Suárez’s post without one. The Jesuit Provincial in Lisbon was happy to confer one, but this failed to satisfy. Finally, Suárez made a trip to the University of Evora in southern Portugal, where he directed a public theological debate and was rewarded with a doctorate.

Suárez stayed at Coimbra until his retirement in 1613, although this was a retirement only from teaching. Among other projects, Suárez hoped to revise an earlier set of lecture notes into a commentary on Aristotle’s De anima. The revision remained unfinished, however, at Suárez’s death on September 25, 1617.

2. Writings

By Suárez’s time, Aquinas’s Summa theologiae (henceforth, ST) had to some extent replaced Lombard’s Sentences as the standard theological textbook and subject of commentaries. Many of Suárez’s works are offered as commentaries on particular sections of ST. Suárez generally does not, however, offer line-by-line comments on Aquinas’s texts; still, the arrangement of subjects in his work mirrors that in Aquinas’s text, the questions he considers often grow out of it, and he constantly cites passages from it.

In correspondence to ST Ia (that is, the First Part), Suárez wrote De Deo uno et trino (“On God, One and Triune”), De angelis (“On Angels”), De opere sex dierum (“On the Work of the Six Days [of Creation]”), and De anima (“On the Soul”). The last work, in particular, has received significant attention from Suárez scholars and is the primary source for his psychological views. One should also note in passing that titles are apt to be a source of confusion for those less familiar with Suárez’s works, since scholars will sometimes use titles for smaller or larger divisions of his work. For example, references to De divina substantiae ejusque attributis, De divino praedestinatione, and De SS. Trinitatis mysterio are all to treatises that make up the De Deo uno et trino mentioned above. Conversely, the latter three works mentioned above are sometimes gathered under the title De Deo effectore creaturarum omnium, though they are more commonly cited individually. Listing all the variations would be tedious, but readers should note that a citation of a seemingly unfamiliar work of Suárez’s might simply be using the title of a collection of treatises or the title of a part of a larger collection.

In correspondence to ST IaIIae (that is, the First Section of the Second Part), Suárez wrote De ultimo fine hominis (“On the Ultimate End of Man”), De voluntario et involuntario (“On the Voluntary and Involuntary”), De bonitate et malitia actuum humanorum (“On the Goodness and Evil of Human Acts”), De pasionibus et habitibus (“On Passions and Habits”), and De vitiis atque peccatis (“On Vices and Sins”), all published together in one volume under the title Tractatus quinque ad primam secundae D. Thomae. These works have received less scholarly attention, although they are obviously of great relevance for determining Suárez’s ethical views. De legibus seu de Deo legislatore (“On Laws or on God the Lawgiver”) also corresponds to IaIIae and has received more attention as one of Suárez’s greatest and most influential works. Seeing De legibus as a commentary on Aquinas’s ST helps to understand it properly, as well. It is sometimes read as a comprehensive presentation of Suárez’s ethical views, but Suárez himself conceives of it as corresponding to questions 90-108 of ST IaIIae. Those questions, of course, constitute a small fraction of Aquinas’s ethical thought. Finally, Suárez’s massive De gratia (“On Grace”) and some of the shorter theological works he wrote in connection with the controversy De auxiliis (to be addressed later) present an extraordinarily detailed account of grace and connected theological matters.

Suárez wrote fewer works on ST IIaIIae (a fact that is grist for the scholarly mill), but De fide theologica (“On Theological Faith”), De spe (“On Hope”), De caritate (“On Charity”), and the again massive De virtute et statu religionis (“On the Virtue and State of Religion”) do correspond to it. The last work offers a defense of the Society of Jesus along with a detailed examination of its principles. These works generally have received rather little scholarly attention, with the exception of the well-known disputation on war included in De caritate. That disputation is the main source for Suárez’s just-war theory.

Corresponding to ST IIIa are De verbo incarnato (“On the Incarnate Word”), De mysteriis vitae Christi (“On the Mysteries of Christ’s Life”), De sacramentis (“On the Sacraments”), and De censuris (“On Censures”). These works have received little attention from recent philosophers, but slightly more from theologians. De mysteriis vitae Christi, in particular, is significant for including several hundred pages of discussion of questions related to the Blessed Virgin Mary. It is this work that has earned Suárez a prominent place in the history of systematic Mariology.

Not all of Suárez’s works fit into the ST framework, especially those of his works sparked by religious and political controversies of his day. The famous controversy De auxiliis raged between the Dominicans and Jesuits during Suárez’s lifetime and he contributed a number of works defending the Jesuit position. The central issue was how to explain human free will on the one side and divine foreknowledge, providence, and grace on the other side in such a way that they were compatible with each other and without falling into one heresy or another. Suárez contributed a number of works dealing with these matters: De vera intelligentia auxilii efficacis (“On the True Understanding of Efficacious Aid”), De concursu, motione et auxilio Dei (“On God’s Concurrence, Motion, and Aid”), De scientia Dei futurorum contingentium (“On God’s Knowledge of Future Contingents”), De auxilio efficaci (“On Efficacious Aid”), De libertate divinae voluntatis (“On the Freedom of the Divine Will”), De meritis mortificatis, et per poenitentiam reparatis (“On Merits Destroyed and Revived through Penance”), and De justitia qua Deus reddit praemia meritis, et poenas pro peccatis (“On the Justice by Which God Gives Rewards for Merits and Punishment for Sins”).

More political in nature is the treatise De immunitate ecclesiastica a Venetis violata (“About the Ecclesiastical Immunity Violated by the Venetians”), a work written in defense of the papal position in the dispute between the papacy and the Republic of Venice about the extent of papal jurisdiction.

Best-known of Suárez’s controversial works is his response to the English monarch James I, Defensio fidei catholicae adversus anglicanae sectae errores (“A Defense of the Catholic Faith Against the Errors of the Anglican Sect”), written at the request of the papal nuncio in Madrid. Suárez’s initial aim was to respond to James I’s arguments for requiring Catholic subjects to take an oath of loyalty. The work became much more than that, however, and offers much of interest to theorists of political power. In it, Suárez opposes the absolute right of monarchs, argues that the papacy has indirect power over temporal rulers, and, perhaps most inflammatory, argues that there are situations in which citizens may legitimately resist a tyrannical monarch to the point of tyrannicide. The work was promptly condemned by the English king and publically burned in London. James tried to enlist the support of other European monarchs in condemning Suárez and the Jesuit order more generally with some success, especially in France.

Last but certainly not least, Suárez’s best-known and most influential work is his Disputationes metaphysicae (“Metaphysical Disputations”; henceforth, DM). The work is meant to cover the questions pertaining to the twelve books of Aristotle’s Metaphysics, but part of its significance lies in its not being a commentary on Aristotle’s work and not following its organization. Suárez is unhappy with Aristotle’s organization and so writes a large, two-volume work in which he sets out a comprehensive treatment of metaphysics in systematic fashion, organized into fifty-four disputations. The work is widely credited with being the first work to offer a comprehensive, systematically organized metaphysics and with initiating a long tradition of such works. DM was immediately and extraordinarily popular, widely reprinted all over Europe and quickly became the standard work to consult on metaphysical matters.

DM is divided into two main parts. The first part deals with the object of metaphysics, the concept of being, the transcendentals (that is, the essential properties of being as such, namely, unity, truth, and goodness), and the causes of beings. The second part deals with the divisions of being, first into infinite and finite. Finite being is then divided into substance and accident, with the latter then divided into the nine categories familiar from Aristotle. The last disputation concerns beings of reason, which, strictly speaking, fall outside the scope of metaphysics on Suárez’s conception, but an understanding of which is helpful for understanding metaphysics.

Aside from its systematic scope, few readers of DM have failed to be impressed by the extraordinary erudition displayed in its discussions. Each section includes a careful cataloguing of all the different positions that have been—and might be—taken with respect to the issue under question. DM’s two volumes include thousands of citations of hundreds of authors: Christian scholastic, Muslim, Patristic, ancient, and others. This may be the feature Schopenhauer has in mind when he characterizes DM as the “authentic compendium of the whole of scholastic wisdom.”

It is worth noting at this point that Suárez comes after centuries of a continuous tradition of professional theology and philosophy and consequently inherits a formidable accumulation of distinctions, technical terminology, and the like. Standard arguments and positions are often assumed or indicated in summary fashion. The resulting work is exceptionally sophisticated and can be highly rewarding to a reader with some familiarity with that tradition, but it can also be forbidding and alien to readers not so familiar.

The situation is not helped by the fact—astonishing in light of Suárez’s caliber and influence—that very little of his work is available either in critical edition or translation. Suárez’s works suffer far fewer textual issues than many medieval works do, so the lack of critical editions impedes less than it might. A significant number of his works were published posthumously, however, and the editorial decisions of his literary executor, Baltasar Alvarez, sometimes leave something to be desired. The value of translations, of course, is evident, yet aside from significant portions of DM very little has been translated into English (or any other language, for that matter). As the preceding presentation should have made clear, of course, Suárez was awe-inspiringly prolific, so a complete translation would be a monumental task. Furthermore, in addition to the mentioned works, there are still several unedited and unpublished works.

3. Thought

The first thing to note when trying to get one’s bearings with Suárez’s thought is that he, like many other scholastic authors, is a Christian Aristotelian. His thought is so thoroughly imbued with both Christianity and Aristotelianism that it would be difficult to find a single page in his writings not containing obvious traces of both. As is well-known, Aristotle says some things incompatible with orthodox Christianity, such as that the world had no beginning, so an orthodox Christian must modify Aristotelianism to some degree. Nonetheless, one can better understand what Suárez says and why he says it once recognizing that he is committed to orthodox Christian doctrines—more particularly, Roman Catholic doctrines—and that his philosophical framework and conceptions are rooted in Aristotle.

On the Christianity side, Suárez is committed to there being a God, a God who created and sustains human beings and the world they inhabit. God is a Trinity of Father, Son, and Holy Spirit, but also One, indeed, utterly simple. God exists necessarily, is infinite, immutable, and eternal. God is perfectly good, omniscient, and omnipotent, and orders everything in the universe providentially. Nevertheless, human beings sinned, consequently requiring God’s grace to attain their salvation and their ultimate end, that is, God. A central part of this story is the Incarnation of the second member of the Trinity, in which Christ assumes a human nature in addition to his divine nature. A moment’s reflection suffices to see that these doctrines raise a host of philosophical issues; much of Suárez’s work is devoted to such issues.

On the Aristotelian side, the most notable inheritances are hylomorphism (the view that objects are composed of matter and form); the four-causes explanatory paradigm (material, formal, efficient, and final); the categorial scheme of substance and nine categories of accidents; the view that the human soul is the form of the body; and the language of ultimate ends, happiness, and virtue in ethics. As scholars of medieval philosophy well know, this Aristotelian legacy leaves considerable room for philosophical disputes, in part due to unanswered questions, in part due to answers susceptible to differing interpretations.

The three authors most frequently cited by Suárez are Aristotle, Aquinas, and Scotus, in that order. Counting citations is not generally an infallible guide to influence, but in this case the citations seem an accurate reflection of influence. Aristotle’s influence is pervasive. Suárez claims to follow Aquinas throughout, though he no doubt exaggerates his fidelity to Aquinas. (The Jesuit order of which he was a member required fairly close adherence to Aristotle in matters of philosophy and Aquinas in matters of theology.) The number of references to Scotus, however, suggest that Suárez also had great respect for his thought. How much this respect led to influence is a matter of some controversy. Detecting such influence is made more difficult by Suárez’s practice of presenting his own view as that of Aquinas (properly interpreted), even where one might doubt their harmony. Closer examination of Suárez’s views confirms that his respect for Scotus at least occasionally led to adopting broadly Scotistic views.

The enormous range of issues addressed in Suárez’s writings ensures the impossibility of surveying all of them in an encyclopedia entry. What follows is a sampling of the issues that have received at least some scholarly attention, but it should be noted that a variety of significant topics, for example, his just war theory and his psychological views, have been omitted. The order roughly follows the order of DM with several issues drawing primarily on other works appended at the end.

a. Metaphysics and Theology

There are at least two issues concerning Suárez’s conception of metaphysics that have been the source of some controversy. Both issues also relate to the perennially enticing question of whether Suárez is the last medieval or the first modern.

Suárez opens his systematically ordered Disputationes metaphysicae by asking what the object of metaphysics is. The question received much discussion in scholastic philosophy, no doubt in part because Aristotle suggests several candidates that do not look equivalent in his Metaphysics. Suárez canvasses and criticizes six answers before giving his own answer: “being insofar as it is real being” (DM 1.1.26). Real being includes both infinite being (God) and finite being, both substances and real accidents. What it excludes, notably, is beings of reason, that is, beings that cannot exist other than as objects of thought. Note the “cannot.” On Suárez’s conception, real being includes both actual being and possible being. It does not, however, include privations, for example, blindness, and other such beings of reason.

Suárez considers metaphysics a unified science in the Aristotelian sense. The function of a science is to demonstrate the properties of its object through the latter’s principles and causes (DM 1.1.27 and 1.3.1). The term “properties” here is not being used in its wide contemporary sense but rather to refer to features necessarily possessed by the members of a kind yet not essential to them. The classic example is the capacity to laugh, which scholastics generally deem a feature necessarily possessed by human beings yet do not deem an essential feature, as rationality is. The function of metaphysics, then, is to identify the properties of being insofar as it is real being, and to demonstrate their necessary possession by appeal to the principles and causes of being insofar as it is real being.

The immediate objection is that being, insofar as it is real being, has neither necessary features that can be demonstrated nor principles and causes (DM 1.1.27). Consequently, it fails to make a suitable object for metaphysics. With respect to the first part, the thought is that being insofar as it is real being is so far abstracted that it has no properties. Trees, rocks, angels, and so forth all have properties, but what property does the being common to every being have? With respect to the second part, on the view that Suárez shares, the being par excellence is God, and God has no causes. Suárez, however, denies both claims. Being insofar as it is real being may not have any properties really distinct from it, but it does have properties that are at least distinct in reason, namely, the transcendentals: unity, truth, and goodness. In a similar move, Suárez denies that a science need appeal to causes, strictly speaking. Rather, appeal to principles or cause in a looser sense is sufficient. His example is that we can demonstrate God’s unity from God’s perfection, even though the latter is not strictly speaking a cause of the former (DM 1.1.28-29).

This conception of metaphysics might seem much narrower than more recent conceptions. It might suggest that all Suárez needed to do to finish DM is to demonstrate being’s unity, truth, and goodness and he would be done. As the size of DM attests, however, Suárez addresses a great many more topics. In the first place, he also includes disputations on the contraries of the transcendentals, namely distinction, falsity, and evil. Secondly, he includes one of the most thorough treatments of causation ever written on grounds (i) that even if being insofar as it is real being does not have a cause, most beings do and (ii) that all being is in some way a cause even if not itself caused. Finally, the entire second half of DM deals with the divisions of beings and includes lengthy disputations about the nature of particular kinds of beings such as substances, qualities, and so forth. In practice, then, the range of subjects discussed in DM comes much closer to the range of subjects that would be discussed in a modern metaphysics textbook.

Turning to the two matters of controversy mentioned earlier, the first question one may ask is how realist Suárez’s account of metaphysics is. An oft-told narrative has it that Aristotle and his medieval followers held metaphysics to concern the real rather than the mental, while at some point in the modern era metaphysics came to be about structures of thought (or something similarly mental) rather than about the extramental world. On the face of it, Suárez’s account seems impeccably realist. After all, metaphysics’ object is being insofar as it is real being. Despite seeming this way, Suárez has often been identified as one of the key figures in the transition to a non-realist approach to metaphysics. The room for debate comes with a later passage in which Suárez says that metaphysics’ object is “the objective concept of being as such” (DM 2.1.1). This is not obviously contradictory to the earlier statements that the object is being insofar as it is real being, but neither is it obviously in harmony. The question now becomes how Suárez understands objective concepts. If objective concepts borrow their ontological status from their objects, so to speak, then Suárez can readily be read as a realist. (And note that on Suárez’s view, objective concepts are concepts by extrinsic denomination; they are not really distinct from the objects. See, for example, DM 2.1.1 and 6.5.3.) On the other hand, if an objective concept is a mental item, then this later passage opens the door to less realist readings. Resolving this protracted dispute in the literature is obviously beyond the scope of this article.

The second question concerns the relationship of metaphysics to theology. Suárez makes it perfectly clear in the preface to DM that he adheres to the standard medieval view that metaphysics, albeit perfective of the human mind in its own right, ought to be in service of theology. He is, however, frustrated with piecemeal metaphysical discussions scattered throughout theological treatises, and so he writes DM to present a comprehensive metaphysics in the proper “order of teaching.” In this project, some detect a distinctly modern attitude and an early example of an increasing trend to see metaphysics as a separate discipline from theology. Others, however, are skeptical and think that Suárez sees metaphysics as distinct in the same way that his medieval predecessors did (namely, insofar as metaphysics proceeds by the “natural light of reason” apart from divine revelation) but in no additional way (for example, not by thinking that metaphysics should be conducted wholly autonomously).

b. Distinctions

The question of whether one item is distinct from another comes up in many contexts. One such place was already seen; namely, whether objective concepts are distinct from their objects. A naïve approach might just ask whether two items, A and B, are distinct or not. Is the human mind distinct from the brain? Are relations distinct from their relata? Is God’s mercy distinct from God’s justice? And so forth. Sustained philosophical reflection soon shows, however, that different sorts of distinctions might be posited between entities. Perhaps God’s mercy is identical to God’s justice in one sense but not distinct in another sense. Suárez’s scholastic predecessors frequently felt the need for a distinction between distinctions, so to speak, and so posited a plethora of distinctions. Suárez also recognizes the need for different distinctions but aims to prune the list down to three basic kinds: real, modal, and of reason. He argues in DM 7 that all other putative distinctions are actually one of these three kinds.

The two most obvious kinds of distinction are real distinctions and distinctions of reason (the two kinds that already made a showing in the previous section). As might be expected, a real distinction is the sort one expects between one thing and another thing. It is crucially an extramental distinction, one whose sign is mutual separability, that is, the possibility of either thing existing without the other. Distinctions of reason are distinctions in mind only. There may be a distinction of reason between Mark Twain and Samuel Clemens but there is no real distinction. A notable context in which Suárez wishes to appeal to distinctions of reason is when discussing God’s attributes. The doctrine of divine simplicity entails that the divine attributes can only be distinct in reason. One complication for the notion of distinctions of reason is that Suárez thinks some of them have some sort of foundation in reality while others do not.

Suárez argues that in addition to real distinctions and distinctions of reason there is a third kind of distinction, intermediate between the other two. The mark of a real distinction is mutual separability while the mark of a modal distinction is one-way separability. On Suárez’s view, for example, the union of form and matter cannot exist without the form and matter, but form and matter can exist without their union. If the union itself were a really distinct thing, Suárez thinks, then a further union would be needed to unite the union to the form and matter, and so an infinite regress would have started. So union is not some really distinct thing. But union is not merely distinct in reason, since the form and matter can exist without being united. Consequently, an intermediate kind of distinction is needed.

The reason mutual separability and one-way separability are marks or signs of the corresponding distinctions rather than simply constituting those distinctions is because of some important exceptions. On Suárez’s view there is only a one-way separability between God and creatures. God can exist without creatures but not vice versa. Assuming one is not a monist à la Spinoza, that requires granting a case of real distinction without mutual separability.

c. Esse and Essentia

In Being and Some Philosophers, Etienne Gilson famously distinguishes four fundamental traditions in metaphysics on the basis of how being is understood, two of which are existentialism and essentialism. Both terms have been used in an unhelpfully wide array of senses, but for Gilson an existentialism privileges existence over essence while an essentialist does the converse. In Gilson’s story, Thomas Aquinas is the hero who recognizes being as the very act of existing and metaphysics as the science of being insofar as it is being. Suárez, however, he sees as the paradigmatic essentialist in whose philosophy existence is no longer significant and for whom metaphysics becomes the science of essences. One may recall here that Suárez identifies being insofar as it is real being, but includes both actual and merely possible being in real being. Gilson thinks, furthermore, that this essentialism gives Suárez a pivotal role in history, albeit a malignant one, since essentialism leads to a variety of further philosophical ills.

Suárez’s conception of metaphysics figures in this story, but so does his account of the distinction between esse (“being,” meaning here actual existence) and essentia (“essence,” meaning here an individual nature), a matter related to a variety of issues regarding necessary versus contingent existence, eternal truths, the possibility of Aristotelian science about contingent things, and so forth. The usual Thomist view—how precisely to understand Thomas himself is a matter of some controversy—is that there is an actual or real distinction between essentia and esse. Or, rather, there is such a distinction for created things, which exist contingently. God, however, exists necessarily and from himself (a se); in God there is no distinction between essentia and esse.

Suárez, however, rejects a real distinction between esse and essentia and, furthermore, rejects a modal distinction. This is, in part, because of his understanding of real and modal distinctions: a real distinction would suggest that an essentia could exist without esse and esse without essentia, and a modal distinction would suggest at least the former (one source of confusion in these discussions is that a Thomist real distinction is probably not the same as a Suárezian real distinction). There were some who were willing to bite one or both of those bullets, but Suárez argues at length for their unacceptability. On Suárez’s view, then, even in created beings there is only a distinction of reason between esse and essentia. Since he is committed to the Christian doctrine of creation ex nihilo, the consequence is that an uncreated essentia is absolutely nihil, while, as expected, in a created essentia its essentia and esse are really the same, albeit conceptually distinct.

Much more would need to be said, and has been said by various scholars, to establish how significant this Suárezian claim is especially with respect to Aquinas’s position or even whether it is significant at all. Inter-school polemics and differing terminologies have resulted in more heat than light on this issue.

d. Efficient and Final Causation

Suárez’s extraordinarily detailed explication of the four kinds of Aristotelian causes—material, formal, efficient, and final—has received increasing attention in the early 21^st century, perhaps in part because seven of the relevant disputations have finally been published in English translation. The resulting scholarly discussions have often been tied up with questions about Suárez’s place in the history of philosophy. Is he a loyal member of the medieval guild or is he setting the stage for mechanism or modernity? Or, perhaps both? On one reading of his account of formal causation, Suárez is the tragic hero making a valiant attempt to defend substantial forms, but in the course of doing so he alters the conception of substantial forms from the traditional model, thereby inadvertently making substantial forms more susceptible to mechanist critique and dismissal. If so, then he is a loyal member of the medieval guild and yet sets the stage for the modern mechanical philosophy. Not all scholars think it is so, however; some think he actually offers remarkably persuasive arguments on behalf of a more or less traditional conception of substantial forms, arguments to which the early modern mechanists would have done well to pay more attention.

Similar issues arise with Suárez’s account of final causality and its relationship to efficient causality. Ends are that for the sake of which an agent acts, as good health is the end for the sake of which a doctor prescribes medicine and finding small invertebrates to eat is the end for the sake of which curlews push their long curved bills into mud. Aristotle—and Aquinas—famously take ends to be a kind of cause, namely, final causes. Appeal to final causes is essential for offering complete explanations. In fact, Aquinas goes so far as to say that all the other kinds of causation, including efficient causation, presuppose final causation. Without final causation there would be no efficient causation on his view. Some scholars, however, have argued that Suárez departs from Aquinas on this score and prioritizes efficient causation, perhaps even reducing final causation to efficient causation. If so, this would make Suárez look like an intermediate between Aquinas’s position and the widespread dismissal of final causation in early modern philosophy.

At first glance, one might think Suárez straightforwardly endorses a traditional picture. He divides his discussion of causation according to the Aristotelian fourfold classification, explicitly defends the status of all four causes as real causes (DM 12.3), and, most importantly, defends at length the claim that ends are real causes (DM 23.1) and argues that final causation is present in the actions of God, rational created agents, and natural agents (that is, non-rational agents such as cows and oak trees).

But a closer look reveals some grounds for those wishing to argue that Suárez emphasizes efficient causation at the expense of final causation. First, he explicitly states that the definition of “cause” applies most properly to efficient causes (DM 12.3.3). Second, when talking about efficient causation he uses the term “real motion,” but when talking about final causation he uses the term “metaphorical motion.” Third, final causality depends on efficient causality, since an end is an actual final cause only if an efficient cause acts on its behalf. Fourth, an end is a final cause only if cognized by a rational agent (see DM 23.7 and 23.10.6); this stands in contrast with Aristotle’s confidence in final causation without the thought and intention of a rational agent. Besides, Suárez devotes far more pages to efficient causation than to final causation.

Nonetheless, scholars who wish to attribute a more traditional view to Suárez can also find support from a closer look at Suárez’s text. Taking the previous paragraph’s points in turn, one may first note that the term “cause” most properly applying to efficient causes is entirely compatible with the term properly applying to final causes, and that the significance of saying that the term applies most properly in one case but not the other is not immediately obvious. Second, it is true that Suárez uses the term “metaphorical motion” when describing the motion of final causes, but he is simply following well-entrenched terminological practice. Also, when pressed, he explicitly denies that metaphorical motion is so-called because it fails to be real motion (DM 23.1.14). Third, on Suárez’s view, actual final causation does seem to depend on efficient causation, but the converse is true as well. He grants Aquinas’s point that efficient causation presupposes final causation, that efficient causes would not act were they not to have ends for the sake of which to act. At the very least, there seems to be a sort of mutual dependence. In some passages, Suárez appears also to endorse the priority of final causation, though whether his view licenses that conclusion is a more complicated matter.

The fourth issue cannot be fully addressed without drawing in a number of other philosophical issues. One may briefly note, however, that, while Suárez does demand that ends be cognized in order to final-cause, he thinks this condition is satisfied even in the case of natural agents. This is a result of his concurrentism, according to which all actions of created things also have God as a concurring agent. That is, one and the same action has two agents, at least one of which is a rational agent. Of course, this account leaves final causation in the natural realm dependent on final causation in the divine realm. But final causation in the latter realm is not unproblematic. A central scholastic assumption, which Suárez shares, is that God is never subject to causation. So how can a final cause move God? And if it cannot, then how can natural actions inherit final causes from God? Suárez is well aware of this problem (DM 23.9.1). His response is to concede that there is no final causality in the case of God’s immanent actions, but that there is no problem with saying that God’s transeunt actions, that is, actions not located in God, have final causes. Whether this answer can be made to work in light of the details of his account of metaphorical motion in DM 23.4 is a further matter. With respect to the fourth issue, there is also a historical question whether the cognition requirement represents a change just from Aristotle or also a change from Aquinas.

These questions about how to fit Suárez’s account of causation into a broader history exemplify a common approach to Suárez. His place in time ensures that it is always tempting to read him as a transitional figure, as standing between the medieval view—where the “medieval view” typically means the view of Aquinas—and the views of the early modern mechanists. Consequently, a strand running through much Suárez scholarship concerns whether he in fact holds transitional views or not.

e. Existence of God

In DM 28, Suárez argues that the best primary division of being is into infinite and finite being, a division he considers equivalent to a number of other divisions including between necessary and contingent being, essential and participated being, and uncreated and created being. It is worth noting, however, that he does not take the term “being” to be used univocally when predicated of both infinite and finite being (DM 28.3). Nor does he go to the opposite extreme and consider it equivocal. Rather, he argues that “being” is used analogously in this case in virtue of the intrinsic characters of both infinite being and finite being.

As Suárez notes, the arguments of DM 28, assuming they work, already go a long way towards establishing the existence of God, since they purport to show that there must be some uncreated being. He devotes an entire disputation, however, to the question of God’s existence. His goal in DM 29 is to prove by natural reason, without any appeal to special revelation, that God exists, a goal that he thinks can be achieved.

His optimism about the possibility of demonstrating that God exists does not result in an uncritical attitude to previous efforts to do so. He rejects, for example, versions of the ontological argument that claim God’s existence to be evident from the fact that necessarily existing is part of what it means to be God (De Deo uno et trino 1.1.1.9). Of course, so did Aquinas. Perhaps more surprising, given Suárez’s debt to Aquinas, is that he also rejects the cosmological argument from motion made by Aristotle and made famous as Aquinas’s first way (Summa theologiae Ia.2.3). This argument starts from the motion or change that we observe, claims that whatever is moved is moved by another, but that there cannot be an infinite chain of moved movers, and so concludes that there must be an unmoved mover at the foundation. A key reason for Suárez to worry about this argument is that he not only thinks the status of the Aristotelian principle that whatever is moved is moved by another uncertain, he thinks that we ourselves provide counterexamples via our free actions. Consequently, he thinks the physical cosmological argument (“physical” because motion pertains to physics) relies on a false premise.

Instead, he turns to a metaphysical version of the cosmological argument (“metaphysical” because being is the object of metaphysics). This argument starts from the observation that there are things that exist, notes that every being either is made by something else or is not (that is, is created or is uncreated), argues that not every being can be made by another being, and concludes that there must be some being that is uncreated (DM 29.1.20-40). The alternatives to the claim that not every being can be made by another being would be either that there is an infinite chain of beings, each made by a prior being, or that there is a circle of beings, each making the next one in the circle. Suárez argues that these alternatives are impossible.

Long before Hume, Suárez recognizes that this cosmological argument hardly suffices to show that there is an uncreated being that merits being called God. Multiple worries might be raised, but Suárez focuses on the observation that for all that the cosmological argument shows, there might be many uncreated beings making other beings. In response, he moves to the next stage of his argument and enlists the aid of teleological arguments, arguing that attending to the order, structure, and beauty of the world shows that there is only one uncreated being (DM 29.2). He considers a variety of objections, ranging from the claim that the order only indicates at most that there is one governor of the world to the possibility of the world having been created and governed by a committee of uncreated beings working in consensus to the possibility that our world is only one of many worlds, each with its own uncreated creator. Suárez argues that some of these objections fail, but he concedes that the teleological or a posteriori argument he is considering cannot show that there are not other worlds with their own creators.

For the final stage, then, he turns to what he calls an a priori argument (DM 29.3). Strictly speaking, there can be no a priori arguments for God’s existence on the scholastic understanding of a priori arguments. For such arguments are arguments from causes to effects and God has no causes. Suárez accepts this point, but suggests that once we have an a posteriori demonstration of a divine attribute, it is possible to demonstrate a priori further attributes from that attribute (cf. the aforementioned example of using God’s perfection to demonstrate God’s unity. Suárez then proposes to demonstrate that there can be only one uncreated being from an uncreated being’s existing necessarily and a se (from itself). The resulting stretch of arguments is complex and relies on premises whose truth is not always obvious. Suárez himself is modest about the force of the argument, granting at the start that the proposed project is difficult and noting at the end that not all of the steps are immune to evasions. He does, however, think that the whole argument taken together will have some persuasive force for a reader who is not obstinate.

f. Categories and Genera

Thanks to the Aristotelian legacy, category theory was a prominent feature of scholastic philosophical discussions. Aristotle famously enumerated ten categories but left it unclear what the justification was for listing ten rather than fewer or more categories. A project that occasioned significant interest among medieval philosophers, then, was to provide the argument that would establish ten and only ten categories; such arguments were called sufficientiae. Deference to Aristotle was not universal, however, and so other philosophers argued that there are fewer than ten categories. There was also disagreement about what the categories are classifying. Extramental objects? Words? Concepts?

In these discussions, the terms “categories” (“praedicamenta”) and “highest genera” (“generalissima” or “suprema genera”) are often used interchangeably. At least some of the time, however, Suárez distinguishes between categories and genera and says that it is the business of logicians to deal with categories, since logic is concerned with the mind’s concepts, while it is the business of metaphysics to deal with the highest genera of beings, revealing their natures and essences (DM 39.pr.1). This suggests a view in which there is a kind of correspondence between the classification of concepts and the classification of extramental beings.

Be that as it may, Suárez qua metaphysician devotes the bulk of the second half of DM to dividing finite being (God falls outside the scope of this division) into the ten highest genera and discussing each genus in turn. Substance, of course, occupies a special role in Aristotelianism, and so Suárez first divides finite being into substance and accident and discusses substance at some length. He then turns to a discussion of the division of accidents, that is, the remaining nine genera on Aristotle’s view.

The first question he considers is whether nine genera of accidents—or ten genera in total—is too many (DM 39.1). He ends up affirming Aristotle’s number but gives a somewhat deflationary spin. He concedes that a variety of intermediate genera can be devised: real accidents versus accidents that are mere modes, absolute accidents versus respective accidents, and so forth. This concession raises questions about the significance of Aristotle’s ten genera, if they are not the most basic or immediate divisions. Suárez, however, thinks Aristotle’s division is nonetheless “most apt.”

The second question concerns the sufficiency of the nine genera of accidents (DM 39.2). Suárez divides this into two issues: (i) are there genera beyond these nine? and (ii) are all nine genera distinct from each other? As usual, Suárez has great respect for his philosophical forebears and says that it would be “rash” to doubt Aristotle’s number. That said, when he discusses the sufficientiae of Aquinas and Scotus, he is obviously dissatisfied. He concludes that a proper a priori demonstration of the sufficiency of the ten highest genera cannot be given. This, he claims, should be no surprise, since a science presupposes its subject rather than demonstrating it.

But what sort of distinction is there between the genera? One would be surprised to find that a given bird belongs to two different genera, say, Ara and Tangara. Similarly, one might be surprised to find that Aristotle’s genera overlap or even coincide entirely. Yet Suárez quickly rejects the view that there is always a real distinction between items belonging to two or more of the highest genera. He is more sympathetic to the view that there is a modal distinction. There are, however, cases that keep him from accepting that view as well.

Relations, the fourth highest genus, are of special concern. There is evidence that at one point in his career Suárez took relations to be modally distinct from their foundations. It is also a view to which he gives more time and attention when he gives an explicit treatment of relations in DM 47, though ultimately he rejects the view in that disputation. Instead, he concludes that relations are only distinct in reason from their foundations. For example, if Socrates and Plato are similar to each other in virtue of each being white, then the foundation for Socrates’s relation of similarity to Plato is Socrates’s whiteness. On Suárez’s mature view, Socrates’s similarity relation is neither really nor modally distinct from Socrates’s whiteness. The relation and quality are only conceptually distinct. There is a persuasive reason for such a reductionist account of relations. If God creates a world with nothing but substances and absolute qualities, it seems the relations would ipso facto follow. No separate creative act would be needed to ensure that white Socrates and white Plato were similar.

Consequently, Suárez concludes that a distinction of reason suffices for a separate highest genus (DM 39.2.22 and 47.2.22). Sometimes there in fact is a real distinction. Suárez thinks items in the second and third genera, quality and quantity, are really distinct from substance and from each other. In the remainder of the cases, however, there is either only a modal distinction or only a distinction of reason. He does stress that the distinction of reason needs to be one with a foundation in reality, which raises challenging questions about what that foundation is. The standard example of distinctions of reason with a foundation in reason in scholastic philosophy is the distinction between God’s attributes. That example may have been dialectically effective insofar as it was widely accepted, but it is not thereby an illuminating example. In the case of action and passion, two genera only distinct in reason, Suárez seems to suggest that the foundation in reality can be comparisons to extrinsic things. He says that action is distinct from passion insofar as action is compared to the principle that acts and passion to the principle that undergoes (DM 39.2.23).

It is worth noting in passing that Suárez thinks “being” is used analogically across the nine highest genera of accidents (DM 39.2.3). Consequently, one of the key texts relevant to the controversies about Suárez’s doctrine of the analogy of being is found in his discussion of the division of accidental being.

g. Beings of Reason

A moment’s reflection shows that we not only think and talk about existent objects such as stars and oak trees, we also think and talk about non-existent objects and even objects that could not exist, such as square circles or goat-stags. But, to use one of Suárez’s examples, what is one talking about if one says that two chimeras are similar but that goat-stags and chimeras are different? Ordinarily one might think that propositions are made true by beings. The proposition that many oaks have lobate leaves is made true by the many real oak trees with lobate leaves. But what makes the proposition that goat-stags and chimeras are different true? Besides, how can a thought be directed at one non-existent object rather than another?

Suárez calls such objects of thought “beings of reason,” and he ends DM with a disputation devoted to them (DM 54). The inclusion of this disputation might come as a surprise, given the systematic nature of the work and given that Suárez has carefully defined the object of metaphysics as “real being.” Well aware of his earlier definition, Suárez opens DM 54 with a prologue defending the treatment of beings of reason. Given that our thought inevitably involves beings of reason, someone needs to give an account of them. Suárez thinks the metaphysician is best suited to the task, since beings of reason are “shadows of being” and consequently should be treated in analogy to real being. Since real being is metaphysics’ object, metaphysicians are well-positioned to give an account of beings of reason also. Suárez’s is a controversial claim; some scholastics thought beings of reason the logician’s domain. Note, too, that the analogy between real being and being of reason is a rather weak one. Not only do the analogates not fall under any unitary concept, but beings of reason do not even have in themselves anything proportional to real beings (beings of reason have nothing in themselves). Suárez, however, points out that beings of reason are thought of as having a proportion to real beings, and insists that is sufficient for a kind of analogy of proportionality (DM 54.1.10).

As is his wont, Suárez attempts to thread a middle course. There are those who deny that there are beings of reason or that beings of reason are needed in order to teach about or conceive of real beings. On the other side are those who grant beings of reason and, furthermore, claim that there is a single concept of being that includes both real beings and beings of reason. Suárez’s own view is that there are beings of reason but that the “are” in that claim does not indicate the same thing as the “are” in the sentence “there are oak trees.” There is no single concept covering both real beings and beings of reason.

When first introducing his own view of beings of reason, Suárez characterizes them as “not true real beings, since they are not capable of true and real existence” (DM 54.1.4). It is worth noting the modal force in that characterization and recalling that for Suárez real being includes both actual and possible being. What Suárez has in mind when talking about beings of reason are things that cannot exist. That class turns out to be rather motley. Mythical beasts, negations, privations (for example, blindness), and even logical concepts such as genus and species are all beings of reason.

Suárez’s explicit definition comes two paragraphs later: “a being of reason is usually, and rightly, defined as that which has being only objectively in the intellect or as that which is thought by reason as a being even though it has no entity in itself” (DM 54.1.6; italics in the original). One issue in Suárez scholarship concerns whether the two disjuncts amount to equivalent definitions or not. If Suárez also thinks that it is possible to think of non-beings in the manner of non-beings, then it would look like there can be things that satisfy the first disjunct without satisfying the second disjunct. Another issue concerns how to understand the talk of being objectively in the intellect. On the traditional interpretation, Suárez’s view is that beings of reason have a peculiar mode of being, namely, as pure objects of thinking. Consequently, their being depends on minds actually thinking about them. That interpretation can and has been challenged, however, by a more eliminativist account that denies that “being only objectively in the intellect” is any sort of being at all. Rather, it just describes the state of affairs where an intellect has a contentful thought about something that simply fails to exist.

Most of the interest in Suárez’s account of beings of reason concerns his general characterization. DM 54, however, goes on to ask what sorts of causes, if any, beings of reason have and whether the traditional division of beings of reason into negations, privations, and relations of reason is right and sufficient. He answers that, rightly understood, it is.

Questions concerning beings of reason go back at least to the ancient Greek philosophers, but Suárez’s discussion is a landmark in its detail and systematicity. It also leaves some matters unclear, however, and of course other philosophers found things with which to disagree. The result was a train of similarly extended, sophisticated discussions of beings of reason after Suárez. Some offered what might be seen as developments of Suárez’s account, for example, Bartolomeo Mastri and Bonaventura Belluto, while others argues against Suárez and offered contrasting accounts, for example, Pedro Hurtado de Mendoza and John Punch.

h. Middle Knowledge, Grace, and Providence

As mentioned earlier, Suárez contributed to the raging debate about how to reconcile human free will with divine grace, foreknowledge, providence, and predestination that is known as the controversy De auxiliis. One of the developments for which he is best-known, albeit more in theological circles than philosophical circles, is a doctrine called Congruism. Suárez and Robert Bellarmine are the two Jesuits usually credited with formulating Congruism in detail.

To understand Congruism, it helps to step back and first look at the Molinism of which it is a species. A relatively straightforward way of reconciling human free will with the various theological doctrines in question is to provide a compatibilist account of free will, that is, an account of free will compatible with determinism. Luis de Molina, also a Jesuit, rejects that method, vigorously defending a libertarian account of free will. To be free means to be able to choose and able not to choose an option once all the prerequisites for acting have been posited. Related to this emphasis on libertarian freedom is the belief that the divine grace (whereby sinful humans can attain salvation) is not intrinsically efficacious. Rather, God’s grace is rendered efficacious by a human being’s free consent. These beliefs naturally raise questions about compatibility with traditional doctrines such as God’s foreknowledge and providential control. Molina famously proposes middle knowledge (scientia media) to show their compatibility. Middle knowledge—“middle” because it falls between the natural and free knowledge traditionally ascribed to God—is God’s prevolitional knowledge of what any possible free creature would do in any scenario. This ensures God’s foreknowledge, even of free actions, and allows God the appropriate providential control.

Suárez wholeheartedly agrees with Molina on the importance of libertarian freedom (DM 19.2-9). If, for example, God determines Peter to steal, then Peter cannot be held responsible for stealing. Suárez also agrees with the Molinist strategy of appealing to middle knowledge (De scientia Dei futurorum contingentium 2). There are, however, disagreements on this as well.

Molina argues that the reason God knows what creatures would freely do is because God’s infinitely surpassing the finite nature of creatures allows God to “super-comprehend” their natures, and thereby to know what they would do in given situations. Suárez, however, denies that any special explanation is needed for God’s middle knowledge. To know whether God can do something, we do not need to investigate God’s omnipotence. Rather, we merely need to establish that the thing is possible. Similarly, he thinks that to find out whether God can know something, we do not need to investigate God’s abilities. All we need to do is establish that the claim in question is true. If it is, then God’s omniscience ensures God’s knowledge of that claim. God knows propositions about what free creatures would do in the same way he knows any other proposition: by a simple intuition of its truth (De scientia Dei futurorum contingentium 1.8).

But Congruism’s main dispute with Molina concerns the reason for the efficaciousness of divine grace and the place of predestination. According to Molina, God bestows grace on, say, Mary and Martha, knowing that Mary will freely act so as to render the grace efficacious while Martha will not. Mary’s free acceptance is the sole extrinsic reason rendering the grace given to her efficacious, while Martha’s free failure to accept the grace is the sole extrinsic reason the grace given to her is sufficient rather than efficacious. Furthermore, God predestines Mary to salvation on the basis of his knowledge that she will freely accept his grace.

Suárez, however, attempts to carve out a position that is broadly Molinist but closer to the position of the Jesuits’ Thomist opponents. On his view, Mary’s free acceptance is not the only extrinsic reason rendering the grace efficacious. Nor does God predestine her to salvation on the basis of his middle knowledge. Rather, God antecedently elects some people to salvation but does not elect others. This election is gratuitous in the sense that it does not rest on God’s knowledge of any human merits. Having thus elected Mary, he then knows, thanks to his middle knowledge, what graces to bestow such that Mary will freely accept. If God had known that the grace given to Mary would not be such that Mary would freely accept it, then he would have given some other grace to her such that she would. In other words, the grace is efficacious not only because Mary freely accepts it, but also because of God’s antecedent decision to give whatever grace is needed to ensure Mary’s free acceptance, that is, his antecedent decision to give her a “congruous” grace (De concursu et auxilio Dei 3.14.9).

Suárez’s and Bellarmine’s Congruist version of Molinism was declared the official doctrine of the Jesuit order in 1613, supplanting Molina’s own version.

i. Natural Law and Obligation

A central question in the history of ethics goes back to Socrates in Plato’s Euthyphro: do the gods love the pious because it is pious or is it pious because the gods love it? Suárez directly addresses a variant of this question, albeit in a section whose title might not immediately reveal its philosophical significance: “Is the natural law truly a preceptive divine law?” (De legibus 2.6; henceforth, DL).

Scholastics customarily distinguish between natural and positive law. Explaining this distinction in neutral terms is made difficult by the widely varying theories of law given, but perhaps the central feature of natural law is an epistemological one. Natural law is law that is accessible to all human beings, regardless their access to some holy text or other special revelation. As Suárez puts it, natural law “is that law which sits within the human mind in order to distinguish the fine from the wicked” (DL 1.3.9 or 10, depending on edition). Other features often associated with natural law are universality (that is, applicability across times and places) and being grounded in nature rather than in an arbitrary will. Positive law, on the other hand, is in some sense arbitrary law that is added to natural law and that is not, in principle, accessible apart from special promulgation or communication. A paradigm example is a law that requires people to drive on the right-hand side of the road. One might well think the choice between that law and the law requiring people to drive on the left is arbitrary and one certainly cannot figure out which side of the road to drive on in a given country by introspecting. It is worth noting that Suárez recognizes two species of positive law: human and divine. In some contemporary discussions, positive law is characterized as human law and natural law as divine law. Understanding the terms in that way makes a hash of Suárez’s discussion, as well as those of other scholastics.

Now we can see why natural law is of special interest with respect to the question Socrates posed to Euthyphro. All scholastic theorists of law think that there are at least some cases where laws get their obligatory force solely from a legislator. That is what positive law is like, including divine positive law. A standard example is the ceremonial law of the Old Testament, which Christians thought obligatory in one time and place but not in another. The question is whether all law is like that. Natural law is, of course, the place to look if one is wondering if there are obligations that are not grounded in a superior’s command or prohibition.

Suárez stresses that the question to be asked is whether natural law is divine law in the sense that it is grounded in God qua legislator. He deems it entirely obvious that God is the cause of natural law in some sense, since God is the creator of everything, including any nature in which natural law might be grounded (DL 2.6.2). But even if God is the cause in that sense, there is still the question whether created nature already indicates what ought to be done and what ought to be avoided or whether a further legislative act prescribing or forbidding actions is needed. Suárez calls law in the former sense indicative law and law in the latter sense preceptive law.

One position is extreme naturalism or intellectualism, which Suárez attributes to Gregory of Rimini and several others (DL 2.6.3). On this view, no legislative act on God’s part is needed. Rather, natural law simply indicates what should be done or not done on the basis of what is intrinsically good or bad. Loss of life, for example, is bad: murdering King Duncan deprives him of life, and so Macbeth ought not to stab Duncan. On the extreme naturalism espoused by Gregory, Macbeth’s duty not to murder Duncan would obtain even if God had not given the Ten Commandments and even if God had not existed at all.

On the other side is an extreme voluntarism that says that natural law consists entirely in a command or prohibition coming from God’s will, a view that Suárez attributes to William of Ockham (DL 2.6.4). On this view, what one ought or ought not to do is wholly determined by God’s legislative acts and, furthermore, God’s legislative acts are unconstrained. That is, there is no act that is intrinsically bad such that God is compelled to prohibit it or even prevented from commanding it and no act that is intrinsically good such that God is compelled to command it. Had God commanded us to murder and steal, then doing so would have been obligatory and good.

Characteristically, Suárez charts a middle course. He first agrees with the extreme voluntarists that natural law is genuinely preceptive law, and argues that for a law to be genuine law and not just law in name it must be grounded in the legislative act of a superior (DL 2.6.5-10). The obligatory force of natural law comes from God’s will. Contra Gregory of Rimini, that obligation would not be present had God not legislated or not existed at all.

But then comes the crucial qualification that ends Suárez’s agreement with extreme voluntarism: “Second, I say that this will of God—that is, this prohibition or precept—is not the whole reason for the goodness and badness that is found in observing or transgressing the natural law, but that the natural law presupposes in the acts themselves a certain necessary fineness or wickedness and adjoins to these a special obligation of divine law” (DL 2.6.11). The extreme voluntarist thinks that God is free to command as he wishes, unconstrained by the natures of things. Suárez, on the other hand, thinks that God’s commands and prohibitions are constrained by natural goodness and badness. As befits a perfect being, God prohibits some actions precisely because they are evil. Suárez thinks it absurd to suggest that there are no actions such that they are too evil for God to command or even just to permit. To this extent, then, Suárez agrees with the naturalist; the obligations of natural law are rooted in natural goodness and badness.

That Suárez attempts to chart a middle course of this sort is uncontroversial. There are, however, controversies about how much Suárez allows on the natural, pre-legislative side, as well as how what is allowed on the natural side relates to the “special obligation” resulting from God’s legislative acts. On one interpretation, Suárez’s account is incoherent, because he gives a voluntarist account of obligation and yet grants that performing an action naturally bad is sufficient for blameworthiness. Since being blameworthy requires violating an obligation, Suárez is thereby implicitly committed to saying that natural badness is sufficient to give rise to obligations. But this contradicts his voluntarist account of obligation. One way to avoid the charge of incoherence would be to understand Suárez as giving a voluntarist account of the “special obligation” imposed by natural law, and understanding that obligation as an additional obligation. In other words, the natural goodness and badness intrinsic to actions is sufficient to give rise to one sort of obligation and consequently sufficient to make agents who observe the obligations praiseworthy and agents who violate them blameworthy. Natural law then adds a further obligation to that natural obligation in the same way that human legislators might add further obligation by prohibiting what is already morally prohibited. As Suárez notes, there is nothing incoherent about adding one obligation to another (DL 2.6.12).

Noting that Suárez uses both the terms “duties” (“debita”) and “obligations” (“obligationes”), an alternative strategy is to interpret him as giving a voluntarist account of all obligation while granting that natural goodness and badness give rise to duties. In other words, natural goodness and badness would be sufficient to give rational agents duties to act in certain ways and not in other ways, but none of this would count as genuine obligation. Obligation, on this view, is a peculiar sort of force that arises from someone with legitimate authority issuing a command or prohibition to some subject to that authority. A task for defenders of this interpretation is to spell out what duties and obligations are, such that being subject to a duty is not sufficient for being under obligation.

j. Political Authority

One could hold the view that what gives some individuals political power over other people is that God bestowed such authority on them directly. Suárez rejects that view. He insists that men are by nature free and subject to no one (DL 3.1.1). (I shall use the term “men” in this section, since Suárez does not grant the same natural liberty to women and children. See DL 3.2.3.) Europeans had, of course, stumbled into the Americas shortly before Suárez’s time and so one of the questions that exercised his contemporaries concerned the standing of Native Americans, especially in light of Aristotle’s infamous claim that some human beings are natural slaves. Suárez has no use for the suggestion that Native Americans are natural slaves. Men are naturally disposed to be free and being free is one of their perfections; suggesting that all the people in some region or other happen to have been born “monsters,” that is, with defective natures, is incredible.

Suárez does, however, think that some rulers have legitimate authority over men. Where does that authority come from? The short answer is that political communities are needed and so men consent to join together in such communities, and in a political community the power to govern and to look after the common good of the community must be vested in an authority. Suárez gives two primary reasons why political communities are needed (DL 3.1.3). First, he agrees with Aristotle that human beings are social animals that desire to live in community. The most natural community is a family, but this is an imperfect community, insufficient to include within it all the skills and knowledge needed for life. Consequently, multiple families need to join together in a perfect (that is, complete) community. Second, anticipating Hobbes’ famous point, Suárez notes that individual families not joined together in a political community would be unlikely to remain at peace and would have no means of averting or avenging wrongs (see also Defensio fidei catholicae 3.1.4).

A lively question in Suárez’s day was whether human beings in the state of innocence would have lived in political communities or whether such communities only became necessary after the Fall with its introduction of sin. The second reason given above, in particular, would not seem to apply in the state of innocence. Suárez, however, is confident that political communities would have formed even in the state of innocence, had it continued (De opere sex dierum 5.7). Human beings are social animals, whether in the state of innocence or not. Furthermore, Suárez thinks that even in the state of innocence some people would surpass others in virtue and knowledge. So even though joining together in a political community might not be necessary for mere survival in that state, it would, nonetheless, be most useful for the sharing of knowledge and for encouragement to greater virtue.

That political community requires some authority to govern it Suárez takes as well-nigh self-evident (DL 3.1.4-5). He does note, however, that one reason a governing authority is needed is because each individual of a political community looks after his or her own good. Individuals’ goods, however, sometimes conflict with the common good. Consequently, a government looking after the common good is needed. Following Aristotle, Suárez thinks that a monarchy best fills the role of governing authority (DL 3.4.1). He does not think that monarchy is dictated by natural law, however, and grants that other forms of government, including democracy, may be “good and useful.” Which form of government to adopt is left to human choice.

A political community does not result merely from a group of families living in proximity, even if they consequently become familiar friends. The formation of a political union requires an “explicit or tacit pact” to help each other and a subordination of the families and individuals to some governing authority (De opere sex dierum 5.7.3; cf. DL 3.2.4). The details of how this works are subject to scholarly dispute. Some commentators argue that on Suárez’s view, the community’s consent creates a political community but does not directly cause obligation to a political authority. Others argue that that the consent does directly cause the obligation and authority.

An important feature of Suárez’s view is that political power does not just reside in the community initially. It always remains there. As he puts it, “after that power has been transferred to some individual person, even if it has been passed on to a number of people through various successions or elections, it is still always regarded as possessed immediately by the community” (DL 3.4.8). Suárez is, of course, aware that the needed stability of political communities would be in question if communities could withdraw their transfer of power to the government at every whim. So even though in some sense the power always remains in the community, Suárez argues that the transferred power may not ordinarily be withdrawn (DL 3.4.6). Suárez recognizes exceptions, however. Should the government become tyrannical, the door may be opened to legitimate revolt and even tyrannicide (Defensio fidei catholicae 6.4 and De charitate 13.8). This is the doctrine that gained Suárez the ire of James I of England.

4. Legacy

Although largely unknown among non-specialists (at least in Anglo-American philosophy), Suárez’s influence has never been in doubt among historians of early modern philosophy. There are difficulties with establishing the extent of his influence. First, many of the canonical early modern figures seldom cite their sources. Descartes is perhaps best-known for this, but citations are hardly abundant in any of the other extrascholastic early moderns such as Spinoza, Malebranche, and Locke. The absence of citations is, of course, especially striking in comparison to the texts of Suárez and his fellow scholastics, which are replete with them. Second, encountering an idea or term in a modern text that looks very much like something in Suárez is insufficient to establish Suárezian influence, since there were hundreds of other scholastic theologians and philosophers, many of them also quite influential and many, especially Suárez’s Jesuit confrères, saying things more or less similar to what Suárez is saying and making use of the same terms and distinctions standard in scholastic circles.

Consequently, there is substantial debate about just how indebted Descartes, for example, is to Suárez. Some historians emphasize that his philosophical formation would have occurred in his education at the Jesuit La Flèche and that he himself writes in a letter that the first seeds of everything he learned came from the Jesuits. Other historians, however, point out that in a different letter he claims to remember only two Jesuits, Antoñio Rubio and Francisco de Toledo, and that he generally does seem to have been a very attentive reader.

However much uncertainty there may be about the extent of Suárez’s influence, it is certain that Hobbes, Descartes, Malebranche, Leibniz, and Berkeley all mention Suárez explicitly at least once and say a variety of things that might well be thought to borrow from or be inspired by Suárez. Wolff, too, cites Suárez and thought highly of him, to the extent that Wolff has been characterized as begotten of Suárez. Insofar as there is a path from Wolff to Kant, there might, then, reasonably be thought a path from Suárez to Kant as well.

The story of influence just told is the one most frequently encountered. It omits, however, Suárez’s main influence. Due to the vagaries of academic fashion and dubious historiographies, large swaths of early modern philosophy receive virtually no attention today. Yet Suárez was most influential in these neglected realms, which saw the rise of a Suárezian school of philosophy.

Suárez and Gabriel Vasquez were seen as rival fathers of Jesuit theology and philosophy, leading to near endless discussion of Suárez’s views by early modern Jesuits. Pedro Hurtado de Mendoza and Rodrigo de Arriaga, for example, discuss Suárez extensively in their own work (in fact, they are sometimes treated as faithful Suárezians, although that is a mistake). It must be remembered, too, that the Jesuits were active worldwide, leading to a remarkably wide dissemination of Suárez’s views. His influence was perhaps most profound in the early modern universities of Latin America, but Jesuit missionaries also spread his work to Africa and Asia, even starting a Chinese translation of DM in the seventeenth-century.

Outside the Jesuit order, Suárez’s stature in Scotism—at its height in early modern Europe—is also striking. Texts by Scotist authors often include frequent and detailed discussion of Suárez’s views. His influence even transcended the main religious division of modern Western Europe. Protestant scholastics such as Francis Turretin and David Hollaz, departing from Luther’s contempt for scholasticism, borrowed freely from Suárez. Suárez is sometimes described as providing the received metaphysics for seventeenth-century Lutheran universities.

Contemporary readers usually come to Suárez via the canonical early modern philosophers such as Descartes and Leibniz, but noting these additional lines of influence does two things. In the first place, it helps provide a more accurate picture of the extent of Suárez’s influence. But, second, it also gives some indication of how many philosophers and theologians identified in Suárez a thinker of the highest caliber, with whose work it was worth engaging at length.

5. References and Further Reading

a. Primary Sources

Suárez, Francisco. Opera omnia. Paris: Louis Vivès, 1856-78.
- This standard edition is the most readily available, including freely online. It is not, however, a critical edition and does not include quite all of Suárez’s works. All major philosophical works are included. No English translation of a complete work has been published. Significant portions of some works, especially Disputationes metaphysicae (DM), have been published, however, and additional translations in various stages of polish are available online.

b. Secondary Sources

This abbreviated bibliography focuses on English works published in the last several years.

Doyle, John P. Collected Studies on Francisco Suárez, S. J. (1548-1617). Edited by Victor M. Salas. Leuven: Leuven University Press, 2010.
- Doyle has perhaps done more than anyone else for Suárez studies in the U.S.A. This is a collection of essays drawn from forty years of work. The theme that receives the most attention is Suárez’s conception of metaphysics and being, but the volume also includes several papers on Suárez’s account of law and human rights.
Fichter, Joseph H. Man of Spain: Francis Suárez. New York: MacMillan, 1940.
- Still the standard English biography of Suárez, although it is rather hagiographical by contemporary standards.
Freddoso, Alfred J. “God’s General Concurrence with Secondary Causes: Why Conservation Is Not Enough.” Philosophical Perspectives 5 (1991): 553-85.
- A superb account of Suárez’s arguments in DM 22 against mere conservationism, that is, his arguments for God’s constant concurrence with the actions of created things.
Freddoso, Alfred J. “Introduction: Suárez on Metaphysical Inquiry, Efficient Causality, and Divine Action.” On Creation, Conservation, and Concurrence: Metaphysical Disputations 20-22, xi-cxxi. By Francisco Suárez. South Bend: St. Augustine’s Press, 2002.
- The focus is on Suárez’s account of creation, conservation, and concurrence, but the incisive introduction to Suárez’s metaphysics in general and account of efficient causation more particularly makes this an especially valuable essay.
Gracia, Jorge J. E. “Suárez’s Conception of Metaphysics: A Step in the Direction of Mentalism?” American Catholic Philosophical Quarterly 65.3 (1991): 287-309.
- Argues for a realist interpretation of Suárez’s account of metaphysics.
Gracia, Jorge J. E. “Francisco Suárez: The Man in History.” American Catholic Philosophical Quarterly 65.3 (1991): 259-66.
- A brief, accessible introduction to Suárez, setting him in his historical context.
Heider, Daniel. Universals in Second Scholasticism: A Comparative Study with Focus on the Theories of Francisco Suárez S. J. (1548-1617), Joao Poinsot O. P. (1589-1644), and Bartolomeo Mastri da Meldola O. F. M. Conv. (1602-1673)/Bonaventura Belluto O. F. M. Conv. (1600-1676). New York: John Benjamins Publishing Company, 2014.
- The title is an accurate guide. Exemplary historical scholarship, but challenging reading for those unfamiliar with scholastic terminology.
Hill, Benjamin, and Henrik Lagerlund, eds. The Philosophy of Francisco Suárez. Oxford: Oxford University Press, 2012.
- One of the first volumes that a student of Suárez should turn to; includes papers on a variety of topics.
Novotný, Daniel D. Ens rationis from Suárez to Caramuel: A Study in Scholasticism of the Baroque Era. New York: Fordham University Press, 2013.
- A fine study of Suárez’s account of beings of reason, followed by discussions of the accounts offered subsequently by Hurtado, Mastri/Belluto, and Caramuel. Novotný’s work is marked by both erudite historical scholarship and keen philosophical analysis.
Penner, Sydney. “Suárez on the Reduction of Categorical Relations.” Philosophers’ Imprint 13 (2013): 1-24.
- Argues that Suárez gives a realist but reductionist account of relations, albeit with some problematic results.
Perler, Dominik. “Suárez on Consciousness.” Vivarium 52.3-4 (2014): 261-86.
- An illuminating examination of Suárez’s account of our access to our own acts of perception and thinking. Looks at his distinction between first-order sensory consciousness and second-order intellectual consciousness, as well as what explains the unity of consciousness.
Salas, Victor and Robert Fastiggi, eds. A Companion to Francisco Suárez. Brill’s Companions to the Christian Tradition 53. Leiden: Brill, 2015.
- Covers a number of areas that are neglected by the Schwartz and Hill/Lagerlund collections.
Schwartz, Daniel, ed. Interpreting Suárez: Critical Essays. Cambridge: Cambridge University Press, 2012.
- An excellent set of essays on Suárez, treating a selection of key topics.
Shields, Christopher. “Virtual Presence: Psychic Mereology in Francisco Suárez.” In Partitioning the Soul: Ancient, Medieval, and Early Modern Debates, edited by K. Corcilius and D. Perler, 199-219. Berlin: W. de Gruyter, 2014.
- Examines Suárez’s account of the soul and its parts and what the talk of parts comes to.
Shields, Christopher and Daniel Schwartz. “Francisco Suárez.” Stanford Encyclopedia of Philosophy, ed. Edward N. Zalta. 2014. Accessed 30 Mar. 2015.
- A fine survey of Suárez’s life and philosophy that covers some topics neglected in the present entry.

Author Information

Sydney Penner
Email: sfp@sydneypenner.ca
Asbury University
U. S. A.

Scientific Representation

To many philosophers, our science is intended to represent reality. For example, some philosophers of science would say Newton’s theory of gravity uses the theoretical terms ‘center of mass’ and ‘gravitational force’ in order to represent how a solar system of planets behaves—the changing positions and velocities of the planets but not their color changes. However, it is very difficult to give a precise account of what scientific representation is. More precisely, though, scientific representation is the important and useful relationship that holds between scientific sources (for example, models, theories, and data models) and their targets (for example, real-world systems, and theoretical objects). There is a long history within philosophy of describing the nature of the representational relationship between concepts and their objects, but the discussion on scientific representation started in the 20^th century philosophy of science.

There are a number of different questions one can ask when thinking about scientific representation. The question which has received the most attention, and which will receive the most attention here, is what might be called (following Callendar and Cohen 2006, 68) the “constitution question” of scientific representation: “In virtue of what is there representation between scientific sources and their targets?” This has been answered in a wide variety of ways, some arguing that it is a structural identity or similarity which ensures representation while others argue that there is only a pragmatic relationship. Other questions about scientific representation relate more specifically to the ways in which representations are used in science. These questions are more typically asked directly about certain sorts of representational objects, especially scientific models, as well as from the perspective of sociology of science.

Substantive Accounts
Deflationary and Pragmatic Accounts
Model-Based Representation
Sociology of Science
References and Further Reading

1. Substantive Accounts

Scientific representation became a rising topic of interest with the development of the semantic view of theories which was itself developed partly as a response to the syntactic view of theories. Briefly, on the syntactic view, theoretical terms are defined in virtue of relationships of equivalence with observational entities (Suppe 1974). This was done through the creation of a first-order predicate calculus which contained a number of logical operators as well as two sets of terms, one set filled with theoretical terms and the other with observational terms. Each theoretical term was defined in terms of a correspondence rule linking it directly to an observational term. In this way a theoretical term such as ‘mass’ was given an explicit observational definition. This definition used only phenomenal or physical terms [such as “Drop the ball from the tower in this way” and “observe the time until the ball hits the ground”] plus logical terminology [such as ‘there is’ and ‘if…then’]; but the definition did not use any other theoretical terms [such as ‘gravitational force’ or ‘center of mass’]. The logical language also included a number of axioms, which were relations between theoretical terms. These axioms were understood as the scientific laws, since they showed relationships that held among the theoretical terms. Given this purely syntactic relationship between theory and observed phenomena, there was no need to give any more detailed account of the representation relationship that held between them. The correspondence rule syntactically related the theory with observations.

The details of the rejection of the syntactic view are beyond the scope of this article, but suffice it to say that this view of the structure of theories was widely rejected. With this rejection came a different account of the structure of theories, what is often called the semantic view. Since there was no longer any direct syntactic relationship between theory and observation, it became of interest to explain what relationship does hold between theories and observations, and ultimately the world.

Before examining the accounts of scientific representation that arose to explain this relationship, we should get a basic sense of the semantic view of theories. The common feature of the semantic approach to scientific theories was that they should not be thought of as a set of axioms and defined syntactic correspondence between theory and observation. Instead, theories are “extralinguistic entities which may be described or characterised by a number of different linguistic formulations” (Suppe 1974, 221). That is to say, theories are not tied to a single formulation or even to a particular logical language. Instead theories are thought of as being a set of related models. This is better understood through Bas van Fraassen’s (1980) example.

Van Fraassen (1980, 41-43) asks us to consider a set of axioms which are constituents of a theory which will be called T₁:

A0 There is at least one line.

A1 For any two lines, there is at most one point that lies on both.

A2 For any two points, there is exactly one line that lies on both.

A3 On every line there lie at least two points.

A4 There are only finitely many points.

In Figure 1, we can see a model which shows that T₁ is consistent, since each of the axioms is satisfied by this model.

Figure 1 – Model of Consistency of T₁

Notice this is just one model which shows the consistency of T₁, since there are other models which could be constructed to satisfy the axioms, like van Fraassen’s Seven Point Geometry (1980, 42). Note that what is meant here by ‘model’ is whatever “satisfies the axioms of a theory” (van Fraassen, 1980, 43). Another, perhaps more intuitive, way of expressing this is that a model for any theory T is any model which would make T true iff the model were the entirety of the universe. For example, if Figure 1 were the entirety of the universe, then clearly T₁ would be true. Notice also that, on the semantic view, the axioms themselves are not central in understanding the theory. Instead, what is important in understanding a theory is understanding the set of models which are each truth-makers for that theory, insofar as they satisfy the theory.

This account of the structure of theories can be applied to an actual scientific theory, like classical mechanics. Here, following Ronald Giere (1988, 78-79), we can take up the example of the idealized simple systems in physics. These are, he argues, models for the theory of classical mechanics. For example, the simple harmonic oscillator is a model which is a truth-maker for (part of) classical mechanics. The simple harmonic oscillator can be described as a machine: “a linear oscillator with a linear restoring force and no others” (Giere 1988, 79); or mathematically: F = -kx. This model, were it the entirety of the universe, would make classical mechanics true.

The targets of theoretical models on the semantic view are not always real world systems. On some views, there is at least one other set of models which serve as the targets for theoretical models. These are variably called empirical substructures (van Fraassen 1980) or data models. These are ways of structuring the empirical data, typically with some mathematical or algebraic method. When scientists gather and describe empirical data, they tend to think of and describe it in an already partially structured way. Part of this structure is the result of the way in which scientists measure the phenomena while being particularly attentive to certain features (and ignoring or downplaying others). Another part of this structuring is due to the patterns seen in the data which are in need of explanation. On some views, most notably van Fraassen’s (1980), the empirical model is the phenomenon which is being represented. That is to say, there is no further representational relationship holding between data models and the world, at least as scientific practice is concerned (for a discussion of this, see Brading and Landry 2006). Others argue that the relationship between theoretical models and data models is only one of a number of interesting representational relationships to be described, which set themselves up in a hierarchical structure (French and Ladyman 1999, 112-114).

With this semantic account of the structure of scientific theories in place, there arose an interest to give an account of the representational relationship. The views which arose with the semantic view of theories are here called “substantive,” because they all attempt to give an account of the representational relationship which looks to substantive features of the source and target. Another way of putting this (following Knuuttila 2005) is to say that the substantive accounts of representation seek to explain representation as a dyadic relationship which holds between only the source and the target. As will be discussed below, this is different from the deflationary and pragmatic accounts which view scientific representation as at least a triadic relationship insofar as they add an agent to the relationship. There are two major classifications of substantive accounts of the representational relationship. The first are the structuralist views which are divided into three main types: isomorphism, partial isomorphism, and homomorphism. The second category is the similarity views.

a. Structuralist Views

Generically, the structuralist views claim that scientific representation occurs in virtue of what might be called “mapping” relationships that hold between the structure of the source and the structure of the target, i.e. the parts of the theoretical models point to the parts of the data models.

i. Isomorphism

Isomorphism holds between two objects provided that there is a bijective function—that is, both injective (or one-to-one) and surjective (or onto)—between the source and the target. Formally, suppose there are two sets, set A and set B. Set A is isomorphic to set B (and vice versa) if and only if there is a function, call it f, which could be constructed between A and B which would take each member of set A and map it to one and only one member of set B such that each member of set B is mapped.

To make the point more clear, let us suppose that set A is full of the capital letters of the English alphabet and set B is full of the natural numbers 1 through 26. We could create a function which, when given a letter of the alphabet, will output a number. Let’s make the function easy to understand and let f(A) = 1, f(B) = 2, and so on, according to typical alphabetical order. This function is bijective because each letter is mapped to one and only one number and every number (1 – 26) is being picked out by one and only one letter. Notice that since we can draw a bijective function from the letters to the numbers, we can also create one from the numbers to the letters: most simply, let f’(1) = A, f’(2) = B, and so on. Of course, there is nothing apart from the ease of our understanding which requires that we link A and 1, since we could have linked 1 with any letter and vice versa.

Isomorphism has frequently been used to explain representation (see van Fraassen 1980, Brading and Landry 2006). Since theories, on the semantic view, are a group of related models, there is a certain sort of structure that each of these models has. Most of the time, they are thought of as mathematical models though they need not be only mathematical as long as they have a structure. Van Fraassen also identifies what he calls “appearances,” which he defines as “the structures which can be described in experimental and measurement reports” (1980, 64). So, the appearances are the measurable, observable structures which are being represented (the targets of the representation). On van Fraassen’s account (1980), a theory will be successfully representational provided that there is an isomorphic relationship between the empirical substructures (the sources) and the appearances (as targets), and an isomorphic relationship between the theoretical models (as sources) and the empirical substructures (as targets). (Or at any rate, this is how he has commonly been interpreted (see Ladyman, Bueno, Suárez, and van Fraassen 2010)). As described by Mauricio Suárez (2003, 228), this isomorphism between the models shows that there is an identity that holds between the “relational framework of the structures” of the source and the target. And it is this relational framework of structures which is being maintained.

So, on the isomorphism view of scientific representation, some scientific theory represents some target phenomena in virtue of a bijective mapping between the structures of a theory and data, and a bijective mapping between the data and the phenomena. Notice that on the isomorphic view, the bijections which account for representation are external to the theoretical language. That is to say that the relationship that holds between the theory and the phenomena is not internal to the language in which the theory is presented. This is an important feature of this account because it allows for a mapping between very different kinds of structures. Presumably the (mainly mathematical) structures of theories are quite different from the structures of data models and are certainly very different from the structures of the phenomena (because the phenomena are not themselves mathematical entities). However, since the functions are external, we can create a function which will map these very different types of structures to one another.

ii. Partial Isomorphism

Isomorphism has much to suggest for it, especially when focusing in particular on those theories which are expressed mathematically. This is especially true in more mathematically-driven fields like physics. It seems that the mathematical models in physics are representing the structure that holds between various real world phenomena. For example, F=ma represents the way in which certain features of an object (its mass, the rate at which it is being accelerated) correspond to other features (its force). However, many philosophers (for example, Cartwright 1983 and Cartwright, Shomar, and Suárez 1995) have pointed out that there are cases where a theory or model truly represents some phenomena, even though there are features of the phenomena which do not have any corresponding structure in the theory or model, due to abstraction or idealization.

Take a rather simple example, the billiard ball model of a gas (French and Ladyman 1999). Drawing on Mary Hesse’s (1966) important work on models, French and Ladyman argue that there are certain features of the model which are taken to be representative, for example, the mass and the velocity of the billiard balls represent the mass and velocity of gas atoms. There are also certain features of the billiard balls which are non-representational, for example, the colors of the balls. Most importantly, though, as a critique of isomorphism, there are typically also some undetermined features of the balls. That is to say, for some of the features of the model, it is unknown whether they are representational or not. (For a more detailed scientific example, see Cartwright, Shomar, and Suárez 1995).

To respond to problems of this sort, many (French and Ladyman 1999; Bueno 1997; French 2003; da Costa and French 2003) have argued for partial isomorphism. The basic idea is that there are partial structures of a theory for which we can define three sets of members for some relation. The first set will be those members which do have the relevant relation, the second set will be those members which do not have that relation, and the third will be those members for which it is unknown whether or not they have that relation. It is possible to think of each of these sets of individuals as being a relation itself (since a relation, semantically speaking, is extensionally defined), and so we could draw a bijective function between these relations. But, as long as the third relation (the third set of individuals for which it was unknown whether or not they had the relation) is not empty, then the isomorphism will be only partial because there are some relations for which we are unsure whether or not they hold in the target.

As a more concrete example, consider the billiard ball and atom example from above. In order for there to be a partial isomorphism between the two, we must be able to identify two partial structures of each system, that is, a partial structure of the billiard ball model and a partial structure of the gas atoms. Between these partial structures, there must be a bijective function which maps relations of the model to relations of the gas-system. For example, the velocity of the billiard balls will be mapped to the velocity of respective atoms. There must be a second function which maps those non-representational relations of the model to features of the gas-system which are not being represented. For example, a non-representative feature of the model, like the color of the billiard balls, will be mapped to some feature of the system which is not being represented, like the non-color of the atoms. All the same, this will still remain partial because there will be certain relations that the model has which are unknown (or undefined) in relationship to the gas-system.

iii. Homomorphism

Homomorphism, defended by Bartels (2006), is more general than isomorphism insofar as all isomorphisms are homomorphisms, but not all homomorphisms are isomorphisms. Homomorphisms still rely on a function being drawn between two sets, but they do not require that the function be bijective; that is, the function need not be one-to-one or onto. So, this means that not every relation and part of the theory must map on to one and only one relation or part of the target systems. Additionally, this permits that there be parts and relations in the target system which are unmapped. Homomorphisms allow for a great deal of flexibility with regard to misrepresentations.

b. Similarity

Isomorphism (and the other -morphisms) places a fairly strict requirement on the relevant constitutive features of representation, which are on these views structural. But, as Giere points out (1988, 80-81), this is often not the relevant relationship. Oftentimes, scientists are working with theories or models which are valuable not for their salient structural features, but rather for some other reason. For example, when modeling the behavior of water flowing through pipes, scientists often model the water as a continuous fluid, even though it is actually a collection of discrete molecules (Giere 2004). Here, the representational value of the model is not between the structure of the model and the structure of the world (since water is structurally not continuous, but rather a collection of discrete molecules). Instead, the relevant representational value comes from a more general relationship which holds between the behavior of the modeled and real world systems. Giere suggests that what is needed is “a weaker interpretation of the relationship between model and real system” (1988, 81). His suggestion is that we explain representation in virtue of similarity. On his account a model will represent some real world system insofar as it is similar to the real world system. Notice that this is a much weaker account of representation than the structural accounts, since similarity includes structural similarities, and so encompasses isomorphism, partial isomorphism, and homomorphism.

Of course, if we try hard enough, we can notice similarities between any two objects. For example, any two material objects are similar at least insofar as they are each material. Thus, Giere suggests that an account of scientific representation which appeals to similarity requires an “implicit” (or explicit) “specification of relevant respects and degrees” (81). Respects indicate the relevant parts and ways in which the model is taken to be representative. Perhaps it is some dynamical relationship expressed in an equation; perhaps it is some physical similarity that exists between some tangible model and some target object (for example, a plastic model of a benzene ring); perhaps it is the way in which two parts of a model are able to interact with one another, which shows how two objects in the target system might interact (like the relevant behavior of the model of water flowing through pipes). The limitations with regard to claims of the respects of similarity are limited only by what scientists know or take to be the case about the model and the target system. For example, a scientist could not claim that there was a similarity between the color of a benzene model and a benzene ring since benzene rings have no color. Similarly, a scientist could not claim that there is similarity between the color of a mathematical model and the color of a species of bacteria since a mathematical model does not have any color. Notice that it is insufficient to merely specify the respects in which a model is similar since similarity can come in degrees. Of course, there is a whole spectrum of degrees of similarity on which any particular similarity can fall. A source can be anywhere from an extremely vague approximation of its target to being nearly identical to its target (what Giere calls “exact” (1988, 93)) and everywhere in between.

Giere’s own example is that, “The positions and velocities of the earth and moon in the earth-moon system are very close to those of a two-particle Newtonian model with an inverse square central force” (1988, 80). Here, the relevant respects are the position and velocity of the earth and moon. The relevant degree is that the positions and velocities in the earth-moon system are “very close” to the two-particle Newtonian model. These respects and degrees thus give us an account of how we should think of the similarity between the model and the target system.

Giere uses similarity to describe the relationship between models and the real-world systems they represent, and sometimes between different models (one model may be a generalization of another, and so on). Theories themselves are constituted by a set of these models as well as some hypotheses that link the models to the real world which define the respect and degree of the similarity between the models and their targets.

More recently, Michael Weisberg (2013) has argued for a similarity account of representation. In brief, his view argues that two sets of things be distinguished in both source and target: the attributes and the mechanisms. In distinguishing these sets, an equation can be written in which the common attributes and mechanisms can be thought of as the intersection of the attributes of the model and of the target system, and the intersection of the mechanisms of the model and target system. The dissimilarities can also be identified in a similar fashion. He adds some terms to these sets which are weighting terms and functions. These allow the users to indicate which similarities are more important than others. Rewriting the equation as a ratio between similarities and dissimilarities will result in a method by which we can make comparative judgments about different models. In this way, we will be able to say, for example, that one model is more or less similar than another.

c. Critiques of Substantive Accounts

While similarity and isomorphism continue to have some support in the contemporary literature (especially in modified versions; see below, section 3c), the versions described above have faced serious criticisms. One of the most common arguments against the substantive views is that they are unable to handle misrepresentations (Suárez 2003, 233-235; Frigg 2006, 51). Many models in science do not accurately reflect the world, and, in fact, the model is often viewed as particularly useful because of (not in spite of) the misrepresentations. Nancy Cartwright (1983) has famously argued for a fictional account of modelling and made this case for the laws of physics. Others have shown that similar things are true in other scientific domains (Weisberg 2007a). When the theories are intentionally inaccurate, there will be difficulty in explaining the way in which these theories are representational (as scientists and philosophers often take them to be), with reference to isomorphism or similarity.

Suárez (2003, 235-237) has also argued that both similarity and isomorphism are each neither necessary nor sufficient for representation. Consider first isomorphism. It must be the case that it is not necessary for representation, given that scientists often take certain theories to be representative of their real-world targets even though there is no isomorphic relationship between the theory and the target system. The same is true of similarity. Using his example, suppose that there is an artist painting an ocean view, using some blue and green paints. This painting has all sorts of similarities to the ocean view she is representing, one of which is that both the painting and the ocean are on the same relative side of the moon, are both in her line of vision at time t, share certain colors, and so forth. But which ones are relevant to its being representative and which are more contingent is up to the discretion of the agent who takes it to be representative of the ocean view in certain respects (as Giere argued). But if this is the case, then it turns out that A represents B if and only if A and B are similar in those respects in which A represents B. This ultimately leaves representation unexplained.

Supposing we can give some account of salience or attention or some other socially-based response to this first problem (which seems possible), we are left with the problem that plenty of salient similarities are non-representational. Suárez makes this point with Picasso’s Guernica (2003, 236). The bull, crying mother, eye, knife, and so forth, are all similar to certain real-world objects. But the painting is not a representation of these other things. It is representing some of the horrible atrocities of Franco.

Suárez also argues that both similarity and isomorphism are insufficient for representation. Consider the first, similarity. Take any given manufactured item, for example, an Acer C720 Chromebook, a computer which is similar to many other computers (hundreds of thousands). Notice that the fact of its similarity is insufficient to make it represent any of the other computers. Even if we add in Giere’s requirement that there be hypotheses which define the respects and degrees of the similarity, the insufficiency will remain. In fact, it seems as though there are hypotheses which define the relevant respects and degrees of similarity between the computers: Acer’s engineers and quality control have made sure that the production of these computers will result in similar computers. All the same, even with these hypotheses which give respects and degrees, we would not want to say that any given computer represents the others.

The non-sufficiency problem holds for isomorphism as well. Suppose someone were to write down some equation which had various constants and variables, and expressed certain relationships that held between the parts of the equation. Suppose now that, against all odds, this equation turns out to be isomorphic to some real-world system, say, that it describes the relationship between rising water temperatures and the reproduction rate of some species of fish which is native to mountain streams in the Colorado Rockies. To many, it appears to be counterintuitive to think that representations could happen accidentally. However, if isomorphism is sufficient for representation, then we would have to admit that the randomly composed equation does represent this fish species, even if no one ever uses or even recognizes the isomorphic relationship.

There are other arguments against these views in general, an important one being that they lack the right logical properties. Drawing on the work of Goodman (1976), both Suárez (2003, 232-233) and Roman Frigg (2006, 54) argue that representation has certain logical properties which are not shared by similarity or isomorphism. Representation is non-symmetric, so when some A represents B, it does not follow that B represents A. Representation is non-transitive: if A represents B and B represents C, it does not follow that A represents C. It’s also non-reflexive: A does not represent itself. Since isomorphism is reflexive, transitive, and symmetric, and similarity is reflexive and symmetric, they do not have the properties required to account for representation.

There are replies to these arguments on behalf of the substantive views. First, there is a general question about whether or not we are justified in making inferences from representation in art to representation in science. As was discussed above, many of the criticisms against substantive views draw examples from the domain of art (for example, Suárez’s (2003) uses many examples of paintings and is drawing upon Goodman’s (1976) which discusses representation in art). But, it should not be taken as given that what holds in art must translate to science. In fact, in many cases, the practices in art seem to be quite different from the practices in science. As Bueno and French say, “After all, what do paintings—in particular those that are given as counter-examples to our approach, which are drawn from abstract art—really have to do with scientific representation?” (2011, 879).

Following Anjan Chakravaratty (2009), Otávio Bueno and French (2011) argue that something like similarity or partial isomorphism is, in fact, necessary for successful representation in science. If there were no similarity or isomorphism at all, the successful use of models “would be nothing short of a miracle” (885). That is to say, while similarity or partial isomorphism might not be the whole story, they are at least part of the story. Using the aforementioned example of Picasso’s Guernica, they note that “there has to be some partial isomorphism between the marks on the canvass and specific objects in the world in order for our understanding of what Guernica represents to get off the ground” (885).

Replies have been made to the other arguments as well. Bueno and French (2011) argue that their account of partial isomorphism can meet all of the criticisms raised by Suárez (2003) and Frigg (2006). Adam Toon (2012) discusses some of the ways in which supporters of a similarity account of representation might respond to criticisms. Bartels (2006) defends the homomorphism account against these criticisms.

2. Deflationary and Pragmatic Accounts

If, as these scholars have argued, these substantive views will not work to explain scientific representation, what will? Suárez (2015) argues that what is needed instead is a deflationary account. A deflationary account claims “that there is no substantive property or relation at stake” (37) in debates about scientific representation. Deflationary accounts are typically marked by a couple of features. First, a deflationary account will deny that there are any necessary and sufficient conditions of scientific representation, or if there are, they will lack any explanatory value with regard to the nature of scientific representation. Second, these accounts will typically view representation as a relationship which is deeply tied to scientific practice. As Suárez puts it, “it is impossible, on a deflationary account, for the concept of representation in any area of science to be at variance with the norms that govern representational practice in that area…representation in that area, if anything at all, is nothing but that practice” (2015, 38).

Already we can see that these views will be quite different from the substantive views. Each of these views was substantive in the sense that they gave necessary and sufficient conditions for representation. There was also a distinct way in which these views were detached from scientific practice, since whether something was representational had little to do with whether or not it was accepted by scientists as representational and more to do with the features of the source and target. In each case, it was a relationship that was entirely accounted for by features of the theory or model and the target system. As Knuuttila (2005) describes it, these were all dyadic (two-place) accounts insofar as the relationship held between only two things. The deflationary accounts take a markedly different direction by moving to at least a triadic (three-place) account of representation.

In some cases, the views that have developed have followed the general lead of many deflationary views in giving a central role to the work of an agent in representation. These views do not qualify as deflationary, given that they still give necessary and sufficient conditions of representation. Given the importance of the role of agents and aims, we might call these views pragmatic. Although pragmatic and deflationary views are importantly distinct in their aims, they share many common threads and in many cases, the views could be reinterpreted as deflationary or pragmatic with little effort. As such, they will be grouped together in this section.

a. DDI

The earliest deflationary account of representation was RIG Hughes’ DDI Account (1997). The DDI Account consists of three parts: denotation, demonstration, and interpretation. Denotation is the way in which a model or theory can reference, symbolize, or otherwise act as a stand-in for the target system. The sort of denotation being invoked by Hughes is broad enough to include the denotation of concrete particulars (for example, a model of the solar system will denote particular planets), the denotation of specific types (for example, Bohr’s theory models not just this hydrogen atom, but all hydrogen atoms), and the denotation of a model of some global theory (for example, this particular model is “represented as a quantum system” (S331)). In each case, the model denotes something else; it stands in for some particular concrete object, some type of theoretical object, or some type of dynamical system.

We might think this relationship sufficient for representation, since the fact that scientists treat certain objects or parts of models as being stand-ins or symbols for some target system seems to answer the question of the relationship between a model and the world. Hughes, though, thinks that in order to understand scientific representation, we need to examine how it is actually used in scientific practice. This requires additional steps of analysis. The second part of Hughes’ DDI Account is demonstration. This is a feature by which models “contain resources which enable us to demonstrate the results we are interested in” (S332). That is, models are typically “representations-as,” meaning not only do scientists represent some target object or system, but they also represent it in a certain way with certain features made to be salient. The nature of this salience is such that it allows users to draw certain types of conclusions and make certain predictions, both novel and not. This is demonstration in the sense that the models are the vehicles through which (or in which) these insights can be drawn or demonstrated, physically, geometrically, mathematically, and so forth This requires that they be workable or used in certain ways.

The final part of the DDI Account is interpretation. It is insufficient that the models demonstrate some particular insight. The insight must be interpreted in terms of the target system. That is to say, scientists can use the models as vehicles of the demonstration, but in doing so, part of the representational process as defended in the DDI Account is that scientists interpret the demonstrated insights or results not as features of the model, but rather as features which apply to the target system (or at least, the way scientists are thinking of the target system).

In summary, with denotation, we are moving in thought from some target system to a model. We take a model or its parts to stand in or symbolize some target system or object. In demonstration, we use the model as a vehicle to come to certain insights, predictions, or results with regard to the relationship that holds internal to the model. It is in interpretation that we move from the model back to the world, taking the results or insights gained through use of the model to be about the target system or object in the world.

b. Inferential

i. Suárez

After criticizing the substantive accounts in his (2003), Suárez (2004) developed his own account of representation which focused centrally on inference and inferential capacities, what he calls an inferential conception of representation. As he describes it there, this account involves two parts. The first part is what he calls representational force. Representational force is defined as “the capacity of a source to lead a competent and informed user to a consideration of the target” (2004, 768). Representational force can exist for a number of reasons. One way to get representational force is to repeatedly use the source as a representation of the target. Another way is in virtue of intended representational uses, that is, in virtue of the intention of the creator or author of some source viewed within the context of a broader scientific community. Oftentimes, the representational force will occur as a combination of the two. It is also a contextual property, insofar as it requires that the agent using the source has the relevant contextual knowledge to be able to go from the source to the (correct/intended) target.

So, for example, in the upper left-hand corner of my word processor is a little blueish square with a smaller white square and a small dark circle inside of it (it is supposed to be an image of a floppy disk). This has representational force insofar as it allows me to go from the source (the image of the floppy disk) to the target (a means of saving the document which I am currently writing). In this case, the representational force exists in virtue of both the intended representational uses (the creators of this word processor surely intend this symbol to stand in for this activity) as well as repeated uses (I am part of a society which has, in the past, repeatedly used an image of a floppy disk to get to this target, not only in this program but in many others as well). It is also contextual: someone who had never used computers would not have the requisite knowledge to be able to use the icon correctly.

This is part of the story for Suárez, but in order to have scientific representation there must be something more than mere representational force. On his view, scientific representations are subject to a sort of objectivity which does not necessarily exist for other representations, for example, the example above of the save icon. The objectivity is not meant to indicate that there is somehow an independent representational relationship that exists in the world when scientists are engaged in scientific representation. Instead, the objectivity is present insofar as representations are constrained in various ways by the relevant features of the targets system which is being represented. That is, because there is some real feature which scientists are intentionally trying to represent in their scientific models and theories, the representation cannot be arbitrary but must respond to these relevant features. So the constraints are themselves objective, but this does not commit Suárez to identifying some reified relationship that holds between sources and targets.

According to Suárez, if we are going to get this objectivity in representations, we must turn to a second feature: the capacity of a source to allow for surrogate reasoning. This second feature requires that informed and competent agents be led to draw specific inferences regarding the target. These inferences can be the result of “any type of reasoning…as long as [the source] is the vehicle of the reasoning that leads an agent to draw inferences regarding [the target]” (2004, 773). Suárez’s point here is that not only does the source lead the agent to the target, but also it leads the agent to think about the target in a particular way, coming to particular insights and inferences with respect to the source.

More recently, Suárez (2010) has argued that this second feature, the capacity for surrogate reasoning, typically requires that three things be in place. First, the source must have internal structure such that certain relations between parts can be identified and examined. Secondly, when examining the parts of the source, scientists must do so in terms of the target’s parts. Finally, there must be a set of norms defined by the scientific practice which define and limit which inferences are “correct” or intended. It is in virtue of these norms of the practice that an agent will be able to draw the relevant and intended inferences, making the representation a part of that particular scientific practice. Of course, he takes his view to be deflationary, so these are not to be understood as necessary and sufficient conditions of the capacity for surrogate reasoning, but rather features which are frequently in place.

Consider an example of a mathematical model, for example the Lotka-Volterra equation. The model is supposed to be representational of predator-prey relationships. Part of this is Suárez’s representational force—the fact that competent agents will be lead to consider predator-prey relationships when considering the source. However, as Suárez notes, this is insufficient for scientific representation because in science the terms interact in a non-arbitrary way. To account for this, he argues that there is another feature of the model, which is the capacity to allow for surrogate reasoning. In this case, that means that individuals who examine or manipulate the model in terms of its parts (the multiple variables) will be able to draw certain inferences about the nature of real-world interactions between predators and prey (the parts of the target system). These insights will occur in part due to the nature of the model as well as the norms of scientific practice, which means that the inferences will be non-arbitrarily related to the real-world phenomena and will afford us to recognize certain specified inferences of scientific interest.

ii. Contessa

Suárez’s inferential account has been further developed by Gabriele Contessa (2007, 2011). He is explicit in his claim that the interpretational view he is defending is not a deflationary account, but is rather a substantive version of the inferential account insofar as he takes the account to give necessary and sufficient conditions of representation. All the same, the account he defends is clearly pragmatic in nature. Contessa begins by noting an important distinction he has drawn from Suárez’s work, that of the difference between three types of representation. The first is mere denotation, in which some (arbitrarily) chosen sign is taken to stand for some object. He gives the example of the logo of the London Underground denoting the actual system of trains and tracks.

The second sort of representation is what Contessa calls “epistemic representation” (2007, 52). An epistemic representation is one which allows surrogate reasoning of the sort described by Suárez. The London Underground logo does not have this feature since no one would be able to use it to figure out how to navigate. A map of the London Underground, on the other hand, would have this feature insofar as it could be used by an agent to draw these sorts of inferences.

The final sort of representation is what he calls “faithful epistemic representation” (2007, 54-55). Whether or not a representation is faithful is a matter of degree, so something will be a completely faithful epistemic representation provided all of the valid inferences which can be drawn about the target using the source as a vehicle will also be sound. Notice this does not require that a model user be able to draw every possible inference about the target, but rather that the inferences licensed by the map that are drawn will be sound inferences (both following from the source and true of the target). In this sense, a map of the London Underground produced yesterday will be more faithful than one produced in the 1930s.

Using this framework, Contessa goes on to describe a scientific model as an epistemic representation of features of particular target systems (56). The scientific model will be representational for a user when she interprets the source in terms of the target. He remains open to there being multiple sorts of interpretation which are relevant, but suggests that the most common sort of interpretation is “analytic,” which functions quite similarly to an isomorphism in which every part and relation of the source is interpreted as denoting one and only one part and relation in the target (and all of the target’s parts and relations are denoted by some part or relation from the source).

Of course, given that this is determined by the agent’s use, it is not necessary that the agent believe that her interpretation is actually the case about the system. Here is where Contessa draws on the distinction of faithfulness. Since models are often misrepresentations and idealizations as has been discussed above, they need not be completely faithful in order to be useful. This is not the end of the story, though, because the circumstances also play an important role in understanding whether or not something is a scientific representation.

c. Agent-Based Versions of Substantive Accounts

In light of some of the insights of Suárez and others, many of the views described above as substantive views were altered and updated to more explicitly and centrally make reference to the role of an agent, making them what could be called agent-centered approaches. Of most importance, given their role in the substantive views as described above are recent advances made by van Fraassen and Giere.

i. Agent-Based Isomorphism

The view of isomorphism commonly attributed to van Fraassen, which was described above, was the one drawn from his 1980 book, The Scientific Image. More recently, van Fraassen has presented an altered account of representation, which places much more emphasis on the role of an agent in his (2008). Van Fraassen notes that while some reference to an agent was a part of his earlier views (Ladyman, Bueno, Suárez, and van Fraassen, 2010), Suárez’s important work on deflationary accounts was influential in the development of the view he defends (2008, 7, 25-26; 2010).

He begins his account by looking primarily to the way in which a representation is used, saying that a source’s being representative of some target “depends largely, and sometimes only” on the way in which the source is being used (2008, 23). Though he does not take himself to be offering any substantive theory of representation, he does call this the Hauptsatz or primary claim of his account of representation: “There is no representation except in the sense that some things are used, made, or taken, to represent things thus and so” (2008, 23). Van Fraassen notices that this places some restrictions on what can possibly be representational. Mental images are limited, because they are not made or used in some way. That is to say, we do not give our mental states representational roles. Similarly, there is no such thing as a representation produced naturally. What it is to be a representation is to be taken or used as a representation, and this is not something that happens spontaneously without the influence of an agent.

Van Fraassen also notices an important distinction in two ways of representing: representation of and representation as. When scientists take or use some source to be representational, they take it to be a representation of some target. This target can change based on context, and sometimes scientists might not even use the source to be a representation at all. Consider van Fraassen’s example: we can use a graph to represent the growth of bacterial colonies under certain conditions, and so the graph will be a representation of bacterial growth (2008, 27). But we could also use that graph to represent other phenomena, perhaps the acceleration of an object as it is dropped from some height. Part of what this captures is the way in which our perspectives can change the way in which we are representing a particular appearance. Thus, by using a source in some distinct way, we can represent some particular appearance of some particular phenomena.

In intentionally using a source as a representation, scientists do not only make it a representation of something, but they also represent it in a certain light, making certain features salient. This is what van Fraassen calls representation as. Two representations can be of the same target, but might represent that target as something different. Van Fraassen offers an example: everything that has a heart also has a kidney, but representing some organism as having a heart does not mean the same thing as representing it as has having kidneys (2008, 27). Similarly, we might represent the growth of bacteria mentioned above as an example of a certain sort of growth model or as the worsening of some infection as it is seen as part of a disease process.

Of course, all of this is very general, which van Fraassen acknowledges. However, in a true deflationary attitude, he notices that there is no good way of getting more specific about scientific representation since it has “variable polyadicity: for every such specification we add there will be another one” (2008, 29). Nonetheless, he still maintains that the link between a good or useful representation and phenomena requires a similarity in structure. As it stands, then, there is still an appeal to isomorphism present in his account: “A model can (be used to) represent a given phenomenon accurately only if it has a substructure isomorphic to that phenomenon” (2008, 309). Just as before, we have an account of representation which relies on isomorphism between the structure of the theoretical models and the (structure of) the phenomena. All the same, this is still a markedly different view from his earlier view described above. No longer is it the isomorphism or structural relationship alone which is representational. Now, on van Fraassen’s views, it is the fact that a scientific community uses it or takes it to be representational.

ii. Agent-Based Similarity

Ian Hacking (1983) has famously argued that, in philosophical discussions of the role and activity of science, too much emphasis is put on representation. Instead, he suggests that much of what is done in science is intervening, and this concept of intervention is key to understanding the reality with which science is engaged. All the same, he still thinks that science can and does represent. Representation, on his account, is a human activity which exhibits itself in a number of different styles. It is people who make representations, and typically, they do so occurs in terms of a likeness, which he takes to be a basic concept. Representation in terms of likeness, he thinks, is essential to being human, and he even speculates that it may have played a role in development like many think language did. In creating a likeness, though, he argues that there is no analyzable relation being made. Instead, “[likeness] creates the terms in a relation…First there is representation, and then there is ‘real’” (139). Representation on his view is not interested in being true or false, since the representation precedes the real.

Giere (2004, 2010) has also made pragmatics more central and explicit to his account of scientific representation. He claims that in attempting to understand representation in science we should not begin with some independent two-place relationship, which substantially exists in the world. Instead, we should begin with the activity of representing. If we are going to view this activity as a relationship, it will have more than two places. He proposes a four-place relation: “S uses X to represent W for purposes P” (2004, 743). Here, S will be some agent broadly construed, such that it could be some individual scientist, or less specifically some group of scientists. X is any representational object, including models, graphs, words, photographs, computational models, and theories. W is some aspect or feature of the world and P are the aims and goals of the representational activity; that is, the reasons why the scientist is using the source to represent the target. Giere identifies a number of different potential purposes of representation. These include things like learning what something is actually like, but are fairly contextual and depend upon the question being asked. So the way in which something is modeled might change depending on the purposes of the representation (2004, 749-750).

Giere is still working from what should be considered a semantic conception of theories, in which a theory is a set of models which are created according to a set of principles and certain specific conditions. The principles are what we might otherwise think of as being empirical laws, but he does not conceive of them as having empirical truth. Instead, by thinking of them as principles by which scientists can form models, it is these scientists who construct and use the models who make particular the otherwise general and idealized principles. On this view, then, it is the models which are representational and will link up to the empirical world.

There are many ways a scientist can use a model to represent the world, on Giere’s view, but the most important way remains similarity. Giere is quick to note that this does not mean that we need to think of the representational relationship as some objective or substantive relationship in the world. Instead, the scientist who uses the model does the representing and she will often do this in virtue of picking out certain salient features of a model which are similar to the target system. In doing so, the scientist specifies the relevant aspects and degrees of similarity which she is using in her act of representation.

One of the advantages of this updated version of the similarity view is the wide range of models which can be effectively representational on this account (Giere 2010). Giere gives an example of a time when he saw a nuclear physicist treat a pencil as a model of a beam of protons, explaining how the beam could be polarized. It is in virtue of the invoked similarity between the pencil and a beam of photons, that is, the fact that the physicist specifically used a relevant similarity, that he was able to use it to represent the beam of photons. By noting the importance of the role of the agent, Giere is better able to explain the scientific representation which occurs in the whole range of scientific representations.

A similar yet importantly distinct account of representation as similarity is defended by Paul Teller (2001). Teller argues that we should abandon what he called the perfect model model, in which we take scientists to model in a way that is perfectly correspondent with the real world targets. Instead, he thinks that models are rarely, if ever, perfect matches for their targets. This does not mean that models are not representations. He argues that models represent their targets in virtue of similarity, though he denies that any general account of similarity can be given. What makes something a similarity depends deeply upon the circumstances at hand including the interests of the model user.

d. Gricean

One way to ‘deflate’ the problem of scientific representation is to claim that there is no special problem for scientific representation, and instead argue that we should understand the question of scientific representation as part of the already widely discussed literature on representation in general. This is the project taken up by Craig Callender and Jonathan Cohen (2006). According to their view, representation in many different fields (art, science, language, and so forth) can be explained by more fundamental representations, which are common to each of the fields.

To explain this, they appeal to what they call “General Griceanism”, which takes its general framework from the insights of Paul Grice. On their General Gricean view, the representational nature of scientific objects will be explained in terms of something more fundamentally representational. The more fundamentally representational objects in this case are mental states. This, in effect, pushes the hard philosophical problem back a stage, since some account must be given with regard to the representational nature of mental states. They remain uncommitted to any particular account of the representational nature of mental states, leaving that something to be argued about in philosophy of mind. All the same, they mention a few popular candidates: functional role theories, informational theories, and teleological theories.

There are, on their view, significant advantages to taking this General Gricean viewpoint. For one, it has a certain sort of simplicity to it. By explaining all representation in terms of the fundamental representations of mental states, we do not need to give wildly different explanations as to why a scientific model represents its target and why, for example, a green light represents ‘go’ to a driver. Each occurs because, in virtue of what the scientist or “hearer” knows, a certain mental state will be activated which contains with it the relevant representational content.

They can also explain the reasons why similarity or isomorphism will be commonly used (though non-necessary) since these are strong pragmatic tools in helping to better bring about the relevant mental state with its representational content. This is, as they argue, clearly one of the reasons why people from Michigan use an upturned left hand to help them explain the relative location of their hometown–because the upturned left hand is similar in shape to the shape of Michigan. The reasons why similarity is a useful tool here are identical to the reasons why similarity would be useful in scientific contexts, because it will make the relative instance of communication more effective–meaning that the hearer (or user of a model or scientific representation) will be better able to arrive at the relevant mental states which represent the target system.

In short, their view is that while there might be a general philosophical problem of representation, there is not anything special about scientific practice that makes its stake in this problem any different from any other field or the general problem. Of course, as they amusingly note, this passes the buck to the more fundamental question: “Once one has paid the admittedly hefty one-time fee of supplying a metaphysics of representation for mental states, further instances of representation become extremely cheap” (71).

e. Critiques of Deflationary and Pragmatic Accounts

These deflationary and pragmatic accounts of representation have not avoided criticisms of their own. Many of these criticisms are presented as part of the defense of one of the views over another. For example, Contessa (2011) argues against a purely denotational account of scientific representation, such as the one seen in Callendar and Cohen’s (2006). As he says, “Whereas denotation seems to be a necessary condition for epistemic representation, it does not, however, seem to be a sufficient condition” (2011, 125). As Contessa argues, it is insufficient merely to be able to stipulate a denotational relationship to have the sort of representation which is useful to scientists. For example, we might use any given equation (for example, F=ma) to denote the relationship which holds between the size of predator and prey populations. But, while this equation could successfully denote this relationship, it will not be of much use to scientists because they will not be able to draw many insights about the predator-prey relationship. Therefore, Contessa argues, while denotation is a necessary condition of representation, it cannot alone be the whole story. In addition, he suggests the need for interpretation in terms of the target, as described above.

Matthias Frisch (2015, 296-304) has raised a worry which he addressed specifically to van Fraassen’s (2008) account, but which is applicable to many of the pragmatic and deflationary accounts described above. The worry is that if we take van Fraassen’s Hauptsatz (“There is no representation except in the sense that some things are used, made, or taken, to represent things thus and so” (2008, 23)) literally, then it seems to be impossible that some models can represent. Taking Frisch’s example, say we wanted to construct a quantum mechanical model of a macroscopic body of water. To do this, “we would have to solve the Schrödinger equation for on the order of 10²⁵variables—something that is simply impossible to do in practice” (297). But if this is so, then it turns out that that the Schrödinger equation cannot be used to represent a macroscopic body of water—since we could never use the equation in this way, it is not representational in this way. Notice that this concern applies to other pragmatic and deflationary accounts: if we are unable to make inferences or interpret the source in terms of the target (which, given the complexity here, it seems we would not be able to do), then it will also fail to be representational on these other accounts. But this leads to a fairly strong conclusion that we can only use a model to represent a system once we have actually applied the model to that system. For example, the Lotka-Volterra model seems to only represent those systems for which scientists have used it; it does not represent all predator-prey relationships, in general.

Frisch (2015, 301-304) does not think that this argument is ultimately fatal to the pragmatic accounts since he argues that because there are constraints on the use of models which are part of the scientific practice, there is a sense in which the Lotka-Volterra model, for example, represents all predator-prey relationships (even though it has not yet been used in this way). There is no problem in extending models “horizontally,” that is, to other instances which are in the same domain of validity of the model. There is, Frisch argues, a problem in extending models “vertically,” that is, using a model to represent some phenomena which is outside the domain of validity. This can be seen in the quantum mechanics example from above since we do not have any practice in place to use Schrödinger equations to describe macroscopic bodies of water. So, he claims, van Fraassen’s view (and, by extension, the other pragmatic and deflationary views) must be committed to an anti-foundationalism (a view that the sciences cannot be reduced to one foundational theory) that denies that the models of quantum mechanics can adequately represent macroscopic phenomena. Of course, the anti-foundationalist commitments might be viewed as a desirable feature of these views, rather than a flaw, depending upon other commitments.

Another important critique which applies more generically to a number of these deflationary and pragmatic views comes from Chakravartty (2009). As described above, many of those who argue for a deflationary or pragmatic account of representation offer their view as an alternative to the substantive accounts. That is to say, they deny that scientific representation is adequately described by the substantive accounts and do not merely add to these accounts, but rather reject them and offer their deflationary or pragmatic account instead. Chakravartty argues that this is a mistaken move. We should not think of deflationary or pragmatic accounts as alternatives to the substantive accounts, but rather as compliments. On the deflationary or pragmatic accounts, representation occurs when inferences can be made about the target in virtue of the source. But, “how, one might wonder, could such practices be facilitated successfully, were it not for some sort of similarity between the representation and the thing it represents—is it a miracle?” (201). That is to say, the very function which proponents of the deflationary or pragmatic accounts take to be the central explainer of scientific representation seems to require some sort of similarity or isomorphism (Bueno and French 2011). On Chakravartty’s view, the pragmatic or deflationary accounts go too far in eliminating the role for some substantive feature. In doing so, they leave an important part of scientific representation behind.

3. Model-Based Representation

The question of scientific representation has received important attention in the context of scientific modeling. There is a vast literature on models, and much of it is at least tangentially related to the questions of representation. An examination of this literature provides an opportunity to see other sorts of insights with regard to representation and the relationship between the world and representational objects.

a. Models as Representations (and More)

Much of the literature on models focuses on the various roles of models within scientific practice, both representational and others. In an influential volume on models, Margaret Morrison and Mary Morgan (1999) use a number of examples of models to defend the view that models are partially independent from theories and data and function as instruments of scientific investigation. We can learn from models due to their representational features. Morrison and Morgan start out by focusing on the construction of models. Models, on their account, are constructed by combining and mixing a range of disparate elements. Some of the elements will be theoretical and some will be empirical, that is, from the data or phenomena. Thus far, this view is mostly in line with what has been discussed in the above sections. What makes models unique in their construction is that they often involve other outside elements. These can be stories (ways of explaining some unexpected data which are not part of a theory), other times it is a sort of structure which is imposed onto the data. These other elements, they argue, give models a sort of partial independence or autonomy. This is true even when the outside elements are not as obviously present, for example when a model is an idealized, simplified, or approximated version of a theory. This independence is crucial if we are to use them to help understand both theories and data as we often use them to do.

According to Morrison and Morgan, models function like tools or instruments for a number of purposes. There are three main classifications of the uses of models. The first is in interacting with theories: models can be used to explore a theory or to make usable a theory which is otherwise unusable. They can also be used to help understand and explore areas for which we do not yet have a theory. Other times, the models are themselves the objects of experimentation. The second classification of the use of models is in measurement: not only as a way of structuring and presenting measurements, but they can also function directly as instruments of measurement. Finally, models are useful when designing and creating technology.

Models are not valuable only insofar as they have these functions. Models, Morrison and Morgan argue, are also importantly representational. Their representational value relies in part on the way in which they are constructed with both the theory and the data or phenomena. Models can represent theories, data, or can be representational instruments which mediate between data and theory. Whatever the case, representation, on their view, is not taken to be some mirroring or direct correspondence between the model and its representational target. Instead, “a representation is seen as a kind of rendering–a partial representation that either abstracts from, or translates into another form, the real nature of the system or theory, or one that is capable of embodying on a portion of a system” (1999, 27). Sometimes models can be used to represent non-existent or otherwise inaccessible theories, as they claim is the case with simulations.

The final role of models described by Morrison and Morgan is the way that models afford the possibility of learning. Sometimes the learning comes in the construction of the model. Most frequently, though, we can learn using models by using and manipulating them. In doing so, we can learn because the models have the other features already described: the wide range of sources for construction, the functions, and their status as representations. Oftentimes, the learning takes place internal to the model. In these cases, the model serves as what they call a representative rather than a representation. With representatives, the insights we can gain from manipulating the model are all about the model itself. But in doing so, we come to a place from which we can better understand other systems, both real-world systems and other systems. Other times, we take the world into the model and then manipulate the world inside the model, as a sort of experiment.

Daniela Bailer-Jones (2003, 2009) defends a slightly different but related account of the representational nature of models. On her account, models entail certain propositions about the target of the model. As propositions, they are subject to being true or false. One way of thinking about the representation of models is to say that models are representational insofar as their entailed propositions are true. However, this cannot be exactly right, since, as was mentioned above, models oftentimes intentionally entail false propositions. Since models are about those aspects of a phenomenon which are selected, they will fail to say things about other aspects of a phenomenon. In some cases, the propositions entailed may be true for one aspect but false for another. This calls for the role of model users who decide what function the model has, the ways in which and degree to which the model can be inaccurate, and which aspects of the phenomenon are actually representing. In sum, on her view, models are representational in part due to their entailed propositions, but also due to the role of the model users.

Tarja Knuuttila (2005, 2011) has argued that in thinking about models too much emphasis has been placed on their representational features – even in accounting for their epistemic value. Following and expanding on Morrison and Morgan (1999), she argues that we should think of models as being material epistemic artefacts, that is, “intentionally constructed things that are materialized in some medium and used in our epistemic endeavors in a multitude of ways” (2005, 1266). The key to their epistemic functioning is to be found from their constrained and experimentable nature. Models, according to this account, are constrained by their construction in such a way that they make certain scientific problems more accessible, and amenable to a systematic treatment. This is one of the main roles of idealizations, simplifications, and approximations. On the other hand, the representational means used also impose their own constraints on modeling. The representational modes and media through which models are constructed (for example, diagrams, pictures, scale models, symbols, language) all afford and limit scientific reasoning in their different ways. When considered in this respect, Knuuttila argues, we can see that models have far more than mere representational capacities including that they are themselves the targets of experimentation and can be thought of as creating a sort of conceptual parallel reality.

b. Model-Building

In addressing model-building Weisberg (2007b) and Godfrey-Smith (2006) both take up the idea that the characteristic ways in which models are constructed is indirect. This comes about in a three step process in which a scientist first constructs a model, then analyzes and refines the model, and finally examines the relationship between the model and the world (Weisberg 2007b, 209). Models are used and understood by scientists with “construals” of the model (Godfrey-Smith 2006). The construal, on Weisberg’s account, is made of four parts. The first is an assignment which identifies various parts of the model to the phenomena being investigated. The second part of a construal is the scope, which tells us which aspects of the phenomena are being modeled. The final two parts of the construal are each fidelity criteria. One of these is the dynamical fidelity criteria, which identifies a sort of error tolerance of the predictions of the model. The other is the representational fidelity criteria, which give standards for understanding whether the model gives the right predictions for the right reasons, that is, whether or not the model is linking up to the causal structure which explains the aspects of the phenomenon being modeled.

This strategy of model-based science is contrasted with a different sort of strategy, what Weisberg calls abstract direct representation. Abstract direct representation is the strategy of science in which study of the world is unmediated by models. He gives the example of Mendelev’s development of the periodic table of elements. This process did not begin with a hypothetical abstract model which is refined and then used representationally (as Weisberg thinks the process of model-based science proceeds). Instead, this process starts with the phenomena and abstracts away to more general features. Such distinction between modelling and abstract direct representation underlines the possibility that not all scientific representations need to achieved in same ways.

There are some worries about Weisberg’s understanding of the process of model-making (Knuuttila and Loettgers, in press). Through a close examination of the development of the Lotka-Volterra model, Knuuttila and Loettgers argue that the process of model-building often begins with certain sorts of templates, or characteristic ways of modeling some phenomena, typically adopted from other fields. Such already familiar modeling methods and forms offer the modeler a sort of scaffolding upon which they can imagine and describe the target system. They also argue that another distinct feature of model-making is its outcome-orientation. That is, in developing a model, a scientist will typically do so with an eye to the anticipated insights or features of the target system that they wish to represent. Thus, on their view, the modeler pays close attention to the target system or empirical questions in all stages of the development of the model (not just at the end, as Weisberg suggests).

c. Idealization

One of the important discussions that has developed primarily in the literature on models concerns idealization. Weisberg (2007a) argues that there are three different kinds of idealization, which he generically describes as “the intentional introduction of distortion into scientific theories” (639). The first kind of idealization he calls Galilean idealization. This is the sort of idealization in which a theory or model is intentionally distorted so as to make the theory or model simpler, in order to render it computationally tractable. This sort of idealization occurs when scientists ignore certain features of a system or theory, not because they are playing no role in what actually happens, but rather because including them makes the application of the theory or model so complex that they cannot gain traction on the problem. By removing these complexities, scientists distort their model (because it lacks complexities which reflect the target). But after gaining some initial computational tractability, they can slowly reintroduce the complexities and thus remove the distortions.

The second type of idealization is what Weisberg calls minimalist idealization. In a minimalist idealization, the only features that are carried into the model or theory are those causal features which make a difference to the outcomes. So, if some feature of a target can be left behind without losing predictive power, a minimalist idealization will leave that feature behind. As an example, Weisberg notes that when explaining Boyle’s law, it is often assumed that there are no collisions between gas molecules. This is, in fact, false since collisions between gas molecules are known to take place in low-pressure gasses. But, “low pressure gases behave as if there were no collisions” (2007a, 643). So, since these collisions do not make any difference to our understanding of this system, scientists can (and do) leave this fact behind.

Notice that this is distinct from Galilean idealization insofar as minimalist idealizations leave certain features out of their theories or models because they make no difference to the relevant tasks or goals at hand. Galilean idealization, on the other hand, leaves certain features out even when they do make a difference, simply because leaving them in would make the model more complex and less tractable.

The final sort of idealization described by Weisberg is what he calls multiple-models idealization. This is the practice of using a number of different, often incompatible models to represent or understand some phenomenon. In this case, none of the models by itself is capable of accurately modeling the relevant target system. All the same, each of the models is good at representing certain features of the target system. Thus, by using not just a single model but rather this group of models, each of which is distorted, scientists can get a better sense of the target system. Weisberg offers a helpful example: the National Weather Service uses a number of different models in making its weather forecasts. Each of the models used represents the target in a different way, each being inaccurate in some way or another. It is the use of all of these models that permits forecasts of higher accuracy since attempts to make a single model have resulted in less accurate predictions.

4. Sociology of Science

a. Representation and Scientific Practice

Many important insights on the nature of scientific representation have come not from the philosophy of science but rather from thinkers who would typically be considered part of the field of sociology of science. The insights from this field can serve as both a source of insight on the nature of representation in scientific practice, as well as a challenge to the primarily epistemically-oriented insights from the philosophy of science. Michael Lynch and Steve Woolgar (1990) edited an important collection of papers on scientific representation in practice written from the perspective of sociology of science called Representation in Scientific Practice. More recently, (2014), Lynch and Woolgar edited another collection with Catelijne Coopmans and Janet Vertesi. Treating representation from the perspective of sociology of science involves asking a different sort of question than the one so far addressed in this article. Instead of asking about the constitution of scientific representation, sociologists of science are more interested in a different question, “What do the participants, in this case, treat as representation?” (Lynch and Woolgar 1990, 11). In the introduction to this volume, Lynch and Woolgar provide a general overview of some of the important insights from this perspective.

Since sociology of science treats scientific practice as its object of inquiry, it is keen to describe precisely how representations are actually used by scientists. They note the importance of “the heterogeneity of representational order” (2). That is, there is a wide range of devices which are representational as well as a wide range of ways in which the representations are used and in which they are useful. Importantly, sociologists are often interested in discussing more than merely the epistemic or informational role and use of representations, viewing them as significantly social, contextualized, and otherwise embedded in a complex set of activities and practices. Sociologists of science attempt to pay attention to the whole gamut of representations and representational uses to better understand precisely the role they play within scientific investigation.

Another important insight, Lynch and Woolgar note, is that the relation between representations is not to be thought of as directional in the sense that the representations move from or towards some “originary reality” (8). Instead, any directionality of representations is to be thought of as “movement of an assembly line” (8). That is to say, representational practice must be seen as constructing not only a representation, but also (re)constructing a phenomenon in a way so that it can be represented. This is something that can be seen in much of the literature from sociologists of science, including the work of Latour (see below).

In paying close attention to the way representations are actually used, some sociologists of science note settings in which “discrepancies between representations of practice and the practices using (and composing) such representations” (9). These discrepancies and other problems encountered in the actual practice of science allow for improvisation and creativity which can help advance the particular domains of which they are a part. Sociologists of science are interested in studying this creativity, not only for its productivity in science, but also as an interesting phenomenon in its own right.

b. Circulating Reference

A particularly telling example of these insights, especially from the philosophical point of view, is provided in Bruno Latour’s “Circulating Reference” (Latour 1999). Latour’s photo-philosophical case study is based on the work of a group of scientists who were examining the relationship between a savannah and forest ecosystem. At the end of their project, they collectively published a paper on their findings which included a figure of the interaction between the ecosystems, detailing the change in soil composition among other features. Latour asks how it is that this abstract drawing, which takes a perspective no individual could possibly have had and which ignores so many of the features of the ecosystem, can be about that stretch of land. That is to say, here we have a drawing, something made by ink and paper, and there we have the forest-savannah ecosystem, how is it that the former can be about the latter?

Latour’s method of answer, which takes the form of a strikingly well-written case study in which he presents pictures from the expedition which he uses to structure and represent the process he describes, is to look carefully at all the details and steps by which the scientists got from the expedition to the figure in the paper. What happens, says Latour, is that there are a series of steps through which the scientists abstract from the world in some intentional fashion. In doing so, they maintain some relevant feature of the world, but they are simultaneously constructing the phenomena they are studying. In the process the representations produced are also getting more abstract.

An example will make this clearer. At one stage in the process, soil samples are collected from a vertical stretch of ground. These samples are transferred into a device which allows the whole vertical stretch of earth to be viewed synoptically. In taking the sample, the scientist has already begun to construct–already this particular bit of dirt is taken to be representative of the dirt for a much wider area of land. Once the soil has been collected, various features of the soil are maintained through intentional actions on the part of the scientist. For example, the scientists will label the soil as being of a certain sort of consistency. The scientists then use a clever device in which there are pinholes in a tool which has the various Munsell colors and numbers, which is itself a construction with a long history. In looking through the pinholes, the scientist can abstract away from the dirt sample itself, taking, in some sense, only the color (which is done in virtue of a construction of numbers associated with particular colors). Something has clearly been lost, namely, the full materiality of the dirt. But something has also been gained, in this case, a number which corresponds to the color of the dirt; some usable, manipulable data.

Latour’s essay carefully describes many of these transitions from the savannah-forest system to the published figure. As he claims, it is this series of transitions (which each involve abstraction and construction due to the intentional decisions of a scientist) which ensures that the figure at the end references or represents the savannah-forest system. There is not a single gap between the figure and the world which must be accounted for by some representational relation. Instead, on his account, there is a large series of gaps, each of which is crossed by a scientist’s actions in abstracting and maintaining, constructing and discovering. This series, he thinks, can be extended infinitely in either direction. By abstracting further from the already quite-abstract figure, certain hypotheses might be suggested, which would result in a return to the savannah-forest system, to gather data which might be more basic than the data already gathered. On his view, there is no such thing as “the world” which is the most basic thing-in-itself. Nor is there any most-abstracted element.

c. Critiques of Sociology of Science

While these insights from the sociology of science literature have been both sources of support and criticism for the philosophical literature, they have also been subject to criticisms. One important criticism comes from Giere’s (1994) review of Lynch and Woolgar’s (1990) Representation in Scientific Practice. Giere’s primary target is the extremely constructivist nature of the sociology of science literature. The constructivist approach claims that science is socially constructed: that is, science is filled with socially-dependent knowledge and aimed at understanding socially-constructed objects. There is no such thing, on this view, as a non-constructed world to be understood by scientists, and therefore, no such world to be represented. The attempt to explain representation in this framework results in a “no representation theory of representation” (Giere 1994, 115). But, Giere thinks there is a straightforward counter-slogan to a view of this sort: “no representation without representation” (115). That is to say that if there is nothing ‘out there’ in the world being represented, it cannot be that this is an instance of representation. This is not to reject the importance of paying attention to the role of the practices and representational devices in particular case studies. All the same, Giere argues that if we want a general account of scientific representation, “we must also go beyond the historical cases” (119). Put otherwise, the sociology of science perspective is an important part of explaining scientific representation, but this work by itself leaves representation unexplained.

Knuuttila (2014) takes up a similar line of criticism. While she places great importance on the insights of sociologists of science, she thinks that many of their views have developed with a false target in mind. Many sociologists of science place their views as a contrast to a traditional philosophical view of science as something which perfectly represents the world. The alternative, they suggest, is their constructivist approach, as described above. However, Knuuttila argues, this motivation runs into problems. First, when they select certain practices to investigate rather than others, by what criterion are they distinguishing this practice as representational? In doing so, they seem to be relying on some traditional account of representation to delineate the cases of interest. Further, it seems that these studies do not show that representation is a defunct concept and that we are bound to a purely constructivist account of science. Instead, “these cases actually reveal…what a complicated phenomenon scientific representation is…and give us clues as to how, through the laborious art of representing, scientists are seeking and gaining new knowledge” (Knuuttila 2014, 304). We need not think that just because there is no perfect representation of the world, that there is therefore no world to be represented. Their insights could equally contribute to an intermediate view in which we reject this perfect-representation view of science, but still maintain that science is giving us knowledge of the real world. That is, we can simultaneously deny that representations “are some kind of transparent imprints of reality with a single determinable relationship to their targets” while still affirming that the “artificial features of scientific representations…result from well-motivated epistemic strategies that in fact enable scientists to know more about their objects” (Knuuttila 2014, 304).

5. References and Further Reading

Bailer-Jones, D. (2003). When Scientific Models Represent. International Studies in the Philosophy of Science 17: 59-74.
- Representation in models is linked to entailed propositions and pragmatics.
Bailer-Jones, D. (2009). Scientific Models in Philosophy of Science. Pittsburgh: University of Pittsburgh Press.
- Extended discussion of the history of philosophy of models and defends an account of models.
Bartels, A. (2006). Defending the Structural Concept of Representation. Theoria 55: 7-19.
- Homomorphism as an account of representation.
Brading, K. and E. Landry. (2006). Scientific Structuralism: Presentation and Representation. Philosophy of Science 73: 571-581.
- Structuralism as a strong methodological approach in science.
Bueno, O. (1997). Empirical Adequacy: A Partial Structures Approach. Studies in History and Philosophy of Science 28: 585-610.
- Representation as partial isomorphism.
Bueno, O. and S. French. (2011). How Theories Represent. British Journal for the Philosophy of Science 62: 857-894.
- Representation as partial isomorphism; replies to arguments against partial isomorphism.
Callender, C. and C. Cohen. (2006). There Is No Special Problem About Scientific Representation. Theoria 21: 67-85.
- Scientific representation is to be explained like other problems of representation.
Cartwright, N. (1983). How the Laws of Physics Lie. New York: Oxford University Press.
- Distortion and idealization in the laws of physics.
Cartwright, N., T. Shomar, and M. Suárez. (1995). The Tool Box of Science: Tools for the Building of Models with a Superconductivity Example. Poznan Studies in the Philosophy of the Sciences and the Humanities 44: 137-149.
- Models are not only theory-driven, but also phenomena-driven.
Chakravartty, A. (2009). Informational versus Function Theories of Scientific Representation. Synthese 72:197-213.
- Argues that substantive and pragmatic accounts are complementary.
Contessa, Gabriele. (2007). Scientific Representation, Interpretation, and Surrogative Reasoning. Philosophy of Science 74: 48-68.
- A substantive inferential account of representation.
Contessa, Gabriele. (2011). Scientific Models and Representation. In S. French and J. Saatsi (eds.) The Bloomsbury Companion to the Philosophy of Science, pp. 120-137. New York: Bloomsbury Academic.
- A substantive inferential account of representation.
Coopmans, C. J. Vertesi, M. Lynch, S. Woolgar (eds.). (2014) Representation in Scientific Practice Revisited. Cambridge: MIT Press.
- Reexamines the question of representation in scientific practice in contemporary sociology of science.
da Costa, N. and S. French. (2003). Science and Partial Truth: A Unitary Approach to Models and Scientific Reasoning. Oxford: Oxford University Press.
- Representation as partial isomorphism.
French, S. (2003). A Model-Theoretic Account of Representation (Or, I Don’t Know Much about Art… but I Know It Involves Isomorphism). Philosophy of Science 70: 1472-1483.
- Representation as partial isomorphism.
French, S. and J. Ladyman. (1999). Reinflating the Semantic Approach. International Studies in the Philosophy of Science 13: 103-119.
- Representation as partial isomorphism.
Frigg, R. (2006). Scientific Representation and the Semantic View of Theories. Theoria 55: 49-65.
- Criticisms of the semantic approaches to representation.
Frisch, M. (2015). Users, Structures, and Representation. British Journal for the Philosophy of Science 66: 285-306.
- Discusses van Fraassen’s (2008); presents and responds to a criticism.
Giere, R. (1988). Explaining Science: A Cognitive Approach. Chicago: University of Chicago Press.
- A semantic account of theories and representation as similarity.
Giere, R. (1994). No Representation without Representation. Biology and Philosophy 9: 113-120.
- Argues that an appeal only to practice will leave scientific representation unexplained.
Giere, R. (2004). How Models Are Used to Represent Reality. Philosophy of Science 71: 742-752.
- Representation as similarity with input of agents.
Giere, R. (2010). An Agent-Based Conception of Models and Scientific Representation. Synthese 172: 269-281.
- Representation as similarity with input of agents.
Godfrey-Smith, P. (2006). The Strategy of Model-Based Science. Biology and Philosophy 21: 725-740.
- Discusses the use of models as being classified by a particular strategy in science.
Goodman, N. (1976). Languages of Art. Indianapolis: Hackett.
- Argues for an account of representation in art which has been influential to accounts of scientific representation.
Hacking, I. (1983). Representing and Intervening. New York: Cambridge University Press.
- Representation in terms of likeness, argues for the importance of intervention.
Hesse, M. (1966). Models and Analogies in Science. Notre Dame, Indiana: University of Notre Dame Press.
- Argues that models are central to scientific practice.
Hughes, R.I.G. (1997). Models and Representation. Philosophy of Science 64: S325-S336.
- The DDI account of representation.
Knuuttila, T. (2005). Models, Representation, and Mediation. Philosophy of Science 72: 1260-1271.
- Models as epistemic tools.
Knuuttila, T. (2011). Modelling and Representing: An Artefactual Approach to Model-Based Representation. Studies in the History and Philosophy of Science 42: 262-271.
- Relates representation to representationalism, and expands the notion of models as epistemic tools.
Knuuttila, T. (2014). Reflexivity, Representation, and the Possibility of Constructivist Realism., In M. C. Galavotti, S. Hartmann, M. Weber, W. Gonzalez, D. Dieks, and T. Uebel (eds.) New Directions in the Philosophy of Science, pp. 297-312. Dordrecht, Netherlands: Springer.
- Criticizes sociology of science accounts and argues that they are compatible with philosophical accounts.
Knuuttila, T. and A. Loettgers, A. (in press). Modelling as Indirect Representation? The Lotka-Volterra Model Model Revisited. British Journal for the Philosophy of Science.
- Focuses on the interdisciplinary, historical and empirical aspects of model construction.
Latour, B. (1999). Circulating Reference. In Pandora’s Hope. Cambridge: Harvard University Press.
- Traces and discusses the steps from some phenomena to a representation.
Ladyman, J., O. Bueno, M. Suárez, B.C. van Fraassen. (2011). Scientific Representation: A Long Journey from Pragmatics to Pragmatics. Metascience 20: 417-442.
- Discussion of van Fraassen’s (2008), with replies by van Fraassen.
Lynch, M. and S. Woolgar (eds.). (1990). Representation in Scientific Practice. Cambridge: MIT Press.
- Collection of essays on scientific representation by sociologists of science.
Morgan, M. and M. Morrison (eds.). (1999). Models as Mediators: Perspectives on Natural and Social Science. New York: Cambridge University Press.
- Collection of essays on the uses and nature of models.
Suárez, M. (2003). Scientific Representation: Against Similarity and Isomorphism. International Studies in the Philosophy of Science 17: 225-244.
- Criticisms of similarity and isomorphism.
Suárez, M. (2004). An Inferential Conception of Scientific Representation. Philosophy of Science 71: 767-779.
- Inferential account of representation.
Suárez, M. (2010). Scientific Representation. Philosophy Compass 5: 91-101.
- Gives a brief overview of accounts of representation.
Suárez, M. (2015). Deflationary Representation, Inference, and Practice. Studies in History and Philosophy of Science 49: 36-47.
- Discusses deflationary accounts and argues that both his inferential account and the DDI account are deflationary.
Suppe, F. (1974). The Structure of Scientific Theories. Urbana: University of Illinois Press.
- Criticizes the syntactic view and introduces a semantic conception of theories.
Teller, P. (2001). Twilight of the Perfect Model Model. Erkenntnis 55: 393-415.
- Argues for a deflationary account of similarity as representation.
Toon, A. (2012). Similarity and Scientific Representation. International Studies in the Philosophy of Science 26: 241-257.
- Explores responses to criticisms on behalf of similarity.
van Fraassen, B. C. (1980). The Scientific Image. New York: Oxford University Press.
- Representation as isomorphism (among other things).
van Fraassen, B. C. (2008). Scientific Representation: Paradoxes of Perspective. New York: Oxford University Press.
- Representation as isomorphism with important role for agents (among other things).
Weisberg, M. (2007a). Three Kinds of Idealization. Journal of Philosophy 104: 639-659.
- Three different sorts of idealization.
Weisberg, M. (2007b). Who Is a Modeler? British Journal for the Philosophy of Science 58: 207-233.
- Models are indirect representations, a strategy which is distinct from abstract direct representation.
Weisberg, M. (2013). Simulation and Similarity: Using Models to Understand the World. New York: Oxford University Press.
- Representation as similarity.
Winther, R. (2015). The Structure of Scientific Theories. In E.N. Zalta (ed.), Stanford Encyclopedia of Philosophy. (Spring 2015 Edition). http://plato.stanford.edu/
- Detailed discussion of accounts of the structure and representation of scientific theories with extensive bibliography.

Author Information

Brandon Boesch
Email: boeschb@email.sc.edu
University of South Carolina
U. S. A.

Olympe de Gouges (1748—1793)

“Woman has the right to mount the scaffold; she must equally have the right to mount the rostrum” wrote Olympe de Gouges in 1791 in the best known of her writings The Rights of Woman (often referenced as The Declaration of the Rights of Woman and the Female Citizen), two years before she would be the third woman beheaded during France’s Reign of Terror. The only woman executed for her political writings during the French Revolution, she refused to toe the revolutionary party line in France that was calling for Louis XVI’s death (particularly evident in her pamphlet Les Trois Urnes, ou le Salut de la Patrie [The Three Ballot Boxes, or the Welfare of the Nation, 1793). Simone de Beauvoir recognizes her, in Le Deuxième sexe [The Second Sex] (1949 [1953]), as one of the few women in history who “protested against their harsh destiny.” Favorably described by commentators alternately as a stateswoman, a femme philosophe, an artist, a political analyst, and an activist, she can be considered all of these but not without some qualification. While contradictions abound in her writings, she never wavered in her belief in the right to free speech and in its role in social and political critique.

On the death of her husband after a brief unhappy marriage, she moved to Paris, where, with the support of a wealthy admirer, she began a life’s work focused wholly on mounting the rostrum denied to women. Defying social convention in every direction and molding a life evocative of feminism of a much later age, Gouges spent her adult life advocating for victims of unjust systems, helping to create a public conversation on women’s rights and the economically disadvantaged, and attempting to bring taboo social issues to the theatrical stage and to the larger social discourse. There is disagreement on whether she participated in or even occasionally hosted some of the literary and philosophic salons of the day. Her biographer Oliver Blanc suggests yes; historian John R. Cole (2011) turns up no evidence. Still, she was recognized among the fashionable and intellectual elite in Paris, and she was well-versed in the main themes of the most influential thinkers of her day, at least for a time. Her name appears in the Almanach des addresses (a kind of social registry) from 1774-1784. Despite harsh criticism for using her voice in the political arena and thus challenging deeply entrenched gender norms (“having forgotten the virtues that belong to her sex,” wrote Pierre Gaspard Chaumette, then-President of the Paris Commune, warning other politically active women of Gouges’s fate), she was certainly the author of 40 plays (12 survive), two novels and close to 70 political pamphlets.

While not a philosopher in any strict sense, she deserves attention for her morally astute analysis of women’s condition in society, for her re-imagining of the intersection of gender and political engagement, for her conception of civic virtue and her pacifist stance, and for her advocacy of selfhood for women, blacks, and children (especially in their right to know their origins). She was among the first to demand the emancipation of slaves, the rights of women (including divorce and unwed motherhood), and the protection of orphans, the poor, the unemployed, the aged and the illegitimate. She had a talent for emulating those she admired, including especially Rousseau but also Condorcet, Voltaire, and the playwright Beaumarchais.

Early Life
Intellectual Pursuits
Relevance and Legacy
References and Further Reading

1. Early Life

Details are limited. Born Marie Gouze in Montauban, France in 1748 to petite-bourgeois parents Anne Olympe Moisset Gouze, a maidservant, and her second husband, Pierre Gouze, a butcher, Marie grew up speaking Occitan (the dialect of the region). She was possibly the illegitimate daughter of Jean-Jacques Le Franc de Caix (the Marquis de Pompignan), himself a man of letters and a playwright (among whose claims to fame includes an accusation of plagiarism by Voltaire). In her semi-autobiographical Mémoire de Madame de Valmont sur l’ingratitude et la cruauté de la famille de Flaucourt (Memoir of Mme de Valmont) (1788), Gouges publishes letters purported to be transcriptions from Pompignan taking pains to distance himself from Valmont/Gouges. These letters stop short of unequivocal denial of his paternity.

She was married at 16 or 17 in 1765, unhappily, to Louis Aubrey (an associate of Pierre’s), with whom she had a son (also) Pierre, and by the age of 18 or 19 was widowed. Denouncing marriage due to her recent past experience and disguising her widowhood (which would have given her a modicum of social and legal status), she adopted her mother’s middle name and the more aristocratic-sounding “de Gouges” and moved to Paris. Literate (schooled likely by Ursuline nuns in Montauban) but not particularly well-read, she spent the next decade informing herself on intellectual and political matters and integrating into Parisian society, supported by Jacques Biétrix de Roziéres, a wealthy weapons merchant, whom she may have met in Montauban shortly after the death of her husband or through her married sister, Jeanne Raynart. Biétrix insured her circumstances until the decline in his family’s resources in 1788. The year of her first published work is 1778, and it marks the end of her first decade in Paris.

She had begun to write in earnest around 1784. A literary compatriot and admirer of hers was Louis-Sébastien Mercier (1740-1814), with whom she shared many political views, including clemency for the King and a general abhorrence of violence. Mercier helped her navigate the tricky internal politics of the Comédie Française—the prestigious national theater of France—assisting her to publish several of her plays, and to stage a handful. Charlotte Jeanne Be’raud de la Haye de Riou, Marchioness of Montesson (1739-1806), wife of the Duke of Orleans, a playwright herself and a woman of much influence and wealth, was among a list of other friends who came to her aid.

With little formal education and as a woman boldly unconventional, once she began her life of letters, her detractors were eager to find fault. She was often accused of being illiterate, yet her familiarity with Moliere, Paine, Diderot, Rousseau, Voltaire, and many others, the breadth of her interests, and the speed with which she replied to published criticism, all attest to the unlikelihood of the accusation. As French was not her native tongue and since her circumstances permitted, she maintained secretaries for most of her literary career.

2. Intellectual Pursuits

a. Literary

All of her plays and novels carry the theme of her life’s work: indignation at injustice. Her literary pursuits began with playwriting. Gouges wrote as many as 40 plays, as inventoried at her arrest. Twelve of those plays survived, and four found the requisite influential, wealthy, mostly male backing needed for their staging. Ten were published. While many of the plays by the dozen women playwrights that had been staged at the Comédie Française were published anonymously or under male pseudonyms, those playwrights who were successful on stage in their own names (most notably Julie Candielle) stuck to themes seen as suitable to their gender. Gouges broke with this tradition—publishing under her own name and pushing the boundaries of what was deemed appropriate subject matter for women playwrights—and withstood the consequences. Reviews of her early productions were mixed—some fairly favorable, others patronizing and condescending or skeptical of her authorship. Those of her plays read by the Comédie Française were often ridiculed by the actors themselves. Her later plays, more strongly political and controversial, were met with outright sarcasm and hostility by some reviewers: “[t]o write a good play, one needs a beard” wrote one critic.

Her first play, written in 1785, never produced for the stage, but published the following year, L’Homme Généreux [The Generous Man] explored the political powerlessness of women through representation of a socially privileged man’s struggle with sexual desire. The play also shines a light on the injustice of imprisonment for debt. Le Mariage Inattendu de Chérubin [Cherubin’s Unexpected Marriage] (1786), one of the several homages of the time written after Beaumarchais’ critically acclaimed Le Mariage de Figaro [The Marriage of Figaro] (1784), is a sequel to Figaro. Intent to rape is a theme in this play as it was in L’Homme Généreux; a privileged husband’s misplaced lust brings damage to the family, while the suffering of the victim is given significant attention. Gouges’s first staged production was originally titled Zamore et Mirza; ou L’Heureux Naufrage [Zamore and Mirza; or, The Happy Shipwreck] (1788). Written in 1784 and later revised, it was finally performed in 1789 under the title L’Esclavage de Nègres, ou l’Heureux naufrage [Black Slavery; or the Happy Shipwreck]. Accepted by the Comédie Française when submitted anonymously in 1785, it was then shelved for four years once the identity (and gender) of the playwright was confirmed. Winning praise from abolitionist groups, it was the first French play to focus on the inhumanity of slavery; it is, not surprisingly, also the first to feature the first person perspective of the slave. It saw three performances before it was shut down by sabotaging actors and protests organized by enraged French colonists who, deeply reliant on the slave trade, hired hecklers to wreak havoc on the production. Gouges fought back through the press, her social and literary connections and through the National Assembly. Understanding that her gender was connected to her lack of success, she called for a second national theatre dedicated solely to women’s productions, and she called for reforms within the Comédie Française itself.

Le Philosophe corrigé, ou le cocu supposé [The Philosopher Chastised, or the Supposed Cockold] (1787) represents women, despite expectations, as capable of agency around their own sexual desire, and of uniting and supporting each other, giving voice to the phenomenon that Simone de Beauvoir would address much later in The Second Sex with the observation that “women do not say ‘we’.” The play also depicts a male, the titular husband, as capable of acquiring moral knowledge through an evaluation of emotional response. Sympathy for his inexperienced wife and, later, an innocent baby, gives him insights he uses for moral reflection, a theme found in David Hume (1711-1776), Josiah Royce (1855-1916), and much modern feminist ethics. When French theatre in the decade of the Revolution turned to lighter vaudevillian fare, Gouges tried her hand at light comedy; La Bienfaisance récompense ou la vertu couronée [Beneficence Rewarded, or Virtue Crowned] (1788), a one-act comedy, portrayed the then-current Duc d’Orléans, an anti-royalist, as a doer of good.

But most of Gouges’s playwriting was dedicated to crafting principally dramatic—if melodramatic by today’s standards—pieces, responding to the issues of the day. And, she had a unique voice on many matters. Moliére chez Ninon ou e siécle de grands hommes [Moliére at Ninon’s, or the Century of Great Men] (1788), for instance, challenges the double standard between the sexes by depicting the famous, fiercely independent, literary courtesan, Ninon de Lenclos (1620-1705), as a noble person positively influencing the male intellectuals in her circle, including Moliére, all of whom are present to honor a visit by another notable intellectual, Queen Christina of Sweden (1626-1689).

Playwriting for Gouges was a political activity. In addition to slavery, she highlighted divorce, the marriageability of priests and nuns, girls forcibly sent to convents, the scandal of imprisonment for debt, and the sexual double-standard, as social issues, some repeatedly. Such activism was not unheard-of on the stage, but Gouges carried it to new heights. By 1790, her writings had become more explicitly political. She had three plays published: Les Démocrates et les Aristocrates; ou le Curieux du Champ de Mars [The Democrats and the Aristocrats], a satire of political extremists on both sides; Le Nécessité du Divorce [The Necessity of Divorce], again illustrating the powerlessness of women trapped in marriage, and written simultaneously with a debate on the topic in the National Assembly (France would be the first Western country to legalize divorce two years later); and Le Couvent ou les Vœux Forces [The Convent, or Vows Compelled]. The Convent was her second play to see the light of day, and her greatest success. In the year of its publication it saw approximately 80 performances. Highlighting the political impotence of women, it illuminated the injustice of the Church’s complicity in male relatives’ right to force females into convents against their will.

Gaining momentum on the political front, her next play to be produced for the stage was Mirabeau aux Champs-Elysées [Mirabeau in the Elysian Fields] (1791). Depicting the Enlightenment philosophers Baron d’Montesquieu (1689-1755), Voltaire (1694-1778), and Rousseau, along with Benjamin Franklin (1706-1790), and, Madame Deshoulieres (1638-1694), Marquise de Sévigné (1626-1696) and Ninon de Lenclos—the latter three major female influences in France during the Enlightenment—Mirabeau awakens numerous notable historical figures to welcome Mirabeau (1749-1791) as a hero to the afterlife, in effect honoring his stance as a supporter of constitutional monarchy. Most notable, perhaps, is the appearance of the three women as worthy of a place of honor and a voice, platforms they use to assert, among other things, that the success of the Revolution pivots on the inclusion of women. (This is also the year Gouges wrote The Rights of Woman, discussed separately below.)

In 1792, one year before her trial and execution, she worked on two plays: the unfinished La France Sauvée, ou le Tyran détrôné [France Preserved, or the Tyrant Dethroned] and the completed L’Entrée de Dumourier [sic] à Bruxelles [Dumouriez’s Entry into Brussels]. The former, confiscated at her arrest, was used as proof of sedition at her trial because of its sympathetic depiction of Marie-Antoinette, even as Gouges used it to demonstrate support for her own case. The latter depicted General Dumouriez’s defense of the Revolution against foreign anti-royalists, assisted by male and female warriors, and challenged her own privileging of aristocracy by suggesting that commoners were the true nobles. It was also the fourth and final play to be staged during Gouges’s lifetime.

The publication of Memoir of Madame de Valmont (1788) ironically begins, rather than summarizes, her political career. This fictionalized self-examination grappled with idealized father figures and fragmented selves, and served to package and compartmentalize her pre-Parisian life and move her forward wholly into a literary existence. Using a version of her (Gouges’s) personal story to make a political point, the narrator of the story sees clearly how gender works to constrain women. Mme de Valmont’s father’s refusal to acknowledge paternity raises issues of legitimacy for Valmont with financial and social repercussions. While he also expressed no patience for women’s forays into traditional male arenas such as writing, the narrator’s solution is to call for women to stop undermining other women and work to support each other. Rejection of the symbolic paternal voice of the culture has political power, and the Memoir presents an 18^th century illustration of making the personal political—a vivid theme in 20^th century feminism.

Gouges’s second novel Le Prince Philosophe (The Philosopher Prince), also written in 1788 but not published until 1792, reconceives monarchical rule, positing the best kind of ruler as one who would prefer not to rule, and proposing that all rule must be founded on the obligation not to take life. Scholars are mixed on whether she maintained her monarchist stance throughout her life. Her literary output and her pamphleteering often suggest some version of a monarchy as her default position. However, Azoulay (2009) suggests, at least in Gouges’s later writings (and perhaps in this second novel as well), her supposed attention to the monarchy is rather an attention to the preservation of the state and to the injustice of taking a life—namely, the King’s. Gouges’s depiction of the cause of the friction between the sexes in this novel is commentary on gender relations that here appears more conservative than will later be the case. The male characters still hesitate to share the reins with women. Yet, women are encouraged to develop reason rather than charm, and one female character submits a carefully drawn up plan advocating education and job opportunities for females, eventually winning a small victory by receiving permission to run a women’s academy.

Gabrielle Verdier (1994) notes several distinguishing features of Gouges’s plays: (1) young women have active roles to play, (2) women of any age have agency, (3) female rivalry is absent, (4) mature women are protectors, benefactors, and mentors, (5) and the abuses that women experience are inevitably tied to larger social injustices. While not prepared to offer up a fully formed theory of oppression, Gouges is readying the space where that work can be done.

In all of her writings, both literary and political, one finds an unflinching self-confidence and a desire for justice. What Lisa Beckstrand (2009) refers to as the “theme of the global family” also runs through Gouges’s literary work. Familial obligations dominate and are responses to the inadequacies of the state. The plight of the illegitimate child, the unmarried mother, the poor, the commoner (at least by 1792), the orphan, the unemployed, the slave, even the King when he is most vulnerable, are all brought to light, with family connection and sympathy for the most disadvantaged as the pivotal plot points. Women characters regularly displace men at center stage. It is women, unified with each other and winning the recognition of men, that most characterizes what Gouges conveys in her work. She is the first to bring several taboo issues to the stage, divorce and slavery among them. Especially prolific in the four years from 1789 to her death in 1793, her political passion, labeled conservative by many, is still astonishing for its persistence in a culture working strenuously too often to stifle women’s voices.

b. Political

The impact of Olympe de Gouges’s political activism is commemorated by her inclusion as the only French woman on revolutionary and abolitionist Abbé Henri Grégoire’s (1750-1831) list of “all those men [sic] who have had the courage to plead the cause” of abolition. The list is included in the introduction to his On the Cultural Achievements of Negroes (1808), which was written as a counter to Thomas Jefferson’s less admiring look at race in Notes on the State of Virginia (1782). As with so much that came to prominence with modern feminism, indignation at injustice must have started for Gouges with her own marginalization as a woman, but it shifted to the external world with a recognition of the inhumanity of slavery. While she was not an immediatist like some in the next generation of abolitionists such as William Lloyd Garrison (1805-1879) in the U.S., abolitionist thought permeated her understanding of the world and of herself as a writer, and soon grounded her thinking on women. The French Revolution itself transformed Gouges’s thinking further when the rights of citizens, despite pleas from the Girondins, were not applied to the female citizen. In fact, female political participation of all kinds was formally banned by the French National Assembly in 1793, after one of several uprisings led by women. Gouges’s political tenacity showed itself most virulently in prison where she “mounted the rostrum” at least two final times, smuggling out pamphlets that condemned prison conditions and that accepted—indeed recklessly demanded—responsibility for her ideas, challenging how the rights of freedom of speech were embodied in the new Constitution.

Her path from social nonconformist, to political activist and reformer, to martyr was one untrodden by women. Many of the most influential eighteenth century intellectuals—with a few exceptions—were convinced women did not have the intellectual capacity for politics. Gouges challenged that perception, while also problematizing it by writing hastily, sometimes dictating straight to the printer. While her uneven education opened her to ridicule, it gave her a critical affinity for Rousseauean ideas, as will be discussed below. Both playwriting and her productivity as a pamphleteer gained her celebrity which she used as a podium for her advocacy of the marginalized, and for drawing attention to the importance of the preservation of the state.

Gouges’s formal petitions to the National Assembly, and her public calls for governmental and social reform through the press and through her pamphleteering (common in France), went far beyond the The Rights of Woman of 1791 (see next section) and her stance on slavery. Among her many calls for change in this medium were: (as mentioned above) a demand for a national theatre dedicated to the works of women; a voluntary tax system (one of the few demands she published anonymously and which saw implementation the following year); state-sponsored working-groups for the unemployed; social services for widows, the elderly and orphans; civil rights for illegitimate children and unmarried mothers; suppression of the dowry system; regulation of prostitution; sanitation; rights of divorce; rights to marriage for priests and nuns; people’s juries for criminal trials; and the abolition of the death penalty. She petitioned the National Assembly on a number of occasions on these and other matters. Whether or not related to her efforts, the National Assembly did pass laws in 1792 giving illegitimate children some of the civil rights for which she fought and granting women the right to divorce, even while women remained legal nonentities overall.

She is frequently touted as a (sometimes the) founder of modern feminism for her unrelenting advocacy for women’s rights in writing and in action. While the historical record is more complicated than that, in historian John R. Cole’s accounting (2011), “she published on current affairs and public policy more often and more boldly than any other woman. . . and [s]he made a more formal and sweeping demand for the extension of full civil and political rights to women than any prior person, male or female, French or foreign” (231). Her call for women to identify as women and band together in support of each other can also be considered a contribution to the revolutionary and to the concept of citizenship, and remains today an important focus for modern feminism. Others in France were also proposing feminist ideas, although none as actively and comprehensively as Gouges: most notably, François Poulain de la Barre (1647-1725) in the 17th century and Marie Madeleine Jodin (1741-90), Louise de Keralio-Robert (1758?-1822), Nicholas de Condorcet (1743-1794), and Etta Palm d’Aelders (Dutch, 1732-1799), in the 18^th century. The latter two petitioned the National Assembly unsuccessfully in 1790 to ensure legal rights for women.

Gouges’s writings share many themes with what would become classic texts in the feminist movement of the 20^th century: independence of mind and body, access to political rights and political voice, education, and elimination of the sexual double standard. Her awareness of these themes spring from her experience as a woman, solidified by her unhappy early marriage, her unapologetic and ostensibly scandalous first years in Paris, through to her sometimes-thwarted, oft-derided, attempts at participation in cultural, literary and political realms. She experienced firsthand how the rights of the citizen were denied women. Her early history, her frustration at being denied, or dismissed as, a voice in the public sphere, and the ridicule she withstood, aimed at her gender, gave shape to insights emblematic of much later feminist theory and concretized for her an understanding of the link between the public and private realms. Gouges contributed markedly to the depth and breadth of the discourse on women’s rights in late 18^th century France, and on the plight of the underprivileged in general. As Cole summarizes: “she tried to rally other women behind a radical extension of liberty and equality into domestic relationships . . . and she [advocated for] the extension of rights to free persons of color and free blacks and to vindicate the full humanity of slaves in the Caribbean colonies” (231).

Perhaps most indicative of Gouges’s political courage and intellectual self-reliance was the stance which led to her death. Her decision to continue to publish works deemed seditious even as the danger of arrest grew shows courage and commitment to her advocacy of the less fortunate and exemplifies her self-definition as a political activist. She was the only woman executed for sedition during the Reign of Terror (1793-1794). That fact, along with comments such as those of Pierre Chaumette: “[r]emember the shameless Olympe de Gouges, . . . who abandoned the cares of her household to involve herself in the republic, and whose head fell under the avenging blade of the laws. Is it for women to make motions? Is it for women to put themselves at the head of our armies?” (Andress, 234), suggests that her outspoken views were greeted with increased hostility because of her gender. In May or June of 1793 her poster The Three Urns [or Ballot Boxes] appeared, calling for a referendum to let the people decide the form the new government should take. Proposing three forms of government: republic, federalist or constitutional monarchy, the essay was interpreted as a defense of the monarchy and used as justification for her arrest in September. Her continuing preference for a constitutional monarchy was likely propelled in part by her disappointment with the Revolution, but more specifically by her opposition to the death penalty and her general humanitarian inclinations. She appears to have had no elemental dispute with monarchy per se, problematizing any philosophical understanding of her commitment to human rights. The Rights of Woman, for instance, is dedicated to the Queen—as a woman, but presumably because she is the Queen.

c. The Rights of Woman (1791)

By far her most well-known and distinctly feminist work, The Rights of Woman (1791) was written as a response to The Declaration of the Rights of Man and of the Citizen, written in 1789 but officially the preamble to the French Constitution as of September 1791. Despite women’s participation in the Revolution and regardless of sympathies within the National Assembly, that document was the death knell for any hopes of inclusion of women’s rights under the “Rights of Man.” The patriarchal understanding of female virtue and sexual difference held sway, supported by Rousseau’s perspective on gender relations, perpetuating the view that women’s nurturing abilities and responsibilities negated political participation. Political passivity was itself seen as a feminine responsibility.

The Rights of Woman appeared originally as a pamphlet printed with five parts: 1) the dedication to the Queen, 2) a preamble addressed to “Man,” 3) the Articles of the Declaration, 4) a baffling description of a disagreement about a fare between herself and a cab driver, and 5) a critique of the marriage contract, modeled on Rousseau’s Social Contract. It most often appears (at least in English translation) without the fourth part (Cole, 2011). Forceful and sarcastic in tone and militant in spirit, its third section takes up each of the seventeen Articles of the Preamble to the French Constitution in turn and highlights the glaring omission of the female citizen within each article. Meant to be a document ensuring universal rights, the Declaration of the Rights of Man and the Citizen is exposed thereby as anything but. The immediacy of the implications of the Revolution finally fully awakened Gouges to the ramifications of being denied equal rights, but her entire oeuvre was aiming in this direction. Gouges wrote a document that highlights her personal contradictions (her own monarchist leanings as they hinder full autonomy most obviously), while bringing piercing illumination to contradictions in the French Constitution. Despite the lack of attention Gouges’s pamphlet received at the time, her greatest contribution to modern political discourse is the highlighting of the inadequacy of attempts at universality during the Enlightenment. The demands contained within the original document assert the universality of “Man” while denying the specificity required for “Woman,” therefore collapsing—at least logically—of its own efforts. Alert to the powerlessness of women and the injustice such a condition implies, Article 4 of The Rights of Woman, for instance, particularly calls for protection from tyranny, as “liberty and justice” demand; that is, as nature and reason demand in personal as well as political terms. To harmonize this document with her devotion to the monarchy for most of her political career takes significant effort.

For Gouges, the most important expression of liberty was the right to free speech; she had been exercising that right for almost a decade. Access to the rostrum required more than an early version of “add women and stir.” While Gouges’s Rights is rife with such pluralizing—extending any right of Man to Woman as well—there is also a clear acknowledgement that blind application of universal principles is insufficient for the pursuit of equality. Article XI, for example, demands the right of women to name the father of their children. The peculiarity of the need for this right on the part of women stands out because of its specificity and demonstrates the contradictions created by blindness to gender. The citizen of the French Revolution–the idealistic universal—is the free white adult male, leaving in his wake many injustices peculiar to individuals excluded from that “universal.” The Rights of Woman unapologetically highlights that problem.

The Enlightenment presumption of the “natural rights” of the citizen (as in “inalienable rights” in the U.S. Declaration of Independence) is in direct contradiction to the equally firmly-held belief in natural sexual differences—both of which are so-called “founding principles of nature.” While Gouges is not fully aware of the implications of this conflict, she holds unequivocally in The Rights of Woman that those natural rights do indeed grant equality to all, just as the French Declaration states but does not intend. The rights such equality implies need to be recognized as having a more far-reaching application; if rights are natural and if these rights are somehow inherent in bodies, then all bodies are deserving of such rights, regardless of any particularities, like gender or color.

Marriage, as the center for political exploitation, is thoroughly lambasted in the postamble, Part 5, to The Rights of Woman. Gouges describes marriage as the “tomb of trust and love,” and the place of “perpetual tyranny.” The primary site of institutionalized inequality, marriage creates the conditions for the development of women’s unreliability and capacity for deception. Just as Mary Wollstonecraft (1759-1797) does in A Vindication of the Rights of Woman (1792), Gouges points to female artifice and weakness as a consequence of woman’s powerless place in this legalized sexual union. Gouges, much like Wollstonecraft, attempts to combat societal deficiencies: the vicious circle which neglects the education of its females and then offers their narrower interests as the reason for the refusal of full citizenship. Both, however, see the resulting fact of women’s corruption and weak-mindedness as a major source of the problems of society, but herein also lies the solution. Borrowing from Rousseau, Gouges proposes a “social contract” as replacement for traditional marriage, reformulating his social contract with a focus that obliterates his gendered conception of citizen, and create the conditions for both parties to flourish. In the “Form for a Social Contract Between Man and Woman,” Gouges offers a kind of civil union based on equality, which will create the “moral means of achieving the perfection of a happy government!”. The state is a (reconceived) marriage writ large, for Gouges. What ails government are fixed social hierarchies impossible to maintain. What heals a government is an equal balance of powers and a shared virtue (consistent with her continuing approval of a constitutional monarchy). Marriages are to be voluntary unions by equal rights-bearing partners who hold property and children mutually and dispense of same by agreement. All children produced during this union have the right to their mother’s and father’s name, “from whatever bed they come.”

Having been a monarchist almost to the end, her authorship of this document and her lack of formal education suggests Gouges possessed less than full comprehension of what we now view as the discourse on universal human rights. That said, the production of this document has influenced exactly that conversation, and thus her presence in the list of historical figures who matter philosophically has to be acknowledged. Even Wollstonecraft, in her Vindication of 1792, does not call for the complete reinvention of women as political selves, as does Gouges in 1791.

d. Philosophical

As with most Enlightenment thinking, a natural rights tradition—although not any kind of comprehensive theory of natural rights—can be found in Gouges’s views on the origins of citizenship and rights for women and blacks. By 1791, she is arguing that equality is natural; it “had only to be recognized.” It appears that Gouges did not see any contradiction in her royalist leanings; nevertheless, she may no longer have been a monarchist by this point. She held that the human mind has no sex, an idea traceable in the modern era as far back as Poulain de la Barre (1673); men and women are equally human, therefore capable of the same thoughts. While her lack of education precluded the use of any systematic methodology, the consistency of her advocacy for the powerless, of pacifism, and (eventually) for the universal application of moral and legal rights is of great merit, and remains, if not based in rigorous philosophical analysis, yet philosophically astute. Her writings, both literary and political, point in directions contemporary feminist philosophy traversed for much of the twentieth century and beyond. She presciently foreshadows the “masculine universal” of liberal democracy identified by much contemporary feminist thought. She rejected the perceptions of sexual difference used to drive women out of the political arena, while she advocated for women’s “special interests.” Echoing Plato’s attention to gender in The Republic, she saw natural differences between genders, but not of a kind relevant to the tasks of the citizens of the state. Despite appearing after the French constitution being decreed and constitutionally frozen, Gouges’s The Rights of Woman was aimed at expanding, even supplanting, the official French Declaration. Focusing on women as human and thus equal, but with pregnancy and motherhood as special differences, Gouges seemed comfortable with the resulting conceptual dissonance.

On the heels of The Rights of Woman, she published The Philosopher Prince (1792), a novel where ideas in the realm of political philosophy (perhaps influenced by the historical events of the previous three years) are most on display. With the marriage contract from The Rights of Woman as a template, she unpacks reasons for the lack of solidarity between the sexes; she depicts women living in a mythical society where education becomes a requirement for civic virtue; access to reason is necessary so that women grow up equal to men and engaged in public life. Azoulay (2009) gives the most scholarly attention to date to this novel, arguing that it provides evidence that Gouges was a monarchist only insofar as monarchy was the best means to preserve the nation. “Gouges did not seek the preservation of the monarchy but rather the revival of the kingdom” (43).

Earlier, in 1789, the pamphlet Le Bonheur primitif de l’homme [The Original Happiness of Man] also gives hints of a political philosophy. Gouges imagines a society where women were granted an education and encouraged in the development of their agency. Individual happiness depends on collective happiness, but collective happiness comes from the natural qualities found within families (Beckstrand’s “theme of the global family”). While agreeing with Rousseau that civilization corrupts, she parts ways with him on the education of females. While never directly critical of Rousseau, the implication here is that the corruptibility of civilization can be countered only if we raised “Sophie” within a nurturing egalitarian family with as much freedom and natural exploration as Rousseau proposes for “Emile.” The happiness of all requires, among many other things, that “[g]irls will go to the fields and guard the animals” (tr. Harth, 1992, 222). This pamphlet also contains her call for a national theatre for women.

We find fragments of a larger philosophical perspective wherever we look. The female as subject rather than object, especially in political discourse, is among the most important and prevalent. Her understanding of the value of her own voice creates an understanding of self that challenges gender norms head on, withstands all public criticism, and refuses to collapse under the weight of taboo. A nascent moral philosophy can be unearthed by considering her lifelong attention to the plight of the disadvantaged. An early example is the postface to the second printing of her first play, written prior to its first staging, but after a long battle with the Comédie Française to have it staged. Réflexions sur les hommes négres [Reflections on Black Men] raises questions about personhood and race. Gouges identifies race as a social construct insofar as slavery condemns blacks to being bought and sold “like cows at market.” She is horrified at what privileged men will do in the name of profit. “It is only color” that differentiates the African from the European. And, difference in color is simply the beauty of nature. If “they are animals, are we not like they?” There is explicit criticism here of a binary way of seeing the world.

An atheist, she critiques religion—particularly Catholicism—by focusing on its oppressiveness, especially towards women. Religion should not prohibit one from listening to reason or encourage one to be “deaf to nature.” The celibacy of priests and nuns lays the ground for corruption and plays a role in religion’s oppressiveness as well, she proclaimed. Throughout her writings, respect for the individual appears more vividly than Enlightenment philosophers generally could conceive, grounds her pacifism, inspires her attention to children, and underscores her political vision. And, in part, through her reverence for Rousseau, she sees problems with the separation, both devastating in its implications in practice and invigorating in its theoretical possibilities, between the private and the public spheres.

i. Gouges and Rousseau

The writings of Jean-Jacques Rousseau (1712-1778) were a major influence on the French Revolution, as was the then-recent success of the American Revolution (1776). Among Gouges’s lost plays is one titled Les Rêveries de Rousseau, la Mort de Jean-Jacques à Ermenonville [Reveries of Rousseau, the Death of Jean-Jacques of Ermenonville] (1791)—she was an ardent admirer, calling him her “spiritual father”. While it is clear Rousseau’s philosophy as a systematizable whole was ignored by the revolutionaries, his idea that the power of the government should come from “the consent of the governed” inspired the overthrow of an absolutist monarchy. His advocacy of rule by the general will helped inspire the French to shed their monarchist allegiances and take to the streets. His theory of education for boys promoted non-interference and encouraged conditions that would allow nature to take its course. That this was the most direct route to virtue and would produce the best kind of self was aimed at the man a boy like Emile would become, and was taken seriously by men and women alike in France and beyond.

Gouges described herself as a “pupil of pure nature,” embracing a Rousseauian perspective on education while imposing on it her own perspective on gender. The education Rousseau proposed for girls was mind-numbingly stifling; they were to be raised to understand they were “made for man’s delight.”

“They must be subject, all their lives, to the most constant and severe restraint, which is that of decorum: it is, therefore, necessary to accustom them early to such confinement, that it may not afterwards cost them too dear; and to the suppression of their caprices, that they may the more readily submit to the will of others” (Emile, or On Education, V:1297).

Rousseau claimed virtues for the male arose most purely when the individual was not constrained by civilization. He proposed that boys turn out best when left to themselves. He advocated freedom for Emile. But when it came to females, a system of cultural constraints was necessary in order to ensure the properly compliant nature for a companion for Emile. “[A]mong men, opinion is the tomb of virtue; among women it is the throne” (Emile, V:1278). Universal principles and the “masculine virtues” applied only to those dominant men. In terms of gender, Rousseau was influencing the Revolution in just the way Gouges was finding fault with it.

Gouges agreed with Rousseau’s understanding of how education of the citizenry could transform society. Yet, seeing well beyond Rousseau in terms of gender, she proclaimed in The Rights of Woman that the failure of society to educate its women was the “sole cause” of the corruption of government. Gouges, in a number of documents, as has been noted, anticipated contemporary feminist philosophy’s claim that modern liberal democracy operated with a deficient notion of the universal because it lacked inclusiveness. Historically, woman is seen as the complementary and contrasting counterbalance to man; if man is a political animal, woman is a domestic one. While the prominent French revolutionaries worked to apply Rousseauean themes to, among other things, the exclusion of women from political participation, Gouges fought to raise awareness of what is largely the misapplication of Rousseau’s social critique while never naming Rousseau. According to her, if social systems are human-made and they tend to cause the evils of the world because they interfere with nature, then just as it is for males, so must it be for females. Despite his intentions, Rousseau’s egalitarian vision had immense feminist implications. Men’s tyranny over women, for Gouges, is clearly contrary to nature, not in sync with it. Her use of “social contract” in the postscript to the Declaration is a direct appropriation of Rousseau. Her social contract proclaims that the right in marriage to equal property and parental and inheritance rights is the only way to build a society of “perfect harmony.” As with Beauvoir a century and a half later, Gouges calls women to take responsibility for their condition, and demand equality. She conceived of it in the masculine, and applied it to herself. She accepted Rousseau’s understanding of “nature” and his discourse on rights. She wrote The Original Happiness of Man in 1789 as a demonstration of her debt to Rousseau. There, she acknowledges her lack of formal education (as she often did in her writings), suggesting that she could see some things more clearly because of that deficit (she is “at once placed and displaced in this enlightened age . . .”). Her freedom comes from her lack of constraint, originating, in part, in her lack of formal education. The artificial constraints she encounters are unjustly thrust upon her by her society.

Rousseau’s condemnation of class distinctions also spoke directly to Gouges’s experience. But he did not question the right of the sovereign over the governed and Gouges, despite her monarchism, does at times do so with vigor. (When that sovereign is in one’s own household–all the more need for vigor.) For instance, in her pamphlet containing a proposal for a female national guard (Sera-t-il Roi, ne le sera-t-il pas? [Will he, or will he not, be King?] (1791), she criticizes a sovereign who would ask its citizenry to go to war, suggesting that such a request contradicts the very essence of citizenship. Being a citizen requires one to honor one’s relationship to the state, not to deliberately put that relationship in jeopardy. At the center of an understanding of political life should be a commitment not to take life–that is, to preserve the polis as a whole.

3. Relevance and Legacy

In addition to the political activism just mentioned, Gouges foreshadowed Henry David Thoreau (1817-1862), Mahatma Gandhi (1869-1948) and Martin Luther King, Jr. (1929-1968), by calling for disobedience to obviously unjust laws. Her argument for protections for the deposed French king comes, not so much from her royalist tendencies, but from her understanding of the “global family” and from her pacifism, as well as from her understanding of the separability of sovereign power from the individual who inhabits that power. Once the sovereignty is removed, the individual, she believed, was no longer synonymous with that figurehead. Putting to death the man who held that title but has since relinquished it, is a miscarriage of justice.

Through the several articles of the third part of The Rights of Woman she helped to problematize the notion of universalizability as a moral good, and has helped to formulate and to popularize the notion that the word ‘man’ when used generically may be problematic

Gouges critiqued the principle of equality touted in France because it gave no attention to who it left out, and she worked to claim the rightful place of women and slaves within its protection. She moved Querelle des Femmes (“the woman question”)—which had its origin in France in the middle ages and pivoted around what role(s) women should rightly play in society—out of the abstract and into the political arena. She moved the discussion of slavery from an abstract distant one (an issue for the colonies only) literally to center stage and specifically highlighted the moral irrelevance of color. The “color of man is nuanced,” she wrote; and she questioned why, if blonds are not superior to brunettes, and mulattos not superior to Negroes, how whites can be any different. Color cannot be a criterion for dehumanization. Her challenge to traditional binaries wherever she found them may be the culminating arc of her work and where we can find our greatest debt to her. Her fictional characters all strain against the straightjackets of their identities: strong women vie for their independence in conversation with sympathetic men rather than pitting themselves against each other in rivalry over men; men and women bond together to right some significant wrong; women seek strength in other women and unify to accomplish morally worthy goals; men often relinquish their arbitrary right to power over women to work in tandem to accomplish just goals. Men are depicted applauding women’s success. Her political pamphlets demonstrate her commitment to an overhaul of society. Revolutions, she insisted, could not succeed without the inclusion of women. And, since blacks demonstrated their humanity with every step and every breath in her plays, their enslavement was an indictment of French society.

While silenced for a time by history, Gouges scholarship has increased steadily since the publication of Oliver Blanc’s first biography in 1981 (no English translation; supplanted by his 2003 publication). A singular figure in the French Revolution and a founding influence on the direction of women’s and human rights, Gouges’s resistance to gendered social norms and her insistence on the revolutionary nature of the application of women’s rights makes her an important historical figure. If not herself a philosopher, she had the stamina and the intellect to shape ideas that have been and continue to be philosophically relevant and valuable. Her ability to attain status and power and the public rostrum despite her background and her gender is astonishing. Her refusal to be silent in the face of injustices, both personal and social, contains the roots of her legacy.

4. References and Further Reading

a. Extant Works by Olympe de Gouges (in French)

Théâtre politique I, preface by Gisela Thiele-Knobloch (Paris: Côté-femmes éditions, 1992)–includes Le Couvent, Mirabeau aux Champs-Elysées, and L’Entrée de Dumourier à Bruxelles.
Oeuvres complétes Théâtre, Félix-Marcel Castan, ed. Montauban: Cocagne, 1993 (comprises the twelve extant plays, including two in manuscript).
Théâtre politique II, preface by Gisela Thiele-Knobloch (Paris: Côté-femmes éditions, 1993)—comprises L’Homme généreux, Les Démocrates et les aristocrates, La Nécessité du divorce, La France sauvée ou le tyran détrôné, and Le Prélat d’autrefois, ou Sophie et Saint-Elme.
Ecrits politiques, 1788-1791, volume 1, preface by Olivier Blanc (Paris: Côté-femmes éditions, 1993).
Action héroïque d’une françoise, ou La France sauvée par les femmes. Paris: Guillaume Junior, 1789.
L’entrée de Dumouriez à Bruxelles, ou Les vivendiers. gallica.bnf.fr.
Ecrits politiques, 1792-1793, volume 2, preface by Blanc (Paris: Côté-femmes éditions, 1993).
Mémoire de Madame de Valmont, 1788, roman (Paris: Côté-femmes éditions, 1995).
La France sauvée, ou Le tyran détrôné. //gallica.bnf.fr.
Mon dernier mot à mes chers amis. //gallica.bnf.fr.
Oeuvres, ed. Benoite Groult. Paris: Mercure de France, 1986.
Repentir de Madame de Gouges. 1791, //gallica.bnf.fr.
Les fantômes de l’opinion publique, 1791? //gallica.bnf.fr.
Pour sauver la patrie, il faut respecter les trois ordes : c’est le seul moyen de conciliation qui nous reste. (1789). //gallica.bnf.fr.
Le cri du sage, par une femme. (1789 ?). //gallica.bnf.fr.
Dialogue allégorique entre la France et la vérité, dédié aux états généraux. 1789. //gallica.bnf.fr.

b. On-line English Translations of Gouges’s Original Works

On-going translation project by Clarissa Palmer: Available at www.olympedegouges.eu.
The Rights of Woman (1791) [titled here Declaration of the Rights of Women and the Female Citizen, 1791]: www.fordham.edu/halsall/mod/1791degouge1.asp.
Transcript of her trial: chnm.gmu.edu/revolution/d/488/.
“Reflections on Negroes” (trans. Sylvie Molta) www.uga.edu/slavery/texts/literary_works/reflections.pdf.
“Response to the American Champion” (trans. Maryann DeJulio) www.uga.edu/slavery/texts/literary_works/reponseenglish.pdf.
Additional material: www.uga.edu/slavery/texts/other_works.htm#1789.

c. Secondary Sources in English (except Blanc)

Andress, David. The Terror: the Merciless War for Freedom in Revolutionary France. New York: Farrar, Straus and Giroux, 2006.
Azoulay, Ariella. “The Absent Philosopher-Prince: Thinking Political Philosophy with Olympe de Gouges.” Radical Philosophy 158 (Nov/Dec 2009).
Beckstrand, Lisa. Deviant Women of the French Revolution and the Rise of Feminism. Verlag: Associated University Presse, 2009.
Beauvoir, Simone de. The Second Sex. Trans. H. M. Parshley. Vintage Books (Random House) (1989) [1952].
Blanc, Oliver. Une Femme de Libertés: Olympe de Gouges. Syros: Alternatives, 1989.
Blanc, Oliver. Marie-Olympe de Gouges: Une Humaniste à la fin du XVIIIe siècle. Paris; Editions René Viénet, 2003.
Brown, Gregory S. “The Self-Fashionings of Olympe de Gouges, 1784-1789.” Eighteenth-Century Studies, 34(3) (2001), 383-401.
Cole, John. Between the Queen and the Cabby. Montreal, Quebec, Canada: McGill-Queen’s University Press, 2011 (contains the only full length translation of all five parts of The Rights of Woman).
Diamond, Marie Josephine. “The Revolutionary Rhetoric of Olympe de Gouges.” Feminist Issues, 14(1) (1994), 3-23.
Fraisse, Genevieve, and Michelle Perrot, eds. A History of Women in the West, Vol. 4: Emerging Feminism from Revolution to World War. Cambridge, MA: Harvard University Press, 1993.
Garrett, Aaron. “Human Nature.” Cambridge History of Eighteenth-century Philosophy, Volumes 1, ed. Knud Haakonssen. New York, NY: Cambridge University Press, 2006. 160-233.
Green, Karen. A History of Women’s Political Thought in Europe, 1700-1800. New York, NY: Cambridge University Press, 2014. (Especially Chapter 9: “Anticipating and experiencing the revolution in France,” 203-234.)
Groult, Benoîte, ed. (French) “Olympe de Gouges: la première feminist moderne,” in Olympe de Gouges: Oeuvres. Paris: Mercure de France, 1988.
Harth, Erica. Cartesian Women: Versions and Subversions of Rational Discourse in the Old Regime. Ithaca, NY: Cornell University Press, 1992.
Levy, Darline Gay, Harriet Branson Applewhite and Mary Durham Johnson. Women in Revolutionary Paris: 1789-1795: Selected Documents Translated with Notes and Commentary. Urbana: University of Illinois Press, 1979.
Mattos, Rudy Frederic de. The Discourse of Women Writers in the French Revolution: Olympe de Gouges and Constance de Salm. 2007. Unpublished dissertation. University of Texas at Austin Electronic Theses and Dissertations.
Maclean Marie. “Revolution and Opposition: Olympe de Gouges and the Déclaration des droits de la femme.” In Literature and Revolution. Ed. David Beven. Amsterdam: Rodopi, 1989.
Melzer, Sara E. and Leslie W. Rabine, eds. Rebel Daughters: Women and the French Revolution. New York: Oxford University Press, USA, 1992.
Miller, Christopher L. The French Atlantic Triangle: Literature and Culture of the Slave Trade. Durham, NC: Duke University Press, 2008.
Monedas, Mary Cecilia. “Neglected Texts of Olympe de Gouges, Pamphleteer of the French Revolution of 1789,” Advances in the History of Rhetoric 1.1 (1996). 43-54.
Montfort-Howard, Catherine, ed. Literate Women and the French Revolution of 1789. Birmingham, AL: Summa Publications, 1994.
Mousset, Sophie. Women’s Rights and the French Revolution: a Biography of Olympe de Gouges. New Brunswick, NJ: Transaction Publishers, 2007.
Nielson, Wendy C. “Staging Rousseau’s Republic: French Revolutionary Festivals and Olympe de Gouges.” Eighteenth-Century: Theory and Interpretation, 43(3) (Fall 2003), 265-85.
Nielson, Wendy C. Women Warriors in Romantic Drama. Lanham, MD: The University of Delaware Press, 2013.
O’Neill, Eileen. “Early Modern Women Philosophers and the History of Philosophy.” Hypatia 20(3) (2005), 185-197.
Scott, Joan Wallach. “French Feminists and the Rights of ‘Man’: Olympe de Gouges’s Declarations.” History Workshop 28 (1989), 1-21.
Scott, Joan Wallach. Only Paradoxes to Offer: French Feminists and the Rights of Man. Cambridge, MA: Harvard University Press, 1996.
Sherman, Carol L. Reading Olympe de Gouges. New York, NY: Palgrave Macmillan, 2013.
Spencer, Samia I., ed. French Women and the Age of Enlightenment. Bloomington, IN: Indiana University Press, 1984.
Trouille, Mary Seidman. “Eighteenth-Century Amazons of the Pen: Stéphanie de Genlis & Olympe de Gouges.” Femmes Savants et Femmes d’Esprit: Women Intellectuals of the French Eighteenth Century. Eds. Roland Bonnel and Catherine Rubinger. New York: Peter Lang, 1994.
Trouille, Mary Seidman. Sexual Politics in the Enlightenment: Women Writers Read Rousseau. Albany: State U of New York Press, 1997.
Vanpée, Janie. La Déclaration des Droits de la Femme: Olympe de Gouges’s Re-Writing of La Déclaration des Droits de l’Homme. In Montfort-Howard, Catherine, ed. Literate Women and the French Revolution of 1789. Birmingham, AL: Summa Publications, 1994. 55-80.
Verdier, Gabrielle. “From Reform to Revolution: The Social Theater of Olympe de Gouges.” In Montfort-Howard, Catherine, ed. Literate Women and the French Revolution of 1789. Birmingham, AL: Summa Publications, 1994. 189-224.

Author Information

Joan Woolfrey
Email: jwoolfrey@wcupa.edu
West Chester University of Pennsylvania
U. S. A.

Kant: Philosophy of Mind

Immanuel Kant (1724-1804) was one of the most important philosophers of the Enlightenment Period (c. 1650-1800) in Western European history. This encyclopedia article focuses on Kant’s views in the philosophy of mind, which undergird much of his epistemology and metaphysics. In particular, it focuses on metaphysical and epistemological doctrines forming the core of Kant’s mature philosophy, as presented in the Critique of Pure Reason (CPR) of 1781/87 and elsewhere.

There are certain aspects of Kant’s project in the CPR that should be very familiar to anyone versed in the debates of seventeenth century European philosophy. For example, Kant argues, like Locke and Hume before him, that the boundaries of substantive human knowledge stop at experience, and thus that we must be extraordinarily circumspect concerning any claim made about what reality is like independent of all possible human experience. But, like Descartes and Leibniz, Kant thinks that central parts of human knowledge nevertheless exhibit characteristics of necessity and universality, and that, contrary to Hume’s skeptical arguments, there is good reason to think so.

Kant carries out a ‘critique’ of pure reason in order to show its nature and limits, thereby curbing the pretensions of various metaphysical systems articulated on the basis that reason alone allows us to scrutinize the depths of reality. But Kant also argues that the legitimate domain of reason is more extensive and more substantive than previous empiricist critiques had allowed. In this way Kant salvages (or attempts to) much of the prevailing Enlightenment conception of reason as an organ for knowledge of the world.

This article discusses Kant’s theory of cognition, including his views of the various mental faculties that make cognition possible. It distinguishes between different conceptions of consciousness at the basis of this theory of cognition and explains and discusses Kant’s criticisms of the prevailing rationalist conception of mind, popular in Germany at the time.

Kant’s Theory of Cognition
1. Mental Faculties and Mental Representation
  1. Sensibility, Understanding, and Reason
  2. Imagination and Judgment
2. Mental Processing
Consciousness
Concepts and Perception
Rational Psychology and Self-Knowledge
Summary
References and Further Reading
1. Kant’s Works in English
2. Secondary Sources

1. Kant’s Theory of Cognition

Kant is primarily interested in investigating the mind for epistemological reasons. One of the goals of his mature “critical” philosophy is articulating the conditions under which our scientific knowledge, including mathematics and natural science, is possible. Achieving this goal requires, in Kant’s estimation, a critique of the manner in which rational beings like ourselves gain such knowledge, so that we might distinguish those forms of inquiry that are legitimate, such as natural science, from those that are illegitimate, such as rationalist metaphysics. This critique proceeds via an examination of those features of the mind relevant to the acquisition of knowledge. This examination amounts to a survey of the conditions for “cognition” [Erkenntnis], or the mind’s relation to an object. Although there is some controversy about the best way to understand Kant’s use of this term, this article will understand it as involving relation to a possible object of experience, and as being a necessary condition for positive substantive knowledge (Wissen). Thus to understand Kant’s critical philosophy, we need to understand his conception of the mind.

a. Mental Faculties and Mental Representation

Kant characterizes the mind along two fundamental axes – first by the various kinds of powers which it possesses and second by the results of exercising those powers.

At the most basic explanatory level, Kant conceives of the mind as constituted by two fundamental capacities [Fähigkeiten], or powers, which he labels “receptivity” [Receptivität] and “spontaneity” [Spontaneität]. Receptivity, as the name suggests, constitutes the mind’s capacity to be affected by something, whether itself or something else. In other words, the mind’s receptive power essentially requires some external prompt to engage in producing “representations” [Vorstellungen], which are best thought of as discrete mental events or states, of which the mind is aware, or in virtue of which the mind is aware of something else (it is controversial whether representations are objects of ultimate awareness or are merely a vehicle for such awareness). In contrast, the power of spontaneity needs no such prompt. It is able to initiate its activity from itself, without any external trigger.

These two capacities of the mind are the basis for all (human) mental behavior. Kant thus construes all mental activity either in terms of its resulting from affection (receptivity) or from the mind’s self-prompted activity (spontaneity). From these two very general aspects of the mind Kant then derives three further basic faculties or “powers” [Vermögen], termed by Kant “sensibility” [Sinnlichkeit], “understanding” [Verstand], and “reason” [Vernunft]. These faculties characterize specific cognitive powers. These powers cannot be reduced to any of the others, and each is assigned a particular, cognitive task.

i. Sensibility, Understanding, and Reason

Kant distinguishes the three fundamental mental faculties from one another in two ways. First, he construes sensibility as the specific manner in which human beings, as well as other animals, are receptive. This is in contrast with the faculties of understanding and reason, which are forms of human, or all rational beings, spontaneity. Second, Kant distinguishes the faculties by their output. All of the mental faculties produce representations. We can see these distinctions at work in what is generally called the “stepladder” [Stufenleiter] passage from the Transcendental Dialectic of Kant’s major work, the Critique of Pure Reason (1781/7). This is one of the few places in the entire Kantian corpus where Kant explicitly discusses the meanings of and relations between his technical terms, and defines and classifies varieties of representation.

The genus is representation (representatio) in general. Under it stand representations with consciousness (perceptio). A perception [Wahrnehmung], that relates solely to a subject as a modification of its state, is sensation (sensatio). An objective perception is cognition (cognitio). This is either intuition or concept (intuitus vel conceptus). The first relates immediately to the object and is singular; the second is mediate, conveyed by a mark, which can be common to many things. A concept is either an empirical or a pure concept, and the pure concept, insofar as it has its origin solely in the understanding (not in a pure image of sensibility), is called notio. A concept made up of notions, which goes beyond the possibility of experience, is an idea or a concept of reason. (A320/B376–7).

As Kant’s discussion here indicates, the category of representation contains sensations [Empfindungen], intuitions [Anschauungen], and concepts [Begriffe]. Sensibility is the faculty that provides sensory representations. Sensibility generates representations based on being affected either by entities distinct from the subject or by the subject herself. This is in contrast to the faculty of understanding, which generates conceptual representations spontaneously – i.e. without advertence to affection. Reason is that spontaneous faculty by which special sorts of concepts, which Kant calls ‘ideas’ or ‘notions’, may be generated, and whose objects could never be met with in “experience,” which Kant defines as perceptions connected by fundamental concepts. Some of reason’s ideas include those concerning God and the soul.

Kant claims that all the representations generated via sensibility are structured by two “forms” of intuition—space and time—and that all sensory aspects of our experience are their “matter” (A20/B34). The simplest way of understanding what Kant means by “form” here is that anything one might experience will have either have spatial features, such as extension, shape, and location, or temporal features, such as being successive or simultaneous. So the formal element of an empirical intuition, or sense perception, will always be either spatial or temporal. Meanwhile, the material element is always sensory (in the sense of determining the phenomenal or “what it is like” character of experience) and tied either to one or more of the five senses or the feelings of pleasure and displeasure.

Kant ties the two forms of intuition to two distinct spheres or domains, the “inner” and the “outer.” The domain of outer intuition concerns the spatial world of material objects while the domain of inner intuition concerns temporally ordered states of mind. Space is thus the form of “outer sense” while time is the form of “inner sense” (A22/B37; cf. An 7:154). In the Transcendental Aesthetic, Kant is primarily concerned with “pure” [rein] intuition, or intuition absent any sensation, and often only speaks in passing of the sense perception of physical bodies (for example A20–1/B35). However, Kant more clearly links the five senses with intuition in his 1798 work Anthropology from a Pragmatic Point of View, in the section entitled “On the Five Senses.”

Sensibility in the cognitive faculty (the faculty of intuitive representations) contains two parts: sense and the imagination…But the senses, on the other hand, are divided into outer and inner sense (sensus internus); the first is where the human body is affected by physical things, the second is where the human body is affected by the mind (An 7:153).

Kant characterizes intuition generally in terms of two characteristics—namely immediacy [Unmittelbarkeit] and particularity [Einzelheit] (cf. A19/B33, A68/B93; JL 9:91). This is in contrast to the mediacy and generality [Allgemeinheit] characteristic of conceptual representation (A68/B93; JL 9:91).

Kant contrasts the particularity of intuition with the generality of concepts in the “stepladder” passage. Specifically, Kant says a concept is related to its object via “a mark, which can be common to many things” (A320/B377). This suggests that intuition, in contrast to concepts, puts a subject in cognitive contact with features of an object that are unique to particular objects and are not had by other objects. Some debate whether the immediacy of intuition is compatible with an intuition’s relating to an object by means of marks, or whether relation by means of marks entails mediacy and, thus, that only concepts relate to objects by means of marks. See Smit (2000) for discussion. Spatio-temporal properties seem like excellent candidates for such features, as no two objects of experience can have the very same spatio-temporal location (B327-8). But perhaps any non-repeatable, non-universal feature of a perceived object will do. For relevant discussion see Smith (2000); Grüne (2009), 50, 66-70.

Though Kant’s discussion of intuition suggests that it is a form of perceptual experience, this might seem to clash with his distinction between “experience” [Erfahrung] and “intuition” [Anschauung]. In part, this is a terminological issue. Kant’s notion of an “experience” is typically quite a bit narrower than our contemporary English usage of the term. Kant actually equates, at several points, “experience” with “empirical cognition” (B166, A176/B218, A189/B234), which is incompatible with experience being falsidical in any way. He also gives indications that experience, in his sense, is not something had by a single subject. See, for example, his claim that there is only one experience (A230/B282-3).

Kant also distinguishes intuition from “perception” [Wahrnehmung], which he characterizes as the conscious apprehension of the content of an intuition (Pr 4:300; cf. A99, A119-20, B162, and B202-3). “Experience,” in Kant’s sense, is then construed as a set of perceptions that are connected via fundamental concepts that Kant entitles the “categories.” As he puts it, “Experience is cognition through connected perceptions [durch verknüpfte Wahrnehmungen]” (B161; cf. B218; Pr 4:300).

Empirical intuition, perception, and experience, in Kant’s usage of these terms, all denote kinds of “experience” as we use the term in contemporary English. At its most primitive level, empirical intuition presents some feature of the world to the mind in a sensory manner. Empirical intuition does so in such a way that the intuition’s subject is in a position to distinguish that feature from others. A perception, in Kant’s sense, requires awareness of the basis by which the feature is different from other things. Kant uses the term in a variety of ways, however—JL 9:64-5, for instance—so there is some controversy surrounding the proper understanding of this term. One has a perception, in Kant’s sense, when one can not only discriminate one thing from another, or between the parts of a single thing, based on a sensory apprehension of it, but also can articulate exactly which features of the object or objects that distinguish it from others. For instance, one can say it is green rather than red, or that it occupies this spatial location rather than that one. Intuition thus allows for the discrimination of distinct objects via an awareness of their features, while perception allows for an awareness of what specifically distinguishes an object from others. “Experience,” in Kant’s sense, is even further up the cognitive ladder (see JL 9:64-5), insofar as it indicates an awareness of features, such as the substantiality of a thing, its causal relations with other beings, and its mereological features, that is part-whole dependence relations.

Kant thus believes that the capacity to cognitively ascend from mere discriminatory awareness of one’s environment (intuition), to an awareness of those features by means of which one discriminates (perception), and finally to an awareness of the objects which ground these features (experience), depends on the kinds of mental processes of which the subject is capable.

Before turning to the issue of mental processing, which figures centrally in Kant’s overall critical project, there are two further faculties of the mind that are worth discussion— the faculties of judgment imagination. These faculties are not obviously as fundamental as the faculties of sensibility, understanding, and reason, but they nevertheless play a central role in Kant’s thinking about the structure of the mind and its contributions to our experience of the world.

ii. Imagination and Judgment

Kant links the faculty of imagination closely to sensibility. For example, in his Anthropology he says,

Sensibility in the cognitive faculty (the faculty of intuitive representations) contains two parts: sense and the power of imagination. The first is the faculty of intuition in the presence of an object, the second is intuition even without the presence of an object. (An 7:153; cf. 7:167; B151; LM 29:881; LM 28:449, 673)

The contrast Kant makes here is not entirely obvious, but includes at least the difference between cases of occurrent sensory experience of a perceived object—seeing the brown table before you—and cases of sensory recollection of a previously perceived object—visually imagining the brown table that was once in front of you. Kant makes this clearer in the process of further distinguishing between different kinds of imagination.

The power of imagination (facultas imaginandi), as a faculty of intuition without the presence of the object, is either productive, that is, a faculty of the original presentation [Darstellung] of the object (exhibitio originaria), which thus precedes experience; or reproductive, a faculty of the derivative presentation of the object (exhibitio derivativa), which brings back to mind an empirical intuition that it had previously (An 7:167).

So, in the operation of productive imagination, one brings to mind a sensory experience that is not itself based on any object previously so experienced. This is not to say the productive imagination is totally creative. Kant explicitly denies (An 7:167) that the productive imagination has the power to generate wholly novel sensory experience. It could not, in a person born blind, produce the phenomenal quality associated with the experience of seeing a red object, for example. If the productive imagination is instrumental in producing sensory fictions, the reproductive imagination is instrumental in producing sensory experiences of previously perceived objects.

Imagination thus plays a central role in empirical cognition by serving as the basis for both memory and the creative arts. In addition it also plays a kind of mediating role between the faculties of sensibility and understanding. Kant calls this mediating role a “transcendental function” of the imagination (A124). It mediates and transcends by being tied in its functioning to both faculties. On one hand, it produces sensible representations, and is thus connected to sensibility. On the other hand, it is not a purely passive faculty but rather engages in the activity of bringing together various representations, as does memory, for example, .Kant explicitly connects understanding with this kind of active mental processing.

Kant also goes so far as to claim that the activity of imagination is a necessary part of what makes perception, in his technical sense of a string of connected, conscious sensory experiences, possible (A120, note). Though Kant’s view concerning the exact role of imagination in sensory experience is contested, two points emerge as central. First, Kant belives imagination plays a crucial role in the generation of complex sensory representations of an object (see Sellars (1978) for an influential example of this interpretation). It is imagination that makes it possible to have a sensory experience of a complex, three-dimensional, and geometric figure whose identity remains constant even as it is subject to translations and rotations in space. Second, Kant regards imagination’s mediating role between sensibility and understanding as crucial for at least some kinds of concept application (see Guyer (1987) and Pendlebury (1995) for further discussion). This mediating role involves what Kant calls the “schematization” of a concept and an additional mental faculty, that of judgment.

Kant defines the faculty of judgment as “the capacity to subsume under rules, that is, to distinguish whether something falls under a given rule” (A132/B171). However, he spends comparatively little time discussing this faculty in the first Critique. There, it seems to be discussed as an extension of the understanding in that it applies concepts to empirical objects. It is not until the third Critique—Kant’s 1790 Critique of Judgment—that Kant distinguishes judgment as an independent faculty with a special role. There Kant specifies two different ways it might function (CJ 5:179; cf. CJ (First Introduction) 20:211)

In one, judgment subsumes given objects under concepts, which are themselves already given. This role appears identical to the role he assigns judgment in the Critique of Pure Reason. The basic idea is that judgment functions to assign an intuited object—a dog—to the correct concept—such as domestic animals. This concept is presumed to be one already possessed by the subject. In this activity, the faculty overlaps with the role Kant singles out for imagination in the section of the first Critique entitled ‘On the Schematism of the Pure Concepts of the Understanding.’ Both are conceived of here in terms of the ultimate functioning of understanding, since it is understanding that generates concepts.

The second role for the faculty of judgment, and what seems to make it a distinctive faculty in its own right, is that of finding a concept under which to “subsume” experienced objects. This is called judgment’s “reflecting” role (CJ 5:179). Here, the subject exercises judgment in generating an appropriate concept for what is given by intuition (CJ (First Introduction) 20:211-13; JL 9:94–95; for discussion see Longuenesse (1998), 163–166 and 195–197; Ginsborg (2006).

In addition to the generation of empirical concepts, Kant also describes reflective judgment as responsible for scientific inquiry. It must sort and classify objects in nature into a hierarchical taxonomy of genus/species relationships. Kant also utilizes the notion of reflective judgment to unify the otherwise seemingly unrelated topics of the Critique of Judgment—aesthetic judgments and teleological judgments concerning the order of nature.

Thus far, the discussion of Kant’s view of the mind has focused primarily on the various mental faculties and their corresponding representational output. Both the faculty of imagination and that of judgment operate on representations given from sensibility and understanding. In general, Kant conceives of the mind’s activity in terms of different methods of “processing” representations.

b. Mental Processing

Kant’s term for mental processing is “combination” [Verbindung], and the form of combination with which he is primarily concerned is what he calls “synthesis.” Kant characterizes synthesis as that activity by which understanding “runs through” and “gathers together” representations given to it by sensibility in order to form concepts, judgments, and ultimately, for any cognition to take place at all (A77-8/B102-3). Synthesis is not something people are typically aware of doing. As Kant says, it is a “a blind though indispensable function of the soul…of which we are only seldom even conscious (A78/B103)”.

Synthesis is carried out by the unitary subject of representation upon representations either given to the subject by sensibility or produced by the subject through thought. Intellectual synthesis occurs when synthesis is used on representations and forms the content of a concept or judgment. When carried out by the imagination on material provided by sensibility, it is called “figurative” synthesis (B150-1). In the Critique of Pure Reason, Kant is primarily concerned with synthesis performed on representations provided by sensibility, and he discusses three central kinds of synthesis—apprehension, reproduction (or imagination), and recognition (or conceptualization) (A98-110/B159-61). Though Kant discusses these forms of synthesis as if they were discrete types of mental acts, it seems that the first two forms must occur together, while the third only may occur as well (compare Brook (1997); Allais (2009).

One of the central topics of debate in the interpretation of Kant’s views on synthesis is whether Kant endorses conceptualism. Roughly, conceptualism claims the capacity for conscious sensory experience of the objective world depends, at least in part, on the repertoire of concepts possessed by the experiencing subject, insofar as those concepts are exercised in acts of synthesis by understanding.

Kant typically contrasts synthesis with other ways in which representations might be related, most importantly, by association (for example B139-40). Association is primarily a passive process by which the mind comes to connect representations due to repeated exposure of the subject to certain kinds of regularities. One might, for example, associate thoughts of chicken soup with thoughts of being ill, if one only had chicken soup when one was ill. In contrast, synthesis is a fundamentally active process that depends upon the mind’s spontaneity and is the means by which genuine judgment is possible.

Consider, for example, the difference between the merely associative transition between holding a stone and feeling its weight compared to the judgment that the stone is heavy (B142). The association of holding the stone and feeling its weight is not yet a judgment about the stone, but a kind of involuntary connection between two states of oneself. In contrast, thinking the stone is heavy moves beyond associating two feelings to a thought about how things are objectively, independent of one’s own mental states (Pereboom (1995), Pereboom (2006)). One of Kant’s most important points concerning mental processing is that association cannot explain the possibility of objective judgment. What is required, he says, is a theory of mental processing by an active subject capable of acts of synthesis.

Several of the important differences between synthesis and association can be summarized as follows (Pereboom (1995), 4-7):

The source of synthesis is to be found in a subject, and the subject is distinct from its states.
Synthesis can employ a priori concepts, concepts independent of experience, as modes of processing representations, whereas association never does.
Synthesis is the product of a causally active subject. It is produced by a cause that is realized in the subject’s faculty, either the imagination or the understanding.

Kant’s conception of synthesis and judgment is tied to his conception of “consciousness” [Bewußtsein] and “self-consciousness” [Selbstbewußtsein]. However, both notions require some significant unpacking.

2. Consciousness

The notion of consciousness [Bewußtsein] plays an important role in Kant’s philosophy. There are, however, several different senses of “consciousness” in play in Kant’s work, not all of which line up with contemporary philosophical usage. Below, several of Kant’s most central notions and their differences from and relations to contemporary usage are explained.

a. Phenomenal Consciousness

Philosophical discussions of consciousness typically focus on phenomenal consciousness, or “what it is like” to have a conscious experience of a particular kind, such as seeing the color red or smelling a rose. Such qualitative features of consciousness have been of major concern to philosophers of the late 20th Century. However, the metaphysical issue of phenomenal consciousness is almost entirely ignored by Kant, perhaps because he is unconcerned with problems stemming from commitments to naturalism or physicalism. He seems to attribute all qualitative characteristics of consciousness to sensation and what he calls “feeling” [Gefühl] (CJ 5:206). Kant distinguishes between sensation and feeling in terms of an objective/subjective distinction. Sensations indicate or present features of objects, distinct from the subject. Feelings, by contrast, present only states of the subject to consciousness. Kant’s typical examples of such feelings include pain and pleasure (B66-7; CJ 5:189, 203-6).

Kant clearly assigns a cognitive role to sensation and allows that it is “through sensation” that we cognitively relate to objects given in sensibility (A20/B34). Despite that, he does not focus in any substantive or systematic way on the phenomenal aspects of sensory consciousness, nor does he focus on how exactly they aid in cognition of the empirical world.

b. Discrimination and Differentiation

The central notion of “consciousness” with which Kant is concerned is that of discrimination or differentiation. This is the same conception of consciousness mostly used in Kant’s time, particularly by his major predecessors Gottfried Wilhelm Leibniz (1646–1716) and Christian Wolff (1679-1754), and Kant gives little indication that he departs from their general practice.

According to Kant, any time a subject can discriminate one thing from another, the subject is, or can be, conscious of that one thing. (An 7:136-8). Representations which allow for discrimination and differentiation are “clear” [klar]. Representations which allow not only for the differentiation of one thing from others (such as differentiating one person’s face from another’s), but also the differentiation of parts of the thing so discriminated (such as differentiating the different parts of a person’s face) are called “distinct” [deutlich].

Kant does seem to deny the Leibniz-Wolff tradition that clarity can simply be equated with consciousness (B414-15, note). Primarily, he seems motivated to allow that one’s discriminatory capacities may outrun one’s capacity for memory or even the explicit articulation of that which is discriminated. In such cases, one does not have a fully clear representation.

Kant’s conception of “obscure” [dunkel] representation is that it allows the subject to discriminate differentially between aspects of her environment without any explicit awareness of how she does so. This connects him with the Leibniz-Wolff tradition of recognizing the existence of unconscious representations (An 7:135-7). Kant says the majority of representations that people appeal to in order to explain the complex, discriminatory behaviors of living organisms are “obscure” in a technical sense. Likening the mind to a map Kant goes so far as to say,

The field of sensuous intuitions and sensations of which we are not conscious, even though we can undoubtedly conclude that we have them; that is, obscure representations in the human being (and thus also in animals), is immense. Clear representations, on the other hand, contain only infinitely few points of this field which lie open to consciousness; so that as it were only a few places on the vast map of our mind are illuminated. (An 7:135)

Thus, obscure representations, have no direct or non-inferential awareness but must be posited to explain our fine-grained, differential, and discriminatory capacities. They constitute the majority of the mental representations with which the mind busies itself.

Though Kant does not make it explicit in his discussion of discrimination and consciousness, it is clear that he takes the capacity to discriminate between objects and parts of objects to be ultimately based on sensory representation of those objects. His views on consciousness as differential discrimination intersect with his views on phenomenal consciousness. Because humans are receptive through their sensibility, the ultimate basis on which we differentially discriminate between objects must be sensory. Thus, though Kant seems to take for granted the fact that conscious beings are in states with a particular phenomenal character, it must be the clarity and distinctness of this character that allows a conscious subject to differentially discriminate between the various elements of her environment (see Kant’s discussion of aesthetic perfection in the 1801 Jäsche Logic, 9:33-9 for relevant discussion).

c. Self-Consciousness

As the discussion of unconscious representation indicates, Kant believes we are not directly aware of most of our representations. They are nevertheless, to some degree, conscious, because they allow differential discrimination of elements from the subject’s environment. Kant thinks the process of making a representation clear, or fully conscious, requires a higher-order representation of the relevant representation. In other words, it requires that someone can have representations based on representations. As Kant says, “consciousness is really the representation that another representation is in me” (JL 9:33). Because this higher-order representation is one of another representation in the subject, Kant’s position here suggests that consciousness requires at least the capacity for self-consciousness. This position is reinforced by Kant’s famous claim in the Transcendental Deduction of the Critique of Pure Reason:

The I think must be able to accompany all my representations; for otherwise something would be represented in me that could not be thought at all, which is as much as to say that the representation would either be impossible or else at least would be nothing for me. (B131-2; emphasis in the original)

Kant might give the impression here of saying that for representation to be possible for a subject, the subject must possess the capacity for self-ascribing her representations. If so, then representation, and thus the capacity for conscious representation would depend on the capacity for self-consciousness. Because Kant ties the capacity for self-consciousness to spontaneity (B132, 137, 423) and restricts spontaneity to the class of rational beings, the demand for self-ascription would seem to deny that any non-rational animal (for example, dogs, cats, and birds), could have phenomenal or discriminatory consciousness.

However, there is little evidence to show that Kant endorses the self-ascription condition. Instead, he distinguishes between two distinct modes in which one is aware of oneself and one’s representations—inner sense and apperception (See Ameriks (2000) for extensive discussion). Only the latter form of awareness seems to demand a capacity for self-ascription.

i. Inner Sense

Inner sense is, according to Kant, the means by which we are aware of alterations in our own state. Hence all moods, feelings, and sensations, including such basic alterations as pleasure and pain, are the proper subject matter of inner sense. Ultimately, Kant argues that all sensations, feelings, and those representations attributable to a subject must ultimately occur in inner sense and conform to its form—time (A22-3/B37; A34/B51).

Thus, to be aware of something in inner sense is to be minimally, phenomenally conscious, at least in the case of awareness of sensations and feelings. To say a subject is aware of her own states via inner sense is to say that she has a temporally ordered series of mental states, and is phenomenally conscious of each, though she may not be conscious of the series as a whole. This could still count as a kind of self-awareness, as when an animal is aware of being in pain. But it is not an awareness of subject as a self. Kant himself indicates such a position in a letter to his friend and former student Marcus Herz in 1789.

[Representations] could still (I consider myself as an animal) carry on their play in an orderly fashion, as connected according to empirical laws of association, and thus they could even have influence on my feeling and desire, without my being aware of my own existence [meines Daseins unbewußt] (assuming that I am even conscious of each individual representation, but not of their relation to the unity of representation of their object, by means of the synthetic unity of their apperception). This might be so without my cognizing the slightest thing thereby, not even what my own condition is (C 11:52, May 26, 1789).

Hence, according to Kant, one may be aware of one’s representations via inner sense, but one is not and cannot, through inner sense alone, be aware of oneself as the subject of those representations. That requires what Kant, following Leibniz (1996), calls “apperception”.

ii. Apperception

Kant uses the term “apperception” to denote the capacity for the awareness of some state or modification of one’s self as a state. For one capable of apperception, there is a difference between feeling pain, and thus having an inner sense of it, and apperceiving that one is in pain, and thus ascribing, or being able to ascribe, a certain property or state of mind to one’s self. For example, while a non-apperceptive animal is aware of its own pain and its awareness is partially explanatory of its behavior, like avoidance, Kant construes the animal as incapable of making any self-attribution of its pain. Kant thinks of such a mind as incapable of construing itself as a subject of states, and it is thus unable to construe itself as persisting through changes of those states. This is not necessarily to say an animal incapable of apperception lacks any subject or self. But, at the very least, such an animal would be incapable of conceiving or representing itself in this way (See Naragon (1990); McLear (2011).

Kant considers the capacity for apperception as importantly tied to the capacity to represent objects as complexes of properties attributable to a single underlying entity (for example, an apple as a subject of the complex of the properties red and round). Kant’s argument for this connection is notorious both for its complexity and for its obscurity. The next sub-section will give an overview, though not an exhaustive discussion, of some of Kant’s most important points concerning these matters, as they relate to the issue of apperception.

d. Unity of Consciousness and the Categories

In order to better understand Kant’s views on apperception and unity of consciousness, one must step back and look at the wider context of the argument in which he situates these views. One of the core projects of Kant’s most famous work, the Critique of Pure Reason, is to provide an argument for the legitimacy of a priori knowledge of the natural world. Though Kant’s conception of the a priori is complex, Kant shares one central aspect of his view with his German rationalist predecessors (for example Leibniz (1996), preface), that we have knowledge of universal and necessary truths concerning aspects of the empirical world (B4-5). Those truths include one saying every event in the empirical world has a cause (B231). This tradition tended to explain the possession of knowledge of such universal and necessary truths by appeal to innate concepts which could be analyzed to yield the relevant truths. Kant importantly departs from the rationalist tradition, arguing that not all knowledge of universal and necessary truths is acquired via the analysis of concepts (B14-18). Instead, he says there are some “synthetic” a priori truths that are known on the basis of something other than conceptual analysis. Thus, according to Kant, the activity of pure reason achieves relatively little on its own. All of our ampliative knowledge (knowledge that can’t be directly deduced) that is also necessary and universal consists in what Kant calls “synthetic a priori” judgments or propositions. He then pursues the central question: how is knowledge of such synthetic a priori propositions possible?

Kant’s basic answer to the question of synthetic a priori knowledge involves what he calls the “Copernican Turn.” According to the “Copernican Turn,” the objects of human knowledge must “conform” to the basic faculties of human knowledge—the forms of intuition (space and time) and the forms of thought (the categories).

Kant thus engages in a two-part strategy for explaining the possibility of such synthetic a priori knowledge. The first part consists of arguing that the pure forms of intuition provide the basis for our synthetic a priori knowledge of mathematical truths. Mathematical knowledge is synthetic because it goes beyond mere conceptual analysis to deal with the structure of, or our representation of, space itself. It is a priori because the structure of space is accessible to us as it is merely the form of our intuition and not a real mind-independent thing.

In addition to the representation of space and time, Kant also thinks that possession of a particular, privileged set of a priori concepts is necessary for knowledge of the empirical world. But this raises a problem. How can an a priori concept, which is not itself derived from any particular experience, be nevertheless legitimately applicable to objects of experience? Even more difficult, it is not the mere possibility applying a priori concepts to objects of experience that worries Kant, for this could just be a matter of pure luck. Kant wants more than mere possibility; he wants to show that a privileged set of a priori concepts apply necessarily and universally to all objects of experience and do so in a way that people can know independently of experience.

This brings us to the second part of Kant’s argument, which is directly relevant for understanding Kant’s views on the importance of apperception. Not only must objects of knowledge conform to the forms of intuition, they also must conform to the most basic concepts (or categories) governing our capacity for thought. Kant’s strategy shows how a priori concepts legitimately apply to their objects by being partly constitutive of the objects of representation. This contrasts with the traditional view, according to which the objects of representation were the source or explanatory ground of our concepts (B, xvii-xix). Now, exactly what this means is deeply contested, in part because it is rather unclear what Kant intends by his doctrine of Transcendental Idealism. Does Kant intend that the objects of representation are themselves nothing other than representations? This would be a form of phenomenalism similar to that offered by Berkeley. Kant, however, seems to want to deny that his view is similar to Berkeley’s, asserting instead that the objects of representation exist independently of the mind, and that it is only the way that they are represented that is mind-dependent (A92/B125; compare Pr 4:288-94).

Kant’s strategy attempts to validate the legitimacy of the a priori categories proceeds by way of a “transcendental argument.” It takes the conditions necessary for consciousness of the identity of oneself as the subject of different self-attributed mental states and ties them together with those necessary for grounding the possibility of representing an object distinct from oneself. From those conditions, various properties may be predicated. In this sense, Kant argues that the intellectual representation of subject and object stands and falls together. Kant thus denies the possibility of a self-conscious subject, who could conceptualize and self-ascribe her representations, but whose representations could not represent law-governed objects in space, and thus the material world or ‘nature’ as the subject conceives of it.

Though Kant’s views regarding the unity of the subject are contested, there are several points which can be made fairly clearly. First, Kant conceives of all specific, intellectual activity, including the most basic instances of discursive thought, as requiring what he calls the “original unity of apperception” (B132). This unity, as original, is not itself brought about by some mental act of combining representations, but, as Kant says, is “what makes the concept of combination possible” (B131). It is itself the ground of the “possibility of the understanding” (B131).

Second, the original unity of apperception requires whatever form of self-consciousness characteristically relates to the “I think.” As Kant famously says, “the I think must be able to accompany all my representations” (B131). Moreover, the “I think” essentially involves activity on the part of the subject—it is an expression of the subject’s free activity or “spontaneity” (B132). This means that, according to Kant, only beings capable of spontaneous activity—self-initiated activity that is ultimately traced to causes outside the reach of natural causal laws—are going to be capable of thought in the sense with which Kant is concerned.

Third, and related to the previous point, Kant seems to deny that a subject could attain the kind of representational unity characteristic of thought if her only resources were aggregative methods. Kant makes this point later in the Critique when he says, “representations that are distributed among different beings (for instance, the individual words of a verse) never constitute a whole thought (a verse)” (A 352). William James provides a vivid articulation of the idea: “Take a sentence of a dozen words, and take twelve men and tell to each one word. Then stand the men in a row or jam them in a bunch, and let each think of his word as intently as he will; nowhere will there be a consciousness of the whole sentence” (James (1890), 160). Kant construes consciousness as the “holding-together” of the various components of a thought. He does so in a manner that seems radically opposed to any conception of unitary thought which tries to explain it in terms of some train or succession of its components (Pr 4:304; see Kitcher (2010); Engstrom (2013) for contrasting treatments of this issue).

The exact content of Kant’s argument for the connection between subject and object in the Transcendental Deduction is highly disputed, and it is likely no single reconstruction of the argument can capture all the points Kant supports in the Deduction. At least one strand of Kant’s argument in the first half of the Deduction focuses on Kant’s denial that the unity of the subject and its powers of representational combination could be accounted for by a merely associationist (or Humean) conception of mental combination, sometimes termed his “argument from above” (see A119; Carl (1989); Pereboom (1995)). Kant’s argues (see Pereboom (2009)):

I am conscious of the identity of myself as the subject of different self-attributions of mental states.
I am not directly conscious of the identity of this subject of different self-attributions of mental states.
If (1) and (2) are true, then this consciousness of identity is accounted for indirectly by my consciousness of a particular kind of unity of my mental states.
Therefore, this consciousness of identity is accounted for indirectly by my consciousness of a particular kind of unity of my mental states. (1, 2, 3)
If (4) is true, then my mental states indeed have this particular kind of unity.
This particular kind of unity of my mental states cannot be accounted for by association. (5)
If (6) is true, then this particular kind of unity of my mental states is accounted for by synthesis by a priori
Therefore, this particular kind of unity of my mental states is accounted for by synthesis by a priori concepts. (6, 7)

Premise (1) says that I am aware of herself as the subject of different states (or at least able to be so aware). For example, right now I might be hungry as well as sleepy. Previously, I was sleepy and slightly bored. Premise (2) claims I have no immediate or direct awareness of the being which has all of these states. In Kant’s terms, I lack any intuition of the subject of such self-ascribed states, instead having intuition only of the states themselves. Nevertheless, I am aware of all these states as related to a subject (it is I who am bored, hungry, sleepy), and it is in virtue of these connections that I can call one and all of these states mine. Hence, as premise (3) argues, there must be some unity to my mental states which accounts for my (indirect) awareness of their unity. My representations must have some basis for which they go together, and it is the basis for their ‘togetherness’ that explains how I can consider them, one and all, to be mine. Premises (4) and (5) unpack this point, and premise (6) argues that association could not account for such unity (the theory of association was articulated in a particularly influential form by David Hume (1888, Hume (2007)) and the reader should look to that article for relevant background discussion).

Kant’s point, in premise (6) of the above argument, is that forces of association acting on mental representations, whether impressions or ideas, cannot account for either the experience of a train of representations as mine or for the “togetherness” of those representations, both as a single thought or as a series of inferences. Hume argues we have no impression and thus no ensuing idea of an empirical self (Hume (1888), I.iv.6). Kant also accepts this point when he says, “the empirical consciousness that accompanies different representations is by itself dispersed and without relation to the identity of the subject” (B133). By this, Kant means that when we introspect in inner sense, all we ever get are particular mental states, such as boredom, happiness, particular thoughts. We lack any intuition of a subject of those mental states. Hume concludes that the idea of a persisting self which grounds all of these mental states as its subject must be fictitious. Kant disagrees. His contrasting view takes the mineness and togetherness of one’s introspectible mental states as data needing explanation.Because an associative, psychological theory like that of Hume’s cannot explain these features of first-person consciousness (see Hume (1888), III. Appendix), we need to find another theory, such as Kant’s theory of mental synthesis.

Recall that, prior to the argument of the Transcendental Deduction, Kant links the operations of synthesis to possession of a set of a priori concepts, or categories, not derived from experience. Hence, in arguing that synthesis is required to explain the mineness and togetherness of one’s mental states, and by linking synthesis to the application of the categories, Kant argues we could not have the experience of the mineness and togetherness of our mental states without applying the categories.

While this argument is only half of Kant’s argument in the first part of the Deduction, it shows how tightly Kant took the connection to be between the capacities for spontaneity, synthesis and apperception, and the legitimacy of the categories. The other half, by the way, consists of an “argument from below,” and discerns the conditions necessary for the representation of unitary objects, see Pereboom (1995), (2009)According to Kant, there is only one possible explanation of one’s apperceptive awareness of one’s psychological states as one’s own and of all states being related to one another. As the subject of such states, one possesses a spontaneous power for synthesizing one’s representations according to general principles or rules, the content of which is given by pure a priori concepts—the categories. The fact that the categories play such a fundamental role in the generation of self-conscious psychological states is thus a powerful argument demonstrating their legitimacy.

Given that Kant leverages certain aspects of our capacity for self-knowledge in his argument for the legitimacy of the categories, the extent to which he argues for radical limits on our capacity for self-knowledge may be surprising. In the final section, Kant’s arguments concerning our capacity for a priori knowledge of the self and its fundamental features will be made clear. However, the next section will look at one of the central debates in Kant’s interpretation of the role of concepts in perceptual experience.

3. Concepts and Perception

During the discussion of synthesis above, conceptualism was characterized as claiming there is a dependent relation between a subject having conscious sensory experience of an objective world and the repertoire of concepts possessed by the subject and exercised by her faculty of understanding.

As a first pass at sharpening this formulation, understand conceptualism as a thesis consisting of two claims: (i) sense experience has correctness conditions determined by the ‘content’ of the experience, and (ii) the content of an experience is a structured entity whose components are concepts.

a. Content and Correctness

An important background assumption governing the conceptualism debate construes mental states as related to the world cognitively, as opposed to merely causally, if and only if they possess correctness conditions. That which determines the correctness condition for a state is that state’s content (see Siegel (2010), (2011); Schellenberg (2011)).

Suppose, for example, that an experience E has the following content C:

C: That cup is white.

This content determines a correctness condition V:

V: S’s experience E is correct if and only if the cup visually presented to the subject as the content of the demonstrative is white and the content C corresponds to how things seem to the subject to be visually presented.

Here, the content of the experiential state functions much like the content of a belief state to determine whether the experience, like the belief, is or is not correct.

A state’s possession of content thus determines a correctness condition, through which the state can be construed as mapping, mirroring, or otherwise tracking aspects of the subject’s environment.

There are reasons for questioning whether Kant endorses the content assumption articulated above. Kant seems to deny several claims integral to it. First, in various places he explicitly denies that intuition, or the deliverances of the senses more generally, are the kind of thing which could be correct or incorrect (A293–4/B350; An §11 7:146; compare LL 24:83ff, 103, 720ff, 825ff). Second, Kant’s conception of representational content requires an act of mental unification (Pr 4:304; compare JL §17 9:101; LL 24:928), something which Kant explicitly denies is present in an intuition (B129-30; compare B176-7). This is not to deny that Kant uses a notion of “content,” in some other sense, but rather only that he fails to use it in the sense required by interpretations endorsing the content assumption (see Tolley (2014), (2013)). Finally, Kant’s “modal” condition of cognition, that it provides a demonstration of what is really actual rather than merely logically possible, seems to preclude an endorsement of the content assumption (B, xxvii, note; compare Chignell (2014)). However, for the purposes of understanding the conceptualism debate, assume Kant does endorse the content assumption. The question then is how to understand the nature of the content so understood.

b. Conceptual Content

In addition to the content assumption, conceptualism is defined as committed to a conception of intuition’s content being completely composed of concepts. Against this, Clinton Tolley (Tolley (2013), Tolley (2014)) has argued that the immediacy/mediacy distinction between intuition and concept entails a difference in the content of intuition and concept.

If we understand by ‘content’…a representation’s particular relation to an object…then it is clear that we should conclude that Kant accepts non-conceptual content. This is because Kant accepts that intuitions put us in a representational relation to objects that is distinct in kind from the relation that pertains to concepts. I argued, furthermore, that this is the meaning that Kant himself assigns to the term ‘content’. (Tolley (2013), 128)

Insofar as Kant often speaks of the ‘content’ [Inhalt] of a representation as consisting of a particular kind of relation to an object (Tolley (2013), 112; compare B83, B87), Tolley’s proposal thus gives ground for a simple and straightforward argument for a non-conceptualist reading of Kant. However, it does not necessarily prove that the content of what Kant calls an intuition is not something that would be construed by others as conceptual, in a wider sense of that term. For example, both pure—that, this—and complex demonstrative expressions—that color, this person—have conceptual form, and have been proposed as appropriate for capturing the content of experience (McDowell (1996), ch. 3; for discussion see Heck (2000)). Demonstratives are not, in Kant’s terms, ‘conceptual’ since they do not exhibit the requisite generality which, according to Kant, all conceptual representation must.

c. Conceptualism and Synthesis

If it isn’t textually plausible to understand the content of an intuition in conceptual terms, at least as Kant understands the notion of a concept, then what would it mean to say that Kant endorses conceptualism with regard to experience? The most plausible interpretation, endorsed by a wide variety of interpreters, reads Kant as arguing that the generation of an intuition, whether pure or sensory, depends at least in part on the activity of the understanding. On this way of carving things, conceptualism does not consist in the narrow claim that intuitions have concepts as contents or components. Instead, it consists in the broader claim that the occurrence of an intuition depends at least in part on the discursive activity of understanding. The specific activity of understanding is that which Kant calls ‘synthesis,’ the “running through, and gathering together” of representations (A99).

The conceptualist further argues that taking intuitions as generated via acts of synthesis, which are directed by or otherwise dependent upon conceptual capacities, provides some basis for the claim that whatever correctness conditions might be had by intuition must accord with the conceptual synthesis which generated them. This arguably fits well with Kant’s much quoted claim,

The same function that gives unity to the different representations in a judgment also gives unity to the mere synthesis of different representations in an intuition, which, expressed generally, is called the pure concept of understanding. (A79/B104-5)

The link between intuition, synthesis in accordance with concepts, and relation to an object is made even clearer by Kant’s claim in §17 of the B-edition Transcendental Deduction:

Understanding is, generally speaking, the faculty of cognitions. These consist in the determinate relation of given representations to an object. An object, however, is that in the concept of which the manifold of a given intuition is united. (B137; emphasis in the original)

However else we are to understand this passage, Kant here indicates that the unity of an intuition necessary for it to stand as a cognition of an object requires a synthesis by the concept ”object.” In other words, cognition of an object requires that intuition be unified by an act or acts of the understanding.

According to the conceptualist interpretation, one must understand the notion of a representation’s content as a relation to an object, which in turn depends on a conceptually guided synthesis. So we can revise our initial definition of conceptualism to read it as claiming (i) the content of an intuition is a kind of relation to an object, (ii) the relation to an object depends on a synthesis directed in accordance with concepts, and (iii) synthesis in accordance with concepts sets correctness conditions for the intuition’s representation of a mind-independent object.

d. Objections to Conceptualism

At the heart of non-conceptualist readings of Kant stands denial that mental acts of synthesis carried out by understanding are necessary for the occurrence of cognitive mental states of the type which Kant designates by the term “intuition” [Anschauung]. Though it is controversial as to what might be considered the “natural” or “default” reading of Kant’s mature critical philosophy, there are at least four considerations which lend strong support to a non-conceptualist interpretation of Kant’s mature work.

First, Kant repeatedly and forcefully states that in cognition there is a strict division of cognitive labor—objects are given by sensibility and thought via understanding:

Objects are given to us by means of sensibility, and it alone yields us intuitions; they are thought through the understanding, and from the understanding arise concepts (A19/B33; compare A50/B74, A51/B75–6, A271/B327).

As Robert Hanna has argued, when Kant discusses the dependence of intuition on conceptual judgment in the Analytic of Concepts, he specifically talks about cognition rather than what others would consider to be perceptual experience (Hanna (2005), 265-7).

Second, Kant characterizes the representational capacities characteristic of sensibility as more primitive than those characteristic of understanding, or reason, and he characterizes those capacities as a plausible part of what humans share with the rest of the animal kingdom (Kant connects the possession of a faculty of sensibility to animal nature in various places, for example A546/B574, A802/B830; An 7:196). For example, Kant’s distinction between the faculties of sensibility and understanding seems intended to capture the difference between the “sub-rational” powers of the mind that is shared with non-human animals and the “rational or higher-level cognitive powers” that are special to human beings. (Hanna (2005), 249; compare Allais (2009); McLear (2011))

If one were to deny that, according to Kant, sensibility alone is capable of producing mental states cognitive in character, then, it would seem that any animal which lacks a faculty of understanding would thereby lack any capacity for genuinely perceptual experience. The mental lives of non-rational animals would thus, at best, consist of non-cognitive sensory states causally correlated with changes in the animal’s environment. Aside from an unappealing and implausible characterization of the animals’ cognitive capacities, this reading also faces textual hurdles (for relevant discussion of some of the issues in contemporary cognitive ethology see Bermúdez (2003); Lurz (2009); Andrews (2014), as well as the papers in Lurz (2011)). Kant is on record in various places as saying that animals have sensory representations of their environment (CPJ 5:464; LM 28:449; compare An 7:212), that they have intuitions (LL 24:702), and that they are acquainted with objects though they do not cognize them (JL 9:64–5) (see Naragon (1990); Allais (2009); McLear (2011)).

Hence, if Kant’s position is that synthetic acts carried out by the understanding are necessary for the cognitive standing of a mental state, then Kant is contradicting fundamental elements of his own position in crediting intuitions or their possibility to non-rational animals.

Third, any position which regards perceptual experience as dependent upon acts of synthesis carried out by the understanding would presumably also construe the ‘pure’ intuitions of space and time as dependent upon acts of synthesis (see Longuenesse (1998), ch. 9; Griffith (2012)). However, Kant’s discussion of space, and, analogously, time, in the third and fourth arguments (fourth and fifth in the case of time) of the Metaphysical Exposition of Space in the Transcendental Aesthetic seems incompatible with such a proposed relation of dependence.

Kant’s point in the third and fourth arguments of the Metaphysical Exposition of space and time is that no finite intellect could grasp the extent and nature of space as an infinite whole via a synthetic process involving movement from representation of a part to representation of the whole. If the unity of the forms of intuition were itself something dependent upon intellectual activity, then this unity would necessarily involve the discursive, though not necessarily conceptual, running through and gathering together of a given multiplicity (presumably of different locations or moments) into a combined whole. Kant believes this is characteristic of synthesis generally (A99).

But Kant’s arguments in the Metaphysical Expositions require the fundamental basis of the representation of space and time does not proceed from a grasp of the multiplicative features of an intuited particular to the whole with those features. Instead, the form of pure intuition constitutes a representational whole that is prior to that of its component parts (compare CJ 5:407-8, 409).

Hence, Kant’s position is that the pure intuitions of space and time possess a unity wholly different from that given by the discursive unity of understanding (whether in conceptual judgment or the intellectual with imaginative synthesis of intuited objects). The unity of aesthetic representation—characterized by forms of space and time—has a structure in which the representational parts depend upon the whole. The unity of discursive representation—representation where the activity of understanding is involved—has a structure in which the representational whole depends upon its parts (see McLear (2015)).

Finally, there has been extensive discussion on the non-conceptuality of intuition in the secondary literature on Kant’s philosophy of mathematics. For example, Michael Friedman has argued that the expressive limitations of prevailing logic in Kant’s time required the postulation of intuition as a form of singular, non-conceptual representation (Friedman (1992), ch. 2; Anderson (2005); Sutherland (2008)). In contrast to Friedman’s view, Charles Parsons and Emily Carson argued that the immediacy of intuition, both pure and empirical, should be construed in a ‘phenomenological’ manner. Space in particular is understood on their interpretation as an original, non-conceptual representation, which Kant takes to be necessary for the demonstration of the real possibility of constructed, mathematical objects as required for geometric knowledge (Parsons (1964); Parsons (1992); Carson (1997); Carson (1999); compare Hanna (2002). For a general overview of related issues in Kant’s philosophy of mathematics, see Shabel (2006) and the works cited therein at p. 107, note 29.)

Ultimately, however, there are difficulties assessing whether Kant’s philosophy of mathematics can have relevance for the conceptualism debate. It is not obvious whether intuition must be non-conceptual in accounting for mathematical knowledge is incompatible with claiming that intuitions themselves are dependent upon a conceptually-guided synthesis.

The non-conceptualist reading clearly commits to allowing that sensibility alone provides, perhaps in a very primitive manner, objective representation of the empirical world. Sensibility is construed as an independent cognitive faculty, which humans share with other non-rational animals, and which is the jumping-off point for more sophisticated, conceptual representation of empirical reality.

The next and final section looks at Kant’s views regarding the nature and limits of self-knowledge and the ramifications of this for traditional rationalist views of the self.

4. Rational Psychology and Self-Knowledge

Kant discusses the nature and limits of our self-knowledge most extensively in the first Critique, in a section of the Transcendental Dialectic called the “Paralogisms of Pure Reason.” Here, Kant is concerned to criticize the claims of what he calls “rational psychology.” Specifically, he is concerned about the claim that we can have substantive, metaphysical knowledge of the nature of the subject, based purely on an analysis of the concept of the thinking self. As Kant typically puts it:

I think is thus the sole text of rational psychology, from which it is to develop its entire wisdom…because the least empirical predicate would corrupt the rational purity and independence of the science from all experience. (A343/B401)

There are four “Paralogisms.” Each argument is presented as a syllogism, consisting of two premises and a conclusion. According to Kant, each argument is guilty of an equivocation on a term common to the premises, such that the argument is invalid. Kant’s aim, in his discussion of each Paralogism, is to diagnose the equivocation, and explain why the rational psychologist’s argument ultimately fails. In so doing, Kant provides a great deal of information about his own views concerning the mind (See Ameriks (2000) for extensive discussion). The argument of the first Paralogism concerns knowledge of the self as substance; the second, the simplicity of the self; the third, the numerical identity of the self; and the fourth, knowledge of the self versus knowledge of things in space.

a. Substantiality (A348-51/B410-11)

Kant presents the rationalist’s argument in the First Paralogism as follows:

What cannot be thought otherwise than as subject does not exist otherwise than as subject, and is therefore substance.
Now a thinking being, considered merely as such, cannot be thought of as other than a subject.
Therefore, a thinking being also exists only as such a thing, i.e., as substance.

Kant’s presentation of the argument is rather compressed. In more explicit form we can put it as follows (see Proops (2010)):

All entities that cannot be thought of as other than a subjects are entities that cannot exist otherwise than as subjects, and therefore are substances. (All M are P)
All entities that are thinking beings are entities that cannot be thought otherwise than as subjects. (All S are M)
Therefore, all entities that are thinking beings are entities that cannot exist otherwise than as subjects, and therefore are substances (All S are P)

The relevant equivocation is in the term that occupies the ‘M’ place in the argument— “entities that cannot be thought otherwise than as subjects”. Kant specifically locates the ambiguity in the use of the term “thought” [Das Denken], which he claims concerns an object in general in the first premise. Thus, “thought” could be given in a possible intuition. In the second premise, the use of “thought” is supposed to apply only to a feature of thought and, thus, not to an object of a possible intuition (B411-12).

While it isn’t obvious what Kant means by this claim, it could be that. Kant takes the first premise to make a claim about the objects of thought. They exist as an independent subject or bearer of properties and cannot be conceived of as anything else. This is thus a metaphysical claim about what kinds of objects could really exist, which explains Kant’s reference to an “object in general” that could be given in intuition.

In contrast, premise (2) makes a merely logical claim concerning the role of the representation in a possible judgment. Kant says one cannot use representation in any place other than upon the subject. For example, while I can make the claim “I am tall,” I would make no sense to claim “the tall is I.”

Against the rational psychologist, Kant argues that one cannot make any legitimate inference from the conditions under which representation may be thought, or employed in a judgment, to the status of the ‘I’ as a metaphysical subject of properties. Kant makes this point explicit when he says,

The first syllogism of transcendental psychology imposes on us an only allegedly new insight when it passes off the constant logical subject of thinking as the cognition of a real subject of inherence, with which we do not and cannot have the least acquaintance, because consciousness is the one single thing that makes all representations into thoughts, and in which, therefore, as in the transcendental subject, our perceptions must be encountered; and apart from this logical significance of the I, we have no acquaintance with the subject in itself that grounds this I as a substratum, just as it grounds all thoughts. (A350)

Kant thus argues that one should differentiate between different conceptions of “substance” and the role they play in thoughts concerning the world.

Substance₀:

x is a substance₀ if and only if the representation of x cannot be used as a predicate in a categorical judgment

Substance₁:

x is a substance₁ if and only if its existence is such that it can never inhere, or exist, in anything else (B288, 407)

The first conception of substance is merely logical or grammatical. The second conception is explicitly metaphysical. Finally, there is an even more metaphysically demanding usage of “substance” that Kant employs.

(Empirical) Substance₂:

x is a substance₂ if and only if it is a substance₁ that persists at every moment (A144/B183, A182)

According to Kant, the rational psychologist attempts to move from claims about substance₀ to the more robustly metaphysical claims characteristic of conceptions and uses of substance₁ and substance₂. However, without further substantive assumptions, which go beyond anything given in an analysis of the concept , no legitimate inference can be made from our notion of a substance₀ to either of the other conceptions of substance.

Because, Kant denies that humans have any intuition, empirical or otherwise, of themselves as subjects, they cannot come to have any knowledge concerning what we are in terms of beings either substance₁ or substance₂. At least they cannot do so by reflecting on the conditions of thinking of themselves using first-person concepts. No amount of introspection or reflection on the content of the first-person concept will yield such knowledge.

b. Simplicity (A351-61/B407-8)

Kant’s discussion of the proposed metaphysical simplicity of the subject largely depends on points he made in the previous Paralogism concerning its proposed substantiality. Kant articulates the Second Paralogism as follows:

The subject, whose action can never be regarded as the concurrence of many acting things, is simple. (All A is B)
The self is such a subject. (C is A)
Therefore, the self is simple. (C is B)

Here, the equivocation concerns the notion of a “subject.” Kant’s point, as with the previous Paralogism, is that, from the fact that one’s first-person representation of the self is always a grammatical or logical subject, nothing follows concerning the metaphysical status of that representation’s referent.

Of perhaps greater interest in this discussion of the Paralogism of simplicity is Kant’s analysis of what he calls the “Achilles of all dialectical inferences” (A351). According to the Achilles argument, the soul or mind is known to be a simple, unitary substance, because only such a substance could think unitary thoughts. Called the “unity claim” (see Brook (1997)), it says:

(UC):

If a multiplicity of representations are to form a single representation, they must be contained in the absolute unity of the thinking substance. (A352)

Against UC, Kant argues that there is no reason to think the structure of a thought, as a complex of representations, isn’t mirrored in the complex structure of an entity that thinks thoughts. UC is not analytic, which is to say that there is no contradiction entailed by its negation. UC also fails to be a synthetic a priori claim, in that it follows neither from the nature of intuition’s forms, nor from categories. Hence, UC could only be shown to be true empirically, and because people have no empirical intuition of the self, people have no basis for thinking that UC must be true (A353).

Kant here makes a point similar to contemporary, functionalist accounts of the mind (see Meerbote (1991); Brook (1997)). Mental functions, including the unity of conscious thought, are consistent with a variety of different media in which functions are realized. Kant’s says there is no contradiction in thinking that a plurality of substances might succeed in generating a single, unified thought. Hence, we cannot know that the mind is such that it must be simple in nature.

c. Numerical Identity (A361-66/B408)

Kant articulates the Third Paralogism as follows:

What is conscious of the numerical identity of its Self in different times, is, to that extent, a person. (All C is P)
Now, the soul is conscious of the numerical identity of its Self in different times. (S is C)
Therefore, the soul is a person. (S is P)

Rational psychologists’ interest in establishing the personality of the soul or mind stems from the importance of proving that not only would the mind persist after the destruction of its body, but also that this mind would be the same person as before, not just some sort of bare consciousness or worse (for example, existing only as a “bare monad”).

Kant here makes two main points. First, the rational psychologist cannot infer from the sameness of the first-person representation (the “I think”) or across applications of it in judgment to any conclusion concerning the sameness of the metaphysical subject referred to by that representation. Kant thus again makes a functionalist point. The medium in which a series of representational states inheres may change over time, and there is no contradiction in conceiving of a series of representations as being transferred from one substance to another (A363-4, note).

Second, Kant argues that we can be confident of the soul’s possession of personality by virtue of apperception’s persistence. The relevant notion of “personality” here distinguishes between a rational being and an animal. While the persistence of apperception (the persistence of the “I think” as being able to attach to all of one’s representations) does not provide an apperceiving subject with any insight into the true metaphysical nature of the mind, it does provide evidence of the soul’s possession of an understanding. Animals, by contrast, do not possess an understanding but, at best, according to Kant, only an analogue thereof. As Kant says in the Anthropology,@

That man can have the I among his representations elevates him infinitely above all other living beings on earth. He is thereby a person […] that is, by rank and worth a completely distinct being from things that are the same as reason-less animals with which one can do as one pleases. (An 7:127, §1)

Hence, so long as a soul possesses the capacity for apperception, it will signal the possession of an understanding, and thus serves to distinguish the human soul from that of an animal (see Dyck (2010), 120).

d. Relation to Objects in Space (A366-80/B409)

Finally, the Fourth Paralogism concerns the relation between awareness of one’s own mind and one’s awareness of other objects distinct from oneself. Thus, it also deals with one’s mind and awareness of space. Kant describes the Fourth Paralogism as follows:

What can be only causally inferred is never certain. (All I is not C)
The existence of outer objects can only be causally inferred, not immediately perceived by us. (O is I)
Therefore, we can never be certain of the existence of outer objects. (O is not C)

Kant locates the damaging ambiguity in the conception of “outer” objects. This is puzzling because it doesn’t play the relevant role as middle term in the syllogism. But Kant is quite clear that this is where the ambiguity lies and distinguishes between two distinct senses of the “outer” or “external”:

Trancendentally Outer/External:

A seperate existence, in and of itself.

Empirically Outer/External:

An existence in space.

Kant’s point here is that all appearances in space are empirically external to the subject who perceives or thinks about them, while nevertheless being transcendentally internal. Such spatial appearances do not have an entirely independent metaphysical nature, because their spatial features depend at least in part on our forms of intuition.

Kant then uses this distinction not only to argue against the assumption of the rational psychologist that the mind is better known than any object in space (famously argued by Descartes), but also against those forms of external world skepticism championed by Descartes and Berkeley. Kant identifies Berkeley with what he calls “dogmatic idealism” and Descartes with what he calls “problematic idealism” (A377). He defines them thus:

Problematic Idealism:

We cannot be certain of the existence of any material body.

Dogmatic Idealism:

We can be certain that no material body exists – the notion of a body is self-contradictory.

Kant brings two arguments to bear against the rational psychologist’s assumption about the immediacy of our self-knowledge, as well as these two forms of skepticism, with mixed results. The two arguments are from “immediacy” and “imagination.”

i. The Immediacy Argument

In an extended passage in the Fourth Paralogism (A370-1) Kant makes the following argument:

External objects (bodies) are merely appearances, hence also nothing other than a species of my representations, whose objects are something only through these representations, but are nothing separated from them. Thus external things exist as well as my self, and indeed both exist on the immediate testimony of my self-consciousness, only with this difference: the representation of my Self, as the thinking subject, is related merely to inner sense, but the representations that designate extended beings are also related to outer sense. I am no more necessitated to draw inferences in respect of the reality of external objects than I am in regard to the reality of the objects of my inner sense (my thoughts), for in both cases they are nothing but representations, the immediate perception (consciousness) of which is at the same time a sufficient proof of their reality. (A370-1)

It helps to understand the argument as follows:

Rational Psychology (RP) privileges awareness of the subject and its states over awareness of non-subjective states.
But, transcendental idealism entails that people are aware of both subjective and objective states, as they appear, in the same way—via a form of intuition.
So, either both kinds of awareness are immediate or they are both mediate.
Because awareness of subjective states is obviously immediate, then awareness of objective states must also be immediate.
Therefore, we are immediately aware of the states or properties of physical objects.

Here, Kant displays what he takes to be an advantage of Transcendental Idealism. Because both inner and outer sense depend on intuition, there is nothing special about inner intuition that privileges it over outer intuition. Both are, as intuitions, immediate presentations of objects, at least as they appear. Unfortunately, Kant never makes clear what he means by the term “immediate” [unmittelbar]. This issue is much contested (see Smit (2000)). At the very least, he means to signal that awareness in intuition is not mediated by any explicit or conscious inference, as when he says that the transcendental idealist “grants to matter, as appearance, a reality which need not be inferred, but is immediately perceived” (A371).

It is not obvious that an external world skeptic would find this argument convincing, as part of the grip of such skepticism relies on the convincing point that things could seem to one just as they currently are, even if there really is no external world causing one’s experiences. This may just beg the question against Kant (particularly premise (2) of the above argument). Certainly, Kant seems to think that his arguments for the existence of pure intuitions of space and time in the Transcendental Aesthetic lend some weight to his position. Thus, Kant is not so much arguing for Transcendental Idealism here as explaining some of the further benefits that come when the position is adopted. He does, however, present at least one further argument against the skeptical objection articulated above—the argument from imagination.

ii. The Argument from Imagination

Kant’s attempt to respond to the skeptical worry that things might appear to be outside us while not actually existing outside us appeals to the role imagination would have to play to make such a possibility plausible (A373-4; compare Anthropology, 7:167-8).

This material or real entity, however, this Something that is to be intuited in space, necessarily presupposes perception, and it cannot be invented by any power of imagination or produced independently of perception, which indicates the reality of something in space. Thus sensation is that which designates a reality in space and time, according to whether it is related to the one or the other mode of sensible intuition.

What follows is a reconstruction of this argument.

If problematic idealism is correct, then it is possible for one to have never perceived any spatial object but only to have imagined doing so.
But imagination cannot fabricate—it can only re-fabricate.
So, if one has sensory experience of outer spatial objects, then one must have had at least one successful perception of an external spatial object.
Therefore, it is certain that an extended spatial world exists.

Kant’s idea here is that the imagination is too limited to generate the various qualities that people experience as instantiated in external physical objects. Hence, it would not be possible to simply imagine an external physical world without having been originally exposed to the qualities instantiated in the physical world. Ergo, the physical world must exist. Even Descartes seems to agree with this, noting in Meditation I that “[certain simple kinds of qualities] are as it were the real colours from which we form all the images of things, whether true or false, that occur in our thought” (Descartes (1984), 13-14). Though Descartes goes on to doubt our capacity to know even such basic qualities given the possible existence of an evil deceiver, it is notable that the deceiver must be something other than ourselves, in order to account for all the richness and variety of what we experience (however, see Meditation VI (Descartes (1984), 54), where Descartes wonders whether there could be some hidden faculty in ourselves producing all of our ideas).

Unfortunately, it isn’t clear that the argument from imagination gets Kant the conclusion he wants, for all that it shows is that there was at one time a physical world, which affected one’s senses and provided the material for one’s sense experiences. This might be enough to show that one has not always been radically deceived, but it is not enough to show that one is not currently being radically deceived. Even worse, it isn’t even clear that a physical world must exist to generate the requisite material for the imagination. Perhaps all that is needed is something distinct from the subject, something which is capable of generating in it the requisite sensory experiences, whether or not they are veridical. This conclusion is thus compatible with that “something” being Descartes’s evil demon, or in contemporary epistemology, with the subject’s being a brain in a vat. Hence, it is not obvious that Kant’s argument succeeds in refuting the skeptic. To the extent that he did refute the skeptic, it still does not show that there is a physical world, as opposed merely to the existence of something distinct from the subject.

e. Lessons of the Paralogisms

Beyond the specific arguments of the Paralogisms and their conclusions, they present us with two central tenets of Kant’s conception of the mind. First, we cannot move from claims concerning the character or role of the first-person representation to claims concerning the nature of the referent of that representation. This is a key part of his criticism of rational psychology. Second, people do not have privileged access to themselves as compared with things outside them. Both the self (or its states) and external objects are on par with respect to intuition. This also means that they only have access to themselves as they appear, and not as they fundamentally, metaphysically, are (compare B157). Hence, according to Kant, self-awareness, just as much as awareness of anything distinct from the self, is conditioned by sensibility. Intellectual access to selves in apperception, Kant argues, does not reveal anything about one’s metaphysical nature, in the sense of the kind of thing that must exist to realize the various cognitive powers that Kant describes as characteristic of a being capable of apperception—a spontaneous understanding or intellect.

5. Summary

Kant’s conception of the mind, his distinction between sensory and intellectual faculties, his functionalism, his conception of mental content, and his work on the nature of the subject/object distinction, were all hugely influential. His work immediately inspired the German Idealist movement. He also became central to emerging ideas concerning the epistemology of science in the late 19^th and early 20^th centuries, in what became known as the “Neo-Kantian” movement in central and southern Germany. Though Anglophone interest in Kant ebbed somewhat in the early 20^th century, his conception of the mind and criticisms of rationalist psychology were again influential mid-century via the work of “analytic” Kantians such as P.F. Strawson, Jonathan Bennett, and Wilfrid Sellars. In the early 21^st century Kant’s work on the mind remains a touchstone for philosophical investigation, especially in the work of those influenced by Strawson or Sellars, such as Quassim Cassam, John McDowell, and Christopher Peacocke.

6. References and Further Reading

Quotations from Kant’s work are from the German edition of Kant’s works, the Akademie Ausgabe, with the first Critique cited by the standard A/B edition pagination, and the other works by volume and page. English translations belong to the author of this article article, though he has regularly consulted, and in most cases closely followed, translations from the Cambridge Editions. Specific texts are abbreviated as follows:

An: Anthropology from a Pragmatic Point of View
C: Correspondence
CPR: Critique of Pure Reason
CJ: Critique of Judgment
JL: Jäsche Logic
LA: Lectures on Anthropology
LL: Lecturs on Logic
LM: Lectures on Metaphysics
Pr: Prolegomena to any Future Metaphysics

a. Kant’s Works in English

The most used scholarly English translations of Kant’s work are published by Cambridge University Press as the Cambridge Editions of the Works of Immanuel Kant. The following are from that collection and contain some of Kant’s most important and influential writings.

Correspondence, ed. Arnulf Zweig. Cambridge: Cambridge University Press, 1999.
Critique of Pure Reason, trans. Paul Guyer and Allen Wood. Cambridge: Cambridge University Press, 1998.
Critique of the Power of Judgment, trans. Paul Guyer and Eric Matthews. Cambridge: Cambridge University Press, 2000.
History, Anthropology, and Education, eds. Günter Zöller and Robert Louden. Cambridge: Cambridge University Press, 2007.
Lectures on Anthropology, ed. and trans. Allen W. Wood and Robert B. Louden. Cambridge: Cambridge University Press, 2012.
Lectures on Logic, trans. J. Michael Young. Cambridge: Cambridge University Press, 1992.
Lectures on Metaphysics, ed. and trans. Karl Ameriks and Steve Naragon. Cambridge: Cambridge University Press, 2001.
Practical Philosophy, ed. Mary Gregor. Cambridge: Cambridge University Press, 1996.
Theoretical Philosophy 1755-1770, ed. David Walford. Cambridge: Cambridge University Press, 2002.
Theoretical Philosophy after 1781, eds. Henry Allison and Peter Heath. Cambridge: Cambridge University Press, 2002

b. Secondary Sources

Allais, Lucy. 2009. “Kant, Non-Conceptual Content and the Representation of Space.” Journal of the History of Philosophy 47 (3): 383–413.
Allison, Henry E. 2004. Kant’s Transcendental Idealism: Revised and Enlarged. New Haven: Yale University Press.
Ameriks, Karl. 2000. Kant and the Fate of Autonomy: Problems in the Appropriation of the Critical Philosophy. Cambridge: Cambridge University Press.
Anderson, R Lanier. 2005. “Neo-Kantianism and the Roots of Anti-Psychologism.” British Journal for the History of Philosophy 13 (2): 287–323.
Andrews, Kristin. 2014. The Animal Mind: An Introduction to the Philosophy of Animal Cognition. London: Routledge.
Bennett, Jonathan. 1966. Kant’s Analytic. Cambridge: Cambridge University Press.
Bennett, Jonathan. 1974. Kant’s Dialectic. Cambridge: Cambridge University Press.
Bermúdez, José Luis. 2003. “Ascribing Thoughts to Non-Linguistic Creatures.” Facta Philosophica 5 (2): 313–34.
Brook, Andrew. 1997. Kant and the Mind. Cambridge: Cambridge University Press.
Buroker, Jill Vance. 2006. Kant’s Critique of Pure Reason: An Introduction. Cambridge: Cambridge University Press.
Carl, Wolfgang. 1989. “Kant’s First Drafts of the Deduction of the Categories.” In Kant’s Transcendental Deductions, edited by Eckart Förster, 3–20. Stanford: Stanford University Press.
Carson, Emily. 1997. “Kant on Intuition and Geometry.” Canadian Journal of Philosophy 27 (4): 489–512.
Carson, Emily. 1999. “Kant on the Method of Mathematics.” Journal of the History of Philosophy 37 (4): 629–52.
Caygill, Howard. 1995. A Kant Dictionary. Vol. 121. London: Blackwell.
Chignell, Andrew. 2014. “Modal Motivations for Noumenal Ignorance: Knowledge, Cognition, and Coherence.” Kant-Studien 105 (4): 573–97.
Descartes, Rene. 1984. The Philosophical Writings of Descartes. Edited by John Cottingham, Robert Stoothoff, and Dugald Murdoch. Vol. 2. Cambridge: Cambridge University Press.
Dicker, Georges. 2004. Kant’s Theory of Knowledge : An Analytical Introduction. Oxford: Oxford University Press.
Dyck, Corey W. 2010. “The Aeneas Argument: Personality and Immortality in Kant’s Third Paralogism.” In Kant Yearbook, edited by Dietmar Heidemann, 95–122.
Engstrom, Stephen. 2013. “Unity of Apperception.” Studi Kantiani 26: 37–54.
Friedman, Michael. 1992. Kant and the Exact Sciences. Cambridge, MA: Harvard University Press.
Gardner, Sebastian. 1999. Kant and the Critique of Pure Reason. London: Routledge.
Ginsborg, Hannah. 2006. “Kant and the Problem of Experience.” Philosophical Topics 34 (1 and 2): 59–106.
Griffith, Aaron M. 2012. “Perception and the Categories: A Conceptualist Reading of Kant’s Critique of Pure Reason.” European Journal of Philosophy 20 (2): 193–222.
Grüne, Stefanie. 2009. Blinde Anschauung. Vittorio Klostermann.
Guyer, Paul. 1987. Kant and the Claims of Knowledge. Cambridge: Cambridge University Press.
Guyer, Paul. 2014. Kant. London: Routledge.
Hanna, Robert. 2002. “Mathematics for Humans: Kant’s Philosophy of Arithmetic Revisited.” European Journal of Philosophy 10 (3): 328–52.
Hanne, Robert. 2005. “Kant and Nonconceptual Content.” European Journal of Philosophy 13 (2): 247–90.
Heck, Richard G. 2000. “Nonconceptual Content and the ‘Space of Reasons’.” The Philosophical Review 109 (4): 483–523.
Hume, David. 1888. A Treatise of Human Nature. Edited by L A Selby-Bigge. Oxford: Clarendon Press.
Hume, David. 2007. An Enquiry Concerning Human Understanding. Edited by Peter Millican. Oxford: Oxford University Press.
James, William. 1890. The Principles of Psychology. New York: Holt.
Keller, Pierre. 1998. Kant and the Demands of Self-Consciousness. Cambridge: Cambridge University Press.
Kitcher, Patricia. 1993. Kant’s Transcendental Psychology. New York: Oxford University Press.
Kitcher, Patricia. 2010. Kant’s Thinker. New York: Oxford University Press.
Leibniz, Gottfried Wilhelm Freiherr. 1996. New Essays on Human Understanding. Edited by Jonathan Bennett and Peter Remnant. Cambridge: Cambridge University Press.
Longuenesse, Béatrice. 1998. Kant and the Capacity to Judge. Princeton: Princeton University Press.
Lurz, Robert W. 2011. Mindreading Animals: The Debate over What Animals Know About Other Minds. Cambridge, MA: MIT Press.
Lurz, Robert W., ed. 2009. The Philosophy of Animal Minds. Cambridge: Cambridge University Press.
McDowell, John. 1996. Mind and World: With a New Introduction. Cambridge, MA: Harvard University Press.
McLear, Colin. 2011. “Kant on Animal Consciousness.” Philosophers’ Imprint 11 (15): 1–16.
McLear, Colin. 2015. “Two Kinds of Unity in the Critique of Pure Reason.” Journal of the History of Philosophy 53 (1): 79–110.
Meerbote, Ralf. 1991. “Kant’s Functionalism.” In Historical Foundations of Cognitive Science, edited by J-C Smith, 161–87. Dordrecht: Kluwer Academic Publishers.
Naragon, Steve. 1990. “Kant on Descartes and the Brutes.” Kant-Studien 81 (1): 1–23.
Parsons, Charles. 1964. “Infinity and Kant’s Conception of the ‘Possibility of Experience’.” The Philosophical Review 73 (2): 182–97.
Parsons, Charles. 1992. “The Transcendental Aesthetic.” In The Cambridge Companion to Kant, edited by Paul Guyer, 62–100. Cambridge: Cambridge University Press.
Paton, H J. 1936. Kant’s Metaphysic of Experience. Vol. 1 & 2. London: G. Allen & Unwin, Ltd.
Pendlebury, Michael. 1995. “Making Sense of Kant’s Schematism.” PPR 55 (4): 777–97.
Pereboom, Derk. 1995. “Determinism Al Dente.” Noûs 29: 21–45.
Pereboom, Derk. 2006. “Kant’s Metaphysical and Transcendental Deductions.” In A Companion to Kant, edited by Graham Bird, 154–68. Blackwell Publishing.
Pereboom, Derk. 2009. “Kant’s Transcendental Arguments.” Stanford Encyclopedia of Philosophy.
Proops, Ian. 2010. “Kant’s First Paralogism.” The Philosophical Review 119 (4): 449.
Schellenberg, S. 2011. “Perceptual Content Defended.” Noûs 45 (4): 714–50.
Sellars, Wilfrid. 1956. “Empiricism and the Philosophy of Mind.” Minnesota Studies in the Philosophy of Science 1: 253–329.
Sellars, Wilfrid. 1968. Science and Metaphysics: Variations on Kantian Themes. London: Routledge & Keegan Paul.
Sellars, Wilfrid. 1978. “Berkeley and Descartes: Reflections on the Theory of Ideas.” In Studies in Perception, edited by P K Machamer and R G Turnbull, 259–311. Columbus: Ohio University Press.
Shabel, Lisa. 2006. “Kant’s Philosophy of Mathematics.” In The Cambridge Companion to Kant and the Critique of Pure Reason, edited by Paul Guyer, 94–128.
Siegel, Susanna. 2010. “The Contents of Perception.” In The Stanford Encyclopedia of Philosophy, edited by Edward N Zalta.
Siegel, Susanna. 2011. The Contents of Visual Experience. Oxford: Oxford University Press.
Smit, Houston. 2000. “Kant on Marks and the Immediacy of Intuition.” The Philosophical Review 109 (2): 235–66.
Strawson, Peter Frederick. 1966. The Bounds of Sense. London: Routledge.
Strawson, Peter Frederick. 1970. “Imagination and Perception.” In Experience and Theory, edited by Lawrence Foster and Joe William Swanson. Amherst: University of Massachusets Press.
Sutherland, Daniel. 2008. “Arithmetic from Kant to Frege: Numbers, Pure Units, and the Limits of Conceptual Representation.” In Kant and Philosophy of Science Today(1), edited by Michela Massimi, 135–64. Cambridge: Cambridge University Press.
Tolley, Clinton. 2013. “The Non-Conceptuality of the Content of Intuitions: A New Approach.” Kantian Review 18 (01): 107–36.
Tolley, Clinton. 2014. “Kant on the Content of Cognition.” European Journal of Philosophy 22 (2): 200–228.
Van Cleve, James. 1999. Problems from Kant. Oxford: Oxford University Press.
Wood, Allen W. 2005. Kant. Oxford: Blackwell Publishing.

Author Information

Colin McLear
Email: mclear@unl.edu
University of Nebraska
U. S. A.

Lucius Annaeus Seneca (c. 4 B.C.E.—65 C.E.)

The ancient Roman philosopher Seneca was a Stoic who adopted and argued largely from within the framework he inherited from his Stoic predecessors. His Letters to Lucilius have long been widely read Stoic texts. Seneca’s texts have many aims: he writes to exhort readers to philosophy, to encourage them to continue study, to articulate his philosophical position, to defend Stoicism against opponents, to portray a philosophical life, and much more. Seneca also writes to criticize the social practices and values of his fellow Romans. He rejects and criticizes, among other things, the idea that death is an evil, that wealth is a good, that political power is valuable, that anger is justified. In Seneca’s philosophical texts, one finds a Stoic who attempts to live in accordance with the conclusions he reaches through philosophy. Though Seneca admits to falling short of this goal personally, his efforts have long been one of the attractions (though some have found these to be distractions) of his philosophical works.

Lucius Annaeus Seneca was born in Cordoba during the reign of Augustus. Because of his birth to a provincial nobleman of low rank, Seneca was quite removed from the workings of the powerful Roman elite, yet the course of his life would come to be shaped by his relationships—sometimes inimical, sometimes friendly—with the early Julio-Claudian Emperors. He was exiled by Claudius and then recalled. He was friend and tutor to Nero. This relationship itself eventually soured, and Seneca, under orders from Nero, committed suicide in 65 C.E.

Someone familiar with Seneca exclusively as a philosopher is likely to be shocked by the details of his personal life. How, one may wonder, should Seneca’s argument that poverty is not an evil be understood in light of the fact that Seneca was one of the wealthiest men in the world? And how should Seneca’s commitment to and claims about the value of living philosophically be understood in light of the fact that Seneca’s own life was riddled with controversy and intrigue? On the other hand, one familiar with Seneca’s life may well meet with wonder the philosophical positions to be found in his philosophical works. How, one may ask, could the person who had positioned himself as the advisor to the young and impressionable (ex hypothesi) Princeps of Rome be the same person who upholds the private life as superior to the public? How could a man whose life story seems impossible for any but the most flexible character be the author of texts upholding the value of integrity and self-mastery as against mastery by one’s circumstances? These and many other questions make a clear view of Seneca difficult. This article attempts to provide a general sense of Seneca’s life and works that can serve as a starting point for understanding Seneca’s legacy. The aim here is primarily to bring the difficulties into view, rather than to resolve them.

Life, Political Career, and Death
Works and Thought
References and Further Reading
1. Texts and Translations
2. Secondary Literature

1. Life, Political Career, and Death

Although the general outline of Seneca’s life is known, that many details remain unknown is surprising given both Seneca’s fame during his lifetime and the volume of his writing. On many points of detail about his life, scholars must take into consideration the available sources, some of which are from centuries after Seneca’s death and others which are hostile to his writings, and reconstruct a plausible account. Seneca’s birth is one of many such examples. Seneca was born in Cordoba, Spain. His father, Seneca the Elder, was a member of the Roman nobility whose family had immigrated to Spain. Seneca spent his earliest years with his mother Helvia at the family estates in Cordoba while his father was away in Rome. We do not know with certainty the year of Seneca’s birth, but the evidence from Seneca’s scant references to his own life suggest that he was born no earlier than 8 B.C.E. and no later than 1 B.C.E. Though some uncertainty is inescapable unless new evidence is discovered, the most common estimate for his birth is 4 B.C.E.

Seneca’s father, also Lucius Annaeus Seneca (the Elder), was a Roman nobleman of the equestrian class. The Elder’s enthusiasm for Roman politics and his enthusiasm for his two older sons’ potential in Roman society are plain in his Controversiae. Also plain is his insistence that the path for his middle son, our Seneca, was to be the normal cursus honorum (course of offices) and not the life of philosophical study. Seneca the Younger thus came to Rome very early, likely by age 5, to begin his training for Roman public life. Seneca’s early education is likely to have been typical of Roman elites at the time—focusing on language (both Greek and Latin) and traditional texts. Though his father would have been eligible for certain Roman offices, he seems instead to have devoted himself to forwarding the careers of his two oldest sons, Annaeus Novatus (later named Gallio upon adoption by L. Junius Gallio) and our Seneca. The Elder Seneca did not pressure his youngest son, Marcus Annaeus Mela, eventual father of Lucan, to pursue a political career.

Little is known with certainty about Seneca’s early life, particularly his personal life. Seneca presents himself in his philosophical works in a way that conceals personal details, however, in some cases, those he gives can provide helpful insight. His references, for example, to his former teachers—Attalus the Stoic, Fabianus the Sextian, and others—give some indication of his advanced training in philosophy and rhetoric. Scholars have found these references to his training, though sparse, crucial for understanding Seneca’s particular philosophical approach. Seneca does not, however, say enough about his personal experiences in Rome to help scholars in developing a robust biography. Further complicating matters is the fact that while Seneca is mentioned in histories from the ancient world, including those of Tacitus and Cassius Dio and the biographies of Suetonius, Seneca’s life as a whole is nowhere a topic of sustained focus.

We know that Seneca’s political career had a slow beginning. By the time Gaius (Calligula) Caesar died in 41 C.E., Seneca (now roughly 45 years old) had not yet advanced to the rank of Praetor, a rank for which he would have been eligible many years earlier. Seneca’s delayed progress or delayed entrance into the cursus honorum has been a matter of much research and speculation and has been explained by one or more of the following: Seneca’s recurring bouts of poor health, because of which he is thought to have spent a number of years in Egypt; his increasing interest in a philosophical, rather than public, life; his emerging reputation as a rhetorical talent; the tumultuous political environment during the time from Sejanus’ rise and fall until the ascension of Claudius in 41. Whatever the explanation, and whatever Seneca’s political ambitions may have been, they were stalled when, in 41, he was exiled by Claudius to the island of Corsica, where he would remain until 49.

Although Seneca’s guilt is not clearly attested in our sources, he was charged and convicted before the Senate for committing adultery with Julia Livilla, the sister of Gaius Caesar. Seneca tells us in the Consolation to Polybius (13.2) that he had been convicted and sentenced to death by the Senate but that Claudius had spared his life. Claudius’ intervention, perhaps, along with some other uncertainties about the case, suggest that the case against Seneca was, despite the Senate’s ruling, not decisive. The historian Cassius Dio (60.8.4, and Griffin, 32) argues that Seneca was essentially a casualty in an attempt by Messalina, Claudius’ wife, to be rid of Julia Livilla. On the other hand, Seneca was clearly a friend of Julia’s family. Her sister, Agrippina the Younger, would later be instrumental in reviving Seneca’s political career. Whatever the case, the occasion of Seneca’s exile marks the beginning of his involvement with the imperial family, which guides the course of his life thereafter.

Seneca’s exile ended with the help of Agrippina the Younger, now wife of Claudius, in 49 C.E. Upon Seneca’s return to Rome, he became the tutor of Agrippina’s son, the young Nero. Seneca’s role in Roman politics after his recall in 49 was largely unconventional. He was at first known as the ‘tutor’ (magister) of Nero and later became (along with Burrus) an influential advisor and speech-writer. In our records he is variously referred to as Nero’s ‘friend’ (amicus) and tutor. Neither of these titles had historically been associated with much political power, but it seems that Seneca likely played an important role in governing Rome, at least in the early years of Nero’s rule. It is difficult to know just which actions were taken on Seneca’s advice and which were not, though some ancient sources credit Seneca with the good policies and blame Burrus for the bad ones. Whatever the details of Seneca’s contribution, the first five years of Nero’s reign—the ‘quinquennium Neronis‘—have been noted for their successes. Here again, though, historians are divided on whether the successes of the first five years of Nero’s reign were genuine or merely successes in public relations, for which Seneca would have been well suited. As Nero matured, though, he began to rely less and less on Seneca’s advice. Eventually, Seneca was named as an associate in the failed Pisonian Conspiracy to overthrow Nero. In 65 C.E., Seneca was sentenced by Nero to commit suicide.

The circumstances of Seneca’s death are reported at length in Tacitus’ Annals (XV.60 ff.) and with less detail by both Cassius Dio and Seutonius. Indeed, Seneca’s death has been a topic of great intrigue and disagreement. Upon receiving word of his sentence, Seneca is reported to have acted calmly. He cut his wrists and legs to let his blood drain, but this proved ineffective because of his frail condition. He then took hemlock, which was also ineffective because of his poor circulation. He was then placed in a bath to improve his circulation and finally suffocated from the steam. As he had specified in his will, he was cremated without ceremony.

The setting and circumstances of Seneca’s death serve as a window into the difficulties of understanding the relation between his life and philosophical work. On the one hand his death seems to be modeled on that of Socrates in Plato’s Phaedo. His last moments are tranquil. He is described as being calm upon receiving the judgment of Nero and then meeting his death, which was, it seems, was preceded by dinner and conversation with his wife, Paulina, and friends. During the ordeal itself, he attempts to calm his friends by telling them to follow the “imago” (“pattern” or “image”) of his life. Seneca here likely means the image of a philosophical life that he has crafted in his works. But that picture of his life does not always fit comfortably with the rest of what we learn from our sources. Tacitus’ account of his death illustrates this. For while Seneca’s demeanor and actions remind us of Socrates’ death, the life that precedes this end bears little similarity to Socrates’. Seneca seems to have crafted a philosophical death, but in a context of great political intrigue. Whereas Socrates dies, at least partly, for his refusal to become involved in Athenian political affairs, Seneca dies, also at least partly, for the failure of his political maneuvers. Seneca seems to have known the sentence of death was coming. He may well have been involved, as alleged, in the Pisonian conspiracy. After his account of Seneca’s death, Tacitus reports a rumor that after the assassination of Nero, Piso was also to be put to death, and Seneca installed as princeps. Tacitus reports that Seneca is rumored to have known of this plan.

2. Works and Thought

Despite Seneca’s turbulent political career, he managed to produce and publish a great deal. His most famous and widely read works are his Letters to Lucilius. The Letters contain much that is of interest to students of Stoicism in general and have served for many as an entry point into Stoic philosophy. The Letters also show something of how Seneca thought philosophical principles could shape how one lives. In addition to the Letters, many other philosophical works—collected under the title ‘Dialogi’—survive. These treatises, some of which are incomplete, include three Consolations (Consolation to Marcia, Consolation to Helvia, Consolation to Polybius) and philosophical treatises on specific questions, topics, or themes (On Anger, On Mercy, On Leisure, On the Constancy of the Wise Person, On Providence, On Benefits). Seneca’s extended work, the Natural Questions, investigates various meteorological phenomena from the point of view of Stoic natural philosophy. In addition to his philosophical works, eight of Seneca’s tragedies survive along with a work that satirizes the deification of Claudius (The Apocoloycyntosis or ‘Pumpkinification’ of Claudius). It is known that Seneca wrote many other works that have been lost, including the public speeches that he wrote for Nero.

a. Seneca and Stoicism

Seneca’s philosophical outlook is best understood in terms of his particular circumstances. He, like many Roman philosophers of his time, was more interested in moral philosophy than in the other two branches of philosophy (dialectic, or logic, and physics) that had become standard in Hellenistic thinking about the parts of philosophy. While Seneca is clearly well-trained and widely read in all parts of philosophy, he chooses to focus on moral philosophy in his texts. With the exception of the Natural Questions, which is devoted entirely to the branch of philosophy called ‘physics’ (a branch that included natural philosophy as well as theology), much of Seneca’s work focuses on ethical matters. Also like other philosophers of his time, Seneca’s focus in moral philosophy has a clear practical emphasis. While discussions of theory and theoretical controversies abound in Seneca’s Letters and other works, his focus is consistently on how his theory—Stoicism—can be brought to bear on living one’s life. Seneca emphasizes the importance of this in Letter 89, where he encourages Lucilius (the addressee of the Letters) to indulge his wish to study logic so long has he refers everything that he learns to living a good life.

Seneca clearly sees himself as a Stoic. He commonly refers to the Stoic school as ‘ours’ and does much to defend the Stoics against certain Peripatetic and Epicurean attacks. Still, he is willing to disagree with the Stoics about certain matters in which he thinks a clearer or better argument is available. In Letter 33, for example, Seneca claims that he follows the teachings of the Stoics, but points out that the people who have discovered important truths in the past are not his masters (domini), but rather his guides (duces). Elsewhere, in his On Leisure, Seneca makes a similar point that he accepts the views of Zeno and Chrysippus (two early leaders of the Stoa) not just because Zeno or Chrysippus taught them, but because the arguments themselves lead to those positions.

He is also willing to make some concessions to the main adversary—the Epicurean. Seneca’s stance, especially toward Epicurus, has led readers to think that Seneca is best described as ‘eclectic’ rather than Stoic. His willingness to draw upon the philosophy of Epicurus, Plato, and others has seemed to some to betray the softness of his commitment to Stoicism. Seneca’s reply to this charge can be found in the passages from Letter 33 and On Leisure above. His focus is on the truth. He believes that, in some cases, the Epicurean or the Aristotelian has hit upon the truth. He is happy to acknowledge this to Lucilius and his readers but is nonetheless ready to point out that they have arrived at the truth for the wrong reasons. His treatise On Leisure illustrates this point. The question is whether the wise person ought to engage in public life or instead retire to pursue the work of retirement, which includes philosophical study. The Epicurean view is that the wise person will not engage in public life unless something interferes. The Stoic view is that the wise person will engage in public life unless something interferes. Seneca, though, argues that the importance of the projects of one’s private life (including the study of philosophy) can, in fact, trump the requirement to enter public life, even according to the Stoic view. This, he argues, shows that the pursuit of philosophical study and avoidance of public life are, in fact, recommended by the Stoics. The Epicureans’ overt call to avoid public life is mistaken, Seneca argues, because it assumes that a life devoted to politics cannot be harmonious with the philosophical life. Seneca concedes that in the actual world, as it is now, that is true, but points out that circumstances can change. In a world where public service would produce greater benefit to mankind than private, philosophical, work, a wise person would engage in the former.

Certain affinities between Seneca and his most famous fellow Roman philosophers—Marcus Aurelius and Epictetus – are commonly noted. All are concerned with the importance of living a philosophical life. All are, in the works that survive, more concerned with ethics than other branches of philosophy. These generalizations are accurate, but they obscure some features of Seneca’s philosophical works that distinguish him from these Roman Stoics. In particular, Seneca’s philosophical works were written for publication. In contrast, Epictetus did not write anything, and Marcus wrote for himself; Seneca, though, intended that his works be readthey were read widely during and after his lifetime.

A related and in some ways more significant feature of Seneca’s authorship is his decision not only to write for an audience, but to do so in Latin rather than Greek. In the generations both before and after Seneca, Greek remained the language of philosophical discourse. Two notable exceptions to this pattern are the Epicurean Lucretius’ epic poem De Rerum Natura (On the Nature of Things), and the philosophical works of Marcus Tullius Cicero. The efforts of Lucretius and Cicero to bring philosophy to Latin and to prove that Latin is sufficient for the task (a regular theme in Cicero’s works) largely failed. Seneca, however, does not seem to have had a goal of bringing philosophy to Latin. He has little interest, as Cicero did, in demonstrating that Latin could accommodate the Greek technical vocabulary. This has made Seneca’s texts particularly useless for those seeking to trace the history of particular terms or concepts through Classical and Hellenistic philosophy. On the other hand, Seneca’s approach makes it clear that he is not concerned with matters of concordance or with establishing or maintaining a particular paradigm of philosophical exposition. Seneca is, instead, doing philosophy in Latin (Inwood, 2005).

Though Seneca distinguishes himself from his peers in some respects, he nonetheless professes his allegiance to Stoicism. His commitment to the school can be seen most clearly in his frequent return to a number of core Stoic positions—particularly the positions defended in Stoic moral philosophy. The Stoic view of morality is distinguished from other Hellenistic and Classical philosophical schools by its commitment to the idea that an individual has absolute authority over her happiness. The Stoics reject the Aristotelian idea that one’s happiness (eudaimonia) is at least in part determined by things outside one’s control. Seneca stands with the Stoics in rejecting this view of happiness. He frequently returns to this theme in different contexts and emphasizes the importance of knowing what things are in one’s power and what things are not. Seneca agrees with the Stoics that virtue is sufficient for happiness. One’s virtue, unlike one’s circumstances, is within one’s power.

Knowledge of one’s nature is importantly connected, in Stoicism, with one’s knowledge of nature generally. Seneca often appeals to the importance of understanding nature in his works. He recommends, for example, that one who is setting off on a voyage say to himself that he will arrive at his destination unless something interferes. This statement is taken to reflect the understanding that whether one’s actions unfold as one wishes is not entirely within one’s control. Thus, Seneca urges that it would be a mistake to say “I will arrive at my destination.” Such a plan ignores the fact that many ships do not reach their destinations. The more one understands the nature of things, the more one understands what is in one’s power and what is not.

Indeed, the Stoics emphasize that to live well one must live according to nature. In Seneca’s texts, this emphasis provides the background for criticism of his culture and fellow Romans. To follow nature or live according to nature requires that one abandon many practices and values that have been taken up through acculturation. Seneca’s return throughout his philosophical writings to the dangers of public life, of crowds, and of social excesses relies on this point—that much of society is corrupt. To live as the mob supposes one should live is to stray from nature. Seneca notes, in Letter 46, that reason demands one live in accordance with one’s own nature, but this nature can be led astray.

b. Philosophical Substance and Literary Talent

Seneca’s literary talent was unmatched during his lifetime. His style appealed immediately to his Roman audience. Writing a generation after Seneca, Quintilian notes in his Institutiones that early in his career Seneca’s works were the only works being read. Quintilian’s treatment of Seneca’s texts is telling. In cataloguing the texts of other authors, he systematically omits Seneca’s contributions to each genre. Seneca’s works are given their own treatment because of their difficulty in being read judiciously. Quintilian praises Seneca’s works but recommends advanced training be completed prior to reading them.

With some modifications, this advice has been upheld by modern readers of Seneca. While he is often rated a philosophical amateur, no scholars would venture the similar claim about his literary talents. This realization, however, has led scholars of Seneca’s philosophical positions to take more care to understand the literary aims and constraints of his work. By all accounts, even from as early as Tacitus and Quitilian, Seneca’s prose style was both original and quite popular. His originality extends beyond the style of his sentences all the way to the organization of his philosophical treatises. He everywhere prefers a style of philosophical writing that more closely resembles conversation.

Seneca’s literary genius confronts readers of his text with a difficulty. Those interested in Seneca’s philosophy cannot simply ignore aspects of genre, style, and so on. For Seneca, these are importantly connected. Often the philosophical message of a treatise or letter is entangled with the norms of the genre in which he is working. At the same time, Seneca often presses against such norms to enlarge or bring into focus certain philosophical points. He claims, for example, that philosophical discourse can be appropriately undertaken as a conversation (Letter 75.1-2). To a great extent, Seneca’s philosophical texts reflect this preference: straightforward exposition is rare in his works. More frequently, his addressee is made to interrupt a point by asking a question or posing a challenge. In some cases, though, the demands of philosophical exposition require setting aside the genre’s norms. Seneca blames Lucilius, for example, in Letter 95 for its length and technical detail. This interplay between style and substance requires great care in interpreting Seneca’s philosophical achievements.

Seneca’s literary talents further complicate interpreting his philosophical works when one considers his controversial career. In some cases a careful interpretation of his work cannot ignore the immediate political context. The Apocolocyntosis, a scathing attack on Claudius, has clear political and public aims (though little of philosophical interest). His Consolation to Helvia, written to his mother during his exile, may well have been intended as a defense and request for recall. Similarly, he once mentions (Polybius 13.2) his trial and conviction, perhaps in an effort to remind Claudius of his innocence. These references to his own life, though rare, alert readers to the fact that his treatises may be constructed with many goals in mind: philosophical, but also personal, political, and literary. One can, for example, see the intermingling of aims in the opening passages of On Mercy, where Seneca praises Nero’s virtues. The praise of Nero’s character has both a philosophical and political goal: to encourage careful thinking about the importance, for a ruler, of cultivating mercy and to exhort the ruler of Rome to have mercy on those who may be thought to have wronged him.

c. The Letters to Lucilius

The Letters to Lucilius are Seneca’s most widely read and influential texts. The Letters contain much that is of interest to philosophers and to non-philosophers alike. 124 Letters have survived, divided into 20 books. It is likely that not all of the Letters have been preserved. The interpretation of Seneca’s Letters has been a matter of much disagreement among scholars.

The Letters themselves contain a wide variety of material ranging from apparently mundane discussions (for example, the dangers of crowds and public baths) to advanced technical discussions of Stoic theory. Seneca often makes use of something in everyday life to steer discussion to an ethical question or to some piece of moral advice. An over-arching interpretation of the Letters as a literary and philosophical work has eluded consensus among scholars. Still, a number of features of the Letters stand out as helpful for their interpretation. First, many groups of letters deal with common themes. Letters 5-10, for example, deal broadly with questions about living a philosophical life. Letters 94-5, the longest two letters of the work, deal with a technical question about the role of rules in moral reasoning. These are but two examples. There are few, if any, Letters the themes of which do not find echoes in others. Second, there is a noted trend as the letters progress toward longer, more technical, and more substantive philosophical discussions. This feature suggests that the Letters, aside from the apparently disparate themes and discussions along the way, also aim to demonstrate a philosophical education.

This aim is apparent early in the Letters. Seneca urges Lucilius in the first letter against the fault of wasting his time carelessly. In the second letter, he advises Lucilius on the correct approach to reading philosophical texts. In the fifth letter, he applauds Lucilius for persistence in his philosophical study but warns him to remain focused on the goal of philosophical study—that is, moral improvement—rather than the goal of many to simply make a show of philosophical talent. Seneca’s advice about philosophy—both how and what to study and how to apply it to one’s life—continues throughout the Letters. Scholars have long noted the apparent improvement of Lucilius as the Letters progress as evidence that Seneca means not simply to discuss philosophical progress but also to illustrate what it is like. The Lucilius of the early letters is not very sophisticated: the reader is made to suppose he is in the habit of requesting from Seneca pithy philosophical maxims to memorize. In Letter 33, Seneca chastises him for this and discontinues the practice of ending his letters with maxims. Later, in Letter 82, Seneca reports that he is happy with Lucilius’ progress. The later Letters also show Lucilius asking, apparently, more and more technical and difficult philosophical questions. Indeed, the later letters are, on the whole, considerably more philosophically rich than the early ones.

While Lucilius’ progress is arguably a theme that unites the letters, it is a theme that allows the philosophical discussions included in them to vary considerably. No one argument or position is systematically defended or articulated throughout the Letters as a whole. Instead, philosophical discussions are more localized, sometimes occupying the space of one letter, other times spanning a group of three or four. Sometimes a question addressed in one letter is picked up again much later. One can find in Seneca’s Letters various discussions of, to name a few, friendship, death, fate, poverty, moral theory, virtue, the good, argument, and much else. In all of his discussions, Seneca emphasizes the importance of being critical both of oneself and one’s way of living and of the received views, both popular and philosophical.

A brief account of the work’s first letter, though scarcely sufficient as a general introduction to the Letters, gives some indication of Seneca’s approach. The letter begins with some advice to Lucilius. He is to continue his efforts in devoting time to philosophical study. The theme of the Letter is just this—that too much time is wasted on worldly pursuits. Time flies, and as we delay what matters, life runs past. This theme is common in Latin literature: famous phrases like “tempus fugit” (from Vergil) and “carpe diem” (Horace) illustrate this. Seneca’s discussion of this offers no new philosophical insight. Still, as the letter continues, the philosophical point comes into view. The advice about wasting time generalizes to one’s life as a whole. To let one’s time slip away is to let oneself be occupied with things that are not really important. Seneca confesses that though he, too, wastes time, he has come to recognize when he is doing so. He counts this as progress and advises that Lucilius do what he can to keep what is really his.

As is typical of the Letters, this letter has Stoicism in view but does not heavy-handedly address or engage in Stoic theory. As a Stoic, Seneca is committed to the view that much of what one does in life is of little value. One’s day-to-day business contributes nothing to living a good life, unless one is considering the manner of his or her life. Seneca’s proposal that one should waste little, and be aware of what one is wasting, points to the Stoic view. What matters is acting virtuously, and this requires reflection on one’s actions. This is the first step to living well.

d. Anger, Grief, and the Therapy of Emotions

A defining principle of Stoicism is the claim that the mind is wholly rational, unlike Platonists and Aristotelians who posited a mind composed of both rational and non-rational parts. According to the Platonic/Aristotelian account of human psychology, emotions such as anger and fear could be explained by appeal to the non-rational parts of the mind, but on the Stoic view of the mind, no similar appeal can be made: Stoic theory suggests no non-rational aspects of the mind. The whole—unitary—mind is implicated in its actions. This feature of the Stoic theory has important implications for both its account of and its evaluation of emotions.

The Stoics view emotions as irrational movements of the mind. Since there are no non-rational parts of the mind, the Stoics understand a movement to be ‘irrational’ when it is contrary to right reason. Anger is a state in which one is not guided by correct reasoning. Fear is a state in which one is not guided by correct reasoning. And so on. Hence, emotions are states of mind that are contrary to right reason. One who is not angry would think and act differently than one who is. At least in the case of the perfect moral agent, these actions—that is, of someone who is not angry—would be fully guided by correct reasoning. The Stoics explain that the emotions arise when one assents to certain kinds of false statements about the world. Consider the following judgments one may make in response to having one’s car stolen:

S1: My car has been stolen.

S2: It is bad to have one’s car stolen.

S3: It is appropriate to respond to having one’s car stolen in an emotional way.

In an ordinary case, the Stoics claim, one’s episode of anger can be explained by appeal to these three propositions. One first encounters some state of affairs, articulates it, and assents to it—S1. One often goes on to form a secondary articulation, along the lines of S2, about the goodness or badness of this state of affairs. If one assents to this statement, one often continues to react in a way that somehow corresponds to the judgment reflected in S2. ‘S3’ is not exactly what one assents to. Instead, S3 is meant to capture something about the angry person’s response. Consider, for example, that an angry person might well scream “in anger” or do some violence to his surroundings or the like. The analysis of anger is meant to capture (via S3) this feature of anger (and other emotions).

According to Stoic theory, judgments of the form S2 and S3 are nearly always false. The Stoics hold that the only good is virtue and that the only evil is vice. All else is indifferent. According to this theory of value, having one’s car stolen is not bad; thus S2 is false. Similarly, since nothing bad has happened, the course of action sanctioned by S2 and S3 is illegitimate. No emotional response is appropriate.

Seneca devotes much of his philosophical work to advancing these aspects of Stoicism. The chief concern behind the Stoic theory of emotions and the theory of value is that until one removes such false beliefs about value, one will not succeed in living a happy life. It is with this that Seneca concerns himself in his philosophical work. He aims, for example, in On Anger to help his readers avoid becoming angry, and offers what little advice there is to help those who are angry stop being so. In the Consolations, he is concerned with helping his readers avoid the life shattering effects of grief. Elsewhere, Seneca works to help people let go of their fear of death.

In his Consolations in particular, as well as in his treatise On Anger and other works, Seneca is clearly more often concerned with helping people avoid experiencing emotions. As a Stoic, he is committed to the idea that emotional experiences involve false judgments. Still, Seneca does not typically concern himself with explicating the theory itself. While our reports from Greek doxagraphers and from Cicero preserve the outlines of the theory, Seneca feels no need to repeat it. One noteworthy exception to this is Seneca’s On Anger. Here (in Book II.1.4) Seneca explains the structure of an emotional experience. His explanation attempts to show that anger is voluntary despite the fact that one cannot entirely control the way things appear.

Seneca’s strategy is to explain anger in terms of three ‘movements’. The first movement, he says, is involuntary. It is the moment when the mind articulates some state of affairs—that ‘having my car stolen is a bad thing’. This may correspond, in some cases, with an elevated heart rate, a sinking feeling in one’s stomach, or the like. This initial experience is, Seneca claims, beyond one’s immediate control, but it is not anger. To be angry, one must “assent” to the proposition. That is, one must sanction the assertion that “such and such is a bad thing.” Once the assent is given, one is angry.

In distinguishing the first, involuntary, movement of anger from anger itself, Seneca seems to be responding (or reporting his source’s response) to an objection to the Stoic view. The Stoics claim that the wise person—the Sage—will not become angry (or experience any emotion) but cannot deny that the Sage will, for example, flinch at the loud bark of a dog or the sudden loud clap of thunder. Why, the objector may say, would the Sage flinch? To flinch is to assent to the proposition that something bad has happened. By separating the involuntary from the voluntary, Seneca answers this criticism.

While Seneca occasionally addresses theoretical matters in this way, he more commonly focuses on an issue—in this case, the emotions—from a different perspective. Seneca largely favors discussing issues from the perspective of the person who is making moral progress, rather than from the perspective of the wise person. This stands in contrast to the focus of other surviving Stoic texts which tend to focus on the morally perfect agent—the ‘Sage’—and her qualities. Those texts often characterize the Sage in a way that sets her very much apart from normal human beings. Seneca’s concern, however, is with the circumstances of those who are aspiring to be and do better.

This orientation can be seen very clearly in passages or whole works (like On Anger, Consolation to Marcia, and others) where he aims to help those who are imperiled by emotions. The aim of these works is not to point out that the Sage does not experience anger or grief, nor is the aim even primarily to say why the Sage does not experience these emotions. Instead, the aim is to appeal to those who are not wise and to offer them advice, informed of course by Stoic theory, to help them re-orient their thinking about their circumstances. In On Anger, for example, Seneca advises that an angry person look in the mirror. Clearly, this person will not find a Sage in the mirror. Instead, Seneca thinks, he will find something in his appearance that does not resonate well with his thinking about himself. Elsewhere, Seneca advises that the person who is grieving consider the difference an audience makes. When one finds that one grieves more in the presence of an audience, Seneca thinks this will force one to reflect on what the grief is really about. Is one’s grief, in other words, directed at the one who is gone or at oneself? These kinds of strategies for dealing with emotions are, in any case, very far removed from arguments about the value of the emotions and still further removed from theoretical accounts of the nature of the emotions. Seneca is convinced that the Stoic view is right, and he finds support for this conclusion in less theoretical, and more practical, aspects of human life.

e. Natural Philosophy

The received view of the Roman Stoics according to which the Romans were only concerned with ethics must be put aside in Seneca’s case. The opening lines of the Natural Questions articulate a view about the importance of physics that shows Seneca to be a clear exception. The very existence of the Natural Questions, one of Seneca’s longest philosophical treatises, shows this as well. He notes that “the difference between philosophy and other areas of study is as great as the difference, within philosophy itself, between the branch concerned with humans and the one concerned with the gods” (Praef.1, Hine, trans.). Seneca’s reference here to the branch concerned with the gods is a standard characterization of the ‘physics’, one of the three Hellenistic divisions of philosophy that Seneca inherits. For the Stoics, the study of physics, or natural philosophy, included the study of the divine. In Letter 88, Seneca claims that the liberal arts, here noted as the ‘other areas of study’, are only important insofar as they prepare the mind for philosophical study. Seneca’s claim at the beginning of the Natural Questions, then, suggests that all philosophical study ultimately aims at understanding of the gods. Even the “branch concerned with humans” (that is, ethics) has an aim beyond itself. According to the Stoic view, full moral progress requires a complete understanding of the nature of the divine. Seneca’s claims here, and elsewhere in the Natural Questions, suggest that he embraces the full range of Stoic philosophy despite the fact that most of his philosophical attention is devoted to matters central to the ‘branch concerned with humans.’

The outlines of Stoic physics are well documented in early sources. The Stoics are materialists, compatibilists, and theists. In the most general sense, the Stoics hold that the cosmos is entirely composed of matter but that certain forms of matter (fire, aether) are endowed with creative capacity. The human being’s mind is itself a composition of these elements. According to the Stoic view, the cosmos is a mind writ large, in the sense that the movements and developments in nature at the cosmic level are the result of guiding intelligence. For this reason, the Stoics regard “god,” “nature,” “fate,” “providence,” as roughly equivalent expressions. All refer to the active and creative element in the cosmos. To live according to nature ultimately requires that one come to adopt, or understand, the natural world from this cosmic perspective.

The surviving portions of Seneca’s Natural Questions are a survey of various meteorological phenomena undertaken in light of the broader Stoic understanding of the nature of the cosmos. Though the discussions are often narrowly focused on particular meteorological phenomena and their explanation, Seneca occasionally pauses to take a wider view. He considers, for example, the role that reflective surfaces (mirrors) play—and are supposed to play—in moral improvement (I.17 ff.). He explains the Stoic view that reason is the same for both gods and humans (Praef. 14). In a discussion of the cause of lightening (II.45), Seneca points to the Stoic view that “Jupiter,” “Providence,” “Fate,” and so on are all names for the active, divine element that shapes the universe.

The Natural Questions is an unfinished work. Passages like those above suggest that Seneca may have been revising or finishing the work with the aim of more carefully connecting his findings about meteorological phenomena to Stoic physics. They also suggest that, at least in some moments, Seneca may have been interested in providing a Stoic alternative to Lucretius’ explanation of many of the same phenomena in De Rerum Natura. The Stoic claim that the happenings of the natural world are guided by reason stands in stark contrast to the Epicurean view, articulated by Lucretius, that the world is generated and organized by chance.

f. Non-philosophical Works

Seneca wrote much besides his philosophical texts; however much of his work has been lost. Lost are all of his speeches, including those he penned for Nero. Also lost are some philosophical treatises, though some fragments survive from a treatise on marriage. The surviving non-philosophical works include the Apocolocyntosis, a work satirizing the deification of Claudius, and eight tragedies: Agamemnon, Hercules Furens, Medea, Thyestes, Oedipus, Phaedra, Phoenisse, and Troades. Scholars have long disagreed about the relation between Seneca’s philosophical prose and his tragic poetry. At one end of the spectrum, some ancient sources regarded the author of the tragedies to be a different Seneca altogether. While there is agreement now that our Seneca authored the tragedies, the relation between these works and his philosophical treatises is less agreed upon. On the one hand the tragedies are clearly concerned with many Stoic themes that Seneca addresses in his philosophical works. Despite this point of intersection, though, the tragedies do not seem to say the same about these themes. The most striking theme in this regard is the attention in the tragedies to the role of anger and other emotions. While the philosophical works (especially On Anger) attempt to persuade the reader to avoid becoming angry, the tragedies sometimes seem to elicit our sympathies for those who are angry and acting in anger. Similarly, as one commentator notes, the tragedies are rife with Stoic pronouncements (for example, “follow nature” Phaedra, 481) that are put forward in a manner inconsistent with the Stoic principles to which they give voice.

The Phaedra illustrates the second phenomena quite clearly. The title character, wife of Theseus, has fallen in love with her stepson, Hippolytus. After a failed effort to overcome her feelings for the boy, Phaedra’s cause of seducing Hippolytus is taken up by the Nurse, who agrees to help in order to prevent Phaedra’s suicide. The Nurse urges Hippolytus to “follow nature” as his guide. The Stoic imperative to follow nature is ordinarily understood as an injunction to live a life according to reason, to be virtuous, and to shun the circumstances of fortune. Here, though, the Nurse employs the phrase to encourage Hippolytus to do what most people do—namely, to pursue the pleasures of sex (Wilson, 2010). Hippolytus himself in this play seems, initially at least, to come closest to the Stoic ideal. In a long passage in Act II, he explains his love for the countryside and mountaintops, places in which he can be truly free from anger and other passions and from the vices that corrupt those who spend their time in society. Yet his peace comes at the price of seclusion and for the wrong reasons. The would-be sage seeks the isolation of the woods because of his hatred for all women. He notes that whether his hatred stems from “reason, nature, or passion” (567), it pleases him to hate them all.

The focus in the tragedies on the destructive force of emotions (especially anger) is plain. As one commentator notes, anger guides the action in all of Seneca’s plays (Wilson, 2010). In the Phaedra, Theseus’ anger at his son leads him to seek Hippolytus’ death. (Phaedra, whose advances were rejected by Hippolytus, has lied to her husband, accusing Hippolytus of raping her). In the Medea, Medea’s anger at Jason leads her to murder her own children. In the Thyestes, Atreus’ anger leads him to murder Thyestes’ children and feed them to him. While these portrayals of emotion forge a connection between the tragedies and the prose works, what that connection is remains unclear. How, for example, should one understand the significance of Phaedra’s, “What can reason do? Passion, passion rules!” (trans., Wilson) given Seneca’s claim elsewhere (On Anger II.1.4) that passions are voluntary?

Scholars have taken a number of positions on these issues. Some have argued that there is no connection between the tragedies and the philosophical works, while others have sought to show that the tragedies contain important philosophical lessons. Arguments of the latter kind are varied. Some have held that the tragedies are meant to illustrate the destructive influence of passions; others have argued that the tragedies should be read in light of Seneca’s Stoic metaphysics. These scholars emphasize the role of fate, providence, and divination in the tragedies. Finally, one scholar has argued that the guiding philosophical concern in the tragedies is epistemological (Staley, 2010). On this view, Seneca’s tragedies, offer a kind of ‘clarification’ of the cognitive processes of those who are under the sway of passions.

Whatever relation they are ultimately thought to have to his philosophical works, Seneca’s tragedies, his Apocolocyntosis, and his lost speeches serve to alert readers of his philosophical works to his literary talent. Scholars have rarely attempted a full account of all his works undertaken with the aim of clarifying or even producing an account of Seneca the author. The difficulty of such an undertaking suggests that caution is needed in assuming that Seneca is primarily a philosopher. Seneca appears to have been comfortable writing in many genres. His comfort, moreover, provides a further clue that Seneca’s life was either plagued by or fortunate in (depending on how one sees it) his constant contact with both philosophy and with the politics and culture of Rome.

g. Criticism and Influence

Both Seneca’s life and his works have been targets of criticism since his own lifetime, during which, of course, he was charged and convicted of both adultery and conspiracy. Though the evidence in neither of these cases is clearly decisive, they added to the growing criticism that Seneca’s way of life undermined his philosophical message. This criticism gained more traction from the fact that Seneca, who writes that poverty is not an evil, was one of the wealthiest men in the world. This criticism of Seneca was first made publicly by Publius Suilius, a political enemy of Seneca who was, according to Tacitus, angered by Nero’s revival of a law against pleading for money. Suilius, it seems, believed that this revival resulted from Seneca’s influence. Tacitus reports that Suilius taunted Seneca publicly, reminding the Roman elites of Seneca’s affair with Julia Livilla, and most importantly, asking the following question of his fellow Romans: “By what kind of wisdom or maxims of philosophy had Seneca within four years of royal favour ammassed three hundred million sesterces?” (Tacitus, Annals XIII.42, Church & Brodribb, trans.). Although little independent evidence exists to confirm Suilius’ claim about the extent of Seneca’s wealth or how he acquired it, this passage from Tacitus’ Annals has served as a source for many readers of Seneca since its publication. The result is that Seneca’s political enemy has in a way won the battle of public opinion. Scholars have noted that some caution is needed in evaluating this charge against Seneca, but the fact that Seneca was very wealthy and at the same time wrote that one should be content with what one has—and that poverty is, in itself, no evil—has been a lasting criticism.

This example denotes a broader line of criticism that Seneca is inconsistent. His wealth and his pronouncements about the value of poverty are but one example. To this can be added his praise of the philosophical life together with his recurrent involvement in Roman politics. Seneca is made, in Tacitus, to plead his case for retirement before Nero, yet Seneca is clearly (in both the Consolation to Helvia and to Polybius) eager to return to Rome during his period of exile. Seneca seems, then, to have little but praise for the philosophical life withdrawn from the business of Rome, yet cannot fully embrace that life himself. In his On Mercy, Seneca encourages the young emperor Nero to take to heart the point that while many may have the power to put others to death, he alone has the power to give life (that is, to allow life where the punishment of death is justified), yet Seneca may well have been party to Nero’s assassination of his own mother. At the very least, Seneca was unable to stop Nero. Again, Seneca upholds the importance of freedom from emotion in living a happy life. He encourages daily exercises to rid oneself of anger and other emotions, yet he writes tragedies in which unbridled emotions are the central focus. He encourages his readers to reflect on what is really theirs and to distance themselves from the inner workings of the political mob, yet he writes a political satire (the Apocolocyntosis) which assumes detailed knowledge of the inner workings of imperial court under Claudius. Finally, Seneca is reported to have written Nero’s address for the funeral of Claudius. While this work is lost to us, it is unlikely that it had much in common with the Pumpkinification of Claudius, which he must have penned at around the same time.

These features of Seneca’s life and work have been both targets for criticism and spurs for investigation. To his credit, Seneca denies (even in the Letters, some of his latest works) that he is close to living a fully philosophical life. He works toward this goal but falls short. Notwithstanding his own profession of philosophical failure, the spirit of his philosophical works seems clearly (to the extent that we can see clearly into his life) undermined by his role in Roman life. A number of views can be taken here. Perhaps Seneca simply fails to live the philosophical life he aspires to live. Perhaps his philosophical ambitions were really secondary to his political ambitions. While many scholars have noted the inconsistencies and many have rejected Seneca’s work on the grounds of hypocrisy, some scholars (notably Emily Wilson) have challenged this view. Wilson notes that, “The most interesting question is not why Seneca failed to practice what he preached, but why he preached what he did, so adamantly and so effectively, given the life he found himself leading” (Wilson, 2014).

A final and more philosophically substantive criticism also relies on a claim that there is some disparity between what Seneca advises and what Seneca does. This criticism, articulated by J. M. Cooper, argues that Seneca’s aim to guide his reader toward moral improvement is ultimately undermined by his advice to avoid the study of logic. Stoic theory requires that one have knowledge of ethics, physics, and logic. The Stoics, in fact, have much to say about the important interconnections among these three branches of study. Though one may begin with ethics, one’s philosophical study is simply not complete unless one has mastered the arguments forms that fall under the scope of logic. Despite this, Seneca repeatedly tells his readers, particularly in the Letters, that the study of advanced logic, including Zeno’s syllogisms and certain logical fallacies, are a waste of time. In doing so, Seneca is advising his readers to avoid something that is, according to his own theory, necessary for moral progress.

Despite these criticisms, Seneca’s works have been widely read since his own lifetime. Seneca’s works, along with Cicero’s, were much more readily accessible to medieval Europeans who no longer read Greek. Thus, Seneca served for a long time as one of only a few sources for Stoic philosophy. Seneca’s works were well received by Christian thinkers in the Early Middle Ages. This was no doubt partly due to the forged correspondence (long thought to be genuine) between Seneca and the apostle Paul. Partly though, Seneca’s acceptance by Christian thinkers was surely due to similarities between Christian and Stoic doctrines. Seneca’s doctrine of the first movements of emotions—those experiences of being drawn toward something or the initial experience that precedes becoming angry or grief stricken – find welcome reception in Christian thinkers who are working on accounts of temptation and the original failings of human nature.

During and after the Renaissance, Seneca’s works continued to be read widely. How much Seneca alone, apart from other surviving Stoic sources (including Cicero’s philosophical works), influenced a particular philosopher’s thinking is difficult to tell, but Seneca was clearly read. Descartes, for example, used Seneca’s On the Happy Life as the basis for the ethical view he develops in his correspondence with Princess Elizabeth. A near contemporary of Descartes, Justus Lipsius, relied on Seneca’s philosophy heavily in his attempt to develop a new form of Stoicism suitable to his age. One can find many references to Seneca in the works of philosophers throughout the history of philosophy in Europe. Seneca’s influence and importance can perhaps be seen most clearly in cases where philosophers identify with Seneca’s philosophical views and at the same time sympathize with the circumstances of his life. Thomas More, for example, who was also an advisor to a powerful monarch, read Seneca widely. It has been noted that one source for More’s Utopia was likely Seneca’s (incomplete) treatise De Otio (On Leisure). There, Seneca notes that the ideal state is “no place” (nusquam).

The influence of Seneca’s work, especially his account of the emotions and their therapy, can be seen in the work of philosophers such as Foucault and Pierre Hadot, who have both developed accounts of living philosophy. This includes focus on the source of one’s troubling emotions—anxiety, fear, anger—and how philosophy can address these. In psychology, the Stoic account of the emotions as cognitive has been influential in the development of cognitive therapies. Albert Ellis, for example, who developed rational emotive behavioral therapy (REBT), was heavily influenced by Stoic views of the emotions, and especially by Seneca.

3. References and Further Reading

a. Texts and Translations

All of Seneca’s works are available in English translation. For many years, the Loeb Series, which includes Latin and English side by side, translated by Gummere (Letters) and Basore (Dialogi or Moral Essays) were the standard English translations. New translations of particular works or selections of letters have been published. Inwood’s 2007 collection contains extensive philosophical commentary on a collection of 17 philosophically substantive Letters.

Campbell, Robin, trans. Seneca: Letters from a Stoic. Penguin Classics. 2004.
Inwood, Brad, trans. Seneca: Selected Philosophical Letters. Oxford; New York: Oxford University Press, 2010.
Seneca, Lucius Annaeus. Epistulae Morales (Letters). Trans. Richard M. Gummere. London: Harvard University Press, 1917. 3 vols. Loeb.
Seneca, Lucius Annaeus. Moral Essays. Trans. John W. Basore. Cambridge, Mass.: Harvard University Press, 1928. 3 vols. Loeb.
Seneca, Lucius Annaeus. Tragedies. Trans. John G. Fitch. Annotated edition. Cambridge, Mass: Harvard University Press, 2002. 2 vols. Loeb.
Wilson, Emily, trans. Seneca: Six Tragedies. Oxford World’s Classics. New York: Oxford University Press, 2010.

An effort to produce new translations of all of Seneca’s works is currently underway through the University of Chicago Press. As of 2015, the following four volumes were available.

Seneca, Lucius Annaeus. Anger, Mercy, and Revenge. Trans. Robert Kaster and Martha Nussbaum. Chicago: London: University of Chicago Press, 2010.
Seneca, Lucius Annaeus. Hardship and Happiness. Trans. Elaine Fantham et. al. Chicago ; London: University Of Chicago Press, 2014.
Seneca, Lucius Annaeus. Natural Questions. Trans. Harry Hine. Chicago; London: University Of Chicago Press, 2010.
Seneca, Lucius Annaeus. On Benefits. Trans. Miriam Griffin and Brad Inwood. Chicago: University Of Chicago Press, 2011.

b. Secondary Literature

Bartsch, Shadi and Wray, David, eds. Seneca and the Self. Cambridge: Cambridge University Press, 2009.
- A collection of essays evaluating Seneca’s contribution to the modern notion(s) of the Self.
Cooper, John M. Knowledge, Nature, and the Good: Essays on Ancient Philosophy. Princeton University Press, 2009.
- Chapter 12, “Moral Theory and Improvement: Seneca,” argues that Seneca’s dislike for logic is incompatible with his Stoic allegiance.
Fitch, John G., ed. Oxford Readings in Classical Studies: Seneca. New York: Oxford University Press, 2008.
- A collection of essays on many aspects of Seneca’s work—both philosophical and poetic.
Griffin, Miriam T. Seneca: A Philosopher in Politics. Oxford: Oxford University Press, 1992.
- An extensive study of what Seneca’s philosophical writings can tell us about his role as a political agent.
Hadot, Ilsetraut. Seneca und die Griechisch-Römische Tradition der Seelenleitung. Berlin: Walter De Gruyter & Co., 1969.
- Places Seneca’s work as a spiritual advisor to his audience in the context of Greco-Roman spiritual advice literature from Homer to Seneca.
Inwood, Brad. Reading Seneca: Stoic Philosophy at Rome. Oxford; New York: Oxford University Press, 2008.
- A collection of essays that explicate Seneca’s thinking about a number of philosophical problems.
Ker, James. The Deaths of Seneca. New York: Oxford University Press, 2009.
- An examination of Seneca’s life and work through the lenses of the various accounts of his death, both ancient and later.
Romm, James. Dying Every Day: Seneca at the Court of Nero. New York, Vintage Books, 2014.
- A biography aimed at reconciling the apparently incompatible versions of Seneca—the wealthy man who praises poverty, the philosopher who is so engaged in politics, and so forth. Romm focuses consistently on the role that death and thinking about death play in Seneca’s life and works.
Wilson, Emily. The Greatest Empire: A Life of Seneca. Oxford: Oxford University Press, 2014.
- A biography of Seneca informed by what is known about the dates of his philosophical and non-philosophical works. Wilson aims to explain, as much as possible, various tensions in the reception of Seneca.
Volk, Katharina, and Gareth D. Williams. Seeing Seneca Whole: Perspectives on Philosophy, Poetry, and Politics. Brill, 2006.
- A collection of essays from a variety of standpoints—philosophical, literary, historical—aimed at clarifying Seneca’s status as an author of many genres.

Author Information

Robert Wagoner
Email: wagonerr@uwosh.edu
University of Wisconsin Oshkosh
U. S. A.

Modal Logic: A Contemporary View

Modal notions go beyond the merely true or false by embedding what we say or think in a larger conceptual space referring to what might be or might have been, should be, or should have been, or can still come to be. Modal expressions occur in a remarkably wide range across natural languages, from necessity, possibility, and contingency to expressions of time, action, change, causality, information, knowledge, belief, obligation, permission, and far beyond. Accordingly, contemporary modal logic is the general study of representation for such notions and of reasoning with them.

Although the origins of this study lie in philosophy, since the 1970s modal logic has developed equally intensive contacts with mathematics, computer science, linguistics, and economics; and this circle of contacts is still expanding. But at the same time, in its technical development, modal logic has also become something more, starting from the discovery in the 1950s and 1960s of various translations taking modal languages into systems of classical logic. Investigation of modalities has also become a study of fine-structure of expressive power, deduction, and computational complexity that sheds new light on classical logics, and interacts with them in creative ways.

This article presents a panorama of modal logic today in the spirit of the Handbook of Modal Logic, emphasizing a shared mathematical modus operandi with classical logic, and listing themes and applications that cross between disciplines, from philosophy and mathematics to computer science and economics. While this style of presentation does not disown the metaphysical origins of modal logic, it views these as just one of many valid roads toward modal patterns of reasoning. Other roads traveled in this article run through other areas of philosophy, such as the epistemology of knowledge and belief, or even through other disciplines, such as logics of space in mathematics, or logics of programs, actions, and games in computer science and game theory.

Modal Notions and Reasoning Patterns: a First Pass
A Very Brief History of Modal Logic
The Basic System: Modality on Graphs
Some Active Current Applications
Modern Themes across the Field
Modal Logic and Philosophy Today
Coda: Modal Logic as a Part of Standard Logic
Conclusion
References and Further Reading

1. Modal Notions and Reasoning Patterns: a First Pass

Modal logic as a subject on its own started in the early twentieth century as the formal study of the philosophical notions of necessity and possibility, and this tradition is still very much alive in philosophy (Williamson 2013). In this article, however, we will paint on a larger canvas and introduce the reader to what modal logic as a field has become a century hence. Still, for a start, it is important to realize that modal notions have a long historical pedigree. They were already studied by Aristotle and then by the medieval logicians (Kneale & Kneale 1961), who noted many peculiarities of this province of reasoning. Often these studies started from raw inferential intuitions that can take several forms. We may judge some pattern to be valid (say, necessity implies truth), we may judge others to be invalid (truth does not imply necessity), or we may have ideas about connections between different modal notions, such as the necessity of some proposition $\phi$ being equivalent to the impossibility that not $\phi$. Modal logicians then start by introducing notation to make all this crystal-clear. Say, writing $\Box \phi$ for necessary truth of $\phi$ and $\Diamond \phi$ for its possibility, the above claims amount to, respectively,

$\Box \phi \rightarrow \phi$ is valid
$\phi \rightarrow \Box \phi$ is not valid
$\Box \phi \leftrightarrow \neg\Diamond\neg \phi$ is valid.

$\neg$ is negation. Later on, through the work of C. I. Lewis (Lewis & Langford 1932), further philosophical notions were drawn into this family, in particular, that of entailment, or ‘strict implication’, between two propositions. Whereas the plain material implication $\phi \rightarrow \psi$ expresses the fact that $\phi$ and $\neg \psi$ do not occur together: $$\neg(\phi \wedge \neg \psi)$$ an entailment is the stronger modal assertion that the two cannot occur together: $$\neg\Diamond(\phi \wedge \neg \psi).$$

With this notation, we can then swing into the second professional mode of logicians, operating on these forms by valid steps of abstract reasoning to discover new insights. For instance, a few simple inference steps with plausible modal principles will show that $$\neg\Diamond(\phi \wedge \neg \psi)$$ is equivalent to $$\Box(\phi \rightarrow \psi)$$ giving another view of entailment, this time as the necessity strengthening of the material implication. With such a proof calculus in hand, we can analyze many philosophical arguments from the classical and modern literature involving modality. Famous examples abound in the work of Arthur Prior, Peter Geach, Jaakko Hintikka, Stig Kanger, Saul Kripke, David Lewis, Robert Stalnaker, and other pioneers, all the way to the new wave of philosophical logicians of today. But we can also use a logical proof calculus, once we have settled on one, independently to reveal more of the abstract system of principles governing modality.

Still, this cozy world of intuitions and mathematical systems is not enough for many logicians. Why would some principles of modal reasoning be valid, while others are not? One common way of analyzing this further is by giving a semantic model for the meaning of the modalities that fits the earlier-stated facts. As it happens, such a model was given already centuries ago, following an idea going back to Leibniz and earlier thinkers like the Jesuit Luis Molina. We can explain the surplus of necessary truth over ordinary truth by going beyond the actual world $s$ in terms of some larger universe $W$ of metaphysically possible worlds. The truth of a statement $\phi$ is truth at $s$ only, while:

the necessity statement $\Box \phi$ says that $\phi$ is true in all possible worlds $t$.

Likewise, the existential modality $\Diamond \phi$ says that is $\phi$ true in at least one possible world. In this way, modalities become like standard universal and existential quantifiers, ranging over some suitably chosen larger family of worlds. Despite the existence of alternatives, and the occasional attack on the above framework, this quantifier view has been dominant since the 1950s, and it has influenced all that is to come in this article.

Note how this setting makes the interpretation of necessity relative to the choice of a model $\mathbf{M}$ containing all relevant possible worlds, something that will return in our later formal modal truth definition, which employs a ternary format:

$\mathbf{M}, s \vDash \phi$

formula $\phi$ is true in model $\mathbf{M}$ at world $s$.

Despite the metaphysical terminology, often retained for nostalgic reasons, such models have very different interpretations today. ‘Worlds’ can stand for situations, stages of a process, information states, locations in space, or just abstract points in a graph. This trend toward exploring a wider spectrum of interpretations was reinforced by the addition, in the 1950s, of a crucial further parameter (by Kanger, Hintikka, Kripke, and Montague) that increased the reach of modal logic immensely. We give each world in a model $\mathbf{M}$ a range of its ‘accessible worlds’, and then let ‘necessity’ (or whatever this notion turns into on a concrete interpretation) range only overall accessible worlds.

Defining modal notions somewhat loosely as those that look beyond the actual, here, and now, natural language is full of modality, since all our thinking and actions wade in a sea of possibilities, many of them never realized, but all-important to deliberation and decision, rational or otherwise. Explicitly modal linguistic expressions show a great variety: temporal (past, present, future), epistemic (know, believe, doubt, must), normative (may, ought), or causal, while there is also a lot of implicit modality, for instance in verbs like “seek” that can even refer to presumably (one more modal expression!) non-existent worlds containing a fountain of eternal youth. There is no good definition covering all these linguistic cases, though failure of substitution of extensional equivalents is often cited as a connecting thread. A ‘modal’ sentence operator can be sensitive to the substitution of propositions with the same truth value. This criterion looks a bit ‘symptom-based’, though, and perhaps a better criterion for spotting a modality is a semantic one of expressions whose truth value may have to look ‘beyond the actual facts’. But the stability of modality also shows in characteristic inference patterns, such as the many dualities instancing the earlier equivalence

$\Box \phi = \neg\Diamond\neg \phi$ and its dual $\Diamond \phi = \neg\Box\neg \phi$.

For instance, ‘always’ is ‘not sometimes not’, or ‘ought’ is ‘not permitted that not’. Such algebraic duality patterns are so ubiquitous that in the 1950s, it was even proposed to include them in broadcasts into outer space announcing our presence to other galactic civilizations – a truly fitting endeavor for possible worlds theorists.

By now the marriage between necessity, possible worlds, and universal quantification over these has become so ingrained that it may be hard to imagine other approaches. Nevertheless, other semantics exist for modal notions, such as the topological models we will mention later, that generalize possible worlds models in the accessibility style, and even predate them historically. In fact, it is one of those ironies of scientific life that this more general semantics was already explicit in the 1930s in Tarski’s work on modality in topology and algebra, but it did not ‘take off’ the way the possible worlds paradigm did in the 1950s. And we need not even think semantics and model theory only: a perfectly good alternative view of modality comes from proof theory. A proof-theoretic explanation of the surplus of stating a necessity $\Box \phi$ over plain truth of $\phi$ is the existence of some strong a priori argument for $\phi$, perhaps a mathematical proof.

Interestingly, on the latter understanding, the ‘intensional surplus’ of modality comes as an existential, rather than a universal quantifier. And yet the proof-theoretic interpretation validates many base laws that also hold for the universal quantifier. For instance, the well-known law of Modal Distribution $$!\Box(\phi \rightarrow \psi) \rightarrow (\Box \phi \rightarrow \Box \psi)$$ is valid on both views, though for intuitively different reasons. With universal quantification in models, it reflects the predicate-logical law $$\forall x(\phi \rightarrow \psi) \rightarrow (\forall x\phi \rightarrow \forall x\psi)$$ while in terms of the existence of proofs, it says that proofs for $\phi$ and for $\phi \rightarrow \psi$ can be combined into one for $\psi$. This harmony between the existence of a proof of a formula and its universal truth in some suitable semantic universe is of course not unheard of: it will be familiar to students of completeness theorems for logical systems.

Modal logic is at work in many disciplines beyond philosophy, as one can see in the 2006 Handbook of Modal Logic or the conference series Advances in Modal Logic. Van Benthem 2010 is a textbook in modal logic with the same broad thrust. This is the breadth of the field that we are after in this article, though, in the context of an Encyclopedia like this, we will be making special reference to interfaces of modal logic and philosophy, past and present, at various places.

2. A Very Brief History of Modal Logic

Aristotle already considered a calculus for reasoning with modal syllogistic forms like “every $P$ is necessarily $Q$”. The topic continued in the Middle Ages, and we still find modality firmly entrenched as a major logical notion in the famous Table of Categories in Kant’s Kritik der Reinen Vernunft. All this was swept aside in the extensional turn of Frege’s Begriffsschrift in 1879. On one telling page the author enumerates a list of things for which he sees no need – and readers of some erudition will recognize the anonymous enemy as Kant’s Table of Categories. Nevertheless, in this century modal notions made their way back onto the logical agenda, leading to extensions of classical systems with operators of necessity, possibility, entailment, and other notions.

Over time, these formalisms have become influential as a tool for analyzing a wide range of philosophical arguments about various modal notions, such as the many beautiful examples of temporal reasoning in Prior 1967. But non-philosophical applications were never far away, starting with mathematics. Gödel 1933 showed how to embed Heyting’s intuitionistic propositional logic faithfully into the modal logic S4, Tarski 1938 showed how to axiomatize modal structures in topological spaces, and the classic paper Jónsson–Tarski 1951 provided a seminal technical apparatus for modal logic in terms of universal algebra, with representation theorems going to accessibility-based possible worlds models. Nevertheless, it is often thought that modal logic is the tool par excellence for philosophical logic, giving the practitioner just the right expressive finesse to deal with metaphysical modality, time, space, knowledge, belief, counterfactuals, deontic notions, and so on. The Handbook of Philosophical Logic (Gabbay & Guenthner, eds., 1981–1987, 2001–2007) has a wide range of pertinent illustrations, also for many topics in this article. However, our focus implies no claim to exclusivity: for some philosophical fare the right conceptual cutlery may be first- or higher-order logic rather than modal logic. Or better yet, as we shall see soon, one can use both.

In some circles, modal logic still has a flavor of ‘alternative’ logic, a sort of counter-culture to standard systems like first-order logic. Some philosophers see the intensional character of modality as a challenge to, rather than a natural extension of extensional notions. It also seems the view enshrined in some fashionable terminology calling modal formulas not ‘true’ in models, as one does for ordinary logical languages, but ‘forced’ in some mysterious manner. This impression of exoticness is wildly obsolete, and modal languages will be a standard part of the heartland of logic in the perspective taken later on, applying also to a variety of standard topics in mathematical logic.

Moving beyond philosophy and mathematics, since 1970, modal logic has come to flourish at interfaces with linguistics: compare the treatment of intensional operators and verbs in Montague 1974, the modal grammar of Blackburn & Meyer Viol 1994, or modal logics of context in linguistics and AI such as the one in Buvac & Mason 1994. It has also thrived in computer science with dynamic or temporal logics of programs, logics of spatial structures, or modal description logics for knowledge: see the Handbooks van Leeuwen ed. 1991, Abramsky, Gabbay & Maibaum eds. 1992, Gabbay, Hogger & Robinson eds. 1997, Aiello, Pratt & van Benthem eds. 2007, or monographs such as Fagin, Halpern, Moses & Vardi 1995, Harel, Kozen & Tiuryn 2000. In fact, the range of applications is still growing, with seminal uses of modal logic in economics (for example, logics of knowledge in the foundations of game theory: see Leyton-Brown & Shoham 2008, Perea 2011), or new ventures in argumentation theory (Grossi 2010). We cannot compile a representative bibliography for the field in an article like this. Suffice it to say that the bulk of modal logic research today, both applied and pure, takes place inside or close to computer science and related fields.

Restated more in terms of themes, the major interpretation of modal formalisms these days fall under two main headings: information and action (van Benthem & Blackburn 2006). A typical modal formalism for analyzing information (though by no means the only one) is ‘epistemic logic’ where possible worlds are viewed as epistemic alternatives to the actual world, and the universal modality $\Box \phi$ expresses knowledge in the sense of having the semantic information that $\phi$ holds. A well-known formalism for action is ‘dynamic logic’ where worlds are states of some computational process, and a labeled modality $[a]\phi$ says that all states reachable from the current one by performing action $a$ satisfy $\phi$. We will discuss both of these interpretations in more detail below. The fact that modal laws can be similar in both cases also highlights a deep conceptual duality between information and action that has also been noted by philosophers.

In this process of expansion, but also for internal theoretical reasons that we shall see, modal operators are now often viewed as a special kind of ‘bounded quantifiers’, making modal logic, not an extension of classical logic, but rather a fragment in terms of its expressive power over possible worlds. As such its attraction acquires a new flavor. Rather than being baroque extensions of the sort that Frege rejected, modal languages have a charming austerity, and they demonstrate how ‘small is beautiful’.

But emphasizing distance from the original philosophical habitat may be misleading. Expats may return to their homeland, and indeed, many modern themes and results of modal logic make sense inside contemporary philosophy. They find continued and even reviving spheres of application in metaphysics, mereology, epistemology, meta-ethics, and other areas – and one might even make the case that information and action are just as crucial notions to philosophy as the original metaphysical modalities.

3. The Basic System: Modality on Graphs

In this section, we review the basic system of propositional modal logic, emphasizing key technical features. With this in place, we will survey extensions in later sections, while ending this article with a few deeper excursions to the contemporary scene.

Basic setting Our basic idea is simply this: we describe properties of directed graphs consisting of points (‘possible worlds’ if you like grandeur) with directed links encoded in an ‘accessibility relation’ between points. A universal modality $\Box \phi$ is true at a point in a graph if $\phi$ is true at all points reachable by a directed arrow. Graphs are ubiquitous in many areas, and they are a good abstraction level for understanding what modal logic is about. And as we all know, pure high mountain air is good for you.

The basic modal language is a useful laboratory for logical techniques. We sketch the basic modal logic of graphs, including the usual topics of language, semantics, and axiomatics. But sticking to only these would mean ordering only part of the full menu available today, depriving you of acquiring a richer palate. So, we will serve you richer fare in what follows, allowing you to appreciate more of a broader literature.

Language and semantics We interpret formulas in models $\mathbf{M} = (W, R, V)$, that may be viewed as directed graphs $(W, R)$ with annotations for proposition letters, given by the valuation $V$ sending each proposition letter $p$ to the set of points $V(p)$ where $p$ is true. When evaluating complex formulas, one can take either the existential or the universal modality as a primitive (both have their comfort zones in logical research):

$\mathbf{M}, s \vDash \Diamond \phi$ iff for some $t$ with $Rst$, $\mathbf{M}$, $t \vDash \phi$

$\mathbf{M}, s \vDash \Box \phi$ iff for all $t$ with $Rst$, $\mathbf{M}$, $t \vDash \phi$

It helps to think of points in $W$ as states of some kind, while accessibility encodes dynamic moves that can be made to get from one state to another. But there are many other useful views of these ‘decorated graphs’, including complete ‘worlds’.

As an example, consider the following graph:

Using the above truth definition, the formula $\Diamond\Box\Diamond p$ is true at $1, 4$, but it is false at $2, 3$.

One conceptual finesse should be stressed here that is often ill-understood. Some critics find the ‘points’ in this picture too unstructured and poor to model lush possible worlds in some pre-theoretical philosophical sense. But the total modal structure of a point includes its environment, with all its interactions with other points through the relation $R$. This is more like way we think of ‘objects’ in category theory as given not so much by their internal structure as by their pattern of functional interactions with other points. Indeed, modal models can be viewed as categories, and this, too, has proved a valid and rich interpretation – even though it is beyond the scope of this article.

Remark. There is a continuing historical discussion about the origins of this semantics. Often-quoted papers are Kripke 1959, 1963, but there were predecessors on the other side of the Atlantic, of which we mention Kanger 1957. To avoid taking sides through terminology, in this article, we choose neutral terms such as ‘models’ and ‘frames’.

Expressive power and invariance for bisimulation Languages are used to define and say things, a communicative function that may even be prior to reasoning. The expressive power of a modal language, or indeed any language, can typically be measured by a notion of similarity between different models, telling us what differences in structure the language can and cannot detect. Mathematically, such an analysis calls for a suitable ‘invariance relation’, or philosophically: a ‘criterion of identity’, between models – and finding one is a test on whether one has really understood a given logic. Here is an invariance relation that fits the basic modal language: it is not standard fare in philosophical textbooks, but learn it, and you will have entered the realm of modern modal logic.

Definition A bisimulation between two models $\mathbf{M}, \mathbf{N}$ is a binary relation $E$ between points $m, n$ in the respective models such that, whenever $m E n$, then (a) $m, n$ satisfy the same proposition letters, (b1) if $m R m’$, then there exists a world $n’$ with both $n R n’$ and $m’ E n’$, (b2) the same ‘zigzag clause’ holds in the opposite direction.

Together, this atomic harmony for proposition letters plus the two dynamic zigzag clauses that can be called again and again, make bisimulation a natural notion of process equivalence tracking possible evolutions of a process step by step. Indeed, this notion was discovered independently in modal logic, computer science, and set theory.

Here is an example, disregarding proposition letters for simplicity. The two black worlds in the depicted models $\mathbf{M}, \mathbf{N}$ are linked by a bisimulation consisting of all matches marked by dotted lines – but there is no bisimulation that includes a match between the black worlds in the following models $\mathbf{N}$ and $\mathbf{K}$:

Here is a first case of ‘fit’: modal formulas are invariant for bisimulation.

Invariance Lemma If $E$ is a bisimulation between $\mathbf{M}$ and $\mathbf{N}$ with $m E n$,
then $m, n$ satisfy the same modal formulas.

In particular, we can show the failure of bisimulation between the above models $\mathbf{N}$, $\mathbf{K}$ by noting that $\mathbf{N}$ satisfies the modal formula $\Diamond\Diamond\Box\bot$ (with $\bot$ for the constant formula ‘false’) in its root (marked as a black dot), whereas $\mathbf{K}$ does not.

The converse to the Lemma only holds for a modal language with arbitrary infinite conjunctions and disjunctions – or for the plain modal language over special models.

Proposition If $m, n$ satisfy the same modal formulas in two finite models $\mathbf{M}, \mathbf{N}$, then there exists a bisimulation $E$ between $\mathbf{M}, \mathbf{N}$ with $m E n$.

There are many further definability results in modal model theory. For instance, for any model $\mathbf{M}, s$ with designated point $s$, there is an infinitary modal formula $\phi\mathbf{^{M,s}}$ true in only those models $\mathbf{N}, t$ that are bisimilar to $\mathbf{M}, s$ (that is, some bisimulation links $t$ to $s$). Deeper model-theoretic studies of definability aspects of modal logic can be found in Blackburn, de Rijke & Venema 2001, Blackburn, van Benthem & Wolter, eds. 2006.

Invariance is of independent interest for its emphasis on comparisons between different models, a topic that seems somewhat neglected in philosophical logic. Barwise & van Benthem 1999 even have interpolation theorems casting bisimulation in the role of ‘transfer inference’, allowing us to find out facts about one model by reasoning about another model sufficiently ‘like it’. This brings us to the second main aspect of logic, providing a calculus of reasoning for the intended area of application.

Validity, proof systems, deductive power Universal validity in the basic modal logic is axiomatized in Hilbert-style by a system called the minimal modal logic K (for Kripke):

(a) all laws of propositional logic

(b) a definition of $\Diamond \phi$ as $\neg\Box\neg\phi$

(c) the modal distribution axiom $\Box(\phi\rightarrow \psi) \rightarrow (\Box \phi\rightarrow\Box \psi)$

(d) the necessitation rule “if $\vdash \phi$, then $\vdash \Box \phi$”

This looks like a standard axiomatization of first-order logic with $\Box$ as $\forall$, and $\Diamond$ as $\exists$, but leaving out first-order axioms with tricky side conditions on freedom and bondage of terms:

$\forall x\phi \rightarrow [t/x]\phi$ and
$\phi \rightarrow \forall x\phi$

Modal deduction is simple quantifier reasoning in a perspicuous variable-free notation. Many other formats for modal proof systems exist, such as sequent calculus or natural deduction. Modal proof theory is still an area in progress (Wansing, ed. 1996), but important strides are being made (compare Negri 2011).

Mathematical theory Starting from the 1970s, an extensive mathematical theory has sprung up for basic modal logic, including model theory and proof theory, while using perspectives from universal algebra. Instead of listing the classical references, we refer the reader to a modern monograph like Chagrov & Zakharyashev 1996, or the Handbook Blackburn et al. eds. 2006. In this article, we only mention a few highlights.

Translation and invariance One basic technique for putting modal logic in a broader perspective is a translation $T$ from modal formulas $\phi$ to first-order formulas $T(\phi)$ with one free variable $x$ having the same truth conditions on models $\mathbf{M}, s$:

(a) $T(p) = Px$,

(b) $T$ commutes with Boolean operators,

(c) $T(\Diamond \phi) = \exists y(Rxy \& [y/x]T(\phi)), T(\Box f) = \forall y(Rxy \rightarrow [y/x]T(\phi))$.

With some care, only 2 variables $x, y$ are needed in these translations (free or bound). For instance, $$\Box\Diamond\Box p$$ translates faithfully into $$\forall y(Rxy \rightarrow \exists\mathbf{x}(Ry\mathbf{x} \wedge \forall y(R\mathbf{x}y \rightarrow Py)))$$

Here is the essential semantic feature that makes these translated modal formulas special inside the full first-order language over the signature $R^{2}, P^{1}, Q^{1}, \ldots$ of models:

Modal Invariance Theorem The following assertions are equivalent for all first-order formulas $\phi = \phi(x)$: (a) $\phi$ is equivalent to a translated modal formula, (b) $\phi$ is invariant for bisimulations.

The resulting modal fragment of first-order logic turns out to share nice properties of the full system such as Compactness, Interpolation, Löwenheim-Skolem, model-theoretic preservation theorems, and others. This is not automatic inheritance, and classical meta-proofs often need to be adapted creatively using bisimulation. But unlike first-order logic, modal logic is decidable – showing fine-structure inside classical logic: with a delicate balance between expressive power and computational complexity.

The fragment perspective is quite general: many other modal languages live inside first-order logic or other standard logics under some translation for their standard semantics. We will see later what makes these fragments so well behaved.

Landscapism A typical feature of modal logic has to do with its historical proliferation of deductive systems: ‘modal logics’ of different proof strength inside the same basic language. On top of the minimal logic, there are uncountably many different normal modal logics given by the same rules of inference as above plus various sets of axiom schemata. This deductive landscape has two major highways, because of the following:

Theorem Every normal modal logic is either a subset of the logic $Id$ with characteristic axiom $\phi \leftrightarrow \Box \phi$, or of $Un$ with axiom $\Box\bot$.

On the former road lie well-known systems like $T, S4, S5$, but the latter road has landmarks such as Löb’s logic of arithmetical provability axiomatized by$$!\Box(\Box \phi \rightarrow\phi) \rightarrow \Box \phi$$Logics in this deductive landscape can be studied by proof-theoretic methods, but also semantically – once we find completeness theorems bridging the two realms.

Completeness Let us now turn to the way in which modal logics viewed as deductive systems are correlated with semantic models. A typical completeness theorem is this:

Theorem A modal formula is provable in $K4$ (minimal $K$ plus the axiom $\Box \phi \rightarrow \Box\Box \phi$) iff it is true in all models whose accessibility relation is transitive.

There are many techniques for proving such results, ranging from simple inspection of the canonical Henkin model of all complete theories in the logic to forms of drastic model surgery. The demand for completeness theorems comes from two sides. Either one has a pre-existing modal logic given by syntactic axioms and rules (like many first-generation modal systems), and seeks a useful matching model class – or one has a natural model class (say, some interesting space-time structure), and wishes to axiomatize its laws for simple modal reasoning. The literature is replete with both. In this survey, we do not pursue either line, but they are very well-documented (Blackburn, de Rijke & Venema 2001, Chagrov & Zakharyashev 1996, amongst many sources).

Correspondence The correspondence between modal axioms and special properties of the accessibility relation in a class of models continues to be one of the major attractions of modal logic. It can be studied directly, calling a modal formula true in a frame $(W, R)$ (a model stripped of its valuation) if it holds under all valuations. Many modal axioms then correspond to simple first-order properties. The Sahlqvist Theorem describes an effective method constructing first-order equivalents from modal axioms of a suitable shape, which has by now reached the world of automated theorem proving. It proceeds by substituting first-order descriptions of ‘minimal valuations’ into the first-order translation of a modal axiom to get a natural first-order equivalent, if available.

As an instance of this procedure, a $K4$ axiom $$\Box p \rightarrow \Box\Box p$$ has a first-order translation $$\forall y(Rxy \rightarrow Py) \rightarrow \forall y(Rxy \rightarrow \forall z(Ryz \rightarrow Pz))$$ A minimal valuation for $p$ making the antecedent true is $Pu = Rxu$. Substituting this, and dropping the tautological antecedent, we obtain $$\forall y(Rxy \rightarrow \forall z(Ryz \rightarrow Rxz)))$$ that is, frame transitivity. Non-first-order principles are the McKinsey Axiom $$\Box\Diamond p \rightarrow \Diamond\Box p$$ and our earlier Löb Axiom.

Correspondence theory has produced many general results. One classic is a theorem in Goldblatt & Thomason 1975 that we state for its form only, omitting details. A first-order frame-property is modally definable iff it is preserved under taking (a) generated subframes, (b) $p$-morphic frame images, (c) disjoint unions, and (d) inverse ultrafilter extensions. Correspondence theory involves a study of simple modal fragments of the complex realm of monadic second-order logic, a perspective we will not pursue here.

Digression This is the classical view of correspondence (van Benthem 1984). But one can always rethink orthodoxy. Are the usual ‘modal logics’ with their special axioms really logics, or theories of special domains over a unique minimal logic? Special frame properties are nice, but they may be in need of further explanation that suggests alternative views. For example, transitivity is an effect of closing an accessibility relation under iterations, and then $K4$ is the logic of a special closure modality definable in just the minimal $K$-style ‘dynamic logic’ to be discussed below.

Next, we move to two basic themes that have risen to prominence since the late twentieth century—not just in modal logic, but also for logical systems generally.

Computation The basic modal language is a decidable miniature of first-order logic. There are many decision methods for validity or satisfiability exploiting special features of modal formulas – each with their virtues. Well-known methods are selection, filtration, and reduction, for which we refer to the literature (Marx 2006).

But there is a deeper issue here, going beyond the traditional understanding of logical systems. What is the precise computational complexity of various key tasks for a logic, allowing us to gauge its difficulty as a device to be used seriously? These key tasks include testing for satisfiability, but also model checking for truth, as well as comparing models. Here are the facts for the basic modal logic. (a) Given a finite model $\mathbf{M}, s$ and a modal formula $\phi$, checking whether $\mathbf{M}, s \vDash \phi$ takes polynomial time in length($\phi$) + size($\mathbf{M}$). This is better than for first-order logic, where this task takes polynomial space. (b) Checking if a modal formula $\phi$ has a model takes polynomial space in the size of $\phi$. For first-order logic, this is undecidable. (c) Checking if there is a bisimulation between finite models $\mathbf{M}, s$ and $\mathbf{N}, t$ takes polynomial time in the size of these models.

These benchmark complexities for logics differ as languages are varied. Complexity awareness may be a new feature to many logicians and philosophers, but computational behavior seems a feature of basic importance in understanding formal frameworks.

Interaction and games The modern view of computation is one of interactive agency (compare the AAMAS conferences, http://www.ifaamas.org/index.html), and accordingly, games provide a new perspective on logics (van Benthem 2014), including modal logic. In a modal evaluation game, two players Verifier ($V$) and Falsifier ($F$) disagree about a formula at point $s$ in a given model $\mathbf{M}$. Disjunction is a choice for $V$, conjunction for $F$, negation is a role switch, $\Diamond$ makes $V$ pick a point reachable from the current point, $\Box$ does the same for $F$. A game $p$ is won by $V$ if the atom $p$ holds at the current point, otherwise by $F$. A player also wins if the opponent has no move for a modality.

The crucial equivalence governing this game is as follows:

Fact $\mathbf{M}, s \vDash \phi$ iff Verifier has a winning strategy for the $\phi$-game in $\mathbf{M}$ starting at $s$.

Here is an example. Our first model picture when we introduced the basic semantics induces the following tree for an evaluation game for the formula $$\Diamond\Box\Diamond p$$ starting from point 1, with boldface indicating the winning positions for Verifier:

In this game, $V$ has two winning strategies: left and right, $<$right, down$>$. These are indeed the two possible successful ways of verifying $\Diamond\Box\Diamond p$ in the given model at point 1.

This style of analysis is widespread in the current literature. There are also model comparison games between players Duplicator (maintaining an analogy) and Spoiler (claiming a difference), playing over pairs of points $(m, n)$ in two given models $\mathbf{M}, \mathbf{N}$. This may be seen as a fine-structured way of checking for existence of a bisimulation, where successor states chosen in one model by Spoiler must be matched by successors in the other model, chosen by Duplicator, while atomic harmony always remains. Without providing details, we note that in games like this, (a) Spoiler’s winning strategies in a $k$-round game between $\mathbf{M}, s$ and $\mathbf{N}, t$ match the modal formulas of operator depth $k$ on which the points $s, t$ disagree, (b) Duplicator’s winning strategies over an infinite round game between $\mathbf{M}, s, \mathbf{N}, t$ match the bisimulations linking $s$ to $t$.

Many other logical notions can be ‘gamified’. For instance, proof games find deductions or counter-examples through a dialogue between two players about some initial claim. And always, the logical core notion turns out to match a strategy for interactive play.

This completes our sketch of basic modal logic as a meeting place for a wide range of logical notions, techniques, and results. In Section 4, we look at some concrete modern applications in more detail, and then in Section 5, we identify further general issues to which these give rise, reinforcing the role of modal logic as a conceptual lab.

4. Some Active Current Applications

We have given some information on the attractions of the basic system of abstract modal logic. At the same time, it is also important to see that many different concrete interpretations can be attached to this system, and how diverse these are.

Knowledge and belief One of the major interpretations of modal logic in use today reads modalities as operators of knowledge or belief (Hintikka 1962, Stalnaker 1984), though this reading is itself a subject of ongoing debate. Languages like this express many further basic epistemic patterns that occur in natural discourse, such as:

$K_{i}\phi \vee K_{i}\neg \phi$ “agent $i$ knows whether $\phi$ is the case”

On this interpretation, standard modal axioms acquire a new epistemic flavor, such as:

‘Positive introspection’: $K_{i}\phi \rightarrow K_{i} K_{i}\phi$
‘Negative introspection’: $\neg K_{i}\phi \rightarrow K_{i} \neg K_{i}\phi$

again readings that have been subject to critical debate.

A major new theme in the epistemic setting is a social one. No lonely thinkers are essential to cognition, but interaction between what different agents $i, \phi$ in a group, involving what they know about each other – in patterns such as $K_{i }K_{\phi }\phi$ or $K_{i }\neg K_{\phi }\phi$ – are essential. What I know about your knowledge or ignorance is crucial, both to my understanding and to my actions. For instance, I might empty your safe tonight if I believe that you do not know that I know the combination. Some forms of group knowledge transcend simple iterations of individual knowledge assertions. A key example is common knowledge: if everyone knows that your partner is unfaithful, you have private embarrassment – if it is common knowledge, you have public shame. Technically, this works as follows in our models. A new common knowledge modality $C_{G}\phi$ says that $\phi$ holds at every world reachable via a finite chain of uncertainty relations for agents in $G$.

For instance, in the following picture, where epistemic accessibility is an equivalence relation, the atomic fact $p$ holds in the current world, marked by the black dot:

In the current world, our semantics yields the following further facts:

(a) agent $Q$ does not know whether $p$: $\neg K_{Q}p \wedge \neg K_{Q}\neg p$,
(b) agent $A$ does know that $p: K_{A}p$, while
(c) it is common knowledge in the group ${Q, A}$ that $A$ knows whether $p$: $C_{{Q, A}}(K_{A}p \vee K_{A}\neg p)$.

Incidentally, this is a good situation for $Q$ to ask $A$ the question whether $p$ is true: but more on epistemic actions below. Common knowledge treated in a modal style is a widely used notion by now in philosophy (Lewis 1989), but also in computer science (Fagin et al. 1995) and game theory (Aumann 1977, Battigalli & Bonanno 1999).

Similar models can represent belief. This is often done a bit crudely by adding one more accessibility relation that is no longer reflexive to allow for false beliefs. But more illuminating is a richer approach (Grove 1988, Baltag & Smets 2008). Thinking of equivalence classes of the epistemic relation as the total range of what an agent knows, we endow these with binary plausibility orderings that encode what the agent considers less or more plausible. Then a belief modality $B\phi$ is interpreted as saying that $\phi$ is true in all most plausible epistemically accessible worlds. And plausibility models also support a richer notion, namely a binary modality of conditional belief $B^{\psi}\phi$ saying that $\phi$ is true in all most plausible epistemically accessible worlds that satisfy $\psi$. Unlike the situation with conditional knowledge, conditional belief cannot be defined in terms of absolute belief. Indeed, the logic of conditional belief is much like modal logics for conditional assertions in models with similarity relations (Lewis 1973, Burgess 1981, Veltman 1985).

Caveat Our use of the terms ‘knowledge’ and ‘belief’ is mainly a tribute to the tradition. Most philosophers and logicians no longer think of the above modalities as modeling real knowledge and belief, and think of $K\phi$ rather as representing the agent’s semantic information (Carnap 1947), and of $B\phi$ as what is true according to ‘the best of the agent’s information’. For a current modal study of how to model genuine notions of knowledge using more sophisticated philosophical intuitions, see Holliday 2012.

Dynamic logic of action Accessibility arrows can also be viewed quite differently, not in terms of knowledge and information, but as transitions for actions viewed as changing states of some relevant process, a computation, or a general course of events (Harel, Kozen & Tiuryn 2000). Modalities now get labeled with explicit action expressions to show what they range over. In dynamic logic – originally designed to describe execution of computer programs, but now used as a general logic of action,

$[\pi]\phi$ says that after every successful execution of action $\pi, \phi$ holds.

Read in this way, modal statements now relate actions to ‘postconditions’ describing their effects and also to ‘preconditions’ for their successful execution. Concrete models of this sort are process graphs describing the possible workings of some computer or abstract machine. For instance, a labeled formula $[a]p$ says that, at the current starting state, after every execution of action $a$ (there may be zero, one or more ways of doing this), it is possible to then perform action $b$ to achieve a state where $p$ holds.

Another concrete model for dynamic logic are games, where actions are moves available to several players. For instance, in the following game tree, player $E$ has a strategy for achieving an outcome satisfying $p$ against any play by player $A$:

This strategic assertion is captured by the modal formula $[a\cup b]<c\ \cup \ d>p.$

Again we get a minimal modal logic, this time a two-level system treating propositions and actions denoting transition relations on a par. This joint setup allows for an analysis of important action constructions, encoded in valid principles of dynamic logic:

$$ [\pi ; \pi’]\phi \leftrightarrow [\pi][\pi’]\phi$$ sequential composition

$$ [\pi \cup \pi’]\phi \leftrightarrow ([\pi]\phi \wedge [\pi’]\phi) $$ choice

$$ [(\phi)]\psi \leftrightarrow (\phi \rightarrow \psi) $$ test for proposition $\phi$

A major new feature here is unbounded finite repetition of actions: $\pi^{*}$. This notion is typical for computation, but also for action in general (‘keep adding salt to bring up to taste’) and it is not first-order definable. This shows in two more axioms:

$$ [\pi^{*}]\phi \leftrightarrow (\phi \wedge [\pi][\pi^{*}]\phi) $$ fixed-point axiom

$$ (\phi \wedge [\pi^{*}](\phi \rightarrow [\pi]\phi)) \rightarrow [\pi^{*}]\phi $$ induction axiom

Dynamic logics resemble infinitary fixed-point extensions of classical logic, but with a modal stamp: like the basic modal logic, they are bisimulation-invariant and decidable, forming a core calculus for reasoning about the essentials of recursion and induction. Fixed-point definitions are ubiquitous in computer science, mathematics and linguistics, as many natural scientific notions involve recursion. An elegant powerful system of this kind generalizes dynamic logic by adding a facility for arbitrary fixed-point definitions: the so-called $\mu$–calculus that we will consider briefly below.

Information update Different kinds of modal logic can also form new combinations. For example, the logics of information change by combining knowledge and action. Our earlier epistemic formulas tell us what information agents have right now, but they do not say how this information changes, through acts of observation, communication, or learning in general. To model such cognitive actions, we need to combine epistemic and dynamic logic. One powerful idea here is an information update changes the current epistemic model. In the simplest case, reflecting a ubiquitous common-sense intuition, this update mechanism works as follows, decreasing the current epistemic range:

a public announcement$\phi$ of a proposition $\phi$ to a group of agents eliminates all worlds in the current epistemic model $\mathbf{M}$ that satisfy $\neg \phi$

Suppose that in our earlier two-agent two-world picture, $Q$ asks $A$: “$p$?” and $A$ then truthfully answers “Yes”. Then the $\neg p$-world gets eliminated, and we are left with a one-world model where $p$ has become common knowledge among {Q, A}.

But more subtle cases are possible, even with simple models. For example, a question itself may convey crucial information. By asking, $Q$ conveys the information that she does not know whether $p$. Even if $A$ did not know the answer at the start, this may tell him enough to settle $p$, and now answer the question. Here is a case where this happens:

But the modeling power of epistemic dynamics is still higher. Suppose that neither $Q$ nor $A$ knew whether $p$, but $A$ asks expert $R$, who answers only to $A$. Then $A$ learns whether $p$, $Q$ is no wiser about $p$, but it has become common knowledge that $A$ knows if $p$. This private act requires a new update changing models by ‘link elimination’:

The modal logic of update has some delicate features. For instance, a public announcement that some formula $\phi$ is the case need not always result in our learning that $\phi$ holds in the updated model. The reason is that truth value switches may happen when announcing formulas $\phi$ that contain a statement of ignorance. A well-known example is ‘Moore sentences’ of the form $p \wedge \neg Kp$, which become false after announcement.

Algorithms for model updates covering a wide range of communicative acts, public or private, and matching complete modal logics for formulas $[\phi]K\psi$ have been studied extensively in dynamic epistemic logic (Baltag, Moss & Solecki 1998, van Ditmarsch et al. 2007, van Benthem 2011). Similar logics can deal with acts of belief change, triggered by the above events !$\phi$ of public ‘hard information’ or also softer triggers rearranging the current plausibility ordering (‘soft information’) to a new model supporting suitably modified absolute and conditional beliefs. Actions of plausibility change have been studied in belief revision theory (Gärdenfors & Rott 1995, Segerberg 1995), in dynamic-epistemic logics (see the earlier references on this field), and in formal learning theory (Kelly 1996, Gierasimczuk 2010).

Intuitionistic logic and provability logic Let us now move from information and action to the grand themes of mathematics. Modal logic has also been used to model constructive reasoning as encoded in intuitionistic logic where truth is reinterpreted in terms of being established, or having a proof (Kripke 1965, Troelstra & van Dalen 1988). Our earlier models can now be viewed as universes of information stages, and accessibility is upward extension. Intuitionistic logic is then about persistent assertions that, once established, remain true upward in the information order. In particular, as mentioned earlier, Gödel 1933 gave a faithful translation from intuitionistic logic into the modal logic S4, reading intuitionistic conjunction and disjunction as their standard counterparts, but sending intuitionistic negation ~ to the strengthened modal combination $\Box\neg$, and intuitionistic implication $\phi \rightarrow \psi$ to the modalized material implication $\Box(\phi \rightarrow \psi)$. The full modal language also contains non-persistent assertions beyond the translated intuitionistic language that fit with some earlier-mentioned epistemic statements such as Moore sentences that may become false after updating with new information.

Another proof-oriented interpretation of the modal language occurs in provability logic (Boolos 1993, Artemov 2006). Here the box modality $\Box \phi$ gets interpreted as existence of a proof in some formal system of arithmetic. Note that this interpretation contains an existential, rather than a universal quantifier, as noted in our introduction. This view validates the laws of the minimal modal logic $K$, as well as the $K4$ transitivity axiom, that can now be read as saying that given proofs can be proof-checked for correctness. But this interpretation also validates Löb’s Axiom $$\Box(\Box \phi \rightarrow\phi) \rightarrow \Box \phi$$ This expresses a deep fact about arithmetical provability – and in fact, provability logic and its many extensions are decidable modal core theories of high-level features of mathematical provability in theories that have the coding power to discuss their own metatheory.

Temporal and spatial logic Still close to mathematics, another lively application area of modal logic concerns physical rather than human nature. A concrete interpretation of models is as flows of time, with accessibility as the temporal order ‘earlier than’ between points. The universal modality then says “everywhere in the future”, with a natural dual “everywhere in the past”. Temporal logics occur in linguistics and philosophy of language (Prior 1967), philosophy of science and philosophy of action (Belnap et al. 2001), but they have also reached computer science and AI, where they show a great diversity beyond the modal point of departure (see Abramsky, Gabbay & Maibaum eds. 1992, Gabbay, Hogger & Robinson eds. 1995). In particular, they can live over different primitive entities: durationless points, or extended periods (van Benthem 1983). The vocabulary of temporal logics is richer than the basic modal language. A typical case are operators saying what goes on during a successful transition: UNTIL $\phi \psi$ says that at some point later than now $\phi$ holds, while at all intermediate points $\psi$ holds.

In this same physical arena, modal logics of space are gaining importance, again in use both in philosophy of science and in knowledge representation in computer science. One of these revives an old idea from the 1930s. Let our modal models be topological spaces endowed with a valuation assigning distinguished subsets to proposition letters (Tarski 1938, Aiello et al., eds. 2007). Then the modality $\Box \phi$ may be read as saying that:

the current point lies in the topological interior of the set $[[\phi]]$ of all points where $\phi$ holds.

In this way, modal laws come to encode topological facts about space. For instance,

$\Box(\phi \wedge \psi) \leftrightarrow (\Box \phi \wedge \Box \psi)$ says that open sets are closed under intersections.

In fact, this interpretation validates all and only the theorems of the modal logic S4. The topological style of analysis extends to modal fragments of geometry. It provides a wide-ranging extension of our standard semantics quantifying over reachable points in graphs, which it contains as a special case. Technically, it suggests a generalized modal semantics in terms of neighborhood models, of a sort developed in the 1960s to explore axiomatic systems below the minimal modal logic $K$ (compare Segerberg 1971, Chellas 1980, Hansen, Kupke & Pacuit 2008) by generalizing the realm of standard relational models.

We do not intend a complete survey of all possible perspectives on modality in this article. One can consult the Handbook of Philosophical Logic for a wide array of uses that have been developed since the 1960s. To conclude here, we just mention one appealing concrete setting where many of the above strands come together naturally (van Benthem 2014).

Agency and games Consider several agents interacting strategically, the natural scenario in much of social life. To see what we are after, consider the simple game depicted in the following tree where players have preferences encoded in pairs:

(value for $\mathbf{A}$, value for $\mathbf{E}$).

The standard solution method of ‘Backward Induction’ for extensive games (compare the textbook Osborne & Rubinstein 1994) will analyze this game bottom up, telling player $\mathbf{E}$ to go left at her turn, which then gives player $\mathbf{A}$ a belief that this will happen – and so, based on this belief about his counter-player, $\mathbf{A}$ should turn left at the start. The resulting strategy is indicated by the two bold face lines:

This may be surprising, as the outcome $(99, 99)$ is better for both than reaching $(1, 0)$. So, why should players act this way, and what are plausible alternatives? To answer such questions, a logical approach tries to understand the reasoning underlying Backward Induction. Interestingly, that reasoning is a mix of many modal notions often studied separately. It is about actions, players’ knowledge of the game, their preferences, but also their beliefs about what will happen, their plans, and counterfactual reasoning about situations that will not even be reached with the plan decided on. Thus, well-understood, one extremely simple interactive social scenario involves about the entire agenda of philosophical logic in a coherent manner.

As a case study, the bridge law for the mix of philosophical notions driving Backward Induction is rationality: “players never choose an action whose outcomes they believe to be worse than those of some other available action”. Evidently, this statement is packed with assumptions, and logic wants to clarify these, rather than endorse any unique game-theoretic recommendation. For instance, Stalnaker 1999 analyzes games in terms of additional information about players’ policies for belief revision, another area of modal logic as explained above. Thus, once we understand the standard reasoning, we can also come up with alternatives: logic helps us see the laws, and break them.

Game logics The preceding example suggests that a number of modal logics needs to be put together in some appropriate way. We only give one illustration, but see van Benthem 2014 for more examples. One interesting mix of our earlier epistemics and dynamics occurs in imperfect information games, where players may not know the precise moves played by their opponents. Thus, in these games, the primary epistemic uncertainty is between actions, and only in a derived sense between the resulting game states. Think of a card game where we cannot observe which initial hand Nature is dealing to our opponent, or where some mid-play moves by our opponents may be partially hidden.

Consider the earlier game tree, but now with an uncertainty link for player $E$ at the second stage – she does not know the opening move played by $A$:

This is a model for a joint language with epistemic modalities $K_{i}$ and dynamic $[a]$ that interact. Halfway, player $E$ knows ‘de dicto’ that she has a winning move:

$K_{E}(<\!c\!>\!p\ \ \vee <\ \!d\!>\!p)$

but she does not know any particular winning move ‘de re’:

$\neg K_{E}p\ \ \& \ \ \neg K_{E}p$

This expresses the fact that the game depicted here is ‘non-determined’: $E$ cannot force an outcome $p$, but neither can $A$ force outcome $\neg p$ for the game.

The general logic of imperfect information games is the minimal dynamic logic plus epistemic ‘multi-S5’. But on top of that, the combined dynamic-epistemic language can also express modes of playing games. Take the basic game-theoretic notion of ‘Perfect Recall’. This describes players whose own actions never introduce uncertainties they did not have before. Properly understood this validates a modal interchange axiom:

turn$_{E} \ \ \& \ \ K_{E}[a]\phi) \rightarrow [a]K_{E}\phi$

saying that what we know about the result of our own game moves is still known to us after we perform them. (To understand this, contrast the different effect of non-epistemically neutral actions such as drinking.) Thus, special modal axioms in this epistemic-dynamic language correspond with special styles of playing a game.

Of course, there are many other modal aspects to the above story. Games are not just driven by actions and information, but crucially also by players’ goals, depending on their preferences between outcomes. Thus game logics link with modal logics of preference (Von Wright 1963, Hansson 2001, Liu 2011), and with deontic logics of agents’ obligations, rights and duties (Hilpinen 1970, 1981, or the proceedings of the DEON conferences, http://www.deonticlogic.org/). Each of these represents an area of its own with ramifications in philosophy and computer science, witness the following two references: Gabbay & Guenthner, eds., 1981, and Shoham & Leyton Brown 2008.

And so on Modal logic keeps finding new interpretations, and no attempt can be made here to list all its current manifestations, or, in some cases, independent rediscoveries. For instance, we omitted description logics for knowledge representation (Baader et al. eds. 2003), modal logics for webpage languages (ten Cate & Marx 2009), argumentation systems (Grossi 2010), epistemology (Holliday 2012), (Hawke 2015), and so on. This process is likely to go on, since the earlier-mentioned expressiveness/complexity balance of modal languages is a natural zoom level on many topics under the sun.

5. Modern Themes across the Field

We have sketched a few basic features of the classical theory of deduction and definability in modal logic, added a few further themes such as invariance and complexity, and then presented a wide array of current applications or manifestations of modal logic. Of course, there are no simple divisions between pure and applied in logic (or anywhere): applications themselves generate theoretical issues, and in this section, we outline a few themes from the 1990s onward that play across many different application areas.

Extended modal languages and hybrid logics The basic modal language is just a starting point for the analysis of modal notions, though it has acquired a sacred status over time, making extensions seem like foul play to some. Modal languages can be naturally enriched over their original models, and this has happened often, starting with the work of Prior on temporal logic. A well-known extension of this sort adds a universal modality $U\phi$ saying that $\phi$ is true at all worlds, accessible or not. This may look like adding all of first-order logic, but this is by no means the case: the universal modality stays inside the decidable two-variable fragment of first-order logic, at a modest price in computational complexity. The $\Box, U$ language has a matching invariance as before, now with ‘total bisimulations’ whose domains and ranges are the whole models being compared.

The more general move here is toward hybrid logics (Goranko & Passy 1992, Blackburn & Seligman 1995, Areces & ten Cate 2006) that add more expressive power to the basic modal language, One powerful hybrid device are ‘nominals’: names for unique worlds that formalize many natural styles of reasoning. This also plugs some blatant expressive gaps in the basic modal language. For instance, much has been made of the latter’s inability to express the natural frame property of irreflexivity $$\forall x \neg Rxx$$ But this property is expressed quite simply by the hybrid axiom $$i \rightarrow \neg\Diamond i$$ using a nominal $i$. Nowadays, the tendency is to add such devices freely, seeking a good balance between increased expressive power and manageable complexity. Another example is the earlier temporal operator Until, which again allows for bisimulation analysis, while keeping the resulting logic decidable. An extensive study of general hybrid logics is found in (ten Cate 2005).

While the preceding moves add ‘logical’ expressive power inside first-order logic (or beyond, as we shall see), ‘geometric extensions’ enrich the similarity type of models, adding modalities with new accessibilities. An important case are polyadic languages with $n$-ary accessibility relations. For instance, an existential dyadic modality $\Diamond \phi \psi$ holds at $s$ iff $\exists t, u: R^{3}s, tu, \phi$ holds at $t$, and $\psi$ holds at $u$. Concrete interpretations for ternary relations $R$ abound: ‘$s$ is the concatenation of expressions $t, u$’, ‘$s$ is the merge of the resources or information pieces $t, u$’, or ‘$s$ is the geometrical sum of the vectors $t, u$’.

A limit to which many extensions of both types, logical and geometric, tend is the Guarded Fragment of first-order logic (Andréka, van Benthem & Németi 1998). This is defined inside full first-order syntax by allowing only quantifiers of a guarded form:

$$\exists\mathbf{y} (G(\mathbf{x}, \mathbf{y}) \wedge \phi (\mathbf{x}, \mathbf{y}))$$

where $\mathbf{x}, \mathbf{y}$ are tuples of variables, $G(\mathbf{x}, \mathbf{y})$ is an atomic formula whose variables occur in any order and multiplicity, and $\phi$ is a guarded formula having only variables from $\mathbf{x}, \mathbf{y}$ free. Many modalities are guarded in this syntactic sense, witness translations such as $$\Diamond p = \exists y(Rxy \wedge Py)$$ $$\Diamond pq = \exists yz(Rxyz \wedge Py \wedge Qz)$$ This quite expressive sublanguage of first-order logic where groups of objects are only introduced ‘under guards’ still yields to modal analysis supporting a good meta-theory. The Guarded Fragment has a characteristic bisimulation, and it is decidable, be it now in doubly exponential time. These properties even transfer to extensions that can deal with temporal languages.

Here is what is going on now. The usual landscape of modal logics is one-dimensional: it keeps the basic language constant in expressive power and varies deductive strength of special theories expressed in it. But now we have a second dimension of variation in expressive power. This new landscape is still being charted.

Recursion, induction and fixed-point logics Another typical modern feature absent in classical modal logic are recursive definitions, whose meaning involves a process of infinite unwinding in order to reach equilibrium. In many modal systems today, recursive definitions play a role, say, for iteration of actions, common knowledge, or the description of temporal behavior on infinite histories. In principle, adding inductive definitions and recursion to classical logics leads to systems of high complexity that can encode True Arithmetic, a case in point being first-order logic with inductive definitions $LFP(FO)$ that is widely used in finite model theory (Ebbinghaus & Flum 1995, Libkin 2012). However, modal logics are often robustly decidable, carrying such loads without exploding in complexity. Propositional dynamic logic itself was a case in point, being a small decidable core theory of terminating recursions. New abstract theories of induction and recursion are thriving, such as the following one (Pratt 1981):

The modal $\mu$–calculus extends the basic modal language with operators $\mu p$• $\phi(p)$ for ‘smallest fixed-points’ where formulas $\phi(p)$ have the following special syntactic format. The propositional variable $p$ occurs only positively, that is, each occurrence of $p$ in $\phi$ lies in the scope of an even number of negations. The semantics for this modal language is more sophisticated than what we have seen before. In particular, the special positive syntax pattern ensures that the following ‘approximation function’ for the predicate defined implicitly by the formula $\phi(p)$.

$$F\mathbf{^{M}}_{\phi} (X) = { s\in\mathbf{M} | \mathbf{M}, [p:= X], s \vDash \phi}$$

is monotone in the inclusion order:

whenever $X \subseteq Y$, then $F\mathbf{^{M}}_{\phi} (X) \subseteq F\mathbf{^{M}}_{\phi} (Y)$.

On so-called ‘complete lattices’ – a special case that often suffices are power sets of standard modal models –, the Tarski-Knaster Theorem then says that monotone maps $F$ always have a smallest fixed-point, an inclusion-smallest set of states $X$ where $F(X) = X$. Concretely, one can always reach this smallest fixed-point $F_{*}$ through a sequence of approximations indexed by ordinals until there is no more increase:

$$\varnothing, F(\varnothing), F^{2}(\varnothing), … , F^{a}(\varnothing), … , F_{*}$$

Now, the formula $\mu p$• $\phi(p)$ is said to hold in a model $\mathbf{M}$ at just those states that belong to the smallest fixed-point for the map $F\mathbf{^{M}}_{\phi}$. Completely dually, there are also greatest fixed-points for monotone maps, and these are denoted by formulas:

$\nu p$• $\phi(p)$, with $p$ occurring only positively in $\phi(p)$.

Greatest fixed-points are definable from smallest ones, via the valid formula:

$\nu p$• $\phi(p) \leftrightarrow \neg \mu p$• $\neg \phi(\neg p)$, where $\neg \phi(\neg p)$ has its occurrences of $p$ positive.

The modal $\mu$–calculus is the decidable modal core theory of induction and recursion. Incidentally, a further example of such robust decidability is the Guarded Fragment: its fixed-point extension $LGF(FO)$ extending the modal $\mu$–calculus is still decidable.

There is a fast-growing literature on the $\mu$–calculus (compare Blackburn, de Rijke & Venema 2000). Venema 2007 is an up-to-date study in connection with current logics for computation, where many themes that we have mentioned for the basic modal logic return in more sophisticated forms, appropriate to infinite processes.

One more general background here is the study of ‘co-inductive’ infinite processes that are not built bottom-up, but can only be observed top-down has become a thriving area of its own in the foundations of computation and games under the name of co-algebra. Modal fixed-point logics point the way toward much more abstract new modal logics that match the category-theoretic semantics of co-inductive computation (Kurz 2001).

What is striking in these developments is the merge of modal logic and automata theory and also game theory. Automata as perspicuous representations of modal formulas are affecting our very understanding of modal languages, and the resulting theory, of great power and elegance, may come to impact our understanding of the field as a whole.

System combination Another major theme in modal logic today is system combination. While single modal logics may be simple, many applications require combining several such logics, as we saw with knowledge, action, and preference in games. Here, crucially, the architecture of combined systems matters. Adding simple systems together need not result in simple systems at all. It depends very much on the mode of combination. There are several ways of combining modal logics, ranging from mere ‘juxtaposition’ to more intricate forms of interaction between the component logics. There is an incipient theory of relevant modes of combination, including new constructions of ‘product’ and ‘fibering’ (Gabbay 1996). Here we only mention one important phenomenon.

Complexity can increase rapidly when combined modal logics include what look like natural and attractive ‘commutation properties’.

Fact	The minimal modal logic of two modalities $[1], [2]$ plus the universal modality $U$ satisfying the axiom $[1][2]\phi \rightarrow [2][1]\phi$ is undecidable.

The reason is that such logics encode complex ‘tiling problems’ on the cross-product of the natural numbers (Harel 1985, Marx 2006). By methods of frame correspondence, the commutation axiom defines a grid structure satisfying a first-order convergence property:

$$\forall xyz: (xR_{1}y \wedge yR_{2}z) \rightarrow \exists u: (xR_{2}u \wedge uR_{1}z)$$

Here is a diagram picturing this, creating a cell of a geometric grid:

This complexity danger is general, and the following two mnemonic pictures may help the reader. Modal logics of trees are harmless, modal logics of grids are dangerous!

Many dangerous combinations of modal systems occur in combinations of epistemic and temporal logic, and the first pioneering results were in fact proved in this area in Halpern & Vardi 1989 (compare the survey in van Benthem & Pacuit 2006).

The general topic behind system combination, and one that seems to have attracted little attention in philosophical logic so far, is the architecture of logical systems.

Modal predicate logic An important topic in philosophical applications of modal logic that we have mostly ignored in this survey is modal predicate logic. While this is faithful to the field as a whole (technically, modal predicate logic is just one of many system combinations), it is a serious omission for many purposes, and we will only partly make up for it by mentioning some current trends and supporting literature.

Many philosophical issues have to do with the nature of objects and their identification across different modal situations, as explained at length in James Garson’s chapter on modal predicate logic in the Handbook of Philosophical Logic. Modal predicate logic has been important as a hotbed of discussion, both philosophical and technical. The main semantics seems obvious, annotating the possible worlds in an accessibility graph with domains of objects with predicates familiar from models for first-order logic. But a major challenge has been how to interpret assertions:

$$\mathbf{M}, s \vDash \Box \phi [\mathbf{d}]$$

representing a predication about objects $\mathbf{d}$ assigned to the free variables in $\phi$ from the domain of $s$. One semantics look at accessible worlds $t$ with $R st$ where those self-same objects occur (Kripke 1980, Hughes & Cresswell 1969), but one can also merely allow ‘counterparts’ to the $\mathbf{d}$ in $t$ (Lewis 1968), an idea that has returned in sophisticated mathematical semantics for modal predicate logic where objects across worlds can only be related to each other through available functions. We will not provide further details, but refer the reader to (Rabinowicz & Segerberg 1994, Gupta & Thomason 1980, Belnap et al. 2001, Williamson 2000, 2013, Holliday & Perry 2013) for sophisticated modal predicate logics, showing how the interplay of modality, objects and predication forms a natural continuation of the modal themes in this article.

Modern modal predicate logic is a sophisticated area, (Gabbay, Shehtman & Skvortsov, to appear). While many techniques for modal propositional logic extend to this area, the devil is in the details, and no consensus has emerged yet on a philosophically or a mathematically optimal framework for the whole field. In fact, some people feel that the underlying mathematical subtleties have to do with modal predicate logic being a ‘product logic’ of two systems (Gabbay, Kurucz, Wolter & Zakharyashev 2007) that are themselves modal in character: modal propositional logic, and predicate logic itself, and we are not clear yet on what is the most natural system combination here.

Other mathematical approaches While this survey largely follows standard relational models for modal logic, it is important to realize that there are several other approaches in the area that have an even broader potential for theory and practice. We elaborate briefly on a few hints in this direction given earlier in this article.

One powerful paradigm is algebraic approaches, viewing modal logic as a study of classical algebras enriched with further operators, making the subject a branch of algebraic logic (Venema 2006). Our relational models are then connected to algebras through representation theorems, a tradition started by Stone and Birkhoff in Universal Algebra, and taken to modal logic in Jónsson & Tarski 1951. In particular, viewed algebraically, modal operators can then live on quite different base logics: intuitionistic, or even much weaker ones (Andreka, Németi & Sain 2003), (Palmigiano et al. 2014).

Another important strand of models, mentioned earlier in connection with topology, are neighborhood models with built-in world-to-set relations $N s X$ and a crucial truth clause

$\mathbf{M}, s \vDash \Box \phi$ iff there is a set $X$ with $N s X$ and $\mathbf{M}, t \vDash \phi$ for all $t$ in $X$

Neighborhood semantics date back to the 1960s (Segerberg 1971, Chellas 1980), but since then, they have found many new uses in co-algebraic computation (Hansen, Kupke & Pacuit 2008), refined notions of ‘powers’ for players in games, single or in coalitions, (Pauly 2001), or ‘evidence’ in inquiry, where different neighborhood sets record ‘reasons’ or observations made in the history so far (van Benthem & Pacuit 2011).

In particular, neighborhood models are also a general form of what are called ‘Hyper-graphs’ in mathematics, and as such, they have also been proposed in the recent philosophical literature as a way of modeling so-called {ITALIC:hyper-intensional} notions where standard logical equivalence is replaced by finer sieves for defining propositions.

One important feature shared by these and other generalized semantics for modal logic is a change in appropriate base logics and base languages. What may be an appropriate logical language over some initially studied model class may fail to have enough power of making distinctions over a generalized model class. Modern logic is replete with examples of this phenomenon (Girard 1987, Restal 2000), and modal logic is no exception. We will encounter a concrete illustration in Section 7 below.

What is modal logic? The wealth of theory and applications in modal logic today may seem overwhelming: the 2006 Handbook of Modal Logic runs to some 1200 pages. The question arises: What is truly ‘modal logic’? The themes in this survey give a working answer as an agenda of themes plus a modus operandi, but there are also more mathematical angles. One general abstract approach is in terms of Lindström theorems (van Benthem, ten Cate & Väänänen 2009). The basic modal logic can be shown to be maximal with respect to possessing two major properties from our earlier analysis and from first-order logic in general: invariance for bisimulation, and the compactness theorem. Further results in this vein can help us understand what makes landmark modal systems tick. However, no such results are known yet for the modal fixed-point logics that are so prominent today, and model-theoretic analysis may have to merge with notions from automata theory.

6. Modal Logic and Philosophy Today

With a technical survey like this, the reader may have the impression that modal logic is one of those subjects that started in philosophy, but then went their own way to become independent disciplines. But leaving the nest for good is a rigid biological view of intellectual history. Prodigal sons leave, but also return. Technical modal logic still serves as a laboratory for new notions of interest to philosophers in modal predicate logic (Williamson 2013), and further examples abound: compare (Stalnaker 2006). Moreover, as we saw with strategic reasoning in games, the unity of modal patterns in new application areas provides a new unity all across philosophical logic. Even so, some manifestations of modal logic today seem fossilized remnants, where ‘being philosophical’ means no more than using systems with forbidding names like $S4.3$ or $KD45$ whose origins, long ago, had to do with philosophical motivations. But things can be much more lively than this.

A good case for optimism is the interface of modal logic and epistemology. This started in the 1960s with Hintikka’s pioneering work, carried on by Lewis, Stalnaker, and others. Ever after that, the perceived inadequacies of our simple notion of knowledge have dominated discussions of issues such as logical omniscience, and introspection. What happened after is a parting of the ways. Modal logicians found ever more uses of epistemic logics, whether or not their main modality captured the philosophical notion of knowledge. At the same time, philosophers developed interesting new accounts of knowledge undreamt off in the logical tradition. The ‘relevant alternatives theory’ of Dretske 1970, and later de Rose, Lewis, Lawlor, comes with a more dynamic account of choosing relevant spaces of alternative worlds that are essential to knowledge claims. This deeply changes the behavior of basic epistemic reasoning, making for large differences with classical epistemic logic. In an alternative line, Nozick 1981 and later on, Sosa and Roush, have introduced the ‘tracking theory’ of knowledge as true belief that correctly tracks the truth over time, and also counterfactually, in worlds slightly different from the present one. And yet one more rich line is the ‘stability theory’ of knowledge as belief that survives new information or criticism, developed by Lehrer, Stalnaker, Rott, and others. Until the beginning of the twenty-first century, discussions in the philosophical and logical milieus seemed largely disjoint. However, the two streams of thought are approaching. Maintaining relevant alternatives shows clear similarities with the information dynamics discussed earlier. Tracking and stability accounts of knowledge intertwine knowledge, truth, belief, and counterfactuals in intriguing ways, also to logicians. A current wave (let) of publications is bringing the two traditions together (Holliday 2012, Baltag & Smets 2008, Holliday & Perry 2014, Hawke 2015), opening new interfaces for modal logic far beyond the usual laments about the inadequacies of Hintikka’s original system.

And this is just one instance. Contacts between modal logic and philosophy in new modes are very much in evidence in the literature on metaphysics (Zalta 1993, Williamson 2000, 2013, Fine 2002), epistemic modals (expressions like “must”, “may”, “probably”, and so on), where modal logic meets with epistemology and philosophy of language (Swanson 2011, Yalcin 2007, Holliday & Icard 2013, Hawke & Steinert-Steinert 2015). And the same is true for social epistemology, and notions of group knowledge and information dynamics (Helzner and Hendricks 2013, Baltag & Smets 2012, List & Pettit 2002, Christof and Hansen 2015) and the epistemic foundations of game theory (Aumann 1976, Stalnaker 1999). If anything, contacts between modal logic and philosophy are livelier than ever before, though, to see this, one has to look broadly and not seek a monopoly of one favored philosophical interface.

7. Coda: Modal Logic as a Part of Standard Logic

In this article, pains were taken to emphasize that modal logic today in the early twenty-first century is not a sort of intensional epicycle or ornamentation of standard logical systems, but a tool inside the classical realm for analyzing the fine-structure of the rich landscape of systems that span the field of logic today. We have also emphasized that there is no case for opposition or replacement here: instead, we advocated a ‘tandem view’ of having both modal and classical perspectives at our disposal when studying some area of reasoning. A certain flexibility in bringing these to bear, though perhaps looking opportunistic to some, is in fact a hallmark of a creative attitude as a working logician.

But as always in logic, one can keep looking at any topic in different ways. Consider the contrast between ‘poorer’ modal and ‘richer’ classical formalisms. Many people see the business of logic as zooming in on some reasoning practice, supplying more and more details until total clarity and cogency is achieved. This is how one thinks of complete formalizations in the foundations of mathematics, that can be checked by machines. Adding layers of detail and precision is one important use of logic, but there is also an inverse one, consisting rather in zooming out. In the details of some reasoning practice, there may be higher-level patterns that form a simple system of their own that can be brought in the open. Modal logics often have this zooming out character, looking at some simple but very basic patterns of reasoning inside some richer practice: say, the way in which modal logics of space find a decidable core theory inside all the reasoning that goes on in a topology textbook. These dual skills of zooming in and zooming out seem equally important to logic, and modal logic seems a powerful tool in achieving it.

And here is one more dual view on what a modal analysis achieves. In this article, we have stressed how modal languages translate into fragments of classical languages. But as we shall see in a moment, a simple modal semantics for these fragments often suggests a generalized semantics for the complete language, yielding intriguing trade-offs between viewing modal laws as standard validities for some small part of classical first-order logic, or as the complete set of validities for a generalized view of what the full first-order language is about. While this may sound rather technical, the actual contemporary subtlety found in studies of logical systems is the best fuel for a practice-based philosophy of logic.

In addition to these general perspectives, modal logic and classical logic also interact in the form of unusual mixes. We end with two examples that may surprise the reader.

Modal foundations of predicate logic Predicate logic itself is a form of modal or dynamic logic. The key truth condition for the standard existential quantifier reads:

$\mathbf{M}, s \vDash \exists x\phi$ iff there exists an object $d$ in $D^{M}$ with $\mathbf{M}, s[x:=d] \vDash \phi$

This clearly has a modal pattern for evaluating an existential modality:

$\mathbf{M}, s \vDash \exists x\phi$ iff there exists $t$ with $R^{x}st$ and $\mathbf{M}, t \vDash \phi$

where we now think of the points $s$ as states of some semantic evaluation process.

Viewed in this light, the usual laws of first-order logic are deconstructed into several layers. The ‘decidable core’ is the minimal modal logic, containing practically important ubiquitous laws such as Monotonicity: $$\forall x(\phi \rightarrow \psi) \rightarrow (\forall x\phi \rightarrow \forall xy)$$ This level makes no presuppositions whatsoever concerning the form of the models: they could have any kind of ‘states’ and ‘variable shift relations’ $R^{x}$. Next, there are laws recording effects of taking states to be concrete variable assignments, connected by a special shift relation of ‘agreeing up to the value for $x$’. For instance, $$\forall x\phi \rightarrow \forall x\forall x\phi$$ expresses the transitivity of $R^{x}$: indeed, all of S5 holds. Finally, more specifically than these first two layers, some first-order laws express existence properties that demand richness of the universe of available states. As an example, the innocent-looking law:

$$\exists x\forall y \phi \rightarrow \forall y \exists x \phi$$

expresses confluence: if $s R^{x} t$ and $s R^{y} u$, there also exists a state $v$ with $t R^{y} v$ and $u R^{x} v$. When pictured, this is a grid property as discussed before with combinations of modal logics, and indeed, it is at this third level that the undecidability of first-order logic arises. Thus, modal analysis reveals unexpected ‘fine-structure’ in the class of what is usually lumped together as ‘standard validities’: they are valid for different reasons.

We also see another earlier phenomenon exemplified: generalized semantics supports richer languages. On our general modal models, the first-order language gets increased expressive power, since new distinctions come up. In particular, polyadic quantifiers • introducing two objects simultaneously now become different from two-step iterations $\exists x\exists y$• or $\exists y\exists x$•. Summing up, in a modal perspective, we get an unorthodox view that shifts the border line of basic logic. The (modal) core of standard first-order logic is decidable, just as Leibniz already thought – but piling up special (existential) conditions makes state sets behave so much like full function spaces $D^{VAR}$ that their logic becomes undecidable, since it now encodes the mathematics of such spaces. For much more on these modal foundations of predicate logic, see (van Benthem 1996).

Dynamic predicate logic Another new view on first-order logic emphasizes the intuitive state change implicit in evaluating an existential quantifier. The ‘dynamic semantics’ of (Groenendijk & Stokhof 1991) makes this explicit. Success is a move to a new state containing a suitable witness value for $x$ that makes the formula true. More generally, one can then let first-order formulas denote actions of evaluation: (a) atomic formulas are ‘tests’ if the current state satisfies the relevant fact, (b) an existential quantifier picks an object and assigns it to $x$ (‘random assignment’), (c) a substitution operator $[t/x]$ is a ‘definite assignment’ $x:=t$, (d) a conjunction is sequential action ‘composition’, and (e) a negation $\neg \phi$ is a test for the ‘impossibility’ of successfully executing the action $\phi$.

The resulting ‘dynamified’ first-order logic has applications in the semantics of natural language, since pronouns “he”, “she”, “it” show this kind of dynamic behavior. One nice illustration occurs with sentences like: $$\exists x Kx \rightarrow Hx$$ (“if you get a kick, it hurts”). Standard folklore ‘improves’ natural language here to a first-order form: $$\forall x (Kx \rightarrow Hx)$$ But with dynamic semantics, this meaning arises automatically for the above surface form, as any value assigned by the existential move in the antecedent will be bound to $x$ when the consequent is processed. The system has also inspired programming languages for dynamic execution of specifications. ‘Dynamic predicate logic’ is a general paradigm for bringing out the cognitive dynamics that underlies existing logical systems. This allows one to view natural language meanings in terms of updates of propositional content, perspective, and other parameters that determine the transfer of information.

The reader should have no difficulty seeing that there is again an underlying modal logic, this time related to the dynamic logic of programs discussed earlier in this article (van Eijck & de Vries 1992, Muskens, van Benthem & Visser 1997).

8. Conclusion

We have discussed modal logic as lying at a crossroads of many disciplines, though we have tried to maintain the original philosophical connections, and also pointed at some promising trends reviving that particular interface. The resulting presentation is different in spirit from other surveys in current anthologies, handbooks, and encyclopedias. We presented modal logic as a tool for fine-structure analysis of expressiveness and complexity of logical systems, including the sometimes surprising effects of their combinations, and we emphasized the major application areas (information, computation, action, agency) that drive abstract theory today. As a result, we had no uniform conclusion, or definition of modal logic to offer in the end: the field seems too rich for that. Our purpose with this panorama will have been served if the reader experiences a beneficial culture shock.

9. References and Further Reading

S. Abramsky, D. Gabbay & T. Maibaum, eds., 1992, Handbook of Logic in Computer Science, Oxford University Press, Oxford.
M. Aiello, I. Pratt & J. van Benthem, eds., 2007, Handbook of Spatial Logics, Springer Science Publishers, Heidelberg.
H. Andréka, I. Nemeti & J. van Benthem, 1998, ‘Modal Languages and Bounded Fragments of Predicate Logic’, Journal of Philosophical Logic 27, 217–274.
H. Andréka, I. Németi & I. Sain, 2003, ‘Algebraic Logic’, in Handbook of Philosophical Logic.
C. Areces & B. ten Cate, 2006, ‘Hybrid Logics’, In P. Blackburn et al. eds., Handbook of Modal Logic, Elsevier, Amsterdam.
S. Artemov, 2006, ‘Modal Logic and Mathematics’, in P. Blackburn et al., eds. Handbook of Modal Logic, 927–970.
R. Aumann, 1976, ‘Agreeing to Disagree’, The Annals of Statistics 4:6, 1236–1239.
F. Baader, D. Calvanese, D. L. McGuinness, D. Nardi, & P. F. Patel-Schneider, eds., 2003, The Description Logic Handbook: Theory, Implementation, Applications. Cambridge University Press, Cambridge.
A. Baltag, L. Moss & S. Solecki, 1998, ‘The Logic of Public Announcements, Common Knowledge and Private Suspicions’, Proceedings TARK 1998, 43–56, Morgan Kaufmann Publishers, Los Altos.
A. Baltag & S. Smets, 2008, ‘A Qualitative Theory of Dynamic Interactive Belief Revision’, in G. Bonanno, W. van der Hoek, M. Wooldridge, eds., Texts in Logic and Games Vol. 3, Amsterdam University Press, 9–58.
A. Baltag & S. Smets, 2012, Interactive Learning, Formal Social Epistemology and Group Belief Dynamics: Logical, Probabilistic and Game-theoretic Models, Lecture Notes, ESSLLI Summer School, Opole.
J. Barwise & J. van Benthem, 1999, ‘Interpolation, Preservation & Pebble Games’, Journal of Symbolic Logic 64, 881–903.
P. Battigalli & G. Bonanno, 1999, ‘Recent Results on Belief, Knowledge and the Epistemic Foundations of Game Theory’, Research in Economics 53, 149–225.
N. Belnap, M. Perloff & M. Xu, 2001, Facing the Future, Oxford Univ. Press, Oxford.
J. van Benthem, 1983, The Logic of Time, Kluwer, Dordrecht. J. van Benthem, 1984, ‘Correspondence Theory’, in D. Gabbay & F. Guenthner, eds., Volume III, 167–247.
J. van Benthem, 1996, Exploring Logical Dynamics, CSLI Publications, Stanford. J. van Benthem, 2010, Modal Logic for Open Minds, CSLI Publications, Stanford.
J. van Benthem, 2011, Logical Dynamics of Information and Interaction, Cambridge University Press, Cambridge.
J. van Benthem, 2014, Logic in Games, The MIT Press, Cambridge (Mass.).
J. van Benthem & P. Blackburn, 2006, ‘Modal Logic, A Semantic Perspective’, In P. Blackburn et al. eds.. 2006, 1–84.
J. van Benthem, B. ten Cate & J. Väänänen, 2009, ‘Lindström Theorems for Fragments of First-Order Logic’, Logical Methods in Computer Science 5:3, 1–27.
J. van Benthem & E. Pacuit, 2006, ‘The Tree of Knowledge in Action’, Proceedings Advances in Modal Logic, ANU Melbourne.
J. van Benthem & E. Pacuit, 2011, ‘Dynamic Logic of Evidence-Based Beliefs’, Studia Logica 99:1, 61–92. P. Blackburn,
J. van Benthem & F. Wolter, eds., Handbook of Modal Logic, Elsevier Science Publishers, Amsterdam.
P. Blackburn & W. Meyer Viol, 1994, ‘Linguistics, Logic, and Finite Trees’, Logic Journal of the IGPL 2, 3–29.
P. Blackburn, M. de Rijke & Y. Venema, 2001, Modal Logic, Cambridge University Press, Cambridge.
P. Blackburn & J. Seligman, 1995, ‘Hybrid Languages’, Journal of Logic, Language and Information 4, 251-272.
G. Boolos, 1993, The Logic of Provability, Cambridge University Press, Cambridge.
J. Burgess, 1981, ‘Quick Completeness Proofs for some Logics of Conditionals’, Notre Dame Journal of Formal Logic 22:1, 76–84.
S. Buvac & I. Mason, 1994, ‘Propositional Logic of Context’, Proceedings AAAI, 412–419. R. Carnap, 1947, Meaning and Necessity, The University of Chicago Press, Chicago.
B. ten Cate, 2005, Model Theory for Extended Modal Languages, Ph.D. Thesis, University of Amsterdam. ILLC Dissertation Series DS-2005-01.
B. ten Cate & M. Marx, 2009, ‘Axiomatizing the Logical Core of XPath 2.0’, Theory Comput. Syst. 44(4): 561–589.
A. Chagrov & M. Zakharyashev, 1996, Modal Logic, Clarendon Press, Oxford. B. Chellas, 1980, Modal Logic, An Introduction, Cambridge University Press, Cambridge.
Z. Christof & J-U Hansen, 2015, ‘A Logic for Diffusion in Social Networks’, Journal of Applied Logic 13, 48–77.
H. van Ditmarsch, W. van der Hoek & B. Kooi, 2007, Dynamic Epistemic Logic, Springer Science Publishers, Heidelberg.
F. Dretske, 1970, ‘Epistemic Operators’, The Journal of Philosophy, 67, 1007–1023.
H. D. Ebbinghaus & J. Flum, 1995, Finite Model Theory, Springer, Heidelberg.
J. van Eijck & F-J de Vries, 1992, Dynamic Interpretation and Hoare Deduction’, Journal of Logic, Language and Information 1, 1–44.
R. Fagin, J. Halpern, Y. Moses & M. Vardi, 1995, Reasoning About Knowledge, The MIT Press, Cambridge (Mass.).
K. Fine, 2002, The Limits of Abstraction, Oxford University Press, Oxford. G. Frege, 1879, Begriffsschrift. Eine der Arithmetschen Nachgebildeten Formelsprache des Reinen Denkens, Louis Seifert Verlag, Halle.
D. Gabbay, 1996, ‘Fibred Semantics and the Weaving of Logics Part 1: Modal and Intuitionistic Logics’, Journal of Symbolic Logic 61, 1057–1120.
D. Gabbay & F. Guenthner, eds., 1981, Handbook of Philosophical Logic, four volumes, Kluwer, Dordrecht. Revised and expanded version appeared from 2001 onward with Springer Science Publishers.
D. Gabbay, Ch. Hogger & J. Robinson, eds., 1997, Handbook of Logic in Artificial Intelligence and Logic Programming, Oxford University Press, Oxford.
D. Gabbay, A. Kurucz, F. Wolter & M. Zakharyaschev, 2007, Many-Dimensional Modal Logics: Theory and Applications, Elsevier, Amsterdam.
D. Gabbay, V. Shehtman, D. Skvortsov, to appear, Quantification in Nonclassical Logic, King’s College London & Moscow University of Humanities.
P. Gärdenfors & H. Rott, 1995, ‘Belief Revision’, in D. M. Gabbay, C. J. Hogger & J. A. Robinson, eds., Handbook of Logic in Artificial Intelligence and Logic Programming 4, Oxford University Press, Oxford.
N. Gierasimczuk, 2010, Knowing One’s Limits, Logical Analysis of Inductive Inference, Dissertation, Institute for Logic, Language and Computation, University of Amsterdam.
J-Y Girard, 1987, ‘Linear Logic’, Theoretical Computer Science 50, 1–102. K. Gödel, 1933, ‘Eine Interpretation des Intuitionistischen Aussagenkalküls’, Ergebnisse eines Mathematischen Kolloquiums 4, 34–38.
R. Goldblatt & S. Thomason, 1975, ‘Axiomatic Classes in Propositional Modal Logic’, in J. Crossley, ed., Algebra and Logic, Springer Lecture Notes in Mathematics 450, 163–173.
V. Goranko & S. Passy, 1992, Using the Universal Modality: Gains and Questions, Journal of Logic and Computation 2, 5–30.
J. Groenendijk & M Stokhof, 1991, ‘Dynamic Predicate Logic’, Linguistics and Philosophy 14, 39–100.
D. Grossi, 2010, ‘On the Logic of Argumentation Theory’, in W. van der Hoek et al. eds., Proceedings 9th International Conference on Autonomous Agents and Multiagent Systems, Toronto, 409–416.
A. Grove, 1988, ‘Two Modellings for Theory Change’, Journal of Philosophical Logic 17, 157–170.
A. Gupta & R. Thomason, 1980, ‘A Theory of Conditionals in the Context of Branching Time’, Philosophical Review 89, 65–90.
J. Halpern & M. Vardi, 1989, ‘The Complexity of Reasoning about Knowledge and Time, I: lower bounds’. Journal of Computer and System Sciences, 38(1):195–237.
H. Hansen, C. Kupke & E. Pacuit, 2008, ‘Neighbourhood Structures: Bisimilarity and Basic Model Theory’, in D. Kozen, U. Montanari, T. Mossakowski & J. Rutten, eds., Logical Methods in Computer Science 15, 1–38.
S. O. Hanson, 2001, ‘Preference Logic’, in D, Gabbay & F. Guenthner, eds., Handbook of Philosophical Logic IV, 319 – 393, Kluwer, Dordrecht.
D. Harel, 1985, ‘Recurring Dominoes: Making the Highly Undecidable Highly Understandable’, Annals of Discrete Mathematics 24, 51–72.
D. Harel, D. Kozen & J, Tiuryn, 2000, Dynamic Logic, The MIT Press, Cambridge (Mass.). P. Hawke, 2015, Knowledge and Relevance, Ph.D. Thesis, Department of Philosophy, Stanford University.
P. Hawke & S. Steinert-Threlkeld, 2015, ‘Informational Dynamics of “Might” Assertions’, Department of Philosophy, Stanford University. Proceedings LORI 2015, Taipei.
J. Helzner & V. Hendricks, 2013, Agency and Interaction: What we Are and What we Do in Formal Epistemology, Cambridge University Press, Cambridge.
R. Hilpinen, ed., 1970, Deontic Logic: Introductory and Systematic Readings, Reidel, Dordrecht.
R. Hilpinen, ed., 1981, New Studies in Deontic Logic, Reidel, Dordrecht.
J. Hintikka, 1962, Knowledge and Belief, Cornell University Press, Ithaca.
W. Holliday, 2012, Knowing What Follows, Epistemic Closure and Epistemic Logic, Dissertation, Department of Philosophy, Stanford University.
W. Holliday & Th. Icard, 2013, ‘Logic, Probability and Epistemic Modality’, Departments of Philosophy, Berkeley and Stanford.
W. Holliday & J. Perry, 2013, ‘Roles, Rigidity and Quantification in Epistemic Logic’, Departments of Philosophy, Berkeley and Stanford.
G. Hughes & M.J. Cresswell, 1969, An Introduction to Modal Logic, Methuen, London. B. Jónsson & A. Tarski, 1951, ‘Boolean Algebras with Operators’, Parts MI Amer. J. Math. 73, 74, 891–939, 127–162.
S. Kanger, 1957, Provability in Logic. Almqvist & Wiksell, Stockholm.
K. Kelly, 1996, The Logic of Reliable Inquiry, Oxford University Press, Oxford.
W. & M. Kneale, 1961, The Development of Logic, Oxford University Press, Oxford.
S. Kripke, 1959, ‘A Completeness Theorem in Modal Logic’, The Journal of Symbolic Logic, 24, 1–14.
S. Kripke, 1963, ‘Semantical Considerations on Modal Logic’, Acta Philosophica Fennica 16, 83–94.
S. Kripke, 1965, ‘Semantical Analysis of Intuitionistic Logic’, in J. Crossley and M. A. E. Dummett, eds., Formal Systems and Recursive Functions, North-Holland, Amsterdam, 92–130.
S. Kripke, 1980, Naming and Necessity. Harvard University Press, Cambridge (Mass.).
A. Kurz, 2001, Coalgebras and Modal Logic, lecture Notes, CWI Amsterdam. J. van Leeuwen, ed., 1991, Handbook of Theoretical Computer Science, North- Holland, Amsterdam.
C. I. Lewis & H. Langford, 1932, Symbolic Logic, Dover, New York. K. Leyton-Brown & Y. Shoham, 2008, Essentials of Game Theory: A Concise Multidisciplinary Introduction, Morgan & Claypool Publishers, San Rafael.
D. Lewis, 1969, Convention, Harvard University Press, Cambridge (Mass.).
D. Lewis, 1973, Counterfactuals, Blackwell, Oxford.
L. Libkin, 2012, Elements of Finite Model Theory, Springer, Berlin. Ch. List & Ph. Pettit, 2002, ‘Aggregating Sets of Judgments: An Impossibility Result’, Economics and Philosophy, 18, 89–110.
F. Liu, 2011, Reasoning About Preference Dynamics, Springer, Dordrecht. M. Marx, 2006, ‘Complexity of Modal Logic’, in P. Blackburn et al. eds., 139–179.
R. Montague, 1974, Formal Philosophy, Yale University Press, New Haven.
R. Muskens, J. van Benthem & A. Visser, 1997, ‘Dynamics’, in J. van Benthem & A. ter Meulen, eds., Handbook of Logic and Language, Elsevier, Amsterdam.
S. Negri, 2011, ‘Proof Theory for Modal Logic’, Philosophy Compass 6/8, 523–538.
R. Nozick, 1981, Philosophical Explanations, Harvard University Press, Cambridge (Mass.).
M. Osborne & A. Rubinstein, 1994, A Course in Game Theory, The MIT Press, Cambridge (Mass.).
A. Palmigiano, W. Conradie, and S. Ghilardi, 2014, ‘Unified Correspondence’, in A. Baltag & S. Smets, eds., Johan van Benthem on Logic and Information Dynamics, Outstanding Contributions to Logic, Springer, Dordrecht, 933–975.
M. Pauly, 2001, Logic for Social Software, Dissertation, Institute for Logic, Language and Computation, University of Amsterdam.
A. Perea, 2011, Epistemic Game Theory, Cambridge University Press, Cambridge.
V. Pratt, 1981, ‘A Decidable Mu Calculus’, Foundations of Computer Science, SFCS 22, 421–427.
A. Prior, 1967, Past, Present and Future, Clarendon Press, Oxford.
W. Rabinowicz & K. Segerberg, 1994, ‘Actual Truth, Possible Knowledge’, Topoi 13, 101–115.
G. Restal, 2000, An Introduction to Substructural Logics, London: Routledge.
K. Segerberg, 1971, An Essay in Classical Modal Logic, Filosofiska Institutionen, University of Uppsala.
K. Segerberg, 1995, ‘Belief Revision from the Point of View of Doxastic Logic’, Bulletin of the IGPL 3, 534–553.
Y. Shoham & K. Leyton-Brown, 2008, Multiagent Systems: Algorithmic, Game Theoretic and Logical Foundations, Cambridge University Press, Cambridge.
R. Stalnaker, 1984, Inquiry, The MIT Press, Cambridge, MA. R. Stalnaker, 1999, ‘Extensive and Strategic Form: Games and Models for Games’, Research in Economics 53, 293–291.
R. Stalnaker, 2006, ‘On Logics of Knowledge and Belief’, Philosophical Studies 128, 169–199.
E. Swanson, 2011, ‘On the Treatment of Incomparability in Ordering Semantics and Premise Semantics’, Journal of Philosophical Logic 40, 693–713.
A. Tarski, 1938, ‘Der Aussagenkalkül und die Topologie’, Fundamenta Mathematicae 31, 103–134.
A. Troelstra & D. van Dalen, 1988, Foundations of Constructivism, Elsevier, Amsterdam.
F. Veltman, 1985, Logics for Conditionals, Dissertation, Philosophical Institute, University of Amsterdam.
Y. Venema, 2006, ‘Modal Logic and Algebra’, In Handbook of Modal Logic.
Y. Venema, 2007, Lectures on the Modal Mu–Calculus, Institute for Logic, Language and Computation, University of Amsterdam.
H. Wansing, ed., 1996, Proof Theory of Modal Logic, Kluwer, Dordrecht.
T. Williamson, 2000, Knowledge and its Limits, Oxford University Press, Oxford.
T. Williamson, 2013, Modal Logic as Metaphysics, Oxford University Press, Oxford.
G. H. von Wright, 1963, The Logic of Preference, Edinburgh University Press, Edinburgh.
S. Yalcin, 2007, ‘Epistemic Modals’, Mind 116 (464):983–1026.
E. Zalta, 1993, ‘A Philosophical Conception of Propositional Modal Logic’, Philosophical Topics 21:2, 263–281.

Author Information

Johan van Benthem
Email: http://staff.fnwi.uva.nl/j.vanbenthem
University of Amsterdam, Stanford University, and Tsinghua University
The Netherlands, U. S. A., and China

Religious Disagreement

The domain of religious inquiry is characterized by pervasive and seemingly intractable disagreement. Whatever stance one takes on central religious questions—for example, whether God exists, what the nature of God might be, whether the world has a purpose, whether there is life beyond death—one will stand opposed to a large contingent of highly informed and intelligent thinkers. The fact of extensive religious disagreement raises several distinct philosophical questions. One significant question arises within the context of political philosophy: may religious conceptions of the good and the right legitimately ground one’s political convictions in a pluralistic society marked by diverse and often conflicting religious convictions? Other questions concern the possibility of reconciling disagreement data with specific religious beliefs. For example, can persistent religious disagreement be squared with the conviction of many Christians and other theists that God “desires everyone to be saved and to come to knowledge of the truth” (I Timothy 2:4, NRSV)? These and other important questions will not be taken up here. The focus of this article is the epistemic challenge raised by religious disagreement: does awareness of the nature and extent of religious disagreement make it unreasonable to hold confident religious, or explicitly irreligious, views? Many philosophers have answered this question in the affirmative, arguing that the proper response to religious disagreement is religious skepticism. Others contend that religious conviction may be reasonably maintained even in the face of disagreement with highly qualified thinkers.

Reflecting on the epistemic challenge posed by religious disagreement readily leads one to questions concerning the epistemic significance of disagreement in general, religious or otherwise. One might think that religious disagreement does not raise any distinctive epistemological questions beyond those that are addressed in a more general work on disagreement. There are, however, features of religious disagreements that present problems that, for the most part, are not adequately addressed in such a work. These features include the lack of agreement on what skills, virtues, and qualifications are most important for assessing the questions under dispute; the fact that many of the disputed beliefs are arguably epistemically fundamental; the significant evidential weight that is assigned to private experiences; and the prominence of practical or pragmatic considerations in the justifications offered for opposing viewpoints. While these features taken individually may not be exclusive to religious disagreements, the fact that they frequently coincide in religious disputes and are especially salient in such disputes makes religious disagreement a worthy epistemological topic in its own right. The bulk of this article will focus on these problematic features of religious disagreements and the special questions they raise.

The First-Order and Higher-Order Significance of Religious Disagreement
The Conciliatory Argument for the Higher-Order Defeat of Religious Belief
Permissivist Responses to the Conciliatory Argument
Religious Belief and the Problem of Judging Epistemic Credentials
Fundamental Versus Superficial Disagreements
Appeals to Religious Experience
Faith and Practical Responses to the Problem of Religious Disagreement
Conclusion
References and Further Reading

1. The First-Order and Higher-Order Significance of Religious Disagreement

Religious disagreement may present two distinct sorts of evidential challenges to a given religious belief: a first-order challenge and higher-order challenge. (Henceforth, the label “religious belief” will typically be used to refer to all beliefs that take a stand on religious questions, including explicitly irreligious beliefs such as the belief that there is no God.) The aim of this section is to clarify the distinction between first-order and higher-order evidential challenges and to look at examples of how religious disagreement may possess first-order significance for religious belief. The remaining sections will focus on the higher-order challenge posed by religious disagreement.

In the epistemological literature on disagreement, a contrast is frequently drawn between first-order and higher-order evidence (Kelly 2005). The distinction may roughly be characterized as follows. First-order evidence for or against some proposition p “directly” bears on the question of whether p, whereas higher-order evidence for or against p does not directly bear on the question of whether p but directly bears on the question of whether one has rationally assessed the first-order evidence for or against p. To illustrate the distinction, consider the case (from Rotondo 2013) of Detective, who has stayed up all night studying the evidence bearing on a particular crime. At the end of the lengthy process of sifting the evidence, Detective judges that it is very likely that Lefty, rather than Righty, committed the crime. When she calls Lieutenant to share her conclusion, Lieutenant asks whether Detective has stayed up all night and then informs Detective that every time Detective has stayed up all night in the past, her reasoning has been atrocious and unreliable (despite its seeming to Detective that nothing is amiss). Let’s call this fact that Detective has a bad track record after all-nighters UNRELIABLE. According to many epistemologists, upon learning UNRELIABLE, Detective ought to become significantly less confident that Lefty committed the crime. (Still, there are others [Lasonen-Aarnio 2014 and Titelbaum 2015] who argue that Detective should not reduce confidence if she in fact assessed the first-order evidence correctly.) Thus, UNRELIABLE is arguably evidence against Detective’s proposition that Lefty committed the crime. However, UNRELIABLE does not directly bear on the question of whether Lefty committed the crime in the same way that the evidence Detective stayed up examining does. It is not as though UNRELIABLE is more to be expected if Righty committed the crime than if Lefty committed the crime: someone who had a full night’s sleep before examining the evidence inspected by Detective could dismiss UNRELIABLE as evidentially irrelevant. If UNRELIABLE gives Detective reason to doubt the Lefty hypothesis, it is only because UNRELIABLE is higher-order evidence that raises doubts about any conclusion that Detective might have reached after an all-nighter.

Facts about religious disagreement may pose first-order or higher-order evidential worries (or both) for religious belief.

Suppose that religious view R1 suggests a view of human nature where persistent religious disagreement is to be expected, while religious view R2 suggests a view of human nature where persistent religious disagreement would be very surprising (though not impossible). Given this supposition, persistent religious disagreement would constitute first-order evidence in favor of R1 over R2. In addition to this first-order significance of religious disagreement, however, facts about disagreement may constitute higher-order evidence against both religious views if these facts raise worries about the rationality of one’s religious views or the reliability of the process by which one’s religious beliefs have been formed. Alternatively, we can imagine a situation where widespread religious agreement on the truth of R1 provides higher-order support of R1 (by boosting one’s confidence in the general reliability of religious belief formation) even though religious agreement is unexpected given R1, and thus counts as first-order evidence against it. The first-order significance of religious disagreement is thus distinct from its higher-order significance.

An example of an argument in the philosophy of religion that makes claims about the first-order significance of religious disagreement is the argument from divine hiddenness. This argument against theism begins by noting that according to most theists, the highest good for a human being is to be in a loving relationship with God. Many theists also claim that since God loves all human beings, God desires to be in a loving relationship with each human person. If this view is a component of theism, then, given theism, we have reason to expect that God would make God’s existence evident to all—for the lack of belief that God exists is a barrier to the loving relationship that God desires. The fact that many intelligent and thoughtful people fail to believe in God, including many people who indicate they would like to believe in God if it were possible for them to do so, is evidence that God has not made God’s existence very evident, contrary to what theism might lead us to expect. Thus, extensive and pervasive disagreement over whether God exists is claimed to be evidence against theism.

John Hick (2004) offers a very different characterization of the first-order evidential significance of religious disagreement. Rather than suggesting that such disagreement supplies an evidential basis for atheism, Hick suggests that such disagreement can instead be viewed as evidence that genuine encounters with “the Real”—that transcendent reality that is the source of salvation and that is encountered in all of the world’s great religions—are inevitably understood through conceptual frames that prevent unproblematic cognitive access to the Real as it is in itself and lead to diverse, and often conflicting, interpretations of such experiences. This position, which Hick labels “religious pluralism,” is not motivated by intractable religious disagreement alone. Hick emphasizes that the major world religions have all proven to be successful as vehicles that move practitioners from “self-centredness” to “Reality-centredness,” and this ethical parity across multiple faiths is seen by Hick as undermining the basis for thinking that one religious tradition may reasonably claim supremacy in the veridicality of its teaching. However, it is clear that Hick’s pluralism would be unmotivated if it were the case that religious dialogue typically led to an end of religious disagreement and to agreement on the teachings of one religious tradition. Hence, the fact of persistent religious disagreement does play a crucial evidential role in the case for Hick’s pluralistic hypothesis.

The argument for divine hiddenness and the case for religious pluralism can both be understood as appeals to the first-order rather than higher-order significance of religious disagreement. Consider first Hick’s pluralism. Since pluralism is itself a controversial and significantly contested religious viewpoint, the higher-order worries raised by disagreement as discussed below would seem, at least initially, to apply as much to the belief in pluralism as to other religious convictions. Considered as a piece of first-order evidence, however, religious disagreement does lend more support to religious pluralism than to many other religious hypotheses. If culturally-conditioned interpretive frameworks are as entrenched and significant as pluralists contend, then we should expect religious conversion to be fairly rare and religious disagreement to be rather intractable, as is in fact the case. Many non-pluralist religious perspectives will have a harder time accommodating this datum. Similarly, the argument from divine hiddenness is clearly a first-order rather than higher-order challenge to theistic belief. Even if religious disagreement did not pose a higher-order challenge to the theist, the fact of significant and persistent disagreement over theism could still be first-order evidence against theism. For example, even if a theist somehow knew that she and her fellow theists were in possession of more evidence than non-theists, so that the disagreement over theism did not give her any reason for questioning whether or not she and other theists have made some error in their assessment of the evidence, the fact that many reject theism, even if due to lack of information, would still constitute evidence against theism since prevalent disbelief is more to be expected given atheism than theism.

2. The Conciliatory Argument for the Higher-Order Defeat of Religious Belief

We turn now from the first-order significance of religious disagreement to an argument for the claim that religious disagreement constitutes higher-order evidence that renders religious belief (or at least many religious beliefs) unjustified—that is, that religious disagreement constitutes a higher-order defeater for religious belief. The argument for this conclusion can be seen as consisting of two components: one a priori and the other a posteriori. The a priori component aims to defend some general “conciliatory” policy that says that in disagreements that satisfy certain conditions, the proper response is a conciliatory one that gives significant weight to the views of one’s disputants. This conciliatory response might involve giving up one’s belief and adopting an agnostic stance towards the question under dispute. Or, if someone’s doxastic state is better described not in terms of belief or unbelief, but in terms of subjective probabilities or “credences,” then the appropriate conciliatory response might involve adopting a new credence for the disputed proposition that gives significant weight to the initial credences of one’s disputants. The a posteriori component of the argument aims to show that the core commitments of religious believers are in fact subject to the relevant type of disagreement—a disagreement where the aforementioned conciliatory principle requires significant reduction in confidence.

a. Strong Conciliatory Policies

The a priori stage of the argument defends some conciliatory policy that is demanding enough to require significant reduction of confidence in religious disagreements. There is no canonical conciliatory policy that is agreed upon by those who argue that disagreement has significant higher-order force, but a variety of conciliatory requirements have been proposed, some more demanding than others. Despite the diversity of conciliatory proposals, one can discern behind the most demanding conciliatory views two basic commitments (Vavova 2014). The first is a principle that requires epistemic deference to other thinkers in proportion to their apparent epistemic qualifications, and the second is a principle that constrains the types of reasons to which one can legitimately appeal when assessing the relative epistemic qualifications of the various sides of the dispute. It is worth separating these principles out, since some criticisms of the most demanding conciliatory views can be understood as targeting the first principle, while others, the second. We might articulate these principles as follows:

DEFERENCE: In a disagreement over p, one ought to show epistemic deference to suitably qualified thinkers, giving equal weight to one’s own view and the view of an apparent epistemic peer (where an “epistemic peer” is someone whose epistemic credentials with respect to p are equal to one’s own) and more weight to the view of someone who appears to be one’s epistemic superior.

INDEPENDENCE: In assessing my own and my disputant’s epistemic credentials with respect to p, in order to determine how (or whether) to modify my own belief about p, I should base my assessment on grounds that are independent of my disputed reasoning concerning p. (Adapted from Christensen 2011.)

Consider DEFERENCE first. The basic idea behind DEFERENCE is that one cannot reasonably maintain confident belief that p while thinking that those who reject p are just as qualified and well-positioned to assess the plausibility of p as those on one’s own side of the dispute. For example, according to DEFERENCE, someone who believes that Muhammad is a prophet of God cannot reasonably think that those who reject this claim are, taken as a whole, just as qualified to assess the claim as those who accept it. Whether DEFERENCE is plausible partly depends on what is meant by “epistemic credentials.” The principle is plausible only if our understanding of epistemic credentials is such that we should expect those who are more credentialed to be more reliable in their views on the disputed question than those who are less credentialed. This means that epistemic credentials must be assessed relative to the particular proposition under dispute and the particular occasion when the proposition was assessed. Furthermore, the relevant credentials must take into account all of the dimensions of epistemic evaluation that bear on the reliability of one’s judgment on the matter at hand, such as the quality and quantity of one’s evidence and the ability to assess that evidence in a rational and unbiased manner. This understanding of epistemic credentials may not align with conventional notions of expertise. For example, someone who is a leading researcher on a contested scientific question might count as less credentialed than a well-informed non-specialist if there is concern that the researcher’s personal involvement has biased her judgment.

Given a sufficiently fine-grained understanding of epistemic credentials, DEFERENCE looks very plausible. Those who reject DEFERENCE must affirm that there are situations where I can reasonably stand by my view that p even though I acknowledge that my epistemic position with respect to p is no stronger than that enjoyed by my interlocutor who denies p. It seems that in such a situation, I would need to hold that, despite our equally strong epistemic positions, I was simply lucky in arriving at the truth and my interlocutor was not. Many writing on disagreement seem to take it for granted that this would not be a rational position. Even those who oppose demanding conciliatory views typically hold that in order to dismiss the worries raised by disagreement, it is necessary to identify a “symmetry breaker” (Lackey 2010), or some reason for thinking that one’s own side is better placed to assess the disputed matter than the opposition. In the next section, we consider one response to disagreement-motivated religious skepticism that involves rejecting DEFERENCE.

INDEPENDENCE is the more controversial of the two conciliatory principles offered above. Christensen (2009), who is responsible for labeling the principle, argues that INDEPENDENCE is the key principle separating “conciliationists” and their opponents. According to Christensen, INDEPENDENCE captures what is wrong with “blatantly question-begging” responses to disagreement (2011). He gives an example where two individuals who are sharing a dinner at a restaurant with several friends both calculate in their head what each person’s share of the total bill comes to (Christensen 2007). They agree to add 20% of the post-tax total for tip and to split the check evenly among each member of the party. Both friends do this sort of calculation often and know that the other person is no more or less reliable than they are. They usually agree on the answer in such cases, but in those instances when they do reach different answers, neither of them has proven more likely than the other to be the one who has made an error. While nothing is out of the ordinary in this case (for example, neither friend is especially distracted or extra alert), upon finishing their mental calculations they discover that their answers differ: one arrived at an answer of $43, and the other, $45. According to Christensen (2011), INDEPENDENCE is needed to explain why it would be illegitimate for one of the two friends to dismiss the significance of the disagreement by reasoning as follows: “Since my friend fails to see that the facts support an answer of $43, I have good reason for thinking that, contrary to my expectations, she is not (at least at this moment) a reliable judge of the question we are disputing; therefore, her disagreement gives me no reason at all to question my answer.”

Most agree that this sort of response to the disagreement is unreasonably sanguine. But it is questionable whether a principle as strong as INDEPENDENCE is needed to explain why this response is unreasonable (Kelly 2013). Note that in the present example, the speaker did not have any reason to be dismissive of his friend’s view antecedent to learning that she arrived at a different answer. His only reason for judging that she was unreliable was the fact that her answer differed from his. Such crudely dogmatic reasoning, if acceptable, could be used to dismiss as misleading any piece of evidence that goes against what one believes. This is quite different from a situation where someone’s belief that p gives him a reason for thinking, even before learning what his friend thinks about p, that his friend is unlikely to be reliable in her judgment concerning p. Consider, for instance, a situation where someone comes to believe in a religion that teaches that the wealthy are frequently biased when it comes to assessing spiritual questions. Such a person has a reason for distrusting his wealthy friend’s opinion concerning the religion even before learning what her opinion is. Suppose the new convert learns that his wealthy friend does reject the new religion as false, but the convert is largely unworried by the disagreement since his new religion teaches that the wealthy are biased on such matters. While this dismissal appeals to dispute-dependent reasons and thus violates INDEPENDENCE, the dismissal would not be based on the crude sort of dogmatic reasoning that would always be available in any dispute. It is at least less obvious in this case that the dispute-dependent reasoning is objectionable. This provides some reason for doubting whether a strong anti-question-begging principle like INDEPENDENCE is needed in order to explain why the quick dismissal in the calculation case is problematic.

b. Modest Conciliatory Policies

The last section focused on the most demanding sort of conciliatory policy, which features both DEFERENCE and INDEPENDENCE. But many proponents of broadly conciliatory views advocate less demanding policies that feature weaker principles than these. In particular, many seek to avoid some of the radical implications that are thought to follow from INDEPENDENCE, opting instead for a weaker anti-question-begging constraint. To see why INDEPENDENCE taken as a general requirement is thought to support implausibly demanding prescriptions, suppose you find yourself in a disagreement with a radical skeptic who believes that human cognitive faculties, including those employed in philosophical reasoning, are systematically unreliable. You might have many reasons for thinking that this skeptic is not particularly qualified with respect to philosophical matters. Perhaps he has not read any academic philosophy and succumbs to several logical fallacies in his argumentation. Still, these reasons for putting little trust in your interlocutor would not be dispute-independent, since you should not trust your ability to judge epistemic credentials if you took seriously the skeptic’s view that your cognitive faculties are systematically unreliable. It would seem, then, that in this context, you cannot have dispute-independent reasons for thinking that you are more qualified than your disputant. Of course, you also cannot have dispute-independent reasons for thinking that your disputant is more qualified. Nonetheless, a lack of dispute-independent reasons for favoring either side may itself be considered a dispute-independent reason for having an equal estimation of the epistemic credentials of the two sides. If this is right, then INDEPENDENCE, in combination with DEFERENCE, would seem to require that you give up believing in the reliability of your cognitive faculties. Since most do not think that we must have non-question-begging reasons for rejecting skepticism, even when we encounter a real skeptic, many advocates of conciliatory views aim to articulate an anti-question-begging constraint that is less absolute than INDEPENDENCE. Conciliatory views that feature a weaker anti-question-begging requirement may nonetheless be sufficiently powerful to undermine religious beliefs.

One example of a weaker anti-question-begging requirement is that proposed by Schellenberg (2007, 171). Schellenberg allows that the need to avoid the most general skepticism warrants trusting those belief-forming mechanisms that are (nearly) universal and unavoidable, even when there are no non-question-begging reasons for such trust. This would explain why we need not capitulate in disagreements with an isolated skeptic who doubts the reliability of our perceptual, memorial, and/or inferential faculties. According to Schellenberg, however, a mechanism that is neither universal nor unavoidable should not be trusted in the absence of independent grounds for thinking that it is reliable. Since he maintains that religious belief-formation is neither universal nor unavoidable, and since it is not possible in the current context of religious controversy to give non-question-begging grounds for taking some particular mechanism of religious belief formation to be reliable, Schellenberg concludes that religious skepticism is the only rational option.

Alston (1991, 198-9) has contested the claim that non-universal belief-forming mechanisms should be held to higher epistemic standards than universal mechanisms, claiming that there is no reason to suppose that all mechanisms worthy of default trust will be common to all or most mature adults. Additionally, Schellenberg’s criteria appear to have consequences that many would find dubious.

Consider the example (adapted from Plantinga [2000, 450]) of someone in colonial America who is strongly inclined towards the view that chattel slavery is morally abhorrent, but who is not unavoidably drawn to this conclusion. Schellenberg’s criteria seem to imply that such a person cannot rationally judge slavery to be morally abhorrent unless she can cite dispute-independent reasons for thinking her own moral views are more likely to be reliable than the majority position. This is an unpalatable conclusion, since it is questionable whether such dispute-independent reasons could be identified.

Another attempt to articulate a more qualified constraint on question-begging is that supplied by Christensen (2011). Christensen acknowledges that in a dispute with a radical skeptic (or with someone else who questions all of the beliefs we rely on to assess epistemic credentials), we lack a dispute-independent reason for thinking that we are more qualified than our disputant. Nevertheless, he argues that the mere absence of dispute-independent reasons for favoring one’s own side is not enough for disagreement to pose legitimate skeptical worries. Nor will disagreements pose serious skeptical worries in cases where a dispute-independent evaluation produces only a very weak reason for thinking that the credentials of one’s disputant rival or surpass one’s own. On the contrary, significant conciliation will typically be required only when there are sufficiently strong positive dispute-independent grounds for trusting the other side. Christensen’s bill tabulation case, where the disagreeing friends have significant track record data that suggest that they are equally reliable at mental math, is an example where the disputants have strong dispute-independent reasons for taking the other person to be an epistemic peer, resulting in significant conciliatory pressure. In contrast, consider a disagreement between two philosophers who systematically disagree across a wide range of ethical questions. While these philosophers might acknowledge that they are both comparable in intelligence, degree of philosophical training, and general intellectual virtue, this sort of parity provides a much weaker reason (in comparison to the solid track record data of the first case) for thinking that the other person is an approximate peer. Christensen’s view would therefore suggest that the conciliatory pressure is weaker in this latter case. The significance of Christensen’s moderate conciliationism for religious belief is discussed in section 4.

c. The a Posteriori Stage of the Argument

No plausible conciliatory policy will require giving up religious belief in the face of just any disagreement. Plausible policies will require religious skepticism only if one’s religious beliefs are contested by a sufficiently large and qualified contingent of individuals. The full argument that religious disagreement defeats some religious belief must do more than merely assert that the belief is contested; it must assert that the degree of dissent is significant enough that the correct view on disagreement will require abandoning the belief. This is the a posteriori stage of the argument, which has received little attention despite the fact that it is far from trivial. Consider the evidential implications of the distribution of opinion concerning the existence of God. A 2010 poll of over 18,000 adults conducted by Ipsos in 23 countries found that 51% of respondents reported believing in at least one God or “Supreme Being,” 17% reported sometimes believing and sometimes not believing, 13% reported not being sure if they believe, and 18% indicated that they do not believe in any sort of divine being. These percentages vary only slightly across different levels of education. The epistemic import of this data is far from clear. Kelly (2011) suggests that the fact that theistic belief is significantly more prevalent than atheistic belief constitutes evidence that at least slightly supports theism. But some proponents of religious skepticism may argue that the exact proportions are not very significant, and that what is epistemically significant is the lack of consensus. Additionally, many hold that the beliefs in the overall population are far less significant than the beliefs of those with relevant expertise. And atheism is the dominant position in certain communities of experts. For example, a large 2009 survey of professional philosophers conducted by Bourget and Chalmers at 99 leading departments found that 73% of professional philosophers accepted or leaned towards atheism, while only 15% accepted or leaned towards theism. On the other hand, most philosophers specializing in philosophy of religion were theists, with 72% accepting or leaning towards theism and only 19% accepting or leaning towards atheism. Draper and Nichols (2013) argue that the specialists in philosophy of religion are significantly influenced by pro-religious biases, a claim which, if true, would perhaps significantly diminish the epistemic significance of the prominence of theistic belief among philosophers of religion. No doubt some theists would counter that certain selection effects and anti-theistic biases in the wider professional culture of philosophy help to explain the prominence of atheism among philosophers as a whole. In any case, many religious believers would contest the notion that philosophical expertise is the most important qualification for reliably evaluating religious questions (see section 4). Clearly, several delicate and contentious questions must be addressed by anyone attempting to measure the degree of qualified opposition to a given religious perspective.

d. The Scope of the Conciliatory Argument

Some proponents of disagreement-motivated religious skepticism target any confident view on contentious religious questions, including those atheistic perspectives that would not typically be labeled “religious.” Others argue that religious disagreement defeats the justification of explicitly religious worldviews, but do not think that secular atheism is similarly threatened. According to Kitcher (2014, 7), “the religious convictions of many contemporary believers are formed in very much the same ways,” namely, through trusting a religious community that claims to preserve and pass on the teachings of prophets and mystics who had some special connection with God or the transcendent. Since this process leads to incompatible beliefs when it is employed in different cultural contexts, the process is unreliable and cannot justifiably be trusted. Because Kitcher does not think that religious disagreement undermines secular atheism—indeed, he appeals to religious disagreement precisely to motivate a “soft atheism” that makes “small concessions in the direction of agnosticism” (23-4)—he presumably thinks that acceptance of secular atheism is not based on a process of communal trust that is relevantly like the unreliable sort of trust that typically grounds religious conviction. Thus, religious disagreement defeats those beliefs typically labeled “religious,” but does not defeat secular convictions.

A significant problem for this more narrowly targeted defeat argument is that many religious believers will deny that the process by which they hold their convictions is accurately characterized as one of uncritically trusting their religious community. They may acknowledge that unreflective trust of a religious community is unjustified (in light of religious diversity), but insist that the process by which they hold their beliefs is one of critically reflecting on all of the evidence. This evidence includes their personal experience and community’s experience, to be sure, but also testimonial evidence from other communities, scientific evidence, and philosophical considerations. Kitcher anticipates a response along these lines, but insists that even when we confine our focus to reflective and philosophically sophisticated religious believers, we still find a substantial amount of controversy, and for this reason he thinks that even philosophically reflective religious belief is defeated by disagreement (9-10). What is unclear, however, is why Kitcher thinks that an irreligious secular outlook can avoid epistemic defeat when the process that presumably accounts for his secular convictions—namely, the process of critical philosophical examination of all the relevant evidence—is a process that appears to lead many thinkers to conclusions that are incompatible with his secularism. If one insists that religious disagreement defeats even reflective religious belief, it will be difficult to maintain that explicitly irreligious belief is not similarly threatened (Bogardus 2013a, 390).

3. Permissivist Responses to the Conciliatory Argument

As discussed in the last section, the conciliatory views that lie behind arguments for disagreement-based religious skepticism can often be understood as consisting of two commitments: a commitment to a principle like DEFERENCE that requires epistemic deference to apparently qualified interlocutors, and a commitment to a principle like INDEPENDENCE that prohibits a question-begging assessment of epistemic qualifications. While many responses to arguments for disagreement-based religious skepticism take issue with INDEPENDENCE, some target DEFERENCE. These responses are the focus of this section.

Critics of DEFERENCE maintain that its plausibility depends on an unacceptably restrictive conception of rationality according to which a given body of evidence rationalizes at most one doxastic attitude towards any given proposition (Schoenfield 2014). On this restrictive picture, if two agents A and B have exactly the same evidence bearing on p and are both perfectly rational in responding to that evidence, then A and B will have the same view on p’s plausibility. This thesis is frequently called the “uniqueness thesis” (Feldman 2007), since it holds that there is a uniquely rational doxastic response to any particular body of evidence. Critics of DEFERENCE deny uniqueness and maintain that, in at least some contexts, there is no single response to a given body of evidence that is maximally rational. Religious questions are often cited as one context where rationality is “permissive” in this way. Surely, these “permissivists” maintain, there is no single credence for, say, God’s existence that stands alone as the maximally rational response to a given body of evidence.

The permissivist objection to the applicability of DEFERENCE in religious disputes thus consists of two claims: (i) DEFERENCE applies in contexts of religious disagreement only if such contexts are rationally impermissive (such that there is a unique doxastic response that is fully rational); and (ii) many religious disagreements are contexts where rationality is permissive.

Here is one way of motivating the first claim. Suppose that Al knows that Beth possesses more or less the same evidence as him concerning religious matters and that she is epistemically impeccable. Presumably, the discovery that Beth rejects Al’s religious views should lead Al to worry about his religious views only if this discovery gives Al reason to suspect that his view is not a fully rational response to his (pre-disagreement) evidence. Furthermore, Beth’s contrary religious viewpoint gives Al a good reason to suspect such irrationality only if it cannot be the case that there are multiple contrary religious viewpoints that are each a fully rational response to their evidence. In other words, the disagreement gives Al a good reason to question the rationality of his initial view only if their religious dispute is a context where rationality is impermissive. If full rationality permits a variety of religious perspectives in response to the same evidence, then religious disagreement does not raise worries about the rationality of one’s pre-disagreement religious views and the epistemic deference commended by DEFERENCE would seem to be unmotivated.

It is controversial whether DEFERENCE depends for its motivation on a non-permissivist conception of rationality as the above argument maintains. While advocates of conciliatory views do frequently characterize the worry raised by disagreement as a worry about the rationality of one’s pre-disagreement position, this need not be the only concern raised by disagreement (Christensen 2014). Even if Al knew that his religious reasoning is perfectly rational, Beth’s disagreement could still raise a different sort of worry: Beth’s disagreement might constitute evidence that rational reflection on religious questions does not reliably lead to true religious beliefs. And the knowledge that the rational formation of one’s religious views does not reliably conduce to true belief arguably gives Al a defeater for his religious views, even if Al knows that, prior to learning of the disagreement, his views were fully rational. Of course, if almost all rational people agreed with Al and only a few with Beth, Al might be able to affirm that rational reflection on religious matters does reliably lead to the truth, and perhaps he could be untroubled by the fact that in Beth and a few others, genuinely rational reflection has led to false religious views. (Beth, being in the small minority, could not be similarly sanguine.) But if the number of apparently rational thinkers who are as informed as Al and yet disagree with him rivals the number who agree with him, then religious disagreement would supply evidence of the unreliability of rational religious reflection and may on that account defeat confident religious belief.

Supposing that DEFERENCE does depend on a non-permissive conception of rationality, despite the preceding reflections, is it plausible to maintain that rationality is permissive with respect to many religious questions? It seems fairly clear that there are contexts where rationality is not permissive. For example, someone’s credence that they will win a particular lottery ought to conform to the mathematical odds (absent any reason to suspect supernatural intervention or corrupt lottery officials). But many philosophers find it implausible to suppose that the requirements of rationality are equally precise in domains of inquiry like religion. The types of evidence and rational standards that govern views on the reality of an afterlife, for example, seem too coarse-grained to admit of a precise credence that is maximally rational, or even a maximally rational credence range with precise endpoints. (For a vigorous defense of the uniqueness thesis, see White [2005].) Also, even if we are concerned not with credences but with the coarse-grained doxastic attitudes of belief, disbelief, and withholding (neither believing nor disbelieving), it seems implausible to suppose that there are no borderline cases where either of two attitudes (for example, belief or withholding) could be fully rational.

Even if there are good reasons for thinking that rationality is permissive in religious contexts, it is not clear that an appeal to permissive rationality can defuse worries raised by religious disagreement. First, the affirmation that rationality permits fundamentally opposed religious perspectives appears to be incompatible with certain religious perspectives. The apostle Paul, for instance, asserts that creation provides evidence of God’s eternal power and divine nature that is plain to all, and that the wicked who turn away from God “suppress the truth” (Romans 1.18ff). Second, even if permissivism is correct and religiously acceptable, questions may be raised about just how permissive rationality is. Perhaps weak belief in the reality of an afterlife as well as agnosticism on the question could both be fully rational responses to a given body of evidence. But could confident belief as well as confident disbelief in the afterlife both be fully rational responses to a given body of evidence? An extreme permissivist view that answers this question in the affirmative is significantly more controversial than permissivism itself.

Because it is not clear how permissive rationality is, if indeed it is permissive, with respect to religious questions, it is not clear what degree of religious disagreement among those with similar evidence is required to indicate likely irrationality on the part of at least one of the parties to the disagreement. But religious disagreement is notable for being so extreme. Many Christians, for example, are utterly convinced that God raised Jesus from the dead; on the contrary, many atheists think that theism, not to mention Jesus’ resurrection, is fanciful nonsense that can be dismissed out of hand. Perhaps no other domain of inquiry exhibits this degree of doxastic polarization. Those who appeal to permissivism in order to defuse worries raised by religious disagreement must therefore affirm a very strong form of permissivism according to which rationality radically underdetermines the appropriate response to a given body of evidence. Even philosophers who are inclined towards permissivism may find such an extreme form of the view unpalatable and implausible.

4. Religious Belief and the Problem of Judging Epistemic Credentials

If DEFERENCE is correct, despite the permissivist challenge just considered, then the religious views of apparent epistemic peers or epistemic superiors on religious matters ought to be accorded significant weight. But how are we to determine who our epistemic peers and superiors are? Asked differently, how are we to assess epistemic credentials? As discussed in section 2, many affirm some principle like INDEPENDENCE and maintain that epistemic credentials ought to be assessed in a dispute-independent manner. The fact that INDEPENDENCE helps to explain the intuitive verdict in the calculation case discussed above does lend it some plausibility. But as already noted, some proponents of conciliatory theories deny that we are always required to rely only (or even primarily) on dispute-independent reasons in responding to disagreement. Thus, even if INDEPENDENCE is on the right track, there may be features of religious disagreements that distinguish them from the calculation case and that weaken or render inapplicable the anti-question-begging requirement that applies in the calculation case.

One potentially significant difference between religious disagreements and the calculation case (and similar cases that are used to motivate INDEPENDENCE) is that in many religious disagreements, there is no clear basis for arriving at a dispute-independent judgment concerning the epistemic qualifications of the parties to the dispute (Pittard 2014). In the calculation case, the robust track record information provides a good dispute-independent basis for estimating the reliability of each friend in answering questions like the one under dispute. But consider a dispute between, say, a theist and an atheist. What nonpartisan, dispute-independent grounds do the disputants have for arriving at an estimation of their epistemic credentials concerning the question of God’s existence? One might think that they should compare their track record on other religious questions: for example, whether a relationship with God would be a great good, whether the sorts of suffering we observe in this world could be redeemed by God (should God exist), or whether there are plausible naturalistic and non-teleological explanations for the existence and character of our universe. However, it is quite clear that this procedure is unlikely to yield a nonpartisan assessment of their respective epistemic credentials. The atheist and theist will probably disagree on these questions as well, for reasons that are not independent of their dispute concerning theism, preventing them from arriving at a dispute-independent calculation of their “religious track record.”

If religious track records cannot provide the atheist and the theist with a basis for a nonpartisan assessment of epistemic credentials, one might think that they can arrive at such an assessment by considering the degree to which they each exhibit the intellectual capacities and epistemic advantages that are most important for a reliable assessment of religious claims. For example, they could estimate one another’s intelligence and analytic sophistication by means of some indicator like academic performance, and through conversation they could perhaps ascertain how extensively each of them has studied topics relevant to an assessment of theism. Unfortunately, in a wide range of religious disputes, this sort of procedure is unlikely to deliver a dispute-neutral assessment of epistemic credentials. This is because many systems of religious belief include incompatible views on what qualifies one to reliably assess religious claims. In this respect, religious disagreement is quite different from controversies in many other domains. Two civil engineers with opposing views on the merits of some bridge proposal will most likely agree on what sort of training and cognitive capacities are required to be a good judge of engineering questions, and they probably agree on which institutional signals (for example, academic degrees, professional experience, publications) serve as reliable evidence that someone possesses the requisite capacities. In many religious disputes, however, whether the disputed proposition is true or false has significant implications for the question of which qualifications best position one to assess the disputed proposition. Some Buddhists, for example, maintain that meditative disciplines are required in order to loosen the grip of certain delusions and to enable an adequate appreciation of the truth of Buddhist teachings concerning the non-existence of a personal self. Those who have considered Buddhism and who are not convinced are unlikely to accept that these meditative disciplines are an important qualification for an assessment of Buddhist claims. To consider another example, a Christian, inspired by the apostle Paul’s writings in I Corinthians 1-2, might affirm that scholarly credentials and analytic sophistication do not help one to see the truth of the Christian message, but that the key qualification is the possession of a divinely-given insight into the beauty and excellence of God as portrayed in the Christian message. Non-Christians will clearly not share this view concerning which qualifications are most important.

In many religious disputes, then, questions about which qualifications are most important cannot be separated from the primary religious matter under dispute, so that there is no shared theory of epistemic credentials that could ground a dispute-independent assessment of the disputants’ qualifications. If Christensen’s moderate conciliatory position is right and significant conciliation is required only when one has positive dispute-independent reasons for trusting the other side, and if in many religious disagreements there is no basis for a dispute-independent assessment of epistemic qualifications since questions about which credentials are relevant are caught up in the dispute at hand, then it seems that the correct conciliatory view may not require significant conciliation in many religious disputes.

Against this conclusion, one might protest that even if there is no nonpartisan theory of epistemic credentials that one can employ for a dispute-independent assessment of epistemic qualifications, it is quite probable that one’s own partisan view on epistemic credentials will give one reason to trust the other side. And if that is the case, then surely conciliation will be required. Suppose that atheists maintain that the most significant qualification in assessing theism is the possession of philosophical aptitude, and that theists maintain that the most significant qualification is a selfless love for others, which they think properly disposes the heart to see the truth of “divine things.” While there is no dispute-neutral theory of epistemic credentials in this case, it is certainly possible that the atheists’ own theory will give them a reason to assign significant weight to the views of theists if there are numerous theists who are philosophically qualified, and that the theists’ own theory will give them a reason to assign significant weight to the views of the atheists if atheists exhibit just as much selfless love as theists. In this case, both sides would, by their own light, have reason to significantly reduce confidence in their respective views. In fact, we could say that both sides do have a dispute-independent reason for trusting the other side, the reason being that on either of the competing theories concerning which credentials are most relevant, there is reason to think the other side highly qualified.

As the above rejoinder shows, the mere lack of a common perspective on the relevant epistemic credentials is not enough to escape the threat posed by disagreement, even given a more moderate conciliatory view like Christensen’s. Conflicting theories of epistemic credentials will alleviate the worries posed by disagreement only if one’s partisan theory of epistemic credentials does not give one strong reason to trust the other side. Is there any reason to think that a theory of epistemic credentials that is part of some religious belief system will not supply strong reasons for thinking that adherents of other belief systems are epistemically qualified? This is, ultimately, an empirical matter that must be settled on a case by case basis: what does a given religious viewpoint say about which epistemic credentials are most important when it comes to religious matters? Does the theory of epistemic credentials implied by the religious perspective give strong reasons for thinking that many of those who reject the religious perspective in question are nonetheless epistemically qualified? Clearly, the answers to these questions could vary depending on which religious perspective we are inquiring about. Pittard (2014) gives some reasons to think that, in many cases at least, religiously-motivated theories of epistemic credentials will not supply strong reasons for thinking that those who reject the viewpoint in question are epistemically qualified. First, the credentials that are emphasized by religious traditions are often credentials the possession of which is not easily discernible. Taking inspiration from Jesus’ sermon on the mount, suppose that “purity of heart,” that is, untainted desire for God, is necessary in order to see the truth of divine things. Unlike more standard epistemic credentials that are relevant in mundane domains of inquiry, purity of heart is not something whose presence in one’s disputant can easily be confirmed or disconfirmed. And if the most important epistemic credentials pertaining to religious questions are unobservable, then one may not have very strong reasons for thinking that one’s disputant is qualified with respect to religious matters. Second, many systems of religious belief feature credentials that are unlikely to be possessed by someone who is not disposed to accept the belief system in question. Consider a Theravada Buddhist who maintains that the truth of “emptiness” is unlikely to be evident apart from substantial engagement in Buddhist meditation. While it is perhaps easy to confirm whether or not someone has practiced Buddhist meditative practices in a disciplined way, it is unlikely that someone would pursue years of Buddhist meditation unless she was already positively disposed towards Buddhism. When the putative epistemic credentials are self-selecting in this way, it is less likely that there will be large numbers of disbelievers who possess the credentials.

If (i) a dispute-independent evaluation does not provide strong reasons for thinking that the other side of a religious dispute is as credentialed as one’s own side, (ii) one’s own partisan theory of religious epistemic credentials also does not supply such a reason, and (iii) significant conciliation is required in disagreements only to the extent that one has strong reasons for thinking that the qualifications of those on the other side of the dispute rival or surpass the qualifications of one’s own side, as Christensen asserts, then the skeptical implications of religious disagreement may be quite limited.

Against this line of thinking, some have complained that a view on disagreement is too weak if it allows religious believers to set aside worries raised by disagreement simply because their religiously-motivated theory of epistemic credentials does not give them reason to highly estimate the credentials of their disputants. Lackey (2014, 308), for example, notes that we should not affirm the reasonableness of the sexist who dismisses disagreement-related worries on the grounds that his disputant is a woman. Similarly, she insists that one should not be able to escape worries raised by religious disagreement simply by affirming a partisan and contestable view on the nature of the relevant epistemic credentials. In response, one could point out that while the sexist’s position after dismissing his female disputant is highly unreasonable, this is compatible with its being the case that he applied the correct policy for responding to disagreements. The unreasonableness of his final position may be explained by the unreasonableness of the sexist views he held before the dispute, and not by the employment of an incorrect disagreement norm. After all, we should not expect that applying the right disagreement norm will correct for rational failures that one brings into a disagreement situation. Lackey considers this response and answers as follows: “If an atheist sticks to her guns with respect to her belief that God does not exist just by regarding the theist as her epistemic inferior, this is irrationality in her response to a disagreement. It is not clear what could justify relegating these failures of rationality to epistemology generally rather than to the epistemology of disagreement in particular” (311).

What follows if Lackey’s more expansive conception of the epistemology of disagreement is granted? Perhaps the correct disagreement norm will still allow that the significance of religious disagreement is sensitive to one’s theory of epistemic credentials, but with the added caveat that one’s theory of epistemic credentials can mitigate the worries raised by disagreement only if one’s adherence to the theory is reasonable and not just an unmotivated attempt to block disagreement worries. It isn’t clear how this changes the dialectical situation, since adherents of a particular religiously-motivated theory of epistemic credentials presumably think that their theory is reasonable, and thus not analogous to the prejudice of the sexist. On the other hand, the correct disagreement norm could deny that the evidential significance of religious disagreement is sensitive to what theory of epistemic credentials one happens to hold. One way to do this would be for the disagreement norm to simply stipulate the correct theory of epistemic credentials. But this would require taking a stand on questions that are contested on religious grounds. The resultant norm could not supply a religiously neutral motivation for religious skepticism. Alternatively, the norm could require that one’s assessment of a disagreement always be dispute-neutral, and that equal weight be assigned to both sides in those cases where there is no agreement on the relevant credentials. But such a strong conciliatory norm would require capitulation in disagreements with radical skeptics, which is what led Christensen and others to search for principled conciliatory policies with more modest anti-question-begging constraints. In short, it is not clear whether there is a conciliatory norm that is religiously neutral and not overly skeptical, but that completely forbids relying on one’s particular theory of epistemic credentials in assessing the significance of a religious disagreement.

It should be emphasized that a moderate conciliatory view like Christensen’s will require reduction of confidence in many religious disputes, even if it does not require significant conciliation in inter-religious disputes where the two sides share very little common ground. Significant doxastic revision will likely be required in a wide range of religious disputes between those with broadly similar positions. Consider a disagreement between two theologians who disagree over the finer details of some shared theological framework. Given their extensive theological agreement, each party to the dispute has strong dispute-independent reasons for thinking that the other person is quite reliable as a guide to theological matters. This suggests that a moderate conciliatory framework of the sort considered here would call for significant deference. So even if outright religious skepticism is not required, believers might be required to loosen their religious views by adopting an agnostic stance towards many intramural disputes.

5. Fundamental Versus Superficial Disagreements

In philosophical discussions of disagreement, one frequently encounters the view that fundamental disagreements—that is, disagreements driven by incompatible epistemic starting points—should occasion less doxastic revision than disagreements that are superficial. Many who readily concede that disagreement can easily defeat one’s belief about the answer to a multistep math problem, for example, deny that one’s fundamental moral, philosophical, or religious convictions are similarly vulnerable in the face of disagreement. The previous section pointed to one reason that arguably goes some way towards explaining why fundamental disagreements might be less worrying than superficial ones: it might be that in fundamental disagreements, it is unclear what the relevant epistemic credentials are and who possesses them, making it unlikely that one will have strong independent grounds for thinking that the epistemic credentials of those on the other side of a dispute either rival or surpass the credentials of one’s own side. This section briefly considers two different accounts as to why religious disagreements that are suitably fundamental will pose less of a skeptical threat, and then considers whether religious disagreements are fundamental in the relevant sense.

Bogardus (2013b) argues that while peer disagreement undermines what he calls “knowledge from reports,” it does not undermine “knowledge from direct acquaintance.” Knowledge from reports, according to Bogardus, is mediated knowledge that rests on the output of some cognitive faculty, while knowledge from direct acquaintance involves “immediate and unproblematic access” (9) to the truth of the known proposition. In a case of knowledge that p from direct acquaintance, one “just sees” that p is the case, and p is part of one’s evidence base. Bogardus cites our knowledge that 2+2=4 as a paradigmatic example of such knowledge. In a case of knowledge that p from a report, one does not “see” p directly but sees p by seeing q, where q is some proposition concerning the report of one or more cognitive faculties. In this case, q but not p is part of one’s evidence. A paradigmatic example of knowledge from reports would be a belief based on memory. Christensen’s bill calculation case also seems to be a case where something known from a report is the subject of peer disagreement. When one of the friends concludes that each person’s share is $43, he does not “just see” that $43 is the correct answer. Rather, what he “just sees” is that the answer he has reached after a series of calculative steps (many of which he probably does not remember) is $43, and this is the basis for his belief that each person’s share is $43.

Assuming that there are these two types of knowledge, it is not implausible to think that knowledge from direct acquaintance is less susceptible to higher-order defeat than knowledge from reports. Because knowledge from reports involves trusting the “readouts” of one’s “cognitive instrument,” such knowledge is understandably threatened by worries raised about the reliability of that instrument or by the fact that some other similar instrument–an epistemic peer–delivers an inconsistent “readout.” Knowledge by direct acquaintance, on the other hand, is more fundamental to one’s cognitive perspective in that it is not mediated by instrumental reports. If such knowledge is not based on the report of one’s cognitive faculties, that knowledge may not be similarly undercut when one learns that an epistemic peer’s faculties deliver an inconsistent report. Therefore, on Bogardus’ view, if a contested religious belief is known by direct acquaintance, or if a religious belief is based on some contested proposition that is known by direct acquaintance, then the party who enjoys such knowledge rationally ought to stand by the belief in the face of disagreement.

A second account as to why fundamental disagreements may pose less of a skeptical threat comes from Gellman (1993, 355ff.). Gellman argues that religious beliefs may be immune to defeat by disagreement if those beliefs are numbered among the “rock bottom” epistemological starting points that supply the basis for epistemic evaluation of other beliefs. However religious believers may have come to initially acquire their religious beliefs, for many believers these beliefs come to achieve rock bottom status, alongside other commitments, such as basic rational principles and fundamental beliefs about the world, that serve as justifiers of other beliefs and that do not themselves stand in need of grounding. Gellman acknowledges that there is a hierarchy among such rock-bottom beliefs: some of these beliefs are given more weight in rational deliberation, and some are given priority in that they invariably trump other rock-bottom commitments in cases when they conflict. He also holds that for many religious believers, core religious beliefs are hierarchically prior to many of the rational norms identified by epistemologists, including norms like DEFERENCE and INDEPENDENCE described above. Given this priority, Gellman maintains that it would not be rational for the religious believer to abandon core religious beliefs just because this is what DEFERENCE and INDEPENDENCE require.

It is, of course, questionable whether the above thinkers are right in thinking that beliefs that are suitably fundamental are thereby protected from the disagreement threat. Many will question the Cartesian optimism implicit in Bogardus’ conception of knowledge from direct acquaintance. And even if there are fundamental beliefs that are presumed “innocent” and that therefore do not stand in need of evidential support, as Gellman claims, it need not follow that such presumptive innocence remains intact in the face of direct challenge from other qualified thinkers. Finally, even if the significance of disagreement is mitigated in fundamental disputes, it may be that neither Bogardus nor Gellman have adequately articulated the relevant sense of “fundamentality.”

Even once the relevant sense of fundamentality is fully clarified, the question of whether a given religious disagreement is fundamental will in many cases be a controversial one. This is because there is significant disagreement among philosophers of religion on the place that religious belief occupies in the believer’s “noetic structure,” and thus on the source of religious disagreement. Consider, for instance, the conflicting accounts of reflective theistic belief developed by Richard Swinburne (2004) and Alvin Plantinga (2000). Swinburne maintains that reflective theists who are aware of evidential challenges to religious faith, including facts about religious diversity, will typically be unable to take their theistic convictions for granted, but will need to proportion their credence in theism to the evidence. Swinburne holds, moreover, that evidential reasoning about God’s existence can and should employ the same principles of confirmation theory that are widely accepted in the sciences, and that the pre-evidential probabilities that serve as the starting point for such reasoning can and should be sufficiently determined by the application of generally-accepted inductive criteria such as explanatory scope and simplicity. This view seems to suggest that when two equally informed thinkers disagree on the plausibility of theism, the most plausible explanation is that at least one of them has made some mistake in the application of agreed-upon criteria that serve as the epistemic starting points for both parties. If this is right, then there is some reason to think that cases of religious disagreement can be assimilated to the calculation case discussed above, a case of disagreement that seems not to be fundamental since the dispute stems from performance error on the part of one of the thinkers rather than from any fundamental divergence in the disputants’ perspectives antecedent to the process of calculation.

In contrast to Swinburne, Plantinga maintains that for most theists, the belief in God is not the product of inference, but is basic in that it is not based on other beliefs. Plantinga acknowledges that theistic beliefs are often prompted by certain experiences: upon viewing a breathtaking mountain vista, one might find oneself believing that the world was created by God; or after doing some evil act, one might find oneself believing that God disapproves of what one has done. According to Plantinga, however, while these experiences may occasion theistic beliefs, these beliefs are not based on inferential reasoning that appeals to facts about these experiences as evidence. Instead, these beliefs are like the belief in other minds or the reality of the past or in the reliability of memory: such beliefs are held with a high degree of confidence whether or not we are aware of any good arguments in their favor. If Plantinga is right that theistic belief is not typically grounded in evidential reasoning, then there is reason to think that disagreements between theists and atheists are typically fundamental in a way that the disagreement in the calculation case is not. Disagreements over theism would not result from some performance error in inferential reasoning, but would be the product of differences in the basic outlooks of different thinkers.

The aim in comparing Swinburne and Plantinga is not to suggest that if Plantinga is right, theistic belief is fundamental in a way that lessens its vulnerability to defeat by disagreement (or that, if Swinburne is right, theistic belief is more vulnerable to the disagreement threat). While this is a conclusion that some have drawn, the principal aim of the comparison is to show that even if we agreed on a characterization of “fundamentality” that protects beliefs from being defeated by disagreement, there may very well be disagreement concerning the structure of religious belief and the question of whether it is fundamental in the relevant sense. While there is some reason for thinking that Swinburne’s view of theistic belief would place theistic belief on the non-fundamental side, there are also considerations that call this supposition into question. There are potentially significant disanalogies between theistic belief on Swinburne’s picture and the calculation case, which does seem to be a paradigm of a superficial disagreement. For example, in the calculation case, the several steps that led to one’s answer are presumably forgotten, and the exact source of the disagreement cannot be pinpointed. (And if it could be pinpointed, no doubt one party would recognize their error.) One might think that in many religious disagreements, disputants can rehearse the most important steps of the reasoning that grounds their view, and that as a result they can locate the precise point where their reasoning diverges. And a disagreement that persists even when the point of divergence has been identified is quite different from one where the disagreement persists precisely because the two parties cannot reconstruct their reasoning and thus cannot identify the point of divergence. The former sort of disagreement, which is driven by stable differences in how one applies inferential norms, is perhaps fundamental in a way that the calculation disagreement, which results from some obscured performance error, is not.

In addition to questioning whether Swinburne’s picture supports the “non-fundamental” characterization of religious disagreement, there is also room to question whether religious disagreement on Plantinga’s picture really does qualify as fundamental in the relevant sense. While Plantinga maintains that theistic belief is basic, one might argue that on his model theistic belief is an instance of knowledge from reports rather than knowledge from direct acquaintance. Even if theistic belief is not inferred from facts about the report of some cognitive faculty, the believer may believe in response to a report from some cognitive faculty (what Plantinga calls the sensus divinitatis) and may not “just see” that God exists. Consider: basic perceptual beliefs seem susceptible to defeat by disagreement in a way that basic mathematical beliefs are not. If two normal and (up till now) healthy friends are standing before an open garage, and one says he sees a car in the garage and the other says the garage is empty, it is reasonable to suppose that both of them should significantly lower their confidence in their initial belief, since it is likely that someone is hallucinating and neither has reason to think that their friend is more susceptible to hallucination than they are. However, if these two friends are talking and it comes to light that one of them believes that 1+1=2 while the other believes that 1+1=5, it is less plausible to suppose that the friend with the correct belief should reduce confidence to any significant extent. Both of these disagreements arguably involve conflicts between basic beliefs, but basic perceptual beliefs appear to be more vulnerable to defeat than basic mathematical beliefs. Perhaps this is because basic mathematical beliefs arise from a direct awareness of mathematical truths, while perceptual beliefs are mediated by the reports of perceptual faculties. This diagnosis and the preceding discussion involve a number of controversial claims and assumptions, controversies that will not be pursued here. The main point is that from the position that theistic belief is basic, it does not straightforwardly follow that theistic belief is among those beliefs that can plausibly be said to be resistant to defeat by disagreement in virtue of their fundamental status. The relevantly fundamental beliefs may be some subclass of basic beliefs, those that are the product of rational insight rather than the product of some perceptual or quasi-perceptual faculty.

6. Appeals to Religious Experience

For many religious believers, personal religious experience plays a crucial role in the formation, development, and sustaining of religious belief. Theravada Buddhists emphasize the importance of experiences arising from certain meditative disciplines, experiences that open the mind to the truth of certain Buddhist teachings. Charismatic Christians frequently refer to certain bodily sensations that serve as experiential signs of the presence and activity of the Holy Spirit. Theists of various stripes emphasize profound experiences of God’s presence and divine communication, experiences that frequently occur in times of prayer or worship but that may also come unbidden outside of any specific religious practice. In addition to claimed direct experience of God, many believers in God or gods purport to discern providential influence on their circumstances, and not infrequently believers claim to have witnessed or received physical healing in response to prayer. This is, of course, only a sample of the diverse religious experiences that are represented in religious traditions across the globe. Atheists who reject any religious viewpoint may also cite personal experiences in accounting for their disbelief—for example, experiences of silence and absence of divine comfort in a season of acute suffering.

The fact that such experiences frequently play a prominent role in motivating and supporting religious belief is potentially significant for an assessment of the epistemic significance of religious disagreement. Many who argue for the defeating power of disagreement are explicitly concerned with contexts where each side to the dispute has fully disclosed the grounds for their view. If disagreement is most worrying when it persists in context of full disclosure, then there is some reason to think that many religious disagreements will not present serious skeptical threats. To be sure, some “religious experiences” are such that their epistemically relevant content can easily be communicated to others. Suppose someone prays for a new car and the next day receives a car from a complete stranger who says that she felt moved to give her car away. The epistemically relevant aspects of such an experience could easily be communicated to others. (Whether the testimony would be believed is, of course, another story.) But one might think that in many instances of what we call “religious experience,” the content of the experience that is taken to be epistemically relevant cannot be communicated. Suppose that someone in desperate straits cries out to God for help and immediately experiences a “peace that surpasses all understanding” (Philippians 4:7), a peace that seems in its profundity to be a divine gift rather than a purely natural phenomenon. Could someone who believes in God partly on the basis of such experience fully disclose his reasons for belief? He could, of course, report having such an experience and describe the belief changes that seemed appropriate in its wake. However, the epistemic significance of the experience may significantly depend on subjective aspects of the experience whose qualities cannot be adequately communicated by means of verbal testimony (James 1902, 371). If this is right, then religious disagreements may be quite different from disagreements in many other domains where the subjective qualities of private experiences do not play a significant epistemic role.

There are reasons for doubting whether the significance of religious experience to religious belief could justify both sides of a religious dispute in confidently maintaining their religious beliefs in the face of disagreement (Schellenberg 2007, 182-3). Consider a disagreement between a Buddhist and a Muslim who both appeal to distinctive sorts of experiences in justifying their contested religious beliefs. While the Muslim does not herself experience the same sort of ineffable experiences that ground the Buddhist’s belief in, say, the doctrine of non-self, the Buddhist can tell the Muslim of his experiences and he can describe the doxastic responses that seem to him appropriate in light of the experiences. If the Muslim trusts the judgment of the Buddhist, then it seems that the Buddhist’s belief in non-self constitutes evidence that his experiences, combined with his other evidence, supply good evidence for the doctrine of non-self. Furthermore, evidence that there is good evidence for p is often itself evidence for p. Hence, the Buddhist’s belief in response to the reported experience may serve as a piece of proxy evidence that stands in for the experience itself. Since this proxy evidence is available to the Muslim, it seems that the incommunicability of the Buddhist’s experience does not prevent that experience from having indirect evidential weight for the Muslim. Of course, a symmetrical story can be told as to why the Muslim’s report of mystical experiences and her doxastic response can serve as proxy evidence that stands in for the experience itself and can be appreciated by the Buddhist. Assuming that both attach comparable weight to their experiences and have responded with equal conviction, there is arguably no reason for either thinker to maintain that his or her own experience should be given more evidential weight than the inaccessible experience of the trustworthy interlocutor. On this view, the inaccessibility of religious experience is unlikely to relieve religious believers of the worries raised by religious disagreement. As long as multiple sides accord significant weight to private experiences, there is a kind of epistemic symmetry that arguably demands a skeptical response.

Still, one might resist the above reasoning by noting, first, that we do not have some metric that we can use to measure and compare the apparent evidential value of various mystical experiences. We communicate the perceived evidential significance of our experience through coarse-grained descriptive language, like “a deep and incredibly profound sense of God’s love” or “a brilliantly clear insight into the unity of all things,” language that is not calibrated in a way that would allow us to make reliable interpersonal comparisons of the significance of different mystical experiences. It is possible that two speakers could both be reasonable in describing their religious experiences as, say, “utterly profound and clarifying” even though one person’s experience was in fact much more profound and clarifying than the other’s. The fact that two people use similarly strong language to describe their experiences is poor evidence that the experiences were comparable in their epistemic import. Of course, this by itself does not give one any reason for thinking that one’s own experience is likely to be more significant than someone else’s similarly-described experience. All the same, consider the case of some religious believer who has had a mystical experience of arresting intensity and profundity, and who attempts to convey the significance of this experience using fairly extreme language, and then discovers that believers from opposing standpoints use similarly extreme language to convey the apparent significance of their own mystical experiences. If the religious believer thinks that it is quite plausible that people would use similarly extreme vocabulary even if their experiences were much less profound and compelling than his own, and if he can easily entertain the possibility of others having less compelling experiences than his own but cannot easily entertain the possibility of others having experiences that are more compelling than his own, then he might be reasonable in believing that his own experience is evidentially more significant than the experiences of his disputants (despite the fact that these experiences are similarly described). According to this reasoning, religious belief that is grounded in surprisingly powerful experiences might be reasonably held in the face of religious disagreement even if multiple sides cite similar “powerful” religious experiences in explaining their view.

7. Faith and Practical Responses to the Problem of Religious Disagreement

Epistemic or “theoretical” rationality is the sort of rationality that is principally exhibited by someone’s beliefs, and the norms of epistemic rationality are concerned with such matters as logical consistency and evidential support. Practical rationality, on the other hand, is the sort of rationality that is principally exhibited by someone’s actions, and the norms of practical rationality are concerned with such matters as the compatibility of one’s various goals and the degree to which one’s actions conduce towards the attainment of those goals. The discussion thus far has proceeded under the assumption that whether religious conviction is rational in light of disagreement is a matter to be settled by the norms of epistemic rather than practical rationality. This assumption is contested by many who maintain that the reasonability of religious belief, or at least of religious faith, is best evaluated from the standpoint of practical rather than (or in addition to) epistemic rationality. According to these thinkers, reasonable religious conviction is often based not on the sort of evidential reasons that bear on the question of whether religious claims are true or probable, but instead on moral, prudential, or existential reasons for thinking that it would be in some way good or valuable to have some particular religious commitment. For example, a theist might believe in God for the reason that belief in God gives her a sense of deep purpose, both for her own life and for the cosmos as a whole, or because it helps her to maintain her moral commitments even when they lead to significant suffering. If religious belief may be rational in light of such practical reasons, and if religious disagreement does not pose a threat to the practical justification of religious belief in the same way that it threatens its epistemic justification, then the claim that religious belief ought to be abandoned on account of religious disagreement is arguably more questionable.

It might seem that practical reasons could make religious convictions rational only if those convictions are based on practical reasons, and religious convictions can be based on practical reasons only if they are voluntary. Many philosophers maintain that beliefs are not voluntary, and for this reason are not evaluable according to the norms of practical rationality. If this is right, then the rationality of religious belief is arguably a matter of epistemic rather than practical rationality. The faith of the religious “believer” may not always be an instance of belief in the conventional sense, however. Of the philosophers who have considered the nature of faith, a good number have argued for a “non-cognitive” conception of faith that does not require outright belief in the propositions that are the object of faith. On Alston’s (1996) view, for example, one may fail to believe a proposition while nonetheless “accepting” it as a matter of faith. Such acceptance is like belief in many respects—one views the world from a standpoint that takes the accepted proposition for granted, and one employs the accepted proposition as a premise in practical reasoning—but acceptance is a voluntary state that does not require believing the proposition or judging it to be more probable than not.

Even if non-cognitivists about faith are wrong and belief is essential to faith, there could still be reasons why religious faith is appropriately evaluated from the standpoint of practical rationality rather than (or in addition to) epistemic rationality. First, some contest the claim that belief is inevitably involuntary. William James (1896), for example, argues that belief is governed by two competing aims (“Believe truth! Shun error!”), and how these aims are prioritized in a given context may be a voluntary matter that helps to determine whether one ends up believing a given claim. Second, even if belief cannot be chosen in the rather direct manner supposed by James, there is little doubt that one can undertake courses of action that may indirectly influence one’s religious beliefs.

Granting that religious faith is responsive to practical reasons, either because it is a voluntary state that does not require belief at all or because religious belief can be directly chosen or indirectly influenced by voluntary means, what implications does this have for the rational significance of religious disagreement? Holley (2013) suggests that if commitment to a religious way of life is valuable, then religious belief will likely be practically rational. For engaging in a religious way of life tends to produce religious beliefs, and the exercise of epistemic discipline that would be required to avoid falling into belief is likely to be incompatible with genuinely entering into the way of life in question. For this reason, Holley maintains that one can be reasonable in persisting in religious belief even if systematic religious disagreement defeats the epistemic justification of religious belief. Just how much erosion of epistemic support can the practical grounds for faith tolerate? The answer is by no means clear. For example, even if one can reasonably accept a proposition for which one has a credence of around 0.5 (a credence that might be insufficient for belief), there still could be non-trivial credence thresholds below which acceptance is not practically rational. If this is right, then the degree to which a given claim enjoys epistemic support is not irrelevant to an assessment of the practical rationality of accepting that claim. Those who argue on Jamesian grounds that belief can be responsive to practical considerations often hold that believed propositions must be judged more probable than not, and that this judgment of probability should be responsive to purely epistemic considerations (Pace 2011). As these examples suggest, the mere fact that practical considerations are relevant to an assessment of religious faith does not mean that the practical rationality of faith can be settled without reference to its epistemic merits.

Even if we agreed that the merits of some religious belief that p should be evaluated according to purely practical criteria that in no way depend on the strength of the epistemic reasons for the belief, it is still possible that knowledge of religious disagreement could constitute a defeater that renders religious belief unreasonable. This is because in addition to disagreements about the truth of various religious truth claims, there is also disagreement concerning the merits of various practical or “pragmatic” arguments for religious belief. This disagreement could undermine the epistemic justification of the beliefs that constitute the practical grounds for religious belief, and one might think that a practical rationale whose epistemic justification has been defeated cannot ground reasonable religious belief. Suppose that Theo believes in God on purely practical grounds. Perhaps Theo believes on the basis of a Kantian argument that concludes that belief in God is important in order to engage in the moral life without despair. Alternatively, perhaps Theo believes for the Kierkegaardian reason that passionate commitment in the face of “objective uncertainty” is the highest form of human existence. Still another possibility, his faith might be a response to the prudential reasoning articulated in Pascal’s “Wager” argument. All of these arguments are the subject of immense controversy. If these arguments must be epistemically justified in order to make it practically rational to have religious faith, then disagreement would threaten to undermine religious faith even if the religious claims that are accepted by faith do not themselves stand in need of epistemic justification. Moreover, several opponents of religious faith offer arguments for the conclusion that religious belief is positively harmful, either to the believer or to society as a whole (Fumerton 2013). Practically rational religious belief arguably requires that one be epistemically justified in rejecting such arguments, but disagreement of the right sort might undercut such justification.

8. Conclusion

Even if individual attempts at characterizing the rational significance of religious disagreement prove controversial, for many thinkers, including many religious believers, the intuition that persistent religious disagreement poses a significant challenge to religious belief is incredibly strong. As this article has attempted to show, clarifying the nature and scope of that challenge requires not only that one resolve various controversial questions in the epistemology of disagreement, but also that one settle difficult questions concerning such matters as the place of religious belief in the noetic structure of religious believers, the epistemic significance of various types of religious experiences, the role played by practical reasons in grounding religious conviction, and the theories of religious epistemic credentials implied by various religious belief systems. Given the complexity of such questions, there is little doubt that the epistemic significance of religious disagreement will remain a topic of lively philosophical dispute.

9. References and Further Reading

Adams, Robert M. 1994. “Religious Disagreements and Doxastic Practices.” Philosophy and Phenomenological Research 54 (4): 885–90.
Alston, William P. 1991. Perceiving God: The Epistemology of Religious Experience. Ithaca, NY: Cornell University Press.
Alston, William P. 1996. “Belief, Acceptance, and Religious Faith.” In Faith, Freedom, and Rationality, edited by Jeff Jordan and Daniel Howard-Snyder, 3–27. Lanham, MD: Rowman & Littlefield.
Baldwin, Erik, and Michael Thune. 2008. “The Epistemological Limits of Experience-Based Exclusive Religious Belief.” Religious Studies 44 (04): 445–55.
Basinger, David. 1999. “The Challenge of Religious Diversity: A Middle Ground.” Sophia 38 (1): 41–53.
Basinger, David. 2002. Religious Diversity: A Philosophical Assessment. Aldershot, UK: Ashgate.
Bogardus, Tomas. 2013a. “The Problem of Contingency for Religious Belief.” Faith and Philosophy 30 (4): 371–92.
Bogardus, Tomas. 2013b. “Disagreeing with the (Religious) Skeptic.” International Journal for Philosophy of Religion 74 (1): 5–17.
Bourget, David, and David Chalmers. 2009. “The PhilPapers Surveys: Results, Analysis, and Discussion.” Accessed August 5, 2015. http://philpapers.org/surveys/.
Christensen, David. 2007. “Epistemology of Disagreement: The Good News.” Philosophical Review 116 (2): 187–217.
Christensen, David. 2009. “Disagreement as Evidence: The Epistemology of Controversy.” Philosophy Compass 4 (5): 756–67.
Christensen, David. 2010. “Higher-Order Evidence.” Philosophy and Phenomenological Research 81 (1): 185–215.
Christensen, David. 2011. “Disagreement, Question-Begging and Epistemic Self-Criticism.” Philosophers’ Imprint 11 (6): 1–22.
Draper, Paul, and Ryan Nichols. 2013. “Diagnosing Bias in Philosophy of Religion.” Monist 96 (3): 420–46.
Elga, Adam. 2007. “Reflection and Disagreement.” Noûs 41 (3): 478–502.
Everett, Theodore J. 2001. “The Rationality of Science and the Rationality of Faith.” The Journal of Philosophy 98 (1): 19–42.
Feldman, Richard. 2007. “Reasonable Religious Disagreements.” In Philosophers Without Gods: Meditations on Atheism and the Secular Life, edited by Louise M. Antony, 194–214. New York: Oxford University Press.
Frances, Bryan. 2008. “Spirituality, Expertise, and Philosophers.” Oxford Studies in Philosophy of Religion 1: 44–81.
Fumerton, Richard. 2013. “Epistemic Toleration and the New Atheism.” Midwest Studies In Philosophy 37 (1): 97–108.
Gellman, Jerome. 1993. “Religious Diversity and the Epistemic Justification of Religious Belief:” Faith and Philosophy 10 (3): 345–64.
Gutting, Gary. 1982. Religious Belief and Religious Skepticism. Notre Dame, IN: University of Notre Dame Press.
Hick, John. 2004. An Interpretation of Religion: Human Responses to the Transcendent. 2nd ed. New York: Palgrave Macmillan.
Holley, David M. 2013. “Religious Disagreements and Epistemic Rationality.” International Journal for Philosophy of Religion 74: 33–49.
“Ipsos Global @dvisory: Supreme Being(s), the Afterlife and Evolution.” 2015. Ipsos In North America. Ipsos. 2011. Accessed August 6, 2015. http://www.ipsos-na.com/news-polls/pressrelease.aspx?id=5217.
James, William. 1896. The Will to Believe and Other Essays in Popular Philosophy. New York: Longmans, Green and Co.
James, William. 1902. The Varieties of Religious Experience. New York: Random House.
Kelly, Thomas. 2005. “The Epistemic Significance of Disagreement.” Oxford Studies in Epistemology 1: 167–96.
Kelly, Thomas. 2011. “Consensus Gentium: Reflections on the ‘Common Consent’ Argument for the Existence of God.” In Evidence and Religious Belief, edited by Kelly James Clark and Raymond J. VanArragon, 135–56. New York: Oxford University Press.
Kelly, Thomas. 2013. “Disagreement and the Burdens of Judgment.” In The Epistemology of Disagreement: New Essays, edited by David Christensen and Jennifer Lackey. Oxford University Press.
King, Nathan L. 2008. “Religious Diversity and Its Challenges to Religious Belief.” Philosophy Compass 3 (4): 830–53.
Kitcher, Philip. 2014. Life After Faith: The Case for Secular Humanism. Yale University Press.
Koehl, Andrew. 2005. “On Blanket Statements About the Epistemic Effects of Religious Diversity.” Religious Studies 41 (04): 395–414.
Lackey, Jennifer. 2014. “Taking Religious Disagreement Seriously.” In Religious Faith and Intellectual Virtue, edited by Laura Frances Callahan and Timothy O’Connor. Oxford University Press.
Lasonen-Aarnio, Maria. 2014. “Higher-Order Evidence and the Limits of Defeat.” Philosophy and Phenomenological Research 88 (2): 314–45.
McKim, R. 2001. Religious Ambiguity and Religious Diversity. Oxford University Press, USA.
Pace, Michael. 2011. “The Epistemic Value of Moral Considerations: Justification, Moral Encroachment, and James’ ‘Will To Believe.’” Noûs 45 (2): 239–68.
Pittard, John. 2014. “Conciliationism and Religious Disagreement.” In Challenges to Moral and Religious Belief: Disagreement and Evolution, edited by Michael Bergmann and Patrick Kain, 80–97. New York: Oxford University Press.
Plantinga, Alvin. 1995. “Pluralism: A Defense of Religious Exclusivism.” In The Rationality of Belief and the Plurality of Faith, edited by Thomas David Senor, 191–215. Ithaca, N.Y.: Cornell University Press.
Plantinga, Alvin. 2000. Warranted Christian Belief. New York: Oxford University Press.
Quinn, Philip L., and Kevin Meeker. 2000. The Philosophical Challenge of Religious Diversity. New York: Oxford University Press.
Rotondo, Andrew. 2013. “Undermining, Circularity, and Disagreement.” Synthese 190 (3): 563–84.
Schellenberg, J. L. 2007. The Wisdom to Doubt: A Justification of Religious Skepticism. Ithaca, N.Y.: Cornell University Press.
Schoenfield, Miriam. 2014. “Permission to Believe: Why Permissivism Is True and What It Tells Us About Irrelevant Influences on Belief.” Noûs 48 (2): 193–218.
Swinburne, Richard. 2005. Faith and Reason. 2nd ed. New York: Oxford University Press.
Thune, Michael. 2010. “Religious Belief and the Epistemology of Disagreement.” Philosophy Compass 5 (8): 712–24.
Thurow, Joshua C. 2012. “Does Religious Disagreement Actually Aid the Case for Theism?” In Probability in the Philosophy of Religion, 209–24. Oxford: Oxford University Press.
Titelbaum, Michael G. 2015. “Rationality’s Fixed Point (Or: In Defense of Right Reason).” Oxford Studies in Epistemology 5.
van Inwagen, Peter. 1996. “It Is Wrong, Everywhere, Always, and for Anyone, to Believe Anything upon Insufficient Evidence.” In Faith, Freedom and Rationality: Essays in the Philosophy of Religion, edited by Jeff Jordan and Daniel Howard-Snyder, 137–53. Lanham, MD: Rowman & Littlefield.
Vavova, Katia. 2014. “Moral Disagreement and Moral Skepticism.” Philosophical Perspectives 28 (1): 302–33.
White, Roger. 2005. “Epistemic Permissiveness.” Philosophical Perspectives 19 (1): 445–59.

Author Information

John Pittard
Email: john.pittard@yale.edu
Yale University
U. S. A.

Karl Rahner (1904-1984)

Karl Rahner was one of the most influential Catholic philosophers of the mid to late twentieth century. A member of the Society of Jesus (Jesuits) and a Roman Catholic priest, Rahner, as was the custom of the time, studied scholastic philosophy, through which he discovered Thomas Aquinas. From Aquinas’ epistemology and philosophical psychology Rahner was introduced to the Aristotelian-Thomistic notion of abstraction. This theory holds that human beings, as embodied souls or spirits, directly know only that which is sensed; direct sensory knowledge is physical knowledge. The intellect, through complex actions best described as abstraction, draws from sensory knowledge. This knowledge is indirect but valid knowledge of spiritual or non-physical realities. Thus, Rahner, learning from Thomas, held that it is the abstractive power of the mind that leads to indirect knowledge of the spiritual. Kant led Rahner to the philosophical work of Joseph Maréchal, a fellow Jesuit. Maréchal attempted to use Kant to create a re-vitalized Thomism. Maréchal held that the dynamic of the mind transcends the dichotomy of phenomenon and noumenon by attaining the utter unity of the Absolute. Rahner learned from Maréchal that the Kantian frustration could be overcome by the dynamic of the mind. Finally Rahner learned from Pierre Rousselot, another Jesuit, that the mind’s dynamic is drawn to the Absolute because the Absolute is the pure unity of being and spirit. So from Rousselot, Rahner understands the absolute terminus of the dynamic of mind to be the pure unity of being and spirit. It is from these strands that Rahner weaves his unique philosophical system.

Life
Influences
1. Kant
2. Rousselot
3. Maréchal
4. Heidegger
5. Summary
Rahner’s System
1. Geist in Welt
2. Hšrers des Wortes
Summary
References and Further Reading

1. Life

Karl Rahner was born 5 March 1904 in the university town of Freiburg-im-Breisgau in the then Grand Duchy of Baden, the fourth of seven children. His father Karl was an educator; his mother Luise, a homemaker. Rahner’s mother was pious, but in a healthy sense: the atmosphere of a university town imbued that piety with openness. It can therefore be said that Rahner’s childhood laid the groundwork for his later complex philosophical and theological projects: a pious openness, seeking the most effective formulations to gain insight into the character of the world.

At age eighteen, on 20 April 1922, Karl Rahner entered the novitiate of the Society of Jesus. The Jesuits had been, since their inception, an intellectual religious congregation: among their number were philosophers such as Francisco Suárez; nascent biologists such as Athanasius Kircher, discoverer of microbes; missionary-linguists such as Matteo Ricci; paleontologist Pierre Teilhard de Chardin and cosmologist George LeMa”tre. It was thus the perfect environment in which Rahner might begin to develop his own thought: intellectually rigorous, pioneering, open.

After the completion of novitiate and the taking of vows Rahner entered the scholasticate, the formal program of studies. These studies were founded upon the current Neo-Scholastic manuals, much defamed but actually thorough presentations of Scholastic thought. Rahner was deeply influenced by Aquinas, of course; Aquinas had been mandated by Leo XIII as the Catholic philosopher. At the same time Rahner discovered three of the four major influences that would form his intellectual horizons: Immanuel Kant and fellow Jesuits Pierre Rousselot and Joseph Maréchal. It was, however, Maréchal, and Maréchal’s interpretation of Kant, who became the decisive impetus to Rahner’s ongoing philosophical reflections.

Rahner’s superiors soon noted the caliber of his intellect, and so he was sent to the University of Freiburg in Freiburg-im-Breisgau, his home, to begin doctoral studies in philosophy in 1934: his superiors foresaw for him a university career as a professor of philosophy. It was at Freiburg where Rahner, despite his sincere acknowledgements of the importance of Aquinas, Kant and especially Maréchal, discovered the philosopher whom he would later call his true teacher: Martin Heidegger. Until his death Rahner kept with him the list of courses he had taken with Heidegger. Heideggerian thought became the catalyst through which the transcendental philosophies of Kant, and especially Kant through Maréchal, began to coalesce into the Rahnerian philosophical system. In 1936, Rahner submitted his doctoral thesis, Geist in Welt, usually rendered Spirit in the World, which attempts a radical re-reading of Aquinas through Kant, Maréchal and Heidegger. Geist in Welt was essentially a lengthy gloss on a single question in Thomas’ Summa, an intricate, complicated, tightly woven, and impenetrable Maréchallian-Heideggerian interpretation. It was rejected as being too influenced by Heidegger. That same year Rahner was transferred to Salzburg to study theology, gaining a doctorate there.

Rahner then began his university career in 1937 at the University of Innsbruck. In that same year Rahner gave a series of lectures at Innsbruck; these became the basis for Rahner’s last purely philosophical work, his philosophy of religion and revelation, Hšrers des Wortes (Hearers of the Word).

During the war years and post-war years, 1939-1949, Rahner engaged in pastoral work in Vienna. After the war he returned to Innsbruck in 1949 and began to develop his theological system, a system rooted completely in the metaphysics of Geist in Welt and Hšrers des Wortes: human beings, finite, yet invested by an infinite and inexhaustible epistemological dynamic, are intrinsically open to the revelation of the utter mystery that is God. Thus religion is the thematization of the absolutely unthematic.

While at Innsbruck in 1962 Rahner’s superiors received a monitum from the Holy Office in Rome: Rahner was neither to lecture nor publish without Rome’s explicit permission. The irony: in that same year Pope John XXIII named Rahner a peritus to the Second Vatican Council. Rahner’s influence was profound; it was Rahner who was the principal behind the drafting of Lumen Gentium, The Dogmatic Constitution on the Church. The monitum, obviously, disappeared into bureaucratic oblivion. Rahner taught at Innsbruck until 1964. From 1964 until 1981 Rahner taught at the University of Munich. Rahner retired to Innsbruck, where he died, 30 March 1984. Karl Rahner was, and remains, a powerful influence upon the Roman Catholic Church. Rahner’s philosophy was at once the foundation and framework for his far-reaching and in some ways radically different re-reading of Roman Catholic dogma. It is important to note, however, that Rahner’s philosophical system precedes and is separable from the theological system built upon it.

2. Influences

In summary: Rahner derived the notion of the transcendental structure of knowledge from Kant, and from Rousselot and Maréchal he derived the notion of the infinite dynamic inherent in this transcendental structure. This infinite dynamic possesses an intrinsic inevitability toward the Absolute or God. Because of his exposure to Heidegger’s system of thought, Rahner ultimately came to characterize human beings as utterly finite yet as ever ordered to being.

a. Kant

Immanuel Kant (1724-1804) brought to crescendo the “philosophy of the subject” that had been steadily on the rise from time of the great Scholastics. For Kant an authentic subjectivity, one that at once addressed the real, however unknowable (the noumenal), and from that address structured the known (the phenomenal), was the only answer to the radical skepticism of Hume. It was in response to this skepticism that Kant created his great work, Kritik der Reinen Vernunft (Critique of Pure Reason) in two editions: the first (A) in 1781, the second, greatly expanded edition (B), in 1787. The Kritik was the impetus for Joseph Maréchal to establish Transcendental Thomism, which, in turn, decisively influenced Rahner. This is the central concern of the Kritik: how can one gain certitude? How, in the face of radical skepticism, can one be sure of the world? Simply put, is knowledge possible, and, if so, what is the guarantee of that knowledge? Kant reasoned that the facticity of this or that experience is formed within a grid of pre-determined schemata, and from this application there emerges consistent, verifiable and, thus, dependable knowledge. These schemata are the a priori structures of reason. These a priori structures, the categories, render that which is experienced globally consistent and temporally consistent. The categories, the a priori structures of reason, are therefore frameworks to which the to-be-known must conform to be known. Thus, Kant finds the consistency and dependability of knowledge in the constant a priori schemata or categories proper to reason as reason. Kant was fully satisfied that he had established a lasting guarantee of the certainty of knowledge. Joseph Maréchal, with Heidegger serving as the decisive influence upon Rahner’s philosophical thought, would serve as mediator between Kant and Rahner. It is through Maréchal’s system that transcendental idealism is married to Aquinas’ Aristotelian epistemology. The mind, dynamic in its address of that which is to be known, structures the known through abstraction, but this abstraction is a neo-Kantian impetus to the Absolute or God.

b. Rousselot

Pierre Rousselot (1878-1915) was ordained in England in 1908. Remarkably (and in significant demonstration of his intellectual capabilities) Rousselot completed his theological preparations for ordination while simultaneously earning the customary two doctorates from the Sorbonne in 1908, the year of his ordination: the major thesis, L’intelletualisme de saint Thomas, and the minor thesis, Pour l’historie du problem de l’amour au Moyen Age. In L’intellectualisme Rousselot created an entirely unique Neo-Thomist system, one he styled as a Platonized Thomism. The entire system hinges on the identification of spirit and being in divinity as the very nature of the Godhead. This was an attempt at a defensible interpretation of Thomas. The unity of spirit and being that is God is thus the self-knowing of God as his being: in other words, God is infinite intelligibility utterly transparent in pure self-possession. Rousselot takes this as his model for all forms of knowledge. Rousselot holds that every act of comprehension, of understanding, as the discovery and appreciation of intelligibility, is in fact an affirmation of the existence of the God which is pure intelligibility. From Rousselot, Rahner came to appreciate the identity of spirit and being, and thus the intrinsic intelligibility of all which, when realized in the act of intellection, requires the co-affirmation of the existence of God. Rahner learned from Rousselot that knowing strengthens the relation to the Absolute. For Rahner the mind is an organ of the affirmation of divinity. It was Rousselot who opened Rahner to Maréchal.

c. Maréchal

Maréchal and Heidegger were the decisive influences upon Rahner’s thinking. Joseph Maréchal was a contemporary of Rousseolt, but although there was productive dialogue between the two, Maréchal pursued a deliberately Kantian path. Joseph Maréchal (1878-1944) entered the Society of Jesus in 1895 and while in England during the Great War he began work on his system, explicated in the five volume opus Le Point du Départ de la Métaphysique: Leçons sur le Développment Historique et Théorique du Problème de la Connaissance (henceforward simply Le Point du Départ de la Métaphysique). MMaréchal sought in these five volumes to trace the history of western philosophy and describe what he thought to be the true philosophical system: a Thomism rethought in the light of Kant.

It is the fifth volume of Le Point du Départ de la Métaphysique, titled Le Thomisme devant la Philosophie Critique, that deeply influenced Rahner. Maréchal appreciates Kant’s notion that the mind is an active and dynamic structuring of that which it knows. However, he believes that Kant fails to honor the character of this dynamic. Thus, Maréchal sees the dynamic as twofold. First, it is the dynamic openness of the mind, for the mind seeks to grasp all it encounters. Second, the dynamic of the mind is invested with an intrinsic direction to a specific end. The dynamic of the mind, investing all that can be known, structuring the knowable to the known, is quite literally driven to an end (in Maréchal’s scholastic vocabulary, it is possessed of an irresistible “finality”) which is its ultimate goal. Maréchal argues that Kant does not appreciate the power of mind he has discovered. Maréchal sees that the Kantian transcendental dynamic, the mind structuring the knowable to the known, must ultimately terminate in absolute being. Maréchal holds that the dynamic, searching, structuring character of the mind will structure everything knowable to the known. This Kantian impetus is ultimately rooted in its trajectory toward the absolute, toward the absolute as absolute being in which absolute being is absolute truth. For Maréchal, as for Rousselot, every act of knowing is at once an implicit affirmation of the absolute that is absolute being as absolute truth. Here Rahner found the means to go beyond Kant and give systemic form to Rousseolt’s lyrical Platonized Thomism: Maréchal gave to Rahner the Thomist framework to both appropriate Rousselot’s themes of the identity of spirit and being in the pure intelligibility that is God and the utter dynamic of human knowing grounded in that identity.

d. Heidegger

It was Heidegger who was perhaps the greatest influence on Rahner. It is the marriage of Heidegger’s thinking to that of Maréchal’s that joins Heideggerian finitude as open-ness to being-as-irreducible to the Maréchallian dynamism of knowing intrinsically ordered to the Absolute. This move firmly established the foundation of Rahner’s philosophical system. Martin Heidegger’s (1889-1976) Sein und Zeit is the final and decisive influence on Rahner. Its themes, melded with those of Maréchal, give the cast to Rahner’s thought. The animating principle and overarching motif of Sein und Zeit is being or being at its most irreducible. In it Heidegger seeks to discover, amongst and through and beneath the myriad kinds of beings, that uttermost manner of being-ness that underlies this myriad. To discover this being-most-irreducible it is necessary to seek this being, and it is human beings who seek being-most-irreducible. Heidegger calls this being Dasein: literally, being-there, being-emplaced, being-in-and-of-the-world-of-beings. Dasein puts being-most-irreducible in question when it gives itself over to the mystery that is being-most-irreducible. Dasein becomes the possibility of being-most-irreducible revealed as it is. But how, then, does Dasein in its questioning quest reveal being-most-irreducible? Only in the authenticity of Dasein: being authentic. How does Dasein be-authentically? Through Dasein’s realization of its utter finitude. And how is finitude disclosed to Dasein? When Dasein as being-authentically accepts being-toward-death. The complete acceptance of death, the radical finitude of being, opens Dasein to being-most-irreducible, for radical finitude recognizes that which is most irreducible as the reply to that finitude. Rahner neatly synthesizes Maréchal’s dynamic of mind inevitably ordered to absolute being with Heidegger’s notion of Dasein.

e. Summary

It is through Maréchal that Rahner understands Kant and Maréchal’s notion of the dynamic of mind thrusting to the absolute being. This becomes the core of Rahner’s system. Rousselot provides the inspiration in the identity of spirit and being in knowing, and Heidegger’s thinking brings this together. The dynamic thrust of the knowing mind understands the being-most-irreducible as God and the unity of being and spirit in knowing. God is implicitly and intrinsically affirmed in the dynamic of every act of knowing. Thus, through Maréchal Rahner appropriates the Kantian notion of the transcendental structuring of the known by the mind. Through Rousselot and most especially Maréchal, Rahner sees this structuring of the known as a drive to attain the Absolute or God, and it is through Heidegger that this drive is rooted in radically finite human beings, and he discloses God as the identity of spirit and being.

3. Rahner’s System

Rahner’s system is fully explicated in Geist in Welt. Here the foundation is meticulously and densely established. Rahner’s second work, Hšrers des Wortes, forwards a system whereby the human and the divine are intrinsically ordered one to the other. For Rahner, human being as defined in Geist in Welt is intrinsically open to God or the Absolute. It is necessarily the receptacle of revelation. In Rahner’s view even if God or the Absolute remains utterly silent and completely hidden that silence and hiddenness, are, in fact, revelations.

a. Geist in Welt

Translating Geist in Welt as Spirit in the World demonstrates Rahner’s dense use of language. Geist qua spirit denotes both spirit as the unity of being and spirit as known and knowing in human reason, as demonstrated in the thinking of Rousselot. The preposition in does not simply note location but it is also indicative of a movement-towards; Welt is the Heideggerian welt, the world as the location of Dasein as the arena of the quest for Being-most-irreducible. Geist in Welt is remarkably simple in concept and extraordinarily complex in execution. It takes a single question and a single article from Thomas’ Summa Theologiae (ST) and uses them as the fulcrum to erect the Rahnerian synthesis. The question: ST I, q 84, a 7; seven hundred words devoted to the crux of Thomastic psychology and epistemology become the cornerstone of Rahner’s philosophical system.

Thomas notes that in this life the soul is joined to its body; it is through the physicality of its body that the soul interacts with the world. Thomas, following his master Aristotle, holds that the intellect (mind) cannot directly know what the body senses and perceives. The mind is spirit, the body matter. Keep in mind Thomas is not a dualist like Descartes; soul and body, spirit and matter are united, wholly interactive and completely congruent. However a transition from the sensed and the perceived to the known is necessary for Thomas. This transition occurs within the process of intellection (the process of the mind coming to understanding). Thomas, in a manner reminiscent of Kant, also holds that the mind does not directly know the world. Thomas did not see the mind possessing a priori structures. Rahner introduces a Kantian a priori thematic through Maréchal into Thomas. Thomas sees that there is a meditative structure to knowing and this mediation occurs in the following sequential constellation. The imagination receives the impressions of the experienced from the senses and the imagination creates an image of that which is experienced or the phantasm. The phantasm is received by the active (or agent) intellect, which abstracts from the phantasm the universal(s) proper to the object(s) experienced, and thus contained in the phantasm is this intelligible species. The passive (or possible) intellect receives the intelligible species and renders it to the verbum mentis, the “mind’s word,” which is the achieved comprehension and attained understanding.

So like Kant, Thomas does not believe human beings directly experience the world as it is; unlike Kant, there are no categories to structure the perceived world as understood. Rather there is a complex translation of the perceived (which is of the body) to the understood (which is of the soul). It is important to remember that Thomas insists human beings cannot know through their spiritual essence (here the soul) as do the angels. Human beings are of the world and so all human knowledge is the result of mediation and translation. It is this thematic that Rahner appropriates as the jumping off point of Geist in Welt and it is also this thematic that, through Maréchal and Heidegger, becomes the very core of Rahner’s philosophical system. Rahner begins with the process of abstraction which is the work of the agent intellect upon the phantasm. Rahner, along with Thomas, notes the world is the arena of the metaphysical. It is in and through the world that human beings through their agent intellect encounter and grasp thematically being-most-irreducible and also the unthematically present absolute being, God.

Thus: esse—Thomas’ word for being-most-irreducible—is, via Rahner, now the woauf of the Vorgriff, which is of the agent intellect. Woauf, a compound of prepositions, means: wo, where or how; auf, toward, up to, into. Thus, woauf might be rendered, however ungrammatically, as “potential-toward.” Vorgriff is another Rahnerian coinage: vor means before, previous, ahead; and greifen (from which griff is derived) means seize, grasp, hold or comprehend. Thus the Vorgriff is that prior projectedness to comprehension. Thus esse, grasped by the agent (active) intellect, is the potential toward which the prior projectedness to comprehension is directed. It is the Vorgriff, the projectedness to comprehension (as grasping, seizing), that is key to Rahner’s system. The Vorgriff bounds and compasses being-most-irreducible. It is at once, however, non-objective in this bounding and compassing. The Vorgriff is the condition of all knowing and the Vorgriff, as the bounding and compassing of being-most-irreducible, is a directedness to the absolute being, God, which is the ground of being-most-irreducible.

It is this Vorgriff that is the condition or possibility of the knowing of all objective beings. Indeed, all objective beings, and all possible objects of knowledge, are of the index of being that is the Vorgriff. The passive (possible) intellect is the self-becoming of human beings as knowing beings. Receiving the verbum mentis, which is the appropriation of the endless scope of the Vorgriff of the agent (active) intellect, the passive intellect is therefore the human spirit as identity of being and spirit. The passive intellect realizes through the active intellect the utterness of being-most-irreducible. The passive intellect becomes all beings as rendered knowable by the agent intellect through the Vorgriff. It is the full scope of being as spirit being as known which is the dynamic of the human mind encompassing being-most-irreducible. The human spirit directed toward being-most-irreducible through the Vorgriff as the potential prior-directedness to comprehension of being-as-irreducible. In turn the being-as-irreducible is constituted human as both the identity of spirit and being in the mind knowing all possible beings because it becomes all beings rendered knowable. Through this occurs the revelation of absolute being, God. This is the knowing of absolute being as last-knowable-being but knowable only in its infinitely distant obscurity. Geist in Welt blends the following: 1.) Maréchal: the themes of dynamism; 2.) the co-affirmation of absolute being with the grasp of being-most-irreducible and all possible beings to be known in the act of knowing of the human being; 3.) Heidegger’s themes of being-most-irreducible; 4.) Dasein as unformed and thus the self-constituting embeddedness of human being in the world and 5.) the world and the beings of the world as the means to discover being-most-irreducible.

b. Hšrers des Wortes

In Hšrers des Wortes Rahner restates, more lucidly, his themes from Geist in Welt. Recasting these themes in terms of metaphysics, Rahner notes that metaphysics addresses the question of being as being-most-irreducible. Metaphysics formulates the question to the beingness of beings to being-most-irreducible. The question of the beingness of beings, being-most-irreducible, addresses the ground of this beingness, this being-most-irreducible. These questions arise because of the ultimate unity of being and being-known. Rahner called this unity of being and being known, in another neologism, Bei-sich-sein, being-present-to-itself. Beingness is analogical. There are degrees of being-present-to-itself as the unity of being and being known just as there are degrees of intensity of the self-presence, the unity. God is absolute being and therefore absolute being-present-to-itself. For Rahner God is the absolute unity of being and being-know, the absolute possession of beingness; therefore God is the ground of being-most-irreducible. Human being, through the Vorgriff, constitutes itself as the dynamic self movement of spirit, the identity of being and being-known-to the absolute compass of all possible beings-as-knowable.

This movement requires the co-affirmation of the absolute being, God, as the being characterized by absolute self-possession of being. God as God, the pinnacle of the analogy of being-present-to-itself, is the possibility of the Vorgriff. Thus, human being as spirit is the openness of the finite to god, the absolute infinite. Human being as spirit is the dynamic self-movement of transcendence to absolute being to God, and thus the possibility of the disclosure of this Absolute Being. The absolute being-present-to-itself that is God is correlated to human being as an endlessly dynamic self movement of transcendence and thus the analogy of being-present-to-itself in degrees of self-possession and the degrees of intensity of unities of being and being-known. This absolute transcendence of human being as spirit toward the infinity of beingness as the absolute self-presence of absolute being is the limitless compass of the Vorgriff. In addition, this same Vorgriff is the possibility of the appearance of the limitless God to limited human beings. The Vorgriff is not limitless in itself, but it is limitless in the endless dynamic of spirit. In this endless dynamic is the co-affirmation of absolute being in the limitless compass of the Vorgriff as it addresses all possible beings as knowable. Fundamentally, the Vorgriff is the awaiting of the disclosure of the absolute being present in that dynamism. Therefore, Rahner declares that this self-disclosure is inevitable and even the silence of refusal is disclosure of absolute being.

4. Summary

Rahner’s utterly unique reading of Thomas through Maréchal and Heidegger cost him his doctorate in philosophy at Freiburg. Yet Geist in Welt demonstrates the fecundity of Rahner’s mind. Taking a single question from the Summa and but a single article in that question, Rahner, using medieval epistemological categories, weaves Maréchal, Rousseolt, and Heidegger into a vibrant transcendental synthesis. Other Catholic philosophers remained closer to Maréchal, especially Francophone philosophers. These were the practitioners of Transcendental Thomism. Rahner’s philosophy forwards a unique transcendentalism from Thomism featuring 1.) the Heideggerian rootedness of human being in its world. This comprises the vast field of beings that is the medium through which being-most-irreducible is revealed; being-most-irreducible is the proper fulfillment of human being. 2.) The Rousselotian identity of knowing and being as spirit as the hierarchy of degrees of self-possession. 3.) The Maréchallian themes of the endless dynamism of mind and the intrinsic co-affirmation of absolute being in that dynamism. This is Rahner’s unique synthesis; it demonstrates the power of his mind as synthetic, the uniqueness of his insight to build this edifice on the alien foundation of medieval scholasticism, and the complexity and subtlety of his system-building skill.

5. References and Further Reading

a. Karl Rahner: Primary

Karl Rahner, Geist in Welt, dritte auflage MŸnchen: Kosel, 1941
Karl Rahner, Hšrers des Wortes, zweite auflage MŸnchen: Kosel, 1968
Karl Rahner, Spirit in the World New York: Continuum, 1994
Karl Rahner, Hearer of the Word New York: Continuum, 1994

b. Karl Rahner: Secondary

Patrick Burke, Re-interpreting Rahner: A Critical Study of his Major Themes NY: Fordham, 2002
Stephen Fields, Being as Symbol Washington DC: Georgetown, 2000
Karen Kilby, Karl Rahner: Theology and Philosophy London: Routledge, 2003
Thomas Sheehan, Rahner Athens OH: Ohio University Press, 1987

c. Immanuel Kant: Primary

Immanuel Kant, Kritik der Reinen Vernunft 1 & 2 (Bande III/IV, Werkausgabe in 12 Bande) Berlin: Suhrkamp, 1974

d. Pierre Rousselot: Primary

Pierre Rousselot, L’intellectualisme de St. Thomas 2. ed Paris: Beauchesne, 1908
Pierre Rousselot,The Intellectualism of St Thomas (translated, James Mahoney) New York: Sheed and Ward, 1935
Pierre Rousselot, Intelligence: Sense of Being, Faculty of God (translated, Andrew Tallon) Marquette WI: Marquette University Press, 2002

e. Pierre Rousselot: Secondary

John McDermott, Love and Understanding Rome: Gregorian University, 1983

f. Joseph Maréchal: Primary

Joseph Maréchal, Le Point de Depart de la Metaphysique 5 volumes Paris: Desclee de Brouwer, 1922

g. Joseph Maréchal: Secondary

Anthony M. Matteo, Quest for the Absolute DeKalb IL: Northern Illinois University Press, 1992

h. Martin Heidegger: Primary

Martin Heidegger, Sein und Zeit zehnte auflage Tubingen: Max Niemeyer, 1963
Martin Heidegger, Being and Time (translated, John Macquarrie and Edward Robinson) NY: HarperCollins, 2008

Author Information

Guy Woodward
Email: gwoodward127@gmail.com

U. S. A.

Dialogical Logic

Dialogical logic is an approach to logic in which the meaning of the logical constants (connectives and quantifiers) and the notion of validity are explained in game-theoretic terms. The meaning of each logical constant (such as “and”, “or”, “implies”, “not”, “every”, and so forth) is given in terms of how assertions containing these logical constants can be attacked and defended in an adversarial dialogue. Dialogues are described as two-player games between a proponent and an opponent. A dialogue starts with an assertion made by the proponent. This assertion can then be attacked according to its logical form by the opponent. Depending upon the kind of attack, the proponent can now either defend against, or attack, the opponent’s move. The two players alternate until one player is unable to make another move. In this case, the dialogue is won by the other player who made the last move. An assertion made in the initial move by the proponent is said to be valid, if the proponent has a winning strategy for it, that is, if the proponent can win every dialogue for each possible move made by the opponent. The dialogical approach was initially worked out for intuitionistic logic and for classical logic; it has been extended to other logics, among them modal logic and linear logic.

Introduction
Argumentation Forms and the Meaning of Logical Constants
Dialogues for Intuitionistic Logic
1. Definition
2. Examples
Winning Strategies and Validity
Dialogues for Classical Logic
1. Examples
2. Classical Dialogical Validity and Completeness
Origins and Recent Developments
References and Further Reading

1. Introduction

Dialogical logic comprises three main constituents:

(i) Argumentation forms. The meaning of the logical constants (like “implies”, for example) is given by so-called argumentation forms. An argumentation form describes in terms of two possible kinds of moves, called attack and defense, how assertions containing a certain logical constant in main position can be attacked and defended. For example, the argumentation form for “implies” says that if one player asserts “$A$ implies $B$”, then the other player can attack this assertion by claiming $A$, which can in turn be defended by the first player by claiming $B$. This reflects how logical constants are understood in everyday argumentation: someone arguing for “$A$ implies $B$” has to be able to argue for $B$ when being granted that $A$.

(ii) Dialogues. A dialogue is a single game played by two players, called proponent and opponent. The proponent moves first by making an assertion. Then players alternate moves. Each move has to be made according to an argumentation form. In addition, certain rules or conditions are imposed, which go beyond what has been laid down in the argumentation forms. An example of such a rule is that a defense against a certain attack cannot be repeated. There are rules that hold for both players as well as rules that restrict only one of the two players. An example for the latter is the rule that a statement containing no logical constant can only be asserted by the proponent after it has already been asserted by the opponent, whereas the opponent can assert such a statement at any time, if allowed by an argumentation form and not prohibited by other conditions. The proponent wins a dialogue if the opponent cannot make another move.

(iii) Winning strategies. The notion of winning strategy for dialogue games provides the dialogical notion of validity or, depending on the point of view, of provability. In dialogical logic, an assertion is called valid if there exists a winning proponent strategy for it. That is, an assertion is valid if the proponent can always win a dialogue for it, no matter what moves are made by the opponent. Depending on the conditions specifying the kind of dialogues which are played, there may or may not be a winning proponent strategy for a given assertion. This means that by changing the rules of the dialogue games one can obtain different notions of validity, such as intuitionistic validity or classical validity, for example.

2. Argumentation Forms and the Meaning of Logical Constants

Argumentation forms are formulated for an extended first-order language. The first-order language consists of formulas $A, B, \ldots, A_1,\ldots$, which are constructed from atomic formulas $a, b, c, \ldots$ with the logical constants $\wedge$ (conjunction; “and”), $\vee$ (disjunction; “or”), $\rightarrow$ (implication; “implies”), $\neg$ (negation; “not”), $\forall$ (universal quantifier; “every”) and $\exists$ (existential quantifier; “there is”), together with terms $t$, which can be variables $x,y,\ldots$, and auxiliary signs “$($”, “$)$” and “$,$”. The atomic formulas can be relation symbols of arbitrary arity taking terms as arguments. An example for a first-order formula is $$\forall x \exists y (a(x,y) \rightarrow b(x))$$

This language is extended by using $?1$, $?2$, $?\!\vee$, $?t$ (for terms $t$) and $?\exists$ as special symbols (that contain a preceding question mark). In addition, the signatures $P$ and $O$ stand for the two players proponent and opponent, respectively. An expression $e$ is either a formula or a special symbol. For each expression $e$ there is a $P$-signed expression $P\, e$ and an $O$-signed expression $O\, e$. These signed expressions are called moves in general. Examples for moves are $$P\, \forall x \exists y (a(x,y) \rightarrow b(x))$$ and $$O\, ?\!\vee$$

A signed expression is called assertion if the expression is a formula; it is called symbolic attack if the expression is a special symbol (there is no such thing as a symbolic defense). $X$ and $Y$, where $X \neq Y$, are used as variables for $P$ and $O$.

For each logical constant there is one argumentation form, which determines how a formula, with the respective logical constant as main constant, that is asserted by $X$ can be attacked by $Y$, and how this attack can be defended by $X$ (if possible):
$$\begin{array}{rlll} \wedge: & \text{assertion}: & X\, A_1 \wedge A_2 & \\ & \text{attack}: & Y\, ?i & (Y \text{ chooses } i = 1 \text{ or } i = 2)\\ & \text{defense}: & X\, A_i & \\ &\\ \vee: & \text{assertion}: & X\, A_1 \vee A_2 & \\ & \text{attack}: & Y\, ?\!\vee & \\ & \text{defense}: & X\, A_i & (X \text{ chooses } i = 1 \text{ or } i = 2)\\ &\\ \rightarrow: & \text{assertion}: & X\, A \rightarrow B & \\ & \text{attack}: & Y\, A & \\ & \text{defense}: & X\, B & \\ &\\ \neg: & \text{assertion}: & X\, \neg A & \\ & \text{attack}: & Y\, A & \\ & \text{defense}: & \mathit{no\ defense} & \\ &\\ \forall: & \text{assertion}: & X\, \forall x A(x) & \\ & \text{attack}: & Y\, ?t & (Y \text{ chooses the term } t)\\ & \text{defense}: & X\, A(x)[t/x] & \\ &\\ \exists: & \text{assertion}: & X\, \exists x A(x) & \\ & \text{attack}: & Y\, ?\exists & \\ & \text{defense}: & X\, A(x)[t/x] & (X \text{ chooses the term } t) \end{array}$$

The argumentation form for $\wedge$ says that an assertion of the form $A_1 \wedge A_2$ made by player $X$ can be attacked by the other player $Y$ by choosing one of the two conjuncts $A_1$ and $A_2$. This is expressed by $Y$ stating the special symbol $?1$ or $?2$, respectively. This attack can then be defended by player $X$ by asserting the conjunct $A_1$ or the conjunct $A_2$ according to the choice of $Y$. A concrete instance of this argumentation form is the following:
$$\begin{array}{l}P\, \neg a \wedge (b \vee a)\\O\, ?2\\P\, b \vee a\end{array}$$
Here the proponent $P$ has asserted the conjunction $\neg a \wedge (b \vee a)$. This is attacked by the opponent $O$ choosing the second conjunct $b \vee a$, indicated by stating the special symbol $?2$. The proponent defends against this attack by asserting the second conjunct $b \vee a$. Informally, someone claiming “$A_1$ and $A_2$” has to be able to argue for $A_1$ and for $A_2$; an opponent can thus ask for any of the two.

For disjunctions of the form $A_1 \vee A_2$ asserted by player $X$, the attack by the other player $Y$ is indicated by the special symbol $?\!\vee$. The defending player $X$ chooses one of the two disjuncts $A_1$ and $A_2$. Informally, someone claiming “$A_1$ or $A_2$” has only to be able to argue for one of the two disjuncts, and can therefore choose to argue for $A_1$ or for $A_2$, if the claimed disjunction is questioned.

In the case of implications $A \rightarrow B$ asserted by player $X$, the attacking player $Y$ asserts the antecedent $A$ of the implication, and the defending player $X$ asserts the consequent $B$. Informally, someone claiming “$A$ implies $B$” has to be able to argue for $B$ whenever $A$ is given as an assumption.

Negated assertions $\neg A$ made by player $X$ can only be attacked, namely by the other player $Y$ asserting $A$. There is no defense against such an attack. Informally, when claiming that “$A$ is not the case”, one has to be able to argue against $A$.

The argumentation form for the universal quantifier $\forall$ says that an assertion of the form $\forall x A(x)$ made by player $X$ can be attacked by player $Y$ by choosing a term $t$ in the symbolic attack $Y\, ?t$. Player $X$ can then defend by asserting the formula $A(x)[t/x]$, where the term $t$ chosen by $Y$ is substituted for (all occurrences of) the variable $x$ in the formula $A(x)$, also written $A(t)$. Informally, someone claiming that “every object has the property $A$” has to be able to show for any object that it has the property $A$. An opponent can thus ask for any object (denoted by the term $t$, for example) whether it has the property $A$. This has then to be answered by an argument for $A(t)$.

For existential assertions of the form $\exists x A(x)$ made by player $X$, the attack by the other player $Y$ is indicated by the special symbol $?\exists$. The defending player $X$ chooses a term $t$ and asserts the formula $A(x)[t/x]$, that is, the formula resulting from substituting $t$ for (all occurrences of) $x$ in $A(x)$. Informally, someone claiming that “there exists an object with property $A$” has only to be able to present one object (denoted by the term $t$, for example) with the property $A$, and can thus choose to argue for $A(t)$, when the claimed existence of such an object is questioned.

The argumentation forms give thus an explanation of the meaning of the logical constants by saying how assertions, which contain the respective logical constant in main position, are used in argumentations between the two players proponent $P$ and opponent $O$. This explanation is intended to capture how assertions are used according to their logical form in actual argumentations.

In the literature, argumentation forms are also called particle rules or logical rules.

3. Dialogues for Intuitionistic Logic

Dialogical logic was at first developed for intuitionistic logic and for classical logic.

Classical logic is usually based on the principle of bivalence. Each assertion is either true or false, and the truth value of a compound assertion is determined by the truth values of its constituents. For example, the meaning of the logical constant $\wedge$ is given by saying that assertions of the form $A \wedge B$ have the truth value “true” if both conjuncts $A$ and $B$ have the truth value “true”; otherwise the truth value of $A \wedge B$ is “false”.

Intuitionistic logic is not based on bivalence. Instead of employing truth values, the intuitionistic meaning of logical constants is usually explained in terms of proofs or transformations of proofs. For example, the meaning of $\wedge$ is explained by saying that an assertion of the form $A \wedge B$ has a proof if and only if both $A$ and $B$ have a proof. An implication $A \rightarrow B$ has a proof if and only if one is in possession of a construction that transforms any proof of the antecedent $A$ into a proof of the consequent $B$. The classical principle of tertium non datur ($A \vee \neg A$) is rejected in intuitionistic logic, and other classical principles such as double negation elimination ($\neg\neg A \rightarrow A$) do not hold. The set of intuitionistic theorems is a subset of the set of classical theorems. In particular, $A \rightarrow B$ is not equivalent to $\neg A \vee B$ in intuitionistic logic, whereas it is in classical logic. In intuitionistic logic, implication ($\rightarrow$) is a genuine logical constant; it cannot be expressed by using other logical constants such as negation ($\neg$) and disjunction ($\vee$).

Dialogical formulations can be given for both classical logic and intuitionistic logic. The difference between these formulations is made by different notions of dialogue, specified by certain conditions. Dialogues for intuitionistic logic are defined next. Dialogues for classical logic will be dealt with below, in Section 5. The argumentation forms underlying these notions do not differ; they are the same for both logics.

a. Definition

Dialogues for intuitionistic logic are defined with respect to the given argumentation forms to be finite or infinite sequences of moves satisfying the following dialogue conditions:

The first move is made by the proponent $P$ with the assertion of a non-atomic formula, and proponent $P$ and opponent $O$ alternate moves as determined by the argumentation forms.
$P$ may assert an atomic formula only if it has been asserted by $O$ before.
If there is more than one open attack, then only the last one may be defended.
(An attack is open if it has not been defended yet. Attacks made according to the argumentation form for $\neg$ are always open, since there is no defense against them.)
An attack may be defended at most once.
A $P$-signed formula may be attacked at most once.

These conditions are also called structural rules or frame rules in the literature. They are here given only informally. Condition 3 refers to one occurrence of an attack, and condition 4 refers to one occurrence of a $P$-signed formula. Hence, if an already defended attack is repeated, then condition 3 does not prohibit that this new occurrence of the attack can be defended once, too. This holds likewise for an already attacked $P$-signed formula. If this formula is again asserted by $P$, then condition 4 does not prohibit that this new occurrence can be attacked once as well.

It can be observed that proponent and opponent are not interchangeable. This is only due to conditions 1 and 4, which are asymmetric for $P$ and $O$. In particular, the moves $X\, A$ and $Y\, \neg A$ do not amount to the same because of condition 1. The argumentation forms, on the other hand, are completely symmetric with respect to $P$ and $O$, that is, the argumentation forms are player-independent.

The proponent wins a dialogue for a formula $A$ if the dialogue is finite, begins with the move $P\, A$ and ends with a move of $P$ such that $O$ cannot make another move, that is, every move that $O$ could make according to the argumentation forms violates at least one of conditions 0 to 4.

b. Examples

Dialogues are written with position numbers on the left and with comments on the right. The comments make explicit what kind of move is made (attack or defense) and to which preceding move a move refers to. In this notation, moves have the format $$\langle\text{position number}\rangle \langle\text{signed expression}\rangle \langle\text{comment}\rangle$$
The following is a dialogue for the formula $a \rightarrow (b \rightarrow a)$:
$$\begin{array}{rll}0. & P\, a \rightarrow (b \rightarrow a) & \\1. & O\, a & (\text{attack on }0)\\2. & P\, b \rightarrow a & (\text{defense against }1)\\ 3. & O\, b & (\text{attack on }2)\\ 4. & P\, a & (\text{defense against }3)\end{array}$$
The dialogue starts with the assertion of the formula $a \rightarrow (b \rightarrow a)$ by the proponent $P$ in the initial move at position 0. This initial move is attacked by the opponent $O$ at position 1 with the assertion of the antecedent $a$ of the implication asserted by $P$ at position 0. The attack is thus made according to the argumentation form for $\rightarrow$. In the next move at position 2, the proponent defends against this attack according to the argumentation form for $\rightarrow$ by asserting the consequent $b \rightarrow a$ of the attacked implication $a \rightarrow (b \rightarrow a)$. The implication $b \rightarrow a$ is attacked by $O$ at position 3 by asserting its antecedent $b$. This attack is defended by $P$ at position 4 by asserting $a$, the consequent of $b \rightarrow a$. Here $P$ is allowed to assert the atomic formula $a$, since $O$ has asserted $a$ before (compare condition 1). These last moves are also made according to the argumentation form for $\rightarrow$. The opponent cannot make another move, since atomic formulas cannot be attacked (there are only argumentation forms for non-atomic assertions), and $O$ cannot repeat attacks due to condition 4. The dialogue is thus won by $P$.

The following is a dialogue for the formula $\neg a \vee (a \rightarrow a)$:
$$\begin{array}{rll}0. & P\, \neg a \vee (a \rightarrow a)\\1. & O\, ?\!\vee & (\text{attack on }0)\\2. & P\, \neg a & (\text{defense against }1)\\3. & O\, a & (\text{attack on 2})\end{array}$$
The initial move is attacked by $O$ with the symbolic attack $O\, ?\!\vee$ according to the argumentation form for $\vee$. This attack can be defended by $P$ either by asserting the left disjunct $\neg a$ or by asserting the right disjunct $a \rightarrow a$. Here, the proponent chooses the former in the defense move at position 2. This is attacked by $O$ with the assertion of $a$ according to the argumentation form for $\neg$. The dialogue is not won by $P$, and $P$ cannot make another move: the atomic formula $a$ cannot be attacked, there is no defense against attacks on negated formulas, and due to condition 3 it is not possible to defend against the attack $O\, ?\!\vee$ again.

Another dialogue for the same formula $\neg a \vee (a \rightarrow a)$ is obtained if $P$ defends against the attack $O\, ?\!\vee$ by choosing to assert the right disjunct $a \rightarrow a$ instead of the left disjunct at position 2:
$$\begin{array}{rll}0. & P\, \neg a \vee (a \rightarrow a)\\1. & O\, ?\!\vee & (\text{attack on }0)\\2. & P\, a \rightarrow a & (\text{defense against }1)\\3. & O\, a & (\text{attack on }2)\\4. & P\, a & (\text{defense against }3)\end{array}$$
At position 3, the opponent attacks $a \rightarrow a$ by asserting its antecedent $a$, which $P$ defends against at position 4 with the assertion of the consequent $a$. This dialogue is finite and ends with a move of $P$ such that $O$ cannot make another move, that is, this dialogue is won by $P$.

These two dialogues for the formula $\neg a \vee (a \rightarrow a)$ show that for a valid formula there can be dialogues which are won by $P$ as well as dialogues which are not won by $P$, although every possible move has been made. There are also invalid formulas for which this is the case. An example is $a \wedge (a \rightarrow a)$. If $O$ attacks this formula with $O\, ?2$, then $P$ wins the dialogue: the defense $P\, a \rightarrow a$ is attacked with $O\, a$, which $P$ defends against with $P\, a$ as final move. If, however, $O$ attacks with $O\, ?1$, then $P$ cannot make another move, since the first conjunct $a$, which is an atomic formula, cannot be asserted because of condition 1.

A dialogue for the first-order formula $\neg\forall x\neg a(x) \rightarrow \exists x a(x)$ is the following:
$$\begin{array}{rll}0. & P\, \neg\forall x\neg a(x) \rightarrow \exists x a(x)\\1. & O\, \neg\forall x\neg a(x) & (\text{attack on }0)\\2. & P\, \forall x\neg a(x) & (\text{attack on }1)\\3. & O\, ?t_1 & (\text{attack on }2)\\4. & P\, \neg a(t_1) & (\text{defense against }3)\\5. & O\, a(t_1) & (\text{attack on }4)\end{array}$$
Instead of defending against $O$’s attack (made at position 1) with $P\, \exists x a(x)$ at position 2, the proponent attacks $O$’s assertion of $\neg\forall x\neg a(x)$ by asserting $\forall x\neg a(x)$, according to the argumentation form for $\neg$. At position 3, the opponent attacks according to the argumentation form for $\forall$ by choosing the term $t_1$ in the symbolic attack $O\, ?t_1$, which $P$ defends against by asserting $\neg a(t_1)$ at position 4. The opponent attacks this according to the argumentation form for $\neg$ with $a(t_1)$ in the last move. Note that there are now two open attacks made by $O$: the one at position 1 and the one in the last move. Due to condition 2, only the last of the two may be defended. Thus the proponent cannot catch up with the defense $P\, \exists x a(x)$ of $O$’s attack made at position 1, and according to the argumentation form for $\neg$ there is no defense to the last move. The proponent can only repeat the attack made at position 2, which leads to an infinite dialogue where a sequence of moves like
$$\begin{array}{rll}n. & P\, \forall x\neg a(x) & (\text{attack on }1)\\n+1. & O\, ?t & (\text{attack on }n)\\n+2. & P\, \neg a(t) & (\text{defense against }n+1)\\n+3. & O\, a(t) & (\text{attack on }n+2)\end{array}$$
is repeated endlessly; the opponent may choose a different term $t$ in each repetition. Note that repeated attacks on the same move are prohibited only for $O$, while for $P$ there is no such restriction. Hence $P$ can repeatedly attack the move $O\, \neg\forall x\neg a(x)$ (made at position 1) with the move $P\, \forall x\neg a(x)$, starting the loop. Furthermore, each occurrence of an attack is defended at most once, and each occurrence of a $P$-signed formula is attacked at most once, in accordance with conditions 3 and 4, respectively. This dialogue for the formula $$\neg\forall x\neg a(x) \rightarrow \exists x a(x)$$ is therefore neither won by $P$ nor does it end with an opponent move.

In summary, it can be observed that there are valid formulas for which there are finite dialogues that are not won by $P$ as well as dialogues that are won by $P$, there are invalid formulas for which the same is the case, and there can be infinite dialogues, too. The notion of winning a dialogue is itself not sufficient for making a distinction concerning the validity of a formula.

4. Winning Strategies and Validity

In dialogical logic the central logical notion of validity is explained in terms of the game-theoretic notion of strategy. A strategy determines each move of a player. The crucial notion for validity is that of winning proponent strategy. Dialogical validity of a formula consists in the existence of a winning proponent strategy for that formula.

a. Winning Strategies

A player $X$ has a winning strategy if for every move made by the other player $Y$ player $X$ can make another move, such that each resulting play of the game (that is, each resulting dialogue) is eventually won by $X$. In dialogical logic one is usually only interested in winning strategies for the proponent $P$. A winning proponent strategy for a formula $A$ is a tree $S$ whose branches are dialogues for $A$ won by $P$, where the nodes are the moves, such that

$S$ has the move $P\, A$ as root node (with depth 0),
if the depth of a node is odd (that is, if the node is an opponent move), then it has exactly one successor node (which is a proponent move),
if the depth of a node is even (that is, if the node is a proponent move), then it has as many successor nodes as there are possible moves for $O$ at this position.

b. Examples

The following dialogue, which has been discussed above, is already a winning proponent strategy for the formula $\neg a \vee (a \rightarrow a)$:
$$\begin{array}{rll}0. & P\, \neg a \vee (a \rightarrow a)\\1. & O\, ?\!\vee & (\text{attack on }0)\\2. & P\, a \rightarrow a & (\text{defense against }1)\\3. & O\, a & (\text{attack on }2)\\4. & P\, a & (\text{defense against }3)\end{array}$$
This winning proponent strategy has only one branch, which is the dialogue shown. The root node $$P\, \neg a \vee (a \rightarrow a)$$ has only one successor, since the move $O\, ?\!\vee$ is the only possible move for $O$. This node at depth 1 has exactly one successor node, namely the move $P\, a \rightarrow a$ at depth 2. This in turn has again only one successor node, namely the move $O\, a$ at depth 3, since no other opponent moves are possible. Its one successor node is $P\, a$, which has no successor as there are no possible opponent moves. The dialogue is won by $P$, and all possible opponent moves have been taken into account. This single branch tree is thus a winning proponent strategy for the formula $\neg a \vee (a \rightarrow a)$.

In general, winning proponent strategies have more than one branch. If there are several opponent moves possible after a proponent move, then there will be a branch for each of the possible opponent moves. Consider the following winning proponent strategy for the formula $(\neg a \vee b) \rightarrow (a \rightarrow b)$:

$\begin{array}{rlcl}0. &\hspace{5em} & P\, (\neg a \vee b) \rightarrow (a \rightarrow b) &\\1. & & O\, \neg a \vee b & (\text{attack on }0)\\2. & & P\, a\ \rightarrow b & (\text{defense against }1)\\3. & & O\ a & (\text{attack on }2)\\4. & & P?\!\vee & (\text{attack on }1)\end{array}$

$\begin{array}{lll|ll}5. & O\, \neg a & (\text{defense against }4)\, & O\, b & (\text{defense against }4)\\6. & P\, a & (\text{attack on }5) & P\, b & (\text{defense against }3)\end{array}$

This tree of signed expressions has two branches. After the move $P\, ?\!\vee$ at depth 4 there are two possible moves for $O$, yielding the two successor nodes $O\, \neg a$ (left branch) and $O\, b$ (right branch). The proponent wins both resulting dialogues: $O$ can neither make another move after $P\, a$ (left dialogue) nor after $P\, b$ (right dialogue). Thus, this tree is a winning proponent strategy.

c. First-Order Winning Strategies

Winning strategies for quantifier-free formulas are always finite trees, whereas winning strategies for first-order formulas can in general be trees of countably infinitely many finite branches. An example is the following winning proponent strategy for the first-order formula $\neg\exists x a(x) \rightarrow \forall x \neg a(x)$:

$\begin{array}{rcl}0. & \hspace{3em} P\, \neg\exists x a(x) \rightarrow \forall x \neg a(x) \hspace{3em}&\\1. & O\, \neg\exists x a(x) & (\text{attack on }0)\\2. & P\, \forall x \neg a(x) & (\text{defense against }1)\end{array}$

$\begin{array}{rl|l|l|ll}3. & O\, ?t_1 & O\, ?t_2 & O\, ?t_3 & \ldots\hspace{.85em} & (\text{attack on }2)\\4. & P\, \neg a(t_1) & P\, \neg a(t_2) & P\, \neg a(t_3) & \ldots & (\text{defense against }3)\\ 5. & O\, a(t_1) & O\, a(t_2) & O\, a(t_3) & \ldots & (\text{attack on }4)\\ 6. & P\, \exists x a(x) & P\, \exists x a(x) & P\, \exists x a(x) & \ldots & (\text{attack on }1)\\ 7. & O\, ?\exists & O\, ?\exists & O\, ?\exists & \ldots & (\text{attack on }6)\\ 8. & P\, a(t_1) & P\, a(t_2) & P\, a(t_3) & \ldots & (\text{defense against }7)\end{array}$

The move $P\, \forall x \neg a(x)$ at depth 2 has countably infinitely many successor nodes, since for each choice of a term $t_i$ (for natural numbers $i$) the symbolic attack $O\, ?t_i$ is a possible move. For pairwise distinct terms $t_i$ the tree therefore has infinitely many branches (indicated by “$\ldots$”), where each branch is a dialogue won by $P$.

Such infinite winning strategies can be avoided by using the following restrictions with respect to winning proponent strategies:

If the depth of a node is even, and the symbolic attack $O\, ?y$ on the move $P\, \forall x A(x)$ is a possible move, where $y$ is a new variable in this branch, then $O\, ?y$ is the only immediate successor node that is an attack on $P\, \forall x A(x)$.
If the depth of a node is even, $P\, ?\exists$ is an attack on an assertion $O\, \exists x A(x)$, and the move $O\, A(x)[y/x]$ is a possible defense against this attack, where $y$ is a new variable in this branch, then $O\, A(x)[y/x]$ is the only immediate successor node that is a defense against $P\, ?\exists$.

There may be further possible moves, which are not attacks on $P\, \forall x A(x)$ or defenses against $P\, ?\exists$. In this case the node at even depth has more than one immediate successor node. However, the number of these immediate successor nodes can only be finite, and there is thus no more infinite ramification within winning proponent strategies.

For example, the infinite winning proponent strategy indicated above is reduced to the following finite winning proponent strategy:
$$\begin{array}{rll}0. & P\, \neg\exists x a(x) \rightarrow \forall x \neg a(x) & \\ 1. & O\, \neg\exists x a(x) & (\text{attack on }0)\\ 2. & P\, \forall x \neg a(x) & (\text{defense against }1)\\ 3. & O\, ?y & (\text{attack on }2)\\ 4. & P\, \neg a(y) & (\text{defense against }3)\\ 5. & O\, a(y) & (\text{attack on }4)\\ 6. & P\, \exists x a(x) & (\text{attack on }1)\\ 7. & O\, ?\exists & (\text{attack on }6)\\ 8. & P\, a(y) & (\text{defense against }7)\end{array}$$
The restrictions (i) and (ii) have the effect that in winning strategies only one of the possible attacks $O\, ?t$ (for each term $t$) on $P\, \forall x A(x)$ or defenses $O\, A(x)[t/x]$ (for each term $t$) against $P\, ?\exists$ has to be taken into account, namely one where the term $t$ is a new variable, whereas in winning strategies which are not thus restricted one has to consider the corresponding moves for each term $t$, including variables. It can be shown that a restricted winning strategy can always be extended to an unrestricted one.

Infinite ramifications in winning strategies can also be avoided by replacing the player-independent argumentation forms for $\forall$ and $\exists$ by the following player-dependent argumentation forms:
$$\begin{array}{rlll}\forall_{P}: & \text{assertion}: & P\, \forall x A(x) & \\ & \text{attack}: & O\, ?y & (\text{variable }y\text{ is new})\\ & \text{defense}: & P\, A(x)[y/x] & \\ \\ \forall_{O}: & \text{assertion}: & O\, \forall x A(x) & \\ & \text{attack}: & P\, ?t & (P\text{ chooses the term }t)\\ & \text{defense}: & O\, A(x)[t/x] & \\ \\ \exists_{P}: & \text{assertion}: & P\, \exists x A(x) & \\ & \text{attack}: & O\, ?t & (O\text{ chooses the term }t)\\ & \text{defense}: & P\, A(x)[t/x] &\\ \\ \exists_{O}: & \text{assertion}: & O\, \exists x A(x) & \\ & \text{attack}: & P\, ?\exists & \\ & \text{defense}: & O\, A(x)[y/x] & (\text{variable }y\text{ is new})\end{array}$$
The argumentation forms $\forall_{P}$ and $\exists_{O}$ contain the condition that the variable $y$ is new. They are thus history-dependent in the sense that the possibility of a move $O\, ?y$ or $O\, A(x)[y/x]$ in a dialogue depends on whether the variable $y$ has already occurred in this dialogue. The player-independent argumentation forms for $\forall$ and $\exists$, on the other hand, are not history-dependent.

Winning proponent strategies for the resulting dialogues can then be restricted as follows: Only one successor node for a node at even depth has to be considered if

the symbolic attack $O\, ?y$ according to the argumentation form $\forall_{P}$ is a possible opponent move,
the symbolic attack $O\, ?t$ according to the argumentation form $\exists_{P}$ is a possible opponent move,
or the opponent move defending a symbolic attack $P\, ?\exists$ according to the argumentation form $\exists_{O}$ is a possible move.

Again, further moves according to other argumentation forms may be possible. The resulting winning proponent strategies are finite. The use of the player-dependent argumentation forms has therefore technical advantages. However, from a conceptual point of view, player-independent argumentation forms might be preferable.

d. Tertium Non Datur and the Principle of Non-Contradiction

The proponent does not have a winning strategy for every formula. An example is the instance $a \vee \neg a$ of tertium non datur. There is only one possible dialogue, namely:
$$\begin{array}{rll}0. & P\, a \vee \neg a & \\ 1. & O\, ?\!\vee & (\text{attack on }0)\\ 2. & P\, \neg a & (\text{defense against }1)\\ 3. & O\, a & (\text{attack on }2)\end{array}$$
which is not won by $P$. At position 2, the proponent can only defend against $O$’s symbolic attack $O\, ?\!\vee$ by choosing to assert the right disjunct $\neg a$. Choosing the left disjunct is not an option due to condition 1. Due to condition 4, the opponent cannot repeat the symbolic attack $O\, ?\!\vee$ at position 3; the only possible move for $O$ is to attack $P\, \neg a$ with $O\, a$. This attack can neither be defended nor can it be attacked, and another defense against the already defended symbolic attack $O\, ?\!\vee$ is ruled out by condition 3.

On the other hand, there is a winning proponent strategy for each instance of the principle of non-contradiction, $\neg (A \wedge \neg A)$. Consider the following winning proponent strategy for $\neg(a \wedge \neg a)$:
$$\begin{array}{rll} 0. & P\, \neg(a \wedge \neg a) & \\ 1. & O\, a \wedge \neg a & (\text{attack on }0)\\ 2. & P\, ?1 & (\text{attack on }1)\\ 3. & O\, a & (\text{defense against }2)\\ 4. & P\, ?2 & (\text{attack on }1)\\ 5. & O\, \neg a & (\text{defense against }4)\\ 6. & P\, a & (\text{attack on }5)\end{array}$$
Here it is essential that $P$ can attack the same assertion repeatedly. The opponent’s assertion of $a \wedge \neg a$ at position 1 is attacked by $P$ first at position 2 by choosing the first conjunct, and again at position 4, now by choosing the second conjunct. Both attacks are necessary for having a winning proponent strategy.

There is no winning proponent strategy in the case of tertium non datur, since $P$ cannot defend against the attack on $a \vee \neg a$ repeatedly, while in the case of the principle of non-contradiction there is a winning proponent strategy because $P$ can attack $a \wedge \neg a$ repeatedly. In classical logic, tertium non datur and the principle of non-contradiction are equivalent, while in intuitionistic logic only the principle of non-contradiction holds. For the dialogues considered, this distinction rests upon the fact that $P$ can repeatedly attack but not repeatedly defend against an assertion.

To sum up, there are formulas for which there exists a winning proponent strategy, and there are formulas for which this is not the case. A given formula can also have more than one winning proponent strategy.

e. Dialogical Validity and Completeness

The dialogical notion of validity is defined as follows:

A formula $A$ is called valid if there is a winning proponent strategy for $A$.

That this dialogical notion of validity corresponds exactly to intuitionistic provability is the content of the following completeness result:

A formula $A$ is valid if and only if $A$ is provable in intuitionistic logic.

Hence, for the dialogues defined by conditions 0 to 4, one obtains a dialogical formulation of intuitionistic logic.

Provability is closed under uniform substitution of formulas for atomic formulas. That is, if a formula $A$ is provable in intuitionistic logic, then each substitution instance $A’$, obtained by uniformly substituting formulas for atomic formulas in $A$, is provable in intuitionistic logic, too. The completeness result implies that also validity is closed under uniform substitution. That is, if there is a winning proponent strategy for $A$, then there are winning proponent strategies for each instance $A’$ of $A$ that is the result of a uniform substitution of formulas for atomic formulas in $A$.

f. Winning Strategies as Proofs

Dialogues can also be viewed as constituents of a proof system. On this view, the proofs of a formula $A$ are the winning proponent strategies for $A$. Completeness is then formulated as an equivalence theorem for winning proponent strategies and proofs in a given proof system such as, for example, sequent calculus:

There is a winning proponent strategy for $A$ if and only if $A$ is provable in sequent calculus for intuitionistic logic.

A constructive proof of this theorem has been given by Felscher [1985] by showing that there are algorithms transforming any winning proponent strategy for a formula $A$ into a proof of $A$ in sequent calculus for intuitionistic logic, and, the other way round, transforming any proof of $A$ into a winning proponent strategy for $A$.

5. Dialogues for Classical Logic

A dialogical rendering of classical logic is obtained by relaxing conditions 2 and 3 for the proponent $P$, while keeping them for the opponent $O$. That is, conditions 2 and 3 are replaced by the following conditions:

2′. If there is more than one open attack by $P$, then only the last one may be defended by $O$.

3′. An attack by $P$ may be defended by $O$ at most once.

For $P$ this means that if there is more than one open attack made by $O$, then $P$ may defend against any of these attacks (instead of only the last one), and $P$ can defend against attacks made by $O$ repeatedly.

No changes are made to the argumentation forms. Classical dialogues are defined with respect to them to be finite or infinite sequences of moves made according to conditions 0, 1, 2′, 3′ and 4.

Classical winning proponent strategies for a formula $A$ are defined as before in the case of intuitionistic logic, but with the notion of dialogue replaced by the notion of classical dialogue.

a. Examples

There is a classical winning proponent strategy for the formula $a \vee \neg a$. It consists in the following classical dialogue:
$$\begin{array}{rll} 0. & P\, a \vee \neg a & \\ 1. & O\, ?\!\vee & (\text{attack on }0)\\ 2. & P\, \neg a & (\text{defense against }1)\\ 3. & O\, a & (\text{attack on }2)\\ 4. & P\, a & (\text{defense against }1)\end{array}$$
At position 2, the proponent can defend against $O$’s symbolic attack $O\, ?\!\vee$ only by asserting the right disjunct $\neg a$, since the atomic left disjunct $a$ has not been asserted by $O$ yet (compare condition 1). But due to the replacement of condition 3 by condition 3′, the proponent can defend against this attack again at position 4, now by asserting the left disjunct $a$, which has been asserted by $O$ at the preceding position 3.

There is a classical winning proponent strategy for the formula $\neg\neg a \rightarrow a$, an instance of the intuitionistically invalid principle of double negation elimination. It consists in the following classical dialogue:
$$\begin{array}{rll} 0. & P\, \neg\neg a \rightarrow a & \\ 1. & O\, \neg\neg a & (\text{attack on }0)\\ 2. & P\, \neg a & (\text{attack on }1)\\ 3. & O\, a & (\text{attack on }2)\\ 4. & P\, a & (\text{defense against }1)\end{array}$$
The last move is possible due to the replacement of condition 2 by condition 2′. At position 3 there are two open attacks by $O$ (made at positions 1 and 3, respectively). Condition 2 would prohibit $P$’s defense against the first attack, since this is not the last open attack. But condition 2′ enables $P$ to defend against any earlier open attack. At position 4, the proponent can thus defend against $O$’s first attack by asserting the consequent $a$ of the attacked implication $\neg\neg a \rightarrow a$.

These two examples show that the replacement of both conditions 2 and 3 by conditions 2′ and 3′, respectively, is necessary to obtain classical logic. Otherwise, there would not be winning proponent strategies for (all instances of) either tertium non datur or the principle of double negation elimination, which are both principles of classical logic.

b. Classical Dialogical Validity and Completeness

The dialogical notion of classical validity is defined as follows:

A formula A is called classically valid if there is a classical winning
proponent strategy for A.

The following completeness theorem holds:

A formula A is classically valid if and only if A is provable in
classical logic.

A proof of this theorem for a sequent calculus for classical logic can be found in Sørensen and Urzyczyn [2006].

A dialogical formulation of classical logic has thus been obtained by a modification of the dialogue conditions for intuitionistic dialogues. This means that one can obtain dialogical formulations of different logics by changing the rules of the dialogue games.

6. Origins and Recent Developments

The dialogical approach to logic was first proposed by Lorenzen in 1958 (Lorenzen [1960]; see also Lorenzen [1961]) for intuitionistic logic as well as for classical logic. That the dialogical approach as such cannot be taken as a foundation of intuitionistic logic is obvious, since a dialogical notion of classical validity can be obtained by modifying the dialogue conditions given for intuitionistic logic. If one wants to obtain a dialogical foundation for intuitionistic logic, it is therefore necessary to give a justification for the special kind of dialogues needed for intuitionistic logic. Such a justification has been proposed by Felscher [2002] (first published in 1986); it is based on the notions of contention, hypothesis and relevance.

The dialogical approach has been extended to several non-classical logics, including modal logic and linear logic; for an overview see Rahman and Keiff [2005] and Keiff [2011]. A dialogical setting for the interpretation of implications as rules has been considered by Piecha and Schroeder-Heister [2012]. Dialogical logic can thus provide a common basis for discussing different kinds of logics.

7. References and Further Reading

Andreas Blass. A Game Semantics for Linear Logic. Annals of Pure and Applied Logic, 56:183–220, 1992.
- Presents a dialogue semantics for linear logic. Starting point for game-theoretic developments in computer science.
Walter Felscher. Dialogues, Strategies, and Intuitionistic Provability. Annals of Pure and Applied Logic, 28:217–254, 1985.
- Constructive completeness proof for intuitionistic logic.
Walter Felscher. Dialogues as a Foundation for Intuitionistic Logic. In D. M. Gabbay and F. Guenthner, editors, Handbook of Philosophical Logic, 2nd Edition, Volume 5, pages 115–145. Kluwer, Dordrecht, 2002.
- Explains the basic concepts of dialogical logic, gives an overview on the literature on dialogues, and develops an argumentative foundation for a certain kind of dialogues as a basis for intuitionistic logic.
Wilfrid Hodges and Erik C. W. Krabbe. Dialogue Foundations. Aristotelian Society Supplementary Volume, 75:17–49, 2001.
- Critical discussion between Hodges and Krabbe on dialogues as a foundation for logic.
Laurent Keiff. Dialogical Logic. In E. N. Zalta, editor, The Stanford Encyclopedia of Philosophy. Stanford University, Summer 2011 edition, 2011.
- Overview on dialogical logic, including dialogues for modal, linear and other non-classical logics. The presentation uses an alternative formalization of dialogical logic.
Erik C. W. Krabbe. Dialogue Logic. In D. M. Gabbay and J. Woods, editors, Handbook of the History of Logic, Volume 7: Logic and the Modalities in the Twentieth Century, pages 665–704. Elsevier North-Holland, Amsterdam, 2006.
- Traces the historical development of dialogical logic. Contains a useful bibliography.
Kuno Lorenz. Basic Objectives of Dialogue Logic in Historical Perspective. In Rahman and H. Rückert, editors, New Perspectives in Dialogical Logic, volume 127 of Synthese, pages 255–263. Springer, Berlin, 2001.
- Describes the development of dialogical logic in historical context, with emphasis on the notion of dialogue-definiteness.
Paul Lorenzen. Logik und Agon. In Atti del XII Congresso Internazionale di Filosofia (Venezia, 12–18 Settembre 1958), volume 4, pages 187–194. Sansoni Editore, Firenze, 1960.
- First proposal of dialogues as a means to explain intuitionistic logic and classical logic.
Paul Lorenzen. Ein dialogisches Konstruktivitätskriterium. In Infinitistic Methods. Proceedings of the Symposium on Foundations of Mathematics (Warsaw, 2–9 September 1959), pages 193–200. Pergamon Press, Oxford/London/New York/Paris, 1961.
- Dialogical explanation of the meaning of logical constants and of the meaning of inductive definitions.
Thomas Piecha and Peter Schroeder-Heister. Implications as Rules in Dialogical Semantics. In M. Peliš and V. Puncochár, editors, The Logica Yearbook 2011, pages 211–225. College Publications, London, 2012.
- Formulates dialogical semantics for implications-as-rules approach.
Shahid Rahman and Laurent Keiff. On How to Be a Dialogician. In D. Vanderveken, editor, Logic, Thought and Action, volume 2 of Logic, Epistemology, and the Unity of Science, pages 359–408. Springer, Dordrecht, 2005.
- Survey of dialogical formulations of a variety of logics.
Morten Heine Sørensen and Paweł Urzyczyn. Lectures on the Curry-Howard Isomorphism, volume 149 of Studies in Logic and the Foundations of Mathematics. Elsevier, New York, 2006.
- Contains a completeness proof for classical dialogues.
Wolfgang Stegmüller. Remarks on the completeness of logical systems relative to the validity-concepts of P. Lorenzen and K. Lorenz. Notre Dame Journal of Formal Logic, 5:81–112, 1964.
- Contains a comparison of the dialogical approach to semantics with the Bolzano-Tarski approach.

Author Information

Thomas Piecha
Email: thomas.piecha@uni-tuebingen.de
University of Tübingen
Germany

F. H. Bradley: Logic

Although the logical system expounded by F. H. Bradley in The Principles of Logic (1883) is now almost forgotten, it had many virtues. To appreciate them, it is helpful to understand that Bradley had a very different view of logic from that prevalent today. He is hostile to the idea of a purely formal logic. Today, deductive logic is largely restricted to a study of the rules through which we can legitimately re-arrange our thoughts, permitting the elimination of items no longer required, but not allowing the addition of anything genuinely new. Bradley had a much wider conception and took logic to be the discipline through which we give an account and explanation of the special function of thought through which we transcend immediate experience. Bradley believes logic covers topics that would fall today under the heading of theory of knowledge.

For Bradley, the processes of thought through which we transcend immediate experience involve ideas, judgments, and inferences. He begins with judgment and offers a natural account of both relational judgments with more than one subject and judgments without a special subject, such as: “It is raining.” His general theory that the ultimate subject of all judgment is reality as such could also accommodate the mass terms that give modern logicians so much trouble.

Although Bradley accepts the credo of empiricism that all our knowledge begins in experience, he does not accept Hume’s view that our immediate experience is composed by a swarm of impressions. He rejects the theory, widespread at the time, that knowledge could be explained through the association of ideas derived from such impressions. Neither psychological particulars nor any connections among them are the sorts of thing capable of representing anything beyond themselves. Judgment requires “logical” ideas that are universal, not particular.

What most baffles readers is an esoteric doctrine in which Bradley assimilates judgment and inference as processes in which there is a movement of thought from a ground to a conclusion. Unless there is a change, nothing has happened, but any change requires justification, if the inference is to be valid or the judgment true. For the movement of thought to be satisfactory, the ground and justification cannot remain external and must be brought inside. This is achieved to the extent that we can enlarge our system of thought. It may seem that Bradley is now heading to a Hegelian solution in which the completion of the system of thought brings about the identity of Thought and Reality, but Bradley is not prepared to go this far. This is, however, a matter for metaphysics and is beyond the scope of logic.

Biography
Bradley’s Conception of Logic
Judgment
Logical Ideas
Categorical Judgments
Hypothetical Judgments
The Esoteric Doctrine
Other Types of Judgments
1. Negative Judgments
2. Disjunctive Judgments
Other Topics
Judgment: Concluding Remarks
The Nature of Inference
The Association of Ideas
Inductive Inference
Inference: The Inclusive Theory
Inference and Judgment
Formal Logic
Truth and Validity
The Final Doctrine
References and Further Reading
1. Selected works by F. H. Bradley
2. Further Reading

1. Biography

Francis Herbert Bradley was born in 1846 into a very large family that included the celebrated Shakespearean critic, A.C. Bradley. Having studied at Oxford University, F. H. Bradley was awarded in 1870 a Fellowship at Merton College, where he remained until his death in 1924. He was not required to teach and did not do so. The dominant philosophy in England when he came to Oxford was the (kind of) empiricism, originally due to John Locke, whose champion in the nineteenth century was John Stuart Mill. This theory attempted to explain cognition through the association of mental particulars, impressions and ideas, originally introduced into the mind, it was supposed, by external causes. Bradley was implacably opposed to this position and determined to demolish it. He gained assistance in this from his wide reading in German philosophy, but refused to call himself a Hegelian, since he denied the central principle of the identity of Thought and Reality. Nonetheless, he is generally regarded as the central figure in the group of British Idealists in the late nineteenth century.

2. Bradley’s Conception of Logic

The principal source for Bradley’s thoughts about logic is a substantial two-volume work entitled The Principles of Logic, published in Oxford in 1883. A second edition appeared in 1922, in which the original text was supplemented by a large number of additional notes and terminal essays through which Bradley expressed his mature position. (Page references in what follows will be to this second edition.)

Bradley had a very different view of logic from that prevalent today. Today, logic is largely restricted to a study of the rules through which we can legitimately re-arrange our thoughts, permitting the elimination of items no longer required, but not allowing the addition of anything genuinely new. Bradley had a much wider conception and took logic to be the discipline through which we give an account and explanation of the special function of thought through which we transcend immediate experience. Logic, for Bradley, therefore covers topics that would fall today under the heading of theory of knowledge.

The processes of thought were traditionally taken to involve ideas, judgments, and inferences. These topics, however, are very closely connected. One could begin at any point, but Bradley proposes to begin in the middle with the faculty of judgment.

3. Judgment

Bradley’s central definition is as follows: “Judgment proper is the act which refers an ideal content (recognized as such) to a reality beyond the act.” (10) This definition immediately raises two serious questions: (1) What is this ideal content and how is it acquired? (2) What is reality and how is it accessed? These are questions that Bradley tackles in considerable detail. Moreover, the definition commits Bradley to the thesis that the structure of judgment is essentially subject-predicate, “that in every judgment there is a subject of which the ideal content is asserted.” (13) The subject is what is real, and the predicate is the ideal content referred to it: judgment is essentially predication.

This is, of course, to display the form of the act or function of judgment. It does not specify the essential structure of the ideal content, nor does it trap Bradley within the traditional logic of the categorical statement, as Russell believed. Categorical statements involve the combination of two terms—a subject term and a predicate term—with the two terms united by the copula in such a way that the act of combination is the act of judgment. Bradley resists this account on the ground that the ideal complex expressed is the same whether the proposition is asserted or merely entertained. “We may say then, if the copula is a connection which couples a pair of ideas, it falls outside judgment; and, if on the other hand it is the sign of judgment, it does not couple. Or, if it both joined and judged, then judgment at any rate would not be mere joining.” (21) It is not even true that every judgment contains two ideas: on the contrary, it has but one. The ideal content may be as complex as you please: it may be “a complex totality of qualities and relations” (11); but even if we distinguish separate ideas within the complex, it is as a unit that it is referred to reality. When we assert that the wolf eats the lamb, it is the whole complex that is referred beyond the act of judgment, even if we distinguish within it the separate ideas of (at least) the wolf and the lamb.

Because we can distinguish separate objects such as the wolf and the lamb that can function as special subjects, we can draw at the level of logic a distinction between singular judgments that characterize single things and plural judgments in which a number of such things may be related. But even with non-singular judgments, we must assume a unified reality within which various objects are assigned a place.

Bradley’s theory that relational judgments that appear to refer to a number of identifiable and discriminable individuals actually presuppose a single underlying reality gets confirmation from his logical analysis of a kind of judgment in which this reality is introduced directly. This is the kind of judgment that denies the existence of things of a certain type, such as sea-serpents. “Sea-serpents do not exist” has “sea-serpents” as its grammatical subject, but we must distinguish the grammatical subject from the real subject that confers a truth-value upon the statement. Sea-serpents are not the reality to which we refer when making this judgment, since there are no sea-serpents. The correct logical analysis is something like: “Reality is such that it contains no sea-serpents.” This corresponds to: “Reality is such that A and B are simultaneous.” Bradley can therefore handle this kind of judgment without presupposing the existence of what is denied. What he presupposes is the reality that is the ultimate subject of every judgment. The competing analysis offered by modern logic through the negation of existential quantification presupposes a universe of discourse comprising all possible values of the individual variables in the system.

Judgment has a dimension of truth and falsity, and Bradley uses this to confirm his view that judgment necessarily involves a reference to what is real. “For consider;” he says, “a judgment must be true or false, and its truth or falsehood cannot lie in itself. They involve a reference to a something beyond. And this, about which or of which we judge, if it is not fact, what else can it be?” (41) It may be thought that logical truths, said to be true in all possible worlds, are an exception. For Bradley, logical truths, or tautologies, are not true in all possible worlds: they are not true in any possible world. “A bare tautology …is not even so much as a poor truth or a thin truth. It is not a truth in any way, in any sense, or at all.” (Appearance and Reality, Note A, 501.)

4. Logical Ideas

Bradley’s definition of judgment introduces “ideal content.” What is “ideal content” and how is it acquired? Bradley was completely sure that the psychological particulars with which empiricists furnished the mind could not begin to explain judgment, knowledge, and cognition. If such things existed, they certainly could not function as predicates in judgment, since they could not be moved from their place in the mind.

What Bradley had to explain was how we get from psychological ideas, which are mental particulars, to logical ideas, which are universal ideal contents, while preserving the information that the impressions have no doubt acquired from elsewhere. He begins by distinguishing two sides that belong to every psychological idea—its existence as a mental particular and its content. “We perceive both that it is and what it is.” (3) Unlike existence, content can be loosened from its home in the psychological idea and transferred elsewhere—a loosening of content that takes place within the act of judgment. It is not, however, the entire content of the psychological idea that is used in judgment. The original content, he says, is “mutilated.” That the acquisition of ideal content involves abstraction is more clearly appreciated, if we move from the Humean picture of a swarm of distinct impressions arriving together in the mind to the notion of an organic immediate experience with which Bradley is more comfortable. It is clear that the logical ideas used in judgment require the separation of elements within the “sensuous felt mass” presented in immediate experience. Even if we begin, however, with an isolated impression or sense-datum, we must recognize that universals are associated at different levels.

Bradley makes an unsuccessful attempt to explain what he has in mind by using the notion of a symbol. A symbol, such as a particular inscription, has, like everything else, two sides: its existence and its content. But it has also a third side—its meaning or signification. This meaning can be identified with the logical idea used in judgment. The symbol RED has as its meaning exactly what we assign to a variety of objects in the act of judgment. This provides an opening for Frege and those who favor the linguistic turn to slip in an item distinct from any image or psychological idea that may be associated with the word. (The logical idea is, of course, to be identified with what Frege calls the sense of the sign, not the referent.) But the attachment of the idea to the symbol through decision or convention does nothing to explain the connection between the abstract universal and the immediate experience which must be its home. It is only because we can abstract a part of the given content that we obtain the sense that we attach to the sign in the language.

5. Categorical Judgments

a. Universal Judgments

The standard classification of judgments distinguished categorical, hypothetical, and disjunctive. Bradley reduces the universal form of the categorical judgment to a hypothetical form. The universal form does not even guarantee the existence of real things to which we refer. “All trespassers will be prosecuted” is designed to ensure that the subject class remains empty. Thus, “Animals are mortal” becomes “If anything is an animal, then it is mortal.” (47) Bradley admits that he got this from Herbart, and Russell admits, in turn, that he got it from Bradley.

b. Analytic Judgments of Sense

Singular judgments, however, are different. Bradley takes as his example: “I have a toothache.” I and my toothache are both individual, but I describe my condition in general terms as “suffering from toothache.” This example belongs to the first division of singular judgments that he calls “analytic judgments of sense.” “The essence of these is to hold only of the now, and not to transcend the given presentation.” (56) Analytic judgments of sense do not always have a grammatical subject or copula. We may call the cry “Wolf” a warning, but it is also a statement of fact, or is supposed to be. The cry of “Wolf” or “Rain” refers to an undifferentiated present reality. The thought is that a wolf is somewhere and that rain is everywhere, at least everywhere that matters. But there are also singular judgments without grammatical subjects in which we qualify by our idea “but one piece of the present.” (57) One way to do this is by pointing. I point to my dog and say “Asleep.” Bradley rejects the view that the grammatical subject is merely suppressed. Even if a grammatical subject may appear when my judgment is reported.

Bradley identifies a second kind of analytic judgments of sense that do have a grammatical subject. “The ideal content of the predicate is here referred to another idea, which stands as a subject. But in this case, as above, the ultimate subject is no idea, but is the real in presentation. It is this to which the content of both ideas, with their relation, is attributed.” (57) “This bird is yellow” is a typical example. The ideal content “bird”, perhaps aided by a pointing finger, is used to identify the particular object that is the special subject of the judgment.

In addition to analytic judgments of sense in which a real object is introduced through what we would now call a definite description, there are other cases in which a proper name is used, such as “John is asleep.” The name “John” is bestowed to help us identify a particular person. Bradley attacks the view that a proper name has a denotation, but no connotation. The proper name is a sign connected with what it denotes, but I could not identify what it denotes without some descriptive content to help me recognize it.

c. Synthetic Judgments of Sense

The discussion of proper names allows Bradley to move to a second category of singular judgment-synthetic judgments of sense. “Proper names,” he says, “have a meaning that always goes beyond the presentation of the moment.” (61) In using the name of a person, we assume an existence that goes beyond what is available in immediate experience, a reality that appears but is distinct from its appearance. In a synthetic judgment of sense, “we make generally some assertion about that which appears in a space or time that we do not perceive.” (61-2) But how is this possible? How can we make a judgment about a reality that appeared in the past, will appear in the future, or is now over the horizon, if we encounter reality only through presentation in immediate experience? No idea can capture the uniqueness of the day that is last Tuesday. We can form the idea of a certain kind of event: we can form the idea of an extensive history involving as large a sequence of events as you please, but such ideal contents cannot capture the unique past that actually took place, which alone can make the ideas we refer to the past either true or false.

For Bradley, the solution requires a crucial distinction between “this” and “thisness”. Only this day is today. Yesterday was today yesterday, but it is no longer today today. Today is also a particular day distinct from every other day and has its own date. It has its own position in a series of days within which every day is rigidly ordered through the relation of earlier and later. This series of days does not change, even when it is envisaged at different times. It is therefore a universal ideal content, and each day within the series has particularity or “thisness”. After McTaggart, the series has been known as the B-series. This ideal series can be attached to reality, only through the identification of a particular day within it with the reality given in present experience, which will turn that day into “today.” Once this is done, days that come after the day with that date are future days that will be real, and days that come before are past days that were real. This introduces the McTaggart A-series. To explain Bradley’s theory, the unit “day” has been used, although it does not appear in the text and involves an oversimplification, since we cannot identify an entire day with the present of immediate experience. On the other hand, it would be a complete mistake to identify the immediate present with an instant or a moment, imagined as either the end product of the infinite division of a period of time or as the interface between adjacent periods.

Since we cannot introduce a reference to what really happened in the past or will really happen in the future, which synthetic judgments of sense seem to demand, through the construction of even the most complex and extensive ideal content constituting a history of a possible world, how is the feat to be accomplished? Bradley’s solution is that although I can access reality only through a point of contact in immediate present experience, reality is not restricted to its appearance in my experience. The problem of appearance and reality is metaphysical and requires another book; but even at the level of logic it is clear that the identity of reality and what appears in experience is not mandated. “If the real must be ‘this’, must encounter us directly, we cannot conclude that the ‘this’ we take is all the real, or that nothing is real beyond the ‘this’.” (70) Being given in experience is not a quality of reality “in such a sense as to shut up reality within that quality.” (70) An ideal content can be true “because it is predicated of the reality, and unique because it is fixed in relation with immediate perception.” (72) Since immediate perception may involve an experience of change, a fragment of the temporal series may be abstracted and extended indefinitely through an ideal process.

Bradley has one further move to make to introduce the idea of a particular fact. “The idea of particularity implies two elements. We must first have a content qualified by ‘thisness’, and we must add to that content the general idea of reference to the reality.” (77) Without the second element, we have members that are exclusive within the series, “but the whole collection is not unique.” (77) For absolute uniqueness, we require the connection of the series with direct presentation. To think of tomorrow we may require a universal ideal content to connect it with today, but the day we think about is as unique as is today.

6. Hypothetical Judgments

Bradley handled universal judgments by reducing them to hypothetical form, but how can a hypothetical judgment be taken as true, since its antecedent is supposed, but not categorically affirmed? Modern logic evades this problem by treating hypothetical statements as truth-functional, but this evasion has consequences. For Bradley, the hypothetical judgment involves an ideal experiment. “The supposal is treated as if it were real, in order to see how the real behaves when qualified thus in a certain manner.” (86) The connection of the components is what is asserted in the hypothetical judgment, and it is this that has its ground in reality.

Bradley believes that not only are all universal judgments hypothetical, but also that all hypothetical judgments are universal. This may be thought doubtful, since there seem to be exceptions. “If this man has taken that dose, he will be dead in twenty minutes.” (89) This would not be necessarily true of any man who took the dose; but if the judgment is true, there will be some universal connection, even if restricted to the case of that specific man.

Bradley is assuming that the truth of a hypothetical statement must depend on some (possibly) latent feature of reality. Singular judgments, however, appear to connect us more directly with solid fact. The synthetic judgment of sense has its special status as categorical because of its connection with a reality actually given. It therefore depends on the analytic judgment of sense which assigns an ideal content to that given. Bradley has already argued that all universal statements are hypothetical. This is now widely accepted. He now moves to the startling claim that all singular statements are hypothetical, which he recognizes as an “unwelcome conclusion.” (91) Construed as categorical, analytic judgments of sense are all false, because they do not provide the whole truth about what is given in immediate experience, far less the whole truth about reality. This follows from his original story that an ideal content used in judgment is limited to part of the content of the given reality. But to say that the judgment is not the whole truth is not to say that it is not wholly true and hence partly false, even false tout court. Bradley complains that the choice of an ideal content to qualify the immediate given is arbitrary. Arbitrary is too strong, since the choice may very well have a purpose, but even if it were arbitrary, the assignment of universal content to the given reality would be just as true as the choice of any other content from the selection available.

Bradley is suggesting that the loosening of part of the content of the given reality that he introduced earlier as the very essence of thought is doomed to failure in advance. This is why he talks about “mutilation”. But the success or failure of the operation is surely relative to what it is intended to achieve. It is not designed to provide an ideal content that will be a complete characterization of reality as a whole; it has surely a much more limited aim. One idea is that loosening a part of the content is associated with separating out a segment of the given reality that conforms to the concept introduced. Loosening the concept of a dog from what I am given allows me to separate out Fido and perhaps other dogs within my field of view. The analytic judgment of sense that here is a dog would appear to be categorically true. This way of explaining the function of the judgments immediately associated with the loosening of ideal contents would allow Bradley, were he so minded, to make peace with logical systems, such as both Aristotelian and modern logic, that give a central position to the individual object. (This is essentially the problem of “special subjects”, discussed in Campbell: 1967.)

7. The Esoteric Doctrine

We have now come to a parting of the ways. If we accept the truth of analytic judgments of sense, such “judgments that analyze what is given in perception will all be categorical.” (106) Abstract, universal judgments will all be hypothetical. Synthetic judgments “about times and spaces beyond perception” (106) are also categorical, although they require inferences that rely on the universal. Bradley is prepared to allow those who lack the courage to follow him to a more esoteric theory “to remain at a lower point of view.” (106) Bradley, however, proposes a trip to a region where the “distinction between individual and universal, categorical and hypothetical, has been quite broken through.” (106) It is at this higher level that Bradley’s logic becomes so difficult, perhaps impossibly difficult. At the lower point of view, we separate out individual objects that we characterize through universal properties and relations in singular and plural judgments. Bradley begins the move to what is higher (or deeper) with the point that these individual objects are conditioned by the setting in which they are found. They are not unconditioned, but are asserted subject to a condition. What is subject to a condition can be asserted categorically, if the condition is taken as satisfied. Bradley is well aware that conditional and conditioned are not the same. “A thing is conditional on account of a supposal, but on the other hand it is conditioned by a fact.” (99) His argument is that for anything with a setting in space and time, the condition can never be satisfied. To introduce the series of conditions in space and time is to introduce a chain whose last link hangs unsupported in the air. This is a worrying argument, traditionally used to prove that the world must have a beginning in time (perhaps also a First Cause), or else by Kant to vindicate transcendental idealism. The assessment of how far it provides a solid support for what Bradley proposes to build on it will be postponed until 18b.

Rejecting the categorical judgment that assigns an ideal content to the segment of reality from which it has been loosened, Bradley is left with no more than hypothetical judgments. These cannot even be our standard hypothetical judgments that are composites of categorical statements. They are mere husks, connecting adjectives For example, “If lightning, then thundering.” Certainly, hypotheticals that connect adjectives are in a way also categorical, since they affirm a ground of connection in reality. But we have lost our standard hypothetical judgments and are left with mere scraps. Even more baffling is the replacement we are offered for a singular judgment in the higher point of view. “Instead of meaning by ‘Here is a wolf,’ or ‘This tree is green’ that ‘wolf’ and ‘green tree’ are real facts, it must affirm the general connection of wolf with elements in the environment, and of ‘green’ with ‘tree.’” (104)

Bradley offers a further explanation of his “unwelcome conclusion” in Terminal Essay II, which I discuss in 18b and offer a way of escape. In the meantime, he returns from the heights and provides a more mundane account of other kinds of judgment.

8. Other Types of Judgments

a. Negative Judgments

Bradley now turns to negative judgments. Negative judgments, he believes, are more complicated than affirmative, since they must begin with a suggestion that is rejected in the judgment. Moreover, this rejection must depend on the assumption of a positive ground of exclusion, even if what this is may not be known. Negative existential judgments are of particular interest. In “Ghosts do not exist,” the grammatical subject cannot be the real subject; the real subject is the nature of things to which we deny the quality of harboring ghosts. The positive character of reality that excludes ghosts is not, however, determined through the negative judgment. This entails that the same character of the real may exclude a variety of different suggestions. The suggestions excluded have their source in an ideal experiment and not in the nature of reality. The negative judgment affirms that some quality of the real excludes a suggestion, but it does not determine what quality that is. The truth of a negative judgment depends on a quality of the real incompatible with the quality excluded in the judgment. The true quality and the quality assigned in the judgment are thus contraries and not contradictories. The way in which a negative judgment presupposes a quality in what is real that we may not be able to specify may be compared with the way in which a hypothetical judgment presupposes the same kind of quality as grounding its connection. It follows that the negation of a hypothetical judgment would be the rejection of this sort of ground. The mere assertion of the antecedent and the negation of the consequent is indeed incompatible with the hypothetical judgment, but it is not its contradictory. A genuine contradictory would be strong enough to rule out counterfactual conditionals.

b. Disjunctive Judgments

Bradley understands disjunction as providing a list of two or more mutually exclusive alternatives. He is willing to associate disjunction with a nest of hypothetical judgments, but since neither the hypothetical judgments nor the disjunction are truth-functional, the disjunctive judgment may have a certain categorical aspect. “Disjunctive judgment is the union of hypotheticals on a categoric basis.” (131)

Bradley connects disjunction with choice, where we make a selection from a number of alternatives. There is a definite list of possibilities; this is its categorical feature. We cannot use disjunctive addition to add in an arbitrary fashion another disjunct that is not a real possibility. In the same way, to say that something is colored is associated with a list of possibilities from which we select the actual color. To produce the disjunctive judgment that lists the varieties of color is to assign to the object categorically the property of being some kind of color, even if we do not know which color it is.

This example conforms to the template that Bradley favors in place of the form “either p or q or…” that is used today. Bradley treats the disjunctive judgment as a kind of singular judgment, with the format “A is either b or c or d….” This analysis will run into difficulties when A does not exist, but Bradley has met this problem before, and deals with it by replacing the grammatical subject with the real subject. This maneuver can even handle cases that seem most recalcitrant, such as “Either the light bulb is dead or the fuse has blown.” This would become: “Reality is either characterized by light bulb malfunction or fuse meltdown.”

9. Other Topics

a. Logical Principles

Chapter V examines logical principles. Bradley dismisses the Law of Identity as an empty tautology. Judgment requires the identity of differences, not provided by “A is A.” This means that the accusation (by Bertrand Russell) of confusing the “is” of predication with the “is” of identity cannot be fair, since for Bradley predication is the essence of judgment, whereas through the “is” of strict identity we do not make a judgment at all.

The most interesting part of the section on “The Principle of Contradiction” is the discussion of (Hegelian) dialectic. Bradley’s simple solution is that if the ideas combined in the synthesis are merely different, there is no problem. The ideas of self and other are different ideas, but no one would say that it is a contradiction to assert the existence of the self and other things as well. The challenge to the principle of contradiction comes, only if the different ideas combined are taken to be discrepant or contrary, since the contrary of a given proposition entails its contradictory. Bradley offers a compromise according to which ideas that appear to be contrary are reconciled when harmonized within a wider reality. For example, opposite properties can be assigned to the same thing at different times.

The Law of Excluded Middle takes the form of a disjunctive judgment and would be expressed today as “either p or not p.” Bradley, however, has a different form for disjunction, so that his version of the principle will be: “A is either b or not-b.” A is not always a real particular thing, but sometimes reality as such. Indeed, if Bradley gets his way, the ultimate subject will always be reality. Excluded middle uses the variety of disjunction in which the number of disjuncts is exactly two. When the second disjunct is constructed as the negation of the first, there can be no other choice.

b. Extension and Intension

Bradley next tackles the familiar distinction between intension and extension in the chapter on the quantity of judgment, explaining that “in every symbol we separate what it means from that which it stands for.” (168) (Frege’s distinction between sinn and bedeutung.) His account of the extensional treatment of universal judgments such as “Dogs are mammals” is disappointing, because he fails to register that a set is a special kind of entity, suggesting that a set of dogs must be a pack of dogs, failing which the only alternative is the ludicrous idea of a collection of dog-images in the mind! With a proper notion of set in place, “Dogs are mammals” can be taken to assert a relation between two sets, just as many other judgments assert a relation between two objects.

Judgments founded on intension refer to the connection of attributes and meanings, and ignore the denotation of objects. Universal judgments based on meanings are those Kant considers strictly universal, because they do not permit even the possibility of exceptions. Not all universal judgments are of this type, and singular judgments never are. Our concept of what is real, denoted in a singular judgment, is the concept of the individual, which is both particular, excluding all other individuals, and universal, as unifying various characteristics and constituting an identity in difference. The real individual is a concrete universal: abstract universals, which can be separated from the individual in thought and applied elsewhere, cannot be real. In a similar way, what is truly individual is a concrete particular; abstract particulars that are nothing more than their distinction from other particulars are also unreal. “A reality in space must have spatial diversity, internal to itself.” (188) A point in space is distinct from all other points, but is a mere abstraction. A moment in time is also an abstraction; a concrete individual existing in time must have some duration.

c. Modality

Bradley rejects as erroneous the view that modal differences do not affect the actual content of the judgments involved. Certainly, you can take any judgment and “express any attitude of your mind towards it.” (198) These propositional attitudes are many and various. I may say: “I wish to make it” or “I fear to make it” or “I am forced to make it.” “All these are simple assertorical statements about my condition of mind.” (198) Statements about possibility and necessity do not, however, express my state of mind. They are assertions that claim objective truth. “There clearly can be but one kind of judgment, the assertorical. Modality affects not the affirmation, but what is affirmed.” (197) This is in line with the logic of Principia Mathematica, in which everything takes place under the aegis of the assertion sign. In this system, there is not even a corresponding negation sign, just a sign for the negation of a proposition. This is more extreme than Bradley, who does allow a distinct function of negation.

Thus, judgments of necessity and possibility have a special content not to be found in the corresponding assertoric judgment. For Bradley, “The possible and the necessary are special forms of the hypothetical.” (198) Necessity consists in a necessary connection between antecedent and consequent in a hypothetical judgment. To say that a fact is necessary is not to elevate it to a higher status, but merely to say that it is a necessary consequence of some other state of affairs, also taken as fact. As already explained, the connection through which the antecedent necessitates the consequent must itself depend on a categorical ground. This includes cases where we assert a necessary connection, because of a regular succession of events. Not that this ground has to be a necessary causal connection. “The real connection which seems the counterpart of the logical sequence, is in itself not necessary.” (206)

Bradley also connects the possible with the hypothetical. To say that something is possible is to say that some of its conditions are satisfied, excluding those specified in the antecedent of the associated hypothetical statement. “It is possible to see an eclipse of the moon tonight” means “If you get up early enough and the weather co-operates, you will see an eclipse of the moon.” To assert a potentiality or power or disposition is to commit to a hypothetical judgment stating that if certain other conditions are satisfied, a certain state of affairs will necessarily come to pass.

Bradley has a problem with modality because of his metaphysical vision of a Parmenidean Absolute Reality. Modal distinctions come to life with the conception of an open future, in which some things are unavoidable and others are possibilities among which we may choose. What is actual at the present time cannot be properly said to be either possible or necessary (Bradley gets this right!); although some things that have taken place were necessary and others were not. Without this kind of background, the conceptual scheme Bradley is discussing would not exist.

10. Judgment: Concluding Remarks

In his presidential address to the American Philosophical Association in 1957 “Speaking of Objects,” W.V. Quine presents the manifesto for the position of modern logic. “We persist in breaking reality down somehow into a multiplicity of identifiable and discriminable objects to be referred to by singular and general terms. We talk so inveterately of objects that to say we do seems almost to say nothing at all; for how else is there to talk?” The reality to which Quine referred at the beginning disappears under the carpet and is heard from no more. For Bradley, the reality that is broken down is, and has to be, the reality available in immediate experience. It is broken down through the faculty of thought and judgment, which introduces distinct individuals characterized through universal logical ideas. This makes possible singular and plural judgments involving qualities and relations. Not all judgments about what is real conform, however, to this template. There are genuine judgments about reality that bypass a reference to real individuals. Some such judgments modern logic may handle in other ways, but there are some that remain troublesome, such as judgments involving mass terms. Bradley’s system of logic is more flexible and can handle the variety we find.

The strength of Bradley’s theory of judgment is the flexibility through which it accommodates a variety of forms. Its weakness is that through insisting that the ultimate subject of judgment is reality, he seems to undermine the legitimacy of the singular and plural judgments on which we normally rely. One way to retain Bradley’s logic while rejecting the absolute monism of his metaphysical theory is to recognize that “reality” is itself a mass term. The later developments in the logic of mass terms that are proving such a headache for modern logic also make more palatable the logic of Bradley. Concepts, like “gold”, which do not by themselves package reality into units in the same way as count nouns like “dog”, can be used in various ways. They can be used in a singular judgment to refer to a piece of gold: they can be used in plural judgments to refer to pieces of gold: and there is also a third use, as in “Gold is yellow,” where the concept is associated with a mass term. (Interestingly, Bradley uses this very example (46) without noticing its special character.) The possibility of this third use surely does not invalidate the other uses in singular and plural judgments.

This explanation of the process described by Quine is, of course, given at Bradley’s lower point of view, but the use of a mass term to designate the setting for the individual object, in place of a string of other individuals, may well discourage the desire to move to the mysterious higher view. To isolate within the sensuous felt mass, designated by a mass term, an individual object associated with an ideal content loosened from what is given, seems about as good an account of the process of thought as we can get.

11. The Nature of Inference

Bradley moves on in Books II and III to the important topic of inference. There is a problem emerging from the distinction between analytic and synthetic judgments of sense introduced in Book I, in that the synthetic judgments move us beyond what is given in immediate experience and must involve some kind of inference. In a book on the principles of logic, Bradley must also engage with the traditional doctrine of the syllogism, which was taken to be the core of deductive inference. Bradley proposes in the second book to deal with deductive inferences generally agreed to be valid, without probing too deeply, then moving in a third book to a fundamental theory intended to cover all forms of inference.

He begins by setting out three features of inference with which it is difficult to disagree. First, the conclusion of an inference depends on a process of thought through which it is reached. Second, the process rests on a basis. “In inference, we advance from truth possessed to a further truth.” (245) Third, there must be a difference between basis and conclusion; otherwise, the supposed inference is a “senseless iteration.” (246)

Bradley makes a list of forms of deductive inference, casting his net more widely to capture specimens that do not usually appear in the textbooks of the day. The traditional syllogism cannot be taken as fundamental, since it does not cover all the forms that Bradley has listed, such as those empowered by transitive relations. Bradley describes the process of inference as an operation of synthesis which “takes its data and by ideal construction combines them into a whole.” (256) Logical connection, however, requires the identity of common links, such as the middle term in a syllogism. The first step is to form the whole: the second step is to extract the conclusion perceived within the whole by omitting parts that are no longer of interest. Bradley denies that there is any general principle that will serve as a test of the validity of reasoning. The traditional syllogism is not up to the job and no replacement can be found.

The common link required to combine premisses is both the same and different. “If it were not different it would have nothing to connect, and if it were not the same there could be no connection.” (288) But how can we have both identity and difference? The solution is that the common term is an ideal content “appearing in and differenced by two several contexts.” (288)

The process of inference depends entirely on this identity in difference. There are, however, two radically different kinds of identity that Bradley does not distinguish at this point. There are universal characters which are identical throughout their various instantiations (abstract universals) and there are individual objects that remain identical throughout their various appearances (concrete universals). These individuals may even combine characters that are in some sense discrepant, if they are extended in space or enduring in time. Caesar was in Gaul, and Caesar was in Italy. Both types of identity in difference can provide a ground for inference, even within traditional syllogistic logic. By suggesting that inference takes place only through the development of an ideal content and not via reference to an individual object, Bradley undermines the singular judgment and prepares the ground for a logical doctrine that downgrades it.

12. The Association of Ideas

The “association of ideas” is the name for a process that exists as a psychological fact; what Bradley is attacking is the empiricist account of this fact and the use of it to explain judgment and inference. The empiricist theories of David Hume and John Stuart Mill attempt to explain the life of the mind in terms of the association of ideas that are distinct existences or psychological atoms. The laws of association usually recognized are contiguity and similarity. Bradley argues that the empiricists do not have the resources even to state clearly their central position, and offers the following restatement: “Any element tends to reproduce those elements with which it has formed one state of mind.” (304) He calls this law “redintegration”, getting the term from Sir William Hamilton. The use of the qualification “tends” is standard for laws of association. Bradley insists that his law “does not exclude any succession of events which comes as a whole before the mind,” (305) which is, of course, vital for the explanation of causal inference.

In spite of a superficial resemblance, there is a chasm that divides Bradley’s redintegration and the association of the empiricists. Association is cohesion between psychical particulars: redintegration concerns the connection of universals, “which is an ideal identity within the individuals.” (306) Only an ideal connection in the mind can survive the disappearance of connected individuals. The impressions originally given in conjunction are gone and cannot be resurrected. Only the universal ideal content, the “what” as opposed to the “that” is left behind as a memory trace. Through the universals, we may perhaps be able to produce images that are, as it were, ghosts of the past, but these images will be fresh particulars and distinct existences that can be considered re-incarnations of the past, only in virtue of an ideal identity preserved through the universal.

In the empiricist theory developed, for instance, by John Stuart Mill, the bare contiguity of impressions was not considered to be by itself sufficient to operate the mechanism of association. Past contiguity can be operative only if the memory thereof is introduced through the similarity between a component in a past experience and a sensation now being enjoyed. But we still face the problem: “What has been called up has never been contiguous; and what has been contiguous cannot be called up.” (318) Not even similarity can resurrect what is now dead and gone. Similarity can exist, only if the similar terms both exist. Therefore, reproduction through similarity is not possible, since the similarity requires that what is reproduced is already there.

There are few traces surviving today in either psychology or philosophy of the theory demolished by Bradley. The violence of the rhetoric, although amusing, might be considered excessive, but in its day the theory was solidly entrenched, and dynamite may have been justified.

13. Inductive Inference

It seems that we often make inferences from particulars to particulars. We take note that Fido barks when approached by a stranger; we infer that Rover will do the same. Bradley denies that such inferences tacitly involve the inductive generalization that all dogs bark when approached by strangers, since people quite happy to make the inference from Fido to Rover might be reluctant to issue a general guarantee for all inferences of this type. This does not mean, however, that universals are not involved. The inference to the barking of Rover is based on a connection of ideal content, acquired through the encounter with Fido.

Bradley now turns to inductive generalization through which we reach a conclusion about all members of a certain class when only some members have been examined. This arena is the stamping ground of John Stuart Mill against whom Bradley directs his fire. Even if Mill’s Methods may be useful, standard textbooks agree that they are not logically sound. Bradley endorses the usual criticisms, and adds the point that in any case they do not take us from mere particulars to general truths, since the facts from which they begin are already conceptualized as instances of general kinds.

14. Inference: The Inclusive Theory

The story so far is that inference operates by combining premises that contain a ground of identity. A conclusion is reached by eliminating the middle term. Bradley now recognizes that this theory will not cover all forms of reasoning and sees the need for a third book in which to put things right. The original theory will handle the syllogism and many other arguments. What it does not cover is arguments where there is no elimination of a middle term, where the conclusion emerges as a structure incorporating A, B, and C on the basis of information relating A to B and B to C. An example may clarify what Bradley has in mind. We connect a day to the day before through the identity of the intervening night and the same day to the day after through a similar process. In this way we construct a succession of days that will constitute a history. This result will count as the conclusion of an inference in the wide sense.

Mathematics is also important in our cognitive life, and often not covered by the theory in Book II. Other exceptions are the processes of comparison and distinction. These are mental operations resulting in judgment, and are therefore inferences. Recognition is also inference, when we make the move from the perception of the man entering the room to the recognition of someone seen before.

Hegelian Dialectic also transcends the pattern permitted in the original theory. Bradley offers a heretical version that tones down the excesses of the orthodox view. Instead of supposing that the process begins in contradiction, Bradley suggests that our unrest begins in the recognition that the original datum is incomplete. The dialectical move is to complete the incomplete through positing a larger whole in which it is a component. This larger whole is itself seen to be incomplete, and the process is repeated. The way in which the incomplete is completed has its source in the subject. Although a dialectical move may have a source in past experience, the inferential move goes directly from the datum to what lies beyond, even if we are able sometimes to uncover a hypothetical judgment expressing the function that controls the inference.

Bradley is now ready to unveil general characteristics of inference. Because it is intended to cover all cases, this will have to be vague. In the beginning is a datum or data, followed by a mental operation, producing a result. For example, in the inference: “A to the right of B, and B of C, and therefore A to the right of C” (432), we begin with “two sets of terms in relations of space” (432) and put them together. This act of construction makes a difference, “but it does not make such a difference to the terms that they lose their identity.” (432-3) Nor do A and C change their identity when directly related in the conclusion. Inference makes a change, but it does not change the world. Bradley often describes inference as “ideal experiment.” It is a movement of thought that we make, but we are not compelled to take this path. If we have several premises, we are not compelled to put them together. The act of combination is arbitrary, in the sense that it is something that we choose, but might not have chosen. The act of inference is not a revision of the original data, although it introduces a fresh thought.

This makes sense where there is more than one premiss and an act of combination is required that depends upon the will of the agent. But Bradley discovers many inferences where the conclusion issues through the development of a single premiss. Certainly, there is no inference without mental activity in which we begin with a datum and end with a judgment predicating a fresh characteristic; but does such intellectual activity all count as inference? Standard inference involves “a construction round an identical centre” (457), but there are non-standard inferences in which there seems to be no given identity. However, the middle process, the operation leading from datum to conclusion, cannot “dispense with all identity.” (457) The mere co-presence of all my thoughts is not enough, since this does not explain the special identity that enables the inference. Take “recognition” and “dialectic”, where we are given a real thing with a quality and infer another quality. The inference depends on the connection of these qualities, and we might want to say that the middle term is the given quality. The problem is that the connection of the qualities is neither explicit nor given. “It is a function of synthesis, which never appears except in its effects.” (458) “It is a construction by means of a hidden centre.” (458)Bradley distinguishes two operations associated with inference: synthesis and analysis. In synthesis the many become one; in analysis the one becomes many. Bradley makes a further distinction between analysis and elision. We may begin with a judgment about a given whole, move by analysis to a plural judgment about its elements, and then by elision reach a conclusion about specific elements. Central cases of inference in which premises are combined and a middle term eliminated involve both synthesis and analysis, but there are other inferences in which one or other operation is at least predominant.

Although they are different functions, analysis and synthesis have an intimate connection. In analysis, the elements in the result are separated, but this means that they are also combined in a latent synthetic unity. In synthesis, elements are combined, but the unity formed will be capable of analysis into the original components. “Analysis is the synthesis of the whole which it divides, and synthesis the analysis of the whole which it constructs.” (471) The crucial idea is the idea of the whole that analysis disassembles and synthesis constructs. In analysis we operate on an explicit whole that falls into the background. In synthesis we bring out the invisible totality comprehending the elements combined.

15. Inference and Judgment

With this wider conception of inference, it is getting harder to separate inference and judgment. Certainly, synthetic judgments of sense involve a substantial inferential component, but even a judgment that comes straight from presentation seems to involve the analysis and synthesis that is characteristic of inference. Judgment involves abstraction from the sensuous felt mass, and hence analysis. Judgments assigning various characters to reality involve synthesis. Bradley is certainly anxious to retain the distinction between judgment and inference. “Inference is an experiment performed on a datum,” whereas in judgments of perception “there is properly no datum.” (479) They do, indeed, have a basis, but this basis is for the intellect nothing. “It is a sensuous whole which is merely felt and is not idealized.” (479) Judgment is required to provide the ideal content from which inference takes its start. In judgments of perception we have no rational ground to justify our result and “the stuff, upon which the act is directed, is not intellectual.” (480) We can now, perhaps, make this clearer by explaining that the stuff in question is designated by a mass term.

The distinction between judgment and inference may not, however, be as sharp as one might like, as becomes clear when Bradley discusses the beginnings of our intellectual life. “The earliest judgment will imply an operation which, although it is not inference, is something like it; and the earliest reasoning will begin with a datum, which though kin to judgment, is not intellectual.” (481) “Experience starts with a stimulation coming in from the periphery [what John McDowell calls ‘a brute impact from the exterior’]; but….the stimulation must be met by a central response.” (481) Sensations do not “simply walk into the mind.” They are “the product of an active mental reaction.” (482) The senses may give us sensations, but “the gift contains traces of something like thought.” (482) The interface between cognition and the sensory input is murky indeed, but two things are clear. The response to the stimulus is not entirely arbitrary, nor is it a simple re-enactment of a given. Nothing is given until it is received!

16. Formal Logic

Bradley is hostile to the idea of a purely formal logic whose goal is to construct a system of valid patterns of inference, covering all cases through the use of blanks and variables. Partly, he does not believe that the goal can be achieved. More basically, his concern is that the attempt to reconstruct inference in terms of the manipulation of counters in accordance with rules breaks the connection between inference and that continued reference to reality that lies at its heart.

Inferences do, indeed, proceed in accordance with principles, and we can reject a principle employed by finding another similar inference in which the premiss is true and the conclusion false. In a particular inference, we can distinguish the principle from the matter involved, but we should not separate it and turn it into a major premiss in order to exhibit the argument as a syllogism. The principle is not a premiss, because it is not a datum but a function. There may sometimes be a point in replacing the original argument with such a syllogism, but this option will not always be available. Every inference depends on a principle that is not a premiss, as Lewis Carroll has shown in “What the Tortoise Said to Achilles.” Even Principia Mathematica has the Law of Substitution and the Law of Detachment that are not axioms of the system!

17. Truth and Validity

So far the focus has been on the phenomenology of inference. But inference is important, not because it takes place, but because it is taken to have validity and justification. The problem is to explain how inference can have validity and justification in the face of the fundamental dilemma that Bradley identifies. Unless there is a transition from the premiss to a different conclusion, nothing has happened, and there is no inference; but if there is a difference between premiss and conclusion, how can we justify the intellectual move? Bradley dismisses the extreme claim that since they are different, there is an actual contradiction between premiss and conclusion. To assert the premisses is not to deny the conclusion: it is merely to fail to assert it until the inference is completed. But how is the eventual assertion of a different conclusion to be justified?

Logicians who do not challenge the legitimacy of the analytic judgment of sense can form a concept of truth that will allow them to explain that what is crucial for a valid inference is not that there be no change from premiss to conclusion, but merely that there be no change in the truth value from true to false. In the case of valid deductive inference this is guaranteed, because we merely re-arrange our information to make a certain element more salient. What changes is merely our knowledge of the relation implicit in the premisses. The act of inference requires an intervention by the subject that is arbitrary in the sense that it might not have taken place; but in the case of valid deductive inference, it is not an intervention that tampers with the truth. There is, perhaps, more interference by the subject when a decision is made to eliminate part of the original ideal content, as when we drop the middle term in the conclusion of a syllogism. Dropping ideal content even makes it possible that the conclusion is true, when the premisses contain error; but this does not matter, so long as it remains the case that if the premisses are true, the conclusion must also be true.

Perhaps deductive inference can be handled, if we do not probe too deeply, but Bradley now comes to a “rising sea” of non-deductive inferences that are not so easily controlled. In mathematical construction we may infer the extension of a given straight line to double its size, but this is not the deduction of a conclusion from a premiss. Comparison and distinction are also acts of the mind that are not deductive inference. It could be argued, indeed, that these acts are not in fact inferences at all, but rather forms of plural judgment, originally involving more than one object distinguished within immediate experience. Bradley, however, would not be greatly interested in this, since in his final view the distinction between judgment and inference is to be broken down.

The really serious problem, however, is empirical inference, including the prediction of the future on which we rely so heavily to carry out our purposes. Bradley took the first step at the beginning of The Principles of Logic when he introduced the loosening from the given experience of an ideal content that can be transferred elsewhere. This may explain how it is possible to formulate a belief about what will happen, but it does not explain why we choose to adopt the beliefs we do, or how these beliefs are to be justified. Suppose we abstract from immediate experience a conjunction of ideal elements. This may tempt us to imagine a similar conjunction in our representation of the future, but this would be justified, only if the connection of the elements were unconditioned and necessary. Since in abstracting the conjunction from the given experience it has been separated from the context in which it was found, it remains, as Bradley believes, conditioned by that context. Since this context is never completely known, the successful transfer of an ideal complex abstracted from the given context to a fresh context that may well be different cannot be guaranteed.

The recognition of the context in which the given ideal content is embedded undermines its guaranteed transfer elsewhere. Does it also undermine the analytic judgment of sense that predicates the content of immediate experience? This is what we are led to think in the move to the higher point of view, and it would be extremely serious, since it would destroy the very concept of true judgment. It is ironic that at the beginning of The Principles of Logic Bradley uncovers the source of true judgment in the predication of an ideal content of an immediate experience from which it has been loosened and with which it is necessarily connected. This explains how it is possible to transfer an ideal content extracted from immediate experience to a segment of reality not immediately experienced. Such judgments, of course, may be either true or false.

This system is available as a lower point of view for those who are unable to follow Bradley all the way. (It is also there as a fallback position, in the event that a fatal flaw is discovered in Bradley’s advanced reasoning, although Bradley himself does not seem to fear this possibility.) The lower point of view is happy enough with the argument that empirical inferences have no logical guarantee, since the given object involved in the premiss is embedded in a context, ultimately unknown. This argument establishes a conclusion to which everyone would agree. What cannot be accepted is the use of the same fact to break the tie between ideal content and object that constitutes true judgment. Without a viable concept of true judgment, even inference as we normally understand it will disappear, since the premisses and conclusion of an inference are all judgments, and a deductive argument is valid, if the conclusion must be true when the premisses are all true.

18. The Final Doctrine

a. Inference

We have been following the argument in the first edition of The Principles of Logic, in which Bradley tries to keep out the influence of his own metaphysical ideas, when operating at the lower level. This is fortunate, because it makes Bradley’s often insightful discussion available to logicians who would be appalled by his metaphysics. Bradley, as we know, is not ultimately satisfied with the lower point of view and feels compelled to move to a different position, where the influence of his metaphysical views can be detected. This difficult theory was not well understood, so that in the second edition of The Principles of Logic he included a set of terminal essays, which he hoped would provide a clearer exposition of his final views.

The original book began with judgment; the terminal essays begin with inference which he now moves to the center. “Every inference is the ideal self-development of a given object taken as real.” (598) This definition attempts to explicate inference without using the notion of judgment, which will later be explained as a kind of inference. Even the third member of the logical trinity, the universal idea, is partly concealed under cover as “the given object.” The given object must be ideal, since this is the only kind of entity capable of ideal self-development. Bradley’s definition of inference would have been much clearer, if he had explained it as the ideal self-development of a logical idea taken as real. The concept of ideal self-development, however, contains a problem, encountered before. If there is no change, there is no inference; but if there is change, then “the inference is destroyed.”(599) Bradley cannot take the usual line that the transition in inference from judgment to judgment is valid, so long as the preservation of truth is guaranteed. This would be circular, since he intends to explain judgment in terms of inference. Bradley’s solution relies on the double nature of the datum, considered in itself and as part of a systematic whole. This is what is involved in the reference of the ideal content to reality. This reference to reality, familiar from Bradley’s initial account of judgment, now turns out to mean “taken to be real, as being in one with Reality, the real Universe.” (598) This is the point of “taken as real” in the original definition. To take an ideal content as real is to identify it with Reality, in so far as it belongs to Reality.

We can now perhaps understand why Bradley replaces “logical idea” with “given object” in his initial definition. A logical idea can only be a part of a system of logical ideas, a system of thought. A given object, as normally understood and as understood within Bradley’s lower point of view, is a part of the real universe. It is the act of judgment that connects the domain of thought with the real world. It is judgment that predicates a logical idea of reality or of an object that belongs to reality. Without judgment, the only possible movement of thought is a movement along a stream of ideas. The only thing more real than a logical idea is a complete system of all ideas, and we have fallen into the clutches of Hegel! To adopt the term “given object” to denote logical ideas makes it difficult to use the same term to introduce concrete individuals constituting the universe.

The movement of inference can be illustrated in the Dialectical Method, in which we expand a given content through recognition of its incompleteness. The explicit premiss is “some distinguished content set before us.” (601) Implicit is “the entire Reality as an ideal systematic Whole.” “Every member in this system…develops itself through a series of more and more inclusive totalities until it becomes and contains the entire system.” (601) When I use this method, everything is necessary except where I begin and when I stop. For Bradley, however, such inferences are never fully satisfactory, since their ground is largely implicit and unknown.

Bradley goes on to consider in some detail other processes such as analysis, abstraction and comparison. His discussion of arithmetic is of surprising interest, because the construction of the natural number series does seem to make sense of the notion of ideal self-development. Each natural number develops itself through the successor function to introduce the number that follows it. The number three is an ideal content, since it is a universal property shared by all triples, so that the transition to four must lie in the domain of ideality.

The representation of space and time is constituted through a similar process involving the ideal self-development of a given space or time. Although these examples may illuminate the obscure notion of ideal self-development, they will not help to explain inference, if the construction of the successor of a natural number or the space and time that lies beyond what is given is not an inference. Inference is usually considered a movement of thought from judgment to judgment, from premiss to conclusion. This is not what happens when we extend a line or form a new number.

b. Judgment

Bradley, however, would not accept this, since he considers judgment itself to be a kind of inference in the wide sense. It is a kind of inference in which the ground that compels the judgment is not made explicit. Inference is present, even in the purest case of an analytic judgment of sense. As we have seen, Bradley recasts the judgment “S is P” in the form: “Reality is such that S is P.” The word “such” is the placeholder for the ground in reality that compels the conclusion “S is P.” Since this condition is unspecified and not completely specifiable, the inferential structure is merely implicit. This is a radical change, under the influence of Bosanquet, from Bradley’s original position, where judgment lies at the interface between the ideal and the actual, between the universal and particular, and is hence distinct from inference which is a movement within thought.

Bradley supports his change of heart by giving an example. Suppose I immediately experience A to the right of B and therefore form the judgment that A is to the right of B. There is, presumably, some sort of causal explanation for the relative position of these things. My objection is that any such condition for the existence of a state of affairs is not a truth condition for the corresponding judgment. It would be a truth condition only if it were incorporated in the judgment, which it is not. Even if I am prepared to say that A is to the right of B because John put it there, I am not saying that A is to the right of B, if John put it there. My statement is categorical, not conditional, and I will insist that A is to the right of B, even if it turns out that John is not responsible.

The objects A and B that are the special subjects of the plural judgment are necessarily selected from and connected with “our whole Universe.” (Presumably, this is our Universe, because it is connected with our immediate experience.) In a singular judgment the special subject is this reality, which is “some special and emphasized feature in the total mass.” (629) All such special subjects are conditioned by what lies beyond. Even without invoking the law of causality, they are all conditioned by their setting in space and time. Bradley argues that since the special subject of the judgment must be conditioned, even if its conditions are not known, the judgment itself cannot be unconditioned. “The object therefore remains conditioned by that which is unknown, and only on and subject to this unknown condition is the judgment true.” (631) This sentence explicitly identifies the existence conditions of the object with the truth conditions of the judgment. If we refuse to make this jump, we can remain comfortably at Bradley’s “lower point of view” and ignore the obscure and baffling complexities of the esoteric theory.

c. The Fundamental Problem of Thought

Even if we insist on a sharper distinction between judgment and inference than Bradley would allow, there is a general idea of a movement of thought that covers both activities. There may be some movements of thought we prefer to call judgments and others we call inferences, but Bradley’s purpose is to dig out what all acts of thought have in common. He believes he can state the fundamental problem without a final distinction between judgment and inference. Thinking is a process that reaches a result, and this implies the transcending of some initial state. It is not enough, however, that there be a mere succession of states. The movement of thought requires justification. The movement of thought must “satisfy the intellect.” In the case of inference, the satisfactory is called “valid”; in the case of judgment, the satisfactory is called “true.” In both cases the problem of the satisfaction condition is essentially the same. “Thought demands to go proprio motu…with a ground and reason…. Now to pass from A to B, if the ground remains external, is for thought to pass with no ground at all.” (Appearance and Reality, Note A, 501) We might suppose that in the case of deductive inference, there is an internal ground within the domain of ideas, although Bradley would not agree. But there is clearly no such internal justification for the inferential move in the case of non-deductive or empirical inferences. The success of empirical inferences or predictions depends on the way the world is or will be. Our general level of success depends on our living in a reasonably well-ordered world in which we have developed reliable systems for the acquisition of information.

Since the ground that justifies the movement of thought is the nature of reality, this ground can never be brought within thought without the identity of thought and reality. Nothing less than this will satisfy the intellect. This is the essentially Hegelian move to identify thought and reality by turning reality into a system of thought. Not that a finite center can ever reach an unconditioned completion of its thought. We may try to get as close as we can, and the closer we get to a final completion, the more truth our thought contains. As we expand our system of thought to make it more comprehensive, the truer it will become, so long as it remains harmonious and coherent. Although the goal of Thought in Dialectic may be to complete the incomplete, Bradley believes that there is more to reality than even a completed system of thought could provide. Bradley is not a Hegelian, because he denies that the completion of thought, even if it were possible, would be identical with the Absolute. He rejects the replacement of reality by “some spectral woof of impalpable abstractions, or unearthly ballet of bloodless categories.” (591) Although Bradley follows Kant in accepting the transcendental ideality of the series of phenomena, a position that provided a stepping stone for Hegel, Bradley refuses to accept this creation of the mind as the reality encountered in immediate experience. For Bradley, “it is the whole continuity of the total series which is absolutely based on ideal reconstruction. By means of this function, and this function alone, we have connected the past in one line with the present.” (587)

d. Immediate Experience and the Absolute

Immediate experience is associated with a cluster of ideas: “this”, “my”, “now”, “here”. What is immediately experienced is felt. “Feeling may be either used of the whole mass felt at any one time, or it may again be applied to some element in that whole.” (659) What I immediately experience is real enough, but this does not mean that everything real must be experienced by me. As less than reality as a whole, Bradley calls my immediate experience an appearance of reality. To Bradley, “it seems clear that we not only start from the given ‘this,’ but remain resting on that foundation throughout. Our whole ordered universe we may call a construction resting on immediate experience.” (661)

Bradley clearly retains the phenomenal realism at the heart of traditional empiricism, while rejecting the idea that immediate experience is a collection of distinct existences, which was responsible for its demise. Experience, for Bradley, is originally a sensuous, felt mass. This is particularly acceptable with the re-instatement of mass terms, excluded by the logic of Principia Mathematica.

For Bradley, a collection of distinct existences is not given, but emerges through an analysis carried out by thought. “I have to turn my experience into a disjunctive totality of elements.” (665) This is uncannily like Quine’s idea that “we persist in breaking reality down somehow into a multiplicity of identifiable and discriminable objects.” The connection is particularly striking, once we realize that special subjects, as well as Reality as a Whole, may extend beyond what is presented in immediate experience. The ideal contents, necessary to separate objects within the sensuous felt mass, do not confine these objects to their presentation in immediate experience. Because the contents are universal, they permit what Hume would call the continued existence of such real things beyond their appearance in my mind.

Bradley’s theory must be taken very seriously because of the detailed account that it offers of a process that Quine leaves shrouded in mystery. It may be understood as a way of fixing what is wrong with empiricism. It is harder to sympathize with the arguments that led Bradley to abandon what he calls the “lower point of view” and which may be based on a mistake.

19. References and Further Reading

a. Selected works by F. H. Bradley

The Principles of Logic. Oxford University Press, 1883; second revised edition including terminal essays, 1922.
- (This is the main source for Bradley’s logical theory.)
Appearance and Reality. Oxford University Press, 1893; second edition with appendix, 1897.
- (The metaphysical theory.)
Essays on Truth and Reality. Oxford University Press, 1914.
- (A collection of articles, for the most part originally published in Mind, and many on broadly logical topics.)
Collected Works. Thoemmes Press: Bristol, England and Sterling, Va., 1999.
- (Volume I contains Bradley’s notes for The Principles of Logic.)

b. Further Reading

Allard, J. W., 2005, The Logical Foundations of Bradley’s Metaphysics: Judgment, Inference, and Truth. Cambridge University Press.
Basile, Pierfrancesco, 1999, Experience and Relations: an Examination of F. H. Bradley’s Conception of Reality. Chapter 4.
Blanshard, Brand, 1939, The Nature of Thought. Two Volumes. London: George Allen & Unwin.
- (Especially, Chapter XIII: Bradley on Ideas in Logic and in Psychology.)
Bosanquet, Bernard, 1885, Knowledge and Reality, A Criticism of Mr. F. H. Bradley’s ‘Principles of Logic’. London: Kegan Paul, Trench.
Bradley, James (ed.), 1996, Philosophy after F. H. Bradley. Bristol: Thoemmes.
Bradley Studies, the journal of the Bradley Society, was published from 1995 to 2004.
- (It has now been succeeded by Collingwood and British Idealist Studies.)
Campbell, C. A., 1931, Scepticism and Construction: Bradley’s Sceptical Principle as the Basis of Constructive Philosophy. London: George Allen & Unwin.
Campbell, C. A., 1957, On Selfhood and Godhood. London; George Allen & Unwin.
- (Gifford Lectures delivered at the University of St. Andrews.)
Campbell, C. A., 1967, In Defence of Free Will. London: George Allen & Unwin.
- (Chapter XII. The Mind‘s Involvement in Objects. This was originally published in 1962 as a contribution to Theories of the Mind, edited by Jordan M. Scher, published by the Free Press of Glencoe, a division of the Macmillan Company.)
Candlish, S., 2007, The Russell/Bradley Dispute and its Significance for Twentieth-Century Philosophy. Basingstoke: Palgrave Macmillan.
Ferreira, P., 1999, Bradley and the Structure of Knowledge. Albany: SUNY Press.
Ferreira P., 2014, ‘Idealist Logic’ in The Oxford Handbook of British Philosophy in the Nineteenth Century, Oxford: Oxford University Press, pp. 111-132.
Hylton, Peter, 1990, Russell, Idealism, and the Emergence of Analytic Philosophy. Oxford University Press. Chapter 2.
Levine, James, 1998, “The What and the That: Theories of Singular Thought in Bradley, Russell and the Early Wittgenstein” in Appearance Versus Reality: New Essays on Bradley’s Metaphysics. Oxford: Clarendon Press.
Mander, W. J. (ed.), 1996, Perspectives on the Logic and Metaphysics of F. H. Bradley. Bristol: St. Augustine’s Press.
Mander, W.J., 2008, ‘Bradley’s Logic’ in D. Gabbay and J.H. Woods (eds.) Handbook of the History of Logic. Volume Four: British Logic in the Nineteenth Century, Elsevier, pp. 663-717.
Mander, W., 2011, British Idealism. A History. Oxford University Press.
Manser, A., 1983, Bradley’s Logic. Oxford University Press.
Peacocke, C., 1992. A Study of Concepts. Chapter 3. Cambridge MA and London: MIT Press.
- (This entry requires explanation, since Bradley is never mentioned in the book. Chapter 3 introduces scenarios, which are non-conceptual representational contents. As general, they qualify as ideal contents in Bradley’s sense. The positioning of scenarios in reality is therefore a special case of an act of judgment that refers an ideal content to a reality beyond the act. Peacocke is thus presenting the essence of Bradley’s position in an up-to-date form.)
Sprigge, T.L.S., 1993, James and Bradley. Chicago and La Salle, Illinois: Open Court. Part II. Chapters 2 and 3.
Wollheim, R., 1959, F. H. Bradley. Harmondsworth: Penguin Books.

Author Information

D. L. C. Maclachlan
Email: lorne.maclachlan@gmail.com
Queen’s University
Canada

Natural Deduction

Natural Deduction (ND) is a common name for the class of proof systems composed of simple and self-evident inference rules based upon methods of proof and traditional ways of reasoning that have been applied since antiquity in deductive practice. The first formal ND systems were independently constructed in the 1930s by G. Gentzen and S. Jaśkowski and proposed as an alternative to Hilbert-style axiomatic systems. Gentzen introduced a format of ND particularly useful for theoretical investigations of the structure of proofs. Jaśkowski instead provided a format of ND more suitable for practical purposes of proof search. Since then many other ND systems were developed of apparently different character.

What is it that makes them all ND systems despite the differences in the selection of rules, construction of proof, and other features? First of all, in contrast to proofs in axiomatic systems, proofs in ND systems are based on the use of assumptions which are freely introduced but discharged under some conditions. Moreover, ND systems use many inference rules of simple character which show how to compose and decompose formulas in proofs. Finally, ND systems allow for the application of different proof-search strategies. Thanks to these features proofs in ND systems tend to be much shorter and easier to construct than in axiomatic or tableau systems. These properties of ND make them one of the most popular ways of teaching logic in elementary courses. In addition to its educational value, ND is also an important tool in proof-theoretical investigations and in the philosophy of meaning (specifically, of the meaning of logical constants). This article focuses on the description of the main types of ND systems and briefly mentions more advanced issues concerning normal proofs and proof-theoretical semantics.

History of Natural Deduction
1. Origins
2. Prehistory
Applications
Demarcation Problem
1. Wide and Narrow Sense of ND
2. Criteria of Genuine ND
Rules
Proof Format
1. Tree Proofs
2. Linear Proofs
Other Approaches
Rules for Quantifiers
ND for Non-Classical Logics
Normal Proofs
Philosophy of Meaning
References and Further Reading

1. History of Natural Deduction

When dealing with the history of ND, one should distinguish between the exact date when the first formal systems of ND were presented and much earlier times when the rules of ND were actually applied. Although one may claim that ND techniques were used as early as people did reasoning, it is unquestionable that the exact formulation of ND and the justification of its correctness was postponed until the 20th century.

a. Origins

The first ND systems were developed independently by Gerhard Gentzen and Stanisław Jaśkowski and presented in papers published in 1934 (Gentzen 1934, Jaśkowski 1934). Both approaches, although different in many respects, provided the realization of the same basic idea: formally correct systematization of traditional means of proving theorems in mathematics, science and ordinary discourse. It was a reaction to the artificiality of formalization of proofs in axiomatic systems. Hilbert’s proof theory offered high standards of precise formulation of this notion, but formal axiomatic proofs were really different than ‘real’ proofs offered by mathematicians. The process of actual deduction in axiomatic systems is usually complicated and needs a lot of invention. Moreover, real proofs are usually lengthy, hard to decipher and far from informal arguments provided by mathematicians. In informal proofs, techniques such as conditional proof, indirect proof or proof by cases are commonly used; all are based on the introduction of arbitrary, temporarily accepted assumptions. Hence the goals of Gentzen and Jaśkowski were twofold: (1) theoretical and formally correct justification of traditional proof methods, and (2) providing a system which supports actual proof search. Moreover, Gentzen’s approach provided the programme for proof analysis which strongly influenced modern proof theory and philosophical research on theories of meaning.

b. Prehistory

According to some authors the roots of ND may be traced back to Ancient Greece. Corcoran (1972) proposed an interpretation of Aristotle’s syllogistics in terms of inference rules and proofs from assumptions. One can also look for the genesis of ND system in Stoic logic, where many researchers (for example, Mates 1953) identify a practical application of the Deduction Theorem (DT). But all these examples, even if we agree with the arguments of historians of logic, are only examples of using some proof techniques. There is no evidence of theoretical interest in their justification.

In fact the introduction of DT into the realm of modern logic seems to be one of the most important steps on the way leading eventually to the discovery of ND. Although Herbrand did not present a formal proof of it for axiomatic systems until Herbrand (1930), he had already stated it in Herbrand (1928). At the same time Tarski (1930) included DT as one of the axioms of his Consequence Theory; in practice he had used it since 1921. Also other ND-like rules were practically applied in the 1920s by many logicians from the Lvov-Warsaw School, like Leśniewski and Salamucha, as is evident from their papers.

Jaśkowski was strongly influenced by Łukasiewicz, who posed on his Warsaw seminar in 1926 the following problem: how to describe, in a formally proper way, proof methods applied in practice by mathematicians. In response to this challenge Jaśkowski presented his first formulation of ND in 1927, at the First Polish Mathematical Congress in Lvov, mentioned in the Proceedings (Jaśkowski 1929). A final solution was delayed until (Jaśkowski 1934) because Jaśkowski had a lengthy break in his research due to illness and family problems. Gentzen also published the first part of his famous paper in 1934, but the first results are present in (Gentzen 1932). This early paper, however, is concerned not with ND but with the first form of Sequent Calculus (SC). Gentzen was influenced by Hertz (1929), where a tree-format notation for proofs, as well as the notion of a sequent, were introduced. One can also look for a source of the shape of his rules in Heyting’s axiomatization of intuitionistic logic (see von Plato 2014).

It should be no surprise that the two logicians with no knowledge of each other’s work, independently proposed quite different solutions to the same problem. Axiom systems, although theoretically satisfying, were considered by many researchers as practically inadequate and artificial. Thus the need for more practice-oriented deduction systems was in the air.

2. Applications

This article distinguishes at least three main fields of application of ND systems: practical, theoretical and philosophical.

Since 1934 a lot of systems called ND were offered by many authors in numerous textbooks on elementary logic. In this way ND systems became a standard tool of working logicians, mathematicians, and philosophers. At least in the Anglo-American tradition, ND systems prevail in teaching logic. They also had strong influence on the development of other types of non-axiomatic formal systems such as sequent calculi and tableau systems. In fact, the former were also invented by Gentzen as a theoretical tool for investigations on the properties of ND proofs, whereas the latter may be seen (at least in the case of classical logic) as a further simplification of sequent calculus that is easier for practical applications.

But the importance of ND is not only of practical character. Since 1960s the works of Prawitz (1965) and (Raggio 1965) on normal proofs opened up the theoretical perspective in the applications of ND. In fact Prawitz was rediscovering things known to Gentzen but not published by him, which was later shown by von Plato (2008). In addition to extended work on normalization of proofs, ND is also an interesting tool for investigations in theoretical computer science through the Curry-Howard isomorphism. This approach shows that (normal) ND proofs may be interpreted in terms of executions of programs.

Finally the special form of rules of ND provided by Gentzen led to extensive studies on the meaning of logical constants. This article takes a look at theoretical and philosophical applications of ND in sections 9 and 10.

3. Demarcation Problem

The great richness of different forms of systems called ND leads to some theoretical problems concerning the precise meaning of the term ‘ND’. It seems that no definition of ND systems was offered which would be generally accepted. This demarcation problem was investigated by many authors; and different criteria were offered for establishing what is, and what is not, an ND system. Detailed survey of these matters may be found in Pelletier (1999) or in Pelletier and Hazen (2012); this article points out only the most important features.

a. Wide and Narrow Sense of ND

Some authors tend to use the term in a broad sense in which it covers almost all that is not an axiomatic system in Hilbert’s sense. Hence sometimes systems like sequent calculi or tableau calculi are treated as ND systems. All these systems are actually in close relationship, but this article chooses to consider ND only in the narrow sense. There are at least three reasons for making this choice:

Historical. Original ideas of Gentzen, who introduced two systems: NK (Natürliche Kalkül) and LK (Logistiche Kalkül). The former is just an ND system, whereas the latter, a sequent calculus, was meant as a technical tool for proving some metatheorems on NK, not as a kind of ND.
Etymological. ND is supposed to reconstruct, in a formally proper way, traditional ways of reasoning. It is disputable whether existing ND systems realize this task in a satisfying way, but certainly systems like tableaux or SC are even worse in this respect.
Practical. Taking the term ND in a wide sense would be a classifying operation of doubtful usefulness. From the point of view of this article’s presentation, it is more convenient to use a more narrowly defined concept.

b. Criteria of Genuine ND

But what criteria should be used for delimiting the class of systems called ND? Many proposals seem to be too narrow (that is, strict) since they exclude some systems usually treated as ND, so it is better not to be very demanding in this respect. So, ND system should satisfy three criteria:

Possibility of entering and eliminating (discharging) additional assumptions during the course of the proof. Usually it requires some bookkeeping devices for indicating the scope of an assumption, that is, for showing that a part of the proof (a subproof) depends on a temporary assumption, and for marking the end of such a subproof the point at which the assumption is “discharged”.
Characterization of logical constants by means of rules rather than axioms. Their role is taken over by the set of primitive rules for introduction and elimination of logical constants, which means that elementary inferences instead of formulas are taken as primitive.
The richness of forms of proof construction. Genuine ND systems admit a lot of freedom in proof construction and in the possibility of applying several strategies of proof-search.

These three conditions seem to be the essential features of any ND. These characteristics are quite general, but the third at least serves to exclude tableau systems and sequent calculi since genuine ND should allow both direct and indirect proofs, proofs by cases, and so forth. This flexibility of proof construction is vital for ND, whereas, for example in a standard tableau system, we have only indirect proofs and elimination rules. On the other hand, ND does not require that its rules should strictly realise the schema of providing a pair of introduction and elimination rules, and that axioms are not allowed.

4. Rules

ND systems consist of the set of (schemata) of simple rules characterising logical constants. For example a connective of conjunction $\wedge$ is characterised by means of the following rules:

$$\begin{array}{ccc}(\wedge I)\ \ \dfrac{\varphi, \psi}{\varphi\wedge\psi}\quad&(\wedge E)\ \ \dfrac{\varphi\wedge\psi}{\varphi}\quad&(\wedge E)\ \ \dfrac{\varphi\wedge\psi}{\psi}\end{array}$$

where $\varphi$ and $\psi$ denote any formulas. Material above the horizontal line represents the premises; and that below represents the conclusion of the inference. The letters $I$ and $E$ in the names of the rules come from “introduction” and “elimination” respectively since the first allows introduction of a conjunction into a proof, and the second allows for its elimination in favor of simpler formulas. Often the following horizontal notation is applied (instead of vertical which is more space-consuming):
\begin{gather*}
(\wedge E) \quad \varphi \wedge \psi \vdash \varphi, \quad \varphi \wedge \psi \vdash \psi \\
(\wedge I) \quad \varphi , \psi \vdash \varphi \wedge \psi
\end{gather*}
Here $\vdash$ is used to point out that the relation of deducibility holds between premises and the conclusion of a rule instance. In what follows, such phrases are called sequents. In fact such deducibility statements in general do not uniquely characterise inference rules, but it does no harm so they are used in what follows for simplicity’s sake.

One can easily check that the rules stated above adequately characterise the meaning of classical conjunction which is true iff both conjuncts are true. Hence the syntactic deducibility relation coincides with the semantic relation of $\models$, that is, of logical consequence (or entailment). Unfortunately not all logical constants may be characterised by means of such simple rules. For example, implication $\rightarrow$ in addition to modus ponens (or detachment rule):

$$(\rightarrow E)\varphi \rightarrow \psi , \varphi \vdash \psi$$

which is known from axiomatic systems, requires a more complex rule $(\rightarrow I)$ of the shape:

$$\begin{array}{cc}& [\varphi] \\ & \vdots \\ \Gamma\ & \psi \\ \hline & \varphi \rightarrow \psi\end{array}$$

or:

$(\rightarrow I)$ If $\Gamma$ , $\varphi\vdash\psi$ then $\Gamma\vdash\varphi\rightarrow\psi$

where $\Gamma$ and $\varphi$ forms a collection of all active assumptions previously introduced which could have been used in the deduction of $\psi$. When inferring $\varphi\rightarrow\psi$, one is allowed to discharge assumptions of the form $\varphi$. The fact that after deduction of $\varphi\rightarrow\psi$ this assumption is discharged (not active) is pointed out by using [ ] in vertical notation, and by deletion from the set of assumptions in horizontal notation. The latter notation shows better the character of the rule; one deduction is transformed into the other. It shows also that the rule $(\rightarrow I)$ corresponds to an important metatheorem, the Deduction Theorem, which has to be proved in axiomatic formalizations of logic. In what follows, all rules of the shape $\Gamma\vdash\varphi$ will be called inference rules, since they allow for inferring a formula (conclusion) from other formulas (premises) present in the proof. Rules of the form:

If $\Gamma_1 \vdash \varphi_1, \ldots, \Gamma_n \vdash \varphi_n$, then $\Gamma \vdash \varphi$

will be called proof construction rules since they allow for constructing a proof on the basis of some proofs already completed. One characteristic feature of such rules is that they involve the process of entering new assumptions as well as conditions under which one can discharge these assumptions and close subordinated proofs (or subproofs) starting with these assumptions.

The complete set of rules provided by Gentzen for IPL (Intuitionistic Propositional Logic) is the following:

$$\begin{array}{ll} (\bot E) & \bot \vdash \varphi \\ (\neg E) & \varphi , \neg \varphi \vdash \bot \\ (\neg I) & \text{If } \Gamma , \varphi \vdash \bot \text{, then } \Gamma \vdash \neg \varphi \\ (\wedge I) & \varphi , \psi \vdash \varphi \wedge \psi \\ (\wedge E) & \varphi \wedge \psi \vdash \varphi \text{ and }\varphi \wedge \psi \vdash \psi \\ (\rightarrow E) & \varphi , \varphi \rightarrow \psi \vdash \psi \\ (\rightarrow I) & \text{If }\Gamma , \varphi \vdash \psi \text{, then }\Gamma \vdash \varphi \rightarrow \psi \\ (\vee I) & \varphi \vdash \varphi \vee \psi \text{ and }\psi \vdash \varphi \vee \psi \\ (\vee E) & \text{If }\Gamma , \varphi \vdash \chi \text{ and }\Delta , \psi \vdash \chi \text{, then }\Gamma , \Delta , \varphi \vee \psi \vdash \chi\end{array}$$

What is evident from this set of rules is the Gentzen policy of characterising every constant by a pair of rules, in which one is the rule for introduction a formula with that constant into a proof, and the other is the rule of elimination of such a formula, that is, inferring some simpler consequences from it, sometimes with the aid of other premises. More will be said about philosophical consequences of this approach in section 10.

In order to obtain CPL (Classical Propositional Logic), Gentzen added the Law of Excluded Middle $\neg \varphi \vee \varphi$ as an axiom, but the same result can easily be obtained by a suitable inference rule of double negation elimination: $\neg \neg \varphi \vdash \varphi$ or by changing one of the proof construction rules, namely $(\neg I$) which encodes the weak form of indirect proof into the strong form:

$(\neg E)$ If $\Gamma , \neg \varphi \vdash \bot$, then $\Gamma \vdash \varphi$

This solution was applied by Jaśkowski (1934).

5. Proof Format

In addition to providing suitable rules, one must also decide about the form of a proof. Two basic approaches due to Gentzen and Jaśkowski are based on using trees as a representation of a proof and on using linear sequences of formulas. This article focuses on the most important differences between these two approaches. For detailed comparison see Pelletier and Hazen (2014), and Restall (2014).

a. Tree Proofs

Let us start with an example of a proof in Gentzen’s format, that is, as a tree of formulas:

$$\begin{array}{cl} \underline{[p]^1\hspace{.5cm} [p\rightarrow q]^3}\hspace{2cm} & ass. \\ \underline{q \hspace{2cm} [q \rightarrow r]^2} & (\rightarrow E) \\ \underline{\hspace{1cm}r\hspace{1cm}} & (\rightarrow E) \\ \underline{\hspace{1cm}p \rightarrow r^1\hspace{1cm}} & (\rightarrow I) \\ \underline{\hspace{.5cm}(q \rightarrow r)\rightarrow (p\rightarrow r)^2\hspace{.5cm}} & (\rightarrow I) \\ (p\rightarrow q)\rightarrow ((q \rightarrow r)\rightarrow (p\rightarrow r))^3 & (\rightarrow I) \end{array}$$

Here the root of a tree is labelled with a thesis and its leaves are labelled with (discharged) assumptions: $p\rightarrow q, q\rightarrow r$ and $p$. All assumptions were discharged while $(\rightarrow I)$ was applied successively building implications from $r$ — the numbers of assumptions indicate the order in which they were discharged, and the suitable number is attached to the formula inferred by the assumption discharging rule. Before that, $r$ was deduced by two applications of $(\rightarrow E)$, first to two assumptions (active at this moment), then to the third assumption and previously deduced $q$.

Gentzen’s tree format of representing proofs has many advantages. It is an excellent representation of real proofs; in particular, deductive dependencies between formulas are directly shown. But if we are concerned with actual deduction, this format of proof is far from being useful and natural. Moreover, one is often forced to repeat identical, or very similar, parts of the proof, since, in tree format, inferences are conducted not on formulas but on their particular occurrences. For example, if $\varphi \wedge \psi$ is an assumption from which we need to infer both $\varphi$ and $\psi$, then a suitable branch starting with $\varphi \wedge \psi$ must be displayed twice. The following example illustrates the point:

$$\begin{array}{cl} \hspace{.5cm}\underline{[p\wedge (q \wedge p \rightarrow r)]^2}\hspace{2.5cm} & ass.\\ \underline{[q]^1\hspace{1cm} p}\hspace{2cm}\underline{[p\wedge (q \wedge p \rightarrow r)]^3} & (\wedge E) \\ \underline{q \wedge p \hspace{3cm} q \wedge p \rightarrow r} & (\wedge I), (\wedge E) \\ \underline{\hspace{.5cm}r\hspace{.5cm}} & (\rightarrow E) \\ \underline{\hspace{.5cm}q \rightarrow r^1\hspace{.5cm}} & (\rightarrow I) \\ (p\wedge (q \wedge p \rightarrow r))\rightarrow (q\rightarrow r)^{2, 3} & (\rightarrow I)\end{array} $$

here, the attachment of two numerals $2, 3$ to the formula in the last line indicates that both occurrences of the same assumption were discharged in this step.

Gentzen himself was aware of the disadvantages of his representation of proof, but it proved useful for his theoretical interests described in section 9. It is not surprising that the tree format of proofs is mainly used in theoretical studies on ND, as in Prawitz (1965) or Negri and von Plato (2001).

b. Linear Proofs

Jaśkowski, on the other hand, preferred a linear representation of proofs since he was interested in creating a practical tool for deduction. Linear format has many virtues over Gentzen’s approach. For example, inferences are drawn from assumptions rather than from their occurrences, which means that, for example, one needs to assume $\varphi \wedge \psi$ only once to derive both conjuncts. It is also more natural to construct a linear sequence trying, one by one, each possible application of the rules. But there is a price to be paid for these simplifications—the problem of subordinated proofs. How should we represent that some assumption and its subordinated proof are no longer alive because a suitable proof construction rule was applied? If we apply a proof construction rule which discharges an assumption, we must explicitly show that the subordinate proof dependent on this assumption is dead in the sense that no formula from it may be used below in the proof. In a tree format this is not a problem—to use a formula as a premise for the application of some inference rule we must display it (and the whole subtree which provides a justification for it) directly above the conclusion. In linear format this leads to problems, and some technical devices are necessary which forbid using the assumptions and other formulas inferred inside completed subproofs. Jaśkowski proposed two solutions to this problem: graphical (boxes) and bookkeeping (in the terminology of Pelletier and Hazen 2012). Let us compare these two simple proofs:

On the left we have an example of a proof in graphical mode where each assumption opens a new box in which the rest of the proof is carried out. On the other hand when a suitable proof construction rule is applied, the current subproof is boxed which means that nothing inside is allowed in further proof construction. In lines 3 and 5 an additional rule of repetition (often called reiteration) is applied which allows for moving formulas from outer to inner boxes. On the right the same proof is represented in bookkeeping style where instead of boxes we use prefixes (sequences of natural numbers) for indicating the scope of an assumption. Each assumption is preceded with the letter S from latin suppositio and adds a new numeral to the sequence of natural numbers in the prefix. When a proof construction rule is applied, the last item is subtracted from the prefix. Hence a thesis can occur with an empty sequence, signifying that it does not depend on any assumption. No repetition rules are applied in this version of Jaśkowski’s system; hence the proof is two lines shorter.

Although Jaśkowski finally chose the second option (perhaps due to editorial problems) nowadays the graphical approach is far more popular, probably due to the great success of Fitch’s textbook (1952) which popularized a simplified version of Jaśkowski’s system (now called Fitch’s approach). In Fitch’s system one is using vertical lines for indicating subproofs. Below is an example of a proof in Fitch’s format:

Other devices were also applied such as brackets in Copi (1954), or even just indentation of subordinate proofs. The original Jaśkowski’s boxes were used by Kalish and Montague (1964) with the additional device being of great heuristic value; each box is preceded by a show-line which displays the current aim of the proof. Show-lines are not parts of a proof in the sense that one is forbidden to use them as premises for rule application. But after completing a subproof, a box is closed and the opening show-line becomes a new ordinary line in the proof (which is pointed out by deleting a prefix “show”).

The second solution of Jaśkowski was not so popular. One can mention here Quine’s system (1950) (with asterisks instead of numerals) or Słupecki and Borkowski’s system (1958) popular in Poland.

6. Other Approaches

Gentzen (1936) introduced yet another variant of ND which may be considered as lying between his first system described in subsection 5.1. and his famous sequent calculus. It shows another possible way of arranging the bookkeeping of active assumptions. As a result, in this approach the basic items which are transformed in proofs are not formulas but rather sequents. For example, both rules for conjunction are of the form:

$$\begin{array}{ll} (\wedge I’) & \text{If }\Gamma \vdash \varphi \text{ and } \Delta \vdash \psi\text{, then } \Gamma, \Delta \vdash \varphi \wedge \psi \\ (\wedge E’) & \text{If } \Gamma \vdash \varphi \wedge \psi\text{, then } \Gamma \vdash \varphi \text{;} \ \ \text{If } \Gamma \vdash \varphi \wedge \psi\text{, then } \Gamma \vdash \psi \end{array}$$

where $\Gamma , \Delta$ are records of active assumptions.

The full list of rules for CPL contains also:

$$\begin{array}{ll} (\neg E’) & \text{If } \Gamma, \varphi \vdash \psi \text{ and } \Delta, \varphi \vdash \neg\psi\text{, then } \Gamma, \Delta \vdash \psi \\ (\neg I’) & \text{If } \Gamma \vdash \neg\neg\varphi\text{, then } \Gamma \vdash \varphi \\ (\rightarrow E’) & \text{If } \Gamma \vdash \varphi \text{ and } \Delta \vdash \varphi \rightarrow \psi\text{, then } \Gamma, \Delta \vdash \psi \\ (\rightarrow I’) & \text{If } \Gamma , \varphi \vdash \psi\text{, then } \Gamma \vdash \varphi \rightarrow \psi \\ (\vee I’) & \text{If } \Gamma \vdash \varphi\text{, then } \Gamma \vdash \varphi \vee \psi\text{;} \ \ \text{If } \Gamma \vdash \psi\text{, then } \Gamma \vdash \varphi \vee \psi \\ (\vee E’) & \text{If } \Gamma \vdash \varphi \vee \psi \text{ and } \Delta, \varphi \vdash \chi \text{ and } \Lambda , \psi \vdash \chi\text{, then } \Gamma , \Delta , \Lambda \vdash \chi \end{array} $$

Assumptions are sequents of the form $\varphi \vdash \varphi$. Theses are sequents with an empty antecedent. Here is an example of a proof:

$$\begin{array}{c} \underline{p \vdash p\hspace{1cm} p\rightarrow q \vdash p\rightarrow q}\hspace{4cm} \\ \underline{p, p\rightarrow q\vdash q \hspace{3cm} q \rightarrow r\vdash q\rightarrow r} \\ \underline{p, p\rightarrow q, q\rightarrow r \vdash r} \\ \underline{p\rightarrow q, q\rightarrow r \vdash p \rightarrow r} \\ \underline{p\rightarrow q \vdash (q \rightarrow r)\rightarrow(p\rightarrow r)} \\ \vdash (p\rightarrow q)\rightarrow ((q\rightarrow r)\rightarrow(p\rightarrow r)) \end{array}$$

One can observe that in the context of such a system the difference between inference and proof construction rules disappears. The only difference is that in the former all transformations are performed on consequents of sequents whereas in the latter some operations (that is, subtractions) are allowed also on antecedents. This is the difference with Gentzen’s ordinary sequent calculus where we have rules introducing constants to antecedents of sequents (instead of rules of elimination). Of course one can go further and allow this kind of rule as well (such a system was constructed, for example, by Hermes 1963), but it seems that Gentzen’s choice offers significant simplifications. First of all, the tree format is not necessary, and one can display proofs as linear sequences since the record of active assumptions is kept with every formula in a proof (as the antecedent). Moreover, since no operation except subtraction is carried out on antecedents, we can get rid of formulas in antecedents and use instead numerals of lines where suitable assumptions were introduced into proofs. Both simplifications are present in Suppes’ system (1957) of ND where the same proof looks like that:

$$\begin{array}{lcll} 1 & \{1\} & p\rightarrow q & \text{ass.} \\ 2 & \{2\} & q\rightarrow r & \text{ass.} \\ 3 & \{3\} & p & \text{ass.} \\ 4 & \{1, 3\} & q & 1, 3, \ (\rightarrow E) \\ 5 & \{1, 2, 3\} & r & 2, 4, \ (\rightarrow E) \\ 6 & \{1, 2\} & p\rightarrow r & 5, \ (\rightarrow I) \\ 7 & \{1\} & (q \rightarrow r)\rightarrow (p\rightarrow r) & 6, \ (\rightarrow I) \\ 8 & \varnothing & (p\rightarrow q)\rightarrow ((q \rightarrow r)\rightarrow (p\rightarrow r)) & 7, \ (\rightarrow I) \end{array}$$

Other solutions generalising standard proof representations were also considered. One can mention at least two approaches without going into details: ND operating on clauses instead of formulas (Borićić 1985, Cellucci 1992, Indrzejczak 2010) and ND admitting subproofs as items in the proof (Fitch 1966, Schroeder-Heister 1984).

7. Rules for Quantifiers

Gentzen (1934) also provided the first set of ND rules adequate for CFOL (Classical First-Order Logic) whereas the rules of Jaśkowski’s system characterised the weaker system of IFOL (Inclusive First-Order Logic) which admits empty domains in models. As pointed out by Bencivenga (2014), a minimal relaxation of Jaśkowski’s rules yields also Free Logic, that is, a logic allowing non-denoting terms, hence it may be claimed that it is the first formalization of Universally Free Logic, that is, allowing both empty domains and non-denoting terms.

Before characterising Gentzen’s original rules for quantifiers let us note that he was using two sorts of symbols to distinguish between free and bound individual variables. The former are often called individual parameters. Such a solution simplifies a formulation of rules and eliminates the risk of a clash of variables while applying the rules. When we provide ND rules for more standard approaches with just individual variables which may have free or bound occurrences, we must be careful to define precisely the operation of proper substitution of a term for all free occurrences of a variable. ‘Proper’ means that no occurrence of a free variable substituted for another (or, when function-symbols are used, within a term substituted for a variable) gets bound by a quantifier. For simplicity’s sake we will keep Gentzen’s solution; let $x, y, z$ denote (bound) variables and $a, b, c$ free variables or individual parameters. Gentzen’s rules are the following:

$$\begin{array}{ll} (\forall E) & \forall x\varphi \vdash \varphi[x/a] \\ (\exists I) & \varphi [x/a] \vdash \exists x\varphi \\ (\forall I) & \text{If }\Gamma \vdash \varphi [x/a] \text{, then }\Gamma \vdash \forall x\varphi \\ (\exists E) & \text{If }\Gamma \vdash \exists x\varphi\text{ and } \Delta, \varphi[x/a] \vdash \psi\text{, then }\Gamma, \Delta \vdash \psi \end{array}$$

where $\varphi$ [x/a] denotes the operation of substitution, that is, of replacing all free occurrences of $x$ in $\varphi$ with a parameter $a$. In case of $(\forall I)$ and $(\exists E)$ a parameter $a$ is required to be “fresh” in the sense of having no other occurrences in $\Gamma , \Delta, \varphi, \psi$. Such a fresh $a$ is sometimes called an ‘eigenvariable’ or a ‘proper variable’.

The last rule in Gentzen’s tree format looks as follows:

$$\begin{array}{crc} \Gamma & & [\varphi[x/a]], \Delta \\ \vdots & & \vdots \\ \exists x\varphi & & \psi \\ \hline & \psi & \end{array}$$

Although Gentzen provided this set of rules for his tree-system of ND, it was easily adapted also to linear systems based on Jaśkowski’s (or Suppes’) format of proof. Let us illustrate their application in Fitch’s proof format (but not with his original rules):

The first application of $(\forall E)$ introduces a parameter $a$ in place of $x$. In line 3 and 7 the assumptions for the applications of $(\exists E)$ in line 5 and 10 respectively are introduced, each time with a new eigenparameter in place of $y$. Note that both applications of $(\exists E)$ are correct since neither $b$ nor $c$ are present in the formulas ending suitable subproofs. Also the application of $(\forall I)$ in line 6 is correct since $a$ is not present in line 1.

The fact that $(\forall I$) is a proof construction rule is obscured here since there is no need to introduce a subproof by means of a new assumption. We just require that in order to apply $(\forall I)$ there be no occurrence of an involved parameter (here $a$) in active assumptions. However, there are systems of ND where such a subproof (usually flagged with a fresh parameter which will be universally quantified below) is explicitly introduced into a proof. For instance, the original Fitch’s rule is based on such a solution; in fact it follows closely the original Jaśkowski’s rule for inclusive general quantifier.

Gentzen’s $(\exists E)$ was sometimes considered as complex and artificial, and some inference rules were proposed instead where $\varphi[x/a]$ is directly inferred and not assumed. Although the idea is simple its correct implementation leads to troubles. Carefull formulations of such a rule (as in Quine 1950) are correct but hard to follow; simple formulations (as in several editions of Copi 1954) make the system unsound. For a detailed analysis of the relations between Gentzen-style and Quine-style quantifier rules one should consult Fine (1985), Hazen (1987) and Pelletier (1999). All these problems with providing correct and simple rules for quantifiers led some authors to doubt if it is really possible (see Anellis 1991). It seems that the only correct system of ND for CFOL with ‘really’ simple rule of this kind is in Kalish and Montague (1964), but this is rather a side-effect of the overall architecture of the system which is not discussed here (but see a detailed explanation of the virtues of Kalish and Montague’s system in Indrzejczak 2010).

8. ND for Non-Classical Logics

ND systems were also offered for many important non-classical logics. In particular, Jaśkowski’s graphical approach is very handy in this field due to the machinery of isolated subproofs. It appeared that for many non-classical logics one can obtain a satisfying result by putting restrictions on the rule of repetition in the case of some subproofs. Let us take as an example the ND formalization of well known propositional modal logic T; for simplicity we restrict considerations to rules for $\Box$ (necessity). $(\Box E)$ is obvious: $\Box \varphi \vdash \varphi$. With $(\Box I)$ the situation is more complicated since it is based on the following principle:

If $\varphi_1, …, \varphi_n \vdash \psi$, then $\Box\varphi_1, …, \Box\varphi_n \vdash \Box\psi$

where formulas in the antecedent are also being changed by addition of $\Box$. It is realised by means of a special ‘modal’ subproof which is opened with no assumption, but no other formulas may be put in it except those which were preceded by $\Box$ in outer subproofs (and with $\Box$ deleted after transition). If in such modal subproof we deduce $\psi$, it can be closed and $\Box\psi$ can be put into the outer subproof. The following proof in Fitch’s style illustrates this:

In line 4 a modal subproof was initiated which is shown by putting a sole $\Box$ in place of the assumption. Lines 5 and 6 result from the application of modal repetition. Such an approach may be easily extended to other modal logics by modifying conditions of modal repetition; for example, for S4 it is enough to admit that formulas with $\Box$ (no deletion) also may be repeated; for S5, formulas with negated $\Box$ are also allowed. Such an approach to modal logics was initiated by Fitch (1952), extensive study of such systems can be found in Fitting (1983), Garson (2006) and Indrzejczak (2010) where also some other approaches are discussed.

This modus of formalizing logics in ND was also applied for other non-classical logics including conditional logics (Thomason 1970), temporal logics (Indrzejczak 1994) and relevant logics (Anderson and Belnap 1975). In the latter the technique of restricted repetition is not enough however (and even not required for some logics of this kind). Far more important is the technique of labeling all formulas with sets of numbers annotating active assumptions which is necessary for keeping track of relevance conditions. Subsequently, applications of labels of different kinds is in fact one of the most popular technique used not only in tableau methods but also in ND. Vigano (2000) provides a good survey of this approach.

9. Normal Proofs

When constructing proofs one can easily make some inferences which are unnecessary for obtaining a goal. Gentzen was interested not only in providing an adequate system of ND but also in showing that everything which may be proved in such a system may be proved in the most straightforward way. As he put it, in such a proof “No concepts enter into the proof other than those contained in its final result, and their use was therefore essential to the achievement of the result’’ (Gentzen 1934).

In particular, such unnecessary moves are performed if one first applies some introduction rule for logical constant $c$ and then uses the conclusion of this rule application as a premise for the application of the elimination rule for $c$. In such cases the final conclusion is either already present in the proof (as one of the premises of respective introduction rule) or may be directly deduced from premises of the application of introduction rule. For example, if one is deducing $\varphi\rightarrow\psi$ on the basis of $(\rightarrow I)$ and then by $(\rightarrow E)$ is deducing $\psi$ from this implication and $\varphi$, then it is simpler to deduce $\psi$ directly from $\varphi$; the existence of such a proof is guaranteed because it is a subproof introducing $\varphi\rightarrow\psi$. Let us call a maximal formula any formula which is at the same time the conclusion of an introduction rule and the main premise of an elimination rule. A proof is called normal iff no maximal formula is present in it. Roughly speaking we can obtain such a proof if first we apply elimination rules to our assumptions (premises) and then introduction rules to obtain the conclusion. Such proofs are analytic in the sense of having the subformula property: all formulas occurring in such a proof are subformulas or negations of subformulas of the conclusion or premises (undischarged assumptions).

Although the idea of a normal proof is rather simple to grasp it is not so simple to show that everything provable in ND system may have a normal proof. In fact for many ND systems (especially for many non-classical logics) such a result does not hold. Gentzen proved such a result directly for an ND system for Intuitionistic Logic, but he was unable to provide a proof for his ND for Classical Logic. He failed to provide the proof for the Intuitionistic case and instead he provided the result for both his ND systems indirectly. First he introduced an auxiliary technical system of sequent calculus and proved for it (both in the classical and intuitionistic cases) the famous Cut-Elimination Theorem. Then he showed that this result implies the existence of a normal proof for every thesis and valid argument provable in his ND systems. Such a result is usually called the Normal Form Theorem whereas the stronger result showing directly how to transform every ND-proof into normal proof by means of a systematic procedure is called the Normalization Theorem. That Gentzen indeed proved the Normalization Theorem for Intuitionistic case became known recently due to von Plato (2008) who found a preliminary draft of Gentzen’s thesis. The first published versions of proofs of Normalization theorems appeared in the 1960s due to Raggio (1965) and Prawitz (1965) who proved this result also for ND systems for some non-classical logics. For a detailed account of these problems see Troelstra and Schwichtenberg (1996) or Negri and von Plato (2001).

One thing should be noticed with respect to proofs in normal form. Although normal proofs are in a sense the most direct proofs, this does not mean that they are the most economical. In fact, non-normal proofs often may be shorter and easier to understand than normal ones. Perhaps it is simpler to understand if we recall that normalization in ND is the counterpart of cut-elimination in sequent calculi. Applications of cuts in proofs correspond to applications of previously proved things as lemmas and may drastically shorten proofs. When a proof is normalized, its size may grow exponentially (see, for example, Boolos 1984, Fitting 1996, D’Agostino 1999). What is important in normal proofs is that, due to their conceptual simplicity, they provide a proof theoretical justification of deduction and a new way of understanding the meaning of logical constants.

10. Philosophy of Meaning

Aesthetics was not the only reason for insisting on having both introduction and elimination rules for every constant in Gentzen’s ND. He also wanted to realise a deeper philosophical intuition concerning the meaning of logical constants. It is claimed that if a set of rules is intuitive and sufficient for adequate characterisation of a constant, then it in fact expresses our way of understanding this constant. Moreover, such an approach may be connected with Wittgenstein’s program of characterization of meaning by means of the use of words. In this particular case the meaning of logical constants is characterised by their use (via rules) in proof construction. There is also a strong connection with anti-realistic position in the philosophy of meaning where it is claimed that the notion of truth may be successfully replaced with the notion of a proof (Dummett 1991). One recent, and very strong, version of this trend is represented in Brandom’s (2000) program of strong inferentialism, where it is postulated that the meanings of all expressions may be characterised by means of their use in widely understood reasoning processes. However, inferentialism is not particularly connected with ND nor with the specific shapes of rules as giving rise to the meaning of logical constants.

Leaving aside the far-reaching program of inferentialism, one can quite reasonably ask whether the characteristic rules of logical constants may be treated as definitions. The term ‘Proof-Theoretic Semantics’ first appeared in 1991 (Schroeder-Heister 1991), but the roots of this idea is certainly linked with Gentzen (1934). He himself preferred introduction rules as a kind of definition of a constant. Elimination rules are just consequences of these ‘definitions’, not in the sense of being deducible from them but in the sense that their application is a kind of inversion of introduction rules. The notion of inversion was precisely characterised by Prawitz’s principle of inversion [see Prawitz’s (1965)]: if by the application of elimination rule $r$ we obtain $\varphi$, then proofs sufficient for deduction of premises of $r$ already contain a deduction of $\varphi$. Hence one can directly obtain $\varphi$ on the basis of these proofs with no application of $r$. As these sufficient conditions for deductions of premises are characterised by introduction rules, we can easily see that the inversion principle is strongly connected with the possibility of proving normalization theorems; it justifies making reduction steps for maximal formulas in normalization procedures.

Not all authors dealing with proof-theoretic semantics followed Gentzen in his particular solutions. Popper (1947) was the first who tried to construct deductive systems in which all rules for a constant were treated together as its definition. There are also approaches (such as Dummett 1991, chapter 13, and Prawitz 1971) in which elimination rules are treated as the most fundamental. No matter which kind of rules should be taken as basic for characterization of logical constants, it is obvious that not any set of rules may be treated as a candidate for definition. Prior (1960) paid attention to this fact by means of his famous example. Let us consider a connective “tonk’’ characterised by the following rules:

(tonk I) $\varphi \vdash \varphi$ tonk $\psi$
(tonk E) $\varphi$ tonk $\psi \vdash \psi$

One can easily show that any formula is deducible from any formula after adding such rules to ND system. However Prior’s example only showed that one should carefuly characterise conditions of correctness for rules which are proposed as a tool for characterisation of logical constants. One of the first proposals is due to Belnap (1962) who emphasized that, just as for definitions, rules must be noncreative in the sense that if we add them to some ND system, then we obtain its conservative extension. In other words, if some formula with no occurrence of this new constant was not deducible in the ‘old’ system, then it is still not in the extended system. Rules for “tonk’’ do not satisfy this requirement. Although Belnap’s solution is not sufficient, he opened the door for further research of such conditions. The term “(proof-theoretic) harmony’’ is widely used for specification of such adequacy conditions for rules, and there is a large amount of literature concerned with this question. Schroeder-Heister (2014) provides one of the recent solutions to this problem whereas Schroeder-Heister (2012) offers extensive discussion of other approaches.

11. References and Further Reading

[1] Anderson, A., R. and N., D. Belnap, Entailment: the Logic of Relevance and Necessity, vol I. Princeton University Press, Princeton 1975. 17.
[2] Anellis, I. H., `Forty Years of “Unnatural” Natural Deduction and Quantification. A History of First-Order Systems of Natural Deduction from Gentzen to Copi’, Modern Logic, 2(2): 113-152, 1991.
[3] Belnap, N. D., `Tonk, Plonk and Plink’, Analysis 22/6:130-134, 1962.
[4] Bencivenga E., `Jaskowski’s Universally Free Logic`, Studia Logica, 102(6):1095-1102, 2014.
[5] Boolos, G., `Don’t eliminate Cut`, Journal of Philosophical Logic, 7:373-378, 1984.
[6] Boricic;, B. R., `On Sequence-conclusion Natural Deduction Systems`, Journal of Philosophical Logic, 14: 359-377, 1985.
[7] Borkowski L., J. S lupecki, `A Logical System based on rules and its applications in teaching Mathematical Logic`, Studia Logica, 7: 71-113, 1958.
[8] Brandom, R., Articulating Reasons. An Introduction to Inferentialism, Cambridge, Harvard University Press 2000.
[9] Cellucci, C., `Existential Instatiation and Normalization in Sequent Natural Deduction`, Annals of Pure and Applied Logic, 58: 111-148, 1992.
[10] Copi I. M., Symbolic Logic, The Macmillan Company, New York 1954.
[11] Corcoran, J. `Aristotle’s Natural Deduction System`, in: J. Corcoran (ed.), Ancient Logic and its Modern Interpretations, Reidel, Dordrecht 1972.
[12] D’Agostino, M., `Tableau Methods for Classical Propositional Logic` in: M. D’Agostino et al. (eds.), Handbook of Tableau Methods, pp. 45-123, Kluwer Academic Publishers, Dordrecht 1999.
[13] Dummett, M., The Logical Basis of Metaphysics, Cambridge, Harvard University Press 1991.
[14] Fine, K., `Natural deduction and arbitrary objects’, Journal of Philosophical Logic 14:57-107, 1985.
[15] Fitch, F.B., Symbolic Logic, Ronald Press Co, New York 1952.
[16] Fitch, F.B., `Natural deduction rules for obligation’, American Philosophical Quaterly 3:27-38, 1966.
[17] Fitting, M., Proof Methods for Modal and Intuitionistic Logics, Reidel, Dordrecht 1983.
[18] Fitting, M., First-Order Logic and Automated Theorem Proving, Springer, Berlin 1996. 18
[19] Garson, J.W. Modal Logic for Philosophers, Cambridge University Press, Cambridge 2006.
[20] Gentzen G., `Uber die Existenz unabhangiger Axiomensysteme zu unendlichen Satzsystemen`, Mathematische Annalen, 107:329-350, 1932.
[21] Gentzen, G., `Untersuchungen uber das Logische Schliessen`, Mathematische Zeitschrift 39:176-210 and 39:405-431, 1934.
[22] Gentzen, G., `Die Widerspruchsfreiheit der reinen Zahlentheorie`, Mathematische Annalen 112:493-565, 1936.
[23] Hazen, A.P., `Natural deduction and Hilbert’s epsilon-operator’, Journal of Philosophical Logic 16:411-421, 1987.
[24] Hazen A. P. and F. J. Pelletier, `Gentzen and Jaskowski Natural Deduction: Fundamentally Similar but Importantly Different`, Studia Logica, 102(6):1103-1142, 2014.
[25] Herbrand J., abstract in: Comptes Rendus des Seances de l’Academie des Sciences 1928, vol. 186, 1275 Paris.
[26] Herbrand J., `Recherches sur la theorie de la demonstration`, in: Travaux de la Societe des Sciences et des Lettres de Varsovie, Classe III, Sciences Mathematiques et Physiques, Warsovie, 1930.
[27] Hermes H., Einfuhrung in die Mathematische Logik, Teubner, Stuttgart 1963.
[28] Hertz P., `Uber Axiomensysteme fur beliebige Satzsysteme`, Mathematische Annalen, 101: 457-514, 1929.
[29] Indrzejczak, A., `Natural Deduction System for Tense Logics`, Bulletin of the Section of Logic 23(4):173-179, 1994.
[30] Indrzejczak, A., Natural Deduction, Hybrid Systems and Modal Logics, Springer 2010.
[31] Jaskowski, S., `Teoria dedukcji oparta na dyrektywach za lozeniowych` in: Ksiega Pamiatkowa I Polskiego Zjazdu Matematycznego, Uniwersytet Jagiellonski, Krakow 1929.
[32] Jaskowski, S., `On the Rules of Suppositions in Formal Logic` Studia Logica 1:5-32, 1934.
[33] Kalish, D., and R. Montague, Logic, Techniques of Formal Reasoning, Harcourt, Brace and World, New York 1964.
[34] Mates B., Stoic Logic, University of California Press, Berkeley 1953.
[35] Negri, S., and J. von Plato, Structural Proof Theory, Cambridge University Press, Cambridge 2001. 19
[36] Pelletier F. J. `A Brief History of Natural Deduction`, History and Philosophy of Logic, 20: 1-31, 1999.
[37] Pelletier F. J. and A. P. Hazen, `A History of Natural Deduction`, in: D. Gabbay, F. J. Pelletier and E. Woods (eds.) Handbook of the History of Logic vol 11, 341-414, 2012.
[38] Plato von J., `Gentzen’s proof of normalization for ND`, The Bulletin of Symbolic Logic 14(2):240-257, 2008.
[39] Plato von J., `From Axiomatic Logic to Natural Deduction`, Studia Logica, 102(6):1167-1184, 2014.
[40] Popper, K., `Logic without assumptions’, Proceedings of the Aristotelian Society 47:251-292, 1947.
[41] Popper, K., `New foundations for Logic’, Mind 56: 1947.
[42] Prior, A.,N. `The runabout inference ticket’, Analysis 21:38-39, 1960.
[43] Prawitz, D. Natural Deduction, Almqvist and Wiksell, Stockholm 1965.
[44] Prawitz, D. `Ideas and Results in Proof Theory’ in: Proceedings of the Second Scandinavian Logic Symposium, J. E. Fenstad (ed.), North-Holland, Amsterdam 1971.
[45] Quine W. Van O., Methods of Logic, Colt, New York 1950.
[46] Raggio A., `Gentzen’s Hauptsatz for the systems NI and NK`, Logique et Analyse 8:91-100, 1965.
[47] Restall G.,`Normal Proofs, Cut Free Derivations and Structural Rules’ Studia Logica, 102(6):1143-1166, 2014.
[48] Schroeder-Heister, P., `A Natural Extension of Natural Deduction’, Journal of Symbolic Logic 49:1284-1300, 1984.
[49] Schroeder-Heister, P., `Uniform Proof-Theoretic Semantics for Logical Constants (Abstract), Journal of Symbolic Logic 56, 1142, 1991.
[50] Schroeder-Heister, P., `Proof-Theoretic Semantics’ in: The Stanford Encyclopedia of Philosophy (ed.) E. N. Zalta 2012.
[51] Schroeder-Heister, P., `The Calculus of Higher-Level Rules, Propositional Quantification and the Foundational Approach to Proof-Theoretic Harmony’ Studia Logica, 102(6):1185{1216, 2014.
[52] Suppes P., Introduction to Logic, Van Nostrand, Princeton 1957, 20.
[53] Tarski A., `Fundamentale Begriffe der Methodologie der deduktiven Wissenschaften`, Monatschefte fur Mathematik und Physik, 37:361-404, 1930.
[54] Troelstra A. S. and H. Schwichtenberg., Basic Proof Theory, Cambridge 1996.
[55] Vigano L., Labelled Non-Classical Logics, Kluwer 2000.

Author Information

Andrzej Indrzejczak
Email: indrzej@filozof.uni.lodz.pl
University of Lodz
Poland

Pierre Bayle (1647–1706)

Pierre Bayle was a seventeenth-century French skeptical philosopher and historian. He is best known for his encyclopedic work The Historical and Critical Dictionary (1697, 1^st edition; 1702, 2^nd edition), a work which was widely influential on eighteenth-century figures such as Voltaire and Thomas Jefferson. Bayle is traditionally described as a skeptic, though the nature and extent of his skepticism remains hotly debated. He is best known for his explicit defenses of religious faith against the attacks of reason, for his attacks on specious theological doctrines, and for his formulation of the doctrine of the erring conscience as a basis for religious toleration.

In contrast to his seventeenth-century contemporaries, Bayle is fundamentally an anti-systematic thinker. In keeping with his skepticism (understood in the ancient sense), he is committed to the thorough examination of arguments for and against the position under examination. This entails making the best arguments possible on both sides, as well as raising the strongest possible objections to both sides. As a result, in many cases, it is difficult to determine just what Bayle’s position is. Commentators refer to this phenomenon as the “Bayle enigma,” and it affects virtually every area of Bayle’s thought, undermining the legitimacy of his defenses of religious faith and calling into question the sincerity of his attacks on theology.

Bayle’s influence extends beyond philosophers; his texts have occasioned interest from historians, theologians, literary scholars, and political theorists. Bayle was incredibly prolific, both in personal correspondence and in published work. The encyclopedic format of his Dictionary showcased the dazzling breadth and depth of his knowledge, a learning which was also on display during his years as the editor of the intellectual journal News from the Republic of Letters (1684-1687). Bayle produced most of the content of the journal—primarily book reviews—during his editorship. His authorship of anonymous works has also been established, most recently in the case of the Important Advice to Refugees (1690). The enormous variety of topics that Bayle treated over the course of his lifetime, the diversity of formats that he used to do so, and the indeterminate nature of his arguments make him a rich topic for scholarly investigation.

Biography and Intellectual Context
Anti-Systematicity
Skepticism
1. What Kind of Skeptic was Bayle?
  1. The “Surreptitious Atheist” Reading
  2. The “Christian Fideist” Reading
2. Moral Knowledge
The Problem of Evil
The Erring Conscience and Religious Tolerance
References and Further Reading
1. Primary Sources
2. Secondary Sources

1. Biography and Intellectual Context

Bayle was raised as a French Calvinist, or Huguenot, from his birth in 1647 in Le Carla, a small village in the south of France, until he left for the Jesuit college in Toulouse. His father, a Huguenot pastor, and his family were astonished by his 1669 conversion to Catholicism, presumably as a result of his studies under the Jesuits at Toulouse. Bayle reconverted to Calvinism eighteen months later, however, officially becoming a rélaps, the most persecuted religious classification under the French Catholic monarchy. Predictably, Bayle then fled France, and studied at a Calvinist seminary in Geneva for two years under Louis Tronchin. After Bayle figured out that the pastoral vocation was not for him, he transferred to the University of Geneva to study Cartesian philosophy. After completing his studies there and returning to France in disguise as “Bâle” in 1674, Bayle spent a year as a tutor in Rouen and Paris before securing a position in 1675 at the Protestant Academy of Sedan.

It was at Sedan that Bayle first came into contact with Pierre Jurieu, a Calvinist theologian who became Bayle’s mentor, but over time, his most bitter enemy. Bayle and Jurieu initially were so close that when the French government closed the Sedan academy in 1681, Bayle followed Jurieu to the Ecole Illustre, an academy in Rotterdam for Huguenot refugees where they both joined the faculty. Their mutual animus likely had its genesis in Bayle’s refusal of a marriage arranged by the Jurieu family, but there were also intellectual reasons for the cooling of Bayle and Jurieu’s relationship. The publication of Bayle’s Philosophical Commentary (1686-88), which advocated religious toleration, had already raised Jurieu’s suspicion of Bayle. The animosity increased markedly in 1690, when Bayle’s anonymously-published Important Advice to Refugees occasioned heated attacks by Jurieu, who saw the work as profoundly anti-Protestant.

During his initial years in Rotterdam, almost all of Bayle’s writings had been focused on attacking Catholic theology and practice, including General Critique of Maimbourg’s History of Calvinism (1682), Diverse Thoughts on the Occasion of a Comet (1683), and An Entirely Catholic France (1686). The death of Bayle’s father and brothers in 1684 and 1685, and the Revocation of the Edict of Nantes in 1685, provided strong personal reasons for Bayle to attack Catholic intolerance. Jurieu saw the Advice to Refugees, however, as evidence that Bayle had turned against his Huguenot roots, and denounced Bayle as a heretic. Jurieu’s public proclamations against Bayle, however, were inconsistent with Bayle’s fidelity to the Reformed community in Rotterdam, and evidence from Bayle’s deathbed seems to support his adherence to the Calvinist religion for the rest of his life.

The text that solidified Bayle’s reputation as a grave danger to religious belief, however, was his Dictionary, the encyclopedic work that was Bayle’s magnum opus. The Dictionary contains many articles that implicitly criticize his Protestant contemporaries, including Jurieu, as well as articles that seem to undermine the rationality of religious belief as a whole. Bayle clarified his criticisms in the second edition of the Dictionary in 1702, which included “Eclaircissements”, or Clarifications, on several of the most controversial articles. These explanations did not deflect criticism, however, and Bayle provided even more fodder for his critics with the publication of his Response to the Questions of a Provincial (1704) and Continuation of Diverse Thoughts (1705). These late works contain reconstructions of coherent atheist positions, and support Bayle’s earlier position from Diverse Thoughts that atheists could be morally upright. Bayle continued to respond to his critics until the day of his death on December 28, 1706. That day, he wrote in a letter to a friend, “I am dying as a Christian philosopher, convinced of and pierced by the bounties and mercy of God.”

Despite this final piece of evidence toward Bayle’s religious fidelity, many Enlightenment philosophes in the generations following Bayle saw him as their intellectual ancestor. One of Bayle’s most famous admirers was Voltaire, who is probably most responsible for Bayle’s reputation as the “arsenal of the Enlightenment,” a reference to the many arguments that the philosophes found in Bayle. The philosophes adapted these arguments to attack religious and superstitious beliefs among philosophers and theologians, using the arguments to show the absurdity of any supernatural belief whatsoever. The Enlightenment portrait of Bayle has defined his place in intellectual history, until the more recent interpretations of the twentieth century.

2. Anti-Systematicity

In the history of the early modern philosophy, Bayle is one of the most controversial, and least understood, intellectuals of the period. Unlike other canonical seventeenth-century figures, Bayle gave no explicit systematization of his philosophical positions. While Bayle wrote on philosophical and theological problems ranging from toleration to the problem of evil, he produced no definitive or complete exposition of his ideas. Despite the widespread popularity of his Dictionary, Bayle is typically not considered to be a canonical philosopher. This is perhaps because the philosophical insights in Bayle’s work are buried in theological polemic, obscure reference material, and extremely prolix arguments. Relatively few scholars have taken on the difficult task of mining these insights until recently.

The Dictionary, one of the most problematic texts of the early modern period, is the obvious place to begin any interpretation of Bayle’s thought. Bayle’s stated purpose in writing the Dictionary was to update and correct the work of Louis Moréri’s Grand Historical Dictionary (1674). Bayle thought that Moréri’s dictionary was hopelessly out of date and inaccurate, and Bayle hoped that his work would replace Moréri’s as a standard reference work. The Dictionary, however, is neither objective nor exhaustive, at least by today’s standards. The majority of the Dictionary’s pages are not even devoted to the scholarly articles themselves, but rather to remarks and footnotes that Bayle uses to articulate his own thoughts on the topics of the articles – or even on other topics that are only tangentially related to the topic of the article. Furthermore, Bayle routinely makes mutually inconsistent claims throughout the Dictionary.

It is not just the underdetermined, dense, and paradoxical, nature of the Dictionary that poses an interpretive problem for scholars of Bayle; the problem is magnified when one examines Bayle’s corpus as a whole. The breadth and complexity of his work is dizzying; Bayle’s writing ranges over a wide variety of topics and genres, from superstition to Biblical exegesis to astronomy to metaphysics, and from historical critiques to literary reviews to journal articles to theological treatises. Elisabeth Labrousse, an internationally regarded scholar of Bayle, notes that “[a]t turns, Bayle speaks the language of a Calvinist theologian, a Huguenot pamphleteer, a disciple of Malebranche, or a spiritual child of Erasmus, Montaigne, and Naudé.” Furthermore, Bayle’s scholarship on all of these topics and in all of these genres was exhaustingly thorough. His scholarly training at Toulouse taught him to examine not just his own position on a particular issue, but also to examine all possible objections and replies to his position, in as much detail as necessary to demolish his opponent. His arguments cite both the relevant historical and contemporary sources, a testament both to his encyclopedic mind and to his lifelong obsession with the intellectual trends of his day. Bayle’s arguments are so intricate that it is often unclear exactly what positions the arguments are supposed to be defending. As Jean Delvolvé, an early twentieth-century scholar of Bayle, aptly notes,

The very originality of Bayle’s ideas, their lack of systematic construction, their diffusion in the mass of a work that is prolix to excess, their intentionally obscured and enveloped exposition – for they must be discovered through a thousand réticences, and among the trompe-l’oeil of affirmations to the contrary – all these reasons hindered the comprehension of Bayle by his contemporaries and have hindered him taking his rightful place in the history of human thought.

The paradoxes of Bayle’s work have given rise to a number of different readings of Bayle. First, the complexity and seeming ambiguity of Bayle’s arguments have been cited as evidence that Bayle ought to be read primarily as an ironic critic. According to this reading of Bayle, all of his arguments that ostensibly defend traditionalist positions are really just vehicles for proto-Enlightenment critiques of those same positions. The completeness of Bayle’s arguments, and his dedication to charitable reconstruction of his opponent’s arguments, is not evidence of Bayle’s responsible scholarship, but is rather a chance for him to advance his own subversive views. That these views are in fact Bayle’s is supported by the paradoxical replies and weak counterarguments that he offers in response to the charges of his opponents. According to this, Bayle’s apparent acceptance of what seem to be obviously anti-intellectual paradoxes by an otherwise philosophically sophisticated mind provides support for reading Bayle as a kind of subversive anti-traditionalist.

An alternative reading of Bayle is as a kind of complicated traditionalist. The complex structure of Bayle’s arguments reflects not subversive critique, or even rational agnosticism, according to this reading. Instead, it reflects Bayle’s desire to demonstrate for his opponents, via a reductio ad absurdum, the paradoxes of reason with respect to metaphysics in general, and with respect to philosophical theology in particular. This reductio of reason provides an explanation both of Bayle’s use of rigorous philosophical argumentation, and of his explicit affirmation of apparent paradoxes. This reading of Bayle as a philosopher who uses reason to disarm itself is consistent not only with his commitment to responsible argument, but also with the evidence of his lifelong adherence to traditional Huguenotism.

Recent readings of Bayle have resisted even attempts to make him into either an ironic critic or a complicated traditionalist. This anti-systematic reading of Bayle recognizes the multiple ambiguities and difficulties inherent in any attempt to provide a systematic interpretation of Bayle. According to this reading, the nature of Bayle’s texts prohibits fixing any sort of singular interpretation to his thought. What is most distinctive about Bayle’s thought is not its irony or traditionalism, but rather its dialogic character and polyphonic thinking. Bayle’s texts consistently allow multiple voices to speak autonomously, rather than as vehicles for his own views; it is thus a grave interpretive error, on this reading, to impose an artificial systematization on a text to create a single voice or interpretation. In other words, the typical temptation to force internal consistency onto Bayle’s texts – even a skeptical consistency – would not just be a hermeneutic mistake; it would be a philosophical one, because it would require the pursuit of consistency between arguments defending opposing positions.

3. Skepticism

a. What Kind of Skeptic was Bayle?

Reading Bayle as a skeptic of one kind or another has a long history, going back to his own contemporaries and continuing through present-day commentators. The sense in which Bayle is a skeptic is not entirely straightforward, but what is clear is that Bayle exhibits a profound suspicion of reason’s ability to deliver certain knowledge. In Bayle’s view, reason seems to be useful in enabling us to draw conclusions about the world, but it runs into so many contradictions and yields so many paradoxes that it ultimately undermines itself, and thus cannot be trusted. Thus, Bayle’s skepticism is, minimally, skepticism about the reliability of reason. Aside from this point, however, interpreters of Bayle diverge about the nature and extent of Bayle’s skepticism. How best to understand Bayle’s skepticism is often a function of the more general reading that one takes of Bayle’s overall projects and positions.

i. The “Surreptitious Atheist” Reading

Taking its cues from the “ironic critic” reading of Bayle, this interpretation of Bayle’s skepticism sees it as fundamentally a kind of Stratonianism, a position that Bayle outlines in the Continuation of Diverse Thoughts (1705). Strato, the position’s namesake, was the third leader of the ancient Lyceum, after Aristotle and Theophrastus. Unlike other ancient philosophers, Strato is uncompromising in his atheism. Bayle himself is interested less in the position advocated by Strato himself than in a modern adaptation of Stratonianism. This is because Strato represents for Bayle the position of seventeenth-century libertins: the denial of a providential God, and the affirmation of the eternity and infinity of the universe.

The case that Stratonianism represents Bayle’s own philosophical position is not found in Bayle’s arguments themselves, but rather in a methodological feature of their structure. Bayle typically structures his arguments not to support directly the position he actually holds; rather, he constructs the best possible argument for the strongest opposing position, and then defeats it later. This eventual defeat makes evident the superiority of the position Bayle actually holds. Bayle explicitly develops the position of the Stratonian atheist over the course of the Continuation, and, according to this reading, this position is never refuted by Bayle. Thus, the strongest opposing position to natural philosophical theology is left standing as a menace to theist philosophers. Reading Bayle in this way assumes that if Bayle’s position were not that of the Stratonian atheist, then he would have provided more decisive objections; in the absence of those objections, Bayle is implying that Stratonianism is the only philosophically defensible position.

ii. The “Christian Fideist” Reading

Taking its cues from the “complicated traditionalist” reading of Bayle, this interpretation of Bayle’s skepticism sees it as a kind of fideism. Bayle’s (heterodox) Calvinism, and the context of Cartesianism and Protestant theology more generally, is taken as fundamental to his thought. According to this reading, the complex structure of Bayle’s arguments reflects not an implicit atheism, but rather his desire to demonstrate the paradoxes of reason with respect to metaphysics, and with respect to the metaphysical claims of religion in particular. This demonstration of the paradoxes of reason provides a basis both for Bayle’s affirmation of Calvinist theology, and for his use of rigorous philosophical argumentation. This reading is thus consistent not only with his commitment to responsible argument, but also with his apparent lifelong adherence to the Calvinist faith.

Textual evidence for this reading is Bayle’s furious reply to the Jesuit father Maimbourg’s History of Calvinism (1682). Bayle wrote his reply – General Critique – in two weeks, and in it, Bayle makes clear both his Protestant convictions and his commitment to them. Bayle emphasizes that since the workings of Providence are infinite, they could not be comprehended by finite reason. However, French Calvinism contains strong elements of Cartesianism, and Bayle himself asserts in Diverse Thoughts that his views were not far from those of Malebranche.

Ultimately, though, this reading holds that Bayle’s pessimistic assessment of reason is what characterizes the bulk of his work. Throughout Bayle’s journal News from the Republic of Letters, he makes critical remarks about the arguments of secular rationalists, and these remarks indicate that all rational investigation of theological or philosophical questions results in puzzles that reason is powerless either to affirm or deny. Bayle also remarks in his Dictionary that “there is no contradiction between these two things: (1) the light of reason teaches me that that is false; (2) Moreover, I believe it because I am persuaded that this light is not infallible and because I prefer to defer to the proofs of sentiment and to the impressions of conscience, in a word, to the word of God, than to defer to a metaphysical demonstration” (“Spinoza,” Rem. M). This is evidence not only of Bayle’s sincerity in his faith, but also of his confidence in the coherence of his religious and philosophical views.

b. Moral Knowledge

Bayle’s account of moral knowledge rests on a function of reason that he calls la droite raison, or right reason. Despite his skepticism, Bayle seems to hold that what he calls the “common notions” of morality are well-grounded insofar as they come from right reason.

The most famous example of a “common notion” delivered by right reason is found in Bayle’s Philosophical Commentary, where he argues that the interpretation of Scripture must be limited by the “clear and distinct notions of the natural light… with respect to morality” (I.i). This conclusion initially appears to be quite heterodox; if read in its most radical form, it seems to imply that any Christian doctrine that is refuted by reason (“the natural light”) is false. What Bayle actually asserts here, however, is not the falsity of any Christian doctrine that is against reason; rather, he asserts only the falsity of particular dogmas that are purported to be in Scripture. For Bayle, the “natural light” reveals the immorality of the forced conversions for which Catholics purported to find justification in Scripture, and their immorality invalidates their purported justification. This highlights the most important consequence of the passage: that the natural light trumps the claims of dogma with respect to morality. Bayle’s skepticism entails that the natural light is fallible, and can be self-contradictory in some domains. It appears, however, that the natural light is reliable with respect to moral truths – at least, with respect to those that apply to humans.

Bayle reiterates the reliability of the natural light with respect to moral truths consistently throughout the Philosophical Commentary, which is unsurprising since the text is a defense of the morality of religious toleration. This position, however, appears in other texts as well. In Diverse Thoughts, wherein Bayle argues that atheists can be moral, he notes that certain moral principles are not only rational, but that moral praise and blame can be rationally assigned to those who live accordingly.

Bayle argues that the atheist has access to right reason, which confirms basic moral truths. Bayle also provides examples of the specific basic moral truths in question: “it is rational to respect one’s father, to hold to one’s word, to console the afflicted, to help the poor, to have gratitude for one’s benefactors, etc.” (OD III 406a). There is no hint of any of the skeptical doubts that Bayle characteristically raises; this suggests that he is using a non-skeptical notion of reason when discussing basic moral beliefs.

One of the final texts of Bayle’s life, Response to the Questions of a Provincial (1704-1707), also offers evidence of Bayle’s insistence on the rational accessibility of moral truths. Bayle’s position there is that atheists can be moral because they can know the conformity of virtue with right reason. He concedes that if this were not true – that is, if morality were only clearly conceivable through revelation – then atheists could not be moral. According to Bayle, however, right reason is as universal as the principles of logic. Bayle’s point in RQP is not to highlight the universality of the principles of logic, but simply to note that if one acknowledges the authority of principles of logic, then the sort of reason at issue here – right reason – should enjoy the same privileges. Other passages in RQP call into question the universality of right reason, particularly in rendering moral judgments about the conduct of God, but not with respect to human conduct.

Bayle’s Abridged System of Philosophy (1675-1680), which are lecture notes from his first position as a professor at Sedan, are where he provides his most systematic treatment of the notion of right reason. In the section of notes on moral theory, Bayle defines right reason as “the judgment that the soul naturally renders on practical conclusions, or conclusions regarding morality that are drawn from practical principles” (OD IV 261b). Bayle thus restricts the scope of right reason to moral, or practical, principles. Unlike the merely plausible conclusions of a skeptical conception of reason, Bayle argues that the natural light of reason – which Bayle uses interchangeably with right reason, when the natural light is illuminating practical matters – suffices to know moral truth. The principles of morality that are known by right reason are universally and evidently true. Bayle argues, further, that right reason is also the standard by which the goodness of particular actions are judged (OD IV 261b).

There is a significant complication in Bayle’s account of moral knowledge, however; in the midst of a discussion on right reason, he introduces the notion of conscience. Bayle defines conscience as

a practical judgment of the understanding, which dictates to us that we must do or ought to have done something, as being praiseworthy, and that we must avoid or ought to have avoided something, as being shameful. In a word, it’s an understanding of the natural law by which each person judges which thing is praiseworthy & ought to be done, and which other thing is shameful & ought to be avoided (OD IV 261b).

This sounds very similar to Bayle’s description of the guidance offered by right reason.

Further, Bayle’s account of moral knowledge is complicated even more by his use of illumination language to describe the conscience: he claims that the “natural light” leads us to affirm the principles of morality. He initially refers to natural morality itself as a “certain light in the soul” that obliges the recognition of general principles of morality. He also, however, makes reference to the light by which we affirm the principles of morality, and which supposedly lead us to natural morality. There seems to be a distinction, then, between “natural morality” (“the first general principles of morality”), which is a certain light, and the “natural light” of conscience – non-identical to the “natural light” of reason – for which the standard is not praiseworthiness, but rather fairness. Further, those led by conscience are merely supposed to have natural morality.

There is a clear connection for Bayle, then, between right reason as the faculty that grounds moral knowledge, and our rational nature – or at least the leftovers of our prelapsarian rational nature. Unfortunately, it also opens the possibility that the obligation of one’s conscience could attach to moral beliefs that were erroneous, or that were in some way contrary to the dictates of right reason, if the conscience were not being guided by right reason. Right reason is a crucial check on the moral “knowledge” provided by conscience in the following ways. First, the conscience can be affected by prejudices and errors, and unless it is rid of those, it cannot function as a moral guide. Relatedly, as a result of its susceptibility to prejudice and error, a conscience can be falsely persuaded of the licitness or illicitness of a particular action. Finally, one whose conscience is falsely persuaded can still commit acts that are in conformity with right reason, even though her erring conscience says that such acts are illicit. Similarly, a person who commits a wrongful act deemed by his erring conscience to be licit is still acting against right reason, despite the conformity with conscience. Thus, while conscience delivers verdicts on the morality of particular actions by particular individuals, right reason is the ultimate arbiter of morality in general. This provides a significant external check on the potentially erring conscience.

4. The Problem of Evil

Bayle’s treatment of the problem of evil is well-known, and occasioned Leibniz’s writing of the Theodicy (1710). Bayle’s Dictionary articles on the Paulicians, the Manicheans, and the Marcionites, as well as his subsequent clarification on the “Paulicians” and “Manicheans” articles, are where Bayle develops the position to which Leibniz is responding. Bayle also treats the issue in Response to the Questions of a Provincial and Dialogues of Maximus and Themistius (1707), where he critiques rationalist responses to the problem of evil. Bayle is pessimistic regarding the use of reason to make sense of evil: he holds that a priori reasons fail to address the a posteriori reality of evil. In other words, any attempt to explain the existence of evil rationally is contradicted by lived experience. Bayle supports this position by showcasing the strengths and weaknesses of both the orthodox and the Manichean solutions to the problem of evil, and concludes that both positions fail. What’s more, the failure of these solutions is not merely beyond the ken of human reason; the proposed solutions are comprehensible to reason, but simply fail its evaluation.

Bayle’s first extensive treatment of the problem of evil is in the Dictionary. In particular, the articles on the Manicheans and the Paulicians provoked a strong response from his fellow Huguenot refugees in Holland, prompting him to write a clarification of his position in those two articles for the second edition of the Dictionary. In Remark D of “Manicheans,” Bayle considers two different responses to the problem of evil, using the personage of Zoroaster on the one hand, and Melissus of Samos on the other. Bayle frames their positions in terms of a priori and a posteriori reasons. According to Bayle, the rational notions of order are what naturally lead us to think that an eternal, self-existent, and necessary being must also possess omnipotence and omnibenevolence. According to Bayle, this is an instance of an a priori reason: the ideas therein are clear and distinct, and it is internally coherent. With respect to the problem of evil, however, a priori reasons are merely the beginning of the discussion; this is because evil is a phenomenon – it is experienced. This entails that, according to Bayle, a posteriori reasons are also relevant; whatever conclusion that is supported by a priori reasons – that of a single unifying principle – may or may not be the same conclusion supported by a posteriori reasons.

Bayle imagines a debate between Melissus and Zoroaster in which they examine the pros and the cons of both the proposed solutions to the problem of evil, with Melissus defending the single unifying principle, and Zoroaster defending the existence of two principles, one evil and one good. Melissus holds that a priori reasons favor the existence of a single unifying principle, and Zoroaster agrees that Melissus surpasses him “in the beauty of ideas and in a priori reasons” (305b). Zoroaster challenges Melissus, however, to explain the source of the evil caused by humankind, and argues that the existence of two principles better explains this phenomenon; it provides better a posteriori reasons than a single unifying principle. Even when Melissus argues that physical evil is simply a response of God’s justice to moral evil, Zoroaster replies that humankind’s inclination to evil is a defect that could not be caused by a single unifying principle with every perfection. Melissus’ final attempt to blame humankind for evil fails, according to Zoroaster, because even the freedom that Melissus claims for humankind is not truly free, since it exists completely by the action of God. Zoroaster argues that it is inconsistent with a priori reasons that a single, omnibenevolent principle would not only fail to prevent moral evil, but would then punish humankind with physical evil for the moral evil that they commit – but for which the single principle is still ultimately responsible.

There is a rational intractability, then, in Bayle’s conception of the problem of evil: a priori reasons contradict a posteriori evidence, and yet the solution that best accounts for the a posteriori evidence – the “two principles” solution – is inconsistent with a priori reasons – particularly with the notion that a single omnibenevolent principle could in any way be the origin of evil. The intractability of the problem forces Bayle to propose an entirely different strategy: the only way out of the rational dilemma of evil is to look beyond the contradictions of reason to the realm of facts. (By “facts,” Bayle means something like “that which is found in Scripture”.) In the case of the problem of evil, the relevant “fact” is the evidence of Scripture that an omnibenevolent, holy, and omnipotent God has either allowed or caused evil to exist. Further, as revelation, Scripture is not merely additional a posteriori evidence; it has the added epistemological weight of faith. The actuality of this state of affairs – the coexistence of this kind of God with evil – is enough to counter the objection of impossibility, according to the principle of logic: “From the actual to the possible is a valid inference.” This factual strategy for addressing the problem of evil is consistent throughout the rest of the Dictionary, and is consistent with Bayle’s continual insistence in the Dictionary on the supremacy of revelation (“faith”) in the face of rational challenges.

Though the Dictionary is the most famous place where Bayle engages the problem of evil, his last two works, Response to the Questions of a Provincial and Dialogues of Maximus and Themistius, contain an extensive treatment of related issues as well. Bayle’s targets are many in these works, but one of the central ones is Isaac Jacquelot, a Reformed theologian who defends a theodicy-type position. Jacquelot was one of the Huguenot rationaux, a group of intellectuals defined by Calvinist theological commitments and broadly Cartesian philosophical ones. Jacquelot was deeply engaged in the project of rational theology, and had a fruitful intellectual history with Bayle. Jacquelot was profoundly influenced by Malebranche, particularly in the divine omniscient governance of nature, and the sinful effects of free will. The common interests of Malebranchean philosophy and Huguenot theology make Jacquelot an excellent interlocutor for Bayle. Bayle’s proposed explanation of the problem of evil remains essentially unchanged from his position in the Dictionary: that ultimately, it is futile to argue a priori reasons against the fact of the coexistence of God’s nature with evil.

Bayle’s proposed solution to the problem of evil reappears in Response to the Questions of a Provincial as part of a debate about free will. Since a hallmark of Reformed theology is the total sovereignty of God over creation, it is difficult for any reformer to hold that the freedom granted to humankind can clear God of responsibility for the evil acts of his creatures. If God is truly sovereign, then he would have some kind of governance over the choices of humans – minimally, he would have foreknowledge of the choices causally connected to the existence of evil, and thus foreknowledge coupled with omnipotence seems to entail a responsibility for God to act such that evil does not come into existence. If this is true, then God is in fact responsible for the existence of evil just insofar as he has not prevented it. Bayle never denies any part of this argument; he seems unwilling to look over or explain away its various premises in the way that his predecessors and contemporaries do.

Bayle’s original proposal for addressing the coexistence of God and evil, however, is consistent with this line of argumentation. As in the Dictionary, Bayle advocates in RQP a “factual” approach to the intractability between God’s omnibenevolence and evil: Scripture declares that this coexistence is so, and it is nonsensical for reason to argue against a matter of fact. Bayle also explicitly refuses the proposal by Jacquelot that the incompatibility is simply above reason by rejecting the “above reason/against reason” distinction. According to Bayle, there is no such thing as “above reason” when the reason at issue is human reason: either an axiom is compatible with human reason, or it is against human reason. If something appears not to conform to human reason, then by definition, Bayle argues, it also appears contrary to it.

One objection to this reading of Bayle is that in fact, there is not much difference between Bayle’s position and the “above reason” position –the two positions in fact represent a distinction without a difference. If Bayle ultimately endorses belief in the coexistence of God and evil in the face of apparent contradiction, the objection goes, he is at least implicitly endorsing some truth that is beyond human reason. The point of true disagreement, however, is that according to the “above reason” position, what is above human reason is still consistent with human reason, though incomprehensible to it. When one considers the divine mysteries, however, it is obvious that, to the extent that they are comprehensible by human reason, they run contrary to it. The doctrine of the Trinity runs contrary to the laws of mathematics; the doctrine of the Incarnation runs contrary to our conception of an object’s ability to have more than one nature; and the doctrine of Jesus’ bodily resurrection runs contrary to our conception of the nature of physical bodies. These conflicts are within the realm of human reason, not above it, and though the mysteries are not fully explicable – thus “mysteries” – they are comprehensible enough to make the conflict a real one, not merely apparent.

In the Dialogues of Maximus and Themistius, Bayle is careful to restrict his rejection of the “above reason/against reason” distinction to the scope of human reason. This is because the problem of evil is so repugnant to human reason that the only possible response to it must completely throw out the conclusions of human reason. Bayle challenges Jacquelot to explain how God’s allowing evil could ever be adequately explained using human reason. According to human reason, Bayle argues, God’s allowing evil to exist violates a priori reasons and our idea of God as omnibenevolent. The position here is essentially that of the Dictionary, and Bayle’s reiteration of it in the Dialogues seems to show that he is unimpressed with Jacquelot’s proposed solution to the problem of evil.

According to Bayle, the specific problem with Jacquelot’s proposed solution to the problem of evil is that Jacquelot accepts divine foreknowledge. Presumably, Jacquelot’s retention of divine foreknowledge is supposed to support the possibility of a free will defense. Bayle notes, however, that divine foreknowledge is actually not all that helpful: even with divine foreknowledge, the existence of evil calls into question God’s omnibenevolence, since a being who foresees the negative consequences of free will cannot have good intentions if he persists in bestowing it on humans.

These objections support Bayle’s assertion in the Dialogues that his solution to the problem of evil is really the last left standing: believing, in spite of lacking an understanding of how God’s omnibenevolence is compatible with evil. Importantly, for Bayle, this belief is not grounded in the faculty of reason, but rather on the declaration of Scripture that God and evil in fact coexist. Bayle’s later works trend toward a kind of moral rationalism with respect to human conduct, but his advocacy of this factual solution to the problem of evil never changes throughout his life, and his debate with Jacquelot on the problem of evil does not undermine the tenability of his position. Divine conduct is simply not susceptible to the judgments of right reason.

5. The Erring Conscience and Religious Tolerance

Bayle’s concern with conscience and toleration is not limited to the Philosophical Commentary, but it is where Bayle most clearly argues for religious toleration. He articulates two lines of argument for religious toleration: one on the basis of his doctrine of the erring conscience, as developed in the General Critique (1682) and the New Letters (1685); and one on the basis of a principle of the natural light according to which any reading of Scripture that implies a moral crime is a false reading. For Bayle, both ways of arguing for religious toleration are necessary in order to prevent coercion of, or by, people who act on the basis of conscience – whether that conscience is accurate or erring.

Bayle’s argument for religious toleration based on his doctrine of the erring conscience assumes that we have a duty and a right to act according to the lights of conscience. This is a less controversial claim when the beliefs of conscience are accurate; however, Bayle’s doctrine of the erring conscience entails that even when the beliefs of conscience are in error, the same duties and rights of conscience obtain. Bayle does place some conditions on the erring conscience’s acquiring these duties and rights; only when the erring conscience is “in good faith” – that is, when the error is sincere – does the erring conscience obtain the relevant rights and duties. Bayle consistently holds to the “good faith” requirement in both the New Critical Letters and the Philosophical Commentary; in the New Critical Letters, he writes that “[a]ll good faith errors have the same right over conscience as orthodoxy, whether we embraced those errors a bit too lightly, or whether we ran them through the most rigorous examination that we could manage.” Bayle places the good faith errors of the sincere lay person on the same footing as the good faith errors of the rigorous intellectual – and, most significantly, on the same ground as orthodoxy.

This allows Bayle to affirm a kind of moral equivalence between the accurate conscience and the erring one: whatever rights and duties accrue to an accurate conscience also accrue to the erring conscience. Thus, if the beliefs of the accurate conscience ought to be tolerated, so ought the beliefs of the erring conscience. Bayle marshals several different arguments for the moral equivalence claim, but the most powerful is the argument from skepticism. Presumably, each person cannot help but think that her conscience is in the right in cases where beliefs of conscience conflict. In the absence of definitive and objective proof for a belief of conscience, then, there is no reason to grant one conscience rights and duties over another.

A serious potential problem arises with respect to the doctrine of the erring conscience, however: the issue of fanaticism. Assuming that an erring conscience has all of the same duties and rights as an accurate conscience, what’s to prevent an individual from acting on a fanatical conscience? Bayle says in the Dictionary that it is the fanatics – the people who would benefit the most from the doctrine of the erring conscience – who support the principle that acting against one’s conscience can be a good. Bayle thus conceives of fanatics as the sort of people who are willing to subvert morality, and even the rights of their own conscience, in order to undercut the rights of others. True fanatics, however, often do not recognize that they are doing so, since they are typically convinced that they are the only people who perceive truth for what it really is. If a fanatic is convinced of his correctness – that is, that the lights of his conscience are accurate – then he will apply to himself whatever is said in favor of truth against those whom he perceives to be in error. The fanatic shifts the burden of falsity to those with whom he disagrees as a way to discharge doubt or discomfort, while simultaneously creating a double standard: an act is permissible when I do it, but not when others do. What fanatics fail to grasp when they argue for the rights of truth (presumably in order to justify the persecution of those whom they believe to be in error) is that if the roles were reversed – if the fanatics were in the minority – they would no doubt be arguing in favor of religious toleration.

This leads to Bayle’s second argument for religious toleration based on the principle of the natural light articulated in the Philosophical Commentary that forbids the commission of crimes. Bayle’s moral principle against committing crimes supports his defense of the doctrine of the erring conscience: if the accurate conscience did indeed have the right to coerce, it would only be a right considered from an abstract point of view. According to Bayle, the abstract point of view is not that of conscience; conscience provides direction for the particular beliefs and actions of a particular person. Setting aside the abstract point of view, the only way to justify coercion is by appeal to the conscience itself, whose accuracy is exactly the issue at hand. Since the only justification available to conscience is the force of its persuasion, then if the true religion were ordered by God to persecute heretics, heretics would also have the right to persecute the true religion. This scene of rampant persecution is the epitome of moral breakdown, and Bayle thinks that no such situation can be justified with an appeal to Scripture – or to conscience. Religious coercion is not only morally villainous, but it violates the very heart of all religions – and most importantly for Bayle’s readers, it violates the heart of Christianity.

Bayle’s principle of the natural light – that no reading of Scripture can be true that justifies the commission of moral crimes – adds thus moral disapprobation to any conscience-based sanction against coercion. It also provides a principle upon which those of differing consciences can agree. The revelation of the natural light that Bayle cites here – that committing crimes is always immoral no matter what the justification – comes from the faculty of right reason. Bayle argues in Diverse Thoughts that this faculty of reason, responsible for intuiting certain basic rational moral maxims, is equally accessible to both atheists and believers – whether heretical or orthodox. This implies that everyone is subject to these same moral maxims, including the absolute prohibition on using conscience as a motive to justify committing crimes. (Note, however, that this principle of the natural light only governs action – that is, it prohibits committing crimes, which is the realm of action. It gives no clear doxastic guidance outside of these basic moral maxims.)

This principle of natural light thus separates religious beliefs, where Bayle is rather permissive, from basic moral beliefs, where only right reason has sway. There are two major benefits to this separation. First, it allows Bayle to maintain that all individuals of every confession – or no confession –are subject to the same basic moral maxims, which apply equally to everyone with access to the “natural light” of right reason. Second, it allows Bayle to maintain that we may still have good reason to condemn beliefs of those with an erring conscience, but that rather than condemning those who believe erroneously, we should condemn those who profess to be in good faith but are not – a sin not merely of belief, but of action. Bayle specifically tackles this issue in his Dictionary article on Arius. The group for whom Bayle reserves his strongest condemnation in that article is not heretical teachers that are in good faith, instructing people in a simple way in accord with the teachers’ conscientious beliefs. Rather, his strongest words are for heretical teachers who teach heresy without believing it; he calls them “monsters of ambition and malice.” Presumably, the force of Bayle’s condemnation rests not on the heresy of such teachers, but on their hypocrisy – the discrepancy between belief and action.

Interestingly, for all of Bayle’s emphasis on right action over right belief, he still leaves room for a distinction between valuable and worthless beliefs. Just because Bayle insists on the primacy of right praxis over right doxa, this does not imply that all opinions are equally good. This is consistent with Bayle’s position that there is good reason to condemn false religious beliefs and to maintain orthodox beliefs. What is most unique about Bayle, however, is his redefinition of the essence of religion: what is most important is not right belief, but right action. Right action requires right reason, and right reason requires religious toleration.

6. References and Further Reading

a. Primary Sources

Bayle, Pierre. Correspondance de Pierre Bayle. Eds. Elisabeth Labrousse & Antony McKenna. 12 vols. Oxford, 1999-2015.
- A monumental assembly of Bayle’s correspondence from February 1662 onward. Projected to extend to 20 volumes.
Bayle, Pierre. Dictionnaire historique et critique, par M. Pierre Bayle. Amsterdam, Leyde, La Haye; 1740. 5th Edition, 4 vols. in-folio.
- The work for which Bayle is most famous. The fifth edition of 1740 is the easiest to access online, at the University of Chicago’s ARTFL project (https://artfl-project.uchicago.edu/content/dictionnaire-de-bayle), but the definitive version is the second edition of 1702, which is the first to include the “Clarifications” as appendices.
Bayle, Pierre. Historical and Critical Dictionary, selections. Trans. & ed. by Richard Popkin. Indianapolis: Bobbs-Merrill, 1965.
- The standard contemporary edition of Bayle’s Dictionary in English, though unfortunately it includes only a small fraction of the original.
Bayle, Pierre. Œuvres diverses de M. Pierre Bayle, professeur en philosophie et en histoire à Rotterdam. La Haye/The Hague, 1727-31; Hildesheim, 1964-68. 4 vols, in-folio; Vols V.1 & V.2: Hildesheim: G. Olms, 1982-1990.
- The standard edition of Bayle’s corpus (not including the Dictionary); it includes all of Bayle’s published works, as well as some fragments of correspondence.
Bayle, Pierre. A Philosophical Commentary on These Words of the Gospel. Eds. J. Kilcullen & C. Kukathas. Indianapolis: Liberty Fund, 2005.
- The standard contemporary edition of Bayle’s Philosophical Commentary in English. The translation is an amended version of the first English translation in 1708.

b. Secondary Sources

Bost, Hubert. Pierre Bayle. Paris: Fayard, 2006.
- The definitive contemporary biography of Bayle. In French.
Brush, Craig B. Montaigne and Bayle: Variations on the Theme of Scepticism. The Hague: Nijhoff, 1966.
- An early and thorough treatment of Bayle’s skepticism.
Delvolvé, Jean. Religion, critique, et philosophie positive chez Pierre Bayle. Paris: Alcan, 1906.
- The beginning of twentieth-century scholarship on Bayle, defending a fundamentally proto-Enlightenment reading of Bayle. In French.
Hickson, Michael W. “Theodicy and Toleration in Bayle’s Dictionary” Journal of the History of Philosophy 51 (1):49-73 (2013).
- A rigorously argued, meticulously detailed treatment of the relationship between Bayle’s position on theodicy and his defense of religious toleration.
Irwin, Kristen. “Which ‘Reason’? Bayle on the Intractability of Evil,” in New Essays on Leibniz’s Theodicy, eds. Larry Jorgensen & Samuel Newlands (Oxford University Press, 2014), 43-54.
- A contextually sensitive account of Bayle’s position on theodicy. It argues that Bayle’s final position on theodicy contains the resources to reply to Leibniz’s objections.
Irwin, Kristen. “Bayle on the (Ir)rationality of Religious Belief,” Philosophy Compass 8:6 (2013), 560-569.
- An exposition of Bayle’s treatments of the rationality of religious belief.
Kilcullen, John. Sincerity and Truth: Essays on Arnauld, Bayle, and Toleration. Oxford: Clarendon Press, 1988.
- A masterful treatment of Bayle’s arguments defending religious toleration.
Labrousse, Elisabeth. Pierre Bayle: Hétérodoxie et rigorisme. Paris: Albin Michel, 1996. 2^nd ed.
- An especially thorough treatment of Bayle’s thought by the premier Bayle scholar of the twentieth century. In French.
Lennon, Thomas. Reading Bayle. Toronto: University of Toronto Press, 1999.
- The definitive treatment of Bayle’s thought in English. It argues that Bayle’s thought is deeply and irreducibly anti-systematic in nature.
Lennon, Thomas. “What Kind of a Skeptic Was Bayle?” Midwest Studies in Philosophy XXVI (2002), 258-279.
- An exceptionally clear taxonomy of the various senses in which Bayle has been thought to be a skeptic.
Maia Neto, Jose R. “Bayle’s Academic Skepticism,” Everything Connects: In Conference with R.H. Popkin, eds. J.E. Force and D.S. Katz. Leiden: Brill, 1999; 264-275.
- A compelling argument that Bayle’s skepticism is not Pyrrhonian, but fundamentally fallibilist and concerned above all with intellectual integrity.
Mori, Gianluca. Bayle philosophe. Paris: Honoré Champion, 1999.
- The most contemporary treatment of Bayle as an ironic critic of religion, and as a moral thinker focused on “common notions”. In French.
Popkin, Richard. The History of Scepticism from Savonarola to Bayle. New York: Oxford University Press, 2003.
- The definitive history of fifteenth, sixteenth, and seventeenth-century skepticism in Europe.
Sandberg, Karl C. At the Crossroads of Faith and Reason: An Essay on Pierre Bayle. Tucson: University of Arizona Press, 1966.
- A short, clear primer on the themes of faith and reason in the Baylean corpus.

Author Information

Kristen Irwin
Email: kirwin@luc.edu
Loyola University Chicago
U. S. A.

The Aesthetics of Classical Music

Musical aesthetics as a whole seeks to understand the perceived properties of music, in particular those properties that lead to experiences of musical value for the listener. It may also be understood more broadly as essentially synonymous with the philosophy of music, thus including issues of musical ontology, epistemology, ethics, and sociology. A specific area of focus within musical aesthetics is the aesthetics of classical music; it addresses questions relating to the aesthetic properties and aesthetic value of music in the Western classical tradition.

What aesthetic content does classical music have to offer? Does it consist simply in pleasing patterns, which have no meaning outside of the musical structures themselves? Can it express emotion, feeling, or other kinds of inner states? Does classical music offer insights into life not available through other art forms? Can it possess identifiable meanings, or significant conceptual, historical, or symbolic content? If so, how could this be achieved, given that its materials appear to be non-signifying in nature? These are some of the principal questions that concern the aesthetics of classical music.

After discussion of several important issues relating to classical music as an art form and an overview of influential discussions of the topic prior to the 20^th century, this article addresses these principal questions through a discussion of four major topic areas in the aesthetics of classical music: musical understanding, musical form, emotion and expressiveness, and some further types of aesthetic content in classical music.

Classical Music as an Art Form
Historical Discussions
1. Kant
2. Schopenhauer
3. Hanslick
4. Gurney
Understanding
Form
Emotion
Human Experience and Values
References and Further Reading

1. Classical Music as an Art Form

In the case of music, as in other arts, the term ‘classical’ indicates the presence of an established or long-standing tradition. While the roots of classical music extend back to Gregorian chant, three developments occurring in the 11^th century are often regarded as marking the beginning of the classical tradition in western music. These are the developments of polyphony, the principles of order, and the establishment of musical pieces as compositions. The classical tradition is centrally defined by European art music composed during the Common Practice period, which encompasses Baroque, Classical, and Romantic music (roughly 1650-1900). It also includes Medieval, Ars Nova, and Renaissance art music, as well as non-European, 20^th century, and contemporary art music that incorporates compositional practices that are recognized as being well-established in western art music. While the vast majority of compositions in Western art music unambiguously fall under the category of ‘classical music’, one can argue that, though there will be no decisive line, certain highly experimental or innovative pieces cannot be apart of an established tradition of composition and thus should not be considered ‘classical’.

In contrast to the aesthetics of popular music, the aesthetics of classical music has traditionally focused on aesthetic content that is strictly musical in nature, excluding any additional content conveyed through words, actions, visual displays, or any other non-musical elements. It has typically limited itself to inquiry into the aesthetic content in musical works that is available from music alone, considered apart from any non-musical elements. Although there are clearly topics of significant interest in the additional aesthetic qualities of classical works that include non-musical elements (whether these be semantic, poetic, dramatic, or dance-related), most philosophers writing about classical music have been unwilling to venture into this territory. The focus on music as such in the aesthetics of classical music is due to the compelling philosophical questions generated by pure or ‘absolute’ music, the complexity involved in considering music in combination with non-musical elements, and a desire to understand the art of music apart from any aesthetic content contributed from other sources. In keeping with the historical focus of the aesthetics of classical music on music as such, this article restricts itself to discussion of aesthetic content that is purely musical in nature and it does not address topics involving the combination of music with other aesthetic elements.

Several features of classical music as an art form play a central role in defining the areas of aesthetic inquiry that pertain to it. Three features in particular deserve attention. These are the unique impact classical music has on our inner experience, its temporal nature, and the central role played by the tradition of tonal harmony, even after its “collapse” at the beginning of the 20^th century.

a. Music and Inner Experience

Classical music’s ability to engage and enliven our inner experience is a primary reason why it holds so much philosophical interest. What is it about classical music as an art form that enables it to connect so strongly with our inner life? Part of the answer would seem to lie in the fact that it is an auditory art. The perception of aesthetic content through hearing differs in fundamental ways from the perception of aesthetic content through vision, especially in the case of visual arts that make use of representation. One of the greatest differences between the two modes of artistic perception is that unless we are given rather clear guidelines, we do not interpret musical sounds as representations of objects. The preexisting ability to interpret and assign meanings to visual images does not automatically come into play when we hear musical sounds. It appears that music has the capacity to engage our aesthetic sensibility without also engaging the cognition of objects. This sensibility is linked in complex ways to inner experience, feelings, moods, and emotions.

In Western philosophy, discussion of the special power of music to shape our inner life predates Plato, as evidenced by the lively debates of the pre-Socratics on this topic. Plato himself devotes substantial attention to it in both the Republic and Laws, conceiving of music as an art that can bypass reason and penetrate into our innermost self, impacting the constitution of our character. To use Plato’s terms, music acts as a “charm” on our inner life, shaping this life to its pattern. Classical music in particular stands out among musical cultures for its ability to evoke compelling inner experiences in the listener. Curiously, the power of classical works to evoke such experiences appears to be heightened in many purely instrumental works despite the fact that such pieces possess no readily identifiable meaning.

b. The Temporal Aspect of Music

In addition to its distinctive characteristics as an art form perceived through hearing, music is, of course, always temporal. Many writers, Roger Scruton among them, suggest that music leaves our minds no choice but to move with it when we listen attentively. This activity of the mind is not merely the recognition of new sounds as they occur. The mind moves sympathetically with the motion it perceives in the music. Thus, another important aspect of classical music is that frequently our mental perception of the movement in the music is so strong that we can feel it almost like we feel physical motion.

Our minds also respond to the temporal nature of music in another way. It is the automatic response of the mind to follow the progress of what it hears and assimilate this content into its conception of the piece as a whole. Music’s temporal nature entails that we do not experience the whole work at once or in an order of our choosing, and that consequently the order of presentation is fundamental to our experience of the musical content. In most classical music, and perhaps all art music of the Common Practice period, we perceive purposive and goal-directed movement along with structures and relationships that develop over time, though the scope and complexity of such content varies greatly from piece to piece. As listeners we recognize that an effort has been made to produce an aesthetic value-content, whether formal, expressive, or otherwise, worthy of appreciation or understanding. Due to this recognition, the assumption of an aesthetic attitude is a common practice in listening to classical music and thought to be an important means of enhancing our experience of the music as it unfolds.

c. Classical Music as an Historical Tradition

As an historical tradition, classical music gradually expands its artistic resources, from the practices of medieval polyphony, through the incorporation of new elements in the Renaissance, to the achievement of a conception of music and musical composition that is shared across Europe by the middle of the Baroque. The subsequent development of classical music during the Common Practice period is unique in the way that it preserves a strong continuity in compositional techniques while at the same time evolving continually as an art form. The late works from this period make use of the same basic musical materials (scales and chords) as the early ones: the diatonic scales, triadic functional harmony, primary organization around the dominant-tonic relationship, integration of vertical and horizontal dimensions, and so on. Early works differ from later ones in countless ways, but the fundamental musical materials and relationships do not change until the extended chromaticism of late romantic music begins to dissolve a sense of the tonic altogether. Later works differ from earlier ones primarily through creative innovations that are compatible with existing tonal system made by particular composers and through a gradual exploration and expansion of resources already implied in the tonal system itself. This gradual expansion within the context of a continuous tradition has significant implications for the expressive possibilities classical music possesses as an art form, allowing for the emergence of a repertoire of expressive compositional techniques that grows in effectiveness and scope as it progressively develops the potential that is inherent in tonal harmony.

The diverse compositional approaches developed in classical music in the early part of the 20^th century introduce new questions for musical aesthetics. Many aesthetic theories based on analysis of music of the Common Practice period do not apply to compositions based on approaches divergent from those used by tonal harmony. This difference in aesthetic content applies to theories of meaning, form, and expressiveness. Most influential and contemporary philosophers of classical musical aesthetics focus almost exclusively on tonal classical music (including music that achieves a tonal center by means other than tonal harmony, as found in the music of Stravinsky, Debussy, and Bartok). Given that many of these theoretical perspectives do not apply to non-tonal music, the aesthetics of non-tonal classical music is an area that is in need of further development by the discipline.

d. Musical Works and Musical Performances

There are many philosophical questions surrounding the nature and definition of music and the ontological status of works of music. However, because these questions do not apply to classical music in particular, and because the discussion of these topics benefits greatly by comparisons between different musical genres and traditions, they are more appropriately addressed under the philosophy of music or musical aesthetics in general. As a result this entry offers just a brief summary of issues concerning the definition of music, musical ontology, and authentic performance of musical works.

General definitions of music most often focus a demarcation between music and the non-musical (largely due to the musical experimentalism prominent in western art music from the 20^th century onward), and on ensuring that the diversity found in the world’s musical traditions is taken into account. These definitional questions are not pertinent to the aesthetics of classical music seeing as they focus on issues involving music outside of the classical tradition.

A similar situation exists with regard to musical ontology, though primary focus is given to works of classical music in some instances. One ontological issue pertaining centrally to classical music concerns the metaphysical nature of a work of music. Do musical works exist? If so, in what sense? With regard to musical ontology a Platonist would hold that a work of classical music is an abstract object, while a nominalist would hold that it must be understood solely in terms of particular objects that relate to it, such as the musical score. In contrast to all of these, anti-realists deny that musical works have any kind of real existence at all, though stopping short of discounting the question altogether, some anti-realists grant musical works a fictional status.

A second issue has to do with what constitutes an authentic performance of a piece. Is it sufficient to perform the right pitches and rhythms in the right order, or is more required to produce an instance of a given work? How essential to authenticity is the use of appropriate period instruments? Is a piano reduction of an orchestral score still an instance of the same work? Debate over these questions centers around which elements must be present in order for a performance to constitute an instance of the work in question. Even if a performance meets the criteria required for authenticity, there is a further question about its reception by the audience. Considering that the sensibilities of listeners continue to change, what is the significance of the fact that contemporary audiences cannot experience works as their original counterparts did?

Influential discussions of musical ontology and authentic performance as they pertain to classical music include Jerrold Levinson’s Music, Art, and Metaphysics, Lydia Goehr’s The Imaginary Museum of Musical Works, and Stephen Davies’ Musical Works and Performances.

2. Historical Discussions

Although discussion of topics relevant to Western musical aesthetics date back to the pre-Socratics, it is not until the 18^th century that musical aesthetics takes shape as an inquiry focused on the understanding of perceived properties and capacities of the art of music. Starting with early theorists such as Mattheson and progressing through to thinkers such as Kant and Schopenhauer to later writers such as Gurney, historical discussions of musical aesthetics in Western philosophy are in fact discussions of the aesthetics of classical music. This is for the simple reason that they take music of the classical tradition as their subject matter.

Many of the early discussions of classical musical aesthetics revolve around the question of what music itself is capable of presenting to the listener, with much of the debate centering on the question of how and to what extent music can convey emotional content. German and English discussions of the topic, such as those of Mattheson and Hutcheson, are typically characterized by the view that music either stimulates psychological states directly or arouses them through imitation of ways that emotion is expressed, principally by the human voice. By contrast French theorists during this early period, such as Boyé and Chabanon, oppose the idea that music is capable of expressing emotion on the grounds that it lacks the tools required for successful imitation or representation. These early writers prefigure the debates between expressionist and formalist viewpoints in later discussions of the role of emotion in the experience of classical music (see Lippman for selected excerpts from these authors and further detail on early musical aesthetics).

a. Kant

Following early explorations of the topic the first major contributor to the aesthetics of classical music is Immanuel Kant in his Critique of Judgment. In applying his aesthetic theory to music Kant’s primary concern is with the question of whether, or to what degree, music belongs to the beautiful or to the pleasing arts. Kant maintains that aesthetic judgments consist in feeling disinterested pleasure in perceiving the form of purposiveness in an object, apart from charm, emotion, or any definite concept of what the object ought to be. He further claims that the perception of the form of purposiveness puts the imagination and understanding into harmony such that they are able to freely play with one another. This state of free play, so far as it can be felt in sensation, is the basis of the pleasure that we feel in response to beauty.

Kant considers the possibility that the imagination can apprehend a form in the musical composition which, when compared by reflective judgment to the faculty of referring intuitions to concepts, places the imagination in harmony with the understanding. In music this form, apprehended independently of any conception of an object, is purely a pattern of melodic and harmonic intervals. Harmonious agreement between the imagination and the understanding in the perception of the form of the composition would, provided that this is possible, result in the music being perceived as purposive for reflective judgment. It would also mean that music deserves to be classified among the beautiful arts.

Initially Kant identifies music as an object of pure aesthetic judgments, classifying “all music without words” as a type of free, rather than dependent, beauty. In his more detailed discussion of music in sections 51-54 of his Critique of Judgment, however, Kant seems to vacillate between the possibility that music belongs to the beautiful arts and the possibility that it falls short of providing a formal content suitable for aesthetic judgments and thus is merely a pleasant art. This ultimately leaves the question of which category music belongs to undecided. If music can qualify as beautiful, the composition as a form alone must constitute the object of aesthetic judgment. Factors such as the instruments used to play the composition and the quality of their tone may add charm to the piece and they may even enhance our experience of its beauty, but by themselves such factors do not constitute objects of aesthetic judgment.

While Kant explores the possibility that the composition as an abstract pattern of relationships may present purposive form and thus qualify as beautiful, he appears to conclude that the apprehension of purposive form in music is difficult at best. In the absence of the apprehension of such form, music is limited to the pleasant rather than the beautiful, consisting primarily in a changing play of auditory sensations. In this case, music can produce enjoyment and emotion, but is not a subject for pure judgments of taste. Apart from his enormous influence on the field of aesthetics as a whole, Kant’s writing on music has been influential for its emphasis on purely formal properties and its concomitant rejection of the value of emotion and sensuous qualities to the listening experience. As such, it clearly lays the groundwork for more explicitly formalist approaches in the 19^th century.

b. Schopenhauer

Arthur Schopenhauer in The World as Will and Representation interpretes ‘will’ as the underlying metaphysical reality, as the thing-in-itself, and grants music the privileged status of presenting it. Departing from Plato and Kant, Schopenhauer denies that the underlying metaphysical reality is rational in nature. Instead, will is a blind urge whose continuous striving has no guiding purpose. Unlike the other arts, whose significance lies in the ability to capture “the permanent essential forms of the world,” thus limiting their reach to interpretations of the phenomenal realm, music expresses the will itself, directly and immediately, speaking the “universal imageless language of the heart.” While in music we experience a direct presentation of the will, nevertheless as thing-in-itself, the musical presentation of will, like will itself, is indescribable.

Despite his allegiance to Kant’s transcendental idealism, Schopenhauer’s aesthetics represents an important departure from Kant. Whereas Kant viewed the aesthetic value of music in purely formal terms as a play of patterns, Schopenhauer advocates that music is valuable for its direct expression of the continuous striving of the will. Thus, the contrasting views of Kant and Schopenhauer prefigure later debates between formalists and expressivists concerning the aesthetic properties of music.

c. Hanslick

In his influential treatise On the Musically Beautiful Eduard Hanslick argues for a strong version of aesthetic formalism that limits aesthetically valuable content to the audible analogue of a moving arabesque or kaleidoscope, differing from these only in that music “manifests itself on an incomparably higher level of ideality” and “presents itself as the direct emanation of an artistically creative spirit.” Hanslick rejects the view that music is capable of expressing emotions, holding instead that music consists purely in tonal forms that develop in time. In doing so he presents an early cognitivist account of emotions, holding that emotions are primarily defined by concepts. He claims that music is incapable of conveying the conceptual content needed to differentiate between specific emotional states. As a result, the aesthetic content of music is limited to a specifically musical kind of beauty that “consists simply and solely of tones and their artistic combination.” His conclusion is that the “representation of a specific feeling or emotional state is not at all among the characteristic powers of music.”

The production of an experience of motion is the aspect of music that is shared with emotion. Through dynamics, tempo, shape, and timbre, music can present auditory instances of qualities that accompany emotions, but no actual emotional content is present, since this would require music to convey concepts: “Music can, in fact, whisper, rage, and rustle. But love and anger occur only within our hearts.” As one might expect given his allegiance to a purely formal conception of musical value, Hanslick also rejects the idea that music as such is suitable for the representation of extramusical content.

d. Gurney

In the latter part of the 19^th century Edmund Gurney developed an approach to musical expression based on Darwinian evolutionary theory. Gurney, preceded by Herbert Spencer, postulated a biological origin of music in the impulse to attract and court a mate. According to Gurney, music originates from the capacity that evolved in our ancestors to use sounds to arouse responses from potential mates and rivals. Given that it evolved in this way, music is directly connected to the arousal of our passions. This original connection to the passions and to sexual excitement is fundamental to music in all of its forms. Emotion in classical music is a sublimation of this original sexual excitement. Its origins do not, however, constitute a link between music and extra-musical values or interests. Gurney argues that music offers a profound and entirely self-contained pleasure, whose origins grant it a special connection to our inner experience. Gurney’s work addresses many other fundamental questions in musical aesthetics, including the nature of musical motion, the basic components of musical understanding (which Gurney believed to be melodic forms), musical beauty, and musical value. It is also the inspiration for a recent work by Jerrold Levinson on the nature of musical understanding entitled Music in the Moment.

3. Understanding

Following Gurney’s claims for the role of melodic structures in musical understanding, scholars have generally agreed that an account of the nature of musical understanding must accompany any comprehensive treatment of the aesthetic properties of classical music. Musical understanding in this sense refers to how specific musical structures combine to convey an intelligible sense to the listener. As a result, this establishes a basis upon which to make further claims about the formal or expressive content of music.

a. The Listening Experience

In contemporary discussions there is general consensus that when we experience classical music, we hear the pattern of sounds as an intentional object. That is, we hear the musical work in the form of an unfolding audible musical structure. The term ‘intentional’ in this context signifies that music exists for us in virtue of its being an object of our conscious focus. Hearing patterns of sounds as music is something we contribute as listeners, since it is perfectly possible for someone not familiar with a particular kind of music to fail to grasp its aesthetic qualities. In appreciating music we hear the sounds as musical elements relating to one another within an aesthetic framework as components of a work of art. This audible musical structure together with any additional attendant qualities such as timbre, dynamics and vibrato, is the object of appreciation that produces experiences of aesthetic value. In Values of Art Malcolm Budd attempts a narrower definition that limits the aesthetic content of music to the work’s audible musical structure alone, leaving out of consideration timbral and performance-related aspects. More recently, multiple authors have presented arguments that these attendant qualities are significant aspects of the experience of aesthetic value. Regardless of these particular issues, there is a broad consensus that the experience of aesthetic value in classical music should not be considered separable from the listener’s experience of the audible musical structure of a work. It is this structure, perceived through active listening, that both contains the aesthetic content and produces the experiences of aesthetic value.

In perceiving the audible musical structure, our minds follow the succession of events and we grasp them aesthetically in relation to one another when we listen attentively. This activity of the mind is not merely the recognition of new sounds as they occur, but involves a sense of motion in the music. Given that the unfolding audible structures of classical music do not involve motion in a literal sense, the perception of motion presents a problem for the theorist.

In his pioneering treatise Sound and Symbol, Victor Zuckerkandl identifies our perception of motion as resulting from the directional tendencies present in tonal music. This includes the leading tone seeking to find resolution in or “move to” the tonic. Roger Scruton finds that while this observation is accurate, it does not capture the essence of musical motion. Scruton argues that motion must be understood as part of a system of indispensable metaphors involved in perceiving the music, and further that we perceive musical motion in spatial terms. Malcolm Budd argues that Scruton’s insistence on a spatial conception of musical movement is unnecessary and that a better approach would be to characterize music in terms of a purely temporal Gestalt, limiting music to movement in time and eliminating the need for metaphor. Scruton’s reply is that the concept of merely temporal movement is itself metaphorical in nature and that foundational metaphors such as spatial movement are also indispensable because they connect music to human experience. This allows, he claims, for the development of a complete account of music’s meaning and value.

Another topic of debate concerns the extent to which the perception of larger scale structures plays a role in musical understanding and appreciation. In agreement with the emphasis placed on the value of larger scale formal structures by Heinrich Schenker and Leonard Meyer, Peter Kivy emphasizes the architectonic aspects of the listening experience. He argues that large scale structural patterns and relationships constitute an important aspect of the expressiveness and aesthetic value experienced by the educated listener. In The Aesthetics of Music Roger Scruton also advocates for the importance of these aspects. He finds the comparison between the methods of music composition and architecture to be an apt one, though he rejects the Schenkerian claim that the surface structure of a piece is generated by its underlying large scale plan. In Scruton’s view musical understanding consists in perception of the composer’s development of the fundamental linear and vertical relationships present in tonal music, which he describes as an “order of polyphonic elaboration” that is inherent in the practices of triadic harmony. Inspired by Gurney, Jerrold Levinson disagrees, arguing instead for ‘concatenationism’, the view that basic musical understanding, together with the greater part of music’s aesthetic value, does not require perception of large scale formal relationships and that “the core experience of a piece of music is a matter of how it seems at each point.”

Related to the question of the value of perceiving larger scale formal patterns in classical music is the question of whether formal training or a certain level of education is required for the appreciation of classical music. Though scholars agree that a certain amount of acculturation is required for its understanding and appreciation, there is debate concerning the extent to which education and musical training can enhance the listener’s ability to perceive the aesthetic content of the music. Those such as Kivy who locate the primary aesthetic content of classical music in the musical form and the purely musical relationships that exist within it tend to argue that a higher level of education or acculturation is needed. On the other hand, others such as Levinson locate the primary aesthetic content in expressive qualities and in the way the music unfolds from moment to moment. They vary in their assessment of the aptitude required of the listener depending on their conception of what musical expression consists in and how it occurs.

b. Theories of Musical Meaning

Recognizing that we identify a pattern of sounds as an intentional object aids in understanding how we come to perceive the sounds produced as a form of art. However, this does not address the question of how an unfolding musical structure produces meaningful aesthetic content. An account of musical understanding requires an explanation of how the patterns and relationships present in the musical structure produce meaning for the listener.

In The Language of Music, Deryck Cooke seeks to show that certain recurrent patterns present in the music of the Common Practice period have specific emotional meanings, making it possible to construct a basic emotional vocabulary of classical music that is composed according to the principles of Common Practice tonality. Cooke further extends his analytical approach to defining emotional content contextually. If correct, his insights would establish a basis for understanding the emotional content of most classical music. Malcolm Budd and Roger Scruton have objected to Cooke’s theory on multiple grounds. They argue that it is inappropriate to construe music as a language because music lacks both a syntactic and a semantic structure, and that even if the claim to be a language is taken in a metaphorical sense, the reappearance of similar musical patterns in similar expressive contexts is not a matter of meaning, but of conventionally tested appropriateness to the context in question. Another important objection focuses on Cooke’s claim that composers use music’s vocabulary of emotions to convey the emotions that they felt when composing the work, sometimes labelled the ‘expression-transmission theory’ or simply the ‘expression theory’ of musical expression. Budd points out that by locating the value of the experience in reception of the composer’s emotions, the expression-transmission theory removes the aesthetic value from the work itself, conceiving of music as a tool for arousing the emotions of the composer in the listener. In reality, he argues, we experience aesthetic value primarily in the experience of listening to the music itself. It would misrepresent our motivation for listening to say that experiencing the emotions that the composer experienced could be a substitute for the experience of the specific aesthetic qualities found in a musical piece.

Following Cooke, a comprehensive and detailed attempt to understand how tones and rhythms produce an experience of meaningful content was made by Leonard Meyer in Emotion and Meaning in Music. Meyer, whose basic approach was further developed by Eugene Narmour, makes use of information theory in developing the thesis that a great deal of what we appreciate in classical music is the result of a sense of expectation produced by antecedent-consequent relationships. According to Meyer, a sequence of tones has musical meaning if it points to or sets up the expectation of other tones that will follow. Meyer calls this type of meaning ‘embodied meaning’, as distinguished from ‘designative meaning’ which consists in a culturally established references to some extramusical content. Largely due to his reliance on information theory, Meyer defines embodied meaning purely in terms of expectation. It is generated by directionality inherent in the diatonic scale (leading tone-tonic relationships in melodies and harmonies) as well as by expectation that is built on the listener’s familiarity with traditional forms. One of the most important instances of expectation is the perception of an incomplete pattern, leading to a desire for its fulfillment on the part of the listener.

Finding Meyer’s concept of embodied meaning to be too one dimensional and seeking to restrict musical meaning to the audible musical structure itself (that is, to the exclusion of what Meyer described as designative meanings), Budd offers the concept of ‘intramusical meaning’. This concept, Budd suggests, consists in the ensemble of musical features and relations present in an audible structure as perceived by an educated listener. In developing the concept of intramusical meaning, Budd is seeking to emphasize the abstractness of music as an art form. He wishes to establish that perceiving the audible structure of a work and the relationships that this structure contains, its intramusical meaning, is a necessary precondition for any further interpretation of a musical work. As Budd conceives of it, intramusical meaning is the most basic and fundamental characteristic of a musical piece. Budd points out that Meyer’s concept of embodied meaning is clearly does not account for the diversity of feelings generated in our experience of music of the Common Practice period. Intramusical meaning encompasses all significant relationships perceived by the listener, so it does not restrict musical meaning to a specific process, such as that of antecedent-consequent relationships. At the same time, Budd acknowledges that Meyer’s concept of embodied meaning does account for the production of responses such as anticipation, frustration, confusion, surprise, and satisfaction, with varying degrees of intensity. A potential criticism of Budd’s concept of intramusical meaning is that it places all musical meaning under a single all-encompassing category and gives no account of how specific types of structures or relationships lead to specific musical meanings.

c. Theories of Musical Symbolism

Inspired by Ernst Cassirer’s Philosophy of Symbolic Forms, in Philosophy in a New Key Suzanne Langer interprets musical understanding to consist in grasping a symbolic content rather than in the perception of discrete intramusical meanings. Langer offers a theory of musical works as “unconsummated presentational symbols.” As such, each piece of music symbolizes the form, but not the content, of a feeling. Unlike words, presentational symbols are understood only through seeking to grasp the whole, the elements of which must be interpreted in relation to each other. Pictures are presentational symbols, as are works of music. The main function of musical compositions is to symbolize feelings. Music is an unconsummated presentational symbol because it only reflects the morphology of feeling, not the content of specific feelings. If true, Langer’s theory entails that we can understand a given work as a formal abstraction of an emotional experience. In evaluating Langer’s views, Roger Scruton argues that because Langer’s unconsummated symbols do not have a specific meaning, reflecting instead only the morphology of feelings, her theory reduces to the claim that musical processes have a formal resemblance to emotional processes.

Another significant attempt to speak of music in symbolic terms was made by Nelson Goodman and given further musical focus by Monroe Beardsley. Arguing that works of art symbolize through predication rather than denotation, Goodman develops the concept of ‘exemplification’ to explain artistic expression. An instance of exemplification is one in which a predicate attaches to something which also refers to the predicate, as in a swatch of cloth from a tailor, which “exemplifies only those properties that it both has and refers to.” The difference between everyday instances of exemplification and exemplification in art is that in art the referential component is metaphorical rather than literal in nature. In applying Goodman’s concept of exemplification to music, Beardsley offers the example of a sonata whose first movement has a diffident, indecisive character. Given that it is displayed by the sonata and also plays a significant role in the piece as a whole, diffidence is an instance of musical exemplification.

d. Theories of Musical Syntax and the Influence of the Cognitive Sciences

Several notable authors sought to offer an account of musical meanings by analyzing music in terms of a musical syntax. Influenced by the structuralism of Ferdinand de Saussure, Nicholas Ruwet and Jean-Jacques Nattiez who argue that music does possess a syntax and therefore can be interpreted and understood similarly to any other system of signs. A prominent criticism of this approach argues that such an attempt will necessarily be unsuccessful because unlike the case of natural language, it does not appear to be possible to define musical structures in terms of a generative grammar. Fred Lerdahl and Ray Jackendoff seek to address precisely this issue in A Generative Theory of Tonal Music. A key issue here is whether it is possible to establish a relationship between deep structure and surface structure in music by providing transformation rules for the generation of surface structures from deep structures. In seeking to establish that music possesses a generative syntax Lerdahl and Jackendoff put forward the ‘reduction hypothesis,’ which they draw from cognitive science. This hypothesis states that we as listeners seek to organize all musical events within a piece into a “single coherent structure, such that they are heard in a hierarchy of relative importance.” Though the attempt to identify syntactic structures in music has been influential, most contemporary theorists would deny that music possesses a syntax in any robust sense.

The emphasis placed by Lerdahl and Jackendoff on how music is organized by our brains while listening shifts the focus from meaning in the music to the cognitive processes by which we understand it (though of course the two are related and both need to be accounted for). This shift makes salient the importance of supplementing philosophical investigations of musical understanding and experience with scientific approaches. Although this entry does not consider specific scientific investigations into musical cognition, it is important to acknowledge the work in areas related to understanding and experiencing music that is being done in the cognitive sciences of psychology and neuroscience. Seeing as musical understanding and experience necessarily relate to cognitive structures and processes, approaches undertaken within various subdisciplines of psychology and neuroscience offer increasingly illuminating investigations into the topics of musical meaning and musical understanding.

In assessing the potential contribution of these fields, Tom Cochrane argues that studies in psychology and neuroscience can provide additional support for one theory of our experience of music over another, as well as in some cases allow us to reframe and synthesize traditionally distinct positions. He also acknowledges the limitations of many scientific studies, which, he suggests, points to the value of an interdisciplinary collaboration between philosophy and cognitive sciences including psychology and neuroscience. A further consideration in support of scientific investigations of musical experience is the fact that philosophical authors commonly make reference to their own personal experience of music as a partial justification for their views. Scientific research into musical cognition also potentially has value for this reason. It may be a way of providing additional support for an otherwise highly subjective component of philosophical theories.

4. Form

Accounts of understanding classical music address the question of how patterns of sound generate meaning for the listener. As such, they have to do with the unfolding of these patterns in time during the listening experience and with the listener’s perception of relationships between musical ideas in the piece. Insofar as they focus on the process of understanding, they only partially address the more general question of what kind or kinds of aesthetic content a musical structure is capable of conveying. Is the aesthetic content of classical music limited to appreciation of patterns and relationships present in the formal structure, or does the musical form relate in some significant way to our experience outside of music? Is the aesthetic experience of this music primarily or wholly intellectual in nature as the cognitivist would claim, or does the listener experience the content in emotional terms through the music’s expressive qualities? The fact that music unaided by words is generally agreed to possess meaning of some sort, but does not appear to possess adequate tools for either representation or signification makes answering these questions especially challenging.

The question of whether music means or expresses anything beyond itself is present in musical aesthetics from the time of the earliest discussions of the topic in the first half of the 18^th century. Kant makes the formalist idea of limiting content to form prominent by virtue of his conception of aesthetic beauty as purposiveness without a purpose, or as the form of purposiveness. Hanslick further develops this train of thought in claiming that the aesthetic content of classical music is best understood through the analogy of a moving arabesque. Meyer emphasizes the fundamental importance of formal structures, though he acknowledges extramusical content as a legitimate aspect of some music. Influential contemporary accounts of the aesthetic value and content of the formal structure as such have been offered by Malcolm Budd, Peter Kivy, and Nick Zangwill. Underlying each of these accounts is the formalist intuition that the aesthetically significant qualities of music as an art form result from appreciation of aspects of the musical structure itself as a structure and that music, as such, has no meaning beyond the patterns and relationships present in it. While Budd ultimately appears to reserve judgment about the possibility that music could possess emotionally expressive or extramusical content in addition to the purely musical content that he advocates for. Kivy and Zangwill take a stronger stance, arguing that aesthetically significant content in music is strictly musical in nature.

a. Music as an Abstract Art

In Values of Art, Malcolm Budd characterizes music as the “art of uninterpreted sounds,” arguing that music is essentially an abstract art and that the essence of music is the audible musical structure perceived by the listener. Budd does not deny that music can contain other elements and serve other purposes. For example, when a musical instrument, passage or motif is used to signify something extramusical, or when a musical work in some fashion represents extramusical things or events, or when music is combined with other art forms. His claim is that such elements in music are not proper to the art; that they are not part of music as such. For Budd, the musical content in music is present in an abstract audible structure whose meaning is not determined by meanings in or references to the external world. In this way, music represents nothing, makes no reference to anything, and is not about anything other than itself. Budd restricts what is essential to understanding music to the perception of the audible structural patterns present in a piece and their musically significant relations with one another. All other content is excluded.

Budd calls this form the ‘musical structure’ of the piece. For Budd, music is abstract in the sense that it does not depend for its success as an art form upon a referential relation to other areas of our experience or knowledge, whether this reference be by means of representation, imitation, signification, or by some other technique that referentially links musical sounds to things in the outside world or our experience. It is important to note that in keeping with the majority of those writing in this area by placing emphasis purely on musical content, referential meanings are not given serious consideration as aesthetically significant to music as an art form. Music may possess a variety of referential meanings, from the imitation of extramusical sounds, to culturally established meanings attached either to specific types of sounds or melodies, to imitations of content supplied by a program or accompanying words. Most writers would argue, however, that such referential meanings are not proper to the aesthetic content of classical music, given that they rely for their specification on extramusical elements such as words and cultural conventions. For Budd, the musical structure alone constitutes all of the musically significant content of the music. Other elements may be added for artistic enhancement. Examples of structural elements as Budd conceives of them would include melody, rhythm, and harmony, as well as other aspects of the music judged by the listener to be musically significant, such as clearly identifiable formal patterns, relations between parts (including contrapuntal motion, imitation, etc.), harmonic texture (polyphonic, homophonic, heterophonic, etc.), variations in the number of parts and in performing forces, and the like. Audible aspects of the music including the type and quality of instrument, the quality of the performer’s technique, and the artistic choices that the performer makes are secondary to what is contained in the music apart from these factors.

In defining music as the art of uninterpreted sounds, Budd locates the strictly musical content of music first and foremost in the listener’s perception of relationships between musical structures. Hearing the music in a work consists in perceiving the relatedness of structural features. Music is an unfolding of patterns and relationships in time. Hearing music as such is primarily a dynamic experience. That is, an experience of the flow of energies generated by the temporal unfolding of pitch relationships and rhythmic patterns.

b. Musical Formalism

The claim that music is fundamentally an abstract art may be taken to mean that music contains nothing other than sounds and their relations to one another. In other words, it may be taken to mean that music possesses only formal content such that any content other than this formal content is of secondary importance and an optional addition on the part of the hearer, and hence, not part of music itself. An account of this sort would allow that musical forms can possess emotional content as an expressive property grasped through intellectual perception and that musical forms can produce an affective state in the listener in response to aesthetically significant qualities such as beauty or impressiveness (as with Gurney). However, it would deny that music expresses emotions in any normal sense of the term. Musical formalism holds instead that all aesthetic content in music is purely musical in nature. For this reason, it also denies that music is capable of conveying human experience or values, as well as any kind of broader conceptual content relating to human life.

Peter Kivy, a prominent advocate of this approach, argues that in essence music is “a quasi-syntactical structure” that is understandable solely in musical terms having “no semantic or representational content, no meaning, making reference to nothing beyond itself.” He offers a sustained argument for this viewpoint in Music Alone and develops his discussion further in New Essays on Musical Understanding. It should be noted that in advocating what he describes as ‘musical purism’, Kivy does acknowledge that music can possess some expressive features, provided that these features are non-representational, non-referential, and possess no meaning other than a purely musical one. Kivy suggests that while music neither expresses emotions nor arouses them in us, it can possess expressive properties through resemblance, much in the same way, to use Kivy’s example, we recognize sadness in the face of a St. Bernard.

A centerpiece of Kivy’s argument is his ‘contour theory’ of musical expressiveness, first articulated in The Corded Shell. Kivy argues that the experience of expressive content in music consists, not in the emotional experience of such content, but instead in the recognition of emotional qualities through a similarity between musical shape and the characteristic shape of utterances or bodily gestures. We make this association, according to Kivy, because we are psychologically determined to animate what we perceive and interpret it in human terms. The perception of emotion in music is thus public and objective in the same way it is in people.

Kivy identifies some instances of expressive content that cannot be explained by his contour model, such as our experience of the respective qualities of the major and minor modes. He argues that these instances, whatever their origin, are established by convention and hence have the same objective character as those resembling human behavioral expressions of emotion. While acknowledging the strength of Kivy’s perspective, Mark DeBellis suggests that an appeal to resemblance via contour lacks explanatory power, since to say that we perceive both music and speech or gestures as having the same expressive quality is merely to restate the problem of expressive character. DeBellis also points to the possibility of music resembling human actions that cause emotion rather resembling the expression of the emotion itself, as in satisfaction resulting from the perception of struggle followed by resolution. He questions whether Kivy’s claims about the conventional nature of the major and minor modes can be verified. More recently Kivy has modified his position to one of “enhanced formalism,” holding that pure instrumental music is a “black box” regarding the question of how it comes to possess expressive properties and suggesting that the important question is instead that of understanding the role that these properties play in the formal structure.

Following a similar conception of music’s aesthetic content to that of Kivy, and in agreement with Scruton concerning the metaphorical nature of our descriptions of musical qualities, Nick Zangwill argues for the ‘aesthetic metaphor thesis’. This thesis holds that, except in exceptional cases, emotion descriptions of music are metaphorical descriptions of music’s aesthetic properties. Thus, just as we say without controversy that a passage is delicate, in the same metaphorical manner we can also describe a musical passage as serene. Zangwill acknowledges that we do have intensely valuable aesthetic responses to some works of music, but denies that these responses are emotional in nature. The mistake, according to Zangwill, is to take our metaphorical descriptions literally and confuse the feelings involved in experiencing music with emotions. In agreement with Kivy, Zangwill holds that absolute music cannot evoke ‘garden variety’ emotions and argues instead that in listening to music, we experience specifically aesthetic feelings which share some, but not all of the features found in actual emotional experiences.

c. Beauty, the Sublime, and Sensuous Pleasure

Regardless of the stance taken on whether or not music is capable of expressing emotions or other types of extramusical content, there is universal agreement among theorists that classical music offers unique and highly valuable experiences of musical beauty. Historically, the predominant tendency has been to limit musical beauty to the perception of relationships existing in the formal structure of the work, excluding its sensuous qualities. The most common type of musical beauty attributed to classical music is found in melody. The great majority of individually identifiable melodies that we describe as beautiful possess certain characteristics that are easily recognizable. These include a predominantly conjunct motion, graceful contours, elegance of design, a duration such that the whole can be grasped in the listener’s immediate awareness, a sense of arrival or return toward the end of the melody, a moderate to slow tempo, and a song-like quality in the production of the sound and phrasing (such as bel canto style, for example). The details of style evolve over time, but these general characteristics hold for beautiful melodies throughout the Common Practice period and beyond, as well as for instances of melodic beauty that predate Common Practice tonality. Musical beauty in the sense of patterns pleasing to the intellect and imagination may also be found in the perception of larger scale musical forms. Assessment of the significance of these vary depending on the weight granted to architectonic features in the musical experience. At the very least, certain readily perceivable formal structures such as those present in canons and harmonic ostinatos can be included uncontroversially in standard aspects of musical beauty in classical music. Well-crafted ‘counterpoint’ is a third commonly identified type of musical beauty. At slower tempos and especially in lower registers counterpoint is also acknowledged by many theorists to contribute to perceptions of musical profundity.

Closely related to musical profundity is experience of the sublime. In classical musical aesthetics, as with other arts, the sublime is usually taken to refer to evocation of that which is beyond human comprehension. In keeping with Edmund Burke’s influential analysis, the experience of sublimity in classical music is most often associated with feelings such as awe, astonishment, obscurity, and terror. Musical passages have been considered to evoke the sublime through qualities. These qualities include complexity, whether of overall design or of interaction between musical elements, emotional expression and mood, which may involve intense conflict or turbulence, but could also be present as transcendence or otherworldliness, and creative power either from an impression of the composer’s creative power in the scope or impressiveness of the work or through qualities evoking creativity in the work itself (as in a fantasia).

In contrast to the traditional focus on formal qualities, classical musicians themselves, as well as contemporary listeners to classical music, would almost universally include sensuous qualities as important contributors to musical beauty and sublimity. Indeed, a primary goal for the classical musician is to develop beauty of tone. Additionally timbres and coloristic effects play an increasingly important role in classical compositions starting in the latter part of the 19^th century, as seen in musical impressionism and minimalism, as well as in the expanded palette available through the use of greater and more varied performing forces from the Romantic period onward. For these reasons it seems difficult to deny that tone quality and the listener’s experience of both successive and simultaneous combinations of timbres should be possible objects of musical beauty and contributors to the experience of musical sublimity. In the case of sublimity, dynamics and texture would also seem to have an important role, as would, in some instances, articulation and attack. A further question would be the extent to which virtuosic elements and displays of musical virtuosity by soloists constitute or enhance beauty or sublimity in music. A common analogy notes that such displays are the auditory equivalent of fireworks.

5. Emotion

Can music possess expressive content in a more substantial way than in the intellectual recognition of resemblances to human expressive behavior in purely structural qualities, as the cognitivist would suggest? Theories addressing this question can be classed into several categories, as follows. Transmission-expression theories such as Deryck Cooke’s claim that the emotions experienced in the music are those experienced by the composer. Arousal theories claim that the music’s expressiveness consists in its ability to move the listener to have an affective response. Resemblance theories claim that musical expressiveness lies in perception of a similarity between the way the music sounds and the way emotions feel. Mirroring response theories claim that expressiveness lies in the music itself rather than originating in the composer or being located in the listener. Nevertheless, these theories claim that listeners often mirror the emotional qualities that the music expresses, though their doing so is not required for the music to be considered expressive. Imaginative response theories claim that we experience music as expressive by imagining that the emotions we perceive in it belong to an indeterminate persona (since the music itself cannot be the possessor of emotions). Accordingly, to hear emotion in music is to hear it as the expression of feelings by an imagined individual. A related approach emphasizes the metaphorical nature of expression without attributing it to an imagined persona. Sympathy theories emphasize our sympathetic engagement in the music and corresponding enhanced recognition of its qualities.

Although the literature is less extensive, theorists have also examined the presence and role of moods in classical music. ‘Mood’ here refers to the feeling of a state or states that persist over a significant period of time and have the capacity to color our attitude toward all of the musical content that we hear while they are being felt. It is generally assumed that moods differ from emotions not only in that they apply globally, but also in their lack of an intentional object. Although it is difficult to claim that moods contain much expressive content themselves, they may set the stage for the experience of more specific kinds of expressive content. Thus, a joyous mood might set the stage for feelings of triumphant arrival, a somber one for mourning and loss. Noel Carroll proposes that moods in music can offer a solution to the debate between formalists and arousalists, conceding to the formalist that music lacks the tools to represent the kinds of objects emotions require while granting to the arousalist the point that music can arouse “affective states that are objectless, global, [and] diffuse.” Peter Kivy disagrees, claiming that while there are certainly experiential differences between moods and emotions, they are identical in regard to how music can be expressive of them.

a. Association and Arousal Theories

Leonard Meyer combines his account of musical meaning with a theory of affective arousal. Building on the theory of emotions developed by John Dewey (whose aesthetics offers illuminating applications to classical music even though it does not consider classical music specifically), Meyer claims that emotion is evoked “when a tendency to respond is inhibited.” This situation occurs in classical music in innumerable instances when composers establish expectations, then delay the satisfaction of these expectations, as in delayed arrival on the tonic, or failure to complete a pattern that has been initiated. These examples, and countless others like them found throughout the fabric of classical compositions, trigger an affective response by establishing an expectation of fulfillment, then inhibiting that expectation. Meyer claims that this affective response can be either undifferentiated, in which case only a “feeling tone” is present (perhaps akin to purely musical feelings), or differentiated into a specific emotion by the listener in a process of imaginative association. Meyer’s theory is thus an arousal theory in its conception of affective response and an association theory in its account of the experience of specific emotions by the listener. However, as Malcolm Budd and numerous others have observed, in order to be aesthetically significant expressive content must be a product of properties perceived in the music itself. Consequently, expressive content cannot be the product of an association between the music and some extramusical content that defines or shapes our experience of it.

More recently Jenefer Robinson has advanced another version of the arousal theory, arguing that music has the ability to excite physiological arousal directly in the listener. According to Robinson, the listener attaches an emotional label to the state of arousal after this arousal takes place. Making a claim similar to that of Meyer in his theory of emotional differentiation, this label is governed by the context that the listener brings to the listening experience. Following the contributions of Robinson, many theorists now accept that arousal plays a role in the experience of classical music, even if it is only part of a more complete account. Peter Kivy figures as an exception by taking a formalist point of view, suggesting that to interpret our inner state as an emotional one after the fact is optional at best, and furthermore, is not the type of listening that appreciates what music as an art form has to offer.

b. Resemblance Theories

In his Music and the Emotions, Malcolm Budd reviews and rejects many of the prominent theories of musical expressiveness. In his Values of Art, Budd offers an argument for a “basic and minimal concept” of what the expression of emotion in music consists in. According to Budd, the expression of emotion in music amounts to hearing the music as sounding the way an emotion feels. Thus, the core element in the emotional expressiveness of music is the listener’s perception of a likeness between what is in the music and the experience of a particular emotion. In Budd’s view, this basic “cross-categorical likeness perception” must underlie any account of the expression of emotion in music. However music is expressive of emotion, the expression of emotion must always rely at bottom upon the perception of the music as sounding like the way emotions feel. Budd goes on to identify three likely “accretions” to this “basic and minimal account,” but does not commit himself to any one view. First, the music may induce the feeling whose likeness is perceived. Second, the perception of a likeness to emotional experience may be accompanied by listeners imagining an occurrence of the perceived feeling in themselves. Third, instead of imagining experiencing the feelings that are perceived in the music, the listener may imagine that the music is an instance of these feelings rather than the feelings of any specific individual.

In The Aesthetics of Music Roger Scruton classifies Budd’s idea of a cross-categorical likeness perception, Langer’s conception of music as an unconsummated presentational symbol, and Kivy’s contour theory as versions of what he calls ‘the resemblance theory’. Scruton argues that all versions of the resemblance theory will be unsatisfactory for two reasons. First, resemblance theories confuse expression with the means by which it is achieved (as with other arts such as poetry, music does not resemble what it expresses). Second, if resemblance involves recognizing expression without requiring that we experience something of value as a result of it (as Kivy would have it), then successful expression may occur in an aesthetically uninteresting piece of music and it is unclear why the musical presentation of expression would have any special value.

Approaching the problem of expressiveness from another angle, Stephen Davies endorses a contour model similar to Kivy’s, but also emphasizes the centrality of the listener’s response to the perceived expressive properties. Thus, experiencing expressive content involves a ‘mirroring response’ in which the listener experiences an emotion similar to that perceived in the musical structure, though the music itself is not thought to arouse this emotion directly or in a mechanical way.

In his recent Critique of Pure Music James Young advances versions of both arousal and resemblance theories as components of his anti-formalist position, Arguing in a manner similar to Budd, but in greater detail, he claims that that music arouses emotions through the resemblance the listener perceives between the experience of music and the experience of human behavior expressive of emotions. Identifying this process as the result of a ‘cross-domain mapping,’ Young follows an approach similar to that recommended by Tom Cochrane in drawing on empirical studies of listener responses as well as theories of brain function.

c. The Role Imagination and Metaphor

Jerrold Levinson focuses on the imaginative contribution of the listener in offering an account of hearing music as drama. Heard as drama, music consists in the interplay of forces within a piece, energies or impetuses within the piece whose interaction involves qualities such as tension, suspense, assertion, struggle, and conflict. Levinson suggests that when we hear music as drama, we imagine the dramatic actions and motivations to belong to indeterminate personae or person-like agents. He acknowledges that this way of listening adds an optional layer of content not strictly derivable from the music itself.

Aaron Ridley takes an approach similar to Levinson in regards to the imagination of indeterminate personae, but places special emphasis on the melismatic gesture in classical music as a primary vehicle of emotional expressiveness. Ridley argues that the melismatic gesture “resembles items in the expressive repertoire of extramusical human behavior, either physical or vocal,” thus allowing the music to present states of feeling which the listener experiences through a sympathetic response to the music. Following the contributions of Levinson and Ridley several theorists, Scruton among them, have suggested that the introduction of an imagined persona is unnecessary and that the musical entities themselves qualify as dramatic agents interacting with one another.

Much of western classical music from the Common Practice period can easily be characterized as inherently dramatic in nature, involving development, struggle, and resolution, due to its fundamental reliance on the tonic-dominant relationship. This relationship allows for multiple large and small scale instances of motivic development, of tension and resolution, departure and return, and movement and rest to occur within the context of a single piece. The tension found in the dominant seventh, as well as in other chords that function similarly, places the listener in a state of suspense and instills a desire for resolution. Tonal harmony exploits the dynamic qualities of chords within a given harmonic context to create tension, suspense, expectation, and surprise. It is worth noting that conceiving of music as a dramatic art would seem to shift the emphasis away from the value of a particular content in the music itself and toward the experience of dramatic qualities by the listener. Provided that we give ourselves over to it fully, a highly dramatic work may allow us to experience a form of catharsis and perhaps a state of exhausted repletion following the experiences of tension, suspense, and fulfillment.

Roger Scruton focuses on the listener’s sympathetic participation in the music in his account of musical expressiveness. He begins by suggesting that, because music cannot express exact states of mind, transitive notions of expressiveness give way to an intransitive conception of it. As a result, the import of expression in music lies in the listener’s response. Scruton claims that the listener’s response to expressive music is essentially a sympathetic one, a response to “human life, imagined in the sounds we hear.” For Scruton, the sympathetic response includes not only feelings, but actions and gestures as well. In order to hear music with understanding, we must move with it internally. Ultimately, for Scruton, our sympathetic response, our ‘moving with’ the music, is defined by the fact that music avoids explicit statement, while still inviting the listener to ‘enter into’ its expressive content. The experience of musical expressiveness consists in hearing it as “metaphorical movements in a metaphorical space.” The sounds are heard as figurative life, so that “you are the music while the music lasts.” In addition, though he does not believe it expresses any kind of cognitive content, Scruton suggests that the expressive qualities of a significant musical work can allow us to rehearse emotions that are otherwise very hard to feel.

Like Scruton Christopher Peacocke gives a central place to metaphor in the experience of musical expressiveness. In a recent influential paper, Peacocke suggests that when music is heard as expressing a particular property, some feature of it is heard “metaphorically-as” that property. Offering a non-linguistic account of metaphor informed by current accounts in cognitive science, Peacocke argues that in listening to music metaphor “is exploited in the perception, rather than being represented.” Thus when a piece of music succeeds in expressing a particular property, some of its features are perceived metaphorically-as possessing some of the characteristics of this property. This may occur at a single moment, or through the development of the music over time.

In a reply to Peacocke, Malcolm Budd contrasts his characteristically minimalist account of metaphorical content as the listener capturing some character of the music as he perceives it, with Peacocke’s account of the perceived property as a constituent of the intentional content of the listener’s perception. Budd questions what information a metaphorical-as constituent of a perception carries. He suggests that if it is no information, then the claim of metaphorical-as perception to cognitive status lapses. Kivy also raises questions about Peacocke’s account, asking the normative question of what metaphorical readings are permissible in Peacocke’s sense of metaphorically-as. He worries that it is unclear whether the account places limits on what can be heard metaphorically-as, leaving open the possibility that anything is permissible.

d. The Expression of Negative Emotions

The traditional question of the value of negative emotions in aesthetic experience applies to classical music as it does to the other arts. However, the question involves additional challenges in the case of pure music if one considers such music to be both abstract and highly expressive. In arguing for a specifically musical emotion that is both pleasurable to experience and universal to all aesthetically significant works of music, Gurney sidesteps the issue altogether. Nelson Goodman addresses the question by suggesting that in aesthetic experience “emotions function cognitively,” meaning that we use emotions to understand the aesthetic content of the work. In an influential essay entitled “Music and Negative Emotion” Jerrold Levinson accepts the suggestion made by Goodman and argues that Aristotle’s original claim of catharsis also has substantial merit in the case of classical music. Beyond these he identifies six additional “rewards” that may be associated with listening to music expressive of negative emotions, most having to do with benefits associated with experiencing and understanding emotions, either ours or another’s. Stephen Davies, by contrast, suggests in Musical Meaning and Expression that there is no real difference between our willingness to expose ourselves to negative emotions in music and our willingness to do so in other areas of life, so the question is more about our response to the human condition than it is about listening to music. A related possibility is that negative emotions in music offer a truthful reflection of our experience outside of music, and that we value such music in part because it affirms a reality we experience in our lives.

6. Human Experience and Values

Beyond the claim for emotionally expressive content in music, some writers have suggested that classical music possesses content that reflects aspects of human experience and values that surpass the expression of emotion, mood, and feeling, or the interplay of imagined personae. Wilhelm Dilthey and Jean-Paul Sartre both make such claims for music, and kindred claims can also be found in the writings of a number of contemporary aestheticians. However, while claims for a more significant human content in music resonate with many people, they have found only limited support among theorists because it has proven difficult to sustain an argument for the presence of this kind of content in music alone without tying the aesthetic claims to a larger philosophical framework that itself makes claims about human experience and values.

a. Dilthey and Music as the Expression of Lived Experience

Wilhelm Dilthey offers one of the most suggestive approaches to the expression of content holding a larger human significance in his late hermeneutical writings, especially in his discussion of musical understanding in “The Understanding of Other Persons and Their Manifestations of Life.” Dilthey’s argument for the expression of human experience in music depends upon a specific conception of what artistic expression consists in. Like Hegel, Dilthey holds that the psyche must obtain self-knowledge by objectifying itself. Unlike the literary, dramatic, and visual arts, however, music alone cannot make use of things or images from the shared external world, nor can it make use of the ability of words and images to refer to the inner world of emotions, perceptions, thoughts, and ideas. Instead, Dilthey argues, music transforms lived experience into a form of expression all on its own in a way that that opens up areas of human experience not accessible to the other arts.

The composer does not translate feelings that arose outside of music into musical terms. Rather the composer develops a capacity for specifically musical feelings through immersion in a musical tradition, in this case the tradition of Common Practice tonality together with all of the expressive techniques developed within this framework by individual composers. This capacity allows the composer to transform non-musical experiences into musical ones. Unlike most other expressive arts, music does not achieve its meanings through signification or representation. Instead, the capacity for musical feelings, as developed in relation to a musical tradition, takes the place of the capacity for signification found in language or that of representation found in the visual arts. Every art requires some vehicle or means through which to pursue the goal of appropriating the human world. In the case of music, Dilthey suggests, this vehicle is a capacity for musical feelings developed within a specific cultural tradition.

Expressions of lived experience in music, then, are expressions, not just of the uniquely individual experience of the composer, but of individuality perceived against a particular cultural-historical background. Expressions of lived experience express not only the individuality of the composer’s experience, but also the composer’s experience as it is determined by cultural and historical factors. As Edward Lippman points out, a primary reason why Dilthey is able to develop his argument as he does is that he interprets the arts as a whole in relation to a conception of interconnected cultural systems that are themselves part of the “overall nexus of life.” It is only because classical music consists in a tradition that is interwoven into this nexus that it can transform lived experience into an object of artistic expression.

b. Sartre, Adorno, and Music as a Social Force

Offering a major revision of the theory of music that he presents in The Psychology of the Imagination, Jean-Paul Sartre argues in “The Artist and His Conscience” that rather than consisting in an object of ideal beauty, music instead expresses cultural-historical values. Sartre explores the musical work as a historical and cultural totality, which simultaneously reflects and transcends its time. He identifies music as a “non-signifying art,” one that does not refer beyond itself, but nevertheless possesses a meaning. This meaning cannot be adequately expressed by any system of signs, but instead “is always a matter of a totality, a totality of a person, a milieu, time, or human condition.” Sartre’s focus in this essay is upon the possibility of music as a committed art form, by which he understands an art form that furthers human freedom. As George Bauer points out, for Sartre the goal of the musician is to find a means of “revealing the liberty of the human condition within his compositions–even to the untutored.”

Sartre’s basic claim is that the aesthetic choices a composer makes reflect the values of the composer’s cultural-historical context. Although Sartre does not deny that music is capable of reflecting the individual values of the composer, he is primarily interested in the way that music reflects, and possibly allows for the transcendence of, the human situation in a particular time and place. Sartre’s claim stems from the intuition, present in Western philosophical thinking about music since the time of Plato, that music has social and political implications, that it can be a transformational force and a potential threat to the established order.

Like Sartre, Theodor Adorno interprets strictly musical qualities in classical music to have social and political implications. Although his influential sociological interpretation and critique of classical music lies outside the scope of the aesthetics of classical music, in his writings on specific composers Adorno identifies political and social implications in classical music as well as other significant human content in the composer’s treatment and alteration of musical conventions. In his writing on Mahler, Adorno argues that a social critique is evident in the relationship the composer establishes between the individual theme and the larger symphonic form. Traditionally conceived as a problem in Mahler, Adorno claims that in fact Mahler’s liberation of individual themes from ties to the larger formal structure establishes an “archaic banality,” akin to improvisation, which is “located prior to the constitution of the harmonically symmetrical relationships and corrodes them.” Seen in this light the true significance of Mahler is that he is “using the archaically corroded material of romanticism … in protest against the bourgeois symmetry of form.” Against this symmetry he opposes “the free contours of the freshly trodden landscape of the imagination.” Thus, Adorno finds in Mahler’s alteration of conventional musical relationships a subversion of the bourgeois order, which is capable of elevating the social awareness of the listener.

Adorno finds another kind of human significance in the late style of Beethoven, arguing that his late style reveals the ultimate inability of art to address the human condition. The traditional view held that Beethoven’s late work reflects “an uninhibited subjectivity … which breaks through the envelope of form to better express itself.” Against this view Adorno argues that in Beethoven’s works generally, rather than breaking through form, the composer’s subjectivity creates it. The middle Beethoven transforms his musical materials according to his intention, freeing them from convention through the compositional uniqueness that he achieves. The late Beethoven, by contrast, makes use of “conventions that are no longer penetrated and mastered by subjectivity, but simply left to stand.” According to Adorno these conventional materials exist in a fractured landscape that reflects the composer’s encounter with mortality: “the finite powerlessness of the I confronted with Being.” Thus, Adorno concludes, “[i]n the history of art late works are catastrophes.”

c. Contemporary Theories

More recently, Patricia Herzog has argued that purely instrumental music can convey content of profound significance to human life and that the value of such music resides largely in the value of the content that is conveyed. Purely formal accounts of music overlook this content and consequently cannot offer insight into the most important aspects of musical value. In Herzog’s view, music criticism must seek to articulate aesthetic value by grasping human values in music. Drawing on the work of Edward Cone and Joseph Kerman, Herzog bases her argument on the intuition that music contains a significance to human life that cannot be grasped by limiting the study of music to intramusical relations and any expressive content these abstract forms may yield.

Herzog claims that grasping purely intramusical meanings will never answer the important questions about music, since such meanings fail to provide a sufficiently rich interpretive vocabulary and “do not generate categories that tell us why music matters.” These questions must be answered through an evaluative connection to the music, one that links the music to human interests. For Herzog, the best works of classical music possess a recognizable conceptual content of human significance. The profundity of this content plays a major role in determining the work’s aesthetic value. Aaron Ridley also claims that music can convey a profound content. Drawing on the music criticism of J.W.N. Sullivan and echoing Dilthey, Ridley argues that a certain works of classical music convey the depth and quality of the artist’s experience of life and that through listening to them the music gives us the opportunity “to grasp, or at least to gain an inkling of, a state of soul or an outlook of extraordinary depth.” Arguing against positions such as those of Herzog and Ridley, in Music Alone Peter Kivy questions whether it is possible to articulate the profundity of music. Kivy suggests that the profundity of music can only be possessed directly through the listening experience. He agrees that music matters, but denies that its profundity consists in a content that can be articulated in terms of human experience and values. Kendall Walton takes a more moderate approach to extramusical content in purely instrumental music, proposing that while music does not, as some have suggested, call for imaginative interpretations of musical content in non-musical terms, it does call for “imaginative introspection”. This means that in the listening experience we imagine feeling particular emotions tied to the content of the music. Walton also suggests that music presents non-psychological properties such as struggle and achievement. According to Walton music’s reference to extramusical realities, though imprecise, is important to explaining the power of music as an art form.

In The Aesthetics of Music Roger Scruton holds that we hear music as purposeful “in the manner of human intention,” and thus events are not just perceived as movement, but as action (though he rejects the need for reference to an imaginary subject). Scruton argues that because we experience music as “figurative life,” music embodies and transmits the values of the culture that produces it. When we enter into the music through sympathetic listening, we rehearse the patterns of emotions that correspond to those values. Like Plato, Scruton suggests that music exercises an influence on our character. He draws an analogy to dance and its evolution from the Baroque period onward, Scriton claims that through the feelings it causes us to experience in our sympathetic engagement with its gestures, classical music educates our emotions, in contrast to popular music, which increasingly represents the decline of Western musical culture, a progressive movement toward disorder lead by the sexual impulse. Appreciating classical music, Scruton argues, is a form of latent dancing so that “the search for objective musical values is one part of our search for the right way to live.”

Theories that find music alone to be capable of expressing aspects of human experience and values must account for how an apparently abstract art can convey such content. Though attempts continue to be made to explain how music achieves this kind of result, most theorists find the attempts made to date to be unsatisfactory. Dilthey’s hermeneutical account would appear to be among the most well developed, but it relies upon additional assumptions about the nature of artistic expression and the compositional process that most theorists would not accept, or at the very least would find to be in need of significant additional exploration. Thus, while theories claiming the expression of human experience and values appeal to the common intuition that certain works of classical music possess a meaning that has larger implications for human life, definitive identification of such meanings has proven to be elusive.

7. References and Further Reading

Adorno, Theodor. “Late Style in Beethoven.” Trans. Susan Gillespie. Raritan 13:1 (1993):102-06.
- A reinterpretation of the meaning of stylistic qualities in Beethoven’s late works.
Adorno, Theodor. “Mahler Today.” Essays on Music: Theodor Adorno. Ed. Richard Leppert. Trans. Susan Gillespie. Berkeley: University of California Press, 2002.
- Advances the claim that Mahler’s deviation from the thematic techniques of tonal harmony should be understood as an artistic subversion of the Bourgeois order.
Bauer, George Howard. Sartre and the Artist. Chicago: University of Chicago Press, 1969.
- An analysis of Sartre’s use of art and artists to convey his conception of the difference between being and existence as it relates to art.
Beardsley, Monroe. “Understanding Music.” On Criticizing Music: Five Philosophical Perspectives. Ed. K. Price. Baltimore: Johns Hopkins University Press, 1981.
- Extends Goodman’s concept of exemplification to music.
Budd, Malcolm. Music and the Emotions. London: Routledge, 1985.
- A penetrating critical examination of influential theories of emotion in music, including those of Hanslick, Gurney, Schopenhauer, Cooke, Langer, and Meyer.
Budd, Malcolm. “Musical Movement and Aesthetic Metaphors.” British Journal of Aesthetics 43:3 (2003): 209–23.
- Argues against Scruton’s account of musical motion in terms of spatial metaphors understood metaphorically, suggesting it is favorable to conceive of musical motion in terms of a purely temporal Gestalt.
Budd, Malcolm. “Response to Christopher Peacocke’s ‘The Perception of Music: Sources of Significance.’” British Journal of Aesthetics 49:3 (2009): 289-92.
- An evaluation of Peacocke’s conception of the role of metaphor in music.
Budd, Malcolm. Values of Art. London: Penguin, 1995.
- Compliments his earlier work with the addition of a “basic and minimal” conception of emotion in music as well as an exploration of the value of music as an art form.
Carroll, Noël. “Art and Mood: Preliminary Notes and Conjectures.” The Monist 86:4 (2003): 521-555.
- Explores the possibility that musical moods can offer a solution to the debate between formalist and arousalist positions.
Clifton, Thomas. Music as Heard: A Study in Applied Phenomenology. New Haven, Conn.: Yale University Press, 1983.
- Considers the experience of music from a phenomenological perspective.
Cochrane, Tom. “Music, Emotions and the Influence of the Cognitive Sciences.” Philosophy Compass 5:11 (2010): 978–88.
- Suggests that psychology and neuroscience can provide additional support for one theory of our experience of music over another, as well as in some cases allow us to reframe and synthesize traditionally distinct positions.
Cone, Edward T. The Composer’s Voice. Berkeley: University of California Press, 1974.
- Argues for a theory of musical communication based on the composer’s musical personae.
Cook, Nicholas. Music, Imagination, and Culture. Oxford: Clarendon, 1990.
- Examines music from the point of view of the composer and the listener, arguing that the role of the listener is of primary importance.
Cooke, Deryck. The Language of Music. Oxford: Oxford University Press, 1964.
- Seeks to show that certain recurrent patterns present in the music have specific emotional meanings, making it possible to construct a basic emotional vocabulary of classical music.
Dahlhaus, Carl. The Idea of Absolute Music. Trans. Roger Lustig. Chicago: University of Chicago Press, 1989.
- A hermeneutical inquiry into the history of our conception of absolute music.
Davies, Stephen. Musical Meaning and Expression. Ithaca: Cornell University Press, 1994.
- A comprehensive discussion of major issues in musical aesthetics, including a presentation of his mirroring response theory of musical expression.
Davies, Stephen. Musical Works and Performances. Oxford: Clarendon, 2001.
- An in-depth exploration of the nature of musical works and of authenticity in musical performances.
Davies, Stephen. Musical Understandings and Other Essays on the Philosophy of Music. Oxford: Oxford University Press, 2011.
- A collection of essays addressing the listener’s response to the expression of emotion in music, the role of the listener in the perception and understanding of music, as well as other central issues in musical aesthetics.
DeBellis, Mark. “Music.” The Routledge Companion to Aesthetics. Ed. Berys Gaut and Dominic McIver Lopes. New York: Routledge, 2001.
- An overview of major topics in musical aesthetics.
Dilthey, Wilhelm. Selected Works, Vol. 3: The Formation of the Historical World in the Human Sciences. Ed. Rudolf Makkreel and Frithjof Rodi. Princeton: Princeton Univeristy Press, 2002.
- Contains Dilthey’s late hermeneutical approach to musical aesthetics in the essay “The Understanding of Other Persons and Their Manifestations of Life.”
Goehr, Lydia. The Imaginary Museum of Musical Works. Oxford: Oxford Univeristy Press, 1994.
- Offers a genealogy of the concept of a musical work from antiquity onward, arguing that no analytic method can succeed in defining musical works and that before 1800 compositions and performances were not governed by the work concept.
Goldman, Alan. “The Value of Music.” Journal of Aesthetics and Art Criticism 50:1 (1992): 35–44.
- Argues that music presents us with another world, separate from everyday life.
Goodman, Nelson. Languages of Art. Indianapolis: Bobbs-Merrill, 1968.
- Highly influential work exploring the nature of musical expression and the relationship between works and performances.
Gracyk, Theodore and Andrew Kania, eds. The Routledge Companion to Philosophy and Music. New York: Routledge, 2011.
- A comprehensive guide to major topics and thinkers in musical aesthetics.
Gurney, Edmund. The Power of Sound. New York: Basic Books, 1966.
- A monumental study drawing on evolutionary theory to analyze the nature of musical expression.
Hanslick, Eduard. On the Musically Beautiful. Trans. Geoffrey Payzant. Indianapolis: Hackett, 1986.
- Classic treatise in musical aesthetics, arguing that aesthetic value in music is purely formal in nature.
Herzog, Patricia. “Music Criticism and Musical Meaning.” Journal of Aesthetics and Arts Criticism 53: 3 (1995): 299-312.
- Makes the case for content of a profound human significance in classical music.
Kant, Immanuel. Critique of Judgement. Trans. J.H. Bernard. New York: Hafner, 1951.
- A foundational text in aesthetics; evaluates whether music is a proper object of aesthetic judgements.
Kivy, Peter. The Corded Shell. Princeton: Princeton University Press, 1980.
- Presents the author’s contour theory of musical expressiveness, supplemented by a convention theory that accounts for our responses to those aesthetic qualities not addressed by the contour theory.
Kivy, Peter. “Mood and Music: Some Reflections for Noël Carroll.” The Journal of Aesthetics and Art Criticism, 64:2 (2006): 271-281.
- Assesses Carroll’s account of the evocation of moods in classical instrumental music.
Kivy, Peter. Music Alone: Philosophical Reflections on the Purely Musical Experience. Ithaca: Cornell University Press, 1990.
- Considers the experience of textless instrumental music, clarifying and defending the author’s cognitivist position.
Kivy, Peter. New Essays on Musical Understanding. Oxford: Clarendon, 2001.
- A collection of essays addressing historical topics, emotional expression, and concatenationism vs. architectonicism.
Langer, Susanne K. Philosophy in a New Key. New York: Mentor, 1956.
- Argues that works of music should be understood as unconsummated presentational symbols and as such symbolize.
Levinson, Jerrold. Music, Art, and Metaphysics. Ithaca: Cornell University Press, 1990.
- An influential work containing six essays on musical aesthetics and covering topics such as the definition, ontology, meaning, performance, and appreciation of music.
Levinson, Jerrold. “Music as Narrative and Music as Drama.” Mind and Language 19:4 (2004): 428-441.
- Argues that that it is natural to hear music as drama and that doing so benefits from the introduction of an imagined persona, while attempting to hear it as narrative poses significant problems.
Levinson, Jerrold. Music in the Moment. Ithaca: Cornell University Press, 1997.
- Presents a sustained argument for concatenationism.
Lippman, Edward. A History of Western Musical Aesthetics. Lincoln: University of Nebraska Press, 1992.
- A thorough survey of influential figures, with an emphasis in its 20^th century coverage on continental aesthetics.
Lippman, Edward. Musical Aesthetics: A Historical Reader. 3 vols. New York: Pendragon Press, 1986.
- An excellent source book in musical aesthetics.
Meyer, Leonard B. Emotion and Meaning in Music. Chicago: University of Chicago Press, 1961.
- A foundational inquiry into musical meaning, focusing on expectation generated by antecedent-consequent relationships.
Meyer, Leonard B. Music, the Arts, and Ideas. Chicago: University of Chicago Press,
- Reworks central aspects of the theory presented in Emotion and Meaning in Music.
Narmour, Eugene. The Analysis and Cognition of Basic Melodic Structures. Chicago: University of Chicago Press, 1990.
- A further development of the basic approach established by Meyer.
Nattiez, Jean-Jacques. Music and Discourse: Toward a Semiology of Music. Princeton, N.J.: Princeton University Press, 1990.
- Argues that music possesses a syntax and thus can be interpreted similarly to any other system of signs.
Peacocke, Christopher. “The Perception of Music: Sources of Significance.” British Journal of Aesthetics 49:3 (2009): 257-275.
- An influential paper arguing that in listening to music metaphor is “exploited in the perception, rather than being represented.”
Ridley, Aaron. Music, Value, and the Passions. Ithaca: Cornell University Press, 1995.
- Focuses on the melismatic gesture as a central component of musical expressiveness.
Robinson, Jenefer. Deeper than Reason: Emotion and its Role in Literature, Music, and Art. Oxford: Clarendon, 2005.
- Drawing on the author’s own theory of emotion, offers an account of musical expression and of the capacity for music to arouse emotions in the listener.
Sartre, Jean-Paul. The Psychology of Imagination. New York: Citadel, 1991.
- Sartre’s early account of music as presenting ideal beauty.
Sartre, Jean-Paul. Situations. Trans. Hazel E. Barnes. New York: George Braziller, 1965.
- Contains the essay, “The Artist and His Conscience,” which argues that music captures a historical milieu and additionally that music can be a transformational force used to further human freedom.
Schenker, Heinrich. Free Composition. Trans. and ed. Ernst Oster. New York: Longman,
- Classic treatise in musical analysis emphasizing the architectonic aspects of musical compositions.
Schopenhauer, Arthur. The World as Will and Representation. Trans. E.F.J. Payne. Indian Hills, Col.: Falcon’s Wing Press, 1958.
- Presents Schopenhauer’s philosophy of music as having the privileged status of being a direct presentation of the will, which is the thing-in-itself or underlying metaphysical reality.
Scruton, Roger. The Aesthetics of Music. New York: Oxford University Press, 1997.
- A thorough and insightful discussion of many of the major issues in musical aesthetics, including spaciality, ontology, expression, understanding, content, and both experiential and cultural value.
Scruton, Roger. “Musical Movement: A Reply to Budd.” British Journal of Aesthetics 44:2 (2004): 184–7.
- Argues for the indispensability of metaphor in the listening experience.
Serafine, Mary Louise. Music as Cognition: The Development of Thought in Sound. New York: Columbia University Press, 1988.
- Identifies twelve cognitive processes that are components of musical cognition and assesses experiments on people of different ages intended to shed light on how these processes develop.
Walton, Kendall. “What is Abstract about the Art of Music?” Journal of Aesthetics and Art Criticism 46:3 (1988): 351-364.
- Argues that music’s reference to extra-musical realities such as unnameable feelings and the dynamics of emotions, though imprecise, is important to explaining the power of music as an art form.
Zangwill, Nick. “Music, Metaphor, and Emotion.” Journal of Aesthetics and Art Criticism 65:4 (2007): 391–400.
- Argues against emotion theorists, claiming that what we experience in response to music is in some ways similar, but not equivalent to, actual emotion, and that instead of taking emotional descriptions of music literally, we should instead understand them as aesthetic metaphors.
Zuckerkandl, Victor. Sound and Symbol. Tr. Willard Trask. New York: Princeton University Press, 1956.
- An influential early study investigating our experience of tone, motion, time, and musical space.

Author Information

Michael Bazemore
Email: mbazemore01@gmail.com

U. S. A.

Ancient Ethics

Ethical reflection in ancient Greece and Rome starts from all of an agent’s ends or goals and tries to systematize them. Our ends are diverse. We typically want, among other things, material comfort, health, respect from peers and love from friends and family, successful children, healthy emotional lives, and intellectual achievement. We see all these things as good for us. So, systematizing our ends involves considering how various goods that we have or seek fit together. In particular, it involves thinking about what makes life good overall—what a happy human life consists in. In ancient ethical theory, then, the core question is: how can I live well? That is, how can I flourish and live a happy life? To a first approximation, happiness consists in having good things, but this formula must be read liberally. The most important goods in life may be activities or experiences, not things that one has in a quite narrow sense. If so, then happiness—having good things—centrally involves the relevant activities or experiences.

Rational reflection on these questions is not just an odd intellectual pursuit unconnected from living life well. Rather, the ancients agree that practical intelligence or wisdom—some sort of understanding of how our ends and goals fit together—is central to living well. We must grasp which ends subserve others (instrumentally or constitutively), which ends are important to our lives as a whole and which are not, and which ends we should reconceive, restrain, abandon altogether, or newly introduce because of how they fit (or fail to fit) with others. We can then guide our lives intelligently, better achieve our ends, and so live well and be happy. This ability to guide our lives intelligently is itself good for us. In fact, it can seem good in a different way from the other ends it governs. Other goods are bad in special circumstances and can be misused. For example, strength is bad when a tyrant conscripts the able-bodied to fight in an unjust war, and it can also be used to bully others. Practical intelligence is always good and cannot be misused; it is unconditionally good for the agent. Since happiness consists in having good things, in a suitably broad sense, and since practical intelligence is a preeminent good, living well centrally involves having and exercising practical intelligence.

This introduces another main feature of ancient ethics: it gives a central role to human excellence or virtue. Practical intelligence—a systematic, coherent grasp of all the goods in a life—is a virtue. Clearly, such a virtue, which amounts to expertise at living, plays a crucial role in living well (as expertise in any domain plays a crucial role in good performance in that domain). So this virtue, at least, is necessary for happiness. By reflecting on how practical intelligence connects to other virtues, we can see why ancient ethical theories say that virtue more generally is necessary, or even necessary and sufficient, for happiness.

Plato
Aristotle
Stoicism
Academic Skepticism
Epicureanism
Pyrrhonism
References and Further Reading
1. Primary Works
2. Secondary Works

1. Plato

Plato says that happiness is the possession, or the possession and correct use, of goods. Correlatively, misery is the possession of bads, or the possession and incorrect use of goods. If we ask why anyone does what she does, and reach the point of showing how her action fits into a happy life, we have fully explained and justified her action; no further question about why she wants to be happy and live well is apt. Put another way, we do everything for the sake of happiness, and we need nothing beyond happiness. Wisdom is both our highest good and the ability to use other goods well and beneficially. So, wisdom should be the first concern of anyone who wants to live well and be happy—that is, everyone. In particular, wisdom is more important than bodily and reputational goods such as health and honors. But as the condition that enables skillful activity in any domain is expertise in that domain, so too the state that enables skillful activity with goods is expertise concerning goods. So, wisdom—the highest human good—is knowledge of the good.

However, a problem lurks. If wisdom is the good for a human being, and the highest good for a human being is knowledge of the good, then wisdom seems to be knowledge of itself. This is unintelligible, and even if it were intelligible, it sounds useless. So, Plato introduces the form of the Good, distinct from other goods (including the highest human good) as the proper object of wisdom. The form of the Good is good without qualification—it is not merely the good of this or that sort of thing; it is what goodness is, in relation to which other goods are (qualifiedly) good. This gives a formal characterization of the Good; more substantively, the Good is unity. So, each thing is good when it is unified; civic unity is the highest good of a city, and psychological unity the highest good of a soul. That is, the soul achieves its highest good by putting its ends and attitudes into a coherent structure. This happens by coming to know the Good; when someone grasps that, she becomes like the object of her knowledge—the Good is unity, and knowing the Good unifies the soul. This identification of the Good with unity is one reason why Plato thinks that mathematics prepares the way for ethical knowledge.

That covers wisdom and its primary object, but what about other virtues? Plato sometimes says that all the virtues simply are wisdom—for example, that wisdom enables one to rule one’s pleasures and appetites (so that temperance is wisdom) and fears (so that courage is wisdom). On this view, there is only one virtue with several names. Elsewhere, he offers a somewhat weaker view: there are several virtues, but having one requires having them all. Even the weaker view provokes surprise; common sense says one can be, for example, just but not temperate, or wise but not courageous. Both versions of the claim that virtue is unified are grounded partly in the claim that affective states represent their objects as good or bad. For example, when someone fears heights on some occasion, she fears the harms of falling—fear represents something as bad for the subject. But wisdom systematically grasps what is really good and bad for us. So, the wise person never harbors any false belief about what is really good or bad for her. Hence, she fears things only to the extent that they really are bad for her—neither more nor less. That is, she is courageous, rather than cowardly or rash. Some things that the wise person knows are not bad may still appear bad to her, though, just as perceptual illusions persist even for those who do not trust in them.

Justice is a particularly important case; two of Plato’s longest works defend the claim that justice is unconditionally good for its possessor. The Gorgias says justice is organization, so a just soul is an organized soul; the Republic says justice is the condition in which each does its own work, so a just soul is one in which each part of the soul does its own work. As with the other virtues, justice is closely connected to wisdom. Again, wisdom is knowledge of the good; it is a systematic and coherent grasp of the relationships among all the goods one seeks. So, wisdom organizes the soul, and the wise person will be just. Because justice is so closely tied to wisdom, it is unsurprising that, like wisdom, it is unconditionally good for the agent. Thus, acting unjustly for the sake of mere conditional goods (for example, wealth or political power) is never prudent. For example, one cannot betray a friend and still have an organized soul; such actions reveal deep ignorance of what goods are most important and make life go well. Some scholars have worried that one could perhaps betray friends and still have an organized soul. Addressing this concern requires reflecting on whether loyalty to friends is actually more important than wealth. If it is, then someone with an organized soul will track this fact, and will never betray friends for the sake of wealth. And if having an organized soul is unconditionally good, then betraying friends for the sake of wealth is never prudent.

As we have seen, Plato thinks virtue is closely related to happiness. In particular, virtue is necessary for happiness—the vicious are not happy, but miserable. But we have not yet seen whether he thinks virtue suffices for happiness, or what else might conduce to happiness. Two important commitments in this regard—which Plato never explicitly thematizes, but regularly assumes—are that virtue and happiness (and vice and misery) come in degrees. Because virtue is the central determinant of happiness, it seems clear that as one becomes more virtuous, one becomes happier. One might take virtue to be the sole determinant of one’s degree of happiness. But in fact, Plato thinks that goods and bads other than virtue and vice—conditional goods such as wealth and honors—are relevant to how happy one is. These have opposite effects on the virtuous and vicious. Somebody with a certain degree of virtue, but with more conditional goods, is happier than somebody with the same degree of virtue but without those goods, or with correlative conditional bads. Somebody with a certain degree of vice, but with more conditional goods, is more miserable than somebody with the same degree of vice but without those goods, or with correlative conditional bads.

The reason for this is that conditional goods enable one to exercise one’s character more widely, while conditional bads prevent one from exercising one’s character as widely. Conditional goods thus allow a virtuous person to exercise her virtue more widely and a vicious person to exercise her vice more widely—they allow virtuous and vicious people to perform more virtuous and more vicious actions, making them happier and more miserable, respectively. Conditional bads keep virtuous and vicious people from performing actions that express their virtue or vice as fully, which makes them less happy and less miserable, respectively. Plato may think that these activities affect our happiness or misery directly, or he may think that their influence on our happiness is fully mediated by how they further shape our characters; he never commits himself one way or the other.

Plato thinks the highest human good is systematic knowledge of the Good (unity) together with the virtues identical to or entailed by that knowledge. Naturally, he rejects competing candidates for the highest human good, such as pleasure, love and friendship, and artistic achievement. In each case, he says how these other plausible candidates relate to his view.

The main alternative way of trying to unify our ends is hedonism, the view that the good—which we do everything for the sake of and which is all we need—is pleasure. Plato argues against hedonism in two main ways: (i) pleasure and pain occur together and cease together in the same place at the same time, as opposites like good and bad do not; (ii) pleasure is a process of restoration culminating in a good, harmonious condition, so pleasure cannot be the same as the good, harmonious condition it culminates in. These points are related: since pain is the felt disturbance of a good, harmonious condition, and pleasure the felt restoration to a good, harmonious condition, pleasure and pain (for example, pains of hunger and pleasures of eating) often occur together and cease together. This observation also allows Plato to argue that the virtuous live most pleasantly (although their pleasures do not make them happy). Because most bodily and reputational pleasures coincide with contrasting pains, they seem more intense than they are. (Compare how colors seem more intense against a contrasting background.) In fact, though, the pleasures associated with virtue and knowledge are larger than bodily and reputational pleasures—or so Plato argues.

Plato takes a similar line on love, friendship, and art: he denies that any of these provide the principles around which one can successfully organize one’s ends and live well, but he recognizes that they play important roles in such a life. When two people love each other and are friends, we can ask about the basis of their friendship. Any old relationship does not make life go well, and relationships directed at some objects can actually keep us from living well. So, we must say what love and friendship are for; Plato suggests that proper love and friendship are directed at the human good—at wisdom and virtue. But love and friendship are not just one way to seek wisdom and virtue. Plato always emphasizes the social character of philosophy (that is, love of wisdom). His approach to art is similar: the wrong kind corrupts us when young and tempts even good adults to hold vicious attitudes. However, the right kind of art is important to developing good character in childhood and to sustaining good character through an entire life.

One last topic deserves mention: Plato thinks that the soul is immortal and transmigrates. This is relevant to his ethics not because he thinks one should act differently in this life because the soul is immortal, but because it raises the stakes for decisions made in this life. Our choices have ramifications for our character in the afterlife and in our next life. So, Plato thinks about character development in the very long term—over many cycles of birth and death, covering many thousands of years.

2. Aristotle

Aristotle was Plato’s student, so we should not be surprised to see him developing similar ethical views. Still, there are differences of emphasis, points on which Aristotle is more explicit, and some points of clear disagreement between them.

Aristotle provides formal criteria for our final end—happiness—that closely resemble Plato’s. We do everything for the sake of happiness and do not seek it for the sake of anything further, and we need nothing beyond happiness. The best candidate for something of this sort, he argues, is a full life of excellent rational activity. Some readers think Aristotle has a compound theory of happiness: it is a full life of excellent rational activity, plus external goods such as health, wealth, good looks, and good children. However, Aristotle clearly distinguishes what happiness consists in from what it needs as background conditions, and he thinks happiness needs external goods as background conditions, not as constituents. There are two reasons for this. First, excellent rational activity requires some external goods as tools; second, lack of some external goods “spoils our blessedness.” One way to understand the latter claim is to notice that excellent rational activity must be unimpeded and pleasant; since everyone wants external goods, we need some not to be pained at their lack. Aristotle and Plato agree in thinking that the virtuous person lives a better life with more external goods, but Aristotle thinks that enough external bads hinder excellent rational activity. They make the virtuous person unable to exercise her virtues fully, either for lack of tools or because her activities are impeded by pain. Plato thinks rather that lack of external goods or presence of external bads cannot prevent the virtuous person from living well, but only that these can prevent her from living the happiest possible life.

Aristotle distinguishes two kinds of virtues that rational creatures can have and exercise: intellectual and character virtues. The highest intellectual virtue is wisdom (sophia), which combines a grasp of the world’s highest principles (nous) and ability to reason deductively from them (epistêmê). The first principle of the world is the source of change that does not itself change—often called the “unmoved mover,” or God. Aristotle calls God the highest good, with which he proposes to replace Plato’s form of the Good. Plato distinguishes the form of the Good from any thinkers or thoughts about goodness, and identifies God with intelligence. But Aristotle says God is both thinker and object of thought. Plato’s God is personal; Aristotle’s is impersonal and does not think about the things it changes. God changes other things not by deliberating and acting, but by being what changing things strive to be like, to the extent possible. For example, the stars change in the smallest way possible for things that change: by circular motion. A life spent in exercising the highest intellectual virtues, to the extent possible, is the best life, and a life most like God’s. A life spent in exercising character virtues is also happy, but we exercise character virtues in part to make contemplation possible, while we contemplate just for its own sake. Thus, exercise of the character virtues fits the constraints on our final end less well than exercise of the highest intellectual virtues.

One intellectual virtue, practical wisdom (phronêsis), has a special relationship to character virtue; nobody can have any character virtue without practical wisdom, and nobody can have practical wisdom without all the central character virtues. Thus, Aristotle subscribes to a version of the unity of virtue. Practical wisdom and the character virtues shape and govern the parts of the human being that are non-rational but susceptible to reason (which are concerned with the material and social conditions of human life). One can exercise the character virtues in private life only or also in public life; the latter involves exercising them more widely, so it is preferable and more godlike. Thus, Aristotle addresses his discussion of character virtue to those who intend to enter politics. This is related to an odd feature of Aristotle’s account of justice: because each character virtue can be exercised in relation to others, he identifies “general justice” with the entirety of virtue. Again, practical wisdom and the character virtues can be exercised privately or politically, but achieve their fullest expression politically. The virtue concerned with other people is justice, so there is a sense in which justice encompasses all of character virtue and a correlative sense in which political expertise is simply practical wisdom writ large.

Aristotle describes each character virtue as being (and hitting) a “mean” in both action and feeling. Hitting the mean in action and feeling involves doing the right thing and feeling the right thing, at the right time, in the right ways, in relation to the right people. (Hitting the mean need not involve doing or feeling a moderate amount; it can be right to perform a grand action or to refrain from acting entirely, and it can be right to feel intensely or not to feel at all.) Each virtue is a mean by falling between two vices—wit, for example, is a mean that falls between buffoonery and boorishness.

The ability to figure out what to do can come apart from feeling the right way about one’s situation. (For example, one might see that one should confront a sexist comment, but be more afraid of doing so than one should be.) In such cases, one can either do the right thing despite one’s feelings, or act on one’s feelings contrary to one’s considered judgment. In the former case, the action is continent (but not virtuous); in the latter case, the action is incontinent (but not vicious). So, continence and incontinence are states of character between virtue and vice. Aristotle also sketches a character worse than vice, “brutishness,” and a character superior to virtue, which is godlike.

The ideal of being like God returns us to an important tension in Aristotle’s treatment of external goods. Usually, Aristotle thinks that lack of external goods can ruin the happiness of a virtuous person by impeding her exercise of virtue, and possession of external goods enables the wider exercise of virtue. He even introduces special virtues concerned with great wealth (magnificence) and great honor (magnanimity). Elsewhere, though, he argues that the contemplative life is superior to the political life in part because it needs fewer external goods, and he posits a godlike state that transcends virtue in its detachment from ordinary human concerns like health and wealth. This problem also arises in the case of friends. Friendship in the core sense involves seeking to become virtuous and acting well together. But the virtuous are self-sufficient, and the self-sufficient need friends least; so, the virtuous need friends least. In particular, the more godlike someone becomes, the less she needs friends at all. Aristotle has ways of trying to address this problem: he says that the virtuous need friends so that they have someone to benefit, and in order to best enjoy activities that are their own (since a friend is a “second self”). However, these two strands of Aristotle—one stressing the need for external goods and friends, the other stressing the need for independence from external goods and friends—remain in tension.

3. Stoicism

Stoicism comprises a centuries-long tradition, involving considerable disagreement among its adherents. This article focuses mainly on early Stoicism as articulated by its first three scholarchs: Zeno, Cleanthes, and especially Chrysippus. Some of the claims called “Stoic” here are rejected by other, later Stoics such as Panaetius and Posidonius. Some are rejected by an important early Stoic, Aristo, who lost a struggle to define the movement and so was retroactively deemed heterodox. There are disagreements among the earliest scholarchs as well, only a few of which are tracked here.

“Nature” plays a significant role in Plato and Aristotle’s ethics, especially in the contrast between nature and convention. But nature as a central organizing principle in ethical theory takes off in the Hellenistic period. For the Stoics, this emerges in their formula for the final end, “living in accordance with nature.” Cleanthes understands “nature” here as cosmic nature, while Chrysippus understands both cosmic and human nature.

One key appeal to human nature comes in the form of a “cradle argument,” which uses the behavior of unsocialized babies to establish what is natural and not merely conventional. The Stoics say that a newborn first finds herself and her constitution congenial (oikeion). So, she has an impulse to preserve herself and her constitution. Thus, the newborn finds whatever preserves herself and her constitution congenial, and has an impulse toward them; she finds whatever destroys herself and her constitution uncongenial, and has an impulse away from them. Our constitution includes bodily, psychological, and social abilities. At first, these are unsophisticated; the baby can flail her limbs, perceive her surroundings, and demand food from her caretakers. All these capacities are natural to her, congenial to her, and she has an impulse to exercise and preserve them. In short, the uncorrupted baby, her capacities, the exercise of those capacities, and whatever conduces to the preservation and exercise of herself and her capacities, have value for her. The opposites all have disvalue.

Next, the Stoics sketch the development of more bodily, psychological, and social abilities. We can stand, walk, and run; we can distance ourselves from appearances and assess whether things are as they seem; and we can engage in reciprocal relationships with others. These developments are natural to us. We continue to find ourselves and our developing constitutions congenial and have an impulse to exercise and preserve ourselves and our constitutions. Again, all these things have value for us and the opposites have disvalue.

Some key psychological aspects of our constitution are the capacity to receive impressions (for things to seem a certain way); the capacity to assent to impressions and so form beliefs, or else to withhold assent; the capacity to receive “graspable impressions” (true impressions that could not possibly be false); and the ability to distinguish graspable from non-graspable impressions, and assent to the former but not the latter. Assent to a graspable impression produces a grasp (katalêpsis), which constitutes an infallible awareness of a small part of reality. Grasps are the Stoic “criterion of truth”—the proper touchstone for any inquiry or argument—but they do not amount to knowledge. Knowledge requires stability, even in the face of dialectical examination (as it did for Plato). That requires assenting only to graspable impressions and organizing one’s grasps into a stable explanatory structure. This sets a high bar for knowledge (and for virtue, which, as we shall see, the Stoics identify with knowledge). Few humans, if any, ever attain knowledge. Still, grasps are a stepping stone; both the wise and the foolish have them, and they offer a path from foolishness to wisdom. Even though few of us make it, wisdom is the natural end point of human development.

This brings us back to value, which is distinct from goodness. Only what always benefits is good, just as only what always makes things hot is heat. That is, goodness is unconditional value. Most valuable things lack unconditional value (are not good) for familiar reasons: in special circumstances, things that are ordinarily valuable are disvaluable, and most valuable things can be misused. So, the Stoics call conditionally valuable things preferred indifferents, which should be selected; conditionally disvaluable things are dispreferred indifferents, which should be rejected. Things of no value or disvalue, or very little, are strictly indifferent and should be neither selected nor rejected. Only good and bad things should be chosen and avoided; these unconditional impulses are only fittingly directed at good and bad objects.

This introduces a crucial concept: appropriate actions (kathêkonta), or actions that admit of a reasonable defense. Importantly, the agent need not be able to provide such a defense to perform an appropriate action. (Even non-rational animals have and can perform their own appropriate actions.) As the wise and foolish both have grasps, so both the virtuous and vicious can perform appropriate actions. However, only the wise person can defend her grasps and her actions in the face of all questioning. Since the wise person (also called the sage) does appropriate actions for the right reasons, the Stoics call her actions right actions (katorthômata). The sage’s rational defense of her actions appeals to the value and disvalue of the preferred and dispreferred indifferents at stake, and explains how her selections and rejections respond appropriately to that value and disvalue. There are no action-types (aside from virtuous actions) that the sage always performs; occasionally, even cannibalism and incest are appropriate actions.

If the sage appeals to the value and disvalue of indifferents to explain her actions, where do virtue and the good enter the picture? Start from the developing agent who not only reacts immediately to particular valuable and disvaluable things, but who can compare value and disvalue and sometimes, at least, find the appropriate action. The next step in proper development is to perform appropriate actions regularly and reliably. Eventually, the agent appreciates how appropriate actions fit together into an orderly, harmonious life. At this point, the developing agent comes to see that the order and harmony of her life—made possible by reasoning about value and disvalue—has a value different in kind from the value of the things she reasons about. That order and harmony is, in a word, good.

The primary good thing in Stoicism is virtue, or practical intelligence about comparative selection-value. (Other goods include virtuous activity, the virtuous agent, and a friend—only the good are friends, because only they harmonize with themselves and each other.) The virtuous person appreciates the relevant values at stake in her circumstances and has a stable, coherent view about how to compare the values at stake. (She also knows that she acts with imperfect information, so she acts “with reservation”—in the knowledge that new information may require a change of plans or attitudes.) Unlike preferred and dispreferred indifferents, one would always rather have virtue so understood, and it cannot be misused. That is, virtue has unconditional value—it is good. The sage selects and rejects indifferents constantly and firmly and so has the “smooth flow of life” that the Stoics call happiness.

Since happiness is the possession (or possession and correct use) of goods, and since the Stoics think virtue is the only good and cannot be misused, the virtuous person is happy. The sage’s happiness does not depend upon whether she actually acquires preferred indifferents and not dispreferred indifferents; that is why they are indifferent (with respect to happiness). Virtue is perfect psychological coherence, which does not come in degrees, so neither does happiness. Thus, the sage is fully happy even on the rack (because she has and exercises virtue) and she always acts virtuously. Cicero illustrates this point with the example of Regulus, a Roman general who was captured by the Carthaginians. Regulus promised that he would carry terms of surrender back to Rome and then return. When he arrived in Rome, he argued against accepting the terms, returned to Carthage as promised, and was tortured and killed there. (Notice that this counts as an appropriate action only if keeping a promise to the enemy and its effects had greater selection-value than Regulus’ physical comfort and continued life and their effects. One cannot assume that Regulus’ behavior is required by justice, because the Stoics deny such general claims as “one should always keep promises,” “one should never have sex with close relatives,” and “one should never consume human flesh.”) On the flip side, everyone who is not a sage is foolish (because we all lack perfect psychological coherence) and miserable (because we all have the only bad thing, vice). All non-sages are equally vicious and miserable, even those who are making progress (prokopê), much as those who are underwater but rising toward the surface are drowning no less than those who are not rising toward the surface.

We are now in a position to understand the view most often associated with Stoic ethics: advocacy of freedom from passions (apatheia). This does not mean that we should have no affective life at all. The Stoics have a technical definition of passions (pathê) as fresh, weak judgments that something is good or bad. (A judgment is fresh when it is newly assented to; a judgment is weak when it is unstable and so not known, even if it is true.) The four highest species of passion are pleasure, pain, desire, and fear. Pleasure and desire represent their objects as good in the present and future, respectively, while pain and fear represent their objects as bad in the present and future. The sage has good versions of three of these four: joy (reasonable elation), wish (reasonable choice), and caution (reasonable avoidance). They omit any good version of pain, which suggests that the “good feelings” (eupatheiai) are strong, known judgments about what is good and bad, and are never directed at preferred and dispreferred indifferents. The sage, being wise, will never judge that anything that is neither good nor bad—for example, any preferred or dispreferred indifferent—is either good or bad. Further, the sage never is bad, but may become bad again. So, she is fittingly cautious about future bads, but she will never experience a negative affect directed at her present badness. For as long as she is wise, she is virtuous, good, and happy, not vicious, bad, and miserable.

So far we have focused on human nature, but we saw above that Cleanthes and Chrysippus both think our end involves living in accordance with cosmic nature. Accordingly, physics (knowledge of nature in general) is a virtue. But how more specifically does knowledge of the cosmos connect to ethics? In at least two ways. First, the Stoics are pantheists—the study of nature reveals that it is providentially ordered, and indeed that the cosmos simply is God. God’s beneficial arrangement of the cosmos (that is, of God’s body) requires that God be good and virtuous. Given the paucity of human sages, physics is the study of the only virtuous, good thing we know. Second, the Stoics use the providential governance of the cosmos and our role as parts of it to argue for ethical conclusions—especially that we should value the common interest more than our own. Chrysippus uses a striking image: suppose our feet were rational. The rational foot would understand itself as part of a larger rational organism, and conduct itself accordingly. For example, given its understanding of what is valuable for the whole of which it is a part, the foot would sometimes want to be muddied. The foot might even desire to be amputated if amputation were the only way for the whole rational animal to carry on in the best way. But each human being is in fact a rational part of a rational whole, the cosmos. So, given our understanding of what is valuable for the cosmos as a whole, we should sometimes want to have dispreferred indifferents, and even sometimes to die, so that the whole cosmos can carry on in the best way.

4. Academic Skepticism

The Academics take their name from Plato’s Academy. Arcesilaus was a head of the Academy who took the school back to (what he thought were) its skeptical roots. Here he could appeal to Plato’s Socrates, who denied knowing anything important and tried to show others that they were in the same position. He could also appeal to Plato, who can be seen as distancing himself from any dogmatic views by writing dialogues, many of which end in puzzlement anyway. The Academics would argue on both sides of any question; in one famous case, Carneades—the greatest of the Academics—went to Rome and argued for justice on one day and against justice on the next. A favorite Academic target was the Stoic claim that cognitive impressions exist and can be distinguished from non-cognitive ones; debates between Academics and Stoics persisted for quite a long time.

Like other global skeptics, Academics must explain how they can maintain their skepticism without walking off cliffs. They say that they do and maybe even believe what is reasonable or plausible. Plausibility comes in degrees, and Carneades suggests three important grades: initially plausible impressions, uncontroverted impressions (which are not only plausible but also agree with related plausible impressions), and thoroughly tested impressions (which require examining each of the related plausible impressions that agrees with an uncontroverted impression). One can rely on different grades of plausibility depending on the matter at hand. To jump away from something on the ground that may be a poisonous snake, the Academic only needs a plausible impression; to decide how to live, she will want thoroughly tested impressions.

In the Academic–Stoic debate, both sides made accommodations under dialectical pressure. Eventually, one Academic, Antiochus of Ascalon, rejected skepticism and accepted views close to Stoicism in both epistemology and ethics. (Cicero, another late Academic who held more firmly to skepticism, did something similar; his De Officiis rehearses and then supplements the Stoic Panaetius’ work on appropriate actions.) Antiochus claims to be recovering an ancient consensus among Plato, Aristotle, and the Stoics. In ethics, this putative consensus says that virtue suffices for happiness, but possession of external and bodily goods makes the happy person happier, while their lack makes her less happy. The Stoics (says Antiochus) just use new and misleading language to state this consensus view. Antiochus’ “consensus view” lies quite close to Plato’s (as described above), but he papers over differences among his view, Aristotle’s view, and the Stoics’ view on the role of bodily and external goods in happiness. Antiochus’ view of Aristotle is understandable, though, especially since the Aristotelians of his day did hold the view that he attributes to Aristotle.

5. Epicureanism

The views canvassed above all accept that living well consists in virtue or virtuous activity. (Though the Academics are skeptics, they reliably seem to find this sort of view more plausible than the alternatives.) Another kind of ancient ethical theory says that living well consists in pleasure; the most important such view is Epicureanism.

Although they are outliers in other ways, the Epicureans operate from standard constraints on our final end: we do everything else for its sake, and we do not seek it for the sake of anything else. They use several approaches to defend their claim that the final end of all our actions is pleasure. First, they say that pleasure’s goodness is evident in perception and need only be pointed out, not argued for—much as we need not argue that fire is hot, since its heat is evident in perception. Second, like the Stoics, the Epicureans offer a version of the cradle argument. Where the Stoics say that the newborn’s first, uncorrupted impulse is for the exercise and preservation of herself and her constitution, the Epicureans say that she goes for pleasure. Finally, some Epicureans responded to arguments against hedonism. Sadly, no direct replies to the best anti-hedonist arguments of antiquity survive, but we do have some attempts to explain why many people deny the obvious truth of hedonism.

In one way, bodily pleasures and pains have a special role in the Epicurean view: all other pleasures and pains must be “referred to” them, directly or indirectly. For example, worry about losing one’s job might be referred directly to pains of hunger and physical exposure (because the job pays for food and shelter). Worry about what the boss thinks might be referred to worry about losing one’s job, and indirectly to the same bodily pains. This can be repeated indefinitely; perhaps one’s worry about proper clothing is referred to what the boss thinks, and so on. The key claim is that all psychological pleasures and pains must ultimately be referred back to the body. Plato and others, in contrast, say that we have basic non-bodily pleasures and pains—for example, shame at one’s bad reputation, or pleasure when one learns something new, just by themselves.

In another way, though, psychological pleasures and pains have a special role: they have greater magnitude than bodily pleasures and pains. On this point, the Epicureans actually agree with Plato and others above. However, they explain the comparative magnitudes in a different way: the body only registers what is happening right now, while the soul ranges over past, present, and future. The soul thus represents to itself a much larger array of pleasures and pains, and can feel more pleasure and pain than the body can at a moment. (Here the Epicureans disagree with their hedonist predecessors, the Cyrenaics, who say that bodily pain is used as punishment because its magnitude is greater than pain of the soul.)

The other most important Epicurean thesis about pleasure and pain is their denial that there is any neutral hedonic state in which one experiences neither pleasure nor pain. (On this point, they disagree with both Plato and the Cyrenaics.) If there is no neutral hedonic state, then complete removal of pain obviously cannot culminate in the neutral state; the condition in which one is completely free of pain must be pleasure. In fact, once pain is removed, they say, pleasure cannot be intensified, in either the body or the soul. Because psychological pleasures are greater than bodily pleasures, freedom from disturbance of the soul (ataraxia) is the key determinant of happiness, more important than freedom from bodily pain (aponia). Thus, any bodily pain can be outweighed by the pleasure of freedom from disturbance, and the Epicurean sage can live well in any external circumstances, even on the rack. Ataraxia (sometimes translated “tranquility”) requires three main subsidiary achievements: freedom from fear of death, freedom from fear of the gods, and freedom from excessive desire.

Epicurean arguments that death is not fearful continue to attract a great deal of attention from contemporary philosophers. The Epicureans argue that death is the end for us; we are not immortal. Then—and this is where contemporary discussion usually begins—being destroyed cannot harm us, for two reasons. First, when we are dead, we perceive nothing, and only what we perceive can harm us. (Some people object: things we do not perceive can harm us, as when a friend betrays us but we never find out.) Second, when we exist, we are not yet dead, so death cannot harm us while we are alive. Once we are dead, we no longer exist, so death cannot harm us when we are dead either. The second argument can be developed in various ways. The Epicurean poet Lucretius asks whether we were harmed by our pre-natal non-existence, and argues that if we were not, then our post-mortem non-existence also will not harm us. (Some people object: we can be harmed when we do not exist, as when a project that we care about and work hard to support fails after our death. Nothing pre-natal could harm us in this way.) One important clarification: as we shall see, the Epicureans think it is (usually) natural to try to avoid death. However, trying to avoid death does not entail fearing it, any more than we must fear getting our shoes wet in order to avoid getting our shoes wet.

The Epicureans try to remove fear of the gods by appealing to the concept of divinity: gods are immortal and blessed. But perfectly blessed gods can neither be benefited nor harmed by others (including human beings). So, they will never be grateful to human beings for benefiting them or angry at human beings for harming them. Therefore, the phenomena popularly ascribed to divine agency—for example, thunderbolts, seen as expressions of divine anger—cannot be explained that way. To vindicate this claim, they offer scientific accounts of the world solely in terms of the basic principles of atoms and void.

Finally, the Epicureans divide desires: some are natural and others are not. The former are grounded in actual human needs; the latter (for example, the desire to have statues erected in one’s honor) are not. Among the natural desires, some are necessary and others are not. Unnecessary natural desires are grounded in actual human needs (they are natural), but they aim to meet that need in a particular way, even though it could be met in many other ways. For example, caviar can meet the human need for food, so desire for caviar is natural. But our need for food can be met in many ways, so the desire for caviar is not a necessary desire. Natural and necessary desires are for the proper objects of genuine human needs. There are three kinds of natural and necessary desires, depending on what they are necessary for: happiness, freedom from bodily pain, and life. This division is fairly clear: we need some things to stay alive, and desires for those things are natural and necessary. But we could be alive and in severe bodily pain, which is naturally bad for us. So, desires for what we need to remove bodily pain are also necessary—for example, food and drink in general (but not caviar and champagne specifically). Further, we can be alive and free from bodily pain but still miserable, because our minds are troubled. Thus, we also have natural and necessary desires for what can remove mental trouble: virtue and friendship.

Several virtues can be treated fairly quickly. Courage is the state in which one is free from irrational fear of death and the gods (which also requires piety). Temperance is the state in which one has natural desires and abandons unnecessary desires whenever circumstances make it difficult to eat (say) caviar instead of barley. Wisdom is knowledge of death, the gods, desires and pleasures, and the basic structure of the cosmos; it instills piety, courage, and temperance. That leaves the most interesting virtue for the Epicureans, justice, which has both social and personal aspects. Socially, justice is a useful agreement—in particular, an agreement to neither harm nor be harmed. For an agreement to be just, it must actually be useful. Which agreements are useful (and so just) varies, so different agreements are just in different circumstances. Still, the core concept of justice as a useful agreement does not change. Next, there are two accounts of why personal justice is important. First, even if one can get away with violating just social agreements, one cannot be sure that one will get away with it. So, violating just social agreements causes fear. Fear is a psychological pain; since such pains are greater than bodily pains, whatever material goods one hopes to gain by violating a just social agreement cannot compensate for injustice’s cost in fear. Second, whatever one might hope to gain through injustice will not be necessary for life, health, or tranquility. Since the sage is temperate, she desires only what is necessary to life, health, and tranquility. Such limited goods are (usually) easily obtained. So, the sage has no incentive to violate just social agreements. Whenever extreme circumstances might seem to give an incentive, we should reconsider whether the original agreements are genuinely useful in those extreme circumstances, and so whether the agreements are still just.

Lastly, Epicurus praises friendship for its ability to make us tranquil. It is tricky to say how friendship and justice differ. Epicurus says justice is an agreement neither to harm nor be harmed, which suggests a possibility: justice seeks mutual avoidance of harm—not only by not harming one another, but also by assisting each other in not being harmed. Friendship goes beyond that; it requires mutual benefit. But what kind of benefits? Friends help each other when necessary, and Epicurus agrees that this is one benefit of friendship. But more important for our tranquility is our confidence that we will have help from our friends in the future, if we need it. This includes not only help with mundane tasks like moving, or momentous ones like providing for one’s children after one dies; the Epicureans actually formed a sort of commune near Athens, and dedicated their days to philosophical therapy through an elaborate set of confessional practices. Thus, friends help each other to achieve the highest good (tranquility) by helping each other to achieve its necessary means (virtue).

6. Pyrrhonism

Pyrrho himself is a nebulous figure, but in the wake of the Academy’s later skeptical turn (see above), Aenesidemus revived his legacy by using him as a figurehead for a different skeptical tradition. The differences between Academics and Pyrrhonists are not always easy to discern. Our main source for Pyrrhonism, Sextus Empiricus, says there are three kinds of philosophers: dogmatists (who claim to have grasped the truth), Academics (who say the truth cannot be grasped), and Pyrrhonists (who are still inquiring). Thus, Sextus effectively characterizes Academics as dogmatists who claim to have grasped one truth. However, his classification does not withstand scrutiny. The Academics follow persuasive appearances, and any claim that the truth cannot be discovered may be understood as what is plausible after extensive inquiry (not: what they claim to grasp as the truth). As we shall see, the Academics use of persuasive appearances is not far from what the Pyrrhonists say and do.

Still, there is a clear difference in the ethical attitudes taken by Academics and Pyrrhonists. The Academics typically say that something like the Aristotelian or Stoic view—that virtue and virtuous activity are the highest or only goods—is plausible. The Pyrrhonists say that their end is tranquility (again, ataraxia). This places their ethical attitude closer to the Epicureans, though their recipe for tranquility is rather different. (Here it is worth noting that later Roman Stoics also emphasized tranquility in a way that the early Stoics did not.)

We must work up to that point by considering the development of a young Pyrrhonist. First, she notices that different appearances often make incompatible reports. (The wind seems warm to her and cold to another; cremating the dead seems respectful to her and disrespectful to another.) Aenesidemus listed many ways that appearances can disagree, the “Aenesideman modes.” Such disagreement or relativity of appearances is puzzling: which appearances reflect how things really are? On topics that we care about, such puzzlement is painful and provokes attempts to remove it by vindicating some appearances over others. That is, puzzlement provokes inquiry into how things really are in themselves, as opposed to how they appear to various subjects.

When the Pyrrhonist inquires, though, she discovers equally strong reasons on both sides of every question. Further, whatever considerations she might appeal to in trying to resolve the dispute are also matters of disagreement, requiring more inquiry, and so on. The state in which one finds equally strong reasons on both sides of an issue is “equipollence”; the Pyrrhonist responds to equipollence by suspending judgment on which appearances reflect how things really are. When she does so, the pain that she felt at being puzzled dissolves. Sextus offers a simile: Apelles was trying and failing to paint the froth on a horse’s mouth. In frustration, he threw his sponge at the canvas; fortuitously, it produced the desired effect. Likewise, the budding Pyrrhonist wants to rid herself of troubles about the real nature of things by discovering the truth. She never finds reasons for any particular view better than the reasons on the other side. So, she suspends judgment. But when she does, she fortuitously achieves the end she sought: tranquility. As mentioned above, though, she does not rest on her laurels at this point; rather, she keeps inquiring.

Like Academics, Pyrrhonists must explain how they act. The Pyrrhonist criterion of action is the appearance. We can approach this through the examples of relativity above. When the wind seems warm to one person and cool to another, and they have equally strong reasons to trust each appearance, they might suspend judgment on the question whether the wind is really warm or cool. But this does not remove the appearances; the wind still seems cool to one and warm to the other. It also does not prevent either from acting on her appearance. One might put on another layer of clothing, while the other takes one off. Likewise when two people disagree whether it is respectful to cremate the dead. We might find equally good reasons to say that cremation is respectful and that it is disrespectful. But it may still seem respectful to one person and disrespectful to the other, and nothing prevents each person from acting on how things seem to them. (It is an open question whether this will produce toleration of different opinions or simply make practical disputes irresolvable.) It is unclear exactly how much the Pyrrhonist criterion of action, the appearance, differs from the Academic criterion, the plausible appearance. For example, both Pyrrhonists and Academics follow their traditional religious practices, which suggests some convergence in how ancient skeptics of different stripes deal with action.

Again, their clearest difference concerns the final end. Naturally, the Pyrrhonists do not dogmatically assert that tranquility is the end; it simply seems to them to be the end, and they act based on that appearance. But they say more about why Pyrrhonism seems to be the best path to tranquility—better than Epicureanism, for example. Certain appearances and feelings are unavoidable for us: hunger seems painful and leads us to relieve it. There is no getting rid of these appearances and feelings. However, those who dogmatically assert that pain is bad (for example) face a double dose of pain. They feel not only the inevitable pain of hunger, but also the further pain of mental trouble on reflecting that they possess something that is (by their lights) really bad for them. The Pyrrhonist, however, suspends judgment on the question whether the pains of hunger are really bad for her. Thus, she maintains her tranquility even in the face of life’s inevitable nuisances.

7. References and Further Reading

a. Primary Works

J. Annas and R. Woolf, Cicero: On Moral Ends (Cambridge: Cambridge University Press, 2001).
- Cicero presents the ethical views of the Epicureans, Stoics, and Antiochus, and disputes them with reference to Carneades’ division of ethical theories.
C. Brittain, Cicero: On Academic Scepticism (Indianapolis: Hackett Press, 2006).
- Our main source of information about the Stoic–Academic debate and the development of the Skeptical Academy.
J. Cooper, Plato: Complete Works (Indianapolis: Hackett Press, 1997).
R. Crisp, Aristotle: Nicomachean Ethics (Cambridge: Cambridge University Press, 2000).
M. Griffin and E. Atkins, Cicero: On Duties (Cambridge: Cambridge University Press, 1991).
- Cicero adapts and extends Panaetius’ work on appropriate actions.
B. Inwood and L. Gerson, Hellenistic Philosophy (Indianapolis: Hackett Press, 1998).
- An excellent source book.
A. Long and D. Sedley, The Hellenistic Philosophers (Cambridge: Cambridge University Press, 1987).
- Another excellent source book; v.1 contains translations, while v.2 contains the texts translated (and sometimes more) together with a substantial bibliography.

b. Secondary Works

K. Algra, et al., The Cambridge History of Hellenistic Philosophy (Cambridge: Cambridge University Press, 2000).
- A series of essays by various authors on central topics, and contains an extensive bibliography.
J. Annas, An Introduction to Plato’s Republic (Oxford: Oxford University Press, 1981).
J. Annas, The Morality of Happiness (Oxford: Oxford University Press, 1993).
- Influential overview of the ethical theories of Aristotle and the main Hellenistic schools.
J. Annas, Platonic Ethics, Old and New (Oxford: Oxford University Press, 2000).
- Argues for a Stoicized interpretation of Plato’s ethics by reference to Middle Platonist readings of Plato.
J. Barnes, The Toils of Skepticism (Cambridge: Cambridge University Press, 1990).
T. Brennan, The Stoic Life (Oxford: Clarendon Press, 2005).
S. Broadie, Ethics with Aristotle (Oxford: Oxford University Press, 1991).
G. Fine, Plato 2: Ethics, Politics, Religion, and the Soul (Oxford: Oxford University Press, 1999).
- A collection of essays, including many classics.
T. Irwin, Plato’s Ethics (Oxford: Oxford University Press, 1995).
R. Kraut, Aristotle on the Human Good (Princeton: Princeton University Press, 1989).
P. Mitsis, Epicurus’ Ethical Theory (Ithaca: Cornell University Press, 1989).
M. Nussbaum, The Fragility of Goodness (Cambridge: Cambridge University Press, 1986).
- A study of moral luck in Greek tragedy, Plato, and Aristotle.
M. Nussbaum, The Therapy of Desire (Princeton: Princeton University Press, 1994).
- Essays on ethical theory and therapy in Aristotle and Hellenistic philosophy.
T. O’Keefe, Epicureanism (Berkeley: University of California Press, 2009).
G. Lear, Happy Lives and the Highest Good (Princeton: Princeton University Press, 2004).
- A study of the relationship between ethical and theoretical virtues in Aristotle.
A. Rorty, Essays on Aristotle’s Ethics (Berkeley: University of California Press, 1980).
- A collection of essays, including many classics.
H. Thorsrud, Ancient Skepticism (Berkeley: University of California Press, 2008).

Author Information

Clerk Shaw
Email: jshaw15@utk.edu
University of Tennessee
U. S. A.

Zhou Dunyi (Chou Tun-i, 1017-1073)

Zhou Dunyi (sometimes romanized as Chou Tun-i and also known by his posthumous name, Zhou Lianxi) has long been highly esteemed by Chinese thinkers. He is considered one of the first “Neo-Confucians,” a group of thinkers who draw heavily on Buddhist and Daoist metaphysics to articulate a comprehensive, Confucian religious philosophy.

This article begins with a brief look at Zhou’s life and historical context before turning to a detailed examination of his major writings. It then looks at major themes in Zhou’s work as well as a few important philosophical concerns that his writings address. Finally, it turns to Zhou’s legacy and influence, providing information on additional readings for further study of Zhou’s thought.

Zhou combines deep spirituality with an emphasis on morality and politics. He places this humanistic ideal within a cosmic vision wherein the forces of creation find their fullest expression in human beings. Essentially, he articulates the common metaphysical framework that informed Chinese philosophy for nearly a millennium. In his work, Zhou follows earlier thinkers such as Mencius (Mengzi, 372-289 B.C.E.), but, unlike some of his stricter Confucian brethren, Zhou draws heavily on ideas associated with Daoism and Buddhism. This is particularly the case with Zhou’s stress on the primacy of “stillness” (qing) over “activity” (dong) and his strong cosmological orientation. Moreover, Zhou’s temperament seems marked more by Buddhist notions of equanimity and compassion than stereotypical Confucian formality and restraint. For these reasons, Zhou remains an intriguing yet controversial figure.

According to Zhu Xi (1130-1200), perhaps the most eminent early Neo-Confucian thinker, Zhou was the first sage since Mencius and a key figure in the “new transmission” of the Confucian Way (Dao). Zhou transmitted the Way to the Cheng brothers, Cheng Hao (1032-1085) and Cheng Yi (1033-1107), who then transmitted the Way to Zhu himself. In this view, Zhou is the “founding ancestor” of Zhu Xi’s school of Neo-Confucianism, a philosophical system that profoundly informed East Asian societies in the Middle Ages. Zhou’s best known works are the “Explanation of the Diagram of the Supreme Polarity” (Taijitu shuo), and Penetrating the Classic of Changes (Tongshu), both of which are included in the Zhouzi Quanshu (Collected Works of Master Zhou). Zhou also wrote a short poetic essay, “On the Love of the Lotus” (Ai lian shuo), that is part of the standard secondary school curriculum in contemporary Taiwan.

Life and Context
Works
Key Concepts
Principal Concerns
Legacy
References and Further Reading

1. Life and Context

Much of what we know of Zhou’s life comes from the Song Shi (History of the Song Dynasty), as well as anecdotes preserved in Reflections on Things at Hand (Jinsi lu), the anthology of Song-era Confucian treatises compiled by Zhu Xi with the help of the historian Lü Zuqian (1137-1181). Totaling some 622 passages culled from the writings of key thinkers (along with Zhu Xi’s comments), this book ranks as one of the most important works of Chinese philosophy.

Zhou was born in Daozhou (modern-day Hunan) into a family of scholar-officials. His “style name” was “Maoshu.” Originally, his personal name was “Dunshi,” but due to the taboo against using the name of the emperor (a widely observed practice in traditional China), Zhou’s name was changed to “Dunyi” when Emperor Yingzong ascended to the throne in 1063. When Zhou was 14 years old, his father passed away but he was adopted by his maternal uncle, Zheng Xiang. It was through his uncle’s work that Zhou attained his first governmental post. During his career, Zhou served as district keeper of records, magistrate of various counties, and assistant prefect. Traditional accounts say that he was quite diligent in his duties, earning high praise from his colleagues and superiors; yet Zhou refused to participate in the civil service examination system, the typical route by which bright and capable men gained access to the elite levels of Song society. As a result, Zhou never held a high governmental position nor attained the coveted “presented scholar” (jinshi) degree, the highest rank and a virtual necessity for attaining an influential post.

Towards the end of his life, Zhou fell ill and was transferred to Xingzi in Jiangxi province, where he settled near the foot of Mount Lu, one of China’s sacred mountains. Here he built a retreat along a tributary of the Pen River, naming it Lianxi (“Stream of Waterfalls”) after a stream in his home village; later generations honored Zhou by calling him “Master Lianxi” after his beloved study. Zhou resigned from office in 1071, passing away about eighteen months later. During his lifetime, Zhou was not well known, even though he briefly tutored both Cheng Hao (1032-1085) and Cheng Yi (1033-1107) when they were young. His contemporaries, however, revered him for his warm personality and intuitive insight into the Way of Heaven. Later Neo-Confucians came to regard him as an exemplar of “authenticity” (cheng), much like Confucius’ disciple Yan Hui. In 1200, Zhou was posthumously dubbed Yuangong (“Duke of Yuan”) and in 1241 was honored in the sacrifices performed in the official Confucian temple.

Zhou lived during the Northern Song (960-1126), the “second golden age” of Confucianism. The initial impetus for this Confucian renaissance came from late Tang Confucian thinkers such as Han Yu (768-824), Li Ao (772-836), and Liu Zhongyuan (773-819). They were highly critical of Buddhism and advocated for a return to what they considered the true source of Chinese civilization (in Zhu Xi’s words, “this culture of ours”), a heritage enshrined in the Classical Confucian texts. After the collapse of the Tang and the eventual rise of the Song dynasty, Confucianism became the guiding Way and, just as in the Han dynasty (206 B.C.E.-220 C.E.), anyone seeking an official position had to be schooled in Confucian texts and doctrines.

The Confucian revival in the early Song was by no means monolithic, however, and several prominent thinkers also pursued studies outside of official circles. While looking to Confucian ideas, many of these thinkers investigated and embraced Daoist and Buddhist notions, particularly those pertaining to spiritual self-cultivation. The ensuing creative tension between these intertwined lines of thought inspired new interpretations of classical texts and pushed Confucianism beyond its traditional boundaries. Among these thinkers, Zhu Xi singles out a select few as the “Masters of the Northern Song,” a group that included Zhou Dunyi, Shao Yong (1011-1077), Zhang Zai (1020-1077), and the aforementioned Cheng brothers. While it would be wrong to consider these men as forming an institutionalized school, they were united in the view that a society based on the Way could only be achieved through personal reform grounded in cultivation of the xin (“mind-heart”) to harmonize Heaven, Earth, and Humanity.

2. Works

For such an influential figure, Zhou authored surprisingly few works. In fact, of the 622 passages in Reflections on Things at Hand, only 12 are by Zhou—far fewer than the number of passages from Zhang Zai and the Chengs. Most people know Zhou for his essay “Explanation of the Diagram of the Supreme Polarity” (Taijitu shuo) along with his extensive commentary, Penetrating the Classic of Changes (Tongshu). Both texts focus on cosmology as well as the ethical and spiritual implications of their depictions of the cosmos, and both texts continue to exert tremendous influence on Chinese thought. In addition, Zhou is credited with “On the Love of the Lotus” (Ai lian shuo), a short poetic essay that, like many such works, reveals unexpected philosophical depths.

a. “Explanation of the Diagram of the Supreme Polarity”

According to tradition, Zhu Xi was so struck by this treatise that he placed it at the beginning of Reflections on Things at Hand, thereby assuring it pride of place in Neo-Confucian thought. Broadly speaking, it has two main parts: the essay itself, which outlines the evolution of the cosmos, and the accompanying “Diagram” (taiji tu), a graphic illustration of the cosmic process described.

Taiji Tu from an ancient Chinese text

The main theme of the Diagram is simple: the human and cosmic realms are governed by the same norms; the microcosm and the macrocosm correspond perfectly. Much like earlier Chinese thinkers, Zhou proclaims that human life (including the socio-political realm) is rooted in the Way of Heaven, and that it is the duty of the sage-ruler to ensure that the cosmic and human realms harmonize. Nonetheless, Zhou presents this cosmology in a particularly powerful manner, prompting later thinkers to consider the “Explanation” a true masterpiece.

A close look at the “Explanation” yields interesting insights. The treatise can be divided into six parts, each corresponding to certain figures in the Diagram. Part 1 begins with the mysterious “Non-Polarity” (wuji), the primordial yet indefinite source of all reality, which Zhou identifies with the “Supreme Polarity” (taiji), the core of actual existence. The taiji gives rise to yin and yang by alternating from stillness to activity and back. Part 2 picks up with the yin and yang, speaking of how their alternation and combination produces the Five Phases (wuxing: water, fire, wood, metal, earth), which in turn form the basis for the cycles of nature (the Four Seasons). In Part 3, Zhou circles back to include the wuji and taiji, the “Two Modes” (yin and yang), and the Five Phases, noting that the latter interact and stimulate one another, thus generating the myriad things of our world.

At this point, Zhou has covered the entire Diagram, yet the “Explanation” is only half finished. With Part 4, he shifts to humanity, which emerges from the cosmic processes and, as such, is governed by both yin and yang which together engender our “five-fold nature.” In Part 5, Zhou turns to the sage, the ideal Chinese ruler, who more clearly perceives and embodies the cosmic forces than the majority of humankind. Mirroring the cosmic rhythm, the sage addresses and “settles” human affairs through the Confucian virtues of centrality, correctness, humaneness, and rightness while abiding in “stillness.” Finally, in Part 6 Zhou turns to the Yijing (Classic of Changes), referring to the Sage’s wisdom as one that embraces cosmic and human truths.

Zhou makes liberal use of paradoxical language in the “Explanation,” notably in the first line where he both distinguishes the wuji and taiji yet joins them together. In doing so, Zhou suggests an equivalence, if not actual identity. Zhou continues in this same rhetorical mode, speaking of the incipient cosmos as both “still” and “active” in its functioning: “Activity and stillness alternate; each is the basis of the other.” In Part 2, Zhou proclaims that the Five Phases are fundamentally one—“simply yin and yang; yin and yang are simply the taiji”—while each has its own nature. Part 4 opens by declaring that humans have the “finest and most spiritually efficacious [qi],” thus singling us out for special consideration. Humans are distinct yet not separate from other beings or the processes of creation. Part 5 focuses on the sage, a mysterious figure who manages human affairs effortlessly, as though he were the the working of nature. Finally, Zhou concludes by stating “Great indeed is the Yijing! Herein lies its excellence!” By closing on this note of awe, Zhou suggests that his treatise proffers a glimpse of the Sage’s cosmic vision.

Zhou’s “Explanation” is simultaneously stirring, enlightening, yet maddeningly mysterious, and this air of mystery is a source of the text’s power. The mystery deepens as Zhou leads us through the Diagram, largely because he describes rather than explains the various figures, and he is strangely silent on some of the Diagram’s aspects. The essay, thus, resembles a theological treatise, laying out basic teachings derived from “scripture,” such as the Yijing, and relating them in a coherent way. In this regard, it resembles the Nicene Creed, a formal statement of core beliefs shared by many traditional Christians. Like the Creed, Zhou’s “Explanation” assumes its readers are familiar with its ideas, presenting them as “articles of faith” but never arguing for why these things should be the case.

b. Penetrating the Classic of Changes

This work comprises forty chapters in all yet since each chapter is only a paragraph or so in length, it is still relatively short. Ostensibly, the title Tongshu comes from Zhou’s insistence that its principles penetrate (tong) and harmonize with the Yijing. The treatise also draws on the Zhongyong (Doctrine of the Mean), the Shujing (Classic of History), and the Analects. It is likely that the “Explanation” was originally the last section of the Tongshu but that Zhu Xi moved it to the beginning; eventually, it became an independent work due to its importance in Neo-Confucian thought.

The treatise’s main themes are central to the Neo-Confucian project: the necessity of authenticity (cheng) in attaining Sageliness, and how to enact Sageliness in accord with the cosmos to establish the true Way (Dao). Zhang Boxing (1652-1725), who compiled the Complete Collection of Zhou Dunyi’s Writings (Zhou Lianxi xiansheng quanji), divides the Tongshu into two parts, each comprising 20 chapters. Certain ideas and concerns link various chapters but a detailed presentation lies beyond the scope of this entry. Instead, this overview highlights key points and includes quotes to provide a sense of Zhou’s voice and style.

Part 1: Tongshu, Chapters 1-20

The first half of the treatise begins with a stirring proclamation: “Being authentic is the foundation of the sage.” Over the next few chapters, Zhou then touches on several traditionally Confucian topics: the importance of moral virtue, the necessity of learning, how to govern properly, and so forth. Not surprisingly, Zhou grounds each of these human concerns in the workings of the cosmos, much as we have seen with the “Explanation.” However, there are a few points in this first half that make the Tongshu rather unique and thus warrant close attention.

Chapters 7-10, for instance, consist of questions from unnamed students and Zhou’s replies, thereby rhetorically underscoring the essentially pedagogical and dialogical nature of Confucianism. Hearkening back to the example of Confucius, the text presumes that the reader is engaging with the teachings as if face-to-face with the teacher, the “old model,” (laoshi) who, in this case, is Zhou himself. Chapter 7 (appropriately entitled “The Teacher”) opens: “Someone asks: ‘Who makes all under Heaven good?’” Reply: “The teacher.” Question: “What do you mean?” Reply: “[He is one whose] nature is simply in equilibrium between firm and yielding good and evil.” Over the next few chapters, the Teacher reminds us of the good fortune at being able to correct our errors, the importance of thinking as an activity rooted in our primal authenticity, and stresses devotion to learning as we progress towards Sageliness.

Chapters 17-19, on the other hand, deal with what might seem to be a minor consideration: music, and, by extension, the “arts” in general. However, this topic is, in fact, central to Confucianism, which consistently upholds the importance of cultural refinement (wen) as part of the Way. Echoing words from Confucius himself, Zhou speaks of music as a positive influence on people, helping attune them to each other. Thus he says in chapter 17, “[The ancient sages and kings] created music to give expression to the airs of the eight [directional] winds and to pacify the dispositions of all under Heaven.” Not only does music attune the mind-hearts of all people, it also harmonizes us with animals and spiritual beings. We see in this short section the inseparability of the aesthetic, ethical, and spiritual dimensions of sagely learning.

Chapter 20 both summarizes Zhou’s points so far and leads us into the next half. It is fitting, then, that the chapter is entitled “Learning to be a Sage,” and, like chapters 7-10, it is a dialogue between student and teacher. The teacher explains the essentials of the Way of the sage, saying, “Unity is essential. To be unified is to have no desire. Without desire one is unoccupied when still and direct when active. Being unoccupied when still, one will be clear; being clear one will be penetrating. Being direct in activity one will be impartial; being impartial one will be all-embracing. Being clear and penetrating, impartial and all-embracing, one is almost [a sage].”

The Daoist flavor in the first half of the Tongshu is unmistakable, but Zhou is not suggesting that the sage observes the world with an empty mind. Rather, he observes that striving for sageliness means uniting all of one’s faculties. Rooted in one’s true nature, undistracted by wayward desires, and unoccupied by selfish lusts or passing whims, one is directly involved with all things. One can then see clearly and thus respond appropriately.

Part 2: Tongshu, Chapters 21-40

This second half of the Tongshu shifts from the more metaphysical stance of the first part to a more explicitly ethical orientation. Zhou starts off in a typically Confucian fashion by focusing on governing society, stressing the importance of being “impartial” (gong) —scrupulously avoiding selfishness—in order to attain “clarity” (ming). Much like the “Explanation,” the first few chapters correlate moral virtue with the cosmic processes of yin, yang, and the wu xing. Following contemporary Confucian Tu Weiming, we can say that Zhou articulates an “anthropocosmic” vision here. However, the references to the Zhongyong as well as the necessity for intelligence in perceiving truth remind us that metaphysical knowledge is but the first step towards enacting the Way.

One of the most interesting things in this second half of the Tongshu is the central role played by Yan Yuan (Yan Hui), Confucius’ most mystically-inclined disciple. At one point, for instance, Zhou exalts Yan Hui’s example: “Seeing what was great, his mind was at peace. With his mind at peace, nothing was insufficient. With nothing insufficient, then wealth and honor, poverty and humble station were all the same [to him]. Being all the same, then he was able to transform and equalize [others, that is regard others as equal]. Thus Yanzi [ Yan Hui] was second only to the Sage [Confucius].” Zhou underscores this same point a little later on in chapter 29 where Zhou exalts Confucius’ “comprehensiveness,” after which he immediately praises Yan Hui as the only one who was able to discern this quality and model it for succeeding generations.

While the entire Tongshu draws on the Yijing, it focuses most explicitly on that work in the last 10 chapters. Much of this section reads as if Zhou were leading the reader through a ritual consultation of that most enigmatic of Chinese classics, referring to hexagrams #1, 24, 25, and 37, among others. Furthermore, it quotes passages from the Xici (“Appended Remarks”), the most philosophically rich section of the Yijing.

Zhou begins the last section of the treatise (chapters 37-40) very simply, invoking the cosmic basis of the sagely Way and stressing the sage’s impartiality while recalling the pedagogical dialogue of earlier sections: “The Way of the sage is perfectly impartial,” I said. Someone asked, “What does that mean?” I replied, “Heaven and Earth are perfectly impartial.” Finally, in the very last chapter, Zhou concludes the Tongshu by giving guidance towards the sagely Way through lines of the Yijing. The last few sentences warrant special attention: “Be cautious! This means [to follow] the ‘timely mean’! ‘Keep the back still,’ for the back is not seen. When still, one can stop [at the right point]. To stop is not to act [deliberately]. To act [deliberately] is not to stop [at the right point]. This Way is profound!”

All told, the Tongshu is a rich, evocative text, appropriately mirroring the mysterious and compelling wisdom of the Yijing. Zhou’s elusive yet allusive style draws on multiple sources, encouraging the reader to make connections between the different sections and events within her own life. While revealing an inspiring cosmic vision, however, it continually reminds readers that its truth can only be realized when enacted daily.

c. “On the Love of the Lotus”

While not a philosophical treatise, “On the Love of the Lotus” (Ai lian shuo) remains Zhou’s most beloved work and reveals surprising spiritual depths. According to tradition, Zhou composed the poem in 1071 after he built his retreat, Lianxi, at the foot of Mount Lu. As was common practice among retired literati (Chinese scholar-bureaucrats), he dug a pond in front of his study and planted it with lotus blossoms, spending much of his leisure time contemplating the scene.

“On the Love of the Lotus” totals some 119 characters in addition to its title, arranged in eleven lines. Each of the lines is a couplet of verses, varying in length. Zhou wrote this piece in the gu wen (“ancient writing”) style, a literary style hearkening back to the elegant prose of the Han dynasty. This style had become increasingly common during the Confucian Renaissance, was a favorite of the late Tang Confucian critic Han Yu, and contrasts with the “parallel prose” style that had dominated Chinese prose previously with the latter’s very strict meter and rhyme scheme. During the Song era, gu wen became the style of choice among the literati, and was a rhetorical signal that the writer and reader were dealing with a work of “special writing” concerning high-minded ideals, versus low or vulgar subjects, much as “the King’s English” functioned in the British Empire during the 19^th and 20^th centuries. A mark of education and culture, gu wen was still accessible to a degree by members of the lower classes, and thus exemplified the power of wen as a culturally binding force among the Chinese populace.

On the surface, “On the Love of the Lotus” is Zhou’s heartfelt ode to the flourishing blossoms in his garden, evoking the serene presence of flowering chrysanthemums, peonies, and lotuses, each with its distinctive aura and beautiful form. Yet, the piece hints at subtle depths of meaning, pointing to the anthropocosmic vision that Zhou so explicitly discusses in his other works. For Zhou, the lotus exemplifies the cosmic/spiritual harmony that we should all seek. Thus he says, “Inside, it is open; outside, it is straight (zhi)” – a line recalling the time-honored Chinese ideal of Dao. Zhou contrasts this with the chrysanthemum, which is the “recluse” among the flowers, and the peony, which he speaks of as “wealthy,” or showy, gaudy, and appealing to the masses. The lotus, on the other hand, is the “gentleman among flowers.” The term “gentleman” (junzi), of course, has since the time of Confucius been the ideal human being.

Like other Chinese literary works, “On the Love of the Lotus” draws on cultural tropes shared by Confucian, Daoist, and Buddhist traditions. This is most obvious with the image of the lotus itself. As Zhou writes, “I love only the lotus, for rising from the mud yet remaining unstained; bathed by pure currents and yet not seductive.” “On the Love of the Lotus” pulses with subtle yet powerful symbolism, evoking a deep, tranquil mood while encouraging a dynamic and attentive state of awareness. It thus gives a glimpse of the sagely mind itself.

3. Key Concepts

Zhou’s works, while creative and eclectic in nature, establish the basic parameters of Neo-Confucian philosophy. While he never articulates a full-fledged system, most of the concepts he discusses support each other. This overview, therefore, looks at key themes running through Zhou’s writings, explaining what they entail and how they connect to each other.

a. Fundamental Unity within Diversity

A perennial issue in philosophy as expressed in all cultures is the relationship between the myriads of phenomena in the world, which are diverse and seemingly constantly changing, and the underlying unity and stability within this vast whole. A pond is filled with dozens of lotus blossoms, each distinct and with its own unique hue, some in bloom while others wither. Yet all seem to embody the same “lotus-ness,” and each specific blossom remains its own, separate self throughout its life cycle. Similarly, our world is peopled with thousands of different human beings, and every single person has his or her own unique background, thoughts and feelings. And yet, each person’s life follows a similar pattern and each person embodies the same “human-ness.” What is the relationship between the oneness and many-ness that characterizes our world? This problem, the problem of “the One and the Many,” lies at the heart of many of the world’s philosophies, from the Pre-Socratics of ancient Greece, such as Thales, Anaximander, Heraclitus, and others, to the nameless ṛṣis who composed the Upaniṣads, to the various thinkers of classical Chinese civilization. While answers have varied, most solutions assume that the world is “one thing” and so there has to be a unifying aspect to the obvious diversity.

For Zhou Dunyi, the answer is that a fundamental unity encompasses the myriads of things, including human beings. This unity, however, does not consist in some static metaphysical mush wherein all things collapse into a formless One, nor some immaterial Divine Being (“God”). Rather, this unity is a dynamic, integrated system in which all things function together. We can see this clearly in the Tongshu, chapter 22, where Zhou succinctly summarizes the cosmic process: “The two [modes of] qi and the five phases transform and generate the myriad things. The five are the differentia and the two are the actualities; the two are fundamentally one. Thus the many are one, and the one actuality is divided into the many.”

Despite such an all-encompassing metaphysical scheme, Zhou maintains the decidedly human focus typically associated with Confucianism, offering an anthropocosmic vision in which the root metaphors for understanding humanity itself are drawn from the workings of Nature. We can see this most clearly in Zhou’s “Explanation,” where he quotes from the commentary section of the Yijing: “the sage’s virtue equals that of Heaven and Earth; his clarity equals that of the sun and the moon; his timeliness equals that of the four seasons.” In this passage Zhou describes the Way of the sage, the ideal of humanity, in explicitly cosmological terms. Rhetorically, the message is clear: the Way of humanity is the Way of the cosmos.

Students of Chinese thought may recognize in Zhou’s metaphysical vision yet another variant of the notion of humanity forming a triad with Heaven and Earth, perhaps best expressed in the statement, “the unity of Heaven and Humanity” (tianren heyi). This harmonious unity of human beings and the cosmos lies at the center of Zhou’s philosophy and draws quite explicitly on earlier Confucian thinkers, notably Dong Zhongshu (c. 195-105 B.C.E.). In some respects, Zhou Dunyi merely expands upon this basis by borrowing insights from Buddhism and Daoism which he integrates into Confucian tradition. Human beings, along with all other natural phenomena, are integral parts of a larger whole, and in Zhou’s view, we can see this teaching both metaphysically and ethically. Julia Ching suggests that, under Buddhist influence, this idea transformed into the increasingly abstract adage “The Ten Thousand Things are One” (wanwu yiti), and although we can see aspects of such “pantheism” in Zhou’s writings, he never advocates pure withdrawal into metaphysical contemplation; for Zhou, embracing the actual embodied situation trumps mystical wonder.

Not surprisingly, this insistence on a non-dual unity-cum-diversity defies clear articulation. As with many mystical philosophers (for example, Zhuangzi, Huineng, Pseudo-Dionysius, Śaṅkara, Ibn Arabi, and others), Zhou often resorts to the language of paradox. Perhaps the most famous example is in the opening words of the “Explanation”: wuji er taiji (“Non-Polar(ity), and yet Supreme Polarity!”). This most curious of lines is comprised of a negation and a positive affirmation linked by a conjunction. Grammatically, this phrase both distinguishes the wuji and taiji yet joins them together in some sort of identity. This simultaneous identity and difference echoes chapter one of the Daodejing: ce liang zhe tong chu er yi ming (“these two [wu – ‘non-being’ – and you –‘being’] interpenetrate, yet, after emerging, differ in name”). Zhou resorts to paradox elsewhere in his writings as well. Rhetorically, paradoxical language poses difficulty for rational understanding. No doubt this more mystical dimension of Zhou’s work has encouraged interpretations that emphasize his debts to Daoism and Buddhism.

The paradoxical harmonious unity of humanity and the larger cosmos also shows in Zhou’s discussion of “stillness” and “activity.” As the second line of the “Explanation” reads: “The Supreme Polarity in activity generates yang; yet at the limit of activity it is still. In stillness it generates yin; yet at the limit of stillness it is active. Activity and stillness alternate; each is the basis of the other.” Much like yin and yang, so cosmic stillness and activity are complementary opposites, not antithetical, but rather co-entailing each other. This cosmic pattern forms the model for the sage as well, who remains still in the midst of activity but also active while keeping still. Such active stillness and still activity expresses the fundamental dynamism governing existence as a whole.

One issue that arises with Zhou’s notion of unity within diversity is whether he is speaking strictly cosmologically, concerning the “physical” functioning of the reality, or metaphysically, concerning the ultimate structure of the cosmos. Zhou’s writings are ambiguous on this point, and lend themselves to both readings. A. C. Graham, however, argues that Zhou is speaking cosmologically, and that the tendency to read Zhou metaphysically is due to Zhu Xi’s reading in which he equates the taiji with li (principle).

Zhou provides a subtle way to understand the psychological dimension of such unity when he speaks of “impartiality” (gong). One who is “impartial” remains unswayed by petty desires, and thus can respond to any situation without complications. As Zhou says, “Being direct in activity, one will be impartial; being impartial one will be all-embracing.” There is no sense of withdrawal, but rather an active embracing of existence. Moreover, such engaging with the world at large is the sage’s Way, a state that mirrors the cosmos.

b. Human Nature

Zhou’s anthropocosmic vision, centering as it does on the unity of Heaven, humanity, and all things, entails a specific notion of human nature (xing). Indeed, discussion of human nature is one of the hallmarks of Neo-Confucian tradition. Unlike the Chengs and Zhu Xi, Zhou does not explicitly spell out his view of human nature, but we can infer quite a lot from his writings.

Zhou never uses the actual term xing in the “Explanation” but he mentions it several times in the Tongshu. Much like what we see in Mencius and the Zhongyong, Zhou implies that the nature of human beings is endowed by Heaven and is fundamentally good. Zhou once more turns to the Yijing: “”The alternation of yin and yang is called the Way. That which issues from it is good. That which fulfills [ or constitutes] it is human nature.” Zhou closes this important chapter on a particularly reverent note: “Great indeed is change, the source of human nature and endowment!” Further on in chapter 3, Zhou extolls behavior in accord with the Five Constant Virtues (humanness, righteousness, propriety, wisdom, and honesty), observing that “One who is by nature like this, at ease like this, is called a sage. One who recovers it and holds onto it is called a worthy.”

Clearly, Zhou espouses the Mencian view of human nature as innately good. Human beings are naturally moral creatures. However, there is a tension in Zhou’s philosophical anthropology, in that the distinction between good and evil does not reside at the primary level of cosmic origin. As he states in the “Explanation”: “Only humans receive the finest and most spiritually efficacious [qi]. Once they are formed, they are born; when spirit [shen] is manifested, they have intelligence; when their fivefold natures are stimulated into activity, good and evil are distinguished and the myriad affairs ensue.” Similarly, in chapter 3 of the Tongshu Zhou cryptically says, “In being authentic there is no [intentional] acting [wuwei]. In incipience there is good and evil.” Here Zhou’s insistence on stillness as cosmically fundamental means that this ultimate level transcends the distinction between good and evil; the latter distinction only arises when human beings begin to interact with actual things. Later commentators have spilled much ink arguing about what Zhou means.

Broadly speaking, Zhou espouses the cultivation of the “mind-heart” (xin) that became a hallmark of Neo-Confucian religiosity, yet he apparently draws heavily on Daoism. Certainly Zhou uses terms often associated with Daoist neidan (“inner alchemy”), notably qi, the basic “stuff” of the universe, shen (“spirit”), and even jing (“essence”), although he mentions the latter only once or twice. By contrast, Zhou says quite a bit about shen, which he associates with cognitive abilities. Thus, as Zhou observes in the Tongshu, “That which ‘penetrates when stimulated’ is spirit (shen).” Apparently shen lies dormant until it is stimulated by external phenomena, at which point it is activated and “knowing” begins.

The place of qi in Zhou’s view of human nature is vague. That is, qi is a vital component of human beings and all things, yet Zhou never discusses it to the same extent that we find in the writings of later Neo-Confucians. Nor does he differentiate it explicitly from “Principle” (li). In the “Explanation,” Zhou speaks of the wu xing as the basic phases of qi, and hence fundamental to the workings of the cosmos, going on to note that “Only humans receive the finest and most spiritually efficacious [qi].” This statement implies that human nature is unique; people have a special status in the world albeit not as beings of a different order than the myriads of other things. Joseph Adler suggests that for Zhou, humans naturally manifest shen because they are endowed with the most refined qi. It is due to the functioning of shen, then, that we are able to encompass all things. Here, Zhou clearly anticipates later Neo-Confucian views concerning human cultivation as a refining of qi, although he does not speak of differences between people in terms of the “coarseness” and “refinement” of qi. We should note, however, that he does not articulate the full explanation we find in Zhu Xi’s works.

c. Authenticity as Humanity’s Ethical and Ontological Basis

Following the spiritual current of Confucian tradition exemplified in Mencius and the Zhongyong, Zhou maintains that authenticity (cheng) is essential to be fully human. In fact, Zhou opens the Tongshu by declaring, “Being authentic is the foundation of the sage.” He goes on to add that it is “the foundation of the Five Constant [Virtues]” as well as being “perfectly easy, yet difficult to practice.” Later, he underscores this rather paradoxical point by saying, “In being authentic, there is no [intentional] acting.” This seems decidedly Daoist (Zhou actually uses the term wuwei here), but Zhou’s meaning can only be understood through it. For Zhou, authenticity expresses human nature as it truly is; to be authentic is to manifest one’s Heavenly endowment. Speaking metaphorically, to be authentic is to remain still in one’s nature while acting in the world. Authenticity is, thus, both ontological and ethical; it is a manifestation of our fundamental being, while also serving as the root of moral activity.

For Zhou, being authentic is intimately tied to self-cultivation, a central concern of Song Confucianism that forms the heart of Neo-Confucian spirituality. In some sense, authenticity is a “given,” as it is rooted in our nature, yet we must work to develop it, just as with any innate ability. Zhou stresses the importance of such ethical/ontological striving throughout the Tongshu. Moreover, Zhou states that it is possible to be inauthentic (bu cheng) when in chapter 2 of the Tongshu he speaks of the Five Constant [Virtues] and the “hundred practices” of moral behavior as being “wrong” or “blocked by depravity and confusion.” Presumably, such cases arise when one is gripped by selfishness and egotism.

One of the most intriguing and controversial points that Zhou makes about striving for authenticity is that, being authentic, a way of retuning to one’s true human nature, is also the way for a person to “become One” (yi). Moreover, Zhou also says that to be in such a state is “to have no desire.” Zhou strikes a decidedly mystical tone here, with a slight ascetic edge that resonates strongly with Buddhism and Daoism. Contra Max Weber, the sociologist of religion who famously distinguished between ascetic and mystical forms of religion, Zhou suggests a spirituality that straddles this dichotomy. Certainly when read in context, Zhou actually seems to mean a state of clear, yet active engagement with one’s situation. Zhu Xi and later commentators, perhaps at pains to distance Zhou from accusations of Buddhist and Daoist influence, explain that Zhou means that one should attain an unbiased, undistracted state rather than renounce the world.

d. Inseparability of Ethical Life from the Workings of the Cosmos

As we have seen in his understanding of authenticity, Zhou also proclaims the integral relationship of cosmology and ethics. While this is a central theme in the “Explanation” and the Tongshu, one of the best hints of this point comes in “On the Love of the Lotus,” where he refers to the lotus blossom as the “gentleman (junzi) among flowers.” The junzi, the “noble person,” is the highest ethical ideal in early Confucianism and, essentially, the equivalent of the Sage in Neo-Confucian tradition. What’s more, not only is Zhou speaking of a natural phenomenon — a blossoming lotus flower — in moral terms here, he is also underscoring the deeply aesthetic dimension involved. Like the beautiful lotus, so the junzi marks the full flowering of human life.

The intertwining of the ethical and cosmological in Zhou’s thought shows, above all, in his practical focus. Throughout the “Explanation” and the Tongshu, Zhou speaks of our sagely dimension in dynamic, active terms. Be it in his admonitions regarding continual striving, his reminders of the importance of ordering society, and his cautious approach to acting in the world, Zhou maintains that the moral life reflects the cosmic order; sagely behavior is in tune with the creative guidance of Heaven and the nurturing vitality of Earth.

In his work, Zhou freely mixes metaphysical and ethical language, switching from one to the other effortlessly, like a sage acting in accordance with the cosmos by establishing a good society following Confucian moral teachings. Thus as he notes in the “Explanation,” “Only humans receive the finest and most spiritually efficacious [qi]. Once formed, they are born; when spirit (shen) is manifested, they have intelligence; when their fivefold natures are stimulated into activity, good and evil are distinguished and the myriad affairs ensue. The sage settles these [affairs] with centrality (zhong) and correctness (zheng), humanity (ren) and rightness (yi). . ..”

One final point that has some bearing on the inseparability of ethics from the working of the cosmos in Zhou’s work is how it may anticipate some of the views of Wang Yangming (Wang Shouren, 1472-1529), specifically the inseparability of “innate (moral) knowledge” (liangzhi) from action. In Zhou’s perspective, a sage is rooted in authenticity; as he says in the Tongshu, “being a sage is nothing more than being authentic.” Moreover, he later states, “Being perfectly authentic, one acts.” In other words, to be a sage is to act in an authentic (sagely) way. In a similar vein, Wang explains to his student Xu Ai in Instructions for Practical Living (Chuanxilu), “There have never been people who know but do not act. Those who are supposed to know but do not act simply do not know yet.” It seems that both Zhou and Wang would agree with Socrates’ famous dictum that “to know the good is to do the good.”

e. Sageliness as Ideal for Daily Life

The concept of sageliness as an idea to be actualized in daily life is implicit in the previous point regarding Zhou and Wang Yangming. Even a cursory reading of the “Explanation” and the Tongshu reveals Zhou’s concern for putting sagely ideals into practice. As Zhou says, “To be active and correct is called the Way.” In the introduction to A Short History of Chinese Philosophy, Feng Youlan quotes one of his colleagues as saying, “Chinese philosophers were all of them different grades of Socrates. . . With him, philosophy was hardly ever merely a pattern of ideas exhibited for human understanding, but was a system of precepts internal to the conduct of the philosopher.” (A Short History of Chinese Philosophy, 10). This passage reads as if it were written specifically about Zhou himself. Clearly for Zhou, the true goal should be to realize sageliness, that is, to discover it and make it concretely real here and now.

Zhou makes clear that the sage as ideal must be engaged with society and the larger world. Not only does the sage “settle these [affairs],” according to the “Explanation,” but Zhou gives extensive guidance for sagely action in the world throughout the Tongshu. Perhaps his most succinct discussion comes in chapter 6: “The Way of the sages is nothing more than humanity and rightness, centrality and correctness. Preserve it and it will be valuable. Practice it and it will be beneficial. Enlarge it and it will match Heaven-and-earth.” The sage is actively involved with things, guided by morality rooted in the cosmos. We should remember, though, that this ideal is also profoundly spiritual, suggesting an “inner worldly mysticism” that embraces all of life.

Understandably, Zhou’s concern for sageliness manifests in the various models he upholds for our emulation. The most obvious example is Confucius, whom Zhou often quotes and to whom Zhou explicitly devotes two chapters (38 and 39) of the Tongshu. Zhou also holds up Confucius’ disciple Zilu and the legendary Fuxi, who is credited with writing the hexagrams of the Yijing. However, Zhou reserves special reverence for Yan Hui, that most spiritual of Confucius’ disciples. Thus when discussing the comprehensive nature of the sage, Zhou writes, “Master Yan was the one who brought out the Sage’s comprehensiveness and taught ten thousand generations without limit. Was he not equally profound?” Interestingly, Zhou himself plays a similar role for later Neo-Confucians, who held him up as a model of authenticity.

4. Principal Concerns

As is the case with all significant thinkers, Zhou Dunyi’s work provides a wealth of material for further analysis. Some of the concerns that Zhou deals with are of universal philosophical interest while others are rather unique to Chinese, or even more specifically, Confucian, thought.

a. Lineage

In traditional Chinese culture, wherein family relations lie at the center of social life and identity, lineage is paramount. This is true not just socially and politically, but in scholarly circles as well; after all, most “schools” of Chinese thought are called jia (“family”). Indeed, it is a cliché to say that Chinese society is envisioned as a large family with the emperor (“Son of Heaven”) as its father. To be true to one’s jia is crucial; to deviate from its ways or to step outside its bounds is to bring shame upon the larger family, including the ancestors, and risk severe punishment, even ostracism. To have a disreputable lineage or one that is haphazard or unknown is highly suspect in polite circles. For scholars, lineal connection to earlier thinkers is a necessity, since that helps certify that one has truly received Dao. The Way, if it is to continue, must be transmitted to succeeding generations. The fact, then, that Zhou’s teachings have a questionable lineage was a major concern in later Confucian circles. In the preface to his “Conversations of Master Zhu, Arranged Topically” (Zhuzi yulei) 94:3153, Zhu Xi gets to the heart of the matter when he says, “No one knows where his (Zhou Dunyi’s) teaching tradition came from.”

Most contemporary scholars agree that Zhou’s inclusion in the “orthodox” lineage of Song Neo-Confucianism is due to efforts of Zhu Xi in the late 12^th century. Almost from the start, Zhu faced conflict from various sources, notably the Lu brothers, Lu Jishuo (1120s-1190s) and Lu Jiyuan (1139-1193), two literati who argued that Zhou is far too Daoist to be considered a recipient of the Confucian Way. In addition, there are historical issues with Zhou’s alleged connection to the Cheng brothers, among them the fact that Cheng Yi declares that his older brother Cheng Hao personally rediscovered the Way via his study of the Classics. What’s more, neither of the Chengs refer to Zhou in terms typically reserved for teachers; instead, they call him by his personal names. Additionally, none of the Chengs’ disciples even mention Zhou in their writings. All together, these points call into question Zhou’s place in the direct line of Confucian transmission.

Joseph Adler and others investigated the historical and biographical records and discovered that during the latter part of Zhu Xi’s life there was a concerted effort on the part of Zhu and Hunan scholar-officials to elevate Zhou to sagely status despite prevailing opinion at the time – an endeavor that culminated during the reign of Emperor Lizong (1225-1264). For his part, Zhu Xi sidesteps the tenuous historical connection by attributing the source of Zhou’s sagely mind to a transcendent source. Thus, as Zhu writes in a record of his personal pilgrimage to the place of Zhou’s study:

“As for Master [Zhou] Lianxi, if he did not receive the propagation of this Dao by Heaven, how did he continue it so easily after such a long interruption, and bring it to light so abruptly after such extreme darkness? . . The Five Planets were in conjunction in Kui [a phase of lunar activity used to structure the ancient Chinese calendar], marking a turning point in culture. Only then did the heterogeneous qi homogenize and the divided [qi] coalesce; a clear and bright endowment was received in its entirety by one man, and the Master [Zhou Dunyi] appeared. Without following a teacher, he silently registered the substance of the Way, constructed the Diagram and attached a text to it, to give an ultimate foundation to the essentials. . . Ah! Such grandeur! Were it not for what Heaven conferred [on Zhou], how could we participate in this?”

Such appeals to Divine Authority, however, raise philosophical problems too numerous to discuss here.

b. Daoist and Buddhist Influences

Daoist and Buddhist influences on Zhou’s thought also warrant serious attention, particularly in light of the controversies surrounding Zhou’s lineage and Zhu Xi’s rather strained efforts to rope him into the Confucian camp. One common, albeit simplistic, view of Neo-Confucianism is that it began in the Southern Song (1127-1279) in response to widespread political, social, and cultural dislocation after the collapse of the Northern Song (960-1126). With the loss of Chinese territories, especially the Yellow River Valley, the traditional Chinese “heartland,” to non-Han invaders, various scholar-officials sought to re-claim a distinctly Chinese identity linked to Confucianism. As part of their efforts, they reformed the civil service system by purging it of Daoist and Buddhist elements. In doing so, they also diminished the political and institutional power of both rival “Ways” and articulated a philosophically robust Confucian philosophy that could hold its own against Buddhist and Daoist wisdom. Ironically, most contemporary scholars agree that Neo-Confucianism owes a great deal to Daoist and Buddhist ideas and practices.

Without doubt, Zhou’s connections to Daoism are deep. The diagram that Zhou uses in his “Explanation,” for instance, strongly resembles several others used by Daoists, such as the Wujitu (“Wuji Diagram”), which is included in the Daoist Canon (Daozang) and the Xiantian taiji tu (“Taiji Diagram which predates Heaven”). While there is some debate about the details, the prevalent view is that Zhou received his diagram from Mu Xiu (979-1032), a minor official who himself received it from Chong Fang (956-1015), a former official turned recluse. Chong Fang, in turn, received the diagram from Chen Tuan (d. 989), a famous Daoist master. Several key terms that Zhou uses – wuji and wuwei, for example – also have Daoist associations, and Zhou’s priority on “stillness” over “activity” also has a strongly Daoist overtone.

Zhou’s work also shows marked influence of Buddhism. For instance, Cheng Yi refers to Zhou as a “poor Chan [Zen] fellow,” and records indicate that Zhou counted several Buddhists among his friends and teachers, notably Shou Ya, a master at the Helin Temple in Jiangsu province. It is possible that Zhou was even a Buddhist layman (upasaka) for a time. Some scholars suggested connections to the work of Guifeng Zongmi (781-841), a patriarch in both the Chan and Huayan schools of Chinese Buddhism. Zhou’s discussion of the sage as having “no desire” and being “impartial” also resonate with the Buddhist virtue of upeksha (“equanimity”) and the ideal of mahakaruna (“great compassion”).

All told, it is impossible to deny influences, direct and indirect, of Daoism and Buddhism on Zhou Dunyi. The various issues surrounding such influences on Zhou may not matter much, however, to students of global philosophy. In fact, they may only be problematic for those who share a more traditional Confucian concern for purity of lineage, or for scholars who approach the study of Chinese (and, indeed, all of East Asian) philosophy and religion with more Western assumptions of exclusivity. This is not to deny the historical difficulties in pinning down Zhou’s religious and philosophical pedigree or the problems it caused later Confucian thinkers, but only to note that such concerns in no way detract from his philosophical and spiritual insights.

c. Criticism of Other Thinkers

Zhou’s oracular style (characterized by pronouncements), the fact that his writings consist mainly of commentary on the Classics, and his overall religious tone give the impression that he is not a “philosopher” in the modern academic sense. He is not, in other words, a thinker who critically engages with other thinkers, using logical arguments to disprove certain truth claims while establishing other ones; however, when we read carefully, we can see a number of implicit criticisms of rival thinkers.

One example is in the Tongshu, chapter 16, where in distinguishing “things” (having physical form) and “spirit” (shen) he observes, “Things, then, are not penetrating. Spirit renders the myriad things subtle.” This seems to be a counter assertion to the Huayan Buddhist doctrine of shi shi wu ai (“unobstruction of all phenomena,” that is, the interpenetration of all things). Shi shi wu ai, according to Neo-Confucians, effectively denies the reality of the actual world. In the next chapter, which is devoted to music and ritual, Zhou laments the present state of society: “Later generations have neglected ritual. Their governmental measures and laws have been in disorder. Rulers have indulged their material desires without restraint, and consequently the people below them have suffered bitterly.” This reads like standard Confucian boilerplate but its critical edge is unmistakable. In chapter 24, Zhou states, “The most revered thing in the world is the Way; the most honored is virtue; the most rare [difficult to attain] is the human being.” While the echoes of Laozi are unmistakable in Zhou’s praise for Dao and de, the fact that he immediately goes on to praise human beings as having a special status strikes a decidedly Confucian tone. Moreover, there are other examples of Zhou’s critical stance in the Tongshu. For instance, ( Zhou criticizes superficial scholars in chapters 28 and 34 whom, he says, are concerned with elegant literary style rather than striving for sageliness–a common Confucian theme.

These passages remind us that Zhou’s work did not emerge in an intellectual vacuum. He worked from a perspective deeply informed by certain basic ideas and assumptions that arose within a highly complex and contested philosophical milieu. Thus, as we can see, Zhou Dunyi takes a strongly critical stance in much of his writings. Moreover, he offers insightful, albeit oblique, observations that shed light not only on his own context, but that also address ethical, political, and metaphysical issues that crop up in other cultural contexts – one of the hallmarks of any great thinker.

d. Quietism

From a global philosophical perspective, Zhou seems to espouse a form of quietism, in that he emphasizes a more interior, contemplative approach to life rather than acting boldly to shape events through force of will. Although “quietism,” strictly speaking, refers to a Christian theological position that held sway during the 17^th century before being declared heretical by the Vatican, the centrality of attaining a detached, serene state of mind within Zhou’s writings strongly resonates with quietist doctrines. Such accusations of quietism are related to criticisms about the seemingly undue influence of Daoism and Buddhism on Zhou as well.

The charge of quietism is understandable in light of Zhou’s view of the relationship between stillness and activity. Stillness and activity co-entail each other, and, in fact, are just another way for Zhou to explain the interaction of yin and yang. Furthermore, Zhou does give priority to stillness as well – something several later Neo-Confucians express concerns about. The distinctly religious dimensions of Zhou’s work also make it easy for critics to dismiss him, especially in light of common stereotypes about mysticism as an excuse to withdraw into a timidly pious passive acceptance of things “just as they are.”

Nonetheless, arguments that Zhou espouses a passive quietism are, at best, straw men. Whatever his mystical inclinations, Zhou seems firmly focused on practical affairs. He draws heavily on Confucian directives on how to live a good life, and, in the Tongshu, explicitly attends to stereotypically “Confucian” concerns about education, ritual, and the proper governing of society, including the necessity for punishing wrongdoers. Even more to the point, Zhou provides clear instruction about activity, saying that one should pay attention and take great care when wielding power. As a thinker imbued with a sense of the Classical Chinese cultural heritage, Zhou repeatedly seeks guidance for engaging with life in authoritative sources, most especially the Yijing but also other Confucian texts such as the Analects, thereby anticipating Zhu Xi’s later comment that studying the classics is like meeting the sages face-to-face. Furthermore, as we have seen, Zhou holds up examples from Confucian history as models for our own behavior. While there are aspects of quietism in much of Zhou’s work, overall he does not advocate passive withdrawal, but a wise and attentive way of participating in the world without recklessly forcing it to conform to our selfish desires.

e. The Problem of Evil

Explaining evil, destruction, pain, cruelty, and so forth, has been a perennial problem for philosophers throughout history. Numerous solutions have been proposed over the centuries, ranging from the Christian doctrine of “original sin,” to the Buddhist and Hindu teaching that we are bemired in samsara (literally “wandering through,” the beginningless cycle of birth-and-death) due to fundamental ignorance underlying our incessant cravings and selfishness. For Chinese thinkers in general, evil is due to departure from Dao, which results in disharmony within individual, society and the world. Confucians are divided on some of the particulars here. Mencius, for example, holds that humans are innately good while Xunzi maintains that people are essentially animalistic. Both agree, however, that human beings can improve through the influence of a proper education and virtuous government.

Zhou by and large assumes a Mencian view of innate goodness, but he never spells it out explicitly. In the “Explanation” he states, “Only humans receive the finest and most spiritually efficacious [qi].” This seems to be an allusion to Mencius’ remark about nourishing his “vast, flowing qi” as a crucial component to moral and spiritual cultivation (Mencius 2A2), and certainly this is how Zhu Xi interprets Zhou. This view of human nature is also confirmed by passages in the Tongshu such as chapter 20, where Zhou affirms that sagehood can be learned by adhering to “the essentials” (being unified, without desire, clear, impartial, and so forth), most of which are associated with the exercise of moral virtue rooted in our Heavenly endowment.

Still, while Zhou clearly speaks of the fundamental goodness of humanity, he barely touches on evil itself. Of human beings, he says in the “Explanation,” “when their fivefold natures are stimulated into activity, good and evil are distinguished and the myriad affairs ensue.” Zhou repeats the same idea in the Tongshu, adding only that “In incipience there is good and evil.” The idea seems to be that good and evil, properly understood, only arise with the start of actual human activity. Does Zhou mean that one’s inherently good human nature, when coming into contact with external things, can give rise to actual good or evil affairs? It is unclear, but Zhou’s statements definitely provoked many later commentators. It is really only after Zhang Zai’s explanation of the role of qi that Neo-Confucians had a way to reconcile the Mencian view of fundamental goodness with the undeniable existence of evil in the world.

5. Legacy

Zhou Dunyi was a major influence on the development of Neo-Confucian metaphysics while the spiritual dimensions of his work continue to resonate with various thinkers. Wing-tsit Chan declares that the most accurate estimation of his work can be found in the comments of the later scholar Huang Bojia (1695), a passage that deserves to be quoted in full:

Since the time of Confucius and Mencius, Han (206 B.C.E.-220 C.E.) Confucianists merely had textual studies of the Classics. The subtle doctrines of the Way and the nature of man and things have disappeared for a long time. Master Zhou rose like a giant. . . . Although other Neo-Confucianists had opened the way, it was Master Zhou who brought light to the exposition of the subtlety and refinement of the mind, the nature, and moral principles.” (quoted in Chan, A Source Book in Chinese Philosophy, 461; pinyin romanization substituted for Wade-Giles in original).

C. Graham, however, argues in his landmark Two Chinese Philosophers: The Metaphysics of the Brothers Ch’eng that Zhou had little direct influence on these seminal thinkers. Certainly in light of evidence that Zhu Xi’s creative work in establishing the orthodox “transmission of the Way” (Daotong), we should not consider Zhou to be the historical “founder” of Neo-Confucianism.

Still, while any direct connection between Zhou and later Neo-Confucians is tenuous, his inspirational role cannot be doubted. One famous story, attributed to Cheng Hao in Reflections on Things at Hand, says that Zhou refused to cut the grass growing outside his window, saying, “[The feeling of the grass] and mine are the same.” While this tale seems the stuff of hagiography, it does give us a sense of the reverence for Zhou within Confucianism. Indeed, as an affirmation of the fundamental continuity of all life, this story is a poignant example of what living out Zhou’s metaphysical vision might be like. Such stories have helped cement the image of Zhou as a “latter day Sage,” an image that fits well with the specific models of Sageliness he holds up ( Yan Hui, Confucius, to name two). In this regard, it is noteworthy that in chapter 14 of Reflections on Things At Hand, entitled “On the Dispositions of Sages and Worthies,” Zhu Xi says of Zhou that “[his] mind was free, pure, and unobstructed, like a breeze on a sunny day and the clear moon.” Elsewhere, Zhu says that Zhou’s mind was “harmonious with the ‘Supreme Polarity’,” and that he “had the joy of Confucius and Yanzi.”

Joseph Adler argues that Zhou’s importance lies in the fact that his work provided a basis for Zhu Xi’s own religious practice. Specifically, Zhou’s teaching on the interrelationship of “stillness” and “activity” enabled Zhu to ground his methods of self-cultivation in the words of an earlier figure revered for his own spiritual example. Regardless, Zhou Dunyi is a profound thinker whose poetic words still provide philosophical and religious guidance.

6. References and Further Reading

Adler, Joseph A. Reconstructing the Confucian Dao: Zhu Xi’s Appropriation of Zhou Dunyi. Albany: State University of New York Press, 2014.
- The single best scholarly discussion of Zhou’s thought and his place within Neo-Confucianism currently available. In addition to his insightful analysis of the “Explanation” and the Tongshu, Adler argues that Zhou’s work provided the solution to Zhu Xi’s personal spiritual crisis by providing a cosmological and metaphysical underpinning for Zhu’s own religious practice. Includes clear annotated translations of the “Explanation” and the Tongshu along with Zhu Xi’s commentaries, prefaces, and postscripts, as well as passages from the writings (commentaries, prefaces and so forth) on Zhou’s work from other Neo-Confucian thinkers.
Adler, Joseph A. “Response and Responsibility: Chou Tun-I and Neo-Confucian Resources for Environmental Ethics.” In Confucianism and Ecology: The Interrelation of Heaven, Earth, and Humans, edited by Mary Evelyn Tucker and John Berthrong, 123-49. Cambridge, MA: Harvard University Center for the Study of World Religions, 1998.
- Excellent discussion of Zhou’s thought highlighting the ecological dimensions of his ethical/spiritual scheme.
Adler, Joseph A. “Zhou Dunyi: The Metaphysics and Practice of Sagehood.” In Sources of Chinese Tradition, 2^nd ed., vol. 1, edited by Wm. Theodore de Bary and Irene Bloom, 669-78. New York: Columbia University Press, 1999.
- Good annotated English translations of Zhou’s “Explanation” in its entirety (including the Diagram itself) along with selections from the Tongshu (chapters 1, 3, 4, 16, and 20). Includes a useful introductory discussion of Zhou’s life and work.
Chan, Wing-tsit, ed. A Source Book in Chinese Philosophy. Princeton: Princeton University Press, 1963.
- A must-read for anyone interested in Chinese thought. Chan’s own perspective is heavily colored by Neo-Confucianism (particularly the Chen-Zhu line). Chapter 28 is devoted entirely to Zhou, and includes not only biographical information and philosophical analysis, but annotated English translations of both the “Explanation” and the Tongshu in their entirety.
Chan, Wing-tsit, trans. Reflections on Things at Hand: The Neo-Confucian Anthology Compiled by Chu Hsi and Lu Tsu-Ch’ien. New York: Columbia University Press, 1967.
- Masterful philosophical translation of the primary text of Neo-Confucian thought. Heavily annotated with a 27-page introduction that includes biographical information about Zhou and the other three “founders” of the Cheng-Zhu line. Also includes a 25-page glossary of key Chinese terms (Wade-Giles Romanization and traditional characters) and a short (11-page) essay entitled “On Translating Certain Chinese Philosophical Terms.” Not only does this anthology open with Chan’s translation of the “Explanation,” the index makes it easy to locate all 12 of the passages from Zhou’s writings that Master Zhu included.
Fung Yu-lan [Feng Youlan]. A History of Chinese Philosophy. Volume II: The Period of Classical Learning (from the Second Century B.C. to the Twentieth Century A.D.). Translated by Derk Bodde. Princeton: Princeton University Press, 1953.
- Rather dated but masterful overview of the history of Chinese thought. Like Chan’s Source Book, a must read for students of Asian philosophy. Section 1 of Chapter XI focuses on Zhou’s thought. The condensed A Short History of Chinese Philosophy (a single volume distillation of Fung’s larger two-volume work) is also informative.
Tu Wei-Ming and Mary Evelyn Tucker, eds. Confucian Spirituality. Volume Two. New York: The Crossroad Publishing Company, 2004.
- Part of the “World Spirituality” series, this collection of nearly 20 essays examines Confucian religious thought and practice from the Song era down to the present, covering the spread of Neo-Confucianism to Korea, Japan, Vietnam and its development into a truly a global tradition. Although Zhou Dunyi is not the focus of any specific essay, discussion of his thought and influence figure prominently in several pieces in the first part of the volume.
Tu Wei-Ming. Confucian Thought: Selfhood as Creative Transformation. Albany: State University of New York Press, 1985.
- Classic discussion of the spiritual dimensions of Confucian tradition (particularly the more Mencian Neo-Confucian dimensions) by its foremost proponent. While not explicitly devoted to Zhou, Tu’s discussion illuminates themes that run throughout the Song master’s work.
Wang, Robin. “Zhou Dunyi’s Diagram of the Supreme Ultimate Explained: A Construction of the Confucian Metaphysics.” Journal of the History of Ideas 66/3 (July 2005): 307-323.
- Highlights ways that Zhou’s thought traces a notion of gender complementarity in his depiction of human beings as arising from and embodying the original and sustaining energies of the cosmos (yin and yang). Human persons are its highest exemplification and as such are a prime phenomenon of this dynamic cosmic creation.
Zhou Dunyi. Zhou Dunyi ji (Collected Works of Zhou Dunyi). Edited by Chen Keming. Beijing: Zhonghua Shuju, 1990.
- Good contemporary Chinese edition of Zhou’s primary works.

Author Information

John Thompson
Email: john.thompson@cnu.edu
Christopher Newport University
U. S. A.

Charles Hartshorne: Dipolar Theism

From the beginning to the end of his career Charles Hartshorne maintained that the idea that “God is love” was his guiding intuition in philosophy. This “intuition” presupposes both that there is a divine reality and that that reality answers to some positive description of being a loving God. This article focuses on the latter issue, namely, Hartshorne’s concept of God. Hartshorne’s views on the former issue are treated separately in another article, “Charles Hartshorne: Theistic and Anti-Theistic Arguments.” Hartshorne vigorously defended both propositions by clarifying what he meant by the phrase, “God is love,” by defending his views against a variety of objections, and generally by arguing that his version of theism (called “dipolar” or “neoclassical” theism) survives critical scrutiny better than its philosophical competitors.

Heavily influenced by Alfred North Whitehead, Hartshorne borrowed some of Whitehead’s technical vocabulary and he often promoted broadly Whiteheadian ideas. It is a mistake, however, to style him as Whitehead’s disciple for he departed from the older philosopher on a number of points, most notably (where this article is concerned), on questions surrounding the concept and the existence of God. In what follows, Hartshorne’s ideas about the concept of God are examined. It is important, however, to appreciate that the formulation of a coherent theism is an integral part of the rational defense of theism. Hartshorne spent much of his career in a philosophical atmosphere in which the question was not so much “Does God exist?” as it was “Does ‘God’ name a coherent idea?” Philosophers from very diverse schools of thought—from Sartre to the Logical Positivists—rejected theism on the basis of alleged inconsistencies in the very idea of deity. Hartshorne himself remarked that there would be fewer atheists if theists had done a better job of making sense of the concept of God. Hartshorne’s response to this situation was to develop his dipolar or neoclassical concept of God. It can plausibly be claimed that Hartshorne accomplished at least two tasks: first, he introduced a sophisticated and religiously important form of theism heretofore unheard of or at least very poorly developed through philosophical argument and, second, he shifted the burden of proof onto those who claim that the concept of God is hopelessly muddled.

Divine Love and Divine Relativity
Existence and Actuality
Divine Perfection
Divine Power
Divine Knowledge
Panentheism
Conclusion
Suggestions for Further Reading

1. Divine Love and Divine Relativity

The only deity worthy of worship, Hartshorne believed, is one that could be described as “Love divine, all loves excelling,” as in the title of Charles Wesley’s hymn. Hartshorne did not identify himself as Christian nor did he consider himself a theologian. He argued, however, that Christian thinkers had an unfortunate tendency to allow what he considered to be warped ideas about absolute power and unchanging perfection to eclipse the central teaching of their faith concerning divine love. The parables of Jesus and the personal qualities he exhibits in the Gospels reflect, for the Christian, the image of a loving God. They portray one who not only acts for the benefit of the beloved but also sympathizes with others in such a way as to rejoice in their well-being and feel sorrow in their tragedies. These are the qualities of love that Hartshorne takes to be essential to it; at a bare minimum, love requires both the capacity to act for the welfare of others and to sympathize with their feelings. As the etymology of “compassion” suggests, it is “to suffer with” another in the desire to ameliorate the other’s suffering. If this sort of love is to be attributed to the divine being, then it must not only be possible for God to act for the welfare of the creatures but also to be affected by their weal and woe. In short, divine love entails the divine relativity: a social conception of God—the title of Hartshorne’s fourth book, published in 1948, now considered a classic in the philosophy of religion.

Divine relativity is precisely what much of traditional theology would not allow. As Aquinas said in Summa Theologica, God is really related to the creatures but the creatures have only a rational (that is, an imagined) relation to God (ST I, Q 13, a. 7). In short, God is impassible or unaffected by anything external. The only doctrine of divine love consistent with the doctrine of impassibility is one in which God promotes the welfare of the creatures, but is unaffected by what happens to them. On this view, divine love, unlike human forms of love, involves neither sympathy nor empathy. John Sanders demonstrates in The God Who Risks that Christian thinkers, from as early as Justin Martyr, realized that there is a tension between the belief in the goodness of God and the denial that God somehow shares in the joys and sorrows of the creatures. Anselm raised the question explicitly in chapter 8 of Proslogion: How can God be all-loving without any sympathetic responsiveness? Anselm answered by promoting a kind of theological behaviorism: we feel the effects of God’s goodness, but God feels nothing. On Hartshorne’s view, this doesn’t answer the question, it only reasserts divine impassibility.

Hartshorne affirms God’s love as involving both benevolence and feeling. Because God loves the creatures, what happens to them is felt also by God. As a loving parent suffers for a child who is ill or who has lost her way in life, so the God in whom Hartshorne believes, suffers through the misfortunes and the mischief of the creatures. He was fond of quoting one of the final statements from Whitehead’s Process and Reality that “God is the great companion—the fellow-sufferer who understands.” Hartshorne, following both Whitehead and Berdyaev, maintained that there can be tragedy, even for God. As Martha Nussbaum argues, tragedy can happen only to someone who cares enough about others to be disappointed by them or hurt by what happens to them. God, in Hartshorne’s view, is one who cares and who can therefore be disappointed or hurt by the actions of the creatures.

Hartshorne’s basic argument for divine relativity is stated throughout his writings. If God knows contingent states of affairs (for example, a woman listening to a bird singing at a particular time and place), then there must be contingency in God. For, if the object of knowledge can be other than it is (for example, the woman not listening to the bird), then the knowledge itself could be otherwise (for example, God knowing that the woman is not listening to the bird). The argument is not that God might have failed to be omniscient, but that the particular cognitive states of God could have been different. As Hartshorne noted, Aristotle inferred from this reasoning that God does not know the world; Spinoza, on the other hand, denied the contingency of the world—despite what seems to be the case, it is impossible, at that very moment, that the woman not be listening to the bird. Hartshorne concludes that one must choose among the mutually exclusive options: a God that is ignorant of the world, a world devoid of contingency, or the neoclassical view that there is contingency in God. What is ruled out by this argument is the Thomistic view that God knows contingent states of affairs but there is no contingency in God. (For different formalizations of this argument see Shields 1983 and Viney 2007/2012.)

Hartshorne’s basic argument for divine relativity is expressed in terms of the idea of God’s exhaustive knowledge but it could equally well be rephrased in terms of inexhaustible love, for love, like knowledge has it objects. Of course, these are not the only qualities that theists usually ascribe to God—there are also such qualities as eminent creativity, perfect power, and infinite wisdom. Hartshorne attempts to do justice to these ideas in formulating his neoclassical concept of God, but for him divine love remained paramount. This is significant for it highlights Hartshorne’s commitment to the principle that negation is parasitic upon positive attributions, that there are no merely negative facts (see “Charles Hartshorne: Neoclassical Metaphysics”). Many theologians, eager to affirm the transcendence of God, emphasize what cannot be known of God and argue that, in view of this ignorance, the most appropriate theological language is by way of negation (via negativa): God is not finite (infinite), not changeable (immutable), not affected by anything external (impassible), not contingent (necessary), not in time (non-temporal), and so forth. Hartshorne also emphasized what is not known of God and he did not deny that negations play an important role in religious discourse. In A Natural Theology for Our Time, he comments that our knowledge of the concrete divine reality is “negligibly small.” He argues, however, that as the sole or even primary approach to religious language, “the negative way” is a case of false modesty. Negative theologians are supposedly being deferential to God by stressing what cannot be known or said of God, but this masks the fact that they consider themselves privy to enough knowledge about the divine reality to know what cannot be attributed to it.

Hartshorne couples the accusation of false modesty with the charge that the negations used of deity by negative theologians almost invariably presuppose invidious contrasts: the finite is inferior to the infinite, the changeable to the unchangeable, the passible to the impassible, the temporal to the non-temporal, and so forth. Hartshorne argues that it is much too simplistic to label one side of an ultimate contrast as “better” and the other side “worse.” On the contrary, there are better and worse forms of each side of each contrast. For example, there are better and worse ways of being affected by others (passibility) and better and worse ways of being unaffected by others (impassibility): to identify too much with the suffering of others is damaging to one’s own well-being and may prevent one from helping others in need; to remain unaffected by the plight of others exhibits the character flaw of insensitivity. In Hartshorne’s view, theologians should not chase after negations if they wish to speak of one that is worthy of worship; rather, they should explore ways of attributing to God what is best in both sides of any particular contrast. For this reason, Hartshorne maintains that, to the extent that language is adequate to theological purposes, only a properly dipolar concept of deity can reflect the divine perfection: God is both finite and infinite, both passible and impassible, and so forth, but in different respects and in eminent ways.

In Analytic Theism, Hartshorne, and the Concept of God, Daniel Dombrowski notes that Hartshorne sought a theory of religious language that avoids two extremes: (1) language is wholly inadequate to describe God and (2) verbal formulae may capture God without doubt or obscurity. Hartshorne considered the formal abstractions of metaphysics to be the most nearly univocal language that is possible for deity, for they do not admit of degrees. For example, on Hartshorne’s view, God is, in different respects, necessary and contingent; we shall see, however, that this does not mean that God is more or less necessary or more or less contingent. Hartshorne calls the most nearly equivocal language about God “symbolic” because it presupposes particular times, places, and situations. Metaphors such as “shepherd,” “mother,” “father,” are examples. Analogical language holds a place between the abstract contraries of metaphysics and the concrete imagery of poetic imagination. Analogical language is a matter of degree, as when one says that love comes in many forms, but the eminent form of love belongs to God. In Beyond Humanism, Hartshorne claimed that psychical predicates such as memory, feeling, and volition admit of an infinite variability, extending beyond their specifically human forms to include the non-human animal world and to include what might exist in a superhuman form, such as deity. Hartshorne sometimes says that these sorts of predicates only apply literally to God and not the creatures. As Dombrowski avers, the most parsimonious interpretation of this “negative anthropology” is that Hartshorne is emphasizing that God alone has the supreme or eminent form of these qualities.

2. Existence and Actuality

To say that God exhibits both sides of a metaphysical contrast would be a logical contradiction unless there was a way of showing that the polar extremes apply to God in different respects. Søren Kierkegaard seemed to relish the paradox that “the eternal came to be in time.” Hartshorne did not mention Kierkegaard in this connection, but he apparently saw little advantage in this way of speaking. In The Divine Relativity, he complained that a theological paradox seems to be what a contradiction is when applied to God. In Hartshorne’s view, asserting contradictory things of God is not a sign of profundity but of confusion. Hartshorne’s proposal is to make a three-fold distinction of logical type, applicable to both God and the creatures, among existence (that a thing is), essence (what a thing is), and actuality (the particular state in which a thing is). To illustrate how this distinction can be applied to both God and the creatures, consider the case of a woman listening to a bird sing and of God knowing this fact. The woman exists, has the cognitive capacity to hear song birds, which is part of her essence (insofar as audition of is part of her natural endowment) and she is currently listening to a bird sing, which is her actual state. The same distinctions apply to God: God exists, has the essence of being all-knowing, and is in the actual state of knowing that the woman is listening to the bird sing.

The tripartite distinction of existence, essence, and actuality is one of logical type analogous to the logical type difference between universals and particulars. One may, for example, deduce that the woman exists if she is listening to the bird, but one may not deduce from the fact of her existence that she is listening to a bird. For this reason, Hartshorne maintains that existence (also essence) is abstract relative to actuality. Actuality is, so to speak, information rich, relative to existence (and essence). This is recognized in modern logic in the use of the existential quantifier which, by itself, gives no details about the existent object. Hartshorne’s three-fold distinction also allows one to make a distinction within God between what is necessary (could not be otherwise) and what is contingent (could be otherwise). It is conceivable that God exists necessarily and necessarily has the quality of being all-knowing, but the actual state of God’s knowing (for example, knowing that the woman is listening to a bird sing) might be contingent. Barring determinism, the woman’s listening to the bird is contingent: she might have been asleep, she might have been listening to a different bird, she might have been distracted, and so forth. If God is necessarily all-knowing, then God knows about the woman and her actual state, regardless of what it may be. Moreover, God’s actual state of knowing the woman as listening to the bird sing is as contingent as the fact that she is listening to the bird sing. The following diagram summarizes how the distinctions between the concrete and the abstract and the necessary and the contingent map onto Hartshorne’s three-fold distinction of existence, essence, and actuality as it applies to God and the creatures.

The three-fold distinction is often referred to by means of the simpler distinction between existence and actuality thereby anticipating the thesis of Hartshorne’s ontological argument that existence belongs to the nature or essence of God. One need not accept the ontological argument, however, to appreciate the importance of the distinction. David Tracy calls the distinction “Hartshorne’s Discovery” and Hartshorne himself said, “I rather hope to be remembered for this distinction.” Hartshorne notes that Aristotle anticipated the tripartite distinction of existence, essence, and actuality when he spoke of substance, essence, and accident. Hartshorne’s criticism of the Stagirite is that he considered substance as ontologically basic and thus could speak of accidental compounds. For Hartshorne, actuality is ontologically basic in the sense of being most concrete. In Philosophers Speak of God, Hartshorne writes, “It is actuality of accidents, not existence of substances that is prior” (1953, 72).

The distinction between existence and actuality is important because it allows, among other things, that there can be give-and-take relations between God and the creatures without reducing God to the status of a creature. Contrary to the ancient tradition of divine impassibility, God can be conceived as affected by the creatures. In the example, the woman listening to the bird brings it about that God knows that she is listening to the bird, although she does not bring it about that God is omniscient, for God would have been omniscient even had she never existed. In Summa Contra Gentiles, Aquinas argued that any contingency in God implies the possibility of God’s non-existence, thereby reducing God’s existence to the status of creaturely existence (SCG I, 16.2). In view of the difference between existence and actuality, the inference is invalid. God’s actual states can be contingent while God’s existence and essence remain necessary. Moreover, the essence of God must now be described not merely as necessary but as necessarily somehow actualized.

3. Divine Perfection

Hartshorne’s three-fold distinction allows one to appreciate the extent of his divergence from the dominant tradition in philosophical theology which he called “classical theism.” This article has noted that classical theists, committed to the transcendence of God, were keen on the via negativa: God was placed on one side only of the pairs of contrasts, absolute/relative, infinite/finite, immutable/mutable, impassible/passible, necessary/contingent, and eternal/temporal. Hartshorne rejects this as a “monopolar prejudice,” an expression that highlights not only the “monopolar” aspect of classical theism but also the invidious character of the contrasts—the “prejudice”—as applied to God and the creatures. Hartshorne speaks instead of God’s dual transcendence. God transcends the creatures by being the supreme instance of both sides of the contrasts. The distinction between existence and actuality permits a logically coherent doctrine of dual transcendence by distinguishing different aspects of God. For example, God is immutable with respect to existence and essence, but mutable with respect to actuality. That is to say, God’s existence and essence are always the same, but God’s actual states are constantly being added to with the creative advance of the world. Or again, God is both necessary and contingent, but in different respects. God’s existence and essence are necessary (that is, could not be otherwise) whereas God’s actuality is contingent (that is, could be otherwise). The examples of divine mutability and contingency represent God’s flexibility in being able to respond to every possible change. It should now be clear why Hartshorne was making a serious point when he quipped that he believed in twice as much transcendence as was usually found in more traditional forms of theism.

From time to time, Hartshorne has been characterized as promoting a merely finite deity such as one finds in Mill’s essay Theism. Hartshorne’s commitment to the principle of dual transcendence entails that this is mistaken. Insofar as God has actual states, God is indeed finite. Furthermore, God can be nothing other than finite in this respect. God’s actuality is the realization of concrete value in the life of God and every realization of value, whether in God or in any other being, is finite in the sense that it excludes values that could have been achieved. For example, from an early age, Mozart’s father set his son on the trajectory of being a musician. Apart from this education and training, Mozart might have lived a very different life, as a lawyer, a military leader, or a peasant farmer. Each path would have led to a certain value achievement, but each, to a greater or lesser extent, excludes the others. In some fashion, God incorporates Mozart’s achievement into the divine life; as the values Mozart did not achieve were not part of his life, no more are they part of God’s. To say that God is not finite in this sense is to risk accepting a doctrine according to which God is merely infinite—that is to say, that God excludes whatever is of worth in the enjoyment of a finite realization of value. Hartshorne long maintained that the concept of the realization of all possible values is a meaningless ideal. God must, therefore, be finite, but not merely so. Dual transcendence means, among other things, that God must be infinite in receptive capacity; whatever comes to be, comes to be for God and becomes an everlasting component in God’s memory. There must also be in God an inexhaustible or infinite capacity to appreciate the creative advance. In addition, Hartshorne allowed that God is actually infinite in the sense that there was never a time when God did not exist and that God is omniscient with respect to this past life. Hartshorne was quick to add that this form of infinity is not the realization of all possible values, for the actually infinite life of God could have been different in as many ways (an infinite number) as the creative advance itself could have been different.

Classical theologians adopted an ideal of perfection as unchanging, often using the argument from Plato’s Republic that change for the better or worse implies an unchanging measure of perfection. The argument is that if something changes for the better then it is not yet perfect, but if it changes for the worse then it is no longer perfect. In either case, change implies imperfection. God, being perfect, must be devoid of change. This argument, however, begs the question against a dipolar conception of God like Hartshorne’s by assuming that there cannot be perfect forms of change. Hartshorne argues, on the contrary, that some forms of value—aesthetic qualities in particular—do not admit of a maximum. Just as it is impossible to speak of a greatest possible positive integer, so it may be impossible to speak of a greatest possible beauty. The fact that Mozart’s music achieved a new level of beauty does not mean that there was nothing left for Beethoven to do. Another analogy is interpersonal relationships. It is a good thing to be flexible in one’s responses to others. The ideal is not unchangeableness; it is, rather, adequate response to the needs of others. It is true that stability and reliability of character are desirable. But this means, in part, that the person can be relied upon to respond in ways appropriate to each situation, and responsiveness is a kind of change. The analogy is particularly appropriate in the divine case since there are always new creatures to which God must respond and hence there is no upper limit to the values associated with these relationships, for each is as unique as the individuals with whom God is related.

As Hartshorne distinguished existence and actuality, so he distinguished different ways in which God is perfect. Taking a clue from the work of Gustav Fechner, Hartshorne noticed an ambiguity in the concept of perfection. If one is perfect, then one is unsurpassable, but by what or by whom is one unsurpassable? The obvious answer is “by others.” This leaves open the possibility that one may surpass oneself. Thus, there is a distinction between (a) being unsurpassable by all others including self and (b) being unsurpassable by all others excluding self.” In Man’s Vision of God and the Logic of Theism, Hartshorne labels these two ideas respectively A-perfection (for absolute perfection) and R-perfection (for relative perfection). God is A-perfect with respect to existence and essence and R-perfect with respect to actuality. Hartshorne agrees with more traditional theists who spoke of God as infinite, immutable, impassible, necessary, and eternal, for this is God’s A-perfection. Hartshorne quickly adds, however, that God is not in all respects infinite, immutable, impassible, necessary, and eternal. To use our previous example, if aesthetic values exhibit an unlimited possibility of increase, then God’s appreciation of beauty may—indeed must—exhibit this possibility. Again, Beethoven’s music introduces new forms of beauty that did not exist prior to his creative life. Hartshorne would also say that God, in enjoying the changing beauty of the world, is also the supremely beautiful object of contemplation, a point that is returned to in the discussion of panentheism. Hartshorne summarized these ideas about divine perfection in The Divine Relativity when he spoke of God as “the self-surpassing surpasser of all.”

4. Divine Power

Theologians have often commented on how difficult it is to define “omnipotence.” Most of those who have thought about this, Hartshorne included, conclude that René Descartes was wrong, in his letter to Mersenne (May 27, 1630), to suppose that God could bring about logically inconsistent states of affairs. Aquinas, for example, in Summa Contra Gentiles denied that God could draw a circle with unequal radii, for this involves a logical inconsistency: one must fix the angle of the compass in order to guarantee that the arc becomes a circle, but one must at the same time not fix the angle, allowing it to become wider or smaller, in order to make the radii unequal (SCG II, 25.14). Aquinas also denied that God could change the past once it has occurred. In Summa Theologica, Aquinas says that not even God can restore virginity to someone who has lost it (ST I, Q 25, a. 4, reply to Obj. 3). Finally, Aquinas denied that God can do what is contrary to God’s nature, such as doing an unloving deed (ST I, Q 25, art. 3, Reply to Obj. 2). On each of these points, Hartshorne agrees.

Beyond these agreements, Hartshorne attributes both more power and less power to God than did the Angelic Doctor. For Aquinas, God can act but not be acted upon by anything external—this is the doctrine of impassibility. As seen, Hartshorne argues that God has the power to be acted upon by the creatures and to respond to them. In this sense, Hartshorne attributes more power to God than does Aquinas. On the other hand, Aquinas apparently believed that God can unilaterally bring about some states of affairs in which more than one agent makes decisions. For Aquinas, God is called omnipotent because everything that does not imply a contradiction in terms is within God’s power to accomplish (ST I, Q 25, a. 3). Hartshorne rejects this claim and holds instead that any state of affairs in which more than one agent makes decisions cannot be conceived as the product of one agent, even if that agent is God. Suppose Ruth loves Naomi and Naomi loves Ruth—their mutual love can be explained only by referring to the activity of two persons, Ruth and Naomi. The logic of the situation does not change if one of the agents is God. The state of affairs described by God loving Ruth and Ruth loving God can only be explained by the activity of both God and Ruth, and not by God alone. Of course, if God is all-loving, then it is impossible that Ruth (an actual person) not be loved by God; but this does not change the fact that two agents—God and Ruth—are required to create the situation of their mutual love. If this is correct, then it is false that God, acting alone, can bring about any state of affairs in which more than one agent is making decisions. A corollary is that it is false that God can bring about any state of affairs the description of which is logically consistent—for there is nothing logically inconsistent about two individuals loving each other.

Classical theists, Aquinas in particular, are not without responses to Hartshorne’s reasoning. Aquinas made two claims relevant to Hartshorne’s argument. First, he maintained that the self-same result could be wholly attributed to two different causes; perhaps Ruth’s loving God can be wholly attributed to Ruth and wholly attributed to God. In Summa Contra Gentiles, Aquinas’ example is that the music of a flute is wholly attributable to the instrument and to the musician (SCG III, Pt. 1, 70.8). Of course, the music is manifestly not attributable to either the instrument or the musician singly; both are required, which supports Hartshorne’s claim. It is relevant to note that it is illicit to distribute “wholly” through a conjunction. There is no valid inference from “X is wholly the result of (A and B)” to “X is wholly the result of A and X is wholly the result of B.” The second thing that Aquinas says that might undermine Hartshorne’s argument is his claim that God has the power to bring about some events necessarily and to bring about other events contingently (ST I, Q 19, a. 8). In this way, one might make head-way in making sense of the idea that God creates a person’s decision while yet preserving the contingency (an element of freedom) of the decision. Again, however, Hartshorne demurs. It makes sense to say that one can be the cause of a contingent event—every roll of the dice is proof of that. It is much less clear that it makes sense to say that one can guarantee the outcome of a contingent event. If one loads the dice in such a way that a particular number must appear (say, seven), then the outcome is not contingent; only if the dice are not loaded is the outcome truly contingent. Again, one should take note of an illicit distribution, but this time it is the problem of distributing “causes” or “guarantees” over a disjunction. There is no valid inference from “X causes (A or B or C)” to “X causes A or X causes B or X causes C.”

Hartshorne’s most controversial departure from classical theism is his denial of creation ex nihilo. Indeed, the argument just given that some states of affairs require multiple decision makers is itself an argument against ex nihilo creation, at least in its classic form. God was said to create the universe, which includes the decisions that creatures make, in one non-temporal and unilateral act. Hartshorne’s argument entails that no universe with multiple decision makers can be created in its entirety by God alone. Aquinas notwithstanding, the making of decisions is a paradigm of creative activity, for something is brought into existence if only the decision itself. For this reason, Hartshorne’s example of multiple decision makers is also an example of multiple creators. Hartshorne saw in Jules Lequyer’s statement that “God created me creator of myself” an anticipation of his own views on divine creativity. A hallmark of Hartshorne’s neoclassical theism is that the universe is a joint creative product of (a) the lesser creators that are the creatures, localized in space and time, and (b) the eminent creator which is God whose influence extends to every creature that ever has or that ever will exist.

Hartshorne defends a metaphysical view that posits creativity as a transcendental, applicable to both God and the creatures. Creativity, in such a metaphysic, is never “from nothing” but is relational, requiring a pre-existent universe (see “Charles Hartshorne: Neoclassical Metaphysics”). It follows that there can be no such thing as God without a universe or, for that matter, a universe without God. A common objection to this view is that it portrays God as dependent upon the universe. Hartshorne considers the objection to be flawed in two ways. First, it assumes an invidious contrast between independence and dependence. As noted, Hartshorne is at pains to instruct philosophers and theologians to be wary of devaluing dependence (and, more generally, to be cautious of simplistic valuations of metaphysical contrasts). Second, the objection is subtly ambiguous. If Hartshorne is correct, then God and the universe are indeed necessary to each other. The proviso, however, is that no particular set of creatures (that is, no particular universe) is necessary to God. An analogy that Hartshorne uses in Creative Synthesis is of a mathematical set that necessarily has numbers, but the numbers that it has are not necessary. God’s actual states, being contingent, are dependent upon interaction with the creatures; God’s existence, on the other hand, is necessary, for it depends upon no particular creatures or groups of creatures. It should also be noted that Hartshorne preserves the distinction between God and the creatures: the divine being meets with no universe that it did not have a hand in co-creating whereas the creatures, because they begin to exist, are born into a universe that they had no part in making. Of course, once the creature exists, it becomes a lesser, co-creator, with God.

In The Divine Relativity and elsewhere, Hartshorne distinguishes two forms of power involving direct and indirect causation. Direct causal influence occurs when one entity—Hartshorne’s name for the metaphysically basic entities is “dynamic singulars”—acts on another without an intermediary as when a present experience acts upon an immediately subsequent experience in the life of a single individual; one’s memory of the preceding moment, for example, is the feeling of one experience acting on its successor in direct fashion. Hartshorne avers that a similar direct action occurs between parts of the nervous system and between the nervous system and the body. Indirect causal influence, on Hartshorne’s account, occurs when one body acts upon another body, which often involves modifying the inter-bodily environment in some way, such as speaking, which causes air to move and sound waves are heard by another person. Some cases of indirect causal action are examples of “brute force” whereby one body moves another body from one place to another. Barring telepathy, cases of one person acting on another are always indirect. On the other hand, Hartshorne maintains that God’s action on dynamic singulars is never indirect. Because each entity retains its own power of creative experiencing, this direct causal influence is not deterministic. Hartshorne, following Whitehead (who was following the later Plato), refers to this mode of influence as “divine persuasion” which is, in effect, the active side of divine love. God acts as a supreme ideal, urging each dynamic singular to achieve an intensity of experience appropriate to its level of complexity. Thus, in Creative Experiencing Hartshorne says, “It is the [divine] love that explains the [divine] power, not vice versa.”

Some philosophers accept Hartshorne’s critique of the traditional concept of omnipotence but argue that the neoclassical account of divine power does not endow God with the highest degree of power conceivable. One may concede that “divine persuasion” is the most admirable form of power, but insist nevertheless that God should also be conceived as having the ability to thwart human decisions by preventing them from being acted upon or by preventing their natural consequences from occurring. In Divine Power in Process Theism, David Basinger notes that a parent can force an unruly child to go to bed by physically putting the child there. If God is unable to accomplish such a feat then, Basinger argues, God does not have the highest degree of power, for the parent is able to do what God cannot. In response one may note that Hartshorne’s metaphysical principles allow that God has the ability to persuade the child to get into bed or even to persuade the parent to force the child into bed. It is contrary to Hartshorne’s thinking, however, to say that God has a body with a location within the cosmos. This is also contrary to classical theism (also Basinger’s “free will theism”)—the idea that Jesus was God embodied involves metaphysical issues which Basinger’s critique does not presuppose. In view of these qualifications, Basinger’s objection seems to be that if God is to be conceived as having the highest degree of power, God must be able to accomplish miraculously what the parent accomplishes without a miracle through the use of his or her body.

Hartshorne responded to Basinger’s critique in a letter (dated August 4, 1988) and said, among other things, that he doubted that he ever claimed that miracles never occur. He was disinclined to believe that miracles have in fact occurred on grounds similar to those offered by Hume (also Montaigne): probabilities favor deceit or error over genuine miracles. Hartshorne attributed the laws of nature to God’s influence over all dynamic singulars (see the article, “Charles Hartshorne on Theistic and Anti-theistic Arguments: Global Argument”) and said that he doubted our wisdom to judge how far the value of such laws “justifies the absence of notable divine intervention.” Doubting, however, the quality of evidence for miracles is different from doubting the possibility of miracles. Basinger replied to Hartshorne (August 24, 1988) that he wasn’t “quite sure” what it could mean in neoclassical metaphysics to suppose that miracles could occur. This is a fair question, especially in light of Hartshorne’s denial that God acts indirectly. On the other hand, it is fair to ask for an account of divine power that is not merely ad hoc but flows naturally from general metaphysical principles such as Hartshorne was at pains to give. With the possible exception of Descartes’ concept of omnipotence, every account of divine power includes propositions of the form “God cannot X.” The force of the “cannot” may be in the logical impossibility of the act named (for example, making a circle with unequal radii), in the nature of God (for example, God cannot intend evil), in the nature of that over which divine power is exercised (for example, God cannot create a creature’s creative act), or in the particular relations that God has with the creatures (for example, God cannot act indirectly). It is a legitimate question what it means to speak of attributing the highest degree of power to God apart from a system of metaphysical principles. It is not that a particular metaphysic is a final court of appeal for a concept of divine power; on the other hand, an appeal to divine power may be no more than a deus ex machina apart from a well-articulated metaphysic.

5. Divine Knowledge

One of the lessons to be learned from debates about divine power is that one’s ideas about God have implications for one’s ideas about the world and vice versa. To assume that God can bring about any logically possible state of affairs presupposes that all states of affairs are such that, in principle, they require only a single being to bring them about. That presupposition, however, begs the question against a world-view like Hartshorne’s in which reality has a social structure. In such a world, it is no limit on God if God cannot bring about every logically possible state of affairs. There is an analogous lesson where divine knowledge is concerned. If reality is continually in-the-making, as Hartshorne maintains, then there is a fundamental asymmetry between past and future. The past is fully determinate and the future is the realm of the partially indeterminate. If God is all-knowing, then God must know the future for what it is, as partially indeterminate. If one raises the objection that such a deity is not omniscient because the future is partially hidden from it, one has failed to cross the pons asinorum of the debate. It is a defect in divine knowledge not to know a fully determinate future only if there is a fully determinate future to be known. The assumption of a fully determinate future is evident in the use that Aquinas makes of the analogy made famous by Boethius: as each point on the circumference of a circle is equidistant from the center, so God is equally knowledgeable of every moment of time (SCG I, 66.7; see also Boethius, Consolation of Philosophy, Bk 4, Pr. 6). As Hartshorne noted, however, the analogy assumes that time can be represented as a completed whole, whereas time may be more like an endless line whose points are added from moment to moment.

Hartshorne’s criticism of the circle analogy was anticipated by late medieval philosophers like John Duns Scotus (Ordinatio I, d. 39, q. 1-5) and Luis de Molina (Condordia IV, d. 49.18). The questions raised by the circle analogy concern not only the nature of time, but also the nature of God. Traditional theists were reluctant to attribute any passive potency to God; they thought that the perfection of the divine being required that God be immutable and impassible. If, however, God is not affected by anything external, then how is it that God knows the world? Aquinas answered that the cognitive relation in God is the reverse of what it is for humans. We know the world because it affects us, but God knows the world because God is its creator. The Thomistic solution may preserve divine impassibility but at the expense of making human freedom problematic. This problem was discussed in the previous section. There was, however, another very imaginative solution to the “mechanics of omniscience” given by Molina. He argued that, prior to creating the world, God has knowledge of what any possible free creature would do in any particular circumstance. Using this “middle knowledge” in combination with the knowledge of what creatures God has in fact chosen to create, God is able to know what every free creature will do in the circumstances where they have been placed.

New life was breathed into Molinism by analytic philosophers of religion in the late twentieth century. For his part, Hartshorne never directly addressed Molina’s theory. It is easy enough, however, to reconstruct a Hartshornean response to Molinism. Above all, it is important to appreciate that, of necessity, the logical subjects of God’s middle knowledge are possible persons. God’s knowledge of what would be the case for any free creature is pre-volitional; that is to say, God knows, prior to creating, what any creature, whether it is eventually created or not, would do under any given circumstance. Middle knowledge cannot serve to guide God’s providential decisions about which world to create if it depends upon which world God creates. For this reason, the usual characterization of middle knowledge as “counter-factuals of freedom” is seriously misleading. Prior to God’s decision to create a world, there are no creatures and, hence, no fact of the matter about any actual creature. There are only possible creatures. Hartshorne denied the existence of possible non-actual individuals. In Man’s Vision of God and the Logic of Theism he wrote, “There is an unutilized possibility of individuals, but not an individuality of unutilized possibility.” (See also, “Charles’s Hartshorne: Neoclassical Metaphysics.”) Given these views, it is clear that Hartshorne would reject Molinism.

There is a hint of irony in claiming to know what Hartshorne would say about middle knowledge. Does this not presuppose a kind of middle knowledge of Hartshorne? In view of what was just said about the logical subjects of middle knowledge, the answer to this question should be obvious. Hartshorne was not a possible person; he was a real person whose views on various philosophical topics were clearly stated. The argument is this: Molinism entails belief in possible persons; Hartshorne denied the existence of possible persons; therefore, Hartshorne would deny Molinism. This argument points to one of the most puzzling features of Molinism, to wit, that middle knowledge is not grounded in fact. Hartshorne’s developmental and cumulative view of process permits speculation about what a given actual person would or might do under various sets of circumstances. These “would be” and “might be” statements are grounded in the world-historical process itself, including a person’s character as so far formed or (as in Hartshorne’s case) as it was formed. Hartshorne made precisely this point in his response to Robert Kane in the Library of Living Philosophers volume devoted to Hartshorne’s work. For Hartshorne, God’s knowledge of the world is similar to our knowledge in that it requires a real relation from the object of knowledge to the knower. The difference, in God’s case, is that divine knowledge is eminent—God perfectly knows the extent to which the future is open or closed at any particular juncture of the creative advance.

A subtle objection to Hartshorne’s theory of omniscience is that it represents God as ignorant of certain truths. To be sure, the neoclassical God perfectly knows the past—what did or did not happen—but does God, as so conceived, know everything that will or will not happen? Consider a person, P, at time T1 as yet undecided about a difficult choice: will P choose B or not-B? Let us suppose that at T2 the person decides B. On Hartshorne’s account, God knows at T2 that P chooses B, but God does not know at T1 that P will choose B. The argument can be further refined: an omniscient being knows all truths; at T1, either “P will choose B” or “P will choose not-B” is true; the neoclassical God does not know at T1 which of the statements is true; therefore, this God is not omniscient.

Hartshorne’s initial response to this objection, in a 1939 article, was to argue, in effect, that there are three truth values: true, false, and indeterminate. According to this view—which may have been Aristotle’s—future tense statements have an indeterminate truth value. Hartshorne was unhappy with this idea because it requires abandoning the law of excluded middle; if p concerns a future event, then “p or not-p” is best construed as indeterminate rather than (as in traditional logic) a tautology. In Man’s Vision of God and the Logic of Theism, Hartshorne hit upon a different response to the argument, one which he would develop more fully in an article in Mind in 1965 (reprinted in Creative Experiencing). Hartshorne’s mature position was to argue that “P will choose B” and “P will choose not-B” are best construed as contraries rather than contradictories. The strict contradictory of “P will choose B” is “P may not choose B” and the strict contradictory of “P will not choose B” is “P may choose B.” The statements forms in the triad—“P will choose B,” “P will not choose B” and “P may or may not choose B”—are mutually exclusive: if one is true the other two are false. In this way, Hartshorne preserves the law of excluded middle as to truth values while allowing for the openness of the future.

Since, on Hartshorne’s view, “will” and “will not” statements are contraries, it is incorrect to represent them in the sentential meta-language as, respectively, p and not-p. Rather, “X will occur” and “X will not occur” should be represented as p and q, where ~ (p & q) (that is, “not-(p and q)”). A similar mapping of object language expressions onto sentential meta-language is needed in other domains as when one represents the pairs of contraries, “commands X” vs. “forbids X” or “legally requires X” vs. “legally requires not-X”: the remaining alternative in each case, respectively, is “makes no command with respect to X” and “there is no legal requirement with respect to X.” The metaphysical underpinning of Hartshorne’s proposed semantics of future tense statements is his indeterminism, according to which past causal conditions require (X will occur), exclude (X will not occur), or permit (X may or may not occur) various effects in the future.

Anticipating an objection, Hartshorne admits that it seems paradoxical to say that “X will occur,” as a prediction, is false even when X in fact occurs. Hartshorne replies that the “paradox” may be no more problematic than the familiar fact that a false scientific law can be verified (or corroborated). This is simply one more instance of the so-called paradox of material implication. We accept that “if p then q” is true when p is false, even if this seems counter-intuitive. The paradox dissolves upon the realization that any other truth functional definition of the conditional besides the standard one—“if p then q” is equivalent by definition to “not-p or q”—yields manifestly invalid inferences. Hartshorne takes a clue from Popper and says that the decisive operation where “will be” statements are concerned is falsification. “X will occur” is shown to be false when X does not occur, but it is not shown to have been true when X occurs. Hartshorne’s view requires that, in the strictest philosophical sense, “will be” statements are disguised “must be” statements. Intuitions among competent speakers of the language differ on this point so it is reasonable not to expect the issue to be decided by ordinary language. When Scrooge, in Dickens’ A Christmas Carol, asks the Ghost of Christmas Future whether he is seeing the shadows of the things that “will be” or the shadows of the things that “may be only,” he is expressing in a precise way Hartshorne’s analysis of future tense statements. If the shadows are of the things that “will be,” then all hope is lost, but if they are the shadows of the things that “may be only” then Scrooge can change his ways and make for himself a different future.

Our discussion to this point has followed philosophical orthodoxy by focusing on whether God knows the truth values of propositions. For Hartshorne, however, this question is secondary, for there is more to knowledge than knowledge that a proposition is true. In The Principles of Psychology, William James, following John Grote distinguished, “knowledge of acquaintance” and “knowledge-about,” a distinction later made famous by Bertrand Russell who spoke of “knowledge by acquaintance” and “knowledge by description.” To have information about something or someone is not the same as having first-hand awareness of them. The two sorts of knowledge are related as more abstract to more concrete. It is one thing to read about a battle, quite another to have experienced it for oneself. Moreover, as a general rule, the more abstract the knowledge, the more emotionally detached it can be. The basic form of knowledge that Hartshorne attributes to deity is direct acquaintance through the affective bonds of feeling; Hartshorne adopts Whitehead’s term “prehension” for the most concrete facts of relatedness among dynamic singulars. If God’s knowledge is prehensive, it is perhaps easier to understand why Hartshorne resists the idea that God knows the future as determinate: no one is acquainted with the future; at best one has knowledge of acquaintance of the future as an array of tendencies towards actualization or as possibilities entertained. Moreover, conceiving God’s relations with the creatures as prehensive places emphasis on the affective dimension of divine knowing. God’s knowing, as feelings of the feelings of others can then be conceived as a form of caring.

Hartshorne’s theory posits God’s perfect knowledge of the future as relatively indeterminate and of the past as determinate. Yet, the past, even if it is determinate as Hartshorne claims, is no longer. Does this mean that God also lacks knowledge of acquaintance with the past? Hartshorne answers in the negative and it is important to understand his reasons. A creature, having specific spatio-temporal location, has acquaintance with at most a vanishingly small segment of events in space and time, and even that knowledge is shot through with fallibility. Most of our knowledge of the past is through inference and by description. We know by acquaintance with the past we have lived, but most of our knowledge of the past is about the past. God’s knowledge is both quantitatively and qualitatively different. Divine experience encompasses everything that has ever come to pass. As a localized individual has acquaintance with its past, God, in an analogous fashion, has acquaintance with all that is past. Divine knowledge, moreover, not only knows all of the past but knows it with perfect adequacy. God’s is the eminent form of prehension. On Hartshorne’s principles, the distant past must be as vivid for God as the recent past. In other words, the past does not “fade” for God. The difference, for God, between distant past events and recent ones is in the knowledge that recent events were preceded by the distant ones whereas there was a time when the recent events were, at best, outlines of what could be relative to distant past events.

The extent of God’s knowledge of the past is a point of contention between Whitehead (or Whiteheadians) and Hartshorne. In the concluding lines of Process and Reality, Whitehead speaks of how creaturely achievements, though transient, are everlastingly remembered by God, making them objectively immortal. The “unfading importance of our immediate actions” are said to “perish and yet live for evermore.” In the Library of Living Philosophers volume on Hartshorne, Lewis Ford interprets Whitehead to mean that each actual occasion (Hartshorne’s dynamic singulars) undergoes a two stage process, its coming-to-be (during which it is a subject of experience) and its objectification (in which it ceases to be a subject of experience) in the coming-to-be of subsequent occasions. According to Ford, it was Whitehead’s “momentous discovery” in metaphysics that the subject/object distinction is a difference in temporal modality; that is to say, an occasion’s status in the present, as it comes to be, is to be a subject, but as past it is an object. Hartshorne agrees with much of this analysis, but he objects to Whitehead’s metaphor of perishing. Hartshorne contends that the objects that are prehended by subsequent occasions are past subjects. If the being of an actual entity is constituted by its becoming, as Whitehead says (and Hartshorne agrees), then God’s prehension of an occasion is precisely God’s feeling of that occasion’s feelings. What exists everlastingly in the divine memory is not merely a knowledge that a dynamic singular felt in a particular way, but an acquaintance with how it felt. Hartshorne likens God’s memory of a person’s experiences to the person’s own vivid recollection of their past experiences.

6. Panentheism

A distinctive feature of Hartshorne’s theism, and one that sets it apart from Whitehead’s theism, is that God includes the universe in a way that bears a distant analogy with the way that a person includes his or her body. Until 1941 Hartshorne spoke of a “new pantheism,” but afterwards he spoke of panentheism, meaning that all (pan) is in (en) God (theos). Hartshorne cited Plato’s World-Soul analogy in some of the later dialogues as an anticipation of panentheism. Hartshorne, however, divests the doctrine of any vestige of mind-body dualism. God is not an immaterial entity haunting the universe; rather, as Hartshorne says in Omnipotence and Other Theological Mistakes, God is “the individual integrity of ‘the world,’ which otherwise is just the myriad creatures.” Hartshorne relies on modern cell theory for an analogy which, of course, was unavailable to Plato. Every localized dynamic singular is, as it were, a cell, in the body of God. An important disanalogy is that the universe, unlike a body within the universe, has no environment external to itself. Thus, in the divine case, the “body” of God and the “environment” in which God operates are one and the same. Hartshorne expresses this idea by saying that God’s “environment” is wholly internal. He adds that the disanalogy explains why there are no specialized organs—such as liver, heart, and brain—in the divine body as there must be in a localized body. Specialized organs allow a localized body to monitor itself in its relation to its environment, but there is no other environment for God to negotiate except the universe. Dombrowski rightly says that, for Hartshorne, it is as true to say that the cosmos is ensouled as to say that God is embodied (Dombrowski 1996, 86).

Hartshorne also used analogies of persons related to persons as symbolic language for the relationship between God and the creatures. He was deeply critical, however, of the male bias of traditional theology. The few female metaphors used for God in the Bible, for example, were overshadowed by the dominance of male images—Lord of Hosts, Father, King—which reinforced patriarchal attitudes. Hartshorne considered himself a feminist. Sometime in the late 1970s or early 1980s, he was alerted to the problem of sexism in language and so he began using inclusive language as one can see in Omnipotence and Other Theological Mistakes and elsewhere. He said that, in retrospect, it would have been better had his early book Man’s Vision of God been titled Our Vision of God. (Auxier and Davies 2001, 159). Hartshorne’s feminism is also apparent in a variation he gives to panentheism. He argued that the relationship between mother and fetus is decidedly more intimate than the relation between father and fetus. Thus, for some purposes, the analogy of a pregnant mother for the relation between God and the creatures is preferable to any male counterpart. Of course, the pregnancy analogy, like all symbolic language for deity, has a restricted use. Nevertheless, re-imagining God as a woman is a useful reminder of the male bias of traditional theology and it helps to highlight aspects of the God/World relationship that were obscured by that bias.

Analogies like World-Soul, person-cell, or pregnancy, are at best distant approximations for the relationship of God to the world. As metaphors they are literally false, but they are aids in understanding what Hartshorne has in mind when he says that God includes the world. Hartshorne’s argument for panentheism is disarmingly simple: If God is the greatest conceivable reality, then God must include all that is valuable in the universe. Otherwise, there would be a reality greater than God, namely, the universe-plus-God. Could God include what is valuable in the universe without including the universe? Hartshorne does not think so. Each dynamic singular that comes to be is not simply an additional fact; it is, by virtue of Hartshorne’s panexperientialist psychicalism also a value-achievement, and that value-achievement is greater in more complex organisms. This article has previously used the examples of Mozart and Beethoven as introducing new values into the universe, but other examples are legion. The sum total of value in the universe, which is inseparable from the dynamic singulars that comprise it, is ever increasing according to Hartshorne’s process-relational metaphysic. It must therefore be included within God if God is to be conceived as the reality than which none is greater.

Norris Clarke says that medieval philosophers anticipated Hartshorne’s argument and replied to it (Clarke 1990, 108). They said that the reality described by “God plus the universe” involves more beings in a quantitative sense, but not greater perfection of being in a qualitative sense. More precisely, says Clarke, “God plus the universe” means that there are more sharers in being. All value is in God, and the creatures merely share or participate in that value. By way of analogy, Clarke says that a mathematician may impart her knowledge to her students. Once the students learn what the teacher has to teach there is not more knowledge in the class, there is only more of those sharing in the knowledge. A different analogy, however, could be used to bring out the distinctiveness of Hartshorne’s view. A music teacher may provide her class with the basics of theory and composition, but the students can create new musical pieces, each with a value of its own. In this example, there are not simply more sharers of being, but more creators of value. The medieval response that Clarke gives is defective, on Hartshorne’s reckoning, at precisely the point that process-relational theology departs from classical theism: the universe is not simply a product of divine creativity but of multiple creative agents. Classical theism had the unhappy consequence of divesting the creatures of any value that is their own, except for what is on loan from God. The sum-total value or perfection of existence is the same whether or not the creatures exist. For this reason, Hartshorne considered his panentheism to give a better account than classical theism of what it means to serve God. If the value in a creature is wholly borrowed from God, then the individual can offer God nothing that did not already belong to God by natural endowment. For Hartshorne, on the other hand, the creatures may be imperfect, but they are not mere conduits for values that God already possesses. On the contrary, their value contributes to that of God—hence, Hartshorne’s expression, “contributionism.”

A question that Hartshorne raised in Man’s Vision of God and the Logic of Theism and that he discussed with E. S. Brightman in their correspondence was whether it is possible for God to include individuals that hold erroneous beliefs without also holding those beliefs. Put somewhat differently, if different individuals hold contradictory beliefs and God “includes” those individuals and their beliefs, does God hold contradictory beliefs? Similar puzzles can be raised about God’s inclusion of individuals who commit terrible crimes—is the evil of the criminal deed a property of God? Or again, can God include creatures who are anxious about their death without also being anxious about death? Hartshorne replies that the logic of parts and wholes is such that they do not necessarily share properties—for example, a sand dune is not the size of a grain of sand even though it is made of grains of sand. Each part of the universe, Hartshorne holds, is a dynamic singular with an activity of its own that is not simply the activity of the universe as a whole (this is another way of expressing indeterminism). By parity of reasoning, these centers of individual activity, or the organisms of which they are parts, can have properties (such as false beliefs, evil deeds, or anxiety about death) that are not shared by the whole. A person can remember formerly holding a false belief or doing something wrong; God, by analogous extension, can prehend—that is, make part of the divine life—the errors and sins of the creatures without thereby being in error or sinning. It is important to add that while Hartshorne denies that God is the author of creaturely lack of wisdom and virtue, God nevertheless suffers their negative effects. In Creativity in American Philosophy, Hartshorne maintains that God feels how others feel without feeling as they feel (1984, 199).

Two advantages of panentheism, as Hartshorne argues for it, are that it provides a ready argument in support of monotheism and it addresses the empiricist challenge of how to identify the referent of the word God. If God is an all-inclusive reality, then there can be only one God because there can be only one all-inclusive reality. In “Synthesis as Polydyadic Inclusion,” Hartshorne defines inclusion in these terms: if X includes Y, then X + Y = X (1976, 247). If X denotes God and Y denotes the universe, then God, plus the universe is God. The argument that there could not be two all-inclusive deities is this: suppose W and X are two all-inclusive deities; this means that each must include the other. That is to say, W + X = X and W + X = W, but in that case, W = X. As for the empiricist challenge, the conditions for the identification of the panentheistic God are not the same as would be required to identify a localized being. Individuals within the cosmos occupy a tiny portion of the universe for a vanishingly brief period. Their influence is felt locally but not universally. God, on the other hand, is affected by all and affects all. As Hartshorne says, God is the one individual with strictly universal functions (1948, 31; 1967, 76). From this, he infers that God is the one individual identifiable, or picked out, by concepts alone. Other individuals have properties that might have been had by others (for example, Obama was the Democratic candidate for President in 2008, but he need not have been) and the properties they actually have might have been different (for example, Hillary Clinton was born in Chicago, but she could have been born elsewhere). The formal properties of God as all inclusive are unique to God: no other individual has universal functions. One might search the earth for Obama or Clinton, but it would be profoundly misguided to search the earth, or the cosmos, for God. The description of God in the book of Acts is applicable to Hartshorne’s panentheism: God is the one “in whom we live, move, and have our being.”

7. Conclusion

The amount of energy that Hartshorne devoted to questions surrounding the nature and existence of God might lead one to classify him as a theologian. Yet, his defense of dipolar theism presupposes no sectarian dogma, makes no appeals to “revealed” truths or books, and privileges no mystical experience. There can be no question that he was first and foremost—as he himself emphasized—a philosopher. Various ideas about deity that he defended, most notably his critique of divine immutability and impassibility, have been widely influential although few would be willing to call themselves Hartshorneans. A case in point is the late William P. Alston who had been a student in Hartshorne’s class and who, late in his career, attempted to find a mediating position between Hartshorne and Aquinas. Another example of Hartshorne’s influence is that he spoke explicitly of “the openness of God” fully thirty years before that expression was adopted by a group of evangelical Christians to describe a deity open to creaturely influence and that faces a relatively open future. Some of the major figures in that movement—William Hasker, Gregory Boyd, and Richard Rice—acknowledge a debt to Hartshorne’s arguments for conceiving God in relational terms even as they distance themselves from the heterodox elements of his thinking. One may also mention Hartshorne as a pioneer who contributed to the recent widespread interest among philosophers of religion in panentheism. Carol Christ, long at the forefront of feminist theology, sees in Hartshorne’s work philosophically sophisticated ways of “re-imagining the divine in the world.”

Although he was a philosopher, Hartshorne’s work has attracted the attention of theologians. In 1973, a volume devoted to his thought was published in a series titled, “Makers of the Modern Theological Mind.” Many theologians, such as Schubert Ogden (who studied with Hartshorne at Chicago), Marjorie Suchocki, Sheila Devaney, Anna Case-Winters, and Theodore Walker, Jr., have critically appropriated Hartshorne’s philosophical theology. John B. Cobb Jr. (who also studied with Hartshorne at Chicago), once commented that it is often the case that a philosopher that gains a following among theologians is regarded with suspicion by other philosophers. This tendency may be less prominent since the resurgence of interest in philosophy of religion in the closing decades of the twentieth century. Of course, Hartshorne was active throughout the century, vigorously defending the rationality of dipolar theism in the heyday of the Vienna Circle. At a time when religious discourse was widely regarded as nonsensical, Hartshorne met and challenged the positivists on their own terms. It is fair to say that Hartshorne was influenced by his Chicago colleague Rudolf Carnap in his insistence on high standards of logical rigor. Carnap was, in turn, constructively engaged with Hartshorne’s work. Carnap was reportedly intrigued by Hartshorne’s formal reduction to absurdity disproof of the coherence of classical attributes of deity as developed in The Divine Relativity; he worked with Hartshorne closely on the technical appendix to Chapter II on “Relativity and Logical Entailment” in The Divine Relativity.

Hartshorne’s development of a philosophical theology according to which God is transcendent yet inseparable from temporal processes is arguably one of his lasting achievements. His defense of divine relativity may well be the single most important factor in dissolving the near consensus that once prevailed that an entirely unchanging and eternal deity should be considered normative for theology. He considered the deity of the classical tradition as at once too active and too passive. It is too active in the sense that nothing falls outside its control; the creatures are left to unwittingly play roles decided for them in eternity—“imitations of life” as Jules Lequyer called them. It is too static in the sense that it cannot change or be affected by the triumphs and tragedies of the creatures. In short, it is a deity that acts but is never acted upon and can therefore never interact. This is captured in the Aristotelian formula that was borrowed and reinterpreted by medieval thinkers to denote the God of the Abrahamic traditions: God as the “unmoved mover.” In a discussion of Mortimer Adler’s use of this formula, Hartshorne once called it a half-truth parading as the full truth. Hartshorne admired Abraham Heschel for reversing this idea by calling God the “most moved mover” (a phrase later adopted by Clark Pinnock). Hartshorne amended this formula to distill the essence of dipolar or neoclassical theism: God is the most and best moved mover.

8. Suggestions for Further Reading

a. Primary Sources

i. Books (in order of appearance)

Hartshorne, Charles. 1941. Man’s Vision of God and the Logic of Theism. Chicago: Willett, Clark and Company.
Hartshorne, Charles. 1948. The Divine Relativity: A Social Conception of God. New Haven. Connecticut: Yale University Press.
Hartshorne, Charles. 1953. Reality as Social Process: Studies in Metaphysics and Religion. Boston: Beacon Press.
Hartshorne, Charles and William L. Reese, eds. 1953. Philosophers Speak of God. University of Chicago Press. Republished in 2000 by Humanity Books.
Hartshorne, Charles. 1962. The Logic of Perfection and Other Essays in Neoclassical Metaphysics. La Salle, Illinois: Open Court.
Hartshorne, Charles. 1965. Anselm’s Discovery: A Re-examination of the Ontological Proof for God’s Existence. La Salle, Illinois: Open Court.
Hartshorne, Charles. 1967. A Natural Theology for Our Time. La Salle, Illinois: Open Court.
Hartshorne, Charles. 1970. Creative Synthesis and Philosophic Method. La Salle, Illinois: Open Court.
Hartshorne, Charles. 1976. Aquinas to Whitehead: Seven Centuries of Metaphysics of Religion. Milwaukee, Wisconsin: Marquette University Publications.
Hartshorne, Charles. 1984. Creativity in American Philosophy. Albany: State University of New York Press.
Hartshorne, Charles. 1984. Omnipotence and Other Theological Mistakes. Albany: State University of New Press.
Hartshorne, Charles. 1987. Wisdom as Moderation: A Philosophy of the Middle Way. Albany: State University of New York Press.
Hartshorne, Charles. 1997. The Zero Fallacy and Other Essays in Neoclassical Philosophy, edited by Mohammad Valady. Peru, Illinois: Open Court Publishing Company.
Hartshorne, Charles. 2011. Creative Experiencing: A Philosophy of Freedom, edited by Donald W. Viney and Jincheol O. Albany: State University of New York Press.
Auxier, Randall E. and Mark Y. A. Davies, editors. 2001. Hartshorne and Brightman on God, Process, and Persons: The Correspondence, 1922-1945. Nashville: Vanderbilt University Press.
Viney, Donald W., guest editor. 2001. Process Studies, Special Focus on Charles Hartshorne, 30/2 (Fall-Winter)
Viney, Donald W., guest editor. 2011. Process Studies, Special Focus Section: Charles Hartshorne, 40/1 (Spring/Summer): 91-161.

ii. Hartshorne’s Response to his Critics

Cobb, John B. Jr. and Franklin L Gamwell, editors. 1984. Existence and Actuality: Conversations with Charles Hartshorne. Chicago: University of Chicago Press.
Hahn, Lewis Edwin, editor. 1991. The Philosophy of Charles Hartshorne, The Library of Living Philosophers Volume XX. La Salle, Illinois: Open Court.
Kane, Robert and Stephen H. Phillips, editors. 1989. Hartshorne, Process Philosophy and Theology. Albany State University of New York Press.
Sia, Santiago, editor. 1990. Charles Hartshorne’s Concept of God: Philosophical and Theological Responses. Dordrecht, the Netherlands: Kluwer Academic Publishers.

iii. Selected Articles

Hartshorne, Charles. 1945. Entries for “Eternal” (256), “Eternity” (257), “Foreknowledge, Divine” (284), “Omniscience” (546-47), “time” (787-88), “transcendence” (791-92) in An Encyclopedia of Religion, ed. Vergilius Ferm. New York: Philosophical Library.
Hartshorne, Charles. 1950. “The Divine Relativity and Absoluteness: A Reply [to John Wild].” Review of Metaphysics 4, 1: 31-60.
Hartshorne, Charles.1966. “A New Look at the Problem of Evil,” Current Philosophical Issues: Essays in Honor of Curt John Ducasse, edited by Frederick C. Dommeyer. Springfield, Illinois: Charles C. Thomas: 201-212.
Hartshorne, Charles. 1967. “Religion in Process Philosophy,” Religion in Philosophical and Cultural Perspective: A New Approach to the Philosophy of Religion Through Cross Disciplinary Studies, edited by J. Clayton Feaver and William Horosz. Princeton, New Jersey: D. Van Nostrand Company, Inc.: 246-268.
Hartshorne, Charles. 1967. “The Dipolar Conception of Deity.” Review of Metaphysics 21, 2: 273-89.
Hartshorne, Charles. 1969. “Divine Absoluteness and Divine Relativity.” Transcendence, eds. Herbert W. Richardson and Donald R. Cutler. Boston: Beacon: 164-71.
Hartshorne, Charles. 1971. “Could There Have Been Nothing? A Reply [to Houston Craighead].” Process Studies 1, 1: 25-28.
Hartshorne, Charles. 1976. “Synthesis as Polydyadic Inclusion: A Reply to Sessions’ Charles Hartshorne and Thirdness,” Southern Journal of Philosophy 14/2: 245-55.
Hartshorne, Charles. 1977. “Bell’s Theorem and Stapp’s Revised View of Space-Time.” Process Studies 7/3 (Fall): 183-191.
Hartshorne, Charles. 1978. “Theism in Asian and Western Thought.” Philosophy East and West 28, 4: 401-11.
Hartshorne, Charles. 1980. “Mysticism and Rationalistic Metaphysics.” Understanding Mysticism, edited by Richard Woods. Garden City, New York: Image: 415-421.
Hartshorne, Charles. 1984. “Toward a Buddhisto-Christian Religion.” Buddhism and American Thinkers, edited by Kenneth K. Inada and Nolan P. Jacobson. Albany State University of NewYork Press: 1-13.
Hartshorne, Charles. 1992. “The Aesthetic Dimensions of Religious Experience.” Logic, God and Metaphysics, edited by James Franklin Harris. Dordrecht: Kluwer Academic Publishers: 9-18.
Hartshorne, Charles.1993. “Can Philosophers Cooperate Intellectually: Metaphysics as Applied Mathematics.” The Midwest Quarterly 35/1 (Autumn): 8-20.

b. Secondary Sources

Blanchette, Oliva. 1994. “The Logic of Perfection in Aquinas.” Thomas Aquinas and His Legacy. Edited by David M. Gallagher. Studies in Philosophy and the History of Philosophy, Volume 28. Washington, D.C.: The Catholic University of America Press: 107-130.
Boyd, Gregory A. Trinity and Process: A Critical Evaluation and Reconstruction of Hartshorne’s Di-Polar Theism Towards a Trinitarian Metaphysics. New York: Peter Lang, 1992.
Burrell, David B. 1982. “Does Process Theology Rest on a Mistake?” Theological Studies 43/1 (March): 125-135.
Case-Winters, Anna. 1990. God’s Power: Traditional Understandings and Contemporary Challenges. Louisville, Kentucky: Westminster/John Knox Press.
Christ, Carol P. 2003. She Who Changes: Re-Imagining the Divine in the World. New York: Palgrave Macmillan.
Clarke, Bowman. 1966. Language and Natural Theology. The Hague: Mouton & Co.
Clarke, Bowman. 1995. “Two Process Views of God.” God, Reason and Religions: New Essays in the Philosophy of Religions. Edited by Eugene Thomas Long. Dordrecht: Kluwer Academic Publishers: 61-74.
Clarke, W. Norris. 1990. “Charles Hartshorne’s Philosophy of God: A Thomistic Critique,” Charles Hartshorne’s Concept of God: Philosophical and Theological Responses. Edited by Santiago Sia. Dordrecht: Kluwer Academic Publishers: 103-23.
Davaney, Sheila Greeve. 1986. Divine Power: A Study of Karl Barth and Charles Hartshorne. Harvard Dissertations in Religion, number 19. Philadelphia: Fortress Press.
Dombrowski, Daniel A. 1996. Analytic Theism, Hartshorne, and the Concept of God. Albany: State University of New York Press.
Dombrowski, Daniel A. 2004. Divine Beauty: The Aesthetics of Charles Hartshorne. Nashville, Tennessee: Vanderbilt University Press.
Enxing, Julia and Klaus Müller, editors. 2011. Perfect Changes: Die Religionsphilosophie Charles Hartshornes. Regensburg: Friedrich Pustet.
Enxing, Julia. 2013. Gott im Werden. Die Prozesstheologie Charles Hartshorne. Regensburg: Friedrich Pustet.
Fitzgerald, Paul. 1972. “Relativity Physics and the God of Process Philosophy.” Process Studies 2/4 (Winter): 251-276.
Ford, Lewis S. 1968. “Is Process Theism Compatible with Relativity Theory?” Journal of Religion 48/2 (April): 124-135.
Geisler, Norman L. 1976. “Process Theology.” Tensions in Contemporary Theology. Edited by Stanley N. Gundry and Alan F. Johnson. Chicago: Moody Press: 235-284.
Alan Gragg. 1973. Charles Hartshorne, Maker of the Modern Theological Mind, edited by Bob E. Patterson. Waco, Texas: Word Books Publisher.
Griffin, David Ray, John B. Cobb Jr., Marcus P. Ford, Pete A. Y. Gunter, and Peter Ochs. 1993. Founders of Constructive Postmodern Philosophy: Peirce, James, Bergson, Whitehead, and Hartshorne. Albany: State University of New York Press.
Gruenler, Royce Gordon. 1983. The Inexhaustible God: Biblical Faith and the Challenge of Process Theism. Grand Rapids, Michigan: Baker Book House.
Gunton, Colin E. 1978. Becoming and Being: The Doctrine of God in Charles Hartshorne and Karl Barth. Oxford University Press.
James, Ralph E. 1967. The Concrete God, A New Beginning for Theology—The Thought of Charles Hartshorne. Indianapolis, Indiana: The Bobbs-Merrill Company.
Kachappilly, Kurian. 2002. God of Love: A Neoclassical Inquiry. Bangalore, India: Dharmaram Publications.
Moskop, John C. 1984. Divine Omniscience and Human Freedom: Thomas Aquinas and Charles Hartshorne. Foreword by Charles Hartshorne. Macon, Georgia: Mercer University Press.
Myers, William, guest editor. 1998. The Personalist Forum, Special Issue on Charles Hartshorne, 14/2 (Fall).
Nash, Ronald H. editor. 1987. Process Theology. Grand Rapids, Michigan: Baker Book House.
Neville, Robert C. 1980. Creativity and God: A Challenge to Process Theology. New York: The Seabury Press.
Neville, Robert C. 2009. Realism in Religion: A Pragmatist’s Perspective. Albany: State University of New York Press.
Peters, Eugene H. 1970. Hartshorne and Neoclassical Metaphysics. Lincoln: University of Nebraska Press.
Pratt, Douglas. 2002. Relational Deity: Hartshorne and Macquarrie on God. Lanham, Maryland: University Press of America.
Ramal, Randy, editor. 2010. Metaphysics, Analysis, and the Grammar of God: Process and Analytic Voices in Dialogue .Tübingen, Germany: Mohr Siebeck.
Sanders, John. 2007. The God Who Risks: A Theology of Divine Providence, revised edition. Downers Grove, Illinois: IVP Academic.
Shields, George W. 1983. “God, Modality and Incoherence.” Encounter 44/1: 27-39.
Shields, George W. 1992. “Hartshorne and Creel on Impassibility,” Process Studies 21/1 (Spring): 44-59.
Shields, George W. 1992. “Infinitesimals and Hartshorne’s Set-Theoretic Platonism” The Modern Schoolman 49/2 (January): 123-134.
Shields, George W. 2003. “Omniscience and Radical Particularity: Reply to Simoni,” Religious Studies 39/2 (October).
Shields, George W. 2009. “Quo Vadis?: On Current Prospects for Process Philosophy and Theology,” The American Journal of Theology & Philosophy, 30/2 (May).
Shields, George W. 2010. “Eternal Objects, Middle Knowledge, and Hartshorne: A Response to Malone-France,” Process Studies, 39/1 (Spring/Summer): 149-165.
Shields, George W. 2010. “Panexperientialism, Quantum Theory, and Neuroplasticity” in Process Approaches to Consciousness, eds. Michel Weber and A. Weekes. (Albany: State University of New York Press).
Shields, George W., editor. 2003. Process and Analysis: Whitehead, Hartshorne, and the Analytic Tradition. Albany: State University of New York Press.
Sia, Santiago. 1985. God in Process Thought: A Study in Charles Hartshorne’s Concept of God. Postscript by Charles Hartshorne. Dordrecht, the Netherlands: Martinus Nijhoff.
Sia, Santiago. 2004. Religion, Reason and God: Essays in the Philosophy of Charles Hartshorne and A. N. Whitehead. Frankfurt am Main: Peter Lang.
Sia, Santiago, editor. 1986. Process Theology and the Christian Doctrine of God, special edition of Word and Spirit, a Monastic Review, 8. Petersham, Massachusetts: St. Bede’s Publications.
Simoni-Wastila, Henry. 1999. “Is Divine Relativity Possible? Charles Hartshorne on God’s Sympathy with the World.” Process Studies 28/1-2 (Spring-Summer): 98-116.
Sprigge, T. L. S. 2006. The God of Metaphysics. Oxford: Clarendon Press.
Suchocki, Marjorie Hewitt and John B. Cobb, Jr. editors. 1992. Process Studies, Special Issue on the Philosophy of Charles Hartshorne, 21/2 (Summer).
Towne, Edgar A. 1997. Two Types of Theism: Knowledge of God in the Thought of Paul Tillich and Charles Hartshorne. New York: Peter Lang.
Viney, Donald Wayne. 1985. Charles Hartshorne and the Existence of God. Albany State University of New York Press.
Viney, Donald Wayne. 1989. “Does Omniscience Imply Foreknowledge? Craig on Hartshorne.” Process Studies, 18/1 (Spring): 30-37.
Viney, Donald Wayne. 2000. “What is Wrong with the Mirror Image? A Brief Reply to Simoni-Wastila on the Problem of Radical Particularity,” Process Studies, 29/2 (Fall-Winter): 365-367.
Viney, Donald Wayne. 2005. “Hartshorne, Charles (1897-2000)” The Dictionary of Modern American Philosophers, edited by John R. Shook (London: Thoemmes Press): 1056-62.
Viney, Donald Wayne. 2006. “God as the Most and Best Moved Mover: Charles Hartshorne’s Importance to Philosophical Theology.” The Midwest Quarterly, 48/1: 10-28.
Viney, Donald Wayne. 2007. “Hartshorne’s Dipolar Theism and the Mystery of God.” Philosophia, 35: 341-350.
Wilcox, John T. 1961. “A Question from Physics for Certain Theists.” Journal of Religion 40/4 (October): 293-300.
Wood, Forest Jr. and Michael DeArmey, editors. 1986. Hartshorne’s Neo-Classical Theology. Tulane Studies in Philosophy, volume 34.

c. Bibliography

“Primary Bibliography of Philosophical Works of Charles Hartshorne” (compiled by Dorothy Hartshorne; corrected, revised, and updated by Donald Wayne Viney and Randy Ramal) in Herbert F. Vetter, editor, Hartshorne: A New World View: Essays by Charles Hartshorne (Cambridge, Massachusetts: Harvard Square Library, 2007): 129-160. Also published in Santiago Sia, Religion, Reason and God (Frankfurt am Main: Peter Lang, 2004): 195-223.

Author Information

Donald Wayne Viney
Email: don_viney@yahoo.com
Pittsburg State University
U. S. A.

and

George W. Shields
Email: George.shields@kysu.edu
Kentucky State University
U. S. A.

Charles Hartshorne: Biography and Psychology of Sensation

Charles Hartshorne is widely regarded as having been an important figure in twentieth century metaphysics and philosophy of religion. His contributions are wide-ranging. He championed the aspirations of metaphysics when it was unfashionable, and the metaphysic he championed helped change some of the fashions of philosophy. He counted some well-known scientists among his friends, and he embraced the deliverances of modern science (he never questioned, for example, the truth of evolution); however, he insisted that metaphysics and empirical science have different aims and methods, each ensuring in its own way a disciplined objectivity. His “neoclassical” or “process” metaphysics is in the same family of speculative philosophy that one finds in the works of Charles Sanders Peirce and the later writings of Alfred North Whitehead. Although he did not style himself a disciple of Peirce or of Whitehead, he made significant contributions to the study of these philosophers even as he developed his own views. Like them, he endeavored in his own metaphysical thinking to give full weight to the dynamic, relational, temporal, and affective dimensions of the universe. He emphasized, as few before him had, in logic and in the processes of nature, the foundational nature of asymmetrical relations.

Hartshorne was also a theist at a time when the coherence of theism was under attack from quarters as various as logical positivism and Sartre’s existentialism. Hartshorne’s name is inseparable from the revival of the ontological or modal argument for God’s existence, having devoted twenty-three articles and the better part of two books to the topic. He insisted, however, that it was unavailing to appeal to the ontological argument (or any theistic argument) as support for theism without first rethinking the concept of deity. He argued that thinking about God had been handicapped by lack of attention to the logically possible forms of theism, and in place of the unmoved mover of classical theology, he proposed “the most, and best, moved mover.” Hartshorne endorsed a “dipolar” version of theism according to which God is both necessary and contingent, but in different respects. Hartshorne sought a “panentheism” in which God includes the creatures without negating their distinctiveness. He argued that no putative inerrant revelation or infallible institution could negate the effects of the inherent fallibility of human knowledge. He occasionally worried that his “highly rationalized” form of theism would not have wide appeal; on the other hand, it was precisely a God of love and the love of God that were ever his “intuitive clue[s]” in philosophy. His ideas about deity influenced the philosophy of both religion and theology; Hartshorne had argued that it is necessary to take seriously an alternative to classical understandings of God that avoided their shortcomings while preserving their best insights.

Hartshorne did not devote all of his intellectual energies to metaphysics and philosophical theology. His first book, The Philosophy and Psychology of Sensation (1934), ventured empirical hypotheses about sensation, a subject to which he returned intermittently throughout his life. Also of note is his Born to Sing: An Interpretation and World Survey of Bird Song (1973), which established him as a serious ornithologist. What follows is an overview of Hartshorne’s life as well as a discussion of his first book and its relation to the larger themes of his philosophy.

Biography
The Affective Continuum and the Psychology of Sensation
Conclusion: Hartshorne’s Work on Sensation and the Rest of his Philosophy
References and Further Reading

1. Biography

Charles Hartshorne (pronounced “Harts-horne”; literally, “deer’s horn”) was born June 5, 1897 in Kittanning, Pennsylvania, the second of six children of Francis Cope Hartshorne (1868-1950), an Episcopal minister, and Marguerite Haughton Hartshorne (1868-1959). He and his brother Richard (1899-1992)—who would achieve fame as a geographer—attended Yeates Boarding School (1911-1915), where he acquired a life-long interest in ornithology. Later, he attended Haverford College (1915-1917), where he was a student of the Quaker mystic Rufus Jones. With America’s entry into the First World War imminent, Hartshorne volunteered for the medical corps and spent the war years (1917-1919) in Le Tréport, France as an orderly in a British hospital.

What Hartshorne referred to as “the second period” of his intellectual development began when he enrolled at Harvard in 1919. He majored in philosophy and minored in English literature. Among his teachers were James Haughton Woods (named after Hartshorne’s maternal grandfather), W. E. Hocking, H. M. Sheffer, Ralph Barton Perry, C. I. Lewis, and the psychologist L. T. Troland. He completed the Ph.D. in 1923, writing a 306 page dissertation titled An Outline and Defense of the Argument for the Unity of Being in the Absolute or Divine Good. The broad outlines of his later thought are evident in the dissertation, but he never published any part of it. He would later remark, in Creativity in American Philosophy (1984), that it was a form of process philosophy that was “somewhat naïve and best forgotten.” Nevertheless, he was productive throughout his career, writing twenty-one books and over five hundred articles and reviews.

After graduation, Hartshorne returned to Europe as a Sheldon Traveling Fellow (1923-1925). He spent most of his time in Germany, but he also visited England, France, and Austria. He was fluent in German and spoke French reasonably well. His travels were rich with intellectual stimulation. In Europe he encountered many philosophical luminaries, including Moritz Schlick, Heinrich Gomperz, Lucien Lévy-Bruhl, Edouard Le Roy, Lucien Laberthonnière, Samuel Alexander, R. G. Collingwood, J. S. Haldane, G. E. Moore, G. F. Stout, Harold H. Joachim, Richard Kroner, Oskar Becker, Julius Ebbinghaus, Max Scheler, Max Planck, Adolf Harnack, Jonas Cohn, Paul Natorp, and Nicolai Hartmann. The most famous philosophers he met and with whom he studied were Edmund Husserl and Martin Heidegger. On his return to the United States, Hartshorne wrote the first English language review of Heidegger’s Sein und Zeit (Being and Time); the review appeared in the Philosophical Review and was published as part of the penultimate chapter of his second book.

Hartshorne was an Instructor and Research Fellow at Harvard (1925-1928) where he was simultaneously exposed to the two thinkers with whose philosophies he felt the most affinity: Charles Sanders Peirce (1839-1914) and Alfred North Whitehead (1861-1947). Boxes of Peirce’s unpublished manuscripts were donated to the Harvard library by Peirce’s widow, and Hartshorne was given the assignment of editing these papers. In 1927, Paul Weiss joined Hartshorne in the project. The Collected Papers of Charles Sanders Peirce was published in six volumes between 1931 and 1935 and would become the standard edition of Peirce’s work throughout the century. Although Hartshorne published enough articles on Peirce to fill a book—a total of seventeen—neither he nor Weiss thought of becoming Peirce scholars. Hartshorne’s duties at Harvard also included helping to grade papers for Whitehead, who was a recent addition to the faculty (1924). As Whitehead’s assistant, Hartshorne witnessed the Englishman develop “the philosophy of organism” that would find expression in Whitehead’s Gifford Lectures, published as Process and Reality (1929). This book, as well as others written by Whitehead during this period, formed the foundation of twentieth century process philosophy.

Hartshorne’s earliest writings, prior to his encounter with Whitehead, emphasize process and relativity as metaphysically basic; for this reason, he characterized his relation to Whitehead (also to Peirce) as one of pre-established harmony. Just as he would write much on Peirce’s philosophy, so he promoted Whitehead’s importance in thirty-nine articles and reviews; thirteen of these articles are collected in Whitehead’s Philosophy: Selected Essays 1935-1970 (1972). For a time Hartshorne considered himself a Peircean and a Whiteheadian, in each case, as he said, “with reservations”—in later years he emphasized the reservations. It is clear, in any event, that the exposure to Peirce and Whitehead helped him to focus his thinking. Whitehead’s works, in particular, provided him with a technical vocabulary for expressing his own metaphysics that in some respects overlaps with Whitehead’s but in other respects is very different from it. In the fullness of time, these differences led some Whitehead scholars to complain of an overly Hartshornean slant to Whitehead studies, thus bearing testimony to Hartshorne’s dominance. Hartshorne referred to the years between 1925 and 1958 as his “third period” to highlight the significant influence of Peirce and Whitehead on his thinking.

When Harvard announced that they had “no job” for Hartshorne after his third year of teaching and research, he took a position in 1928 at the University of Chicago, where he was a faculty member in the Department of Philosophy until 1955. He eventually held a joint appointment as a member of the Meadville Theological School (1943-1955). Shortly after the move to Chicago, he married Dorothy Eleanore Cooper (1904-1995), his life-long companion. The Hartshorne’s only child, Emily (Schwartz), was born in 1940. In 1936, he served as secretary (that is, chairperson) of the department of philosophy, during which time Rudolf Carnap was hired. Hartshorne was a visiting faculty at Stanford University in 1937, and he spent the 1941-42 academic year at the New School in New York. From 1948 to 1949 he taught at Goethe University in Frankfurt and also lectured at the Sorbonne in Paris. He was president of the Western Division of the American Philosophical Association in 1949, and he was a Fulbright Lecturer at Melbourne, Australia during 1951-52. Hartshorne was also a member of the informal group of theologians called “the Chicago school,” which included Henry Nelson Wieman, Daniel Day Williams, Bernard Meland, and Bernard Loomer.

At Chicago, Hartshorne’s thinking matured, and he developed the outlines of his own system of speculative philosophy, which he called neoclassical metaphysics. The hiring of Carnap was especially ironic since he was the most famous of the logical positivists while Hartshorne was one of positivism’s greatest critics. However, Hartshorne reported that, despite his and Carnap’s profound differences in philosophical outlook, their engagement was cordial and fruitful. The German helped him to formalize his objection to the classical understanding of divine foreknowledge in his book The Divine Relativity (1948). Hartshorne published six books while at Chicago (in addition to the Peirce papers), including the wide-ranging survey of philosophical theology titled Philosophers Speak of God (1953/2000), edited with his student William L. Reese. Hartshorne’s other books during this period, apart from his first one, focused on the problems of metaphysics: Beyond Humanism: Essays in the Philosophy of Nature (1937), Man’s Vision of God and the Logic of Theism (1941), and Reality as Social Process: Studies in Metaphysics and Religion (1953).

Hartshorne attracted many graduate students from Chicago’s three federated seminaries, two of whom became well-known theologians (John B. Cobb, Jr., b. 1925, and Schubert Ogden, b. 1928). He was unhappy, however, that few graduate students in philosophy studied with him. Two of the most well-known students in Hartshorne’s classes were Richard Rorty (1932-2007) and Huston Smith (b. 1919). Each became known for defending views at odds with Hartshorne’s ideas : Rorty in philosophy and Smith in religious studies. Even as he disagreed sharply with his former teacher, Rorty made clear that he never ceased to admire Hartshorne’s intellectual passion and generosity of spirit.

Hartshorne and his family left Chicago and moved to Atlanta, Georgia in 1955, where he taught at Emory University until 1962. In 1958, he taught at the University of Washington and visited Kyoto, Japan as a Fulbright Lecturer. There he learned more about Buddhism, which he called the first process philosophy. It was also in Japan that he began a more intense focus on Anselm of Canterbury’s ontological argument for God’s existence. He would soon publish in the second chapter of The Logic of Perfection (1962), for the first time in the history of philosophy, a formalization of the argument using modal symbolism. Soon afterwards came Anselm’s Discovery (1965), which includes an overview of treatments of the ontological argument in the works of various philosophers and theologians. Hartshorne described this time in his life as the beginning of his “fourth period,” as he gained more critical distance from the philosophies of Peirce and Whitehead and began in earnest to refine his own metaphysical synthesis. Now in his sixties, he faced mandatory retirement at Emory at age 68. In 1962, John Silber, then at the University of Texas at Austin, invited Hartshorne to Texas. Hartshorne accepted the invitation and, in 1963, became Ashbel Smith Professor of Philosophy; he taught full-time until his official retirement in 1978, and part-time for a few years thereafter. During his years at Texas he taught and traveled widely, throughout the United States, including two summer sessions at Colorado College (1977 and 1979), but also to India and Japan on a third Fulbright (1966), Australia (1974), the University of Louvain, Belgium (1978), and again to Japan and Hawaii (1984).

Hartshorne’s productivity in the last three decades of his life was prodigious, beginning with four major works; these included the aforementioned book on Whitehead, the book on bird song, as well as A Natural Theology for Our Time (1967) and Creative Synthesis and Philosophic Method (1970), the latter being his most comprehensive and systematic presentation of neoclassical metaphysics. In his eighties, Hartshorne published dozens of articles, reviews, and forewords, and completed numerous books. Hartshorne gave his most complete assessment of western philosophy in Insights and Oversights of Great Thinkers: An Evaluation of Western Philosophy (1983) and in Creativity in American Philosophy (1984). Omnipotence and Other Theological Mistakes (1984) is a nontechnical introduction to his philosophical theology. The posthumously published Creative Experiencing: A Philosophy of Freedom (2011), completed during the 1980s, complements Wisdom as Moderation: A Philosophy of the Middle Way (1987) and more or less rounds out the technical metaphysical work begun in Creative Synthesis.

The last of Hartshorne’s books to appear during his lifetime, The Zero Fallacy and Other Essays in Neoclassical Philosophy (1997), published in the year of his centenary, was edited by Muhammad Valady, a philosopher he met in 1985. Valady made a thorough study of Hartshorne’s works and engaged him in conversation on a regular basis over lunch. Valady compiled the essays in The Zero Fallacy to reflect the full range of Hartshorne’s thinking, including his empirical work on sensation and on bird song (approximately half the book is comprised of essays not previously published). The book opens with a “brisk dialogue” between Hartshorne and Valady that conveys both the charm of a conversation with the aging philosopher as well as the keenness of his mind in dealing with philosophy. In his twilight years, Hartshorne also contributed to four books devoted exclusively to his thought, giving detailed replies to sixty-two essays by fifty-six scholars (see secondary sources, books edited by Cobb and Gamwell, Kane and Phillips, Sia, and Hahn). His responses fill approximately one fourth of the pages in these volumes. With good reason he expressed concern that philosophers might find it difficult to stay abreast of his writing.

Hartshorne died on Yom Kippur, October 9, 2000 (incorrectly reported as October 10^th by The New York Times). He was preceded in death by his wife, who passed away at the age of ninety-one on November 21, 1995.

2. The Affective Continuum and the Psychology of Sensation

Hartshorne began thinking seriously about sensation after an experience he had while serving as an orderly in France during the First World War. As he stood on a cliff looking over a scene of great natural beauty, George Santayana’s phrase “beauty is objectified pleasure” came to him. Hartshorne rejected that slogan on the basis of what he was experiencing. It seemed to him that the pleasure was not experienced in himself as a subject and only then projected onto nature; rather the pleasure was itself given as in the object. He concluded that experience, all experience, is saturated with affect, given in emotional terms. In the essay “Some Causes of My Intellectual Growth” he says, “Nature comes to us as constituted by feelings, not as constituted by mere lifeless, insentient matter.” The point is not that we never attribute more to an object than what the object contains; it is, rather, that objects are never given to us in experience as completely lacking affective tone. Hartshorne never strayed from the conviction that matter devoid of feeling is an abstraction from experience and not a datum of experience.

Hartshorne’s first published book was The Philosophy and Psychology of Sensation (1934), the result of his intense philosophical interest in aesthetic motifs proffered by Peirce and Whitehead and his longstanding interest in empirical psychological inquiries into sensation begun under Troland at Harvard. This interest in empirical inquiries continued with study of some European experimental psychologists such as Julius Pikler, whose name is sometimes paired with Hartshorne’s in the literature on sensation, as in “the Hartshorne-Pikler Hypothesis” discussed by Lawrence E. Marks in The Unity of the Senses (New York: Academic, 1978). Hartshorne argues for a theory that, in his view, integrates themes of evolutionary biology with experimental and phenomenological data on intersensory analogies, with aesthetic and religious values, and with an overall enhancement of intelligibility or the “unity of knowledge.” The work was written when interest in sensation had dwindled under the influence of American behaviorist theory, when the odd indifference of William James to considerations of sensation was still lingering, and when psychologists were little interested in grand theoretical integrations, including integrations with evolutionary theory. The work, arguably ahead of its time, can be much better appreciated now than when it was first published.

Hartshorne’s theory is organized around the defense of five theses, to be discussed in turn below: (1) the sensory modalities exhibit quantitative continuity, exhibiting no absolute difference of kind; (2) sensory qualia are essentially affective (a theme echoed in the early Heidegger with whom Hartshorne studied); (3) all experience is analyzable as essentially social in the Whiteheadian sense of “feeling of feeling”; (4) sensation is essentially “adaptive” in the evolutionary biological sense; and, (5) sensory qualia have a common origin in evolutionary history. The whole doctrine might be conveniently labeled as the “affective continuum hypothesis.” The third item is central to the thesis of panexperientialism, which Hartshorne defended throughout his career. In view of its importance to his metaphysics, it will require separate discussion. Here the focus will be on a brief exposition of the other mentioned theses.

First, Hartshorne rejects the “classical” doctrine of Hermann von Helmholtz that the various sensory modalities (visual, olfactory, tactile, gustatory, and auditory experiences) are tightly compartmentalized, allowing no degrees of lesser or greater similarity, and no transition from one modality to another. According to the classical doctrine, while degrees of qualitative similarity or analogy might be permissible within a given sensory modality (for example, dark magenta and royal purple are qualitatively “closer” to one another than are, say, candy red and canary yellow), no inter-modal sensory analogies are permissible such that we could intelligibly say that, for instance, certain odors are more or less similar to certain colors. Moreover, the classical theory of sensation held that sensations are not inherently emotional or affective in character; any affective properties found to be associated with sensations are culturally conditioned “additions” to the sensations; in effect, sensations are essentially pure “registrations” of cognitive data. For classical theory, emotions and sensations are entirely separate functions of consciousness. To the contrary, Hartshorne argues that the classical theory does not fit the phenomenological and empirical evidence, is out of touch with the intersensory analogies provided in all manner of ordinary language metaphors, and does not cohere with the concept of an evolutionary history of sensory systems.

While experimentation on intersensory phenomena is a complex affair and interpretation of some results is disputable, it is fair to say that a body of evidence has emerged which bodes well for the thesis of intersensory connection. Indeed, it is now a commonplace of contemporary psychology texts to discuss evidence for intersensory analogies, for instance, the establishment of connections between visual and auditory neural systems as well as evidence of visual-auditory correlations in the cognitive development of infants. It is also particularly telling that neuroscientists have developed sensory substitution systems that can allow the blind to construct images, objects, and words from tactile stimulation. Moreover, Hartshorne points to abundant metaphors of common parlance which make intersensory connections: some colors are said to be “warm” or “loud,” some sounds are said to be “sweet” or “sour,” some affective states or moods are said to be “blue” or “dark,” or some smells are said to be “delicious” or “distasteful.” The practice of employing intersensory metaphors occurs widely across cultures and is broadly communicative or publically accessible, pointing (at the very least) to the possibility of intersensory continuity and to an underlying objective affect-quality in sensation, thus grounding the communicability of the intersensory metaphors. If the sensory modes are as rigidly separated and analogical connection is as unintelligible as classical theory maintains, it is difficult to explain that language is so saturated with intersensory metaphors. Hartshorne does not deny that there are strong qualitative differences between the qualia of various sensory modes (indeed his theory posits qualitative difference in terms of a geometric notion of “distance on a continuum”), nor does he deny that cultural conditioning can play an important role in constructing affective associations with sensations. Rather, his theory rejects the rigid discontinuity of the sensory modes and the separation of sensation from affectivity.

While Hartshorne is cognizant of cultural conditioning of sensory experience, he argues that such conditioning can be shown to presuppose an underlying affect in the “conditioned” sensation. Consider a locus classicus case of culturally constructed associations of affectivity in classical theory: the preference for white dress in traditional Chinese funerals as opposed to black or dark dress in traditional Spanish or Italian funerals is said to show that there are fundamentally different emotional qualities attached to white in Chinese as opposed to European cultures. Hartshorne argues that this misconstrues the situation. The cultural difference is found in different attitudes toward death and funeral rites, not in different feelings concerning the colors white or black (the Chinese think of funerals as positive celebrations of past life). Hartshorne also applies this reasoning to variations in individual sensory-qualitative preferences. In Creative Synthesis and Philosophic Method, he remarks on how the fact that some persons prefer a certain bitter quality of strong dark chocolate does not show that such individuals “fail to sense the contrast, sweet-bitter, as essentially positive-negative.” It means rather that they do not want mere sweetness or pleasantness; they want a more complex sensory experience. Hartshorne’s point is that an adequate phenomenology of sensation must include the appropriate “layered” complexity of sensory experience and thus accommodate the fact that we have meta-feelings (“feelings about our sensory feelings”) in addition to “object-feelings” (feelings about things that are not feelings, like chocolate). It is the duality of this, so to speak, “meta-feeling/object-feeling” situation which is the source of the distinction (which Hartshorne calls a “pseudo-duality”) of affect and mere sensation posited by the classical theory of sensation.

In addition, it is not clear how the classical view can be squared with the evolutionary development of sensory modes. If the sensory modes are as separate as classical theory supposes, then how could new sensory modes which evolve have meaningful connections to older modes? Were the transitions from one mode to another simply de novo additions abruptly occurring all-at-once, contrary to standard neo-Darwinian assumptions of gradualism? If one sensory mode evolved from another, then how could it be impossible for the new sensory mode to have analogical connections with its modal parent? How could information from the different sensory modes be coordinated during early moments of evolutionary transition if there is no meaningful analogical connection between them? Would not an organism that possessed the capacity to integrate information from different sensory modes be better adapted to its environment? Hartshorne’s theory, on the other hand, supposes that sensory modes are intrinsically connected by their common evolutionary origins (with tactile capacities as the earliest), that sensation is a form of affectivity that serves the purpose of enhancing the prospects of an organism’s survival, and that this underlying physiological connection of sensation and affectivity is what is primal—it is the “object-feeling” pole of the “meta-feeling/object-feeling” duality found in our complex emotional life.

The affective properties of sensation are most immediately evident in the case of pain; indeed, intense sensations of pain are ineluctably described in strongly affective terms such as “horrific” or “torturous” or “excruciating.” While there may be cases in which, paradoxically, pain is experienced as pleasure, such cases by definition posit a hedonic property to the experience inimical to the notion of a thoroughly “disinterested” pain. The affective aversion that is part and parcel of the experience of pain also clearly coheres with the biological or adaptive value of affectivity that Hartshorne’s theory asserts. Organisms that are not warned of injury by virtue of pain, and that do not seek to avoid such injury by virtue of visceral, emotional aversion to pain, are insofar vulnerable to their environments. Other tactile qualia such as sensual touch are obviously inseparable from hedonic content. Gustatory qualia are also affective as enjoyment of delicious foods and strong aversion to extremely sour or spoiled foods attest. New born infants react with aversion to sour, bitter, or fetid substances, and so it is difficult to “argue away” gustatory affect as culturally conditioned. Here again, there are obvious biological or adaptive advantages for organisms capable of being affectively reinforced by and motivated to seek nutritious foods and avoid fetid substances or spoilage. Sounds, especially in the form of music, are readily seen to evoke emotions in immediate ways. Minor chords, for instance, have an immediate “sad” or “melancholy” tonality which explains their use in ballads evocative of such moods.

Hartshorne understood that the more difficult case for his theory is visual phenomena. For this reason, he discusses at length the affective nature of visual experience with a particular emphasis on color sensation. Careful attention to our experience of color reveals that strong primary colors exhibit affective qualities, as in the paradigm cases that “gaiety” is part and parcel of yellow and “warmth” of red. While Hartshorne admits that there seem to be dull color sensations to which we may seem affectively indifferent, that such sensations possess some slight degree of affect could be shown by imagining blindness with respect to such colors; in addition, such colors have a valuable contextual role to play in providing certain nuances of contrast. In his treatment of Hartshorne’s theory in the Library of Living Philosophers (LLP), psychologist Wayne Viney notes that some previously blind persons who are successfully re-sighted attach much significance even to the visually trivial. Importantly, Hartshorne argues that without such an affective account of color, it is extremely difficult to give a coherent account of the visual arts. If affective qualia are always merely accidently “associated” with color by virtue of idiosyncrasies of personal experience, how could artists communicate or express intelligibly? For instance, the dulled grayish-brownish tones of an Edward Hopper painting convey the depressive atmosphere of life during the Great Depression far better than would the alternative use of bright yellows or Kelly greens or Titian reds. Indeed, certain projects of modern art, such as found in the work of Kandinksy, depend on the notion that color expression can in and by itself evoke emotion without mediation through well-defined objects, whether in surreal juxtaposition or otherwise.

Adaptive values for color sensation are not difficult to conceive. The greater discriminatory information provided by color sensation at least enhances, say, human abilities to demarcate and map out their immediate environments. Moreover, at least one affective property of color can be correlated with experimental neuro-physical evidence; the inherent “aggressiveness” of red correlates with the empirically discerned increase in cortical stimulation when compared to exposure to blue. While this may be explained by cultural conditioning (for example, our learned response to red stop signs), such an explanation may also beg the question as to why red is so often selected as a color of warning. On Hartshorne’s theory, the selection of red occurs precisely because it has the stimulating or aggressive affect it does. In general, Hartshorne sides with Julius Pikler in connecting all affectivity of sensation at its most fundamental level with excitations to act or with behavioral avoidances, and these in turn have an evolutionary “cash value” or utility. Nonetheless, empirical study of the affectivity of color sensation is by no means settled, and results are unclear, for one reason because it is difficult to separate out learned from universal emotional responses to color. Hartshorne’s theory, however, points in the direction of an overall evolutionary account of sensation. Even if Hartshorne has some of the details mishandled, the general thesis of color affect brings color vision in line with other sense modalities and best explains why it was strongly “naturally selected.”

3. Conclusion: Hartshorne’s Work on Sensation and the Rest of his Philosophy

Hartshorne’s first book could be seen, in one respect, as a systematic attack against the form of materialism that finds inspiration in the theory of sense data. From the times of John Locke and David Hume, some empirically minded philosophers and psychologists analyzed experience in terms of “sensory impressions.” Emotions were conceived as annexed onto bare impressions; Hartshorne characterizes this as “the annex view of value.” As already noted, this view of emotion is at odds with evolutionary thinking since a sensation-minus-affect would be lacking in adaptive value. Equally, it is not clearly a deliverance of experience. The analysis of experience into sensory impressions is, Hartshorne held, bad phenomenology; it is an intellectualized reconstruction of experience. The mistake was, in part, due to the excessive attention paid to visual experience, which as we have noted, is where affect is least apparent. Visual experience exhibits less felt relevance of the body than one finds in the other sensory modalities. This may account for the prevalence of visual metaphors for a supposedly immaterial process of intellection. It is easier to forget that one sees with the eyes than it is to forget, for example, that one touches with the skin.

In light of Hartshorne’s conviction concerning the data of experience, it is not difficult to understand why he resonated to the expression “feeling of feeling,” an idea (if not the exact wording) that he found in Chapter X (section II) of Whitehead’s Process and Reality. The clearest instance of a feeling of feeling, for both Whitehead and Hartshorne, is memory, for it is at a minimum the record of a past experience in a present experience. The example of memory also supports Hartshorne’s contention that, while every sensation is a feeling, not every feeling is a sensation. Hartshorne would later refer to the difference between introspection and perception as the difference between personal and impersonal memory.

When Hartshorne came to the business of ontology, he could find nothing more consonant with his psychology of sensation, nothing more in keeping with evolutionary thinking, and nothing more coherent philosophically than panexperientialism, the view that the basic constituents of reality are momentary flashes of experience. Whitehead called these “actual entities” or “actual occasions”; Hartshorne sometimes called them dynamic singulars. Panexperientialism implies that there must be non-human and non-conscious forms of experience. Leibniz had argued this case before evolutionary theory, but evolution made the case even more convincing. Humans are different from the creatures from which they evolved by matters of degree. Mind-like qualities, Hartshorne argued, are susceptible to an infinitely flexible number of forms. Hartshorne and Whitehead held that every concrete particular is an experient occasion; they did not, however, believe that every whole made of such occasions can be said, as a whole, to feel the world. Whitehead spoke of a tree as a democracy, the cells making up its members—there can be cellular feelings even if the tree as a whole does not feel. Hartshorne used the analogy of a flock of birds: there are feelings in each bird, but the flock itself does not feel.

If Hartshorne followed Whitehead on the ontology of actual occasions, he parted ways with him on how best to construe the nature of possibility. Whitehead took possibility to be grounded in an array of eternal objects, including particular sensory qualities, constituting an ideal world. As is evident in his first book, Hartshorne preferred to think of sensory qualities as existing along an affective continuum. Whitehead, it seems, was not dogmatic in rejecting this view. Hartshorne reports that he presented Whitehead with the following reasoning: if points are constructed from the extensive continuum and not vice versa, as Whitehead held, perhaps, by parity of reasoning, particular sensory qualities are extracted from an affective continuum and not vice versa. According to Hartshorne, Whitehead called the argument “subtle” requiring “further reflection.” It is also worth remarking that Hartshorne’s view is more radically processive than Whitehead’s since it implies that sensory qualities are emergent as the affective continuum is sliced in various ways through the evolutionary process within and between species.

Hartshorne’s theory of the affective continuum is very much in keeping with his aesthetics and with his theory of a monotony threshold in song birds. Hartshorne’s aesthetics locates beauty—which could also be called intense satisfactory experience—as a mean between two extremes: absolute order vs. absolute disorder and ultra complexity vs. ultra triviality. Aesthetic experience, like all sensory experience, must have, on Hartshorne’s account, both a subjective and an objective side. In a word, Hartshorne denies that the quality of beauty is “merely in the eye of the beholder,” or to generalize, “merely in the perception of the perceiver.” Hartshorne’s study of bird song convinced him that oscines have a primitive aesthetic sense. He found evidence that birds with more varied repertoires have shorter pauses between their songs than do birds with less varied repertoires. In a word, simpler repertoires invoke more boredom whereas varied repertoires are more interesting—hence, a “monotony threshold.” Hartshorne meant his theory to supplement, not to replace, standard accounts of bird song as the marking of territory. His view of the aesthetics of bird song coheres nicely with his evolutionary view of sensation and affective tone.

Hartshorne’s emphases on the primacy of feeling in perception and of aesthetic experience are also evident in his form of theism. God, he held, has the eminent form of “feeling of the feelings” of others. In the first instance, this means that God’s knowledge is suffused with affect and is not simply an intellectual awareness of the world, for example, a knowing of the truth value of propositions. According to Hartshorne, divine cognition is a form of what William James called “knowledge of acquaintance” rather than simply a “knowledge-about.” This idea yields a view of omniscience that is decidedly more intimate than one that is couched in terms of the metaphor of an “all-seeing” deity. Since, for Hartshorne, the relation of “feeling of feelings” has a temporal structure, every instance of awareness in the present must be nothing other than an awareness of the past. It stands to reason that, if God is the eminent embodiment of “feeling of feelings,” God must also have the eminent form of memory. This is indeed Hartshorne’s view, which he calls “contributionism.” Every experience of a non-divine being is felt and retained in perfect memory by God, thereby contributing to the richness of the divine immortal life. In Hartshorne’s words, God’s possession of us, not our possession of God, is our final achievement.

4. References and Further Reading

a. Primary Sources

i. Life

Hartshorne, Charles. 1970. “The Development of My Philosophy.” Contemporary American Philosophy: Second Series, ed. John E. Smith. London: Allen & Unwin, 1970: 211-28.
Hartshorne, Charles. 1970. “Charles Hartshorne’s Recollections of Editing the Peirce Papers.” Transactions of the Charles S. Peirce Society 6, 3-4: 149-59.
Hartshorne, Charles. 1973. “Pensées sur ma vie”: 26-32; “Thoughts on my Life”: 60-66. Bilingual Journal, Lecomte du Noüy Association, 5 (Fall)
Hartshorne, Charles. 1984. “How I Got that Way.” Existence and Actuality: Conversations with Charles Hartshorne. John B. Jr. and Franklin L Gamwell, eds. Chicago: University of Chicago Press: ix-xvii.
Hartshorne, Charles. 1990. The Darkness and the Light: A Philosopher Reflects Upon His Fortunate Career and Those Who Made it Possible. Albany: State University of New York Press.
Hartshorne, Charles. 1991. “Some Causes of My Intellectual Growth.” The Philosophy of Charles Hartshorne, The Library of Living Philosophers Volume XX. Lewis Edwin Hahn, ed. La Salle, Illinois: Open Court: 3-45.

ii. Psychology of Sensation

Hartshorne, Charles.1927. Review of A.N. Whitehead. Symbolism, Its Meaning and Effect (New York: Macmillan, 1927). Hound and Horn 1: 148-52.
Hartshorne, Charles. 1931. “Sense Quality and Feeling Tone.” Proceedings of the Seventh International Congress of Philosophy. Gilbert Ryle, ed. London: Oxford UP: 168-72.
Hartshorne, Charles. 1934. The Philosophy and Psychology of Sensation. University of Chicago Press. Republished in 1968 by Kennikat Press.
Hartshorne, Charles. 1934. “The Intelligibility of Sensations.” The Monist 44, 2: 161-85.
Hartshorne, Charles. 1961. “Professor Hall on Perception.” Philosophy and Phenomenological Research 21, 4: 563-71.
Hartshorne, Charles. 1963. “Sensation in Psychology and Philosophy.” Southern Journal of Philosophy 1, 2: 3-14.
Hartshorne, Charles. 1965. “The Social Theory of Feelings.” Southern Journal of Philosophy 3, 2: 87-93. Reprinted in Persons, Privacy, and Feeling: Essays in the Philosophy of Mind, ed. Dwight Van de Vate, Jr. Memphis: Memphis State UP, 1970: 39-51.
Hartshorne, Charles. 1967. “Psychology and the Unity of Knowledge.” Southern Journal of Philosophy 5, 2: 81-90.
Hartshorne, Charles. 1970. Creative Synthesis and Philosophic Method. La Salle, Illinois: Open Court.
Hartshorne, Charles. 1973. Born to Sing: An Interpretation and World Survey of Bird Song. Bloomington, Indiana University of Indiana Press.
Hartshorne, Charles. 1984. “Response to George Wolf.” Existence and Actuality: Conversations with Charles Hartshorne. John B. Jr. and Franklin L Gamwell, eds. Chicago: University of Chicago Press: 184-188.
Hartshorne, Charles. 2001. Notes on A. N. Whitehead’s Harvard Lectures 1925-26, transcribed by Roland Faber. Process Studies 30/2: 301-373.

b. Secondary Sources

i. Life

Peters, Eugene H. 1970. Hartshorne and Neoclassical Metaphysics. Lincoln: University of Nebraska Press: 1-14.
Viney, Donald Wayne. 2003. “Charles Hartshorne.” American Philosophers Before 1950. In Dictionary of Literary Biography, volume 270, edited by Philip B. Dematteis and Leemon B. McHenry. Detroit: Thomson Gale, 2003: 129-51.
Viney, Donald Wayne. 2004. “Charles Hartshorne.” Dictionary of Unitarian Universalist Biography, 1999-2004. On-line at: http://www.uua.org/uuhs/duub/articles/ charleshartshorne.html
Viney, Donald Wayne. 2005. “Hartshorne, Charles (1897-2000)” The Dictionary of Modern American Philosophers, edited by John R. Shook (London: Thoemmes Press): 1056-62.
Viney, Donald Wayne. 2008. “Charles Hartshorne (1897-2000),” Handbook of Whiteheadian Process Thought, Volume 2, edited by Michel Weber and Will Desmond. (Frankfurt / Paris / Lancaster: Ontos Verlag): 589-596.

ii. Psychology of Sensation

Anon. 1985. Report on Hartshorne’s “My Enthusiastic but Partial Agreement with Whitehead,” presented at the eleventh Congreso Ineramericasno de Filosifia, Guadalajara, Mexico, Nov. 15, 1985, Center for Process Studies Newsletter, 9, 4, 7.
Dombrowski, Daniel. 2004. Divine Beauty: The Aesthetics of Charles Hartshorne. Nashville, Tennessee: Vanderbilt University Press.
Hospers, John. 1991. “Hartshorne’s Aesthetics.” The Philosophy of Charles Hartshorne, The Library of Living Philosophers Volume XX. Lewis Edwin Hahn, ed. La Salle, Illinois: Open Court: 113-134.
Viney, Wayne. 1991. “Charles Hartshorne’s Philosophy and Psychology of Sensation.” The Philosophy of Charles Hartshorne, The Library of Living Philosophers Volume XX. Lewis Edwin Hahn, ed. La Salle, Illinois: Open Court: 91-112.

c. Bibliography

Author Information

Donald Wayne Viney
Email: don_viney@yahoo.com
Pittsburg State University
U. S. A.

and

George W. Shields
Email: George.shields@kysu.edu
Kentucky State University
U. S. A.

Charles Hartshorne: Theistic and Anti-Theistic Arguments

Charles Hartshorne is well known in philosophical circles for his rehabilitation of Anselm’s ontological argument. Indeed, he may have written more on that subject than any other philosopher. He considered it to be the argument that, more than any other, reveals the logical status of theism. Nevertheless, he always clearly and explicitly denied that the argument was his reason for being a theist. There are two reasons for this. First, he believed that, without a revision in the very concept of deity, Anselm’s argument could readily be turned upside down, so to speak, so as to constitute not a proof of theism but its disproof. Consequently, Hartshorne believed that a full defense of theism requires developing a coherent concept of God. (See “Charles Hartshorne: Dipolar Theism.”) Second, Hartshorne’s revised ontological argument does not stand alone. It is one strand in a fabric of reasoning which he sometimes called “the global argument.” He followed C. S. Peirce’s recommendation that philosophy should rely on a variety of interrelated pieces of evidence rather than trust to the conclusiveness of a single argument. Peirce (5.265) used the analogy of a cable, the strength of which is in the combination of its numerous fibers. Peirce specifically mentioned that this way of arguing is typical of science, but it is also evident in other areas such as law, history, and literary criticism. Nowadays, philosophers use Basil Mitchell’s terminology and call the multiple argument strategy a “cumulative case.” Hartshorne’s most systematic presentation of the global argument is in the fourteenth chapter of Creative Synthesis and Philosophic Method, titled “Six Theistic Proofs.” Not long after this essay appeared, he stopped calling the arguments proofs, for he recognized that it is often the case that equally rational and informed philosophers disagree on fundamental issues. For this reason, he presented the global argument in a way that emphasizes both the rational basis of neoclassical theism and the rational cost of rejecting it. In addition to discussing Hartshorne’s case for theism, this article also addresses Hartshorne’s reflections on the problem of evil.

Anselm’s Discovery and the Ontological Argument
The Global Argument
The Problem of Evil and Theodicy
Conclusion
References and Further Reading

1. Anselm’s Discovery and the Ontological Argument

It used to be customary to speak in the singular of “Anselm’s ontological argument.” Hartshorne was the first to argue that this is mistaken. Setting aside the question of Anselm’s intentions, Hartshorne found that two arguments are suggested in Anselm’s Proslogion, one in chapter II, another in chapter III. Hartshorne made this point in 1944 in an article published in The Philosophical Review and again in 1953 in Philosophers Speak of God. The philosophical world did not take notice until 1960 when Norman Malcolm’s article, “Anselm’s Ontological Arguments,” made the distinction between the two arguments famous. Hartshorne, like Malcolm, agreed with Anselm’s critics that the first argument (in chapter II) is fallacious, but the second argument (in chapter III), which has a modal structure, he considered valid. The difficulty in showing that the argument is sound kept Hartshorne from thinking of it as demonstrating God’s existence. In The Logic of Perfection, Hartshorne presented a formalized version of the argument using C. I. Lewis’s system S5, the first such formalization to be published. In Anselm’s Discovery, he again defended a version of the argument and canvassed the various treatments of Anselm’s reasoning in the history of philosophy, including an anticipation of the argument in Plato noted by the scholar Prescott Johnson.

In the introduction to George L. Goodwin’s The Ontological Argument of Charles Hartshorne, and again in Creative Experiencing, Hartshorne reduced the modal ontological argument to what he considered to be its essentials. The argument’s logical symbols are the tilde (~) for negation, the arrow (→) for strict implication, M for “is logically possible” (thus, “~M~” means “is logically necessary”), and p* stands for “God exists,” where God is defined as “a being unsurpassable by any other conceivable being.” (In Hartshorne’s dipolar theism, the divine can, in some senses, surpass itself but it is unsurpassable by any other being). The argument is presented as follows:

Mp*
Mp* → ~M~p*
Therefore, ~M~p*

If necessity (~M~) is what is common to all possibilities—a common definition—and if any state of affairs that is actual is also possible—a standard modal principle—then the conclusion to be drawn is that God exists (p*). Hartshorne was under no illusions that this mode of reasoning would convince the skeptic that God exists. Nor did he use it as his reason for believing in God. Nevertheless, the argument is not, in the hyperbole of Graham Oppy (199), “completely worthless.” In A Natural Theology for Our Time, Hartshorne credited George Mavrodes with the insight that it is unreasonable to suppose that no doubts about theism can be removed because an argument cannot remove all doubts about theism. Moreover, the simple deductive structure of the argument clarifies what is at stake in the theistic question. If one denies the conclusion, one must deny one or more of the premises or what their denials entail. Hartshorne follows Gottfried Wilhelm Leibniz in urging that, in questions of metaphysics, philosophers are more apt to err in what they deny than in what they affirm. Highlighting the rational cost of rejecting theism can, for this reason, be a fruitful method in metaphysics.

If one rejects the conclusion of Hartshorne’s modal argument, one of two alternatives is possible. First, it may be that God’s existence is impossible (~Mp*), which is the denial of the first premise. This is the view that J. N. Findlay originally took in his famous 1948 article, “Can God’s Existence Be Disproved?” In effect, Findlay’s argument turns Hartshorne’s modal modus ponens upside down to make a modal modus tollens disproof: If M~p* and Mp* → ~M~p*, it follows that ~Mp*. Hartshorne referred to this as the a priori atheist or positivist position. The second alternative is that a logical consequence of the second premise is false. The strict implication of the second premise allows one to infer that if God’s existence is logically possible then it is logically necessary. If this is false, then God’s existence and non-existence are equally possible: Mp* and M~p*. This was the view of David Hume, for whom every proposition asserting or denying existence, including “God exists,” is logically contingent. Hartshorne calls this the empiricist position, or sometimes empirical theism or empirical atheism depending on whether or not the empiricist thinks that God exists.

Hartshorne considered the empiricist position regarding the ontological argument as the least tenable. The second premise says, colloquially, if God is so much as logically possible, then it must be the case that God exists. Hartshorne calls this “Anselm’s principle,” or more forcefully, “Anselm’s discovery.” The discovery is that God, as unsurpassable, cannot exist with the possibility of not existing. Put differently, contingency of existence is incompatible with deity. Anselm’s formula that God is “that than which nothing greater can be conceived” means, among other things, that any abstract characteristic for which something greater can be conceived cannot properly be attributed to deity. For example, if there is something greater than being partially ignorant, then God cannot be conceived as partially ignorant. Or again, if there is something greater than interacting with some but not all others, then God cannot be conceived as a merely localized being. Applied to modality of existence, Anselm’s principle means that a deity that can fail to exist is not the greatest conceivable. If correct it is then a mistake to conceive of God as possibly existing and possibly not existing. This is another way to state the second premise. One may deduce from this premise that it is impossible that God’s existence and non-existence are both logically possible. Symbolic notation presents this as ~M(Mp* and M~p*).

Hartshorne emphasized that the empiricist’s view he considers Anselm to have refuted is shared by, among others, those who consider the existence of God as a hypothesis to be established or refuted by science. Hartshorne accepts Karl Popper’s idea that empirical statements must be falsifiable by some conceivable experience (see “Charles Hartshorne: Neoclassical Metaphysics”). Anselm’s principle entails that if God exists, there could be no disconfirming empirical evidence of God’s existence. On the other hand, if God does not exist, then by parity of reasoning, there could be no confirming empirical evidence of God’s existence. If premise two is correct, the remaining options are that God exists necessarily (~M~p*) or God’s existence is impossible (~Mp*). This removes the question of God’s existence from the domain of science. Yet, this is not the same as removing the question from rational justification, unless metaphysics is impossible, a position that Hartshorne vigorously opposed. In effect, treating the existence of God as a scientific hypothesis is a failure to conceive of God as unsurpassable by any being other than God—and is therefore a changing of the subject.

Among the many criticisms of Hartshorne’s reasoning about the ontological argument, four stand out as deserving special treatment: one from J. N. Findlay, one from John Hick, one stemming from W. V. O. Quine’s reflections on modal logic, and one from H. G. Hubbeling. Each is set forth in The Philosophy of Charles Hartshorne. Hartshorne praises Findlay for most clearly stating the objection that the concrete cannot be deduced from the abstract, and that this is what the ontological argument purports to do. Definitions are abstract, but God’s existence must be concrete; from the logically weak definition of God one may not deduce the logically stronger conclusion that God exists. Put somewhat differently, if the deduction succeeds, then God’s existence must be as abstract as God’s essence. Hartshorne’s response to Findlay is to accept the principle but to appeal to the distinction between existence and “actuality”––Hartshorne’s term for existence in a particular, determinate, concrete state (see “Charles Hartshorne: Dipolar Theism, section 2”). To be sure, the ontological argument concludes to the existence of God, which is abstract, but more explicitly, it concludes to God’s existence as somehow actualized. No actual state of God—which is the concreteness of God—can be deduced by a metaphysical argument. The structure of this reasoning is analogous to Hartshorne’s argument that non-being is impossible (see “Charles Hartshorne: Neoclassical Metaphysics”). The statement “Something exists” may be necessarily true, as Hartshorne urges, although it gives no information as to what actually exists. It only says that the set of existing things is not empty. By parity of reasoning, the conclusion of Hartshorne’s modal argument can be rephrased to say that the set of actual divine states is never empty. With good reason, Hartshorne insisted that he knew very little about God. At most, his metaphysics yields only the most abstract truths about deity, although he stressed that it is a notable achievement to advance the subject of metaphysics when so few attend to its reasoning.

Hick pressed the objection that Hartshorne’s ontological argument confuses two kinds of necessity; one pertaining to propositions (logical necessity), the other pertaining to a being (ontological necessity). According to Hick, to say that God exists of necessity is to say no more than that God has the property of “aseity.” That is, God’s existence, unlike all creaturely existence, depends upon nothing outside of itself. This does not mean, Hick claims, that “God exists” is a necessary truth. To speak of God’s existence as logically necessary is, in Hick’s view, a category mistake; applying to a being a predicate that is properly a predicate of sentences. Hartshorne agrees with Hick that, excluding the case of God, all propositions asserting the existence or non-existence of an individual are logically contingent. However, in all of these cases, there is a causal explanation for the possibility of the individual’s existence which neatly explains why the proposition asserting or denying existence is not necessarily true. For example, the non-existence of x’s monozygotic twin is explained by the fact that the fertilized egg from which x came did not split; x’s existence also has a causal explanation in the union of a particular sperm and egg. Hartshorne notes that there is no analogous explanation, on Hick’s empiricist account, for why “God exists” is logically contingent. Yet, Hartshorne has a ready explanation for why the proposition is not logically contingent, an explanation moreover that Hick uses in explaining the meaning of divine necessity: neither God’s existence nor non-existence could have a causal explanation. In both The Logic of Perfection and Creative Experiencing, Hartshorne discusses other characteristics of logically contingent propositions that “God exists” lacks. For example, God’s existence includes all positive forms of existence whereas the existence of any creature within the universe excludes certain positive states of affairs. Hartshorne says that God’s existence is not competitive. Hartshorne’s conclusion is that, on Hick’s account, “God exists” breaks the usual semantic criteria for a proposition to count as logically contingent.

Hartshorne’s response to Hick is that the meanings of modal terms must be anchored in the causal-temporal matrix. If this is true, then only particular noun-adjective combinations are logically conceivable. Numerous parodies of the modal argument—beginning with Gaunilo’s “perfect island”—consist in joining the concept of necessary existence to real or imagined localized beings. On Hartshorne’s account these ideas are improperly conceived, for they cannot withstand the application of semantic criteria that distinguish contingent and necessary truths. Attaching necessary existence to a being that is properly conceived as contingent is the reverse of the error of attaching contingent existence to a being that is properly conceived as necessary. Hartshorne counts both extremes as errors. It is no accident that it was J. S. Mill, an empiricist, who made famous the question, “Who made God?” If “God” signifies a being unsurpassable by all others, then asking for the cause of God’s existence is on a par with asking what is north of the North Pole. Both questions are grammatical, but both are also nonsensical. Of course, on Hick’s account of divine aseity, it is also a mistake to ask for the cause of God’s existence. However, Hartshorne’s theory of the semantic grounding of modal terms in temporal process provides one reason why it is a mistake.

Another important objection to Hartshorne’s modal ontological argument, especially as presented in The Logic of Perfection, arises from Quine’s attack on the intelligibility of de re modality. While Hick criticized Hartshorne’s modal argument for moving illicitly from de dicto (linguistic) to de re (ontological) conceptions of modality, Quine’s strategy is to reject the very intelligibility of de re modality. If successful, such a critique would surely devastate the modal version of the argument since, for Hartshorne, “logical modality mirrors objective modality.”

Quine’s challenge to the intelligibility of de re modality has been taken up in great detail by Goodwin in his book The Ontological Argument of Charles Hartshorne. In his foreword of the work, Hartshorne endorses Goodwin’s approach. The arguments could be summarized as follows. Quine objected to the idea of de re modality, since it involves quantification across modal operators. For example, the formulation “(Ǝx) (necessarily, x is greater than seven)” is logically illicit, claims Quine, because the modal operator “necessarily” is inserted within a quantifier-bound variable-predicate expression. Quine points out that we cannot generalize existentially from the legitimate de dicto formulation:

Necessarily, nine is greater than seven.

to the illicit

(Ǝx) (necessarily, x is greater than seven).

This is because “nine” in (a) is referentially opaque; it fails to denote in a singular way, and thus opens the door to counter-examples in the generalized sentence (b). For instance, Quine says that “nine” can name “the number of planets,” but it is not a property of “the number of planets” that it is necessarily greater than seven. Given his theory of contingent states of affairs, Hartshorne would not object to the notion that “the number of planets,” presumably in our solar system, is indeed a contingency. The thrust of this is that, because of referential opacity in quantified modal logic, we do not know what it means to introduce propositions of the existentially generalized form (b). However, Goodwin notes that Hartshorne is indeed committed in his modal version of the argument to such forms as:

(Ǝx) (necessarily, x is perfect).

Consequently, an effective Hartshornean response to Quine’s critique requires an intelligible semantics for modal logic.

Goodwin argues that Saul Kripke supplies such a semantics in the essay, “Semantical Considerations on Modal Logic.” According to Kripke, we can give an intelligible account of sentences involving quantification into modal contexts. A sentence having the form of (b) can be interpreted to say: “there is an object, x, in this world which has the property “greater than seven,” and x has this property in every possible world in which x exists.” In other words, x exists in this world and at least some possible worlds accessible from this world, and x falls under the extension of the predicate “greater than seven” in every world in which it exists.” However, this only takes one so far in the provision of an (arguably) intelligible formal semantics for sentences involving quantification into modal contexts.

Quine replies that the very terms of this formal semantical solution to the problem of opacity raises the further question of what it means for an individual or object to exist in various possible worlds. This problem has come to be known as the problem of “trans-world identity.” Quine challenges any response to his critique of de re modality based on Kripke’s semantics by arguing that Kripke’s solution to referential opacity ushers in a semantics involving the difficulty of “essential properties.” For instance, let the value for x be C. S. Peirce, while the predicate attributed to x is “being a speculative philosopher.” Must Peirce be a speculative philosopher in any possible world in which he exists in order to be Peirce in such worlds? Could Peirce be a “seventeenth century sea captain” in some possible worlds and still intelligibly remain Peirce in such possible worlds?

It is precisely here, argues Goodwin, that Hartshorne’s ontology of temporal process can be employed, providing Kripke with intelligible criteria for making trans-world identifications. The problem of trans-world identity seems perplexing and insolvable when assuming, to use Quine’s phrase, “Aristotelian essentialism,” in which essential properties belong to substances that make no inherent reference to temporality. By contrast, Hartshorne’s process or event ontology positions the search for an intelligible criterion for trans-world identity in the much wider matrix of successive and causally efficacious temporal units of becoming. This is one reason why Hartshorne prefers to speak of “possible world-states” rather than “possible worlds” (see “Charles Hartshorne: Neoclassical Metaphysics”). Temporal inheritance becomes the essential factor in determining identity, and thus more readily settles the above questions: Peirce might well exist as, say, a professional painter in some possible world-state, since he might have been one in the history of this actual world; that is, since there may have been a juncture in Peirce’s development in which he was not particularly taken with questions of speculative philosophy, but was exposed to an environment of intense interest in artistic expression. Yet, surely he could not be, in any possible world-state, a seventeenth century sea captain, since this would bear nothing in common with his succession of temporal events. To conclude the issue cautiously, perhaps we should say that, even if Hartshorne’s event-ontological criterion of temporal inheritance does not fully resolve the issue of trans-world identity, it seems to simplify it profoundly. More pointedly, this criterion directly answers Quine’s charge of the unintelligibility of solutions based on Aristotelian essentialism that appeal to temporally de-contextualized substances.

A technically sophisticated objection to Hartshorne’s modal argument, especially as expressed in The Logic of Perfection, comes from H. G. Hubbeling. He presents Hartshorne with a dilemma: the modal argument is valid if and only if the theory of temporal modalities is false. The problem is that Hartshorne’s argument is expressed in Lewis’s S5 system in which modal status is necessary. Symbolically (where L = ~M~), it is presented as such: “If Lp* then LLp*” and “If Mp* then LMp*.” Temporal modalities, however, are best expressed in Lewis’s weaker S4 system, which includes the first of these formulae as an axiom but the second formula is neither an axiom nor a theorem. Without “If Mp* then LMp*” Hartshorne’s argument is not valid, for then it could be the case that God’s existence is possible but not necessarily so. On the other hand, Hartshorne wants to ground the meaning of modal terms in temporal process. The most plausible semantics for S5, however, leaves modal concepts untethered to time.

It is to be noted, however, that Hartshorne gave other versions, both informal and formal (such as the version used above) which do not depend on S5. Hartshorne was convinced that an element of intuitive judgment that goes beyond the logical formalism is involved in assessing the argument. However, granting the element of intuitive judgment does not directly answer Hubbeling’s dilemma. What remains is whether Hubbeling’s challenge can be met from within Hartshorne’s form of dipolar theism. It seems true that S5 is the appropriate modal system for expressing the abstract point of the argument relating to God’s unique characteristic of existence in every possible state of affairs. S5’s property of complete “world accessibility” symmetry is exactly what is needed. On the other hand, S4 is applicable to the description of what Hartshorne calls God’s actuality or God’s concrete states. So, Hartshorne’s distinction between existence and actuality maps onto the S5/S4 distinction. (For more on the existence/actuality distinction, see “Charles Hartshorne: Dipolar Theism,” section 2).

2. The Global Argument

If Hartshorne is correct, the ontological argument reveals the logical status of the theistic question as metaphysical rather than empirical. The argument falls short of a proof of theism, in large measure, because it depends on the premise that the existence of God is logically possible. Hartshorne’s own arguments against classical theism show that this is not an acceptable premise. Hartshorne once commented that John Duns Scotus also concluded that the question of God’s existence is not empirical. Hartshorne added, “My quarrel with him is that I regard his form of theism as either self-inconsistent or meaningless” (Viney 1985, x). Hartshorne believed that the weak premise in the modal argument is the first one, that “God” names a possible reality. He said in his reply to Hick that all of his misgivings about believing in God rested on the suspicion, which is difficult to remove, that every form of theism masks an absurdity. At least in part, this explains Hartshorne’s efforts to defend metaphysics as both the search for necessary truths about existence and the development of a coherent dipolar theism. One can think of the global argument as the completion of this process. Discounting the modal argument, each element of Hartshorne’s cumulative case is designed to buttress the claim that the existence of God is logically possible.

The various strands of the global argument highlight what Hartshorne considered to be the theistic implications of neoclassical metaphysics. Each argument is given a familiar name that suggests precursors in the history of philosophy, but none of them has an exact equivalent in the world’s philosophical literature. In addition to the ontological argument, Hartshorne develops his own versions of the cosmological, teleological, epistemic, moral, and aesthetic arguments. In keeping with Hartshorne’s use of position matrixes, each argument is presented as a logically exhaustive set of options. We have already hinted at this style of reasoning in the modal argument where one has the choice that God’s existence is a necessity (~M~p*), an impossibility (~Mp), or a contingency (Mp* and M~p*). Other strands of the global argument are also presented in this way: to affirm one alternative is to deny all others, and alternately, to deny one is to affirm that only one of the others is true. In each case, Hartshorne employs what he calls “the principle of least paradox” to conclude that the rational cost of rejecting neoclassical theism is greater than the cost of accepting it. Time and again, Hartshorne acknowledged the difficulties of an unqualified verdict in favor of neoclassical theism, but he also believed that his view better answered the questions of metaphysics than his rivals. Hartshorne was epistemically cautious in recognizing that his method would not yield a decisive victory for his own views. As with the modal argument, Hartshorne believed that no degree of logical rigor can eliminate the need for an element of intuitive judgment. The “essential element in rational procedure in metaphysics” is to honestly face the logically possible alternatives and to weigh up the cost of accepting or rejecting them (Viney 1985, x).

Much of the global argument is anticipated in Hartshorne’s explanation and defense of neoclassical metaphysics (see “Charles Hartshorne: Neoclassical Metaphysics”). Consider an outline of Hartshorne’s cosmological argument. As noted above, Hartshorne argues that “Something exists” is necessarily true. The principle of contrast and Hartshorne’s defense of de re modality, if correct, imply that what exists is characterized by both contingency and necessity. The necessary, moreover, as the common element in all possibility, is abstract. If it is possible for this necessity to be divine—more precisely, the abstract pole of the divine—then it is possible for God to exist. This supports the weak premise of the modal argument that God’s existence is logically possible. To reject the conclusion, one must either deny the necessity of existence, the principle of contrast, de re modalities, the character of necessity as abstract, or the possibility that the necessary aspect of things is divine. Hartshorne’s cosmological argument differs from traditional versions in neither concluding to the existence of a prime mover, an uncaused cause, nor a wholly necessary being. Of course, none of these descriptions fits the dipolar God, and Hartshorne had no interest in defending them.

Causal principles enter Hartshorne’s cumulative case in his argument from cosmic order, which he calls “the design argument.” Hartshorne defends a metaphysic according to which the cosmos is a theater of interactions among dynamic singulars, all of which act and are acted upon. The existence of many real beings, thus defined, raises the problem of cosmic order. The question is not why there is order rather than mere chaos. For Hartshorne, chaos presupposes order as much as non-existence presupposes existence—indeed, mere chaos is indistinguishable from nonbeing. The question, rather, is how there can be order on a cosmic scale if there is only an uncoordinated set of centers of creative activity. Localized order, or order within the cosmos, can be explained by localized activity of entities within the cosmos. The order of the cosmos, however, cannot be the outcome of a coordinated effort by the many entities since their very existence, severally scattered throughout the cosmos, presupposes the cosmos as a field of activity. If there is a cosmic-ordering power that itself falls under the metaphysical principle of acting and being acted upon, then cosmic order can be explained. Moreover, as Hartshorne argues in A Natural Theology for Our Time, the explanation is not ad hoc since all real beings, localized ones and the cosmic-ordering power, fall under the same metaphysical principle. The cosmic-ordering power is not, in the words of Alfred North Whitehead, an exception to metaphysical principles, invoked to save their collapse, but is their chief exemplification.

Hartshorne allows that the expression “cosmic order” permits different values; the laws of nature must include constants as well as variables, and the values of the constants (for example, the speed of light), are not logical necessities. In this way, one may speak, with Whitehead, of different “cosmic epochs” in which the laws of nature beyond the singularities of our universe are not identical with our own. Hartshorne insists, however, that the problem of cosmic order remains. This is because our conceptions of the fundamental laws of nature are contingent and mathematically peculiar in character. For instance, an epoch such as our own with a law of gravitation specified by “mass times mass proportioned to the radius squared” is a particular nomological condition to be conceptually contrasted with, say, gravitation as “mass times mass proportioned to the radius cubed.” Basic laws of nature appear to have the logical earmarks of “contingent decrees,” and as such it is legitimate to ask for their causal explanations. Thought experiments which assert that such basic laws could be instituted by chance mechanisms beg the question of basic order. An example is Hume’s suggestion of an epicurean universe of swerving atoms that happen to arrange themselves into the cosmic “regularities” we observe. As Hartshorne says in A Natural Theology for Our Time, talk of atoms with a definite character persisting through time is “already a tremendous order.” Recent thought experiments in cosmology such as “bubble inflation” models also seem to posit background assumptions of contingent cosmic conditions, including the operating laws of quantum mechanics which necessarily involve specific quantitative values (for example, as in the use of Planck’s Constant).

On Hartshorne’s neoclassical theistic alternative, one arguably need not settle for any metaphysically inexplicable contingent cosmic order or a freedom-suppressing “necessitarian” universe. It is also well to remember that Hartshorne vigorously defends “indeterminism.” If determinism is false, then neither the order within the cosmos nor the order of the cosmos is absolute. Multiple real beings with varying degrees of creative power are a recipe for conflict. To be sure, the existence of multiple real beings also opens the possibility for cooperative endeavors, whether it is cooperation among or between localized beings and the cosmic designer; but multiple creativity guarantees a mixture of disharmony and harmony. The cosmic-ordering power can guarantee a cosmic order, but because of the existence of a plurality of real beings that act, and are not simply acted upon, not everything that happens can be chosen by a single individual, even a divine one. This is relevant to the problem of theodicy, for it shows that, in neoclassical metaphysics, the conflict of decisions among the creatures and between the creatures and God are possible, opening the way to tragedies that not even God can avoid.

A skeptic may embrace any of the options that Hartshorne denies, but at a cost. Hartshorne argues that each of the non-theistic options has dubious metaphysical credentials and that his solution to the problem of cosmic order is the most parsimonious. If there is no cosmic order one must explain the apparent success of science in discovering that order. If there is no cosmic-ordering power then either localized beings are being used to explain an order that their activity presupposes or there is no explanation of the order. Another atheistic option is to accept that there is a cosmic-ordering power but deny that it is divine. Hartshorne considered “panentheism” to provide a superior analogy to anything atheism can propose for the cosmic designer. However, the remaining three strands of the global argument can also be used to support the idea of such an ordering power; it is not only an agent causally affecting the world but is also affected by the world and incorporates it into the divine life, as one that perfectly knows the world (epistemic argument), perfectly preserves its achievements (moral argument), and fully appreciates the world (aesthetic argument).

In the epistemic argument, Hartshorne raises the question of the relation between reality and knowledge. In one respect, knowledge depends upon the real, for one cannot know what is not real. On the other hand, it is difficult to give an account of the real apart from some form of knowledge. As Hartshorne (Creative Synthesis 288) notes, Immanuel Kant suggested that appearance differs from reality because “ … the content of our sensory intuition differs from the content of a non-sensory intuition” (See also Kant’s Critique of Pure Reason, A249, A252). The object of the non-sensory intuition is the “noumenon.” (Hartshorne parts company with Kant in conceiving God’s knowledge as partly passive rather than as wholly active). Taking up Kant’s point, no merely partial or fallible knowing can circumscribe the real, for the extent of errors in knowing are measured by the real—if one is mistaken about x then something about x escapes one’s knowledge. In view of these conundrums, it is tempting to say that reality is the potential content of infallible knowledge—what an epistemically unsurpassable being would know if it existed. The problem with this solution, as far as atheism is concerned, is that an infallible knower, by definition, could not possibly be mistaken. However, it would know its own existence, so one is led to posit not simply the possible existence of an infallible knower, but also its actual existence. Hartshorne drew precisely this conclusion, that reality is the actual content of infallible knowledge. He argued further, following Josiah Royce, that defects in cognitive experience are internal to experience. Hartshorne mentions confusion, inconsistency, doubt, inconstancy of beliefs, and “above all, a lack of concepts adequate to interpret our percepts and of percepts adequate to distinguish between false and true concepts” (Creative Synthesis 288).

A distinctive feature of Hartshorne’s account of perfect knowledge is that it requires both cognitive and affective components (see “Charles Hartshorne: Dipolar Theism,” part 5). God must be conceived not only as knowing all true propositions but also as knowing the creatures themselves; that is, feeling what they feel. Whatever one has been and however one has felt become transformed thereafter as an everlasting memory in God’s consciousness. This applies also to the collective life of the creatures. There is no mere numerical sum of value in God—as if value were simply a question of set membership—for the experiences of creatures become woven into the fabric of God’s undying experience. This is what Hartshorne means by “contributionism,” that the creatures enrich the divine life in a way that would not have been possible apart from their activity. In comments he made on a debate about the resurrection of Jesus, Hartshorne (Did Jesus Rise From the Dead? 140) asked, “If people can live or die for country, or other human groups, why can they not live and die for that which embraces all groups and their intrinsic values—the divine life?” Hartshorne was fond of quoting the Jewish prayer, “Help us to become co-workers with You, and endow our fleeting days with abiding worth.” The moral argument brings out the attractiveness of this ideal as the supreme aim of creaturely existence.

There are a number of ways to reject contributionism. One may deny that there is any supreme aim, theistic or nontheistic. Hartshorne argues that this robs comparative value judgments of a standard of comparison; if, as most reflective people would accept, it is possible to squander one’s life on trivial, unimportant, or immoral pursuits, then there must be a measure of the good life that is being used as a comparison. Another option is that self-interest is the supreme aim. Hartshorne follows the Buddhists in rejecting this (see “Charles Hartshorne: Neoclassical Metaphysics”). More plausible is the idea that the aim of life is to live for self and for others either during this life or in an afterlife. Hartshorne considered this laudable, but finally unsatisfactory as the supreme aim of life. First, he argued that there is at best a numerical meaning of “general welfare,” whereas neoclassical theism provides an experiential meaning in God’s experience. Second, there is the problem of mortality. In “A Free Man’s Worship,” Bertrand Russell stated the problem clearly when he proposed to build a philosophy of life upon a foundation of “unyielding despair.” The despair stems from the recognition that “the noonday brightness of human genius” and “the whole temple of man’s achievement” is destined to perish. There is, to be sure, apparent nobility in such Sisyphian labor, except that “nobility” and “tragedy” become, on this account, as if they had never been. Dipolar theism, on the other hand, accounts for the value of past achievement as an enduring aspect of the unending process of God’s life and memory. Moreover, the value of living for self and others is included in Hartshorne’s account, for the supreme “other” is God. The extent and nature of value that one contributes to God is precisely the extent and quality of value that one has contributed to others. Hartshorne argued that contributionism captures the inclusive nature of love that one finds expressed in biblical ethics: one cannot love God if one does not love others, and one is to love God with everything one is and to love one’s neighbor as oneself.

An argument from the beauty of the world as a de facto whole rounds out Hartshorne’s cumulative case and ties it to the aesthetic motif of his philosophy. It is quite natural, and prima facie rational, to speak of enjoying the beauty of the cosmos. Most people consider it appropriate to include aesthetic predicates in descriptions of the universe, for it is endlessly interesting, mysterious, and awe inspiring. Hartshorne described science as the search for the hidden beauty of the world, and many great scientists would agree; even those who have little or no use for philosophy or religion, like Steven Weinberg who states that the universe is beautiful beyond what seems necessary. An aesthetically displeasing universe, says Hartshorne, would be either chaotic or monotonous. What we find, on the contrary, is order in the laws of nature and variety in the evolution of new arrangements of matter and levels of mind. Hartshorne speaks of the world as a de facto whole, for he means to stress its open-ended and dynamic character. If atheism is true, then it is non-divine individuals alone that enjoy the beauty of the universe as a whole, catching a glimpse of it in the slice of time that is available to them and to the species. The peek that we have of the beauty of the cosmos, moreover, reveals horizons suggestive of aesthetic riches forever beyond our grasp. Hartshorne argues that this would represent an irremediable aesthetic defect in the universe, for beauty should be enjoyed and only God could adequately enjoy the beauty of the world as a whole. Of course, what should be is not necessarily what is. Hartshorne insists, however, that unlike merely contingent defects, the lack of a divine spectator would be a necessary defect, “an eternally necessary yet ugly aspect of things” (Creative Synthesis 290). It is a thought without intrinsic reward or pragmatic value, best conceived as a thought experiment whose purpose is to make us realize a divine mind that can appreciate the beauty that escapes us.

The conclusions of the design and epistemic arguments, together with Hartshorne’s “psychicalism,” lend support to his aesthetic argument. As the supreme cosmic-ordering power, whose knowledge is the ultimate measure of reality, the divine, in any particular state of its life, must find within itself the entire wealth of all creative experiencing that has ever existed. This experience of a universe in process is, as Whitehead says, “beyond our imagination to conceive”; it includes (to us) the imperceptible abyss of the past as well as the infinite possibilities of the future. It is here that these lines of inference dovetail with the moral argument. God must be conceived not only as the supreme spectator appreciating the beauty of the world as a de facto whole, but also as the supremely beautiful (or sublime) object of contemplation, adoration, and worship—an endlessly unfolding cosmic experience to which we contribute. Also implicit in Hartshorne’s theology is that God is, as it were, the supreme actor in the play of existence. The various roles of the deity, as Hartshorne conceives it, are neatly summarized in the title of one of his articles: “God as Composer-Director, Enjoyer, and, in a Sense, Player of the Cosmic Drama.”

3. The Problem of Evil and Theodicy

As long as there have been theists there has been a problem of evil, whether as a believer’s lament (as in Job), as a theologian’s conundrum (as in Augustine), or as a skeptic’s argument (as in Hume). Contemporary philosophers of religion speak of two forms of the problem of evil: the logical and the evidential. The logical problem of evil raises the question whether the existence of evil, conceived as gratuitous suffering, is logically consistent with the existence of a God that is perfect in power, knowledge, and goodness. The evidential problem of evil raises the question whether its existence renders improbable that of a perfect God. Hartshorne found neither version of the problem especially troublesome for his form of theism. He held that the problem with both versions of the problem of evil, as they are usually stated, is that they pose a loaded question, presupposing a concept of divine power that, in Hartshorne’s (Philosophical Aspects of Thanatology 86) words, “is not even coherent enough to be false.” Hartshorne developed and defended a metaphysic of shared creativity in which no individual, not even a divine one, can have a monopoly of power (see “Charles Hartshorne: Neoclassical Metaphysics” and “Dipolar Theism”). He was fond of disagreeing with Einstein who said that God does not play dice. On the contrary, chance and multiple freedom are inseparable; it is no accident, said Hartshorne (Studies in the Philosophy of J. N. Findlay 230), that there are accidents. Although God has the eminent form of creative power, it is not enough to guarantee a world without accidents, wrongdoing, and tragedy. Hartshorne would say that the evidential problem of evil suffers from the additional defect of assuming that God’s existence is an empirical question. We have seen that, according to Hartshorne, this represents a failure to appreciate the logical consequences of “Anselm’s discovery.”

Much of the appeal of traditional religion is that it offers the hope that the gulf between what is and what ought to be can be bridged in a future existence. It promises that the cosmic scales of justice are finally balanced either through the mysterious operations of karma in the process of reincarnation or through the omnipotence of God in a heavenly or hellish afterlife. Hartshorne considered these to be false hopes. While he did not definitively reject the possibility of an afterlife, he showed no interest in speculating about it or defending the idea. He argued that it is the divine prerogative alone to persist through infinite variations; the self-identity (that is, the genetic identity) of a non-divine individual cannot sustain itself indefinitely. Even if there were an afterlife, there could be no guarantee that the individual would survive long enough for every injustice, or even the greatest of injustices, in that person’s life to be rectified. Moreover, an afterlife could not eliminate the risk inherent in multiple or shared creativity. Traditional accounts of the afterlife are plausible only to the extent that creaturely freedom bends to a higher moral law (karma) or will (God’s) imposed on it. The heavens, hells, and purgatories of religion are elaborately orchestrated so as to place all lesser freedoms in perfect harmony with justice. In Hartshorne’s neoclassical metaphysics—especially evident in his design argument—God has the power to insure order on a cosmic scale, a power that is tantamount to insuring a field of activity for localized individuals. Divine power does not, however, extend to insuring what decisions the creatures will make. No particular outcome can be guaranteed.

To grant that the two versions of the problem of evil do not undermine neoclassical metaphysics, still leaves the question of God’s role regarding suffering and injustice. The facts that generate the problem of evil do not go away because one successfully rebuts a philosophical argument. Hartshorne claims that his theology makes better sense of “God is love” than its competitors, yet, there is a great deal of suffering that is undeserved, pointless, and widespread. Evolutionary theory adds another dimension. Entire ecosystems and countless species have come and gone in the course of geologic time. Throughout this history, creatures compete for the goods that will insure their survival and very often live at each other’s expense. Nature seems entirely indifferent to comparative values; as John B. Cobb Jr. noted, “lower” species thrive at the expense of “higher” species as when malarial mosquitoes feed on human beings. Finally, there are what Marilyn Adams calls “horrendous evils,” evils that are so pernicious that they give reason to doubt that the person’s life could be a great good to him or her on the whole. Hartshorne claims that a loving God is a necessary and indispensible character in this drama. One may ask whether this is plausible, but one must also take care not to permit the presuppositions of classical theism to color one’s judgment. Hartshorne counsels to be suspicious of the question whether our world is the sort that one would expect from an almighty and all-loving creator. In the context of dipolar theism the question must be rephrased: Is this the sort of world that one would expect of a deity that is perfect in power and love that presides over a world composed of beings, each of which exercises some degree of creativity?

If Hartshorne is correct, God accounts for order on a cosmic scale. There must be, however, two aspects to this activity that are distinguishable but not separable. On the one hand, there is the ordering activity that establishes the cosmic order per se, making possible all non-divine forms of freedom. On the other hand, there is the ordering activity that lures each localized being towards greater intensity of experience. Hartshorne holds that both aspects of God’s creative ordering of the world follow aesthetic principles (see “Charles Hartshorne: Neoclassical Metaphysics”). According to these principles, the double extremes between which the divine ordering power operates are (1) unqualified unity and unqualified diversity (or chaos) and (2) ultra-complexity and ultra-simplicity (or triviality). The mere fact of an ordered cosmos does not automatically avoid the aesthetic defects of being overly chaotic or trivial. Avoidance of these extremes requires a cumulative developmental process, which is implicit in Hartshorne’s cumulative view of process. In neoclassical metaphysics, “the explanation for the contingent must be a genetic one,” as Hartshorne (82) says in Insights and Oversights of Great Thinkers. It could not be everlastingly true that there have been elephants or seahorses. Because the process is cumulative, it must also be developmental. For example, an elephant is not created de novo from a mixture of atoms and molecules; it requires a lengthy process of species development. This is why Hartshorne claimed in Omnipotence and Other Theological Mistakes that the general idea of evolution is derivable from his metaphysical principles.

God’s role in the economy of nature is not simply maintaining cosmic order, but also eliciting higher forms of order, making possible forms of experience with greater levels of unity in diversity. A law of axiology as firm as any law of nature is that varying levels of creative experience are necessarily correlated with varying levels of what can be achieved in the way of value. For example, as complex and emotionally rich as a dog’s interior life may be, it is not sufficient to produce scientific theorizing or high artistic accomplishment. What follows is that varying levels of creativity exhibit varying levels of opportunity and risk. For instance, one cannot be ironic with a dog. Irony may amuse or offend only if one’s audience can understand it. As goes creative experiencing, so goes freedom. The cost of actual or possible achievement is the risk of failure. This analysis is evident in the few comments that Hartshorne made about sin. In a 1944 symposium on world peace, Hartshorne said that much could be learned from Reinhold Niebuhr that sin is not a struggle between “lower” (bodily) and “higher” (spiritual) aspects of personality. Rather, sin is a perversion of what is highest in a person, one’s sense of the divine; it is the claim to be divine, “a rebellion against our humble station in the universe” (Finkelstein and Maciever 597). This idolatry comes in many forms, religious and nonreligious, in the pernicious claims to infallibility or any attempt to place ultimate worth in something less than deity. As far as our experience goes, these are the highest and most tragic manifestations of the general principle that greater degrees of freedom necessarily accompany greater possibilities of its abuse.

Hartshorne agrees that the world is better to the extent that sin, and the suffering it brings in its wake, is not part of it. It does not follow, however, that the world is better to the extent that the possibility of sin is excluded from it. The conditions for the possibility of good or evil are the same: freedom. Indeed, Hartshorne maintained that some degree of evil is inevitable if good is to be possible. It is true that the particular evils that occur are not inevitable. Knowing this, we imagine that the cosmos could be altogether free of the blemish of evil, but this is to imagine an ideal that no single individual could bring about. One might agree with this but ask, with Hartshorne, whether there is a greater possibility of evil than might be expected from an all-loving cosmic designer. In The Zero Fallacy, Hartshorne spoke of human beings as the “bullies of the planet,” heedless of the welfare of other creatures, cruel to our own kind, and too often lacking the will to prevent such cruelty. He asked whether the seemingly unbridgeable distances between the earth and other solar systems might be a providential arrangement. In Omnipotence and Other Theological Mistakes, he allows himself an expression of doubt as to whether the “perilous experiment” of creatures free of instinctive guidance was too dangerous. He says that if he played at criticizing God, it would be at this point. Yet, Hartshorne also accepted on faith the infallible wisdom and ideal power of God. In Wisdom as Moderation, Hartshorne denies that limited intellects are in a position to know whether there is too much risk of evil in the world, for such a judgment must include a potentially infinite future. He also stressed that the justification of the world is in the world; that is, in the open-ended adventure of life itself that God’s creativity insures.

One of Hartshorne’s definitions of religion is the acceptance of our fragmentariness. We are fragmentary both in the sense that we are limited in space and time (that is, we are localized) and in the sense that our capacities for knowledge and goodness are limited (that is, we are imperfect). If something like Hartshorne’s panentheism is correct, we are also fragmentary in the sense that we are part of the divine being-in-becoming (see “Charles Hartshorne: Dipolar Theism”). For Hartshorne, God includes all but does not determine all, much like a person includes the cells of his or her body without being able to decide the details of their activity. Thus, what we do makes a difference in and to God in the sense that we can enhance or diminish in admittedly limited ways the divine enjoyment of the world—hence, the concept of tragedy in God mentioned previously. We have also seen, in the moral argument, that Hartshorne regarded the aim of consciously contributing to the divine life as the highest purpose to which we can aspire. In Wisdom as Moderation, he says, “God’s possession of us is our final achievement, not our possession of God” (90). Every creature that has ever existed or will ever exist becomes part of the inexhaustible memory of God. In Plato’s Symposium, Socrates, reporting the views of Diotima, speaks of immortality as the achievement of doing acts worthy of future generations’ remembrance. Hartshorne offers a similar kind of immortality except that the fallible and mortal memory of future generations is replaced by the infallible and unending memory of God.

A Hartshornean theodicy does not allow one to say that everything, or every evil, happens for a reason. There is no cure for the fact that the “lower” sometimes lives at the expense of the “higher” and that horrendous evils are part of this universe. On the other hand, a Hartshornean theodicy allows one to say that anything that happens, or any evil that occurs, can become part of a reason for striving to overcome evil with good thereby depriving evil of its capacity to dishearten us. The true depth of divine power, on Hartshorne’s view, is not God’s ability to manipulate events to the best possible outcome, but to be able to bear the suffering of the creatures without being overcome by it. On Hartshorne’s view, God is forever seeking ways to bring good from the world no matter how bad things may get. The world-weariness that sometimes overcomes the creatures never overcomes deity. In the language of William James, Hartshorne’s God is neither a pessimist (thinking that things cannot get better) nor an optimist (thinking that things are for the best), but a kind of cosmic meliorist (thinking that things can get better). This theology may console in at least two ways. To those who are helpless and who suffer, Hartshorne claims that there is a divine co-sufferer. To those who are not helpless and who work for the welfare of others, Hartshorne maintains that they are indeed working on the side of the cosmos itself, as co-workers with God. This is what Pierre Teilhard de Chardin called “building the earth.” In this way, Hartshorne’s theism may promote a resilient spirit in the face of defeat, hope that may conquer despair, and love that holds the promise of harnessing evil.

4. Conclusion

Hartshorne’s extensive writings on the ontological argument were instrumental in generating new interest in Anselm’s reasoning and in redoubling the efforts of philosophers in exploring and evaluating the variations that it can take. By highlighting a second form of ontological argument—a modal version—that the vast majority of philosophers had ignored, Hartshorne demonstrated that it was no longer sufficient to rely on Gaunilo or Kant for a refutation of Anselm. Hartshorne benefited from the formalizations of modal systems made popular by his teacher C. I. Lewis, and was the first to publish a formalized version of the modal argument. This unprecedented accomplishment clarified the argument and helped turn attention to its modal structure.

One could argue that Hartshorne was a victim of his own success. As many philosophers had failed to read Anselm closely enough to discern a second argument in his Proslogion, so philosophers had a tendency not to read Hartshorne closely enough to understand that he never used the modal argument as a singular proof of theism. Hartshorne used the argument as a single strand in a cumulative or global argument for neoclassical theism. His way of presenting the elements of the global argument emphasized the rational cost of rejecting the premises that, in each case, Hartshorne argued, was greater than in accepting the conclusion. To be sure, Hartshorne considered the modal argument an essential strand in the case for theism since it reveals, he believed, the logic of theism. If Hartshorne is correct, empirical arguments for or against the existence of God are unavailing because they misconstrue the nature of the theistic question. This idea also extends to skeptical arguments from evil that conclude to either the non-existence or probable non-existence of God. The problems of theodicy, for Hartshorne, concern the presence of evil in a universe in which every concrete particular has some degree of creativity, and not, as in traditional theology, where creativity is the unique privilege of God.

5. References and Further Reading

a. Primary Sources

i. Books in Order of Publication Date

Hartshorne, Charles. 1941. Man’s Vision of God and the Logic of Theism. Chicago: Willett, Clark and Company.
Hartshorne, Charles. 1948. The Divine Relativity: A Social Conception of God. New Haven, Connecticut: Yale University Press.
Hartshorne, Charles and William L. Reese, eds. 1953. Philosophers Speak of God. Chicago: University of Chicago Press. Republished in 2000 by Humanity Books.
Hartshorne, Charles. 1962. The Logic of Perfection and Other Essays in Neoclassical Metaphysics. La Salle, Illinois: Open Court.
Hartshorne, Charles. 1965. Anselm’s Discovery: A Re-examination of the Ontological Proof for God’s Existence. La Salle, Illinois: Open Court.
Hartshorne, Charles. 1967. A Natural Theology for Our Time. La Salle, Illinois: Open Court.
Hartshorne, Charles. 1970. Creative Synthesis and Philosophic Method. La Salle, Illinois: Open Court.
Hartshorne, Charles. 1983. Insights and Oversights of Great Thinkers: An Evaluation of Western Philosophy. Albany: State University of New York Press.
Hartshorne, Charles. 1997. The Zero Fallacy and Other Essays in Neoclassical Philosophy. Ed. Mohammad Valady. Peru, Illinois: Open Court Publishing Company.
Hartshorne, Charles. 2011. Creative Experiencing: A Philosophy of Freedom. Ed. Donald W. Viney and Jincheol O. Albany: State University of New York Press.

ii. Hartshorne’s Response to his Critics

Alston, William. 1964. “Interrogations of Charles Hartshorne.” Philosophical Interrogations. Eds. Sydney Rome and Beatrice Rome. New York: Holt, Rinehart and Winston: 319-54.
Cobb, John B. Jr. and Franklin L Gamwell, eds. 1984. Existence and Actuality: Conversations with Charles Hartshorne. Chicago: University of Chicago Press.
Hahn, Lewis Edwin, editor. 1991. The Philosophy of Charles Hartshorne, The Library of Living Philosophers, Volume XX. La Salle, Illinois: Open Court.
Kane, Robert and Stephen H. Phillips, eds. 1989. Hartshorne, Process Philosophy and Theology. Albany: State University of New York Press.
Sia, Santiago, ed. 1990. Charles Hartshorne’s Concept of God: Philosophical and Theological Responses. Dordrecht, the Netherlands: Kluwer Academic Publishers.

iii. Selected Articles

Hartshorne, Charles. 1944. “The Formal Validity and the Real Significance of the Ontological Argument.” The Philosophical Review 53.3: 225-45.
Hartshorne, Charles. 1945. “On Hartshorne’s Formulation of the Ontological Argument: A Rejoinder [to Elton].” Philosophical Review 54.1: 63-5.
Hartshorne, Charles. 1961. “The Logic of the Ontological Argument.” Journal of Philosophy 58.17: 471-73.
Hartshorne, Charles. 1962. Introduction. Saint Anselm: Basic Writings. 2nd ed. Trans. S. W. Deane. La Salle, Illinois: Open Court Publishing Company: 1-19.
Hartshorne, Charles. 1963. “Rationale of the Ontological Proof.” Theology Today 20.2: 278-83.
Hartshorne, Charles. 1966. “Is the Denial of Existence Ever Contradictory?” Journal of Philosophy 63.4: 85-93.
Hartshorne, Charles. 1967. “Necessity.” Review of Metaphysics 21.2: 290-96.
Hartshorne, Charles. 1967. “Rejoinder to Purtill.” Review of Metaphysics 21.2: 308-09.
Hartshorne, Charles. 1972. “Can There Be Proofs for the Existence of God?” Religious Language and Knowledge. Eds. Robert H. Ayers and William T. Blackstone. Athens: University of Georgia Press: 62-75.
Hartshorne, Charles. 1977. “John Hick on Logical and Ontological Necessity.” Religious Studies 13.2: l55-65.
Hartshorne, Charles. 1978. “A Philosophy of Death” Philosophical Aspects of Thanatology. Vol. 2. Eds. Florence M. Hetzler and A. H. Kutscher. New York: MSS Information Corp.: 81- 89.
Hartshorne, Charles. 1982. “Grounds for Believing in God’s Existence.” Meaning, Truth, and God. Ed. Leroy S. Rouner. London: University of Notre Dame Press: 17-33.
Hartshorne, Charles. 1984. “God and the Meaning of Life.” On Nature. Vol. 6. Ed. Leroy S. Rouner. Notre Dame, Indiana: University of Notre Dame Press: 154-68.
Hartshorne, Charles. 1985a. “Theistic Proofs and Disproofs: The Findlay Paradox.” Studies in the Philosophy of J. N. Findlay. Eds. Robert S. Cohen, Richard M. Martin, and Merold Westphal. Albany: State University of New York Press: 224-34.
Hartshorne, Charles. 1985b. “Our Knowledge of God.” Knowing Religiously. Vol. 7. Ed. Leroy S. Rouner. Notre Dame, Indiana: University of Notre Dame Press: 52-63.
Hartshorne, Charles. 1987. “Response to resurrection debate.” Did Jesus Rise From the Dead? The Resurrection Debate. Ed. Terry L. Miethe. San Francisco: Harper & Row: 137-42.
Hartshorne, Charles. 1989. “Metaphysical and Empirical Aspects of the Idea of God.” Witness and Existence: Essays in Honor of Schubert M. Ogden. Eds. Philip E. Devenish and George L. Goodwin. Chicago: University of Chicago Press: 177-189.
Hartshorne, Charles. 1999. “Can We Understand God?” Framing a Vision of the World: Essays in Philosophy, Science and Religion. Eds. André Cloots and Santiago Sia. Belgium: Leuven University Press: 87-97.

b. Secondary Sources

Boyd, Gregory A. 1992. Trinity and Process: A Critical Evaluation and Reconstruction of Hartshorne’s Di-Polar Theism Towards a Trinitarian Metaphysics. New York: Peter Lang.
Burrell, David B. 1982. “Does Process Theology Rest on a Mistake?” Theological Studies 43.1: 125-35.
Clarke, Bowman. 1971. “Modal Disproofs and Proofs for God.” Southern Journal of Philosophy 9.3: 247-58.
Dombrowski, Daniel A. 1996. Analytic Theism, Hartshorne, and the Concept of God. Albany: State University of New York Press.
Dombrowski, Daniel A. 2006. Rethinking the Ontological Argument: A Neoclassical Theistic Response. New York: Cambridge University Press.
Finkelstein, Louis and Robert M. Maciever, eds. 1944. Approaches to World Peace: A Symposium. New York: Conference on Science, Philosophy, and Religion in their Relation to the Democratic Way of Life.
Goodwin, George L. 1978. The Ontological Argument of Charles Hartshorne. Missoula: Montana Scholars Press.
Goodwin, George L. 1983. “The Ontological Argument in Neoclassical Context: Reply to Friedman.” Erkenntnis 20: 219-32.
Goodwin, George L. 2003. “De Re Modality and the Ontological Argument.” Process and Analysis: Whitehead, Hartshorne, and the Analytic Tradition. Ed. George W. Shields. Albany: State University of New York Press: 175-97.
Kane, Robert. 1984. “The Modal Ontological Argument.” Mind 93: 336-50.
Lucas, Billy Joe. 2003. “The Second Epistemic Way.” Process and Analysis: Whitehead, Hartshorne, and the Analytic Tradition. Ed. George W. Shields. Albany: State University of New York Press: 199-207.
Neville, Robert C. 1980. Creativity and God: A Challenge to Process Theology. New York: The Seabury Press.
Neville, Robert C. 2009. Realism in Religion: A Pragmatist’s Perspective. Albany: State University of New York Press.
Oppy, Graham. 1995. Ontological Arguments and Belief in God. New York: Cambridge University Press.
Peirce, C. S. 1934. The Collected Papers of Charles Sanders Peirce. Vol. 5, Eds. Charles Hartshorne and Paul Weiss. Cambridge: Harvard University Press.
Peters, Eugene H. 1970. Hartshorne and Neoclassical Metaphysics. Lincoln: University of Nebraska Press.
Peters, Eugene H. 1984. “Charles Hartshorne and the Ontological Argument.” Process Studies 14.1: 11-20.
Shields, George W. 1980. “Review of The Ontological Argument of Charles Hartshorne by George L. Goodwin.” The Journal of Religion 60.3: 357-59.
Shields, George W. 1980. “Hartshorne’s Modal Ontological Argument.” Dialogue 22.1-2: 45-56.
Shields, George W. 1983. “God, Modality and Incoherence.” Encounter 44.1: 27-39.
Shields, George W. 1992. “Hartshorne and Creel on Impassibility,” Process Studies 21.1: 44-59.
Shields, George W. 1992. “Infinitesimals and Hartshorne’s Set-Theoretic Platonism” The Modern Schoolman 49.2: 123-134.
Shields, George W., ed. 2003. Process and Analysis: Whitehead, Hartshorne, and the Analytic Tradition. Albany: State University of New York Press.
Sia, Santiago. 1985. God in Process Thought: A Study in Charles Hartshorne’s Concept of God. Dordrecht, the Netherlands: Martinus Nijhoff.
Sia, Santiago. 2004. Religion, Reason and God: Essays in the Philosophy of Charles Hartshorne and A. N. Whitehead. Frankfurt am Main: Peter Lang.
Sia, Santiago, ed. 1986. Word and Spirit, a Monastic Review, 8: Process Theology and the Christian Doctrine of God. Petersham, Massachusetts: St. Bede’s Publications.
Viney, Donald Wayne. 1985. Charles Hartshorne and the Existence of God. Albany: State University of New York Press.
Viney, Donald Wayne. 1986. “How to Argue for God’s Existence: Reflections on Hartshorne’s Global Argument.” The Midwest Quarterly 28.1: 36-49.
Viney, Donald Wayne. 1987. “In Defense of the Global Argument: A Reply to Professor Luft.” Process Studies 16.4: 309-311.
Viney, Donald Wayne. 2005. “A Lamp to Our Doubts: Ferré, Hartshorne, and Theistic Arguments.” Nature, Truth, and Value: Exploring the Thinking of Frederick Ferré. Eds. George Allan and Merle F. Allshouse. Lanham, Maryland: Lexington Books: 255-69.
Whitney, Barry L. 1985. Evil and the Process God. Toronto: Edwin Mellon Press.
Wilcox, John T. 1961. “A Question from Physics for Certain Theists.” Journal of Religion 40.4: 293-300.
Wood, Forest Jr. and Michael DeArmey, eds. 1986. Hartshorne’s Neo-Classical Theology. New Orleans: Tulane University Press.

c. Bibliography

Viney, Donald Wayne and Randy Ramal. 2007. “Primary Bibliography of Philosophical Works of Charles Hartshorne.” Hartshorne: A New World View: Essays by Charles Hartshorne. Ed. in Herbert F. Vetter. Cambridge, Massachusetts: Harvard Square Library: 129-160. Also published in Sia, Santiago. 2004. Religion, Reason and God. Frankfurt am Main: Peter Lang: 195-223.

Author Information

Donald Wayne Viney
Email: don_viney@yahoo.com
Pittsburg State University
U. S. A.

and

George W. Shields
Email: George.shields@kysu.edu
Kentucky State University
U. S. A.

Reformed Epistemology

Reformed epistemology is a thesis about the rationality of religious belief. A central claim made by the reformed epistemologist is that religious belief can be rational without any appeal to evidence or argument. There are, broadly speaking, two ways that reformed epistemologists support this claim. The first is to argue that there is no way to successfully formulate the charge that religious belief is in some way epistemically defective if it is lacking support by evidence or argument. The second way is to offer a description of what it means for a belief to be rational, and to suggest ways that religious beliefs might in fact be meeting these requirements. This has led reformed epistemologists to explore topics such as when a belief-forming mechanism confers warrant, the rationality of engaging in belief forming practices, and when we have an epistemic duty to revise our beliefs. As such, reformed epistemology offers an alternative to evidentialism (the view that religious belief must be supported by evidence in order to be rational) and fideism (the view that religious belief is not rational, but that we have non-epistemic reasons for believing).

Reformed epistemology was first clearly articulated in a collection of papers called Faith and Rationality edited by Alvin Plantinga and Nicholas Wolterstorff in 1983. However, the view owes a debt to many other thinkers.

Introduction
The Origins of Reformed Epistemology
1. Reformed
2. Epistemology
Key Figures in Reformed Epistemology
Evidence and Rational Belief in God
Classical Foundationalism
1. Rejecting Classical Foundationalism
The Positive Case in Reformed Epistemology
Objections to Reformed Epistemology
References and Further Reading

1. Introduction

Here is an argument against the rationality of belief in God:

(1) Belief in God requires the right kind of evidence in order to be rational.

(2) No such evidence exists for belief in God.

(3) Therefore, belief in God is not rational.

The idea here is that in order for belief in God to be rational, there needs to be an appropriate relationship between belief and evidence. What is appropriate, according to those who endorse the above argument, is that the belief in question be based on good evidence. This argument is sometimes referred to as the evidentialist objection to believe in God. According to the reformed epistemologist, philosophers have historically taken premise 1 to be rather intuitive. As a result, discussion involving the rationality of belief in God focused almost entirely on premise 2. Thus, philosophers who defended the rationality and justification of belief in God would have done so by responding to premise 2 and providing evidence for God’s existence. The evidentialist objection fails, they claim, because sufficient evidence does exist for rational belief in God. According to the reformed epistemologist, then, theists (historically anyway) who reject premise 2 would simply endorse the following argument:

(1) Belief in God requires the right kind of evidence in order to be rational.

(2*) Such evidence does exist for rational belief in God.

(3*) Therefore, belief in God is rational.

For the theist who defends this argument, finding the right kind of evidence that is sufficient for rational belief in God becomes their chief aim. The problem, according to the reformed epistemologist, is that such a move is unnecessary. There is, in other words, a much easier way around the evidentialist objection—the rejection of premise 1. Thus, for the reformed epistemologist the problem with the evidentialist objection lies not with 2, but with 1. Why assume that belief in God is in any way subject to the demands of 1? Belief in God, argues the reformed epistemologist, can be rational without inference from evidence or argument. If this central claim is true, 1 is undermined and the evidentialist objection (as it stands) fails.

2. The Origins of Reformed Epistemology

Reformed epistemology first appeared in the early 1980s but the view owes a debt to many other thinkers. The influences on reformed epistemology can be divided into two groups: reformed influences and influences from within epistemology.

a. Reformed

Reformed epistemology was first clearly articulated in a collection of papers called Faith and Rationality edited by Alvin Plantinga and Nicholas Wolterstorff in 1983. The reason for “reformed” in reformed epistemology is a result of the clear influences from the reformed theological tradition on this view. Two of the leading proponents—Plantinga and Wolterstorff—taught at Calvin College and they take inspiration from important reformed thinkers such as John Calvin and Abraham Kuyper.

The most explicit appeal to the reformed tradition is found in Alvin Plantinga’s work. Plantinga, when wondering how theistic belief might be grounded, suggests that we consider that Calvin may have been right when he said that God has created humans with an inner awareness of himself and it is this sensus divinitatis that is responsible for theistic belief. Plantinga also engages with and criticizes reformed thinkers who reject natural theology such as Karl Barth (See Plantinga 1983).

Despite the important role that reformed thought has played in the early days of reformed epistemology, and, in particular, in the thinking of some of its key proponents, the central tenets of reformed epistemology do not depend on this tradition. Plantinga has tried to make this more explicit. In Warrant and Christian Belief he argues that the ideas he finds in Calvin are also found in Thomas Aquinas. In fact, there is no reason to believe that there won’t be numerous traditions within Christian thought that could also adopt something like the view defended by reformed epistemologists. Furthermore, the view could be easily adapted by other religions—particularly monotheistic religions.

In light of this, the word “reformed” in reformed epistemology is best thought of as describing the inspiration behind the position rather than its core claims. Objections to reformed thought, or to Christianity more generally, may leave reformed epistemology unscathed.

b. Epistemology

As well as being influenced by the reformed tradition, reformed epistemology draws on work in epistemology. The philosopher who has most clearly been influential to reformed epistemologists is Thomas Reid, a Scottish Presbyterian minister. Reid’s epistemology is distinctive because of the importance he places on describing the belief forming faculties that give rise to our beliefs. These faculties are dispositions to form certain beliefs in response to being triggered in certain ways. These dispositions can vary over time and we can gain some and lose others through training or habit. But some of our belief dispositions are innate—we are simply born with them. According to Reid these innate dispositions cannot ultimately be rationally grounded by us, but we must rely on them nonetheless.

This Reidian picture of epistemology has had a significant influence on reformed epistemology. Accordingly, reformed epistemologists argue that in order to understand whether or not our religious beliefs are rational we must consider what sorts of being we are and the innate belief dispositions that we have.

3. Key Figures in Reformed Epistemology

Though perhaps not a sufficient condition, the rejection of premise 1 above is at least a necessary condition when it comes to identifying key figures within reformed epistemology. Below, then, we discuss three philosophers who reject the idea that belief in God is rational only when inferred from good evidence. These philosophers—William Alston, Alvin Plantinga, and Nicholas Wolterstorff—are key figures within religious epistemology and were central in the development of reformed epistemology.

a. William Alston

William Alston’s first major contribution to reformed epistemology comes in a pair of essays “Religious Experience and Religious Belief” and “Christian Experience and Christian Belief” (the latter of these appears in Faith and Rationality, which is edited by Alvin Plantinga and Nicholas Wolterstorff). His aim is to argue that Christian Practice (CP) is justified. CP is the practice of forming certain kinds of beliefs in response to certain experiences. The sorts of beliefs in question are those such as “God will provide for his people” or “God will forgive the sins of the truly repentant.” They are beliefs about God and his activities and Alston calls these beliefs “M-beliefs” where M stands for manifestation (Alston 1983: 104-105).

Alston wishes to show that those who engage in CP are justified in much the same way that we are justified in engaging in a different practice—perceptual practice (PP). PP is the very familiar practice of forming certain perceptual beliefs in response to perceptual experiences.

Alston argues that there is no non-circular justification available for PP; this is because our only access to the physical world, that PP gives us knowledge of, is through PP itself. The only justification we have for PP is that we do not have sufficient reason for believing that it is unreliable. CP, claims Alston, is justified by the same standard. Those who claim that we need some independent reason for trusting CP are holding it to a higher epistemic standard than PP.

Alston went on to offer a book-length defense of these ideas in Perceiving God. In Perceiving God Alston spends significant time discussing objections to what he is now calling Christian Mystical Practice (CMP). He concludes that all the objections fail and that they are guilty of one of two things: epistemic imperialism or double standards. He describes epistemic imperialism as requiring that CMP be like PP in some way, if it is to be justified, without any epistemic support for that requirement. Objections are guilty of double standards when they seek to apply a standard to CMP that PP would not meet (Alston 1991: 248-250).

b. Alvin Plantinga

Alvin Plantinga has authored and edited a number of books and essays on reformed epistemology. Plantinga’s earliest work on the topic, God and Other Minds, represents an initial attempt to undermine the evidentialist objection. In God and Other Minds, Plantinga assumes that (2) is generally correct. There isn’t, according to Plantinga, sufficiently good evidence for belief in God—at least not in the way that is demanded by the evidentialist. Plantinga’s approach at this point, then, is to argue that there is a double standard with regard to (1). So while the evidence and arguments for belief in God are far from conclusive, they are, in fact, on par with other beliefs that we take to be rational. For example, as the argument goes, we take the belief that other minds exist to be rational despite the fact that philosophical arguments in its favor suffer many of the same problems that plague traditional theistic arguments. Thus, concludes Plantinga, “if my belief in other minds is rational, so is my belief in God. But obviously the former is rational; so, therefore, is the latter” (1967: 271). This is the first of Plantinga’s so called parity arguments.

In more recent literature, however, Plantinga abandons this earlier parity argument as a way to deal with the evidentialist objector. This is due in part to the fact that in God and Other Minds Plantinga assumed, like the evidentialist objector, that the way to go about discussing the rationality of religious belief was to first consider the evidence in its favor. Here is Plantinga discussing this assumption:

I was somehow both accepting but also questioning what was then axiomatic: that belief in God, if it is to be rationally acceptable, must be such that there is good evidence for it. This evidence would be propositional evidence: evidence from other propositions you believe, and it would have to come in the form of arguments. This claim wasn’t itself argued for: it was simply asserted, or better, just assumed as self-evident or at least utterly obvious. What was taken for granted has now come to be called ‘evidentialism’ (a better title would be ‘evidentialism with respect to belief in God’, but that’s a bit unwieldy). (2000: 70)

Plantinga, then, initially attempted to confront the evidentialist objection by merely pointing out its inconsistent nature. In more recent literature, however, Plantinga adopts a new, bolder approach in response to the evidentialist objection. He directly confronts the evidentialist by showing that it is motivated by a failed theory of justification—namely, classical foundationalism. Crucial to the argument, then, is the belief that the evidentialist objection arises from the influence of classical foundationalism. A detailed response to classical foundationalism is found in chapter 3 of Warranted Christian Belief. The idea presented in WCB is not that (1) is applied inconsistently, but that there is no good reason to think that (1) is true.

As well as this negative approach to challenging the evidentialist objection Plantinga also seeks to offer something more positive. In his book, Warrant and Proper Function, Plantinga seeks to offer an account of warrant—his term for whatever it is that makes the difference between true belief and knowledge. In Warranted Christian Belief Plantinga applies his account of warrant to religious belief and argues that there is no way to show that religious belief is not warranted without first assuming that it is false.

c. Nicolas Wolterstorff

Nicholas Wolterstorff’s defense of some of the central claims of reformed epistemology is perhaps less significant than the previous two figures that we looked at, but his contributions are certainly more wide reaching. His earliest contribution is his book Reason within the Bounds of Religion. In this book Wolterstorff is grappling with the question of how to be a Christian and a scholar and how one’s faith ought to relate to and impact upon one’s reasoning. Though we find no explicit formulation of reformed epistemology here, it is clear that he is attempting to develop a view in which religious beliefs are neither subordinate to nor independent of our other beliefs.

His most explicit contribution to reformed epistemology comes in the collection of essays that he edited with Alvin Plantinga called Faith and Rationality. In his paper entitled “Can belief in God be rational?” he considers what obligations rationality places upon us, and in particular whether rationality requires that we only believe in God on the basis of evidence. Wolterstorff argues that:

A person is rationally justified in believing a certain proposition which he does believe unless he has adequate reason to cease from believing it. Our beliefs are rational unless we have reason for refraining; they are not nonrational unless we have reason for believing. They are innocent until proved guilty, not guilty until proved innocent. (Wolterstorff 1983: 163)

He then turns to applying this to belief in God. He observes that people come to believe that God exists in a variety of ways such as from their parents, or in response to an overwhelming sense of guilt, or by finding peace in the midst of suicidal desperation. In many cases, belief in God seems to be immediate (that is, not based upon other beliefs) and so long as the person who forms the belief has no adequate reason to give up their belief then that belief will be rational.

More recent contributions from Wolterstorff come in his books Divine Discourse and Justice. In the former he is engaged in a philosophical discussion of the claim that God speaks, and in the latter, he is defending an account of human rights. Although these books are not about reformed epistemology they are informed by it. Wolterstorff is still engaged in showing how certain religious beliefs can be rational. Furthermore, Wolterstorff is clearly putting into practice some of the key claims of reformed epistemology. In Justice it is clear that Wolterstorff is seeking to show how some religious claims interact with the discussion of human rights—in doing this, Wolterstorff treats the religious claims as standing on equal footing with the non-religious claims. What this means in practice is that he does not attempt to justify religious claims on grounds acceptable to the non-religious, but neither does he treat religious claims as immune to criticism.

4. Evidence and Rational Belief in God

According to the reformed epistemologist, objections to the rationality of belief in God often revolve around the claim that belief in God lacks the appropriate evidence. In order to see this, we can, following Plantinga, identify two distinct types of objections—namely, the de facto and de jure objections. The de facto objection, historically anyway, is the form many religious objections traditionally take. That is, the religious skeptic often questions the reality or truth of the religious conviction before directly considering epistemic questions. De facto objections take many forms, with perhaps the problem of evil being the most well-known and discussed in philosophical literature. As the argument goes, a benevolent and omnipotent God cannot possibly exist given the amount of unnecessary or gratuitous evil.

In contrast to the de facto objection, there is an epistemic objection—or as Plantinga calls it, the de jure objection. The de jure objection ignores the ontological status of God’s existence and instead focuses on the justification and rationality of belief in God. The de jure objector asks whether belief in God is irrational, unjustifiable, or epistemically irresponsible. This objection comes in various forms as well. For some, belief in God is irrational as it is the result of some cognitive malfunction. Belief in God is so irrational, it is claimed, that it could have only been invented by mad, deluded people who base their belief on insufficient justification or argument. For others, this cognitive malfunction is akin to belief in Santa Claus and not the kind of belief an adult could justifiably believe in. Belief in Santa Claus, for which there is no evidence, is akin to belief in God, for which there is no evidence. No matter which line the de jure objector takes, what seems to unite these objectors is the idea that belief in God lacks the kind of epistemic justification necessary for rational belief. And for many de jure objectors there is the assumption, as Plantinga notes, that having a rational belief in God requires (propositional) evidence in order to have adequate epistemic support. Call this the evidentialist de jure objection. So what motivates the de jure objection, then, is the idea that belief in God both requires and lacks the appropriate evidence. The central claim of the evidentialist position is that one ought to believe only when one has the appropriate evidence. Thus if theism is indeed similar to belief in Santa Claus (for which there is no good evidence), then it seems that belief in God is indeed dubious and the nature of the evidentialist de jure objection becomes a bit clearer: belief in God is rational only if its justification depends on evidence. Theism, however, lacks the appropriate evidence and is therefore irrational.

What makes reformed epistemology unique here is the response that is given in reply to this critcism. The assumed move here would be to try and show that there is adequate evidence for theism. Instead, though, the reformed epistemologist rejects the evidentialist assumption (and on some accounts might even grant that there is insufficient inferential evidence). While there are perhaps several ways to get around the evidentialist assumption, the most well-known account is offered by Plantinga. Plantinga argues, for example, that the evidentialist assumption is undermined given that it is motivated by a failed theory of justification—namely, classical foundationalism.

5. Classical Foundationalism

In order to undermine the evidentialist objection, reformed epistemologists have sought to argue against what they take to be the underlying epistemological view that motivates the objection. The view that they identify as playing this role they call Classical Foundationalism.

Classical Foundationalism holds that there are two kinds of belief: basic beliefs and non-basic beliefs. The basic beliefs are rational even when not held on the basis of other beliefs, whereas non-basic beliefs are only rational when supported by basic beliefs. The reason why classical foundationalism motivates the evidentialist objection against belief in God is because of the restrictions it puts on what can reasonably be a basic belief—on what is a properly basic belief.

According to the classical foundationalist, the only beliefs that are properly basic fall into to one of the three following categories:

evident to the senses,

incorrigible, or

self-evident.

This means that any belief that does not fall into one of these categories can only be rational if it is supported by beliefs that do fall into these categories. With this framework in place it seems quite easy to formulate the evidentialist objection against belief in God. This is because belief in God does not seem to be evident to the sense, incorrigible or self-evident. Given this, then, we can claim that belief in God is only rational if it is supported by adequate evidence—that is, by other beliefs that are evident to the senses, incorrigible or self-evident.

It is possible to find historical examples of arguments along these lines. For example, here is J. L. Mackie discussing the rationality of belief in God:

If it is agreed that the central assertions of theism are literally meaningful, it must also be admitted that they are not directly verifiable. It follows then that any rational consideration of whether they are true or not will involve arguments… it [whether God exists] must be examined by either deductive or inductive reasoning or, if that yields no decision, by arguments to the best explanation; for in such a context nothing else can have any coherent bearing on the issue. (Mackie 1982: 4, 6)

Mackie is not alone is these demands. John Locke placed similar demands on religious belief by boldly claiming that those who do assent to (religious) belief without evidence “transgress against their own light” and disregard the very purpose of those faculties which are designed to evaluate the evidence necessary for belief.

The reformed epistemologist contends that this view has been the dominant one among both theists and atheists alike, and so the question of whether or not belief in God is rational has focused on whether or not there is adequate evidence for that belief. It is for this reason that reformed epistemologists have seen their first task as being to show why classical foundationalism fails as account of what it takes for a belief to be rational.

a. Rejecting Classical Foundationalism

The case for rejecting classical foundationalism rests on two key arguments. First, classical foundationalism classes a large number of beliefs that we typically take ourselves to know as irrational. Second, classical foundationalism is self-referentially incoherent.

The first problem raised against classical foundationalism is that it classes beliefs such as ‘the world has existed for more than five minutes’, ‘other persons exist’ and ‘humans can act freely’ as not properly basic. These beliefs, claims Plantinga, (along with a great many others) are accepted by the vast majority of rational humans; yet, the arguments for these beliefs are remarkably weak. Most people who believe these things can offer no arguments for their belief, and those who can, still seem to hold the belief with a greater degree of certainty than the argument would seem to warrant. Plantinga writes that the problem of other minds is to explain how it is that the very common belief that other humans have a mental life could be justified. Plantinga thinks that the best argument is the argument from analogy—that we observe that our own mental events such as being in pain are accompanied by certain behaviors, such as grasping the area where the pain is located, and then infer from this that when others are exhibiting similar behavior, they are also having the associated mental event. This inference from a single case hardly seems to justify the belief that there are other minds, but if it can be shown to be sufficient it would still be implausible to claim that only those who have knowledge of the argument are rational in their belief that other minds exist. This, perhaps, would not be so troubling if it were not the case that so many beliefs that do not meet the requirements set down by classical foundationalism are believed in a basic way by most rational humans. Anthony Kenny has pointed out that there are many beliefs that, although we can find some evidence for them, should not be thought of as being based upon that evidence because the evidence is believed with less strength than what it is evidence for. He suggests that the belief that Australia exists is just such a belief:

If any one of the ‘reasons’ for believing in Australia turned out to be false, even if all the considerations I could mention proved illusory, much less of my noetic structure would collapse than if it turned out that Australia did not exist. (Kenny 1983: 19)

The same goes for beliefs such as ‘I am awake’ or ‘human beings die’. If these beliefs can be rational only if they are based upon evidence then the classical foundationalism seems to suggest that we should hold many of our beliefs with much less certainty, and give up many other very strongly held beliefs.

Plantinga’s second objection is that classical foundationalism is self-referentially incoherent. Classical foundationalism itself is not self-evident, neither is it incorrigible, and it is certainly not evident to the senses. This means that if it is to meet its own standards there must be an argument from premises that are self-evident, incorrigible, or evident to the senses. No argument presents itself, and it is certainly difficult to see where one would start, especially in light of some of the counterintuitive consequences of the classical foundationalism highlighted above.

It’s worth noting here that not all reformed epistemologists think the connection between classical foundationalism and evidentialism is so obvious. There are two main lines of criticism that can be made to Plantinga’s arguments against classical foundationalism. The first is to question the link between classical foundationalism and the evidentialist objection, and the second is to claim that Plantinga has failed to show that classical foundationalism is an untenable position.

This first criticism can be found among Plantinga’s fellow reformed epistemologists:

[I]f [Plantinga] is saying that no one has explicitly presented [the evidentialist objection] as following from some other developed and articulated position that is probably true, but it remains to be shown that anyone has done that with respect to classical foundationalism either. But if the claim is that no other epistemological theory could plausibly serve as a reason for the evidentialist denial, that is palpably false. (Alston in Tomberlin and van Inwagen 1985: 296)

[Plantinga’s] discussion puts us in the position of seeing that the most common and powerful argument for evidentialism is classical foundationalism, and of seeing that classical foundationalism is unacceptable. But to deprive the evidentialist of his best defense is not yet to show that his contention is false. (Wolterstorff 1983: 142)

The criticism from Alston and Wolterstorff is that Plantinga has done nothing to persuade us that the evidentialist objection has no force; at best he has shown that no previous articulation of the objection is successful (supposing that it is correct that all previous versions of the argument rely on something very much like classical foundationalism).

The second response to Plantinga can again be found in Alston (Alston in Tomberlin and van Inwagen 1985: 296-299). Alston observes that Plantinga has not shown that the defender of classical foundationalism cannot argue for classical foundationalism from premises that are properly basic by her lights. Alston agrees that it is hard to see how this might be done but denies that this supports the conclusion that it cannot be done.

Plantinga’s critique of classical foundationalism noted above might be understood as a negative approach. The responses from Alston and Wolterstorff, then, are directed at this negative approach. Plantinga, however, also offers a different, more positive approach to the issue of proper basicality. He asks us to reconsider what might be classified as properly basic. Rather than select criteria, and then categorize our beliefs accordingly, we should amass examples of beliefs that we take to be properly basic, and the circumstances in which they are considered properly basic. After this process, Plantinga suggests that one could then propose criteria following reflection on these examples. Though, it’s important to keep in mind that not all of the example beliefs will qualify as genuinely properly basic (despite any initial appearances to the contrary).

But who is to decide the set of examples, and how do we weed out bad examples without any criteria? Plantinga deliberately gives no definitive answers to these questions. According to Plantinga, it is the responsibility of each community to decide what it considers to be properly basic and to take that as a starting point; there can then be an exchange between the examples and the criteria that they are used to justify, each refining the other. The claim is not that those beliefs that are held by one’s own community to be properly basic are properly basic; rather, the claim is that this is the best starting point for enquiry. It may be that your community has got it wrong about what beliefs are properly basic, but hopefully this will be revealed by further reflection.

According the reformed epistemologist, there is no neutral starting point for philosophical enquiry, so it is up to each community to assess their own starting point, and take that as a defeasible foundation for inquiry. Communities are not free, however, to decide what beliefs are basic for them. What we believe is rarely within our own control—for example, one cannot simply decide to believe that the moon does not exist. This means that there is an objective fact about what each community does take as its starting point.

It might be objected that this is arbitrary, but Plantinga contends that there is no set of beliefs that will be entirely uncontroversial, and there is no criteria of proper basicality that is more convincing than the beliefs that most people take as properly basic. Or perhaps some will agree that although this method is correct, it is still implausible that belief in God should be properly basic. In the case of perceptual beliefs the ground for them is obvious, even if how they are grounded is not clear. God, if he exists, is surely much more remote, and his existence is not the sort of thing that can be known in the basic way.

Plantinga responds by pointing out that, within the Reformed tradition at least, belief in God is considered to be grounded. According to John Calvin, one of the important figures in the Reformation, humans each have a natural tendency to believe that God exists when placed in certain circumstances, in fact he claims that God “daily discloses himself in the whole workmanship of the universe” (Plantinga 2000: 66). Plantinga does not argue for the truth of such a position, rather, he mentions it to show that his claim that belief in God can be properly basic is not ad hoc, but is in fact implicitly the view held by a large number of people, and the Reformed tradition more specifically. It is not necessary that Plantinga know, or even have good reason to believe the claims made by Calvin and others, as long as it is true that there are experiences that serve to ground belief in God then that belief will be properly basic on those occasions. It is due to this appeal to reformed thinkers that this view has come to be known as reformed epistemology.

On the surface, reformed epistemology bears some similarity to fideism. Fideism is the claim that belief in God is not rational, but must be accepted upon faith; it is usually claimed that this belief is independent of reason, or in more extreme cases that it is opposed to reason. The reformed epistemologist will agree with the fideist that arguments are not needed to justify belief in God, but what about the relationship between reason and belief in God?

It is clear from what has already been discussed that the reformed epistemologist will not subscribe to the more extreme fideism because to believe what is properly basic is not to believe what is opposed to reason. What is, at first, less clear is whether to believe in God in the basic way is to believe independently of reason. Plantinga considers a distinction between reason and faith suggested by Abraham Kuyper (Plantinga 1983: 88), that the deliverances of reason are those beliefs that are based on argumentation and inference, whereas the deliverances of faith are beliefs that are held independently of argument and inference. On this understanding of faith, anything held in the basic way will be taken on faith. For example, this definition would suggest that 2+1=3, external objects exist and I am awake, are all held on faith. This is not the understanding of faith that the fideist has in mind, since it does not serve to draw a distinction between faith and reason. Plantinga explains that there is no reason for the reformed epistemologist to think that belief in God is independent of, or opposed to, reason:

Belief in the existence of God is in the same boat as belief in other minds, the past, and perceptual objects; in each case God has so constructed us that in the right circumstances we form the belief in question. But then the belief that there is such a person as God is as much among the deliverances of reason as other beliefs. (Plantinga 1983: 90)

Reformed epistemologists, unlike fideists, hold that religious belief is rational, but unlike the evidentialist, they deny that this rationality is due to the beliefs being based upon evidence.

6. The Positive Case in Reformed Epistemology

So far, much of what has been said here has been focused on undermining a certain sort objection to the rationality of religious belief. The second significant strand to reformed epistemology concerns providing a description of the way in which religious beliefs can be rational.

a. The Christian Mystical Practice

In Perceiving God William Alston seeks to describe and defend what he calls the Christian Mystical Practice (CMP). This is the practice of forming beliefs about God in response to certain kinds of experiences.

Alston first argues that there are no non-question-begging way to show that any basic belief forming practice is reliable—one will always have to appeal to the practice itself. In light of this we cannot require that belief forming practices enjoy independent support before we engage in them because this support will never be available. It may be that some practices can be ruled out due to being inconsistent, but no adequate reason can be found for thinking that any of our basic belief forming practices are reliable.

Instead Alston argues that it is reasonable to accept socially established practices; those practices that have demonstrated stability over a number of generations and which are deeply embedded in our psyche. Such practices provide prima facie justification for the beliefs that they produce. Furthermore, if these practices are not shown to be unreliable then the beliefs that result from them are rational.

Alston claims that CMP is one of these practices. Christians have been forming beliefs in this way for centuries, and the practice is deeply embedded in the culture. This means that engaging in the practice is prima facie justified. And as long as there are no adequate reasons for thinking that CMP is unreliable then the beliefs that result from this practice will be justified.

Alston goes on to argue that many of the reasons for thinking that CMP is unreliable exhibit one or both of two flaws: imperialism and double standards. Objections such as that CMP must be unreliable because most normal adults do not practice it is, Alston argues, guilty of imperialism. It imposes a standard on CMP that requires it to be more like the Sense-perceptual Practice (SP) for no good epistemic reason. Why should we expect practices that are used by all the population to be the only ones that are reliable? An example of an objection that imposes a double standard would be requiring that the outputs of CMP be independently verifiable. Alston argues that no basic belief forming practice meets this requirement including SP, so requiring something like this of CMP is to apply a standard that one would not apply across the board.

b. The Parity Argument

The beginnings of the parity argument can be seen in Plantinga’s early writings as far back as God and Other Minds. There, Plantinga argues that belief in other minds and belief in God are in the same epistemological dilemma; all of the arguments in their favor fall short when it comes to philosophical scrutiny. Yet, as Plantinga states, “if belief in other minds is rational, so is my belief in God. But obviously the former is rational; so, therefore, is the latter.” As Plantinga’s thinking has developed, so has his parity argument as it relates to rational belief in God. The key difference in his thinking, as he notes in Warranted Christian Belief, is that he no longer takes proofs as the only way to justify belief in God. This major shift in Plantinga’s thinking opens the door for a more daring parity argument, namely that in the same way that perceptual experiences are justified, belief in God—through the divine sense—is also justified and should thus enjoy the same epistemic status as ordinary perceptual experiences.

Plantinga’s parity argument for rational belief in God follows a specific pattern. The first goal is to highlight those beliefs that we take to be both rational and basic. In other words, it needs to be the kind of belief that is rational despite not being inferred from any evidence or argument. Further, it must be the sort of belief that if held hostage to evidential demands it would have devastating epistemological results; perceptual beliefs, it is thought, are specifically what Plantinga is looking for. Consider for example the belief that I see a clock hanging on the wall. It would be difficult to present any non-circular or non-question begging evidence to justify my belief. Yet, this is what the evidentialist demands. So if we can disregard the demands of the evidentialist in the case of perceptual beliefs, then perhaps the demands the evidentialist places on belief in God should be reconsidered as well; neither can produce the required (non-question begging) evidence, but surely in the case of our perceptual beliefs it can’t be said that we as agents are unjustified, epistemically irresponsible, or irrational in our belief. This of course raises further questions about evidential demands. This, then, is the first parallel that Plantinga and other reformed epistemologists make. The second parallel deals with the similarities between perceptual and religious experiences.

Perceptual beliefs arise from some perceptual experience; the belief arises suddenly with the cognizer having no control over the initial belief. The perceptual belief that arises from the experience is prima facie justified. Thomas Reid, whose influence on reformed epistemology is of note, argued that what we perceive is not “only irresistible, but it is immediate; that is, it is not by train of reasoning and argumentation that we come to be convinced of the existence of what we perceive.” Perceptual beliefs, according to Reid, are not inferred but immediately known by the perceiver. The parallels between perceptual beliefs and belief in God, on Plantinga’s account anyway, are important. The idea is that belief in God and perceptual beliefs are both immediate and the result of our cognitive faculties. Thus, if some perceptual belief like “I see a tree” is prima facie justified, then belief in God, if it arises in the same manner (for example, the result of some cognitive faculty), is also prima facie justified.

So what is this special faculty that gives rise to belief in God in an immediate non-inferential fashion? Plantinga uses a term that is well known to most in the reformed tradition called the sensus divinitatis. Calvin, who Plantinga credits with the sensus divinitatis, claimed that one can accept and know that God exists without any argument or evidence. As a result of the workings of the sensus divinitatis, belief in God is properly basic and is not inferred from any evidence or argument. Plantinga’s position is summed up nicely here:

Calvin’s claim, then, is that God has created us in such a way that we have a strong tendency or inclination toward belief in him. This tendency has been in part overlaid or suppressed by sin. Were it not for the existence of sin in the world, human beings would believe in God to the same degree and with the same natural spontaneity that we believe in the existence of other persons, an external world, or the past. This is the natural human condition; it is because of our presently unnatural sinful condition that many find belief in God difficult or absurd. The fact is, Calvin thinks, one who does not believe in God is in an epistemically substandard position—rather like a man who does not believe that his wife exists, or thinks she is likely a cleverly constructed robot and has no thoughts, feelings, or consciousness. Although this belief in God is partially suppressed, it is nonetheless universally present. (Plantinga 1983: 66)

From this, Plantinga concludes that “there is a kind of faculty or cognitive mechanism, what Calvin calls sensus divinitatis or a sense of divinity, which in a wide variety of circumstances produces in us beliefs about God.” So in the same way that perceptual beliefs such as “I see a table” are non-inferential and properly basic, belief in God, when occasioned by the appropriate circumstances (such as one feeling a sense of guilt, dependence, beauty, and so forth), can also be properly basic because of the cognitive working of the sensus divinitatis.

On Plantinga’s reformed account then, belief in God can now be added to the list of properly basic beliefs:

I see a tree (known perceptually),
I am in pain (known introspectively),
I had breakfast this morning (known through memory), and
God exists (known through the sensus divinitatis).

This belief can be taken as properly basic if the agent’s belief has sufficient warrant.

There is another important question to be asked, however. Does it follow from this that belief in God is groundless? If I come to believe in God on the reformed model, can it be said that my belief is groundless? Plantinga argues that in the same way that “I see a tree” is properly basic but not groundless, belief in God is not groundless. Understanding what Plantinga means by “groundless” is important in realizing the distinction between evidence and grounds for belief. Perceptual experiences, such as those caused by visual experiences, are not considered to be groundless because of their reliance on the senses. Likewise, Plantinga claims that belief in God is not groundless, because it is rooted in the experience of the sensus divinitatis. These experiences, however, do not entail that the belief in question is inferential. The belief is merely occasioned by the circumstance (for example, the circumstance of beholding some majestic mountains or desert sunset) which triggers the working of the sensus divinitatis. Those who believe in God simply find themselves with this belief.

Another important point concerns defeaters against belief in God. Plantinga argues that while belief in God is properly basic, it is also open to defeat. Suppose that someone offers a defeater for the belief that God exists; then, claims Plantinga, that particular belief would have to be abandoned. It is possible however, for one to offer a defeater-defeater, which would obviously entail the belief being justifiably maintained. This is an important point in that we can now see that a properly basic belief, for Plantinga, is not some incorrigible or indubitable belief that one can always believe despite defeating evidence. It is, in other words, properly basic but open to defeat.

c. Warranted Christian Belief

Alvin Plantinga has developed an important account of how religious belief could amount to knowledge. This view is discussed in his trilogy: Warrant: The Current Debate, Warrant and Proper Function, and finally, Warranted Christian Faith. In this Warrant trilogy, Plantinga is interested in the question “What is knowledge?”, and more specifically in what it is that makes the difference between mere true belief and knowledge. He calls this, whatever it is, warrant.

Warrant is just one of a number of epistemic terms that are used in epistemology; others include justification, rationality and evidence. Warrant is of particular importance, however, because if we can answer the question “What is warrant?” then we will have an answer to the question “What is knowledge?”

Plantinga argues that warrant results from the proper functioning of your cognitive faculties:

[A] belief has warrant for me only if (1) it has been produced in me by cognitive faculties that are working properly (functioning as they ought to, subject to no cognitive dysfunction) in a cognitive environment that is appropriate for my kinds of cognitive faculties, (2) the segment of the design plan governing the production of that belief is aimed at the production of true beliefs, and (3) there is a high statistical probability that a belief produced under those conditions will be true. (Plantinga 1993: 46-47)

Key to Plantinga’s analysis of warrant is that a belief can only be warranted if it is produced by a cognitive faculty that is functioning properly, which means that it must not be diseased or broken or hindered. In order to make sense of what it means for our cognitive faculties to be functioning properly we must introduce the notion of a design plan, which determines the way our cognitive faculties are supposed to work. Just as the human heart is supposed to beat at 50-80 beats per minute while at rest, so too, there is a way that our cognitive faculties are supposed to function. This, claims Plantinga, should not be thought to necessarily invoke the notion of conscious design (by God, or anyone else), rather he means to invoke the common idea shared by many theists and non-theists, that parts of our bodies have a function, such as one of the functions of our legs being to allow us to move through our environment.

As well as having cognitive faculties that are functioning properly those faculties must also be operating in the right cognitive environment—the one for which they are designed. This means that one might have warrant for a perceptual belief that is formed about a nearby medium sized object on a clear day, but not for a perceptual belief about a far-away object in a badly lit, smoke-filled room. It must also be that the part of the design plan governing the production of the belief in question must be aimed at truth. Our faculties are designed for a number of different purposes, not just the production of true beliefs, which means that it may be that there are times when our cognitive faculties are functioning properly in the correct environment, and yet produce a false belief, or a belief that is only accidentally true. For example, it may be the case that when a person discovers that they have a life-threatening illness that they are designed in such a way that they will come to believe that they will recover, even if this unlikely to be true—this may perhaps be the case because one is more likely to recover if one believes that this is true. That would be a case of cognitive faculties functioning properly in the correct environment, but not a case of the belief being warranted because the design plan, in this instance, does not aim at truth.

The final requirement is that there is a high statistical probability that a belief that is produced by the cognitive faculty in question is likely to be true when it is functioning properly in the environment for which it was designed—which is to say that the design must be a good one. Plantinga imagines a situation in which our faculties have been designed by some lesser deity, and that this deity has done such a poor job, that even when our faculties are functioning properly, in the correct environment, according to a design plan that is aimed at truth, we still form mostly false beliefs because the design is so poor. If this was the case then our beliefs would not have warrant, even in cases where they did turn out to be true. For this reason a reliability condition is required as well.

One important point to note is that Plantinga’s account is an externalist one. This means that, on Plantinga’s view, warrant involves, not just facts that the agent is aware of, but also facts that the agent may not be aware of; such as, for example, whether one’s faculties are functioning properly and facts about the environment. This point is crucial to Plantinga’s account given that whether or not a theist has warrant for her religious beliefs may depend on facts that she is unaware of.

Plantinga claims that given this view in epistemology there is no good reason to think that religious belief is not warranted. Plantinga claims that, following John Calvin, we may have been created by God with a faculty called the sensus divinitatis. Any beliefs that result from this faculty will be in a position to be warranted. So long as the faculty was designed by God for the purpose of producing true beliefs about him then this faculty will meet the requirements described above and the resulting beliefs will be warranted.

It is not Plantinga’s intention to show that this faculty exists or that this really is the way that religious beliefs come about. Instead his claim is that since this is true for all we know then one cannot reasonably claim that religious beliefs are not rational without first showing that this account is false.

7. Objections to Reformed Epistemology

Reformed epistemology has received a significant amount of attention and attracted many objections. Some of the most significant ones are described below.

a. Great Pumpkin Objection

There is a family of objections known as Great Pumpkin objections. These objections get their name from the Peanuts comic strip. In peanuts the character Linus is a child who believes that each Halloween the Great Pumpkin will come to visit him at the pumpkin patch. What these objections have in common is that they claim that, if reformed epistemology is correct, then belief in God is no more rational than belief in the Great Pumpkin.

This kind of objection is first mentioned by Plantinga in “Reason and Belief in God” (74-78). One of the claims of reformed epistemology is that the religious believer need not offer any criteria for deciding which beliefs are reasonable starting points for forming further beliefs. Instead each community is responsible for determining its own starting points and reasoning on that basis. Plantinga supposes that someone might object to this by claiming that this method means that the community in question will have no reason to accept any belief over any other. This community could take belief in God to be properly basic, but they might instead take the belief that the earth is flat or that I can run at the speed of light if I try really hard, or the belief that the Great Pumpkin will return at Halloween to the most deserving pumpkin patches. There is no reason, so the objection goes, to choose one belief over another without first offering some criteria for determining which beliefs are rational starting points and which are not.

Plantinga points out that in other areas we are able to discriminate between two things even if we are not able to give criteria for how that discrimination is to be done. The example he gives is the meaningfulness of sentences. Plantinga observes that we can easily tell that the sentence “T’was brillig; and the slithy toves did gyre and gymble in the wabe” is meaningless even if we cannot appeal to some general criteria of meaning. Likewise, claims Plantinga, there is no reason to think that something similar will not be possible for beliefs. This example shows that there is nothing mysterious about the suggestion that we might be able to tell which candidates belong to a certain class, and which do not, without also being able to state criteria for inclusion. For these reasons this objection need not trouble the reformed epistemologist.

Michael Martin offers a more troubling version of the argument. He does not label his objection as a Great Pumpkin objection, but Plantinga refers to it as the Son of the Great Pumpkin objection. Here is how Martin phrases the objection:

Although reformed epistemologists would not have to accept voodoo beliefs as rational, voodoo followers would be able to claim that insofar as they are basic in the voodoo community they are rational and, moreover, that reformed thought was irrational in this community. Indeed, Plantinga’s proposal would generate many different communities that could legitimately claim that their basic beliefs are rational. (Martin 1990: 272)

This second objection concerns whether or not a community can make judgments about the basic beliefs of other communities in a principled way. They may be able to argue that the believers in some other community are not justified in holding some of their non-basic beliefs, because they are not adequately supported by their basic beliefs, but since the basic beliefs are not supported by other beliefs, there seems to be no way for those outside the community to criticize them. If this is correct, it is a very strange and counter-intuitive result. There are various beliefs that we think are objectionable, even if they are held in the basic way; for example, belief that the Great Pumpkin will return every Halloween, that the Earth is flat and the claims of astrology all seem to be objectionable from the epistemic point of view, whether or not they are held in the basic way.

The reformed epistemologist regards the process of assembling examples of properly basic beliefs to be the responsibility of each community, and so, it would seem, at least at first, that she is committed to a sort of epistemic relativism whereby the most one can do to criticize the beliefs of a person from a different community is to point out internal inconsistencies. This wouldn’t necessarily be a major problem, except for the fact that the sorts of communities that seem to be included are ones that hold bizarre, irrational or superstitious beliefs—beliefs like astrology, voodoo or perhaps even the Great Pumpkin belief.

The reformed epistemologist can respond to this objection by pointing out that one could challenge the basic beliefs of another community by finding a defeater. Our basic beliefs are defeasible, and therefore open to revision in light of further information. This means that just because you are permitted to treat a belief as properly basic if it seems to you that it is, it does not follow that you will continue to be permitted to hold that belief no matter what. You may gain a defeater for that belief and come to believe that it is no longer true. A person may be justified in taking a belief such as the Great Pumpkin belief as basic if she has been raised to believe that the Great Pumpkin exists, but when she comes to learn more about the world—for example, when, yet again, the Great Pumpkin fails to arrive on Halloween—she will obtain a defeater for that belief, and it will no longer be reasonable for her to hold that belief.

The reformed epistemologist is therefore not endorsing an epistemic free-for-all, since just because a belief is basic does not mean that it is immune to epistemic appraisal. It is still perfectly possible for anyone to argue against the basic beliefs of another community, and to show them that one of their beliefs is false or unjustified.

The third, and final, version of this objection claims that reformed epistemology places belief in God beyond epistemic appraisal and that its methods could be adapted to place other beliefs beyond epistemic appraisal—beliefs that are clearly irrational like belief in the Great Pumpkin. If the methods of reformed epistemology can be used to defend beliefs like these then it cannot be successful in establishing the rationality of religious belief.

Linda Zagzebski has offered an objection like this one. She claims that reformed epistemology has failed to meet the requirements of what she calls the “Rational Recognition Principle (RRP): If a belief is rational, its rationality is recognizable, in principle, by rational persons in other cultures” (Zagzebski in Plantinga et al. 2002: 120). Zagzebski directs her objection against Plantinga and writes that reformed epistemology

violates the Rational Recognition principle. It does not permit a rational observer outside the community of believers in the model to distinguish between Plantinga’s model and the beliefs of any group, no matter how irrational and bizarre—sun-worshippers, cult followers, devotées of the Greek gods . . . , assuming, of course, that they are clever enough to build their own epistemic doctrines into their models in a parallel fashion. But we do think that there are differences in the rationality of the beliefs of a cult and Christian beliefs, even if the cult is able to produce an exactly parallel argument for a conditional proposition to the effect that the beliefs of the cult are rational if true. Hence, the rationality of such beliefs must depend upon something other than their truth. (Zagzebski in Plantinga et al. 2002: 122)

A similar objection is offered by Keith DeRose in his unpublished essay “Voodoo Epistemology.” DeRose argues that the real worry for reformed epistemology is that it could be adapted to defend some very strange and clearly irrational beliefs. This, claims DeRose, shows that there is something wrong with reformed epistemology even if we cannot say exactly what it is.

This objection is not completely devastating for reformed epistemology but it does make the achievements of reformed epistemology look much less significant. Work in this area by Kyle Scott (2014) has suggested that we ought to consider the historical and social environments that beliefs occur in, arguing that only beliefs that occur in stable and enduring communities are viable candidates for being defended in the way that reformed epistemologists defend religious belief.

b. Disanalogies

An important claim made by reformed epistemology is that religious belief can be rationally held in the basic way, similar to perceptual beliefs. An objection to this is that it cannot be reasonable to hold religious beliefs in the basic way because of significant differences between perceptual beliefs and religious beliefs. The objection has been most forcefully put by Richard Grigg (1983). He does not think that theistic beliefs will turn out to be basic because of the disanalogies between theistic beliefs and more widely recognized basic beliefs.

Grigg interprets reformed epistemology as arguing that the Christian community is within its epistemic rights in holding that certain theistic beliefs are basic because these beliefs are analogous to other beliefs that are more widely regarded to be basic. Examples of these include: (1) I see a tree, (2) I had breakfast this morning, and (3) That person is angry. Grigg identifies three important disanalogies between these beliefs and theistic beliefs.

Firstly, Grigg points out that although beliefs such as (1)-(3) will often be basic, they are still constantly being confirmed:

For example, when I return home this evening, I will see some dirty dishes sitting in my sink, one less egg in my refrigerator than was there yesterday, etc. This is not to say that (2) is believed because of evidence. Rather, it is a basic belief grounded immediately by memory. But one of the reasons that I take such memory beliefs as properly basic is that my memory is almost always subsequently confirmed by empirical evidence. (Grigg 1983: 126)

This, on the other hand, is not true of theistic belief. Beliefs, such as that God created the world, Grigg suggests, are not confirmed by observation, and may even be disconfirmed if the problem of evil is a successful argument.

The second disanalogy is that there is a certain universality enjoyed by beliefs such as (1)-(3), but not by theistic beliefs. That is, when a person has a perceptual experience such as being appeared to treely, they will naturally believe something like “I see a tree”; and this is the case, claims Grigg, for the vast majority of people. The situation is not the same for theistic beliefs; take, for example, Plantinga’s suggestion that one might have an experience of being awed by the beauty of the universe and form the belief that God created the universe. Grigg claims that many people have this experience yet there is no universally shared belief that typically comes with this experience, unlike in the case of perceptual beliefs.

The third, and final, disanalogy that Grigg raises is that people have a bias towards theistic beliefs, but not usually with less controversial examples of properly basic beliefs. Grigg points out that there is a psychological benefit to be gained from believing that God exists, whereas, there will not usually be any obvious benefit for beliefs like (1)-(3).

Each of these disanalogies can be challenged. Mark Macleod points out that it is not obvious that these are genuine disanalogies. For example, religious beliefs may receive confirmation from multiple sources such as sacred writings, the testimony of other believers and further religious experiences. Although these sources are not independent of each other it is not clear that the experiences in the breakfast example above are independent either since all the supporting evidence relies on perceptual experience at some point.

The second disanalogy is problematic as well because when a person has an experience of seeing a tree they may form a wide variety of belief such as “I see a tree” or “that tree is about to fall over” or “it is very windy today”. Contrary to what Grigg argues the beliefs that are formed in response to perceptual experiences are not uniform.

The third disanology is also not clearly a genuine disanology. I may derive psychological benefit from many of my perceptual beliefs such as believing that the computer screen is showing a positive number next to my bank account.

Even if the case for disanalogies between perceptual experiences and religious experience can be proved, then, this may not be a problem for reformed epistemology. Reformed epistemology should not be understood as relying on the claim that religious experience is just like perceptual experience. Rather what reformed epistemologists have been arguing for is that we ought to judge religious experience by the same standards as we judge perceptual experiences, and that religious experience stands up well when judged by those standards. Given the difference in subject matter and the alleged faculties involved, then, it should not be surprising to find disanalogies between religious experience and perceptual experience. To develop any disanalogies into an objection to reformed epistemology it must also be shown that the disanalogies are sufficient to show that such beliefs are not rational unless supported by further evidence.

c. Religious Diversity

According to reformed epistemology religious belief can be rational even if it is not supported by evidence. What reformed epistemologists do not claim is that these beliefs will be immune to defeat. It may be that a person’s religious beliefs are initially irrational, but when they discover some new piece of information they cease to be. Some have suggested that, even if reformed epistemology is correct, there is a defeater for religious belief that ought to be apparent to most competent adults in the world today. This defeater comes from considering the facts of religious diversity. In this section we will consider two attempts to advance this sort of objection.

i. Religious Belief is Epistemically Arbitrary

Suppose, for the sake of argument at least, that all of the major religions might be equally well supported by arguments and that its adherents might all have the same sort of internally available markers for their beliefs. The scenario would be one where whatever the theist can offer in support of her beliefs, those who disagree can offer the same considerations. For example, suppose that Anne believes p and Bill believes ¬p, and that whatever evidence or arguments Anne can offer in support of p Bill can offer equally good evidence and arguments in support of ¬p. Suppose further that their beliefs are alike in all other respects, so that if Anne finds p intuitive, Bill finds ¬p intuitive; or if Anne takes p as foundational Bill takes Øp as foundational; and so on for any other considerations that might be epistemically relevant. John Hick claims that if this is the case then it is intellectually arbitrary for the religious believer to hold that her own beliefs are true while those of other religions are false because she has no reason to treat the beliefs differently.

Richard Feldman also offers a similar objection by arguing for the following principle:

If (i) S has some good reasons (‘internal markers’) to believe P, but (ii) also knows that other people have equally good reasons (‘internal markers’) for believing things incompatible with P, and (iii) S has no reason to discount their reasons and favor her own, then S is not justified in believing P. (Feldman 2003: 88)

This principle states that even if you have good reasons for believing p, if you know that others have equally good reasons for believing something incompatible with p, and you have no reason to discount their reason then you are not justified in accepting p. This is because, claims Feldman, learning that others have equally good reasons for their incompatible beliefs undercuts your justification for p.

Alvin Plantinga has responded to this objection by trying to show that there is nothing inconsistent about holding onto your beliefs in the face of disagreement—even in the circumstances described above.

His first point is that the internal support that a belief enjoys does not exhaust everything that can be said about the epistemic status of a belief. Two beliefs can have all the same “internal markers” and yet still not be equal from the epistemic point of view. Other relevant features include whether or not the faculty that produced the belief is functioning properly, and whether or not the belief was produced in an environment for which the faculty was designed. Furthermore, one does not need to endorse Plantinga’s epistemology in order to agree with this point. Others have suggested that external factors are relevant to the epistemic standing of a belief; such as reliability of the source of the belief, whether the belief is safe or whether the belief is sensitive. What this means is that there is no inconsistency in thinking that two incompatible beliefs are alike in purely internal support and yet for us to treat them differently. This is a very modest claim and supplies no reason to think that judging two such beliefs differently in the sorts of cases described can be justified, only that it is not contradictory to do so. This point is supposed to lay the basis for his following two points.

The second point is that if disagreement is a defeater then it would defeat too many beliefs. Plantinga labels it a “philosophical tar baby,” claiming that it would be a problem not just for him, but for his objectors as well. This is because whatever position one adopts in this debate there will be others who disagree. The Christian will believe certain claims knowing that others in similar epistemic situations disagree, as will the Hindu or the Muslim. An atheist or a pluralist will be in no better a situation since she will think that the claims of these religions are false, and know that there others who disagree. Plantinga does not think that withholding belief avoids the problem either since if one withholds belief there will still be disagreement concerning whether or not withholding belief is the correct epistemic attitude to adopt. This worry also extends to other areas as well, such as politics and philosophy where there is also widespread disagreement. What this is supposed to show is that claiming that disagreement is a defeater has potentially disastrous consequences leading to a sort of skepticism. This, of course, does not show that it is wrong that disagreement defeats belief, it is only meant to show that this problem is a problem for everyone, and it is not one that is solely a problem for the religious believer.

Plantinga’s third point is offered by way of a thought experiment:

Perhaps you have always believed it deeply wrong for a counselor to use his position of trust to seduce a client. Perhaps you discover that others disagree; they think it more like a minor peccadillo, like running a red light when there’s no traffic; and you realize that possibly these people have the same internal markers for their beliefs that you have for yours. You think the matter over more fully, imaginatively recreate and rehearse such situations, become more aware of just what is involved in such a situation (the breach of trust, the breaking of implied promises, the injustice and unfairness, the nasty irony of the situation in which someone comes to a counsellor seeking help but receives only hurt) and come to believe even more firmly the belief that such an action is wrong… (Plantinga 2012: 653)

Plantinga claims that in moral cases, such as this one, it is clear that it is reasonable to continue believing in the face of disagreement even when you believe that those who disagree enjoy the same internal markers as yourself. If it is reasonable in this case to continue to hold on to your beliefs then it cannot be true in general that one is required to give up beliefs in the face of disagreement.

Plantinga thinks that these three considerations are sufficient to diffuse the charge of arbitrariness. His claim is that if we endorse something like Feldman’s principle above then we will be forced to give up many of our beliefs (possibly including beliefs about the principle itself) and in particular this does not fit with our intuitions about what it is rational to do in the case of moral disagreements like the one Plantinga describes above.

These responses do something to help neutralize the arbitrariness charge but they do not adequately deal with it. What Plantinga has achieved is to show that we cannot always be rationally required to give up our beliefs in the face of disagreement. But that is not sufficient to respond to the problem because there are examples where it does seem to arbitrary to hold on to your belief. An example often discussed in the literature is the restaurant case.

Suppose that Anne and Bill are in a restaurant with friends. The time comes to pay the bill and they both decide to figure out how much everyone owes. Anne believes that everyone owes $23, but Bill believes everyone owes $24. Each considers the other to be just as good at mental arithmetic and they have no reason to suspect that one of them is impaired on this occasion. In this example it seems clear that it would be irrational for Anne to hold on to her belief that everyone owes $23 even if it turns out that she is correct. She seems to have no good reason to prefer her own belief other than that it is her own.

What this suggests is that it cannot be either that disagreement always requires us to revise our beliefs or that it never requires us to revise our beliefs. What is needed is a more sophisticated epistemology of disagreement that lies somewhere between these two extremes. But Plantinga has given us no reason to think that religious beliefs will remain rational in the face of disagreement under this more reasonable epistemology of disagreement. What is needed here is a better understanding of the epistemic implications of disagreement and how that relates to religious disagreement. Fortunately, there is an active debate on this topic and it is likely that one’s opinion on that debate will determine whether or not one believes that this is a successful objection.

ii. Competing Belief Forming Practices

One of the central claims of reformed epistemology is that what determines whether religious belief is rational is not the evidence that a believer can present, but facts about the faculty that produced the belief. The facts of religious diversity offer a way to mount an argument that concludes that we have good reason to think that the faculty that produces religious belief is unreliable.

Before looking at a serious version of this argument it will be instructive to look at a naïve version of the argument and why it fails. This version of the argument observes the wide variety of religious beliefs in the world and notes that many of them contradict each other. Given this disagreement it seems clear that religious belief forming methods are unreliable because, even if some of the beliefs are correct, most of them must be false. Given the wide diversity of religious beliefs, most of these beliefs must be false. This objection is not too troubling since it assumes that there is a single religious belief forming practice. That is, however, implausible. There are significant differences in the practices of different religious practitioners, so the diversity of belief is not evidence that all religious belief forming practices are unreliable.

This objection can be developed further by observing that when it comes to religious matters there are competing methods. These competing methods frequently produce contradictory beliefs. At most, one of these methods can be reliable, but if we have no independent (that is, independent of religious belief forming methods) reason to prefer one over the others then we ought to refrain from engaging in any of them.

William Alston raises this objection against his own view. He compares it to the following situation:

Consider ways of predicting the weather: various ‘scientific’ meteorological approaches, going by the state of rheumatism in one’s joints, and observing groundhogs. Again, if one employs one of these methods but has no non-question-begging reason for supposing that method to be more reliable than the others, then one has no sufficient rational basis for reposing confidence in its outputs. (Alston 1991: 271)

It seems clear, when it comes to choosing between methods for predicting the weather, that if we have several competing methods we ought not accept any of them until we find some reason to prefer one over the other.

Alston responds to this objection by pointing out that there is an important difference between the religious case and the weather prediction case. When it comes to predicting the weather we know what sort of evidence we would need to choose between these methods—we can observe which one is getting it right. Things are different for the religious case because we do not know what reasons we could have for choosing one of these methods over another. The methods in question in the religious case are our only access to the topic—independently of these methods it is difficult to see what reasons we could have for preferring one over another. In light of this, Alston suggests that one cannot be faulted for lacking reasons to prefer one’s own religious belief forming methods.

d. Sensible Evidentialism

One of the central claims of reformed epistemology is that evidentialism with respect to belief in God is misguided. Stephen Wykstra argues that reformed epistemologists (or basicalists, as he calls them) have poorly framed the debate between themselves and evidentialists. He has sought to relocate the debate about the proper basicality of belief in God by contrasting reformed epistemology not with what he calls Extravagant Evidentialism (EE) but with Sensible Evidentialism (SE).

EE is the claim that a person’s belief is only rational if it is either basic, or that person can present propositional evidence for their belief. If we use this to define basic and non-basic beliefs then beliefs that arise from testimony or memory will often be basic. Since these beliefs are basic and belief in God often derives from memory or testimony, then in most cases the EE Objection to belief in God will not amount to much.

Wykstra, however, claims that EE is not the best way to understand the notion of needing evidence. He highlights this by using the example of belief in electrons. Most adults believe in electrons, but very few hold this belief on the basis of evidence. Most of us believe in electrons because we have been told that they exist by scientists, or teachers or some other knowledgeable person. According to the reformed epistemologist this belief will often be basic, and so it will be immune to the evidentialist objection. This is only true if we understand evidentialism as a demand that evidence be produced for each belief by the believer. This fails to take into account that, although the believer in electrons need not be able to produce evidence, the belief is still in some sense in need of evidence. Wykstra asks us to consider the following possible situation:

Suppose we were to discover that no evidential case is available for electrons—say, that the entire presumed case for electrons was a fraud propagated by clever con-men in Copenhagen in the 1920s. Would we, in this event, shrug our shoulders and continue unvexedly believing in electrons? Hardly. We would instead regard our electron belief as being in jeopardy, in epistemic hot water, in (let us put it) big doxastic trouble. (Wykstra 1989: 485)

The electron belief may not need evidence to be rational in an individualistic sense, but evidence must be available somewhere in the community. The testimony is defective if it does not connect you to a person, or persons, who do have evidence for the existence of electrons. This is what Wykstra refers to as a much more sensible way of construing the notion of needing evidence. EE requires that evidence is possessed by the individual, whereas SE requires that the evidence is possessed by the believer’s community.

SE gives us a much more plausible evidentialist objection to belief in God. The sensible evidentialist constraint will be that belief in God is only epistemically adequate if the religious community has sufficient evidence for the belief that God exists. The “interesting basicalist” will then be someone who claims that belief in God is not in need of evidence even in this sense; that belief in God is based upon our native faculties. Wykstra observes that even if belief in God is derived from some God-given faculty it may still be the case that belief in God is in need of evidence. Belief in electrons is in need of evidence because our native faculties do not give us access to them, but beliefs based upon our native faculties, such as testimony, are also sometimes in need of evidence in a rather different way. Wykstra draws attention to some of the insights of Thomas Reid concerning testimony:

When brought to maturity by proper culture … [reason] learns to suspect testimony in some case, and to disbelieve it in others … But still, to the end of life, she finds a necessity of borrowing light from testimony … And as, in many instances, Reason even in her maturity, borrows aid from testimony, so in others she mutually gives aid to it, and strengthens its authority. For, as we find good reason to reject testimony in some cases, so in others we find good reason to rely upon it with perfect security… (Wykstra 1989: 489)

According to Reid, we each have a natural tendency to believe testimony, however, over time we learn that not all testimony is reliable and we learn to find reasons to give some testimony greater weight and others much less. Although inferences are playing a role in forming testimonial belief, it is still testimony that gives support to the belief; inference only plays a refining role.

In light of varied religious beliefs and experiences, both across and within particular religious traditions, we must conclude that evidence is needed to discriminate between different religious beliefs. This does not mean that religious experience cannot ground belief in God. It may be that some religious faculty grounds the belief, but that the faculty is in need of refinement, just like testimony can be a basic source of knowledge, but still in need of refinement. This continues to draw on the teachings of the Christian tradition because although some Christians hold that we have access to God through our native faculties, they have been marred by sin, so it should not be surprising that we can err in our knowledge of God, or that our native faculties alone are not sufficient.

This sensible evidentialist objection should not really be called an objection; perhaps the sensible evidentialist problem would be better. That is because Wykstra is not urging the reader to give up belief in God, but rather to properly acknowledge the role that evidence can and does play in knowing God. This problem seems to have played some role in motivating the later work of Alvin Plantinga where he is attempting to set out a positive account of how religious beliefs could amount to knowledge, rather than simply responding to an objection.

8. References and Further Reading

Alston, William. “Religious Experience and Religious Belief”. In Nous 16 (1982): 3-12.
- An early essay by one of the central proponents of reformed epistemology.
Alston, William. Perceiving God. Ithaca, NY: Cornell University Press, 1991.
- An important work on the epistemology of religious experience.
Baker, Deane-Peter. Tayloring Reformed Epistemology. London: SCM Press, 2007.
- An attempt to bring together the work of Charles Taylor and certain aspects of reformed epistemology. Includes a helpful description and critique of arguments for reformed epistemology.
Beilby, James. Epistemology as Theology. Burlington, VA: Ashgate Publishing, 2005.
- A detailed account of Alvin Plantinga’s reformed epistemology.
DeRose, Keith. “Voodoo Epistemology” unpublished manuscript.
- A well-known essay – despite being unpublished – that criticizes Alvin Plantinga’s reformed epistemology.
Feldman, Richard. “Plantinga on Exclusivism”. In Faith and Philosophy 20 (2003): 85-90.
- A paper arguing that it cannot be rational to hold religious beliefs when one is aware of the widespread disagreement about religion.
Grigg, Richard. “Theism and Proper Basicality: A response to Plantinga”. In International Journal for Philosophy if Religion 14 (1983): 123-127.
- An essay challenging the reformed epistemologist’s claim that there is a parity between perceptual belief and theistic beliefs.
Kenny, Anthony. Faith and Reason. New York: Columbia University Press, 1983.
- Much of this book is on religious epistemology and it engages with reformed epistemology.
Mackie, J.L. The Miracle of Theism. New York: Oxford University Press, 1982.
- An important book providing many arguments against theism.
Martin, Michael. Atheism: A Philosophical Justification. Philadelphia: Temple University Press, 1990.
- This book presents numerous arguments in favour of atheism and against theism – including against reformed epistemology.
Plantinga, Alvin. God and Other Minds. Ithaca: Cornell University Press, 1967.
- An early account of Plantinga’s parity argument which lays the foundation for reformed epistemology.
Plantinga, Alvin. Warrant and Proper Function. New York: Oxford University Press, 1993.
- A discussion of proper function which also lays the foundation for Plantinga’s Warranted Christian Belief.
Plantinga, Alvin. Warranted Christian Belief. New York: Oxford University Press, 2000.
- Arguably the most important work in reformed epistemology to date. Plantinga articulates and defends his version of the view at great length. It engages with many important debates in Philosophy of Religion.
Plantinga, Alvin. “A Defense of Religious Exclusivism” in Louis Pojman and Michael Rae (eds) Philosophy of Religion: An Anthology. Boston: Wadsworth, 2012.
- Plantinga argues that it can be reasonable to believe that your religion is correct and that others are wrong.
Plantinga, Alvin and Nicholas Wolterstorff. Faith and Rationality. Notre Dame, Indiana: University of Notre Dame Press, 1983.
- Contains many important early essays articulating and defending reformed epistemology.
Plantinga, A., Sudduth, M., Wykstra, S. and Zagzebski, L. “Warranted Christian Belief”. In Philosophical Books 43 (2002): 81-135.
- A collection of essays critically engaging with Warranted Christian Belief, along with a reply from Alvin Plantinga.
Scott, Kyle. “Return of the Great Pumpkin”. In Religious Studies 50 (2014): 297-308.
- A recent formulation of an objection to reformed epistemology along with a new response.
Sudduth, Michael. The Reformed Objection to Natural Theology. London: Ashgate, 2009.
- Deals with the objections to natural theology that are typically posed by the reformed epistemologist.
Tomberlin, James and Peter van Inwagen (eds.). Alvin Plantinga. Dordrecht: D. Reidel, 1985.
- A collection of essays examining the work of Alvin Plantinga, one of the central figures in reformed epistemology.
Wolterstorff, Nicholas. Reason within the Bounds of Religion. Grand Rapids, MI: Eerdmans, 1976.
- An exploration of how his Christian faith ought to relate to his work as a scholar.
Wolterstorff, Nicholas. Lament for a Son. Grand Rapids, MI: Eerdmans, 1987.
- Though not an academic book, some important points are made about reformed epistemology and religious epistemology in general.
Wolterstorff, Nicholas. Divine Discourse. Cambridge University Press, 1995.
- A Philosophical exploration of claims that God speaks.
Wolterstorff, Nicholas. Justice: Rights and Wrongs. Princeton University Press, 2010.
- Offers an account of rights and of justice. Engages significantly with Christian thought.
Wykstra, Stephen. “Toward a sensible evidentialism: on the notion of ‘needing evidence’.” In Philosophy of Religion, New York: Harcourt Brace Jovanovich (1989): 426-437.
- An analysis of Plantinga’s critique of evidentialism.
Zagzebski, Linda (ed.). Rational Faith: Catholic Responses to Reformed Epistemology, Notre Dame: University of Notre Dame Press, 1993.
- A response to reformed epistemology from various Catholic philosophers.

Author Information

Anthony Bolos
Email: ABolos@vcu.edu
Virginia Commonwealth University
U. S. A.

and

Kyle Scott
Email: k.scott@heythrop.ac.uk
Heythrop College
United Kingdom

Feminist Ethics and Narrative Ethics

A narrative approach to ethics focuses on how stories that are told, written, or otherwise expressed by individuals and groups help to define and structure our moral universe. Specifically, narrative ethicists take the practices of storytelling, listening, and bearing empathetic, careful witness to these stories to be central to understanding and evaluating not just the unique circumstances of particular lives, but the wider moral contexts within which we all exist. In telling stories, they suggest, we both create and reveal who we think we are as moral agents and as persons; in granting these stories uptake—that is, in giving them epistemic credibility—we help to mold and sustain the moral identities of others, as well as our own. Thus, theorists engaged in narratively-based moral scholarship take stories to be foundational for how we view the world and our place in it, arguing that they are the means through which we can make ourselves morally intelligible to ourselves and to others. At their best, narrative methodologies offer non-ideal, epistemically rich approaches—that are not grounded in strict, juridical principles—to a number of philosophical discourses, including those central to questions of morality, identity, and social justice. At their most worrisome, they appear to be merely loosely-related notions about the constitutive roles of stories in moral theory and practice that do not easily lend themselves to rigorous moral justifications, epistemic explanations, or the guiding of action, raising concerns about the theoretical and practical soundness of the whole endeavor.

Introduction
Feminist Ethics
Narrative Theories and Methodologies
Feminist Ethics and Narrative
Some Criticisms of Narrative Approaches to Ethics
Conclusion
References and Further Reading

1. Introduction

Even as a relatively new set of moral discourses and practices, narrative ethics has made its presence known. Among the areas within philosophy in which the influence of narrative has been particularly influential are biomedical ethics and feminist ethics. While this entry will only minimally touch on the former, the focus on the latter requires some qualification: While the themes, concerns, and ideas that connect feminist ethics and narrative theory are philosophically significant, this is not to suggest that all (or even most) of feminist ethics employs narrative methodologies, or that all (or most) feminist ethicists are narrativists. In fact, in addressing the oppression of women and other disadvantaged individuals and groups, a number are focused on alternative, non-narrative methodologies (for example, multicultural feminists tend to focus on interconnected systems of oppression, which may or may not be grounded in oppressive narratives) while others (for example, certain liberal feminists who tend to focus on justice-related remedies) reject the personal turn altogether. Thus, the connection between narrative ethics and feminist ethics as explored here ought not be viewed as global or as necessary, but as one that exists whenever the focus on the particular, on stories, and on phenomenologies within feminist ethics intersects with the conception of narratives as normatively constituting our moral universe. Viewed in this light, their relationship is philosophically important in the sense of sharing an anti-totalizing, anti-hierarchical views of the practices of morality, as well as in the sense of emphasizing the necessity of greater inclusivity in moral discourses. Indeed, to view feminist ethics through the lens of narrative, or to conceive of narrative ethics as an approach to feminist value theory is not to exhaust the claims, significance, or methodologies of either one—it is simply to examine overlapping aspects of both, and how they have, and continue, to shape each other. This entry, then, will focus will be on the complicated relationship between feminist ethics and narrative ethics. And while narrative ethics does not always neatly intersect with some of the concerns of feminist theorists, the relationship between feminist ethics and narrative ethics is nevertheless a rather dynamic one, combining the social, political, epistemic, and other insights of feminist theory with the fluid methodologies of narrative. Moreover, although a number of feminist theorists have benefitted from, and contributed to, the various insights provided by narrative approaches to ethics, no single method or theory can definitively be called “narrative feminist ethics.”

Thus, this entry will not endeavor to reduce the relationship between feminist ethics and narrative ethics to a single approach, but instead, will address the ongoing discourses between narrative approaches to ethics and feminist ethics, focusing on four specific issues: (1). What are some of the central concerns of feminist ethicists? (2) What are narrative methodologies, and how do they pertain to ethics, and specifically, to feminist ethics? (3) How have the theorists engaged in feminist ethics turned to narrative, and which aspects of narrative seem to be most useful to their projects? (4) What are some general criticisms of narrative as an approach to ethics, broadly construed?

2. Feminist Ethics

Although the main purpose of this entry is not an exploration of the many nuances of the approaches to feminist ethics or the work of feminist ethicists, it is important to note how feminist ethics differs from the more “traditional” ethical theories, and importantly, how this difference makes feminist ethics responsive to narrative approaches and methodologies. To a large extent, feminist ethical theory can be understood as both a response to, and a movement against, a historical tradition of more abstract, universalist, ethical theories such as utilitarianism, deontology, and in certain respects, contractarianism and virtue theory, which tend to view the moral agent either as an autonomous, rational actor, deliberating out of a calculus of utility or duty, or else as an often disembodied and decontextualized ideal decision-maker, unburdened by the non-ideal constraints of luck (moral and otherwise), circumstance, or capability (Nagel 1979; Brennan 1999; Nussbaum 2000). Specifically, feminist ethicists contend that this top-down, juridical, principlist theorizing has largely neglected the centrality of physical, social, and psychological situatedness, power differentials, and, importantly, the voices of women whose lived experiences have simply not been part of any ongoing moral debates (Young 2005; Jaggar 1992; Walker 1997; Lindemann Nelson 2001; Held 1990; Tessman 2005). As Alison Jaggar argues, traditional ethics emphasize male-centered issues of the public and the abstract while dismissing the private and the situated. As a result, women, and “women’s issues” that have to do with care, interdependent relationships, community, partiality, and the emotions, are de-centered, and relegated to the margins of serious intellectual (and specifically philosophical) inquiry (Tong and Williams 2014; Jaggar 1992). While there is a significant number of subgroups of feminists—traditionally, including care ethicists, Marxist feminist, liberal feminists, radical feminists, and ecofeminists, and lately, divided into a greater variety of feminisms, including analytic feminism, continental feminism, radical lesbian separatist feminism, pragmatist feminism, psychoanalytic feminism, and all the intersections among them—the intent of feminist theory has been, and remains, the elimination of group and individual oppressions, and especially the silencing oppression of women, both in philosophical discourse and in the wider world (Tuana 2011). As Brennan argued, “feminist ethical theories [are] those ethical theories which share two central aims: (a) to achieve a theoretical understanding of women’s oppression with the purpose of providing a route to ending women’s oppression and (b) to develop an account of morality which is based on women’s moral experience(s)” (Brennan 860, 1999, citing Jaggar 1991).

While this entry does not address the many varieties or the latest developments within feminist ethics, it is important to note its general, and persistent, commitment to for the rejection of, among other things, the sort of universalizable, uniform, acontextual “view from nowhere” that characterized much of ethical theory. As noted earlier, feminist ethics has generally pointed to the lack of serious philosophical attention to the aspects of human life where women (and other minorities) tend to predominate, thus leading to a deficit of inclusion of these actors who were subsequently made invisible to anthropocentric theory. In taking seriously, and including, the contexts, relationships, and commitments of women’s (and many marginal others’) experiences in its theorizing, feminist ethics does not deprive these individuals and groups of their moral agency. In this way, feminist ethics opens up the spaces of reasons within moral theory to marginalized others by, on the one hand, affirming the necessity of socially inclusive moral work, and on the other, by challenging the socially (and otherwise) excluding practices, boundaries, and limitations of its current discourses. Finally, although not strictly a part of this discussion, it is important to note that liberal feminism, and liberal feminists, like Susan Moller Okin, represent an important exception to the more particularist, subjective approaches to women’s freedom noted earlier, focusing instead on the need for women’s personal and political autonomy, promoted by the liberal state, that enable their flourishing as persons, and fighting for democratic self-determination denied to women by social and political patriarchies (Okin 1989).

What is mostly dismissed or neglected by traditional moral theorists is any engagement with non-ideal actors in non-ideal environments. Often, the default “autonomous moral agent” within moral philosophy is an otherwise unencumbered, abstract decision-maker, understood to be a man, coming to a decision that is not otherwise burdened by the messy contextuality of an actual lived life (Tong and Williams 2014; Jaggar 1992; Brennan 1999). The result was not only a simplification of what it might mean to be a situated agent in non-ideal circumstances, but also the wholesale absence of agents who were not male, not unencumbered, and certainly not abstract. In other words, those left out of the moral calculus—indeed, out of the philosophical moral imaginaries—were women, people of color, LGBTQ communities, economically underprivileged individuals, and many others. Because many of these non-standard agents are engaged with the world in ways not considered by those relying on abstract agent models in their ethical analyses, and because they are instead participants in the interdependent moral practices that define them in terms of their relationships with others, they are viewed by a great majority of traditional moral theorists as somehow beyond the scope of philosophical discourse (Held 1990; Tong 2009). Apart from neglecting the fact of men’s own situatedness and embeddedness in the particular circumstances of their lives, the exclusion of entire categories of individuals not only deprived these populations of a voice in philosophical debate, but also removed their experiences from the scope of possible normative discourse altogether. More specifically, the voices, and thus the moral experiences, of women and minorities were effectively silenced as reliable narrators of not only the moral significance of their experiences, but also of what moral theory and practice ought to take into account as its proper subject matter.

In addition to impoverishing and narrowing the idea of what moral theory is more generally, this kind of silencing has had uniquely burdensome costs for the silenced: As the feminist political theorist Iris Marion Young noted, those whose voices and whose presence have been historically missing from public discourses are severely challenged in receiving any kind of uptake of their view even once they attempt to engage (Young 1997; McAfee 2014). In the process of democratic deliberation, for example, those who are not habituated into participating in the overly formal, abstract, and juridical moral and sociopolitical discourses would be continuously marginalized, and, in the end, dismissed. What Young argued ought to be an alternative way of engaging those on the periphery is a kind of a “communicative democracy,” allowing for a number of different communicative styles (including narrative, rhetoric, and storytelling), perspectives, and voices (McAfee 2014).

Thus, understood very broadly, feminist ethics is a response to this epistemic, moral, and sociopolitical silencing born of exclusion, and to the oppression that it underwrites. As Samantha Brennan notes, “[f]eminist ethics seeks to overcome the limits of narrow, male-centered ethics by constructing moral theories which can make sense of the experiences of women as moral agents…feminist ethics has become associated with an ethics of lived, concrete experiences which takes most seriously women’s experiences of morality” (Brennan 1999, 861). Indeed, feminist ethicists, like Margaret Urban Walker, have argued against the impartial universality of “juridical,” or top-down, ethical methodologies that reduce moral reason to rigid, acontextual deductions, and favor more situated, “expressive-collaborative” approaches to morality that expand, rather than restrict, both the spaces of moral reasons as well as the variety of moral agents (Walker 1997).

What feminist philosophers accomplish, therefore, is the broadening and deepening of what it means to be engaged in moral philosophy by introducing the epistemically and morally rich stories of what it is like to be a non-ideal agent in a non-ideal world. It is this turn toward including, confronting, and challenging the oppressions of women (and other oppressed and often silenced populations), that serves as the beginning of the intersection between narrative and feminist ethics. And because like much of feminist moral theory, narrative approaches to ethics emphasize the contextuality, situatedness, and the shared nature of public and private life as central to moral reasoning, some leading feminist philosophers have offered a number of varying approaches to philosophical ethics that can all nevertheless be called “narrative” in significant ways.

3. Narrative Theories and Methodologies

Generally speaking, there is not a single theory of narrative ethics, nor is there a single correct way to engage in narrative analysis. However, there are a number of views and practices that have a family resemblance, and can be construed as a part of a larger, more amorphous field of narrative ethics. One such view about how narrative is to be understood as a part of moral theory is offered by Kathryn Montgomery Hunter. Hunter argues that “[i]n using the word ‘narrative’ somewhat interchangeably with ‘story’ I mean to designate a more or less coherent written, spoken, or (by extension) enacted account of occurrences, whether historical or fictional” (Hunter 1996, 306).

There are many ways to define, and engage in, a narrative approach to ethics. By a “narrative approach,” I mean a focus on the significance of context, situatedness, and, importantly, the communication of the stories people tell about themselves and others in trying to make themselves, others, and, more broadly, their world, mutually intelligible. Narrative ethicists often criticize what they consider to be a preoccupation with impartiality distance, and universalizability at the expense of personal relationships among more traditional juridical moral theories (Walker 1997; Lindemann Nelson 2001; Rimmon-Kenan 2002). What is missing, they suggest, is not merely the exclusion of so many from juridical moral discourses, but, importantly, the warrant for why moral actors would desire, and be motivated by, something like a good will (or a utilitarian-based outcome, or a rights-based justification) as a part of a meaningful life. Indeed, they argue, given the requirements of juridical moral thought, we are left wondering what there is to admire about such a life, why such a life is worth having, and why disinterested detachment from everything and everything one cares about – that is, detachment from all that makes the moral life not just worthwhile but possible – is the sole path to robust moral agency. Although duties and laws might very well be a part of moral work, the “ought” of morality cannot be grounded primarily in bare, unyielding principles.

In response, narrative approaches to moral theory and practice have been put forth by a number of philosophers (especially those engaged in normative ethics and applied ethics, such as medical ethics), literary scholars, and psychologists, including Alasdair MacIntyre, Charles Taylor, Paul Ricoeur, Paul John Eakin, Hilde Lindemann Nelson, Margaret Urban Walker, Martha Nussbaum, Kathryn Montgomery Hunter, and Jerome Bruner, among others. Indeed, the philosopher Marya Schechtman has argued that narratives are not only essential to understanding what we do, but, indeed, to who we are by suggesting that only those who “weave stories of their lives” are, strictly speaking, persons. This is so, she suggests, because one’s narrative is precisely what constitutes—or, as she argues, characterizes—one’s personal identity (Schechtman 1996). Generally, narrative theorists take the personal story, or the first-person narrative, to not only be descriptively informative, but also normatively vital to connecting a particular life with the rest of a moral community (or communities), making the story, and the storyteller, both intelligible and open to normative analysis. In other words, theorists who use a narrative approach to ethics take the process of telling and hearing the stories of our lives to be doing something morally significant. For example, feminist philosopher Hilde Lindemann offers the following summary of some possible roles of stories in moral reasoning:

Narrativists have claimed, among other things, that stories of one kind or another are required: (1) to teach us our duties, (2) to guide morally good action, (3) to motivate morally good action, (4) to justify action on moral grounds, (5) to cultivate our moral sensibilities, (6) to enhance our moral perception, (7) to make actions of persons morally intelligible, and (8) to reinvent ourselves as better persons (Nelson 2001, 36).

Thus, narratives can differ teleologically. They can also be judged to be good and bad, desirable and undesirable, truthful and false. Indeed, instead of providing the sort of insight into ourselves that might be constructive and action-guiding, they can encourage dishonesty, cowardice, or can serve to indulge our fantasies in generally unhealthy, or even destructive, ways. Narratives can be “master narratives” that tell us where and how we are socially situated with respect to our duties, claims, and expectations. One can also resist harmful master narratives through a counterstory, whose purpose it is to “root out the master narratives in the tissue of stories that constitute an oppressive identity and replace them with stories that depict the person as morally worthy” (Lindemann 2001, 150). Moreover, one can resist a master narrative through a humorous re-casting of that narrative – a king with no clothes (power), Victor/Victoria (sexuality), and so on – that serves to expose the “master narrative” as unreliable, or at least of doubtful validity. And, of course, the master narratives themselves can differ: while they can oppress, they can also inform, (re)align, and guide. Counterstories, too, can be destructive as well as reparative. What matters is acquiring the ability (and desire) to listen or read closely enough, with sufficient attention and discernment, to tell the difference (Lindemann 2001). Morality, in short, is not solely within the purview of a judge who possesses the necessary moral epistemology and pronounces on a given act as “warranted” or “unwarranted,” but is something that we do together: it is a socially embodied medium of understanding and adjustment in which people engage in practices of allotting, assuming, or deflecting responsibilities of various kinds (Walker 1997). These practices create a vocabulary and resources for moral deliberation that give us recognized and socially shared ways of deciding what is good or right to do.

Since narrative approaches to ethics are not a singular, monolithic whole, the understandings and practices of what it might mean to engage in moral analysis narratively does indeed vary. Narratives can be read, heard or viewed through the mediums of film, literature, or through the oral traditions of storytelling, thus expanding one’s emotional, social, and intellectual vocabulary and perception. In this way, we become not merely better informed about being otherwise, but better equipped in addressing morally complex and difficult situations in the real world (Lindemann 2001; Nussbaum 1990). These narrative techniques can be can be reified by substituting a “master” model of moral reasoning (say, the Enlightenment model of detached objectivity and rationality) with the kind of normativity that is action-guiding to a particular narrative community that wishes to find justification for, and thus make moral sense of, its way of life (Lindemann 2001; MacIntyre 1984). They can also serve as methods of clarification of confusing or contradictory moral reasoning when compared to each other. In trying to work through some particularly difficult moral dilemmas, narratives can help us to see where seemingly divergent viewpoints can possibly move closer together, when they cannot, and why, without resorting to ill-fated attempts to (re)order principles and (re)interpret laws. In short, a narrative approach to doing ethics takes its cues from the stories themselves, as they are told, heard, and (mis)understood, and although there are a number of approaches and methodologies, they tend to center around questions of who the teller is, what the teller might mean, who the intended (and unintended) audience might be, what is the effect of the story, and (perhaps less frequently) what constitutes a good story – and what might be meant, in this case, by “good.” Writing from a narrativist medical ethics perspective, Joan McCarthy suggests that some of the central tenets of narrative approaches to moral issues can be understood as the following:

Every moral situation is unique and unrepeatable and its meaning cannot be fully captured by appealing to law like universal principles.

…[A]ny decision or course of action is justified in terms of its fit with the individual life story or stories…

The objective of the task of justification in 2 is not necessarily to unify moral beliefs and commitments, but is to open up dialogue, challenge received views and norms, and explore tensions between individual and shared meanings. (McCarthy 2003)

Thus, a narrativist account of moral problems, dilemmas, and general questions of moral judgment, takes seriously the multitudes of individual lives, and thus the multitudes of voices and interpretations of moral situations. What matters, then, is not so much a reduction of moral positions to a commonly-held single perspective, but an opening up of a space for reasons and dialogues with equally morally worthy others, thereby expanding the possibility of a shared, rather than a unitary and monolithic, moral universe.

One way to charitably interpret this narrative turn in ethics is to take seriously the proposition that stories simply provide the sort of flexibility of understanding and variability of perspective that deep and “thick” moral work requires. It makes possible a way to engage in moral negotiations by reminding the participants to take into account how they got to the present point, what the present circumstances are, as well what they ought to do in the future. At their best, narrative approaches to ethics welcome voices that, as Young noted, are differently situated, possessing quite radically divergent views of where they fit within the moral and sociopolitical discourses and debates. They remind us that different participants carry the burdens of different histories, epistemologies, and moralities. In the end, the narrative collaborative methodologies see stories as not merely ways to decide among competing principles, but as self-contained, and context-rich, reasons to revise moral understandings, to negotiate solutions, and to continue seeking the ever-elusive common ground (Walker 1997).

4. Feminist Ethics and Narrative

Given the narrative emphasis on multivocality, shared discourse, and the moral significance of individual voices, it is perhaps not entirely surprising that feminist philosophers have both employed and expanded the idea of narrative within feminist philosophy. For example, Margaret Urban Walker, in “Moral Particularity,” has argued that one of the characteristics that make an agent a distinctly moral one is her desire to define herself as the protagonist of a coherent narrative (Walker 2003). The “moral persona” that emerges out of such narrative coherence, she claims elsewhere, is defined by her commitments to individuals, institutions, and values. It is this desire to self-define as a protagonist of a largely coherent narrative that makes one as a moral agent (Walker 1989, 177). Walker later expanded her views to include the narrative notions of collaboration and negotiation into moral work. Perhaps as a way of challenging what she calls the “theoretical-juridical model” of ethics as exemplified by the more traditional top-down theories, her “expressive-collaborative” approach to morality turns on its head both the priorities and its presuppositions of what it means to be engaged in moral practice. As a priority, the expressive-collaborative approach tends to view the importance of moral work not as necessarily the juridical determination of “right” and wrong” based on a set of deduced unyielding norms or laws, but more as a way to negotiate and narrate our way in the complex and imperfect social, physical, and psychological realities of being human: The “expressive-collaborative model” encourages us to view “an investigation of morality as a socially embodied medium of mutual understanding and negotiation between people over their responsibility for things open to human care and response” (Walker 1997, 173). The distinction between this approach and the non-narrative juridical one is that while the latter emphasizes the uniformity of what is required, forbidden, or permitted in a given situation for all similarly-placed agents, the expressive-collaborative model prioritizes moral competence as strong moral self-definition, or, as Walker has argued, “the ability of morally developed persons to install and observe precedents for themselves which are both distinctive of them and binding upon them morally” (Walker 2003). In other words, her argument explicitly makes the point that the work of morality has to do with accountability and responsibility – and thus moral reliability, requiring a certain integrity in one’s relationships, sense of identity, and values. To be accountable is, to some extent, to be viewed as accountable by others, and this means that our actions have to tell a coherent story at least to the extent that they are reasonably predictable by those who are affected by them in the sorts of situations that matter morally. In the end, moral accountability is a narrative practice of making ourselves both internally and externally coherent, and in so becoming, weave ourselves into the fabric of a moral universe (Walker 1989).

Other feminist scholars have also turned to narrative as a way to engage with some of the central concerns within feminist ethical theory. In considering how personal identities structure our various moral discourses and concepts, Hilde Lindemann claims that these identities are “complex narrative constructions consisting of a fluid interaction of the many stories and fragments of stories surrounding the things that seem most important, from one’s own point of view and the point of view of others, about a person over time” (Nelson 2001, 20). She argues that not only are identities, and thus one’s moral standing in society, narratively constituted, she notes, but they can be narratively damaged by oppression and oppressive practices. Indeed, the moral damage of oppressive “master narratives”—destructive especially to those who are already socially subordinated and disempowered—must be counteracted with powerful counter-narratives that just might repair these broken identities, securing individual (and sometimes group) moral agency. The two kinds of moral damage that can impact the cohesion of one’s identity—one, depriving one of important social goods, and the other, of self-respect—can be repaired by, and through, counterstories, which narratively resist, challenge, and overcome the damaging master narratives that are so inflicted by the powerful on the vulnerable (Lindemann 2011). For example, women whose moral agency is compromised because they choose childlessness against a more general pronatalist narrative can offer stories of womanhood as personhood without essentializing the women-as-mother. As noted earlier, these counternarratives can take many forms, but their purpose remains consistent: to both expand normative spaces to include those previously excluded, and to admit, and in fact encourage, the use of narrative as a legitimate practice of engagement within the broader moral and sociopolitical discourses.

Of course, Walker and Lindemann are not the only feminist theorists to turn to narrative. Because feminist theorists are generally concerned with addressing various kinds of oppressions—and especially the oppression of women—they have often construed personal stories as fruitful ways of theorizing morally dilemmatic situations. These accounts serve a number of goals, including clarifying the harms of oppression, explaining the personal costs of cruel, myopic, or marginalizing moral reasoning, re-orienting the purely juridical and theoretical toward the non-ideal, and, among other things, motivating the development of moral thinking in ways that are inclusive of contextualities, situatedness, and burdened lives. Vivid, empathy-producing examples that tend to engage the moral imagination are often used by feminist scholars to focus on specific problems in order to show—and not to simply argue—that issues of sexism, oppression, gross power differentials, exclusion, and domination must be recognized and addressed both theoretically and practically. For example, Sandra Bartky, as a part of her analysis of objectification, offers a story of harassment, catcalls and whistles, noting that her previously unremarkable walk was now a source of identity-threatening humiliation and brutal, othering objectification (Bartky 1990; 1979). Moreover, Susan Brison, as a part of her examination of violence, identity, and the moral work of bearing witness, shares the very personal trauma of her brutal rape and assault. As Anita Superson notes, “Brison argues that the experience of rape should be of interest to philosophers because it raises many philosophical issues, including the metaphysical issue of the disintegration of the self, the epistemological issue of the victim’s skepticism about everyone and everything, as well as the obvious legal, moral, and political issues relating to what it is like to be a victim of rape, why rape occurs and is so prevalent in our society, what its meaning is, and so on” (Superson 2009). Moreover, Susan Estrich employed her own story not just of rape, but of a right for her credibility as a reliable narrator of her experiences as a crime victim to police who did not take her claim to be a serious one (Superson 2009). Through this reliance on a personal narrative of trauma and victimization, she addresses not only the broader challenge of confronting the presuppositions and prejudice inherent in American rape law, but also makes a case, through her personal narrative, for alternative, less abstract and rigid constructions of the notions of force and consent.

Turning to a different aspect of narrative—namely, fiction—the philosopher Martha Nussbaum argues that the narrativity of literature provides a deep and necessary source of moral knowledge that not only more sharply attunes people to the various sources of morality, but also to themselves as sensing moral beings who enter into relationships of mutual responsibilities and obligations with each other (Nussbaum 1990). Finally, Seyla Benhabib has noted that narratives are not only the central constituting elements of a self, but that “[w]e are born into webs of interlocution or into webs of narrative-from the familial and gender narratives to the linguistic one to the macronarrative of one’s collective identity. We become who we are by learning to be a conversation partner in these narratives. Although we do not choose the webs in whose nets we are initially caught or select those with whom we wish to converse, our agency consists in our capacity to weave out of those narratives and fragments of narratives a life story that makes sense for us, as unique individual selves” (Benhabib 1999, 344).

Thus, the process of challenging, re-defining, and finally re-making moral theory and practice that is so central to the project of feminist ethics can, with the help of narrative methodologies, go far toward addressing women’s oppressions, as well as the oppressions of numerous excluded others. By telling their stories—by grounding moral theorizing in personal narratives rather than in purely idealized contexts and agents—feminists scholars are not only able to motivate a deeper understanding of ethical dilemmas, but also advocate for practical changes in the structures of marginalizing social practices by creating a more inclusive space of reasons within which to negotiate our moral understandings.

5. Some Criticisms of Narrative Approaches to Ethics

Even though embraced by a not insignificant number of feminist ethicists, narrative approaches to ethics, whether feminist or otherwise, are not without their critics. While these criticisms are diverse and multifaceted, many of them converge on the worries about narrative’s lack of moral grounding, epistemic justification, and normative guidance. Some concerns stem from a reliance on context, perspective, and circumstances of specific stories, which, for some, drift too close to relativism. Others worry about the dependence on testimony and storytelling as a basis of moral theory. A number of theorists also wonder about the theoretical and practical value of a narrative, contextualized approach for moral theory, broadly construed. Finally, some claim that a narrative approach to understanding one’s place in the moral universe is not only misguided, but unnecessary and not reflective of what matters to us morally as human beings.

First, in addressing worries about relativism, the turn of feminist ethics toward narrative and experiential pluralism might re-make moral theory merely into an account of “historically specific moral practices and traditions” (Jaggar 1991, 93). Alison Jaggar further notes that while feminist ethics is “incompatible with any form of moral relativism that condones the subordination of women or the devaluation of their moral experience…[i]t is neutral, however, between the plural and local understanding of ethics, on the one hand, and then ideal of a universal morality, on the other” (Brennan 1999, 862, citing Jaggar 1991, 94). Thus it would seem that worry about the slide of feminist narrative-based theory into moral relativism is at least prima facie warranted: Ought feminist theorists relying on narratives focus on the local and the contextualized, rather than on the abstract and universalizing, if so doing offers an expansion of new “political agendas” (Shrage 1994) while at the same time leading practitioner to accept practices and narratives that might contribute to other kinds of oppressions (Brennan 1999)? Perhaps if identity-and-moral-community-defining stories are to have any kind of moral grounding that are both useful and reasonably defensible, then Susan Sherwin’s suggestion that it is the revisable and process-dependent “community standards” that might offer something beyond a fully relativistic and situational ethics is one way out of the worries about relativism (Brennan 1999; Sherwin 1992).

Second, Diana Tietjens Meyers, concerned about the reliability of testimony, suggests that narrative theory, instead of simply looking to storytelling as its sources of normativity, must prove its credibility as an account of morality by insisting on a particular skill set of the storyteller. She argues: “To ensure respect for the diversity of morally decent lives, narrativity theory must explicate the credibility of self-narratives in terms of this repertoire of skills. Self-narratives are not all equally valid, revealing, and conducive to flourishing, but there is no property internal to self-narratives nor any interpersonal test that can rank them. The best gauge of a self-narrative’s credibility, then, is the narrator’s overall degree of mastery of the self-discovery, self-definition, and self-direction skill repertoire and the extent to which the narrator made use of this competency in constructing a particular self-narrative” (Meyers 2004,303). Meyers claims that if narratives are simply taken at face value, we might be left with “all sorts of fictions—fairy tales, negative utopias, science fiction, romances, and horror stories—as well as autobiographical narratives” (Meyers 2004, 303). Simply because a story is good or interesting, Meyers notes, it does not guarantee that it will be anything but an exercise in wishful fiction or a flight of fancy. In order to properly address this possibility, one must acquire particular skills—introspection, volition, nurturing, communication, listening, and memory, among others—that allow one to recall relevant experiences, to imagine feasible options, and so on (Meyers 2004). Indeed, Meyers insists that “[t]o curb overactive imaginations, to overcome isolating silence, and to secure the credibility of self-narratives, the competency that keeps people attuned to themselves and alive to life’s possibilities must underwrite the processes of self-narrating” (Meyers 2004,303). Without this kind of rigorous self-discipline, a narrative approach to morality seems at best less than fully credible, and at worst, a methodologically compromised enterprise that confuses the interesting and the exciting with the epistemically important and the morally compelling.

Third, another kind of critique of narrative is offered by the philosopher and bioethicist Tom Tomlinson, focused on the worry about whether a narrative approach to ethics brings something distinct to moral theory that other, more traditional, approaches do not (Tomlinson 1997). Tomlinson argues that even though narrative might be methodologically important to the development of ethical reasoning, it does not offer “a mode of ethical justification that is independent from or superior to appeals to moral principles” (Tomlinson 1997, 132). On his view, narrative does not serve the sort of “central epistemic function in the discovery, justification, or application of ethical knowledge” that its supporters take it to be serving (Tomlinson 1997, 124). Instead, he argues that a focus on stories does not go far enough – or, indeed, any distance at all – toward enriching our moral epistemology. If narrative sets itself against the overstructured and sterile methodology of juridical thinking, then, Tomlinson claims, we ought to expect to find something morally valuable that is unavailable to us through principles alone. However, this does not seem to be the case: First, if one takes the kind of narrative approach that Martha Nussbaum has proposed—whereby engaging with certain kinds of literature allows for the development of a more nuanced, and empathetic, view of moral discourse (Nussbaum 1990)—and reads a novel in order to broaden one’s moral imagination, one is missing the actual encounter with a living person, and is thus epistemically and morally limited by the four corners of the text. Whatever moral “truth” is made available by the story, it seems limited situationally to the characters within that story, and does very little to speak to those who do not also share the world in which a particular moral lesson unfolds. And even if one were to set aside literary narrative and enter into a conversation with other people, the sort of particular knowledge one might derive through these interactions would not yield any moral knowledge that is generalizeable—that translates from one story, or from one storyteller, to another. At best, Tomlinson suggest, “novels and stories become…vivid illustrations of knowledge verified through other means” (Tomlinson 1997,125).

On Tomlinson’s account then, narrative does not appear to have much to contribute toward assisting the ethical discourse about aligning, or at least making less attenuated, the relationship between moral principles and lived experiences. Indeed, Tomlinson sees no clear way to distinguish how a uniquely narrative approach helps with addressing ethical dilemmas from other methodologies. For example, in a case where one is torn between disclosing or withholding potentially devastating news, a narrative theorist might require a consideration of how much truth to tell, how to tell it, how one will hear what is said, who is doing the speaking, and so on. However, Tomlinson suggests that aside from the vagueness of the narrative criteria itself, what it might mean to “interpret” this information is unclear: “Any social system of reasoned reflection involves a ‘communal dialogue’ of ‘give and take,’ including those deliberatively rooted in principles…The failure to provide any more precise account of the nature and role of ‘interpretation’ is a symptom of the tendency to wave it and ‘narrative’ as banners that fly over everything bright and beautiful being ignored by those crude and insensitive principles” (Tomlinson 1997, 127).

Moreover, Tomlinson rejects what he views as the tendency among proponents of narrative ethics to conflate the descriptive claim that one’s life is best understood as a narrative with the normative claim that one’s choices – and especially one’s moral choices – ought to be judged according to their coherence with a given life narrative. First, we do not, he claims, live a life that can be forced into coherence by a storyline – or by anything else: “we don’t live out a narrative, we create one by living a life” (Tomlinson 1997, 130). Second, even if we were to take seriously the narrative we create by “living a life,” “the [moral] question of how best to live out ‘that’ unity is not answered by the notion of narrative unity. It’s answered by appeal to extranarrative ideals that elevate some kinds of narrative over others” (Tomlinson 1997, 130). And since these ideals can be whatever one desires them to be, the resulting coherence loses any meaningful normative force. Unless one subscribes to one “extranarrative” ideal – or, indeed, to one principle – over another, the standard of narrative coherence seems to neither add anything to principlist analysis, nor offer an epistemically independent criteria of ethical reasoning, explanation, or justification.

While Tomlinson’s arguments focus on the claim that narratives do not offer any ethically or epistemically satisfying criteria that we could use in making moral choices, another kind of criticism, offered by John Arras, centers on the moral incompleteness of narrative as moral theory. Although he takes a somewhat more conciliatory, although still critical, view, his dissatisfaction with narrative as a method for doing ethics is grounded in his suspicion of narrative as a means of grounding moral justification—of finding the relationship “between the telling of a story and the establishment of a warrant for believing in the moral adequacy or excellence of a particular action, policy or character.” Having examined what he takes to be three different approaches to narrative—“as an essential element of any and all ethical analyses,” as an ahistorical rejection of the Enlightenment project, and as a postmodern attempt to substitute narrative “for the entire enterprise of moral justification”—he concludes that, while narrative seems to be an important part of ethical analysis, its ability to completely replace principles and ethical theory seems doubtful at best if what one seeks is moral justification for actions (Arras 1997, 79-85). Arras’s view, therefore, is that narrative seems to be merely supplementary to principles, and, in the end, is no threat to their moral primacy.

Finally, Galen Strawson, in “Against Narrativity,” argues that a narrative approach (to morality, to identity, and so on) is not only presumptively false from a folk-psychological, or common-sense perspective, but is also descriptively vague and normatively unmoored. He claims that not only does he not see himself or his life in narrative terms, but that he resents the idea of such a practice altogether. In response to the urging of (feminist and other) narrative theorists to engage in moral work through narrative, Strawson wonders, “Why on earth, in the midst of the beauty of being, it should be thought to be important to do this” (Strawson 2004, 436). Indeed, he noted there are deeply non-Narrative people and there are good ways to live that are deeply non-Narrative” (Eakin 2006, 180-187). Moral claims about oneself, about others, or about the world more broadly, Strawson insists, do not require the reliance on stories, or on how these stories relate to one’s present and future agency and shared moral understandings.

6. Conclusion

It can be said with some certainty that narrative approaches to ethics are not without considerable controversies and passionate critiques. It also seems clear that there are significant and challenging insights offered by narrative ethicists—a number of which have been theorized, defended, and expanded upon by feminist ethicists. Indeed, it seems that feminist ethics and narrative approaches to normativity do indeed share a number of concerns, goals, and motivations that offer powerful counterstories to the largely principlist, abstract, and universalizing practices of traditional moral theory. But shared worries and a desire for a more multivocal and collaborative moral discourse do not presuppose, nor require, the same methodologies, and there are some clear and powerful points of disagreement both within feminist philosophy about the role of narrative in ethical theory, as well as among narrativists themselves about what kinds of narratives ought to count as properly normative and adequately action-guiding. Because there is not a single approach to feminist ethics, and certainly no single way of engaging in narrative analysis, it is quite difficult to make any tidy generalizations, either about the theories themselves, or about their complicated relationship. Yet perhaps this is exactly the point: theorizing that tends to move away from such generalization in its own methodologies unsurprisingly escapes any attempts at totalizing definitions, in the process changing and restructuring the spaces and scope of moral discourse.

7. References and Further Reading

Arras, J. “Nice Story, But So What?” In H. L, Nelson (ed.). Stories and Their Limits: Narrative Approaches to Bioethics. New York: Routledge, 1997.
Bal, M. Narratology: Introduction to the Theory of Narrative. Toronto: University of Toronto Press, 1997.
Bartky, S. “On Psychological Oppression.” In S L. Bartky (ed.). Femininity and Domination: Studies in the Phenomenology of Oppression. New York: Routledge, 1990. Reprinted from Philosophy and Women (Wadsworth Publishing, 1979).
Becker, L. C. “Impartiality and Ethical Theory.” Ethics 101.4 (1991): 698-700.
Benhabib, S. ‘Sexual Difference and Collective Identities: The New Global Constellation’. Signs: Journal of Women in Culture and Society. 24.2 (1999): 335-361.
Benson, P. “Free Agency and Self-Worth.” Journal of Philosophy 91.12 (1994): 650-668.
Benson, P. “Feminist Second Thoughts about Free Agency.” Hypatia 5 (1990): 47-64.
Benson, P. “Feminist Intuitions and the Normative Substance of Autonomy.” Personal Autonomy: New Essays on Personal Autonomy and its Role in Contemporary Philosophy, Ed. James Stacey Taylor. Cambridge: Cambridge University Press, 2004.
Brennan, S. “Recent Work in Feminist Ethics.” Ethics 109.4 (1999): 858-893.
Brison, S. J.. “Surviving Sexual Violence: A Philosophical Perspective.” In S.G. French, W. Teays, and L. M. Purdy (eds.) Violence Against Women: Philosophical Perspectives. Ithaca, New York: Cornell University Press, 1998.
Charon, R. “Narrative Medicine: Attention, Representation, Affiliation.” Narrative 13.3 (2005): 261-270.
Charon, R. and Montello, M. Stories Matter: The Role of Narrative in Medical Ethics. New York: Brunner- Routledge, 2002.
Christman, J. “Narrative Unity as a Condition of Personhood.” Metaphilosophy 35.5 (2004): 695–713.
Crossley, M. L. “Narrative Psychology, Trauma and the Study of Self/Identity.” Theory and Psychology 10.4 (2000): 527–546.
Damasio, A. R. The Feeling of What Happens: Body and Emotion in the Making of Consciousness. New York: Harcourt Brace, 1999.
Dennison, A. Uncertain Journey: A Woman’s Experience Of Living With Cancer. Newmill: Patten Press, 1996.
DesAutels, P. and Walker, M. U., eds. Moral Psychology: Feminist Ethics and Social Theory. Lanham MD: Rowman and Littlefield Publishers, Inc., 2004.
Eakin, P. J. “Narrative Identity and Narrative Imperialism: A Response to Galen Strawson and James Phelan.” Narrative 14.2 (2006): 180-187.
Eakin, P. J. How Our Lives Become Stories: Making Selves. Ithaca: Cornell University Press, 1999.
Frank, A. W. “Just Listening: Narrative and Deep Illness.” Families, Systems and Health 16.3 (1998): 197–212.
Frank, A. W. The Wounded Storyteller: Body, Illness, and Ethics. Chicago: University of Chicago Press, 1997.
Frank, A. W. “Asking the Right Question about Pain: Narrative and Phronesis.” Literature and Medicine 23.2 (2004): 209-225.
Held, V. “Feminist Transformations of Moral Theory.” Philosophy and Phenomenological Research, Fall Supplement, 1990.
Held, V. Feminist Morality: Tranforming Culture, Society, and Politics. Chicago: University of Chicago Press, 1998.
Held, V. “Feminist Reconceptualizations in Ethics.” In . J. Kourany, (ed.). Philosophy in a Feminist Voice: Critiques and Reconstructions. Princeton: Princeton University Press, 1999.
Hooker, B. and Little, M. O. Moral Particularism. Oxford: Oxford University Press, 2000.
Hunter, K. M. Doctors’ Stories: The Narrative Structure of Medical Knowledge. Princeton, New Jersey: Princeton University Press, 1991.
Jaggar, A. “Feminist Ethics: Projects, Problems, Prospects.” In C. Card (ed.). Feminist Ethics. Lawrence: University of Kansas Press, 1991.
Jaggar, A. “Feminist ethics”. In L. Becker and C. Becker (eds.), Encyclopedia of Ethics, New York: Garland Press, (1992): 363-364.
Kleinman, A. The illness narratives: Suffering, healing and the human condition. New York: Basic Books, 1988.
Korsgaard, C. M. The Sources of Normativity. Cambridge: Cambridge University Press, 1996.
Lindemann, H., Verkerk, M., and Walker, M. U. Naturalized Bioethics: Toward Responsible Knowing and Practice. Cambridge, MA: Cambridge University Press, 2009.
Lindemann, H. Holding and Letting Go: The Social Practice of Personal Identity. New York: Oxford University Press, 2014.
Little, M. O. “On Knowing the `Why’: Particularism and Moral Theory.” The Hastings Center Report 31.4 (2001): 32-40.
Lorde, A. The Cancer Journals. San Francisco: Spinsters/Aunt Lute, 1980.
Lugones, M. (1987) “Playfulness, ‘world’-traveling, and loving perception”. Hypatia, 2: 3-19.
Lugones, M. and Spelman, M. (1983). “Have we got a theory for you! Feminist theory, cultural imperialism, and the demand for ‘the woman’s voice’”. Women’s Studies International Forum, 6(6): 573-581.
MacIntyre, A. After Virtue: A Study in Moral Theory. Indiana: University of Notre Dame Press, 1984.
MacKenzie, C. and Stoljar, N. Relational Autonomy: Feminist Perspectives on Autonomy, Agency, and the Social Self. New York: Oxford University Press, 2000.
Martin, C. “Feminism, the Self, and Narrative Ethics.” Macalester Journal of Philosophy 16:1 (2007):7-14.
Mattingly, C. Healing Dramas and Clinical Plots : The Narrative Structure of Experience. Cambridge: Cambridge University Press, 1998.
McAdams, D. P. The Stories We Live By: Personal Myths and the Making of The Self., New York: The Guilford Press, 1997.
McAfee, N. “Feminist Political Philosophy”, The Stanford Encyclopedia of Philosophy (Summer 2014 Edition), Edward N. Zalta (ed.), URL =<http://plato.stanford.edu/archives/sum2014/entries/feminism- political/>.
McCarthy, J. “Principlism or narrative ethics: must we choose between them?” Medical Humanities 29.2 (2003): 65-67.
Merleau-Ponty, M. The Phenomenology of Perception. London and New Jersey: Routledge, 1992.
Meyers, D. T. “Narrative and Moral Life.” In Cheshire Calhoun (ed.). Setting the Moral Compass: Essays by Women Philosophers. Oxford University Press, 2004.
Meyers, D. T. Jaggar, A. (eds.). Feminists Rethink the Self. Boulder: Westview Press, 1997.
Mullan, F., Ficklen, E., and Rubin, K. (eds.). Narrative Matters: The Power of the Personal Essay in Health Policy. Baltimore: The Johns Hopkins University Press, 2006.
Nagel, T. The View From Nowhere. New York: Oxford University Press, 1989.
Nelson, H. L. Stories and Their Limits: Narrative Approaches to Bioethics. New York: Routledge, 1997.
Nelson, H. L. Damaged Identities, Narrative Repair. Ithaca: Cornell University Press, 2001.
Nelson, H. L. “Context: Backward, Sideways, and Forward.” HEC Forum: Special issue on narrative 11.1 (1999): 16-26.
Nelson, H. L. “7 Things to Do with Stories.” unpublished manuscript.
Nussbaum, M. C. Love’s Knowledge: Essays on Philosophy and Literature. New York: Oxford University Press, 1990.
Nussbaum, M. and Sen, A. The Quality of life. Oxford: Clarendon Press, 1993.
Nussbaum, M. and Glover, J. Women, culture, and development: a study of human capabilities. Oxford: Clarendon Press, 1995.
Okin, S. M. Justice, Gender and the Family. Basic Books: New York, 1989.
Ricœur, P. Time and Narrative (Temps et Récit), 3 vols. trans. Kathleen McLaughlin and David Pellauer. Chicago: University of Chicago Press, 1984.
Rimmon-Kenan, S. “The story of ‘I’: Illness and narrative identity.” Narrative 10.1 (2002): 9-19.
Rorty, R. Contingency, Irony, and Solidarity. Cambridge: Cambridge University Press, 1989.
Schechtman, M. The Constitution of Selves. Ithaca: Cornell University Press, 1996.
Shrage L. Moral Dilemmas of Feminism: Prostitution, Adultery, and Abortion. New York: Routledge, 1994.
Sherwin, S. No Longer Patient: Feminist Ethics and Health Care. Philadelphia: Temple University Press, 1992.
Strawson, G. “Against Narrativity.” Ratio 17.4 (2004): 428-452.
Superson, A., “Feminist Moral Psychology”, The Stanford Encyclopedia of Philosophy (Winter 2009 Edition), N. Zalta (ed.), URL = <http://plato.stanford.edu/entries/feminism-moralpsych/>.
Taylor, C. Sources of the Self: The Making of the Modern Identity. Cambridge: Harvard University Press,1992.
Tessman, L. Burdened Virtues : Virtue Ethics for Liberatory Struggles. New York : Oxford University Press, 2005.
Tomlinson, T. “Perplexed about Narrative Ethics.” In H. L. Nelson (ed.). Stories and Their Limits: Narrative Approaches to Bioethics. New York: Routledge, 1997.
Tong, R. Feminist Thought: A More Comprehensive Introduction, 3rd edition, Boulder, CO: Westview Press, 2009.
Tong, R. and Williams, N. “Feminist Ethics”, The Stanford Encyclopedia of Philosophy (Fall 2014 Edition), Edward N. Zalta (ed.), URL = <http://plato.stanford.edu/archives/fall2014/entries/feminism- ethics/>.
Tuana, N. “Approaches to Feminism”, The Stanford Encyclopedia of Philosophy (Spring 2011 Edition), Edward N. Zalta (ed.), URL =<http://plato.stanford.edu/archives/spr2011/entries/feminism- approaches/>.
Vollmer, F. “The Narrative Self.” Journal for the Theory of Social Behaviour 35.2 (2005): 189–205.
Walker, M. U. Moral Understandings: A Feminist Study in Ethics. New York: Routledge, 1997.
Walker, M. U. Moral Contexts. Lanham: Rowman & Littlefield Publishers, 2003.
Watson, G. Free Will. Oxford: Oxford University Press, 2003.
Young, I. M. Justice and the Politics of Difference, Princeton: Princeton University Press, 1990.
Young, I. M. Intersecting Voices: Dilemmas of Gender, Political Philosophy, and Policy, Princeton: Princeton University Press, 1997.
Young, I. M. On Female Body Experience: “Throwing Like a Girl” and Other Essays. New York: Oxford University Press, 2005.

Author Information

Anna Gotlib
Email: AGotlib@brooklyn.cuny.edu
Brooklyn College of City University of New York
U. S. A.

Charles Hartshorne: Neoclassical Metaphysics

Charles Hartshorne (1897-2000) was an intrepid defender of the claims of metaphysics in a century characterized by its anti-metaphysical genius. While many influential voices were explaining what speculative philosophy could not accomplish or even proclaiming an end to it, Hartshorne was trying to show what speculative philosophy could accomplish. Metaphysics, he said, has a future as well as a past. He believed that the history of philosophy exhibits genuine, albeit halting and uneven, progress towards a comprehensive understanding of the nature of existence.

Philosophy was, for him, a dialogue that spans centuries, with partners whose wisdom has a perennial relevance. The two philosophers who most influenced him, and in whose work he found the greatest parallels with his own thinking, were Charles Sanders Peirce and Alfred North Whitehead. Hartshorne was co-editor with Paul Weiss of the first comprehensive edition of Peirce’s philosophical papers, and he served as Whitehead’s assistant during the most metaphysically creative period of the Englishman’s career.

Hartshorne considered the metaphysical views he had begun to develop in his 1923 dissertation as, to a great extent, in pre-established harmony with Whitehead’s philosophy of organism. He indicated that Whitehead helped him sharpen his ideas and gave him a better vocabulary to express them, although there remained important differences between the two philosophers. One difference is that theism was always a central element of Hartshorne’s metaphysics (addressed briefly here, but see “Charles Hartshorne: Dipolar Theism” and “Charles Hartshorne: Theistic and Anti-theistic Arguments”) whereas Whitehead was preoccupied for much of his career with a philosophy of nature and did not introduce God until he developed the speculative philosophy of his later works.

The Nature Metaphysics
The Question of Method in Metaphysics
Neoclassical Metaphysics
Conclusion: Hartshorne’s Legacy
References and Further Reading

1. The Nature Metaphysics

After his first book on sensation, Hartshorne’s philosophical work focused mostly on the questions of metaphysics (see “Charles Hartshorne: Philosophy and Psychology of Sensation”). In Creative Synthesis and Philosophic Method, he provides no fewer than a dozen definitions of “metaphysics” which, he argued, differ only as a matter of emphasis. Central to all of Hartshorne’s definitions is that genuinely metaphysical propositions are unconditionally necessary and non-restrictive of existential possibilities. If metaphysical propositions are true at all, they hold true of all possible world-states or state-descriptions. This means that they are propositions which are illustrated or exemplified by any conceivable observations or experiences when such observations or experiences are properly understood or reflected upon.

“Conceivable observation” is here understood in terms of Karl Popper’s notion that observation is always of the form “such and such is the case” rather than “such and such is not the case.” Cognitive definiteness is gained only by noting what is observed, rather than what is not observed, which is indefinite or infinite. Plato argues that negation is parasitic upon affirmation—“that which is not” is not contrary to what exists, but something different than what exists (Sophist 257b). In effect, quantificational criteria for identity can apply only to events that occur, not events which do not occur. The question, “How many storms did not occur?” has no definite answer. In Hartshorne’s view, there are no merely negative facts. Every negation presupposes some actually existing state of affairs. For example, to say that there are no swans in the lake is to say that every part of the lake is occupied by something other than a swan. Or, more generally, to say that swans do not exist is tantamount to saying that every location in the universe is occupied by something other than a swan. Sheer denials (claims purporting to state negative facts) represent an absence of positivity, and this is a key feature of metaphysical error. Properly metaphysical propositions are unique in never being falsified by any actual or genuinely possible states of affairs and in always being verified by actual or genuinely possible states of affairs. They represent, in effect, the kind of necessity defined since Leibniz and found in modern modal logic as “that which is common to all possibilities.”

This distinguishes genuinely metaphysical propositions from other kinds of a priori necessary propositions, such as truths of mathematics and hypothetical necessities. In Creative Synthesis and Philosophic Method (p. 162), Hartshorne maintains that mathematical propositions are non-existential, for they express relations between conceivable states of affairs. “Two apples plus two apples equals four apples” is an existential assertion containing a true mathematical relation, but “two slithy toves plus two slithy toves equals four slithy toves” is a non-existential assertion that nonetheless contains the same true mathematical relation. The bare arithmetic truth that “2 + 2 = 4” is neutral to existential instantiation. Similarly, “The number nine is not integrally divisible by two” is necessarily the case given the conventional meanings of the vocabulary of finite arithmetic. However, although no conceivable state of affairs falsifies the proposition, it is not verified by any conceivable state of affairs. And while hypothetical necessities express necessary relationships between possibilities, Hartshorne takes them to be covert denials that there are any states of affairs which falsify the relation asserted by the conditional. By contrast, genuinely metaphysical propositions are unequivocally affirmative, and their denials can only be sheer denials (as described above), expressions of utter absence or privation. The denials of metaphysical propositions are impossibilities; they are failed attempts to represent that which would never be found among possibilities.

As a prime illustration of a metaphysical truth, Hartshorne used the proposition, “Something exists.” This is properly metaphysical since it could not be falsified under any conceivable observational or experiential circumstances, yet it could be verified by every such circumstance; in fact, to assert both of these features is to assert something that is analytically true of the proposition, since any attempt to verify the proposition would posit, at minimum, a verification-event which would in turn falsify the counter-proposition that “nothing exists.” Some philosophers suggest that it is a contingent truth that something exists, as seems to be assumed by the question, “Why is there something rather than nothing?” In Creative Evolution, Henri Bergson said that one could attempt to arrive at the idea of nonbeing by imaginatively negating every true statement asserting the existence of something. Hartshorne points out, following Bergson, that this thought experiment is self-defeating. It ends in one of two ways: either there is no assertion, but only a denial, or there is an assertion that is self-referentially incoherent such as, “Nonexistence exists.” It is logically kindred to such “nonsense” propositions as “I was told something by nobody” or “I ate nothingness.” There is literally no possible state of affairs that could make “Nothing exists” true. If it is impossible for “Nothing exists” to be true then “Something exists” must be necessarily true.

If Hartshorne is correct that it is impossible for “Nothing exists” to be true, then there can be no state of affairs that meaningfully contrasts with “Something exists.” To say that it is necessary that something exists does not provide any information about any existing thing; in other words, “Something exists” is too abstract to tell one about the concretely existing things (pluralism) or thing (monism) that may exist. This observation, however, presupposes the contrast between the abstract and the concrete. A further metaphysical question, therefore, is the relation that exists between the abstract and the concrete. “Something exists” does not describe an existing thing but rather presupposes the existence of entities (or an entity) more concrete than the sentence itself—this is the case even if, per impossibile, only the sentence existed, for “Something exists” is more abstract than “Only the sentence ‘something exists’ exists.” In light of these kinds of considerations, Hartshorne concludes not only that “Something exists” is necessarily true but also that “Something concrete exists” is as well, where the adjective “concrete” is the contrary of “abstract.” There is a hint of paradox in the fact that “concreteness” is itself abstract, but this leads to another of Hartshorne’s definitions of metaphysics as the study of the abstraction “concreteness.” Indeed, Hartshorne maintains that all metaphysical mistakes are instances of what Whitehead called “the fallacy of misplaced concreteness,” that it is to say, of mistaking an abstraction for what is concrete.

Conceivable propositions involve conceivable states of affairs in order for them to count intelligibly as propositions. Natural deduction systems of modern symbolic logic seem to make this supposition as in the decision of Whitehead and Russell in Principia Mathematica to make “there exists something X which either does or does not have an arbitrary one-place predicate P” axiomatic: in effect, they disallow an empty universe of discourse since an empty universe produces incoherence in the system such as counter-instances to the law of Universal Instantiation. While it is to be granted that free logics can avoid this assumption, it is also true that free logics entail difficulties precisely in determining their semantical domains. Most important, free logics that are designed to formalize ordinary language presuppose “objects” in both their inner or outer domains. Despite such monikers as “null inner domains,” such domains assume objects that are non-actual possibles. All free logics that have cognitive import for the description of “possible worlds” assume a semantical domain of objects that are conceptualized on the basis of actual objects or properties; for example, “Batman is a superhero” can be formalized in free logic, but it ultimately makes oblique reference to actualities (bat ears, masks, muscular strength, courage, and so forth) that are posed in non-actual combinations or juxtapositions. In effect, free logics can be interpreted in such a way that they do not contradict basic tenets of Hartshorne’s modal theory. Where cognitively meaningful, they assume objects as values for variables, and they formalize fictional scenarios that indirectly display the conceptual priority of actualities.

Hartshorne contrasts metaphysical propositions with empirical and contingent propositions, which are restrictive of some existential possibilities. An empirical proposition is essentially restrictive, always involving an actualization of a state of affairs that excludes other possible alternatives. For example, “Barack Obama resides in the White House during 2011” tells us about states of affairs obtaining in the White House during 2011, and it tacitly excludes the state of affairs of John McCain, his opponent in the 2008 presidential election, residing in the White House during 2011. This feature of exclusion among alternative possibilities is definitive of contingency, and, for Hartshorne, follows from Leibniz’s insight that the scope of disjunctive possibilities cannot be actualized simultaneously or conjunctively, since there are incompossible possibilities. Thus, the selection among possibilities confronted by natural processes must involve the acceptance of one alternative and the rejection of others, and this is a signature feature of empirical propositions. Hartshorne never considered the many-worlds interpretation of quantum theory, which by virtue of quantum branching into conjunctively realized alternative space-times, denies Leibniz’s principle of contingency as exclusion of alternatives. (For a critique of the so-called “actualist” account of many-worlds ontology and defense of the coherence of process philosophy and quantum theory, see Shields 2008.)

If empirical propositions are essentially restrictive, it follows that every empirical state of affairs is positive, but has negative implications. The denial of these negative implications is also an empirical state of affairs. For example, one alternative to Obama’s having won the 2008 presidential election is Hillary Clinton’s having won it. Since this alternative did not occur, the denial of this alternative, namely, “Hillary Clinton did not win the 2008 presidential election” is true of the actual world. However, if an empirical proposition is one which excludes alternatives, how is this true of negative empirical implications of such propositions? Is not a negative empirical proposition simply an assertion of an absence or privation? Hartshorne holds that this is clearly not the case. What is excluded from actualization in the above negative empirical statement is Hillary Clinton’s winning the 2008 presidential election, and this exclusion is achieved by a positive state of affairs. Positivity and exclusion of possibilities are thus features of all empirical propositions. Thus, unlike metaphysical propositions, empirical propositions have both an affirmative and a negative logical quality.

The division between metaphysics and empirical science is, in principle, clear. Hartshorne notes that, in practice, it is not always clear which statements count as empirical and which as metaphysical. It is well to keep in mind that Hartshorne uses Popper’s idea about falsifiability as a criterion of what it means to be an empirical statement and not as the guiding method of empirical science. Popper elevated falsification over verification as the proper method of science. Hartshorne does not address in a systematic way the question of the proper methods of science; even so, showing that a given statement is falsifiable is, on Hartshorne’s principles, one way in which it can be discredited as a true metaphysical idea. If a true metaphysical claim is falsified by no conceivable observation it is also the case that it is verified by every conceivable observation. Hartshorne holds that verifiability fails as a criterion for empirical statements but succeeds as a criterion for true metaphysical statements. It follows that false metaphysical ideas are falsified by every conceivable observation and verified by none.

A nuanced issue emerges, however, when one considers particular case studies of the relationship between metaphysical and empirical propositions in Hartshorne’s theory. Some critics have urged that Hartshorne’s neoclassical positions may sometimes conflict with apparently well-corroborated empirical scientific hypotheses. Among other hypotheses, these include (i) the apparent empirical result from special relativity theory that there is no cosmic simultaneity and thus no privileged or divine time (Hartshorne’s theory of deity posits a temporal God), or (ii) the beginning of physical events in space-time a finite time ago as posited in standard hot big bang cosmologies (Hartshorne’s metaphysics of creativity posits an infinite past of cosmic epochs, the latest of which is our actual cosmos since the purported big bang event). Such apparent conflicts, however, do not actually speak to Hartshorne’s general theory of the difference between metaphysical and empirical propositions. Rather, they concern whether the specific propositions he proposes as metaphysical are in fact illustrated by any conceivable state of affairs.

While Hartshorne can be described as a kind of rationalist insofar as he maintains, like classical rationalists such as Descartes and Leibniz, that metaphysics is a matter of consistent and adequate meanings of concepts, he is hardly a dogmatic “armchair” or purely speculative philosopher who desires no engagement with the special empirical sciences—his first and thirteenth books demonstrate that he was a serious psychologist and ornithologist. His rationalism is in fact “critical” and rather severely qualified. For instance, a propos of the above comment regarding the question of the “success” of his metaphysical project, Hartshorne speaks in Creative Synthesis (Ch. II) of metaphysics as our quite “contingent ways of trying to become conscious of the non-contingent ground of contingency,” and he insists on the qualification that the notion of the a priori should hardly be conflated with the epistemic notion of “certainty.” With Whitehead, Hartshorne insists that philosophers should be epistemically wary by avoiding the “dogmatic fallacy” such as found in the confidence of the Continental rationalists. In “The Development of My Philosophy” (1970) Hartshorne declares, “All philosophizing is risky: cognitive security is for God, not for us.”

There are at least three considerations which make it clear that, at the very least, it is not obvious that Hartshorne’s neoclassical metaphysics conflicts with the above mentioned empirical hypotheses, or that he is cavalier about empirical challenges. Following Popperian distinctions, Hartshorne never claimed that his proposed metaphysics is in principle exempt from empirical disconfirmation, although it is exempt from the quite distinct notion of empirical confirmation. If a “metaphysical” proposal really does conflict with an empirical fact, then it is disconfirmed and fails to be a genuinely metaphysical proposition. No genuinely metaphysical proposition, however, could be “empirically confirmed” in the standard sense that some restrictive state of affairs as opposed to another illustrates the proposition, because this would deny the universality of the candidate metaphysical proposition’s requirement that it be illustrative of any conceivable state of affairs. This requirement does not prevent it from being the case that some states of affairs are phenomenologically “privileged” in the sense that certain metaphysical truths may be more readily apparent in special cases of phenomena. Hartshorne agrees with the early Heidegger that metaphysics can be about profoundly general concepts, yet such concepts are neither phenomenologically vacuous nor inexplicable nor utterly without discernible structure. For instance, the process metaphysical theory of the necessarily “social structure of all experience” might be seen with particular clarity via the special phenomenon of the “active concern” (Heidegger’s sorge) of human being.

The determination of the relevant “empirical facts” (or interpretations of them) which a philosopher is forced to accept is a subtle, highly theory-dependent and much disputed matter, especially regarding the above mentioned cases of relativity theory and big bang cosmology. For example, it is not clear or agreed upon by philosophers of science that relativity physics establishes that time is “relative” even in Newton’s sense or that special relativity robs us of any objective, uniform notion of temporal modes of past, present, or future; nor is it clear that the standard big bang model, even if sound, “proves” the absolute finitude of either time or creative process as such. W. H. Newton-Smith, in The Structure of Time, argues that the notion of an “empirical” proof of a beginning of time even when granting a big bang singularity is highly problematic.

Hartshorne was cognizant of the prima facie tensions between relativity and big bang theory and his neoclassical metaphysics, and he offered plausible conciliatory suggestions: For example, consider his embrace of quantum physicist H. P. Stapp’s notion of a primordial, asymmetrically well-ordered sequence of events upon which space-time location is dependent. Stapp’s idea harmonizes the relativity of spatio-temporal observations dependent upon light-cone propagation and the ultimacy of ontological asymmetry demanded by process theory. Consider also Hartshorne’s observation that big bang theory establishes, at most, the contingent origin and present physical chronometry of time appropriate to our “cosmic epoch.” At any rate, whether or not these conciliatory suggestions are successful, Hartshorne attempted to follow through with the directives of his theory of metaphysics. As he says in Creative Synthesis, “there must be an at least possible way of harmonizing what physicists say is true of our epoch and what metaphysicians say is true of all possible epochs (since it forms the content of ideas of such generality that there is nothing we can think which is not a specialization of this content).”

2. The Question of Method in Metaphysics

That Hartshorne thought at length about questions of philosophical method can be inferred from what Paul Weiss called the systematic “machinery” at work in his metaphysics, and from the very title of one his most important mature philosophical works, Creative Synthesis and Philosophic Method. Hartshorne’s method for neoclassical metaphysics results from both original insights and critical reflection on a wide swath of variegated influences. These range from the work of American pragmatists (especially Peirce), to phenomenology, to the speculative thought of Whitehead, to the work of analytic philosophers (with particular attention given to Popper and the logical investigations of his Harvard teachers Lewis and Sheffer as well as his University of Chicago colleague Carnap). The section titled “Reply to Everybody” published in The Philosophy of Charles Hartshorne lists no fewer than twenty-one methodological principles to be used in the proper adjudication of metaphysical claims. Among the most important of these are what could be termed the principles of “positivity,” of “dipolar contrast,” of “inclusive asymmetry,” and of Peirce’s doctrine of “position matrices or diagrams.” We explained the principle of positivity, or the rejection of purely negative facts, in the previous section, so let us turn to a discussion of the other principles.

a. Dipolarity

Hartshorne’s principle of dipolar contrast derives, in part, from the semantic “law of polarity” found in Morris R. Cohen’s A Preface to Logic. Following Cohen, Hartshorne holds that genuine metaphysical concepts are semantically interdependent. In effect, such concepts have logical contraries which cannot mean anything in utter isolation from one another. In spite of the extreme generality of metaphysical concepts, each such concept entails a polar contrast to it. Even the highly general concept “reality” requires that the concept “unreality” be assigned some meaning. To use Hartshorne’s illustration, the concept of “reality” ought to include the notion of having mental states, but the concept of “unreality” should include the notion of intentional objects of real mental acts which fail to designate anything extra-mental. Perhaps a more telling example could be found in the notion of necessity. A standard definition of necessity is “that which has no alternative,” yet alternativeness clearly invokes contingency, since a contingent state of affairs is to be characterized as “this rather than that alternative.” Hence, the semantical analysis of necessity invokes contingency. For Hartshorne, then, each metaphysical concept has a corresponding contrast: necessity requires contingency, being requires becoming, unity requires variety, and so on, for any concept that is non-restrictively general, having applicability across possible states or state-descriptions. The two interdependent contraries in each case warrant the term dipolarity.

Lack of recognition of dipolarity is, for Hartshorne, a chief difficulty in previous metaphysical theories that suppress expression of a polar contrast. In effect, they suffer from a certain conceptual poverty or “fallacy of monopolarity.” Monopolar theories allow expression of only one pole of a pair of contrasts; stated obversely, they completely deny one pole of a pair of contrasts. One example of denying dipolarity is monistic theories such as that of Spinoza, which allow causal necessity and internal relatedness, but which disallow contingency and external relatedness. At the opposite “monopolar” extreme are logical atomist theories like that of Russell, which allow causal contingency and external relatedness, but which disallow causal necessity or internal relatedness. Hartshorne asks if these contrary extremes make any more sense than supposing that doors must have hinges on both sides or on neither side. Hartshorne’s metaphysical project is guided by the observance of dipolarity and thus conceptual inclusiveness; in his view, a neoclassical process theory of reality is structurally dipolar and offers comprehensive accommodation of both necessity and contingency, both causal determination and a degree of freedom from such determination, both internal and external relations, and so forth, throughout the range of metaphysical polar contrasts.

b. Inclusive Asymmetry/Concrete Inclusion

Hartshorne’s principle of dipolarity is complemented and qualified by a principle of inclusive asymmetry or concrete inclusion. As Hartshorne points out, the principle of dipolarity does not justify metaphysical dualism. One should distinguish between asserting that a metaphysical concept requires a contrary polar conception in its definition, and asserting that two polar concepts have an equivalent metaphysical status. It may well be the case that one concept requires the other polar concept in its definition, while the other polar concept both requires the polar contrast in its definition, and yet is itself the ground or source of that polar contrast. In other words, it may be the case, as Hartshorne asserts, that dipolarity is itself grounded in a logically asymmetrical relation between the contraries.

The model for this relation can be seen in logical implication, which Hartshorne, following Peirce’s trail-blazing work on “illation” as logically fundamental or primal, takes to be the ultimate concept in formal logic and a resource for metaphysical generalization. “p implies q” means that p both implies itself and q. This can be formally expressed in the tautology that (p ⊃ q) iff [(p ⊃ p) & (p ⊃ q)]. (This result is mirrored in Lewisian systems in which the formula—changing material implication to strict implication—is a theorem.) However, given a standard material implication, p ⊃ q (where p and q are not equivalent in meaning), we cannot say conversely that q logically implies p. This is reflected in the fact that the correlative formula (p ⊃ q) iff [(q ⊃ q) & (q ⊃ p)] is not a tautology, for it is false on the truth-tabular conditions that p and q have opposite truth values, and thus implicitly involves a species of “fallacy of affirming the consequent.” (Analogously, the similar formula using strict implication is not a theorem.) Thus, entailment is essentially asymmetrical.

Consider furthermore the defining power of variant connectives of standard systems of propositional logic. For Hartshorne, it is immensely significant that the defining power of propositional operators or functions “varies inversely with symmetry.” The symmetrical function of logical equivalence, as in “p if and only if q,” has the least defining power of the propositional functions, since, even when combined with negation, it can be used to produce only eight (including itself) out of the sixteen propositional functions. On the other hand, the directional or asymmetrical functions, which contrast with the equivalence function, are constitutive of entailment. Hartshorne points out that Peirce, and then Sheffer, were the first to see that either the combination of negation and conjunction (“not both”) or the combination of conjunction and negation (“neither/nor”) are singly sufficient to define all the others.

The Sheffer functions (the “stroke” and “daggar”) are the most definitive functions, but they possess a triadic asymmetry that yet includes dyadic symmetry. We see this, Hartshorne notes, in their truth-tabular definitions. The Sheffer stroke is false if and only if both propositional variables are true, while the Sheffer daggar (also Peirce’s ampheck) is true if and only if both propositional variables are false. In effect, the triadic relation of the stroke, that is, the truth-value product of the binary Sheffer construction p|q, which is dyadically symmetrical in terms of its propositional truth-value assignments (p is true and q is true), stands as an asymmetry in terms of its truth value (that is, it is false in relation to symmetrical truth). Hartshorne finds a metaphysically ultimate pattern here, namely, symmetry within an all-embracing asymmetry.

Hartshorne holds that the relation between dipolar metaphysical contraries exhibits this asymmetrical structure. As an illustration, consider his argument in Creative Synthesis that “becoming” logically contains its polar contrast “being,” but not the converse. Suppose there is a reality, X, that does not come to be, that is eternal, and another reality, Y, that does come to be. The total reality, XY, is not eternal; XY comes to be, for Y itself is not eternal. This shows that becoming is the more inclusive category, for it preserves itself (becoming) and its polar contrast (being). No comparable argument can show that being can include becoming without destroying the contrast. The concrete or definite, the creatively cumulative, is the inclusive element, and is the key to the abstract, not vice versa. The concrete and the abstract are neither sheer conjuncts as posited by varieties of dualism, nor some mysterious “third” entity, but, in consonance with both Whitehead’s ontological principle and Aristotle’s ontological priority of the actual, is rather, “the abstract in the concrete.”

In his “Logic of Ultimate Contrasts” (Creative Synthesis, Ch. VI and Zero Fallacy, Ch. VII), Hartshorne calls the concrete terms in a pair of metaphysical contraries the r-terms (correlated with Peirce’s categoreal “seconds” and “thirds”), while abstract terms are called a-terms (correlated with Peirce’s categoreal “firsts”). While he provides 21 r-terms and 21 a-terms in his table of metaphysical contraries, a few samples could be taken as illustrative, especially given his Rule of Proportionality, namely: as any one r-term stands to its contextually correlated a-term, all other r-terms stand to their contextually correlated a-terms.

Hartshorne argues that each r-term includes its correlative a-term, but not vice versa. Given the items above, we see that, for Hartshorne, the analysis of experience should be constructed so as to include the notions that objects or things experienced are independent of or externally related to the contingent acts of experience which include the objects as their necessary (but not sufficient) conditions. If correct, these conceptual relations all exhibit the essential asymmetry of entailment. Yet, there is a two-way necessity within this overall asymmetry, for while the relation of logical inclusion falls always on the r-term side of the table, a-terms nonetheless necessitate that “a class of suitable r-term correlates be non-empty.” For example, the necessary can be expressed, Hartshorne contends, as “the non-emptiness of the class of contingent states of affairs.” (This particular rumination is a key feature of Hartshorne’s revision of the ontological argument; see “Charles Hartshorne: Theistic and Anti-theistic Arguments”.)

While the detailed arguments for and against proper adjudication of each case of r-term/a-term relation is a complex affair that cannot be presented here, it is interesting to notice that some independent considerations of modern logic arguably shore up Hartshorne’s basic principle of r-term inclusion. For example, Hartshorne pointed to the fact that an important theorem of contemporary modal logic “mirrors” the logical inclusiveness of contingent concreta or “r-terms” in juxtaposition with the abstract necessity or “a-terms,” namely, the theorem that [(Np & ~ Nq) ⊃ ~ N(p & q)], where N is a modal operator for “necessarily.” In effect, the conjunction of necessary and contingent propositions logically entails the modally contingent status of the conjunction of assertoric propositions—in effect, contingency in a relevant sense “includes” necessity rather than vice versa.

c. Position Matrices

Hartshorne also holds that metaphysical theories can be tested by subjecting them to processes of rational elimination and/or comparison of cognitive costs that begin with a formal logical elaboration of theoretical possibilities. This idea has its origin in Peirce’s doctrine of position matrices or diagrams. The point here is that no philosophical topic can be declared fully rationally adjudicated until the constituent fundamental aspects of that topic have been subjected to an exhaustive “mathematical analysis.” Much error can occur unless and until all possibilities have been foreseen and subjected to thorough rational consideration. Consider the issue of the God-world relationship in philosophical theology. Hartshorne argues that there are sixteen combinatorial possibilities for theological and atheological models of this relationship when we notice that the concepts of God and world can each be either ontologically necessary, ontologically contingent, can possess these modal properties in diverse aspects, or are neither ontologically necessary nor contingent. In the following matrix, upper case letters (N and C) represent ontological modalities as applied to God and lower case letters (n and c) represent ontological modalities as applied to the world. The zero case (O) represents lack of modal status or impossibility:

Hartshorne’s matrix provides a method of making distinctions among various types of historically significant worldviews as well as highlighting the distinctiveness of his own position. For example: Parmenidean monism or classic Advaita Vedanta can be symbolized as N.o; early Buddhist thought is O.cn; Aristotle’s theism is N.cn; Aquinas’ theism is N.c; Stoic and Spinozistic pantheism is N.n; LaPlacean atheism is O.n; John Stuart Mill’s theism and most forms of deism are C.n; William James’s theism is C.c; Jules Lequyer’s is NC.c; Bertrand Russell’s atheism is O.c. Hartshorne argued that his preferred option (NC.nc) is the most formally inclusive of the theoretical options, and that no specific options are logically compossible (otherwise we would have modal incoherence or contradiction).

Hartshorne’s presentation of the position matrix representing necessity and contingency as applied to God and world developed over the course of his career. He did not come to the four-row, four-column arrangement until after his ninetieth birthday, with the help of Joseph Pickle at Colorado College. A more substantive change was in the way that Hartshorne interpreted the zeros. In Creative Synthesis, the zeros are the atheistic and acosmic positions. In later discussions, however, he interprets the zeros more broadly as “God is impossible (or has no modal status)” and the “World is impossible (or has no modal status).” To illustrate the difference between these interpretations consider the position of W. V. O. Quine. He would say that God does not exist, the world does exist, but the world has no modal status. This option cannot be represented as O.n, O.c, or as O.cn since each presupposes modal status for the world. Nor can it be represented as O.o without serious distortion, since Quine does not deny that the world exists. Another illustration of the problem is Robert Neville’s emphasis on apophatic theology. In Neville’s view, the necessary/contingent contrast is a product of God’s creative act; God cannot be characterized as either necessary or contingent, but only as indeterminate, at least prior to the act of creation. Hartshorne’s table, as presented here, finesses these issues by interpreting the zeros in a strictly formal fashion to mean “neither necessary nor contingent,” leaving open the possibility of further refinement.

Whatever one’s ultimate convictions about this particular topic, Hartshorne’s approach arguably represents an advance in metaphysical or philosophical theology since it provides a matrix that may well suggest missed possibilities in traditional or conventional ways of thinking about the topic. Furthermore, Hartshorne’s method can be extended: similar 16-fold matrixes can be made for other polar contrasts such as infinite/finite, eternal/temporal, and so on. If any two matrixes are combined (16 X 16) the number of formal alternatives leaps to 256. More generally, if m equals the number of contrasts one wishes to include in talking about God and the world, then 16^m is the number of formal alternatives available. There is no apparent antecedent in the history of philosophical theology of Hartshorne’s matrices. It is no wonder, therefore, that he considered them as one of his original contributions to metaphysics.

3. Neoclassical Metaphysics

Hartshorne referred to his metaphysics as “neoclassical” to emphasize its continuity with classical traditions, especially as they sprang up in antiquity from the Presocratic philosophers and from Plato and Aristotle. He was also keen to stress that his views are importantly different, or new (“neo”), in their substantive claims. He would eventually highlight the parallels of his metaphysics with ideas in early Buddhist thought. The family of metaphysical views to which Hartshorne’s ideas belong is often called process philosophy or, following Bernard Loomer, process-relational philosophy. One finds anticipations of process-relational philosophy in Peirce’s tychism, James’s pluralistic universe, and Bergson’s la durée. Hartshorne was influenced by these philosophers (with Peirce being the most dominant of the three) but his greatest debt was to Whitehead.

a. Creativity

Philosophers venture various hypotheses as to the character of the finally real constituents of existence. One remembers Parmenides’ Being, Democritus’ tiny impenetrable atoms, Aristotle’s hylomorphic ousia, Descartes’ dual substance ontology, Leibniz’s monads, and Whitehead’s actual entities. Hartshorne adopted a Whiteheadian view, sometimes speaking of “dynamic singulars” instead of “actual entities.” Dynamic singulars are instances of what Hartshorne called “creative experiencing,” an expression that suggests an activity of synthesis, a bringing together of diverse elements from an entity’s antecedent world into a unity of feeling. Hartshorne often used Whitehead’s word “prehension” to name the feelings from which a dynamic singular weaves its own experience from the welter of data from its past. The “diverse elements” from the past that are synthesized are themselves instances of creative experiencing; for this reason, Hartshorne was fond of the expression “feeling of feeling,” which is close to Whitehead’s language in Process and Reality (Ch. X, sec. II). The prime example on which both Whitehead and Hartshorne model this activity is memory. Memories are themselves experiences that may have previous experiences as component parts. Moreover, memories are active in the way that they highlight some items of experience but place other items in the background, sometimes almost forgotten. Memory also serves as a model of the way emotional tone suffuses experience, in accordance with Hartshorne’s theory of the affective continuum. Finally, in keeping with process-relational philosophy, memory is a process, a coming-to-be, and not an unchanging substance; its very existence, moreover, depends upon its relation to past events.

Hartshorne agreed with Whitehead when the latter spoke of creativity as “the category of the ultimate.” In Whitehead’s words, “the many become one and are increased by one” (Process and Reality, Ch. II, sec. II). For both Whitehead and Hartshorne, creativity is not itself a substance but rather the name for the activity that characterizes every concrete particular, from the lowliest puff of existence to God. Thomas Aquinas restricted creativity in the strict sense to deity alone. Whitehead and Hartshorne, on the other hand, treat creativity as what medieval thinkers called a transcendental, a universal concept that is not restricted to this or that kind of real thing but which identifies a thing as such as real. Another departure from traditional ideas about creativity is that, for Whitehead and Hartshorne, creativity is never from nothing (ex nihilo), whether it is God’s creativity or the creativity of individuals within the cosmos. According to Hartshorne, the “nothing” in the expression “creatio ex nihilo” would be a purely negative fact. As noted in a previous section, Hartshorne rejects the existence of such facts. Thus, Hartshorne concluded that a creative act always presupposes an antecedent world from which the novel act arises.

Hartshorne’s emphasis on creativity illustrates his commitment to the principle—summarized in the previous section—that that which comes-to-be (becoming) includes but is not included by that which is but does not come to be (being). Hartshorne insists on taking “becoming” in the strictest sense as a process that adds to the definiteness of reality something that was not included in the class of real things prior to the act of becoming. Nothing corresponds to the word “reality” considered as a single nontemporal or eternal fact; rather, reality grows with every act of becoming and is, as it were, defined by them. Hartshorne rejects the idea that there is literally “nothing new under the sun”; on the contrary, there was a time when even the sun was new. Hartshorne is not simply reaffirming the flux of Heraclitus where all concrete things change; he is affirming that reality is a growing totality, an idea that is also prominent in Peirce’s evolutionary cosmology. The growth of reality, moreover, is thoroughly temporal—time itself is the process of creation. The past is determinate, the future is a field of relatively indeterminate possibilia, and the present is the process of determination. Finally, Hartshorne argues that what comes to be, once it has become fully determinate, is a permanent fixture of all subsequent becoming, guaranteed in the final analysis by God’s memory of it. This is why Hartshorne speaks of creation as a cumulative process.

b. Psychicalism

The fact that, for Hartshorne, experience is ontologically foundational means that his metaphysics is a type of what has traditionally been known as panpsychism. Early in his career, Hartshorne used “panpsychism,” distinguishing true and false versions of the doctrine. Later he preferred “psychicalism” and he said that he did not object to David Ray Griffin’s word “panexperientialism.” Hartshorne attributed mind-like qualities to every concrete particular (that is, dynamic singular), but his metaphysics cannot be described as anthropomorphic. He accepted Leibniz’s two-fold criticism of Descartes that self-consciousness is not the only form of human experience, and that human experience is not the only form of experience. In his second book, Beyond Humanism, Hartshorne points out that a dog need not become a human in order to suffer. In keeping with the theory of the affective continuum Hartshorne conceives mind-like qualities as existing along a continuum from the simplest feelings to the most complex thoughts. He argues that it is precisely in its psychological characteristics that it is possible for a nonhuman being to be infinitely other than a human being. This is because psychological variables such as memory, feeling, and volition are infinitely variable. Memories are conceivably of any span (a few seconds, a million years, and so forth) and of any condition of vagueness or precision; feelings can be any degree of intensity or complexity; volitions, which presuppose memory and feeling, are likewise infinitely variable.

Hartshorne denied the assumption of much of modern philosophy that an experience can have only itself as an object. The errors of waking experience as well as the false impressions during dreams provide no sure ground for a global skepticism—in the words of Peirce, “as if doubting were as easy as lying.” Hartshorne maintains that the question “What if all of our experience is a dream?” is based on a faulty phenomenology of dreaming and he points to Henri Bergson’s small book, Le rêve. Bergson argued that, during dreams, perceptions are indistinct, memory is free-floating, and attention is mostly disengaged, but the connection with the world through the body is never severed. Events and concerns of the day as well as immediate stimuli from the environment regularly appear in our dreams. Hartshorne gives the example of having dreamed of a propeller airplane and, as he awoke, hearing the sound of the airplane blend imperceptibly into the sound of a fan blowing in the room. As perception is not lacking in dreams, so more generally experience is always of something not itself. What philosophers call “the given” in experience are, according to Hartshorne, the independent causal conditions of the experience. Introspection too conforms to this model: it is a present experience having the immediately previous experience as an object. Experience, at every level imaginable, is essentially social—dynamic singulars feeling the feelings of others.

Hartshorne rejects the assumption that minds are essentially non-physical entities. Even Descartes, who argued for this claim, acknowledged that certain mental qualities are experienced as spread throughout one’s body or as being in specific regions of the body. Mental and physical qualities are indeed distinguishable but it does not follow that they are separable. Descartes raised the question of the criteria for the presence of mind in a physical object, thereby making materialism the default position for anything outside one’s own consciousness; since, however, mind-like qualities are so pervasively present in varying degrees in so much of nature, Hartshorne asked for the criteria for the absence of mind. The problem, as Hartshorne sees it, is as much with the concept of mind as with the concept of the physical or of matter. He raises the question whether there is anything that positively corresponds to the concept of a merely physical entity, that is to say, a physical entity in which mind-like qualities—not simply human mind-like qualities—are wholly absent. To be sure, there are physical entities in which mind seems to be absent, but Hartshorne argues that this is no more evidence of the absence of mind than the appearance of inactivity in a physical object is evidence that there is no activity in it. Leibniz guessed otherwise and modern science is on his side; the micro-world, even where apparently “dead matter” is concerned, is buzzing with activity. The old adage, “absence of evidence is not evidence of absence” applies.

In arguing for the ubiquity of mind-like qualities, Hartshorne found inspiration in certain aspects of Leibniz’s panpsychism. With Leibniz, he distinguished parts and wholes. The parts—Hartshorne’s dynamic singulars—have mind-like qualities even if some wholes of which they are made lack them. He argues by analogy that feeling can be everywhere even though not everything feels. For instance, a flock of birds does not have feeling, but there are feelings in the individual birds. Hartshorne explains the difference using a modified version of Leibniz’s concept of a “dominant entelechy” according to which some physical systems are organized in such a way that the experiences of the dynamic singulars (the parts) can be channeled into a single more or less unified stream of experience or even conscious experience, as in the case of animals with complex nervous systems. In Hartshorne’s theory, the body not only reacts to the world around it, but also reacts to itself. We feel the feelings of at least some of our cells. As Hartshorne said, hurt my cells and you hurt me. Some organic wholes, such as plants, do not have a structure integrated enough to allow for a dominant stream of experience. Hartshorne viewed plants as having no feeling, but he attributed feelings to their individual cells. He held that the phototropism of a flower tracking the sun is more a function of the activity of the cells than of the plant as a whole. Hartshorne generalizes this analysis along Leibnizian lines to the inorganic world. Leibniz spoke of monads in inorganic substances as being in a “stupor.” Hartshorne attains a similar result in his theory of the infinite variability of mind-like qualities. There is no such thing as “mere matter,” only matter in which mind-like qualities are far removed from what is recognizably human-like, animal-like, or even cell-like. With Leibniz’s distinctions, Hartshorne is able to theorize that there is experience in every object, but not every object of experience is an experiencing object.

Despite Hartshorne’s use of Leibniz’s ideas, the dissimilarities between their versions of panpsychism are as striking as their similarities. As already noted, dynamic singulars are entities that come to exist in the creative-cumulative advance of the world; Leibniz’s monads do not come to exist within the universe but are coexistent with it. For Leibniz, God’s creation of the universe is nothing more than God’s creation of the monads that make it up. Another significant difference between the two philosophers concerns relations of cause and effect. Leibniz denied causal relations among nondivine monads—they are “sans fenêtres” (windowless); he secured the appearance of relations of cause and effect by positing a divinely imposed pre-established harmony. For Hartshorne, every dynamic singular is both a partial result of causal conditions that precede it and a partial causal condition of events that succeed it. In short, every dynamic singular is both an effect and a cause. The word “partial,” especially as regards the relation from cause to effect, is important. Hartshorne rejected determinism (see below), and this represents another departure from Leibniz. For Hartshorne, causal conditions are necessary, but not entirely sufficient, for the emergence of a dynamic singular. The individual’s response to its own causal past—the way it synthesizes the world given to it—provides an ineradicable aspect of the explanation for why it is the way it is. It acts and is not merely acted upon. According to Hartshorne, the same principle applies to God, although allowance must be made in the divine case for the modal difference between existence and actuality (see below). The twin ideas that there are real relations among dynamic singulars and that each is unique by virtue of its manner of experiencing the world highlight two features of Hartshorne’s metaphysics. First, reality has a social structure (see below, under “personal identity” for a discussion of the meaning of “social”); second, every concrete particular that “makes” the world retains at least a minimal degree of freedom.

One objection to Hartshorne’s theory is that mental qualities seem to require a central nervous system. In Beyond Humanism, Hartshorne makes several points that are crucially relevant to this objection. He notes that among animals with central nervous systems, physical and psychical qualities are correlated. Hartshorne observes, in an almost Teilhardian turn of phrase, that physical complexity is a sign of psychical complexity. The more complex is the mental life, the more complex is the nervous system that underlies it. Can one generalize beyond creatures with a nervous system? Hartshorne points out that one-celled animals manage the functions of digestion, oxygenation, and locomotion without the organs and body parts that in creatures with nervous systems make these possible. He asks whether mental function, broadly conceived, may not be analogous. Is it any more reasonable to say that a paramecium feels nothing because it lacks a central nervous system than it is to say that it cannot swim because it has neither motor nerves nor muscle cells? If it has primitive feelings, then it displays them behaviorally in the only way it could, by responding to stimuli. Hartshorne argues that the only conclusion that can be drawn from physiology is that similarity of mind between a one-celled creature and a human is limited by the dissimilarity of their bodies. Physical wholes insufficiently organized to allow a dominant stream of experience are the closest thing in Hartshorne’s philosophy to what materialists call “matter.”

An important objection to Hartshorne’s psychicalist theory is suggested in the work of Karl Popper. In his classic treatise on the mind-body problem titled The Self and Its Brain (co-authored with neuroscientist John Eccles), Popper objected to “psychicalist” or “proto-mental” conceptions of the brain’s elementary particles, arguing that such conceptions have no empirical explanatory power and are thus “metaphysical in the bad sense.” Popper maintains that elementary particles can have no “interior states” because they are “completely identical whatever their past states.” For example, any arbitrary proton selected at any time for measurement will have the same physical properties as any other proton selected at any time for measurement: its mass will be 938 MeV/c²; its charge +1; and its spin ½.

Contra Popper, it does not follow from this that elementary particles are absolutely, predicatively identical no matter what their past states. To use Hartshorne’s dipolar vocabulary, Popper is here conflating “gen-identity” (identical characteristics over time) and “strict identity.” Such properties as mass, charge and spin are gen-identical features of protons that are present in each proton-occasion. However, protons do not remain static in terms of their empirically discernible behavior over periods of time. For example, a proton P in a tritium nucleus of hydrogen (a nucleus of hydrogen with one proton and two neutrons) has a rate of radiation decay as compared to a distinctive proton P* in a lead-206 nucleus (one of the four “stable” isotopes of lead), which has no such decay, as is now familiar to us through the “half-life radiation law.” Notice that the behavioral differences occur precisely because of differences in physical contexts. That physical context matters to the behavior of protons is readily explicable in a Hartshornean interpretation of elementary particle-occasions, because such particle-occasions are “open” to their environments—in Whitehead’s vocabulary, the environments are their “actual worlds”—through prehension. More recent empirically well-corroborated developments in quantum physics are likewise readily explicable in Hartshorne’s psychicalist interpretation, again through the notion of prehension. One may note in this regard the phenomena that (a) “information transfer or influence” occurs between well-separated particles faster than light-cone propagation (that is, quantum entanglement) and (b) that physical states are discernibly influenced by the selection and rapidity of an observation or measurement process (that is, quantum Zeno effect). It may well be no accident that one of the first philosopher-physicists to devise experimental tests for quantum entanglement phenomena was Abner Shimony, a student of Hartshorne’s at the University of Chicago, who has remained indebted to the “Whiteheadian paradigm.” In neuroscience, the emergence of neuroplastic phenomena in which rigorously repeated thought or “attentional” exercises have an empirically discernible effect upon brain metabolism as shown through PET-scans also conjures a top-down causation model which again can be readily handled by a Hartshornean interpretation of particle-occasions as prehensive. Thus, Popper’s dismissive estimate of the empirical explanatory power of psychicalist or panexperientialist concepts seems to be, at the very least, seriously challenged by more recent developments in physics.

Hartshorne believed that his concept of the infinite variability of mind-like qualities provides the theoretical bridge to extend the categories of experience beyond the human, the animal, or even the organic. He does not deny that these speculations about the possibility of radically non-human or non-animal minds are, for the foreseeable future, of little or no use to much of science. Physics, for example, need not worry whether atoms or electrons have “feelings”; but this may simply be a way of saying that what is of interest to metaphysics is not necessarily of interest to physics. In a 1934 article in The New Frontier, Hartshorne characterized physics as the behavioristic aspect of the lowest branch of comparative psychology, or even of comparative sociology since reality, in his view, has a social structure. Hartshorne argued further that psychicalism is the metaphysic best suited to an evolutionary world-view. Psychicalism does not face the problem of the emergence of mind from what is wholly lacking in psychical qualities. Hartshorne calls this view “temporal dualism”; all of the problems of mind-body dualism of how to relate nonphysical mind to nonmental matter are repeated, only in an evolutionary context. For Hartshorne, on the contrary, new forms of mind emerge in the process of evolution, but not mind itself.

c. Indeterminism and Freedom

The philosophy of creative becoming is inherently anti-deterministic. This is not to say that Hartshorne denied relations of cause and effect or that he rejected the laws of nature discovered through scientific investigation. It is all-too-common for philosophers to argue that the falsity of determinism implies chaoticism, the doctrine that there exists, at most, an appearance of causal regularity in the world. By way of clarification, Hartshorne noted that determinism posits absolute modal regularity in the sense that, for every set of causal conditions, it is not only the case that, then and there, there is only one effect that will occur (which may well be a truism), but there is only one effect, then and there, that can occur (note that “can” is a modal concept). As William James argued in “The Dilemma of Determinism,” if some sets of causal conditions allow for more than one possible effect, then determinism is false. Therefore, the logical contradictory of absolute regularity is non-absolute regularity, not absolute irregularity (chaoticism). Absolute irregularity is the logical contrary, not the contradictory, of determinism. For this reason, Hartshorne argues in Wisdom as Moderation that determinism and chaoticism are the extreme metaphysical positions, both of which may be false. If both are false, then some form of indeterminism must be true.

Determinism has sometimes gone by the name of the doctrine of necessity, as in Perice’s famous article “The Doctrine of Necessity Examined.” The meaning of “necessity” as it applies to determinism is that a specific effect could not have been otherwise given the causes that brought it about; in other words, causes necessitate their effects. Indeed, determinists seek to minimize the extent to which events seem contingent—that they could have been otherwise—by uncovering their causal antecedents. The deterministic theory is that all contingency in the world, which is to say, all of the variety and novelty or all deviations from absolute regularity, are apparent only. Alternate effects seem possible, but determinists claim that this is only because of our ignorance of all of the factors—the causes and the laws that link cause to effect—that explain a particular effect. Nevertheless, hidden within the seeming contingency of our ignorance is another necessity: the causal nexus of events absolutely fixes the details of our knowledge in any given situation. Of course, whether determinism or indeterminism is correct, some degree of ignorance and fallibility is an inescapable aspect of the human condition. The indeterminism espoused by Hartshorne also admits of unknown causes that limit what is possible. For example, an athlete may eat breakfast with plans of competing later in the day, not realizing that the food she is eating is contaminated and will incapacitate her. In Hartshorne’s theory, however, contingency is not merely a function of ignorance; on the contrary, sometimes there are real alternatives, no one of which the concatenation of causal conditions entirely eliminates. The incapacitated athlete, for example, may nevertheless have a variety of real alternatives for how to respond to the food poisoning.

Peirce argued, and Hartshorne agreed with him, that one cannot help but posit real alternatives: either reality as a whole could have been otherwise or contingency enters the world piecemeal or incrementally. Determinists may attempt to eliminate contingency within the universe by tracing events to their causal antecedents—to a singularity at the beginning of the universe or to an eternal decree from deity—but there remains the question of why the universe has the exact initial conditions that it has. There is no plausible modal theory that would allow one to consider the contingency of the initial conditions as a hidden form of necessity. Thus, contingency is unavoidable, or as Hartshorne says in Creative Synthesis (Ch. II), “There can be no alternative to alternativeness itself.” Hartshorne, following Peirce and James, locates the contingency of the universe not in an absolute beginning or in the divine will but within the universe’s own creative processes—in Hartshorne’s words, contingency “seeping into the world bit by bit” (Creativity in American Philosophy, Ch. X). James spoke of “pluralism’s additive world” and this is Hartshorne’s view: the coming-to-be of each dynamic singular introduces a morsel of novelty into existence and, in so doing, adds itself to the universe. Every subsequent dynamic singular must take account of this prior addition to the universe as a causal factor in its own emergence. In this way, there is a rhythm of the universe as each new subject of experience inevitably becomes a new object for a new experience.

It should now be clear that Hartshorne intended his version of indeterminism to leave ample room for the massive regularities—the order—of the world that scientists make it their business to discover, but these regularities are not absolute as determinists conceive them to be. Hartshorne turned on its head the traditional doctrine that effects are contained in their causes; for Hartshorne, it is the other way around: at the most basic metaphysical level of analysis, causes are contained in their effects. Again, Hartshorne finds a clue in the experience of memory. One’s memory-of-X includes X as an indispensable causal component, but X as partial cause of one’s memory-of-X does not contain the memory itself. Hartshorne goes further and denies that memory-of-X is contained, implicitly or virtually, in the entire set of causal conditions leading up to the memory. In short, the causal antecedents of the memory provide the necessary but not the sufficient conditions of the memory. The present memory-experience is an instance of creative experiencing; as such, it adds a novel element to reality. Nevertheless, the causal conditions are limiting factors in what experience may result from them; the causes define a field of possible experience activity. Not just any effect can result from a particular set of casual conditions and this principle is enough to block the inference from indeterminism to chaoticism. This principle also provides the metaphysical ground of developmental processes. For example, every adult human has a developmental history beginning with a fertilized egg, but no single-celled zygote and its genetic make-up is sufficient to make an adult. The countless intermediate steps of growth and education, as well as the person’s own reactions to his or her circumstances, are required to complete the process.

Since at least David Hume, philosophers have acknowledged that empirical science cannot establish the truth of determinism. There remained, however, the idea that scientific explanations presuppose or require a deterministic framework. In Hartshorne’s reckoning, Peirce disposed of this claim once and for all. First, Peirce observed that measurements can be no more fine-grained than our instruments and our proneness to error will allow. There can be no empirical or scientific meaning to the concept of an absolute measurement. Second, the far reaching regularities in nature that a reasonable indeterminism posits are enough for the purposes of scientific theorizing; saying that the regularities are absolute, as determinism does, adds nothing. The much diminished levels of novel experiencing that Hartshorne’s metaphysics locates in the world of inorganic beings makes that realm as deterministic in appearance as it needs to be for the purposes of discovering laws of nature. To be sure, those laws must be understood as stochastic, but this fits well enough with scientific judgments which are couched in terms of probabilities rather than certainties. It is worth noting that Hartshorne did not look to subatomic physics for his main support for indeterminism, for he believed that the case against determinism had already been made by Peirce and others. As far as Hartshorne was concerned, quantum indeterminacies buttress the case against determinism by showing that physics, the supposedly most materialistic of sciences, does not require determinism. Even Einstein, who rejected indeterministic interpretations of quantum phenomena, did not deny that those interpretations were scientific.

Numerous philosophers use moral freedom as an argument—perhaps the central argument—against determinism. Hartshorne agreed that moral freedom is indispensable to a proper understanding of human life but he was more interested in defending a more generalized idea of freedom that extends beyond moral decision-making and even into the nonhuman realm. Freedom in this more generalized sense, as a creative act, complements and completes Hartshorne’s indeterminism. In The Logic of Perfection (Ch. VIII), he speaks of causality as crystallized freedom and freedom as causality in the making. As we have just seen, for Hartshorne, every effect is more than, and even includes, the causal conditions that make it possible. If one analyzes the effect, abstracting from its causes, one is left with the particular way in which a dynamic singular experiences its causal antecedents, which is the measure of novelty in it. The word “experience” may call to mind a merely private epiphenomenon, but Hartshorne insists that experience has an ineliminable public aspect as it becomes a datum for subsequent experiences—a cause for future effects. In Creative Experiencing: A Philosophy of Freedom, he stresses that this idea of freedom is essentially social. Every creative act is a combination of self-determination and determination by others. The creative act, once completed in a dynamic singular, becomes part-cause of subsequent dynamic singulars. In this way, cause and effect relations are explained by the more basic principle of freedom limiting freedom.

For all of Hartshorne’s animadversions on determinism and his advocacy of a philosophy of creativity, he was under no illusions about the limits, sometimes extreme limits, on freedom in any particular situation. In The Logic of Perfection (Ch. VIII), he speaks of a present creation as adding “only its little mite” to the vast totality of the universe. Hartshorne says that a phrase like “creative experiencing” escapes redundancy because there are degrees of creativity. As Hartshorne’s indeterminism provides the metaphysical ground for developmental ideas, so the concept of freedom limiting freedom provides the ground for a meaningful concept of degrees of freedom. Freedom increases to the extent that there are more options of more complexity, allowing for greater contrasts of feeling. The development of more and more complex organisms during the course of evolution makes for new levels of organizational structure (for example, in the convolutions of brains), more varieties of experiencing, and a widening range of possibilities for creative realization. The most dramatic example of augmented freedom occurs when organisms cross the threshold from experience to conscious experience. This occurred in the evolution of the human species but it is also the natural development within each member of that species. Hartshorne remarks on how the complexities of a symphony can be appreciated by a human being but they are hopelessly beyond the understanding of creatures with simpler brains. Consciousness also makes possible moral freedom which brings with it increased opportunities for achievement and for risk of failure in the attainment of high ideals. The opportunities and the risks go hand-in-hand in such a way that one cannot be had without the other.

d. Personal Identity

The attribution of responsibility for acts worthy of praise or censure involves the concept of a person, or more fundamentally, of agency. With the problematic exception of a supernatural deity that exists outside of time, persons do not simply exist, they persist; their existence requires days, months, and years. Dynamic singulars, as momentary flashes of experience, are not persons, but in Hartshorne’s view, they are the raw materials from which persons are made. One can say that a person is a whole of which dynamic singulars are the parts. Hartshorne adopts the more refined categories of Whitehead’s philosophy in order to express, in neoclassical terms, the concept of personhood. Whitehead spoke of a nexus as any “particular fact of togetherness among actual entities.” A society is a type of nexus whose constituents prehend (feel) a common element of form. Every mammal, for example, is a society of dynamic singulars, each of which inherits from its predecessors and passes along to its successors the form of “mammal.” A society is more than a mere mathematical set, for the common form of the society is passed along—shared by prehensive relations—from one grouping of dynamic singulars to another.

In the philosophies of Whitehead and Hartshorne, the existence of a person requires that there be a special type of society, one that exhibits personal order. A personally ordered society is a sequence of dynamic singulars, no two of which are contemporaries. This is the neoclassical metaphysical account of our sense of being persons that persist through time. Both Whitehead and Hartshorne emphasize, however, that personally ordered societies are embodied. A personally ordered society is a sub-society within the larger society that is the human body. Leibniz spoke of a dominant entelechy or soul associated with each animal body, itself a collection of monads; a personally ordered society is a very rough equivalent of this (taking into account all of the caveats mentioned in the discussion of psychicalism). Each dynamic singular making up a personally ordered society inherits not only from its predecessor in the sequence but also from the dynamic singulars that make up the rest of the body. The body, one might say, is the immediate environment of the soul, or more colloquially, the self. Whitehead and Hartshorne believed that a personally ordered society does not survive without the body. Although neither philosopher definitively dismissed the possibility of a limited post-mortem existence, they did not show the slightest interest in speculating on the details of such a possibility.

Hartshorne argued that his and Whitehead’s view of personhood avoids two extremes. A person is not, as Hume seemed to think, a mere bundle of qualities existing from moment to moment, with no internal relations among its component parts. Every dynamic singular within a personally ordered society is a creative appropriation of its successors in the sequence and in the wider environment of its body. As noted in the previous section, Hartshorne denied determinism without denying the efficacy of causal regularities. Certain kinds of damage to the brain, for instance, are real causal factors in seriously altering or even eliminating personality. The other extreme that Hartshorne claims to avoid is the denial of external relations among the components of the self. According to Leibniz, the identity of a monad, including a dominant entelechy, is in its “concept,” which is all of the properties that ever were or will be true of it. Hartshorne maintains that a person is a product of developmental processes that are inherently open-ended, allowing for different outcomes. For this reason, Hartshorne accepted the Jamesian view that one’s character as so far formed is no absolute guarantee of one’s future behavior. It is true, as is said, that people “act in character,” but one is also part-creator of one’s character. We meet here once again, but now as applied to the problem of self-identity, the protean nature of creativity in neoclassical metaphysics. As each dynamic singular in one’s personally ordered society emerges, one is a partly new self.

On Hartshorne’s principles, personal identity is not a matter of strict or mathematical identity. The additive nature of creativity entails that identity through time, or gen-identity, is relative only—a question of “more or less” rather than “all or none.” The unity of self-identity in a person is wholly a function of the inertia that past dynamic singulars carry into the present of a personally ordered series. Hartshorne sometimes spoke of this relation as being among past and present “selves.” James said that the present thought itself is the thinker. Hartshorne would agree, for it is not the past “selves” in a personal sequence that do the thinking; the present is where thinking occurs and where particular decisions are made. For most of us, most of the time, the broad outlines of our personality remain stable, allowing us to speak of ourselves as being “the same person.” Yet, dramatic changes are possible, for the better and for the worse. The annals of both brain science and of religious conversion are full of case histories of persons who undergo changes that are sufficiently global to speak of a new person being born. It is also worth noting that Hartshorne’s metaphysics allows for the possibility that a single body could support more than one personally ordered society; this might provide the outlines of an account of multiple personality or even of aspects of the unconscious mind.

Hartshorne’s theory of personal identity is not reductionist. It is, like his indeterminism and philosophy of freedom, inherently developmental. Consider the beginnings of a human life. In most cases, conception results in a full complement of chromosomes necessary for a human person to develop. Much more must be accomplished, even within the mother’s reproductive system, to complete the process. The single-celled zygote from which we grow is genetically human, but it is arguably not the individual we associate with being a person. For example, far from being one individual rather than another, the fertilized egg has the potential to divide to produce twins or triplets. Hartshorne noted that his twin brothers, James and Henry, were very different persons despite having the same genetic make-up. Another argument against reducing personhood to genetic structure is that the nervous system and a functioning brain, which provide the physiological basis of human personhood, are not present from the moment of conception; they are the result of development both in utero and after birth. These observations do not determine the moral or legal status of the unborn, but they are relevant to those questions, for they argue against reducing personhood to genetics. To be sure, the question of abortion is complicated. When does the unborn become a person with rights and how do these rights, assuming they exist, stand vis-à-vis a woman’s manifest right to self-determination? Hartshorne was firmly on the side of allowing women to decide for themselves, apart from interference from government or religion, whether to terminate an unwanted pregnancy. His position on abortion was basically that of Roe v. Wade. What Hartshorne’s metaphysic of personal identity brings to the debate is a robust rejection of reducing personhood to genetics and a corresponding emphasis on developmental categories. In The Second Sex, Simone de Beauvoir wrote that one is not born a woman, but becomes one. Hartshorne would agree and generalize the thought: one is not born a person, but becomes a person.

Hartshorne drew interesting ethical implications from his metaphysics of personal identity. Most notably, he argued that a metaphysics which includes such Whiteheadian notions as prehension, personally-ordered societies of actual occasions, and transmutation of conformal feeling, could never countenance what Hartshorne calls the “illusions of egoism.” Even more plausible versions of “enlightened” ethical egoism, which allow interest in others for the sake of welfare of self, are incoherent in Hartshorne’s reckoning. Enlightened self-interest theories are based on a partially true but misleading “common sense” conception of self-identity that fails to grasp the logical distinction between being an individual and being the concrete states of an individual. The former is an abstraction from the latter. No momentary state is strictly identical with any other but there can be enough continuity to speak of an abstract, relatively unchanging, character. As Hartshorne says in The Zero Fallacy (Ch. XII), “The identity is somewhat abstract, the non-identity is concrete. Without this distinction the language of self-identity is a conceptual trap.” When this distinction is grasped, we see that the claim to have an interest in self cannot be simpliciter or absolute, since there must always be an “other,” namely, the future concrete states of the individual self, to which the interests of the self in a concrete state now must be addressed. Moreover, the fact that (psychologically normal) individuals “enjoy the enjoyment of others” is grounded in the metaphysical structure of social selves, whose dominant occasions of experience are built up and transmuted by conformal feeling of the feeling-tone in constituent neural occasions. We are, quite literally in Hartshorne’s account, “members one of another.” That is to say, a “self” is precisely a creative synthesis of feelings of others through its “perceptual mode of causal efficacy” in Whitehead’s language. The capacity for feeling the feelings of others, in a word “sympathy,” is basic, and thus the capacity for altruism as well as selfishness is built into the nature of being a social organism.

e. Time and Possibility

Hartshorne’s philosophy of creative experiencing is inseparable from his philosophy of time. As already explained, he posits a universe that is forever in the making by the dynamic singulars that come to be. What has already been made is the past, what has yet to be made is the future, and the present is the locus of activity where future possibility becomes past actuality. This characterization of time is in one sense circular, for the definens presupposes the definiendum; for example, “yet to be” presupposes “future.” What keeps the definition from being vacuous is Hartshorne’s concept of creativity or making. Classical ideas about creation in Christian tradition, for example, place God outside of time as its creator. According to this theory, God brings the temporal world—past, present, future—into existence but the divine act itself is not in time. From God’s eternity, what is future for us is as fully detailed as any moment that has for us become past. Hartshorne, on the other hand, finds a fundamental asymmetry in temporal relations. There is no such thing, even from a divine perspective, of a future that is as fully detailed as the past. The future, as “yet to be made,” lacks details that will not exist until the making of them. The “making of them,” as already noted, adds something to the universe that was not previously part of it. The universe, and time itself, is nothing more than this process of accumulated and accumulating acts of becoming and all that they contain.

Some commentators are tempted to see in Hartshorne’s theory of time a variation on J. M. E. McTaggart’s concept of an A-series. However, in his article on “Time” for Vergilius Ferm’s 1955 Encyclopedia of Religion, Hartshorne distinguished his own ideas from those of McTaggart. McTaggart distinguished two ways of marking time: the A-series of relations of past, present, and future and the B-series of relations of before and after. McTaggart said that if one abstracts from the A-series and B-series relations, one is left with an ordered array of events, called a C-series, without temporal order of any kind. If a C-series is like a film strip, with each frame representing an event, the A-series is analogous to frames passing in front of the light of the projector; as the light shines through a particular frame the photo on that frame is a present event, beforehand it is a future event and afterwards it is a past event. By contrast, Hartshorne’s cumulative theory of becoming entails that there is no such thing as a C-series from which A-series relations could be abstracted. To continue the analogy, there is no film running on a projector with frames yet to be viewed. In short, there are no future events. At best, and in keeping with Hartshorne’s indeterminism, there is field of possibility that is only as detailed as the past determines it must be, all else in the field remaining essentially vague, awaiting full determination as novel dynamic singulars arise in the creative advance. By parity of reasoning, B-series relations are not fixed in eternity but are themselves results of temporal becoming. For example, the fact denoted by “Socrates died before Aristotle’s birth” could not exist until Aristotle was born. This is no mere limitation of human knowledge. After Socrates’ death and before Aristotle’s birth, there was no such relation as Socrates-having-died-before-Aristotle-was-born; what existed at the time of Socrates’ death was a range of recently emergent possibilities of someone or other being born after Socrates, for example: a-great-philosopher-born-fifteen-years-after-the-death-of-Socrates. As Hartshorne says in the Encyclopedia article, “Time is not a mere relation of becomings but a becoming of relations.”

Hartshorne grounded modal concepts in the temporal structure of the world. He often quoted, with approval, Peirce’s dictum that time is a particular variety of objective modality: the past is fully determinate or actual, the future is relatively indeterminate or possible, and the present is the becoming of the actual as the relatively indeterminate becomes determinate. Following the lead of both Peirce and James, Hartshorne argued that determinism denies the reality of time. As noted previously, the only objective modality where determinism is concerned is necessity. Hartshorne’s indeterminism, on the other hand, posits necessity in the direction from effect to cause; in the direction from cause to effect, however, there is an element of contingency, and this is the objective modality of the future. Determinists emphasize our ignorance of causes and the consequent inability to clearly perceive the necessary relations among all events. For the determinist, however, the ignorance includes the systematic illusion of time’s direction. From a practical point of view we cannot help but treat the illusion as reality. Aristotle remarked in Nichomachean Ethics (Bk. VI) that no one decides to have sacked Troy; however, the war (assuming its historicity) was once a matter of urgent decision in which the future was not something to be known but something to be made. For this reason, Hartshorne maintained that we act as though the future is relatively indeterminate even if we convince ourselves otherwise.

Hartshorne argued that the human capacity to form general conceptions and to frame principles that guide actions is another illustration that it is necessarily the case that we act as though determinism were false and time is real. The asymmetry between remembering a past event and planning for a future event is a powerful indication of the asymmetry of time. One may remember or misremember any amount of detail about a plan that has been carried out, but when the plan has yet to be executed, the only details that can be known are ones within the plan itself. As a script for future activity, the plan is abstract compared to its eventual realization. One may remember having taken one’s dog for a walk, including the memory of having intended to take a particular route; however, the memory of the originally intended route cannot include everything that happened on the walk: on this walk, a toddler peered at you from beside a car, a fallen branch blocked your path, you stepped on two ants, a street lamp burned out, a raccoon scurried into a sewage drain—these and countless other details were not included in the plan. Of course, what one anticipates by way of plans, intentions, or purposes, can be more or less specific. Regardless of the amount of detail, however, one’s future projects leave innumerable particulars undecided. When things go “as planned” it is not because every aspect of the plan matched some detail fixed in advance, for there are many ways that plans can be successfully fulfilled. Musicians know that every musical score leaves a great deal to be decided; different performances can be equally faithful to what the composer wrote.

Hartshorne realized that if his theory of modality as essentially temporal is correct then there can be no such thing as merely possible worlds that are not anchored in the actual world. At most, there are possible world-states; that is to say, there are ways the actual world might have been. For any given past event, there was a time when something else might have occurred in its place. We can ask “What if?” about the past in order to conceive of ways the world might have been different, even though nothing can now be done to change what occurred. The future, on the other hand, is the arena of what might yet occur given the actual history of the world up to the present. Hartshorne’s view contrasts neatly with Leibniz’s idea that possible worlds are completely detailed descriptions of universes that God might choose to create. Possible worlds, in the Leibnizian sense, contain possible persons. As Leibniz argues in his correspondence with Arnauld, when the “concept”—the complete description of a possible person—is made actual by God, the person exists; the making actual of a different “concept” (that is, altering the description in some way) would result in a different person. Hartshorne objects that persons cannot be merely possible. Contrary to Leibniz, an actual person could have had properties other than what it has and the properties that it has could have been had by others. For example, Hillary Clinton could have been elected the U.S. president in 2008 and someone besides Hillary could have been Bill Clinton’s first lady. A fictional character, on the other hand, has no reality beyond the description of it; it has enough specificity to simulate a real person, but no feat of magic could transform it into a real person. Hartshorne’s arguments clearly anticipate and dovetail with those of Saul Kripke in Naming and Necessity. Kripke maintains that a proper name designates the same object across possible worlds (for example, Hillary Clinton) whereas a description designates different objects from world to world (for example, “winner of the 2008 U.S. presidential election”). Kripke also suggested that “counter-factual situation” or “possible state (or history) of the world” are less misleading expressions than “possible world.” To speak of a “counter-factual” is to presuppose the factual. On these points, Hartshorne and Kripke are in full agreement.

On the question of the nature of possibility, Hartshorne sided closely with Peirce but parted ways with Whitehead. Peirce conceived the realm of possibilities as a continuum which, by definition, has no least member, but is infinitely divisible. There are no actual parts of a continuum, only an infinite number of ways to slice it. This idea is evident in Hartshorne’s concept of the affective continuum (see the companion article “Charles Hartshorne: Biography/Philosophy: Philosophy and Psychology of Sensation”). Whitehead, on the other hand, spoke of “eternal objects” as “forms of definiteness” that identify what a thing is. The point of calling eternal objects “eternal” is that none of them are novel; the point of calling them “objects” is that they are definite; for example, a particular shade of green is this shade and no other. To use Whitehead’s example, a leaf on a tree changes colors but any particular shade of color exhibited by the leaf does not change. Hartshorne maintains, by contrast, that the shades of color in question are neither eternal nor are they definite objects; put somewhat differently, they are definite only insofar as they are not eternal. The successive shades of color of the leaf are slices of the color continuum that exist as definite only when instantiated in the leaf. The color of the leaf at a particular moment is novel. In Hartshorne’s account, we speak of sameness of color because the gradation between any two shades may be so infinitesimally slight as to be imperceptible. He noted that observed sameness of color is not a transitive relation. An object X may appear to be the same color as Y and Y the same color as Z, but X may appear slightly different than Z. In other words, there is a threshold defined by a degree of separation on the color continuum below which real differences are not observable for creatures like us.

According to Hartshorne, any quality that admits of a negative instance is not eternal. There are, in short, emergent universals. In Creative Synthesis (Ch. IV), Hartshorne notes that “lover of Shakespeare” is a universal in the sense that it may be true of more than one thing but it is emergent in the sense that it could be true of nothing prior to Shakespeare. By parity of reasoning, specific qualities in the affective continuum—particular tonal qualities, particular shades of color, and so forth—emerge as the affective continuum is cut in various ways and patterns by dynamic singulars. On the other hand, the generic quality of “feeling” may be classified, in Hartshornean principles, as eternal, if not quite an “object” in the Whiteheadian sense. As previously noted, qualities that admit only of positive instances are metaphysical. A consequence of Hartshorne’s view is that similarity is not simply a function of partial identity. It is true that we count two things similar when they have a sufficient number of qualities in common. But it is also the case that qualities themselves are similar to each other, as when we observe that orange is closer to red than it is to blue. Hartshorne concludes that similarity is as metaphysically ultimate as identity.

f. The Aesthetic Motif

One of the best and earliest interpreters of Hartshorne’s philosophy, Eugene Peters, spoke of “the aesthetic motif” that runs through neoclassical theism. Peters was drawing attention to the fact that, for Hartshorne, the most inclusive values are aesthetic. Hartshorne began his career proposing, as an empirical hypothesis, that all sensations are feelings and that all feelings exist along an aesthetic continuum (see “Charles Hartshorne: Philosophy and Psychology of Sensation”). Hartshorne’s metaphysics completes and complements the empirical hypothesis by considering the value-achievement and value-enrichment of dynamic singulars as the very foundations of existence. In Divine Beauty, Daniel Dombrowski rightly says that, for Hartshorne, aesthetic experiences are not merely woven into the real, they are the real. The poet e. e. cummings wrote, “Since feeling is first / who pays any attention / to the syntax of things . . .” Hartshorne did precisely what cummings dismissed (at least in the poem): he recognized feeling as first (that is, as a metaphysical category) but he also paid close attention to the syntax of things (to understand the structure of feeling).

In his first book Hartshorne rejected the “annex view of value.” In the context of neoclassical metaphysics this means that there is no merely valueless stuff (what Whitehead called “vacuous actuality”) onto which values are projected by human or divine purposes. Our pre-reflective experiences of our bodies, our memories, and of the world are never, Hartshorne insists, of bare valueless existence. The mother hears her baby’s cries as irritating and the mother’s songs are heard as soothing by the child. The values in experience, however, are not primarily ethical but aesthetic, a fact most clearly illustrated in the animal kingdom. The experiences of subhuman creatures are productive primarily—and for most creatures, exclusively—of non-moral values. When a lion fells an antelope, it is good for the lion pride and bad for the antelope, but moral judgments are out of place. One may, it is true, stress what is adaptive in behavior and useful for the survival of the species. There remain, however, the values of living for the lions and for the antelope that derive from being aware of the world around them, of breathing, eating, and the interactions with their fellows. These creatures do not think about their worlds but they feel them. For them, aesthesis or feeling (the root of “aesthetics”) is indeed “first.” Hartshorne’s extensive study of song birds in his book Born to Sing supports this hypothesis; oscines have what in us would be called an aesthetic sense.

Hartshorne did not consider beauty to be the only aesthetic value, but “beauty” was his word of choice for what anchors his aesthetic theory. One could generalize or gloss “beauty” to mean intense satisfactory experience without distorting Hartshorne’s meaning. Much of traditional aesthetics holds that beauty is unity within diversity. Hartshorne argued, however, that another contrast is necessary to make sense of beauty, that of complexity and simplicity. This concept of beauty, along with the relation of beauty to other aesthetic values, is expressed in the Dessoir-Davis-Hartshorne Circle. (Max Dessoir and Kay Davis helped Hartshorne with the diagram.) If Hartshorne is correct, then beauty is a mean between two extremes, between order and disorder on the one hand (the vertical axis of the circles) and between complexity and simplicity on the other (the horizontal axis of the circles). Outside of the boundary of the outer circle is not merely aesthetic failure but also the failure of experience and therefore (because of Hartshorne’s psychicalism) of existence itself.

For Hartshorne, beauty (or any aesthetic quality) is not merely in “the eye of the beholder” (or the perception of the perceiver). One must take into account not only the perceiving mind but what the mind perceives. A mind of sufficient complexity, cultivation, and education is required to appreciate the elements that make for beauty in something. For example, until one knows what counterpoint is and until one is taught to listen for it, one is not in a position to be fully aware of it and one may not even be able to hear it. An adequate grasp of such things is beyond the ability of creatures with simpler nervous systems or of humans with certain kinds of brain damage. There is, in short, an intellectual component of beauty that requires a higher intellect to appreciate. This intellectual side of beauty predominates in science and mathematics, but Hartshorne argues that the twin contrasts of order / disorder and complexity / simplicity remain. In one of his articles, titled “Science as the Search for the Hidden Beauty of the World” (1982), he chronicles the ways in which ideals of beauty guide pure scientific inquiry and how the deliverances of science themselves are beautiful. Science seeks a proper balance between imagination (for example, theorizing) and observation. Hartshorne speaks of the “romance of science” as “the disclosure of a universe whose wild harmonies surpass the most vivid dreams of imagination not submitting itself to criticism and observational test.” He reminds us that Darwin closed the Origin with “a prose poem on the beauty of the web of life.”

Prediction is one of the goals of scientific inquiry, but even here, there is an aesthetic component. Too little predictability is chaotic but too much predictability is monotonous. Good science is also heuristic, meaning that it is fruitful, leading to more discoveries. But discoveries in the strict sense are not predictable and are often quite surprising. Hartshorne accuses the determinism of traditional Newtonian science of aesthetic failure for it posited absolute regularity as the ideal to the exclusion of spontaneity, chance, and freedom: the adventure of life reduced to mechanistic obedience to law. Hartshorne’s indeterminism, as we have seen, respects the rule of laws of nature but provides a balance between regularity and irregularity. Traditional theology, Hartshorne claims, was as defective from an aesthetic point of view as the traditional philosophy of nature. Classical theologians stressed divine simplicity and unity to the exclusion of complexity and variety. In an article titled “The Aesthetic Dimensions of Religious Experience” (1992) Hartshorne says, “The beauty of the world is in its partly unprogrammed spontaneities.” Hartshorne’s neoclassical theism affirms a world of multiple creative agents in interaction with each other and with God (see “Charles Hartshorne: Dipolar Theism”). In Hartshorne’s view, God is affected by the creatures and, consequently, the divine experience is a complex reality, full of all of the serendipity and tragedy that interactions with others routinely bring. If Hartshorne is correct, there is an ever changing beauty of the world as a whole that is fully appreciated only by deity and to contemplate this divine experience is to have something akin to what classical theologians called the beatific vision (see the discussion of Hartshorne’s aesthetic argument in “Charles Hartshorne: Theistic and Anti-theistic Arguments”).

4. Conclusion: Hartshorne’s Legacy

At an early age, after reading Emerson, Hartshorne says in his introduction to The Logic of Perfection that he resolved “to trust reason to the end.” He left ample evidence that he was true to this purpose. He was, however, sensitive to the many ways in which philosophy is a frail and fallible enterprise. Communication must take place across centuries and across cultural and linguistic boundaries. There is the snobbery and inertia of traditions and what Hartshorne called “cultural lag” in the recognition of genuine insights (“Analysis and Cultural Lag in Philosophy,” 1973). There is the tendency to forget, ignore, or marginalize objections to one’s views; Hartshorne also considered it mistaken to suppose that meeting objections is sufficient for securing the rationality of one’s ideas, or as he wrote in his correspondence with Edgar Sheffield Brightman, to merely defend one’s own “castle of ideas.” As Carnap said, it is one thing to ask what your metaphysical position commits you to, but it is something else again to ask what commits you to your metaphysical position. Despite their knowledge of formal logic, philosophers are also susceptible to the fallacy of affirming the consequent, looking only for confirmation of their views or for arguments favorable to them. There is, finally, the failure to exhaust the logically possible alternatives in considering the solutions for particular philosophical problems. Hartshorne discussed all of these obstacles, and more, to making progress in philosophy, and he took measures to remedy them in his own attempt to trust reason.

Hartshorne distinguished, with Edith Wharton, between those who light new candles and those who are mirrors reflecting the candles that are lit by others. At the close of his autobiography, he remarked that Whitehead and Peirce had done both, and he dared to hope that he had done both. Hartshorne’s own “candle” has perhaps often been missed because he expended a lot of energy reflecting the lights of Whitehead and Peirce. Hartshorne, however, was neither Whiteheadian nor Peircean. This is true not only of his range of interests and expertise—he contributed to the psychology of sensation and to the study of bird song; it is also true of his systematic presentation, development, and defense of the project of metaphysics, as well as of his own distinctive metaphysical system. He lacked for neither ideas nor for arguments to support those ideas. His neoclassical metaphysics is arguably one of the great intellectual achievements of the twentieth century.

5. References and Further Reading

a. Primary Sources: Books (In Order of Appearance)

Hartshorne, Charles. 1937. Beyond Humanism: Essays in the Philosophy of Nature. Chicago: Willett, Clark & Company. Republished in 1975 by Peter Smith.
Hartshorne, Charles. 1953. Reality as Social Process: Studies in Metaphysics and Religion. Boston: Beacon Press.
Hartshorne, Charles. 1962. The Logic of Perfection and Other Essays in Neoclassical Metaphysics. La Salle, Illinois: Open Court.
Hartshorne, Charles. 1970. Creative Synthesis and Philosophic Method. La Salle, Illinois: Open Court.
Hartshorne, Charles. 1972. Whitehead’s Philosophy: Selected Essays, 1935-1970. Lincoln, Nebraska: University of Nebraska Press.
Hartshorne, Charles. 1976. Aquinas to Whitehead: Seven Centuries of Metaphysics of Religion. Milwaukee, Wisconsin: Marquette University Publications.
Hartshorne, Charles. 1983. Insights and Oversights of Great Thinkers: An Evaluation of Western Philosophy. Albany: State University of New York Press.
Hartshorne, Charles. 1984. Creativity in American Philosophy. Albany: State University of New York Press.
Hartshorne, Charles. 1987. Wisdom as Moderation: A Philosophy of the Middle Way. Albany: State University of New York Press.
Hartshorne, Charles. 1997. The Zero Fallacy and Other Essays in Neoclassical Philosophy, edited by Mohammad Valady. Peru, Illinois: Open Court Publishing Company.
Hartshorne, Charles. 2011. Creative Experiencing: A Philosophy of Freedom, edited by Donald W. Viney and Jincheol O. Albany: State University of New York Press.
Auxier, Randall E. and Mark Y. A. Davies, editors. 2001. Hartshorne and Brightman on God, Process, and Persons: The Correspondence, 1922-1945. Nashville: Vanderbilt University Press.
Viney, Donald W., guest editor. 2001. Process Studies, Special Focus on Charles Hartshorne, 30/2 (Fall-Winter)
Viney, Donald W., editor. 2001. Charles Hartshorne’s Letters to a Young Philosopher: 1979-1995. Logos-Sophia, the Journal of the Pittsburg State University Philosophical Society, volume 11. Pittsburg, Kansas.
Viney, Donald W., guest editor. 2011. Process Studies, Special Focus Section: Charles Hartshorne, 40/1 (Spring/Summer): 91–161.
Vetter, Herbert F., editor. 2007. Hartshorne, A New World View: Essays by Charles Hartshorne. Cambridge, Massachusetts: Harvard Square Library.

b. Primary Sources: Hartshorne’s Response to his Critics

“Interrogations of Charles Hartshorne,” conducted by William Alston. 1964. Philosophical Interrogations, edited by Sydney and Beatrice Rome. New York: Holt, Rinehart and Winston: 319–354.
Cobb, John B. Jr. and Franklin L Gamwell, editors. 1984. Existence and Actuality: Conversations with Charles Hartshorne. Chicago: University of Chicago Press.
Hahn, Lewis Edwin, editor. 1991. The Philosophy of Charles Hartshorne, The Library of Living Philosophers Volume XX. La Salle, Illinois: Open Court.
Kane, Robert and Stephen H. Phillips, editors. 1989. Hartshorne, Process Philosophy and Theology. Albany State University of New York Press.
Sia, Santiago, editor. 1990. Charles Hartshorne’s Concept of God: Philosophical and Theological Responses. Dordrecht, the Netherlands: Kluwer Academic Publishers.

c. Primary Sources: Selected Articles

Hartshorne, Charles. 1932. “Contingency and the New Era in Metaphysics, I.” Journal of Philosophy 29/16. 4 August: 421–431.
Hartshorne, Charles. 1932. “Contingency and the New Era in Metaphysics, II.” Journal of Philosophy 29/17. 18 August: 457–469.
Hartshorne, Charles. 1934. “The New Metaphysics and Current Problems, I.” New Frontier 1/1: 24–31; “The New Metaphysics and current Problems, II.” New Frontier 1/5: 8–14.
Hartshorne, Charles. 1935. “Metaphysics for Positivists.” Philosophy of Science 2/3. July: 287-303.
Hartshorne, Charles. 1945. Entry for “time”, pp. 787-88 in An Encyclopedia of Religion, ed. Vergilius Ferm. New York: Philosophical Library.
Hartshorne, Charles. 1964. “Thinking About Thinking Machines,” Texas Quarterly 7/1. Spring: 131–140.
Hartshorne, Charles. 1970. “The Development of My Philosophy” in John E. Smith (ed.) Contemporary American Philosophy: Second Series, London: Allen & Unwin: 211–28.
Hartshorne, Charles. 1973. “Analysis and Cultural Lag in Philosophy.” Southern Journal of Philosophy 11/2-3: 105–112.
Hartshorne, Charles. 1977. “Bell’s Theorem and Stapp’s Revised View of Space-Time.” Process Studies 7/3 (Fall): 183–191.
Hartshorne, Charles. 1978. “A Philosophy of Death.” Philosophical Aspects of Thanatology, volume 2, edited by Florence M. Hetzler and A. H. Kutscher. New York: MSS Information Corporation: 81–89.
Hartshorne, Charles. 1980. “Mysticism and Rationatistic Metaphysics.” Understanding Mysticism, edited by Richard Woods. Garden City, New York: Image: 415–421.
Hartshorne, Charles. 1981. “Concerning Abortion: An Attempt at a Rational View.” The Christian Century 98/2. 21 January: 42–45.
Hartshorne, Charles. 1982. “Science as the Search for the Hidden Beauty of the World.” The Aesthetic Dimension of Science 1980 Nobel Conference, Number 16, ed. Deane W. Curtin. New York: Philosophical Library, 1982): 85–106.
Hartshorne, Charles. 1987. “Mind and Body: A Special Case of Mind and Mind.” A Process Theory of Medicine: Interdisciplinary Essays, edited by Marcus Ford. Lewiston, New York: Edwin Mellen Press: 77–88.
Hartshorne, Charles. 1987. “A Metaphysics of Universal Freedom.” Faith and Creativity, Essays in Honor of Eugene H. Peters, edited by George Nordgulen and George W. Shields. St. Louis, Missouri: CBP Press: 27–40.
Hartshorne, Charles. 1988. “Some Principles of Procedure in Metaphysics.” The Nature of Metaphysical Knowledge, edited by G. F. McLean and Hugo Meynell. Lanham, New York: University Press of America: 69–75.
Hartshorne, Charles. 1988. “Sankara, Nagarjuna, and Fa Tsang, with Some Western Analogues.” Interpreting Across Boundaries: New Essays in Comparative Philosophy, edited by G. J. Larson and Eliot Deutsch. Princeton University Press: 98–115.
Hartshorne, Charles. 1989. “Von Wright and Hume’s Axiom.” The Philosophy of Georg Henrik von Wright, edited by Paul Arthur Schilpp and Lewis Edwin Hahn. La Salle, Illinois: Open Court: 59–76.
Hartshorne, Charles. 1990. “Hegel, Logic, and Metaphysics,” CLIO 19/4: 345–352.
Hartshorne, Charles. 1991. “An Open Letter to Carl Sagan.” The Journal of Speculative Philosophy 5/4: 227–232.
Hartshorne, Charles. 1992. “The Aesthetic Dimensions of Religious Experience.” Logic, God and Metaphysics, ed. J. F. Harris. Dordrecht: Kluwer Academic Publishers: 9–18.
Hartshorne, Charles. 1993. “Can Philosophers Cooperate Intellectually: Metaphysics as Applied Mathematics.” The Midwest Quarterly 35/1. Autumn: 8–20.
Hartshorne, Charles. 1994. “Three Important Scientists on Mind, Matter, and the Metaphysics of Religion.” The Journal of Speculative Philosophy, 8/3: 211–227.

d. Secondary Sources

Chancey, Anita. 1999. “Rationality, Contributionism, and the Value of Love: Hartshorne on Abortion.” Process Studies 28/1-2. Spring-Summer: 85–97.
Dombrowski, Daniel A. 1988. Hartshorne and the Metaphysics of Animal Rights. Albany: State University of New York Press.
Dombrowski, Daniel A. 2004. Divine Beauty: The Aesthetics of Charles Hartshorne. Nashville, Tennessee: Vanderbilt University Press.
Easterbrook, Gregg. 1998. “A Hundred Years of Thinking About God, A Philosopher Soon to be Rediscovered,” U.S. News & World Report. February 23: 61, 65.
Fitzgerald, Paul. 1972. “Relativity Physics and the God of Process Philosophy.” Process Studies 2/4. Winter: 251–276.
Ford, Lewis S. 1968. “Is Process Theism Compatible with Relativity Theory?” Journal of Religion 48/2. April: 124–135.
Ford, Lewis S., editor. 1973. Two Process Philosophers: Hartshorne’s Encounter with Whitehead. Tallahassee, Florida: American Academy of Religion.
Griffin, David Ray, John B. Cobb Jr., Marcus P. Ford, Pete A. Y. Gunter, and Peter Ochs. 1993. Founders of Constructive Postmodern Philosophy: Peirce, James, Bergson, Whitehead, and Hartshorne. Albany: State University of New York Press.
Jesse, Jennifer G. and J. Wesley Robbins, editors. 2001. American Journal of Theology & Philosophy, memorial issue in tribute to Charles Hartshorne, 22/2. May.
Minor, William S., editor. 1969. Charles Hartshorne and Henry Nelson Wieman. Lanham, MD: University Press of America.
Myers, William, guest editor. 1998. The Personalist Forum, Special Issue on Charles Hartshorne, 14/2. Fall.
Peters, Eugene H. 1970. Hartshorne and Neoclassical Metaphysics. Lincoln: University of Nebraska Press.
Peters, Eugene H. 1976. “Philosophic Insights of Charles Hartshorne,” Southwestern Journal of Philosophy, VII, 1/17: 157–170.
Ramal, Randy, editor. 2010. Metaphysics, Analysis, and the Grammar of God: Process and Analytic Voices in Dialogue .Tübingen, Germany: Mohr Siebeck.
Reck, Andrew J. 1961. “The Philosophy of Charles Hartshorne,” Tulane Studies in Philosophy X. May: 89–108.
Reese, William L. and Eugene Freeman, editors. 1964. Process and Divinity: Philosophical Essays Presented to Charles Hartshorne: The Hartshorne Festchrift. La Salle, Illinois: Open Court Publishing Company.
Shields, George W. 1992. “Infinitesimals and Hartshorne’s Set-Theoretic Platonism” The Modern Schoolman 49/2. January.
Shields, George W. 2004. “Process and Universals” in After Whitehead: Rescher on Process Metaphysics, ed. by M. Weber. Frankfurt: Ontos Verlag.
Shields, George W. 2008. “‘Beyond Enlightened Self-Interest’ Revisited: Process Philosophy and the Biology of Altruism” in Researching with Whitehead: Essays in Honor of John B. Cobb, Jr., ed. by F. Riffert and Hans-Joachim Sander. Muenchen: Verlag Karl Alber.
Shields, George W. 2008. “MWI Quantum Theory: Some Logical and Philosophical Issues,” paper presented at the Center for Philosophy and Natural Sciences, California State University-Sacramento.
Shields, George W. 2009. “Quo Vadis?: On Current Prospects for Process Philosophy and Theology,” The American Journal of Theology & Philosophy, 30/2. May.
Shields, George W. 2010. “Eternal Objects, Middle Knowledge, and Hartshorne: A Response to Malone-France,” Process Studies, 39/1. Spring/Summer: 149–165.
Shields, George W. 2010. “Panexperientialism, Quantum Theory, and Neuroplasticity” in Process Approaches to Consciousness, eds. Michel Weber and A. Weekes. Albany: State University of New York Press.
Shields, George W., editor. 2003. Process and Analysis: Whitehead, Hartshorne, and the Analytic Tradition. Albany: State University of New York Press.
Simoni-Wastila, Henry. 1999. “Is Divine Relativity Possible? Charles Hartshorne on God’s Sympathy with the World.” Process Studies 28/1-2. Spring-Summer: 98–116.
Sprigge, T. L. S. 2006. The God of Metaphysics. Oxford: Clarendon Press.
Suchocki, Marjorie Hewitt and John B. Cobb, Jr. editors. 1992. Process Studies, Special Issue on the Philosophy of Charles Hartshorne, 21/2. Summer.
Viney, Donald Wayne. 2008. “Charles Hartshorne (1897-2000),” Handbook of Whiteheadian Process Thought, Volume 2, edited by Michel Weber and Will Desmond. Frankfurt / Paris / Lancaster: Ontos Verlag: 589–596.
Viney, Donald Wayne and Rebecca Viney. 1993. “For the Beauty of the Earth: A Hartshornean Ecological Aesthetic.” Proceedings of the Institute for Liberal Studies: Science, Technology & Religious Ideas, volume 4. Frankfort: Kentucky State University: 38–44.
Whitehead, Alfred North. 1978 [1929]. Process and Reality: An Essay in Cosmology, corrected edition, edited by David Ray Griffin and Donald W. Sherburne. New York: Free Press.
Wilcox, John T. 1961. “A Question from Physics for Certain Theists.” Journal of Religion 40/4. October: 293–300.

e. Bibliography

“Primary Bibliography of Philosophical Works of Charles Hartshorne” (compiled by Dorothy Hartshorne; corrected, revised, and updated by Donald Wayne Viney and Randy Ramal) in Herbert F. Vetter, editor, Hartshorne: A New World View: Essays by Charles Hartshorne. Cambridge, Massachusetts: Harvard Square Library, 2007: 129–160. Also published in Santiago Sia, Religion, Reason and God. Frankfurt am Main: Peter Lang, 2004: 195–223.

Author Information

Donald Wayne Viney
Email: don_viney@yahoo.com
Pittsburg State University
U. S. A.

and

George W. Shields
Email: George.shields@kysu.edu
Kentucky State University
U. S. A.

Political Constructivism

Political Constructivism is a method for producing and defending principles of justice and legitimacy. It is most closely associated with John Rawls’ technique of subjecting our deliberations about justice to certain hypothetical constraints. Rawls argued that if all of us reason in the light of these conditions we could arrive at the same judgment about justice. Moreover, our shared judgment about justice is justified precisely because it resulted from a suitably structured deliberative process. This is constructivism’s key idea; it holds that certain complex entities are constructed from more fundamental elements.

In moral and political constructivism, the complex entities are moral and political principles or obligations, such as the principle to each according to his merits or the obligations created through contracts. The debates surrounding constructivism tend to concern the nature of these elements and the process by which they get assembled. Some constructivists are more subjective insofar as they cast these elements as attitudes and values of living agents or as the settled political values of a particular society. Others are more objective insofar as they identify these elements with universal precepts of practical reason working in combination with abstract conceptions of persons and society. In each case, the constructivist holds the view that these elements—no matter how they are specified—are brought together in a set of reasons favoring one principle over another. The process by which this happens is a process of construction, since the human mind actively assembles the considerations from which a principle is formulated; it does not passively receive its formulation. Absent this active mental process, there are no criteria for guiding political action or justifying our political institutions—neither a way to properly assess our genuine political obligations. In order to perform these evaluative tasks, we must construct the metric of assessment. Political constructivism is a philosophical account for how this constructing happens, and how the process confers moral authority on the resulting principles.

Introduction
A Brief History
Political Constructivism: Two Formulations
Political Constructivism and Procedures
Political Constructivism and Social Problems
Conclusion
References and Further Reading

1. Introduction

The term “constructivism” is still relatively new to political and moral theory. It emerged sometime in the second half of the twentieth century to describe John Rawls’ general approach to normative political theory. Since first appearing, it has developed into a family of positions in normative ethics, political philosophy, and metaethics. The term “political constructivism” is newer still and sometimes used to describe the approach Rawls employed in Political Liberalism, which attempts to steer clear of any controversial metaphysical suppositions by drawing heavily on the ideals and values implicit in a democratic society. More generally, it is used to describe the application of constructivism to the political domain. On this general understanding, political constructivism not only covers all of Rawls’ political works, but any political work guided by the idea that an appropriate thought process confers authority onto the resulting political principles. Moreover, since human thought creates the political principles governing our society, human thought can analyze those same principles and either affirm or refute their justification. The fact that we can analyze our principles—and by extension the policies based on them—suggests that we can reason about politics, and the constructivist maintains that our reasons should go a long way toward reconciling political debate and generating agreement in judgment.

This general idea of political constructivism is not too different from other, more familiar views, such as the claim that appropriate prices are the result of open and competitive markets, or the idea that legitimate representatives of a democratic society are the winners of open and fair elections. In each of these cases, the entity in question can be explained in terms of a more fundamental process, for instance, the decisions of various people engaged in markets or electoral processes, together with an explanation of what it means for these processes to be ‘open,’ ‘competitive,’ or ‘fair.’ Political constructivism reflects a similar idea insofar as principles of political action result from a thought-process involving elements more fundamental than the principles themselves, such as attitudes, concepts, ideals, beliefs, values, and precepts. Together, these building blocks help establish a particular set of principles as justified, appropriate, objective, or valid. As a result, constructivism is the view that the best set of political principles is the outcome of an appropriate form of thinking. Importantly, there is no criterion beyond this form of thinking by which we can assess the appropriateness of the principles.

Although constructivism begins with a simple idea, its conception of thinking, or practical reason, is ambitious. We can see this by contrasting it with two competing conceptions of practical deliberations that are more familiar in everyday experience. The first frames practical deliberations within a means-ends relationship whereby practical reason identifies the means by which certain ends are realized. For example, every day we are led by reason to conclude that eating certain foods will satisfy our hunger. In this case, the end to be achieved is immediately given by our natural desire for food; our reason simply discovers the means for satisfying that desire. A second view of practical reason is concerned not merely with identifying the means to some immediately given end, but also with ensuring that the means or action conforms to some moral principle. For example, we may prefer to lie about a particular event, but because we are committed to a principle of honesty, we tell the truth. On this view, the capacity of practical reason extends beyond its instrumental role by including within it a power to check our impulses against moral principles. Notice, however, that in this second example, the rule on which our action is based is still given. There is no implicit or explicit claim that practical reason produces the principle. Instead, practical reason passively receives its command and acts within its limits.

Political constructivism is a different view altogether. The various political principles constraining political action are not merely given to us, but rather are the products of thought. They are not products in the sense of being created from nothing, but rather constructed from various resources appropriate to political argument. Apart from these constructions, there are no moral facts or true moral judgments, nor are there ways of assessing the moral worth of a political action. It is only when deliberations are properly constrained that the resulting outcome is a principle against which our actions can be assessed as morally right or wrong.

Notice that while our political actions are assessed against a normative principle, there is no criterion beyond the deliberative process by which the rightness of the principle is assessed; it is authoritative in virtue of being the outcome of a certain kind of deliberative process or a certain form of argument. Consequently, the challenge for constructivism is to explain the appropriateness of the process without appealing to any judgment that is supposed to derive from that process; for if the thought process relied on such a judgment to assess its appropriateness, it would assume the very thing it claims to construct. It has been argued that constructivism fails to meet this challenge on logical grounds (Cohen 2008). But others have attempted to meet it, and in turn have created a variety of interpretations. This makes the approach difficult to define and summarize. Naturally, a great deal of philosophical debate surrounds the appropriateness of the deliberative process, especially as it concerns the metaethical themes of justification and objectivity. At any rate, despite the extensive literature on the subject, there are two general formulations of political constructivism influenced by two historical accounts of practical reasons. The first is a deontological account of practical reason that is primarily associated with Kantian ethics. The second is a teleological account of practical reason that is primarily associated with social contract theory. The Kantian and social contract traditions, although offering differing accounts of practical reason, share much in common, and it would not be an exaggeration to cast constructivism as a contemporary attempt to explain the Rousseauian idea of moral freedom as acting on a law one gives oneself complemented by the Kantian idea that the law one gives oneself is out of one’s reason. Political constructivism tries to make this idea clear by identifying a compelling form of normative political analysis with easily understood criteria for thinking about political issues. The hope is that once we are equipped with this form of analysis, we can reason in the light of these criteria and reach agreement in political judgment; and, if not agreement, we can at least narrow our differences sufficiently to secure a just, or fair, or honorable, or decent set of political relations (Rawls 1993, 120).

This article frames political constructivism as a general way of applying constructivism to the political domain. It discusses various interpretations in light of the two general formulations noted above—deontology and teleology. Although the various interpretations discussed do not always fit easily within this distinction, it is nevertheless a useful way of examining political constructivism because deontology and teleology straddle a historical fault line for how best to think about practical reason and the justification of political principles. According to the deontological approach, practical reason is modeled on a mathematical deduction; the aim is to create an argument that should be, so far as possible, a deductive one. By contrast, a teleological account of practical reason has an instrumental form; the aim is to explain how political principles function to realize some end. Examining political constructivism in the light of these formulations exposes the key logical difference between the various interpretations of political constructivism and sets the stage for assessing whether a particular interpretation is more favorable than others.

2. A Brief History

Although the historical influences of constructivism date back to the social contract and Kantian traditions, the contemporary usage of the term seems to have originated with Ronald Dworkin’s 1973 article, “The Original Position” (Dworkin 1973). In this article, Dworkin defends a constructive model of Rawls’s reflective equilibrium over a natural model. Reflective equilibrium refers to a strategy for justifying political principles often associated with political constructivism. The important point here is that a natural model views the relation between moral principles and our more intuitive judgments about ethics as analogous to the relation between scientific laws and empirical data. On a natural model, political theory aims at discovering and describing the normative laws that explain our moral intuitions, not unlike the way natural science aims at discovering and describing the laws of nature that explain our sensory intuitions about the world outside of us. By contrast, a constructive model presents political theory as analogous to legal theory. On this view, political theory aims at constructing political principles that can account for our moral intuitions by bringing as many of those intuitions into a coherent whole with one another, not unlike a judge who, on deciding a case, constructs a legal principle that brings precedent into a coherent whole with a novel yet plausible interpretation of a legal concept.

Dworkin’s constructive model captures a feature of constructivism that has endured until the twenty-first century, namely, that political principles depend on us; they are mind-dependent and result from some interpretive work on our part. To bring this feature into sharp relief, Dworkin contrasts a constructive model to a natural model, again foreshadowing a move familiar in the literature; for it is often the case that constructivists contrast their positions to moral realism when developing their arguments. Moral realism takes many forms, but a common feature of moral realism is that it frames moral judgments in terms of our detecting moral facts. We discover these moral facts not unlike the way we discover the color red—we passively receive the datum. Moreover, these entities are simple in that they cannot be analyzed any further; there are no fundamental elements brought together into a set of consideration from which political principles are formulated. Constructivism differs with moral realism on these various points. In contrast to realism, constructivism holds that actions are judged as right or wrong by measuring those actions against principles that are themselves constructed, not detected. Moreover, these principles are justified in virtue of being constructed from more fundamental elements through an appropriate thought process.

The first major attempt to explicitly develop a constructivist position was John Rawls’ “A Kantian Constructivism in Moral Theory,” first published in 1980. Prior to “A Kantian Constructivism,” Rawls used the metaphor of a ‘contract’ rather than ‘construction’ to describe his theory, tending instead to use the adjective constructive to mean capable of settling moral disputes. “A refutation of intuitionism,” he writes, “consists in presenting the sort of constructive criteria that are said not to exist” (Rawls 1999a, 35). The aim of A Theory of Justice is to defend precisely these constructive criteria. In “Kantian Constructivism,” Rawls introduces the term constructivism without explanation and uses it to denote a particular kind of political argument reflecting a particular view about justification and objectivity. He writes:

Kantian constructivism holds that moral objectivity is to be understood in terms of a suitably constructed social point of view that all can accept. Apart from the procedure of constructing the principles of justice, there are no moral facts. Whether certain facts are to be recognized as reasons of right and justice, or how much they are to count, can be ascertained only from within the constructive procedure, that is, from the undertakings of rational agents of construction when

suitably represented as free and equal moral persons (Rawls 1999b, 307).

This suggests a two-step process: (1) constructing a social point of view acceptable to all, and (2) constructing principles of justice from within that point of view. Somewhere between the publication of “Kantian Constructivism” and Political Liberalism, published in 1993, Rawls decides to express himself differently. When explaining political constructivism, Rawls clarifies what he takes to be constructed:

First, in this form of constructivism, what is it that is constructed? Answer: the content of a political conception of justice. In justice as fairness this content is the principles of justice selected by the parties in the original position … A second question is this: as a procedural device of representation, is the original position itself constructed? No: it is simply laid out (Rawls 1993, 103).

The two-step construction noted above is now reduced to one step, namely, constructing the principles of justice. The social point of view from which the construction takes place is not constructed, but simply laid out. The varieties of constructivism that follow the publication of Rawls’ “Kantian Constructivism” represent competing views on how these separate tasks are to be conducted. While each variant lays out a social point of view and defends competing principles, they have in common the basic idea that the appropriate set of political principles is the outcome of an appropriate form of thinking. Moral judgments are correct when they conform to these principles. The task of political argument is to join together all the relevant elements into one unified scheme of practical reason, that is, a social point of view, so that the deliberations constrained by that scheme arrive at—or construct—the proper principles of justice. Absent this scheme, there are no criteria for guiding political action or justifying our institutions.

And so, in 1980, constructivism begins to take shape as a distinctive approach to moral and political theory. In the decades following “Kantian Constructivism,” the literature on constructivism proliferates at an increasingly fast rate, and an increasing percentage of the literature focuses on moral constructivism rather than political constructivism. The publications of T. M. Scanlon, Christine Korsgaard, and Onora O’Neill begin to form a body of work that, together with John Rawls’, shapes key debates in normative ethics and political philosophy as well as in metaethics.

3. Political Constructivism: Two Formulations

The central idea behind political constructivism is that an appropriate set of political principles is constructed from suitably formed deliberations. These deliberations assemble fundamental elements—such as attitudes, concepts, ideals, beliefs, values and precepts, along with their application to certain problems or contexts in which our normative deliberations take root—into a set of reasons from which principles are formulated. This is an abstract idea that needs to be filled out with some content if it is to be fully understood. The most famous and substantial formulation of it is John Rawls’ theory of justice, which he calls justice as fairness. Justice as fairness begins with a simple idea: the most appropriate conception of justice is one that people would choose in a fair situation (Rawls 1999b, 310). A fair situation is a hypothetical choice procedure called the original position. It organizes various concepts, considered judgments, and precepts into a procedure that frames deliberations. Anyone deliberating within this procedure will reason according to these elements of rationality and reasonableness. In other words, these building blocks provide the raw material from which principles of justice are constructed.

What are these starting points? They include common precepts of rationality, such as: If one desires a particular end, it is rational to follow the means for achieving that end; if the end can be realized in more than one way, it is rational to choose the less burdensome way; if agreements between parties are mutually beneficial and each party can be given full assurance that the other will abide by the terms of the agreement, it is rational to enter into the agreement; if times are uncertain, it is rational to rank alternatives by their worst possible outcome and then pick the alternative with the least worst outcome. These precepts of rationality are guided in their application toward a particular set of ends called primary goods. In addition to these precepts of rationality and their related ends, the original position attempts to model precepts of reasonableness. Reasonable people are ready to propose principles as fair terms of social cooperation and to abide by them willingly, even at the cost of their own interests in particular situations, provided that others accept those terms. Rawls models reasonableness into the original position by including within it a veil of ignorance that precludes parties from knowing their specific circumstances: a condition of publicity that ensures parties understand the public nature of the agreement; a symmetric positioning of the parties’ situation with respect to one another; formal constraints of generality and universality; and, a list of traditional principles from which the parties choose (Rawls 1999a, 105–130).

The precepts of rationality and reasonableness are modeled as a thought procedure anyone can enter into at any time. In A Theory of Justice, Rawls argues that anyone deliberating from within the original position will arrive at the same conclusion—they will choose the same two principles of justice. As a result, the original position realizes the general aim of constructivism by bringing together abstract precepts of rationality with a conception of persons and society in a set of reasons that supports a particular set of principles. Indeed, Rawls’s procedural argument is so well known and so well developed that constructivism is often taken to be synonymous with the idea that whatever results from a hypothetical thought experiment, such as the original position, constitutes the correct set of principles. For example, some describe the constructivist as a hypothetical proceduralist. “He endorses some hypothetical procedure as determining which principles constitute valid standards of morality” (Darwall, Gibbard, and Railton 1992, 140). Similarly, Brian Barry defines constructivism as “a theory to the effect that what comes out of a certain kind of situation is to count as just” (Barry 1991, 266). Sharon Street says that the bumper sticker slogan of constructivism is “no normative truth independent of the practical point of view” (Street 2010, 366). The works of T. M. Scanlon deepen the characterization of constructivism as a form of proceduralism, and critics have further solidified this interpretation by fixing on various weaknesses of procedural arguments. The combined effect is that proceduralism has become the default interpretation of political constructivism.

Proceduralism has taken many forms since the publication of A Theory of Justice. For example, Rawls’ later works use complex conceptions of persons and society to give the original position a more substantive form (Rawls 1993, 93). Since these conceptions are informed by the shared public values of a democratic society, the starting points of construction are more substantive than those identified by A Theory of Justice. Indeed, many of the debates and criticisms of a procedural formulation of political constructivism center on whether the starting points should be more universal and objective, as in A Theory of Justice, or more local and substantive, as in Political Liberalism.

The procedural interpretation of political constructivism is by far the most common, but it is not the only one. A second, less developed account is already present in “A Kantian Constructivism” where Rawls draws a link between the original position and the practical task of political argument. Rawls begins his article in a manner consistent with a procedural formulation by noting that “What distinguishes a Kantian form of constructivism is essentially this: it specifies a particular conception of the person as an element in a reasonable procedure of construction, the outcome of which determines the content of the first principles of justice” (Rawls 1999b, 304). However, he quickly adds that the Kantian conception of justice is meant to address an impasse in our recent political history, namely, “the apparent conflict between freedom and equality in a democratic society” (Rawls 1999b, 305). This impasse, and the attempt to break it, impacts the argument’s logical structure, for principles are now justified in virtue of their breaking the impasse rather than in virtue of being the outcome of a choice procedure. Constructivism becomes “political” not because it appropriates political values, but because it engages in a practical enterprise of solving political problems. Christine Korsgaard expresses this idea when she writes, “Rawls, like Hobbes before him, thinks that justice is the solution to a problem” (C. Korsgaard 2003, 112). On this formulation, justice as fairness is justified if it solves the conflict between freedom and equality in a democratic society. If it does not solve the problem, it is unjustified.

This second account of constructivism might be called the practical formulation of constructivism. Together with the procedural account, political constructivism reflects two great traditions of moral and political thought—Kantian ethics and social contract theory. Like the procedural formulation of constructivism, Kant employed the Categorical Imperative to determine whether subjective maxims are universalizable and thus objectively valid. The Kantian Categorical Imperative specifies a moral point of view that might be described as “suitably joining together all the requirements of our (human) practical reason, both pure and empirical, into one unified scheme of practical reasons” (Rawls 1999b, 515). This scheme guides deliberations so as to construct correct moral judgments. By contrast, the social contract tradition identifies the state of nature as a structural problem in need of rectification. It is the nature of the problem that frames deliberations on the content of the contract. Once the contract is established and agreed upon, it places new obligations upon the contracting parties, thereby constructing a moral order that had previously not existed.

The development of constructivism over the past three decades reflects these two traditions. Sometimes the particular variant of constructivism emphasizes one tradition over the other; sometimes it trades on both. In any case, a critical division between the variants concerns whether the constructed set of principles are formulated and justified independently of any conception of the good those principles might later realize, or whether the constructed set of principles are formulated and justified in relation to the good those principles might later realize. The former is deontological; the latter is teleological. Procedural formulations are typically deontic in that they are fashioned on mathematical proofs that move from widely acceptable axioms to more substantial political theorems. Practical formulations tend to be teleological insofar as the practical analysis is guided by the good that would be realized should that problem be resolved.

The procedural and practical formulations of constructivism serve as two entry points for understanding how political constructivism has been applied and might further be developed. Whether one formulation proves more successful depends on whether one can make more sense of the idea that the best political action is an action conforming to a normative law we give ourselves out of reasons we all can share. Only that variant will be “constructive” in the sense of being “capable” of settling moral charged political disputes. Or, absent the ambitious goal of actually settling disputes on reasons we all can share, the successful variant should at least fix the point at which political disagreements arise by bringing out into the light of day the reasons why people arrive at political judgments that are not only different but are in fact incommensurable.

4. Political Constructivism and Procedures

One way to describe the procedural formulation of political constructivism more thoroughly is to recall that constructivism can be characterized as a view about the nature of political argument or analysis, especially as it pertains to justification and objectivity. If political principles are to be justified as obligatory and morally authoritative, it is insufficient to derive them from a social point of view without also explaining why that social point of view is also authoritative; for absent a defense of the point of view, the purported justification of principles will appear wanting. In the course of time since Dworkin introduced the term, political philosophers have developed three general strategies for defending the elements of a procedure. They include reflective equilibrium, narrowing the scope of the investigation, and, at its most ambitious, elucidating the demands of practical reason from which normative political principles can be established.

Reflective equilibrium refers to a back and forth process that seeks coherence among the different parts of a conception of justice. These parts include the principles of justice, the conditions of the hypothetical procedure, and the firm moral judgments we make in everyday life. Once equilibrium is achieved, the different parts of the theory are justified in terms of their mutual support. The “key idea underlying reflective equilibrium is that we test various parts of our system of moral beliefs against other beliefs we hold, seeking coherence among the widest set of moral and non-moral beliefs by revising and refining them at all levels” (Daniels 1996, 2). Accordingly, the fundamental elements comprising a hypothetical procedure are justified in virtue of their supporting and being supported by the match between the outcome of the procedure (the principles of justice) and our firmly held moral intuitions, which Rawls calls considered judgments. “By going back and forth,” Rawls writes, “sometimes altering the conditions of the contractual circumstances [hypothetical procedure], at others withdrawing our [considered] judgments and conforming them to principle, I assume that eventually we shall find a description of the initial situation that both expresses reasonable conditions and yields principles which match our considered judgments duly pruned and adjusted. This state of affairs I refer to as reflective equilibrium” (Rawls 1999a, 18).

Critics have raised tough questions about a coherentist justification of political principles; for if our intuitive moral judgments form part of the justificatory process, then the resulting principles cannot serve as independent standards against which those same judgments can be assessed and found wanting (Hare 1973, 147; Nagel 1973, 228; Sandel 1998, 49). The risk of circular reasoning slips into the process and thus undermines its justificatory force. To strengthen the critical dimension of the resulting political principles, procedural constructivists have tended to move in either one of two directions. They have either conceded moral breadth in order to strengthen the justificatory core of constructivism by narrowing the scope of its investigation (James 2012; Roberts 2007), or they have refocused their attention on accounts of agency and rationality in order to more clearly elucidate the demands of practical reason (O’Neill 1996).

Rawls’ writings subsequent to A Theory of Justice can be interpreted as taking the former path. In these works, he paid increasingly close attention to liberal values by linking justification to “our deeper understanding of ourselves and our aspirations,” and bracketing “claims about the essential nature and identity of persons” (Rawls 1999b, 306–07, 388). The conditions of the original position are therefore conditions already accepted by members of a liberal democracy, or conditions such members could be made to accept because of their implicit presence within the public culture of a democratic society they share. The hope is that by localizing practical reason to a particular kind of political tradition one can simultaneously strengthen the justification of the argument for that audience. There is no attempt to provide a comprehensive normative political argument true for all peoples at all times. Instead, the program is much more modest, relying on values already at home in the subject addressed.

This strategy has been criticized on a number of grounds. Some have questioned the veracity of Rawls’s empirical claims, others worry that the search for stability within a pluralist society lowers the bar of justification too much, and still others claim that Rawls’ conception of persons remain too ideal and detached from reality (Klosko 1993; Barry 1995; O’Neill 1996). Although these criticisms are forceful objections of the usual interpretation of Rawls’ Political Liberalism, Aaron James has developed a variant of the strategy less susceptible to them. James describes political constructivism as “a methodology of substantive justification… The hope is to show, as though by something vaguely akin to mathematical demonstration, that proposed principles can be worked out, in steps which are themselves manifestly reasonable, from rudimentary and highly plausible ideas arising from within a society’s own essentially social kind of practical reason” (James 2013, 251–52). The aim is to “justify principles that tell us how existing versions of the practice would have to be reformed if they are to be justifiable” (James 2012, 29). If practices such as constitutional democracies or global free trade regimes are not inherently unjust, then this could be an attractive path to pursue, since the fundamental elements from which principles are constructed are contained within the practice itself. These elements include the practice’s participants, its purpose, and the circumstances favorable to its continuation over time. Provided the description of the practice is accurate and generally acceptable, the argument in favor of a particular set of principles should be authoritative to that practice.

One criticism of this strategy is that it turns a contingent empirical fact into an absolute constraint on ones conception of justice, thereby undermining that conception’s critical leverage by rendering it ill equipped to determine why these practices might fall short of justice (Valentini 2011, 412). This becomes apparent in Rawls’ Law of Peoples, which “sets out guidelines for a liberal society’s foreign policy in a reasonably just Society of Peoples” (Rawls 2001, 128). The concern is that a state-centric global order (or peoples-centric global order) lends itself to certain injustices because there are no overarching institutions that can foster trust and cooperation among nations. The Law of Peoples fails to shed light on the unwelcomed incentives created by a state-centric order, since it assumes from the beginning that the practice is normatively innocuous and, as a result, risks justifying an unjust, or less than fully just, status quo.

In order to avoid this outcome, one would have to either attach the fundamental elements of construction to the good realized by the practice in question, or move in the opposite direction by recovering the more abstract features of practical rationality. The former option shifts the grounds of justification toward a teleological structure of justification, which is associated with a practical formulation of constructivism discussed in the next section. The latter option alone remains within the framework of a deontological justification, and is perhaps best illustrated by some of the work of Onora O’Neill. O’Neill’s constructivism abstracts from our more richly idealized conception of persons by articulating more meager—and thus what she believes to be more easily justifiable—precepts of rationality, agency, and mutual independence (O’Neill 1988). For example, O’Neill maintains that rationality can be construed as the capacity to understand and follow some form of social life; and mutual independence can be interpreted as an agent’s capacity to develop varying sorts and degrees of dependency and interdependency. These elements help frame the question: What principles can a plurality of agents of minimal rationality and with varying degrees of dependence live by? While these minimal, formal requirements of rationality and agency might be too meager to construct substantive principles of justice entitling people to certain goods, O’Neill thinks they can inform us as to which principles a group cannot live by. The elements of construction therefore help us construct principles of obligation prohibiting those actions that undermine the capacity for agency.

O’Neill’s variant reclaims the moral breadth and universality of normative political principles by constructing them from fundamental elements that are generally weak and widely acceptable. She believes every rational person can understand and accept these fundamental elements and thus can agree on the obligations constraining their actions. Rawls suggests something similar in A Theory of Justice. Like O’Neill, Rawls thinks that the justification of justice as fairness is in part defended on generally shared and preferably weak conditions. Moreover, his published articles leading up to A Theory of Justice have been described as beginning with “as narrow and morally neutral a conception of rational agency as can plausibly be drawn” (Wolff 1977, 13). The ambition reflected in these works concerns the derivation of substantive principles from formal premises through a kind of rational choice bargaining game. O’Neill can be interpreted as developing a similar position by returning to these earlier ambitions, albeit not in the language of rational choice theory. Together with the more descriptively rich practice-based variant suggested by James, the procedural formulation of constructivism can be characterized as moving in either an abstract, more universal direction, or a substantive, more localized direction. Some have tried to bridge the two ends of the spectrum by suggesting various levels of construction. For example, Peri Roberts argues that primary constructions start from bare concepts of persons and society and formulate general principles of justice with universal scope (Roberts 2007). However, once armed with these bare concepts and general principles, the constructivist can thicken the concepts in a secondary procedure by drawing on the ideals and values of a particular society.

What is common to each of these arguments is their form. Each aims to construct an argument modeled on a mathematical demonstration. The hope is to move from generally weak and broadly acceptable axioms to more substantial political theorems via a procedure of construction. The strength of this form of constructivist argument depends not only on the plausibility of the procedure, but also on whether the appropriateness of the procedure can be specified without appealing to the kinds of normative judgments the procedure is supposed to produce; for if the appropriateness of the procedure depended on such judgments, it would assume the very thing it claims to construct. G. A. Cohen has argued that Rawls’ constructivism fails to meet this challenge because the two principles resulting from the original position depend for their justification on unarticulated background principles of justice (Cohen 2008). Cohen’s argument is based on a deeper thesis about the relationship between facts and principles. On Cohen’s view, “a principle can respond to (that is, be grounded in) a fact only because it is also a response to a more ultimate principle that is not a response to a fact” (Cohen 2008, 229). This is a logical argument. If it is correct it strikes a notable blow against the constructivist position; for if the procedure reflects factual considerations, as they often do, then Cohen can maintain that anyone affirming a principle resulting from the procedure must also affirm a more fundamental principle surviving denial of those same facts. These fact-insensitive principles are the valid principles of justice; they are logically prior to the principles generated by a procedure and thus are not themselves constructed.

Cohen’s criticism is directed against John Rawls, but it applies to any form of constructivism that uses facts about persons and society when formulating the procedure. The general idea, already reflected in a number of other criticisms of constructivism, is that the process of constructing substantive normative principles relies upon unarticulated, non-constructed principles. Consequently, the constructivist cannot maintain the view that all political principles are constructed.

5. Political Constructivism and Social Problems

In A Theory of Justice, Rawls writes: “The theory of justice is a part, perhaps the most significant part, of the theory of rational choice” (Rawls 1999a, 15). Describing A Theory of Justice as a rational choice theory is less common than it used to be. In his Reconstruction and Critique of A Theory of Justice, Robert Paul Wolff speculates that Rawls’ “original intention must have been to write a book very much like Kenneth Arrow’s Social Choice and Individual Values” (Wolff 1977, 4). Wolff then continues to interpret Rawls’s working terms of a rational choice model. Similarly, in his 1989 treatise, Theories of Justice, Brian Barry interprets Rawls’ argument from two perspectives: a rational choice model and a competing approach Barry calls justice as impartiality. A rational choice characterization of A Theory of Justice views the participants of the original position as engaged in a bargaining contract concerning political principles. The failure to establish an agreement returns a person to her position or holdings prior to any cooperative arrangement, and this position is called the noncooperative baseline. Now, it is assumed that the parties are rationally motivated by their own self-interests to move beyond the noncooperative baseline and arrive at a mutually advantageous arrangement. In rational choice theory, the most mutually advantageous series of outcomes is referred to as the Pareto Frontier. It represents a series of efficient outcomes insofar as it is not possible to move away from the frontier so as to improve one person’s position without worsening another’s. The deliberations within the original position represent a move away from a noncooperative baseline to a specific point on the Pareto Frontier.

Rawls would later regret having described his theory as part of a rational choice theory, calling it a very misleading error (Rawls 1999b, 401). Nevertheless, what is particularly interesting about a rational choice characterization of A Theory of Justice is that it reflects, to some extent, the two different formulations of constructivism. On the one hand, rational choice models embody the rigor and certainty of mathematical demonstrations insofar as substantive conclusions are thought to derive from premises that, though not formal, are generally weak and widely acceptable. The procedural formulation of constructivism reflects this mathematical model. On the other hand, rational choice models are often described as solutions to problems cast as bargaining games. If the bargaining game concerns the problem of justice—or how the benefits and burdens of social cooperation are to be divided among people conceived as free and equal—then the problem itself contains normative resources for constructing the principles that will serve as the solution. The practical formulation of constructivism reflects this key idea. Notice that the two formulations locate the resources for constructing political principles in different places. The procedural account locates the fundamental elements in generally weak and broadly acceptable ideas and precepts. These building blocks are articulated independently of the good they may help bring about when assembled into principles and applied to the situation. Conversely, the practical account locates the fundamental elements of construction in relation to the good realized when the principles are applied. This is because principles of justice are conceived as solutions to problems rather than outcomes of procedures. We begin not with generally weak and widely acceptable ideas about persons and society but rather with particular problems faced by individuals. Moreover, it is in formulating the problem clearly that we are directed to its solution, since the problem contains recourses that will point us in the direction of its solution. It is with these resources that the practical account of constructivism in part begins. Consequently, the conceptual starting points are in part located in the good realized once the solution is applied and the problem resolved.

Christine Korsgaard offers a variant of this formulation by characterizing the concept of justice as a solution to a distribution problem concerning collectively created goods (C. Korsgaard 2003). A conception of justice is a particular solution to this problem; it should answer questions like: Who gets what? Who makes what? How much of what one makes should one get? Who is excluded from getting what others have made? A society can consistently answer these questions over time by referring to—implicitly or explicitly—principles of justice. These principles might assign rights or entitlements to individuals, or they might ensure fair and open access to the courts, or they might protect political voice. In each case they must express a particularly thick conception of political right by providing a fairly specific solution to the problem. For example, a conception of justice might express a libertarian set of principles, such as Nozick’s principles of acquisition and transfer; or it might express a liberal egalitarian principle, such as Rawls’ difference principle. On Korsgaard’s view, the task of practical philosophy is to move from abstract normative concepts, such as justice, to a particular normative conception, such as Rawls’s justice as fairness, “by constructing an account of the problem reflected in the concept that will point the way to a conception that solves the problem” (C. Korsgaard 2003, 116). Constructivism does this by conceiving normative concepts and principles as functional—they play a particular role in helping solve the various practical problems that arise in social life. In the absence of such problems, constructivism does not have a toehold from which to begin constructing principles of justice.

There are two important features of a practical formulation of constructivism. First, the resources for constructing principles are in part located in the practical problems humans face, or more precisely the good brought about when those problems are solved. Consequently, we must first look to the nature of the problem before we can understand why principles are obligatory and for what reasons they are authoritative. Second, “a sufficiently detailed and accurate description of the problem actually yields the solution” (C. Korsgaard 2003, 115). This is because the precepts of practical reason and conceptions of persons from which principles are constructed arise from within the problem itself. To see this, consider Korsgaard’s moral constructivism, which in its most recent formulation is primarily concerned with the problem of agency, or the question: How is it possible for a person to act autonomously and effectively over time? (C. M. Korsgaard 2009). Korsgaard begins with the observation that humans are free; it is an inescapable fact of life that we are free to choose and act. The process of acting freely is at the same time a process of constructing our identities over time. If we are to construct unified lives, we need both the freedom to act and a set of principles for determining the reasons on which we act. In the absence of freedom our choices would fail to be our own and we would cease being the authors of our lives. In the absence of principally determined action, our choices would be arbitrary and we would fail to create unified lives reflecting identity and integrity. The problem is to articulate a concept of freedom that is also law abiding. Korsgaard adopts Kant’s Categorical Imperative as the solution, since the categorical imperative tells us to act in such a way that the rule on which one acts can be adopted as a law by all rational persons. Insofar as the imperative recognizes the action as being caused by the person, it preserves freedom. Insofar as it requires the universalization of the rule on which the action is based, it preserves lawfulness. Indeed, Korsgaard thinks the Categorical Imperative principle is constitutive of autonomous, effective agency. That is, we simply cannot understand ourselves as autonomous, unified agents without also ascribing to ourselves this particular principle of practical reason. Consequently, a sufficiently detailed and accurate description of the problem of human agency actually yields the Categorical Imperative as a solution, or so Korsgaard argues.

Korsgaard is admittedly concerned with a constructivist account of practical reason rather than a constructivist account of justice or legitimacy. Whether such a constructivist project is plausible is beyond the scope of this article (see “Constructivism in Metaethics”). But what is important about Korsgaard’s constructivism is that it articulates a notably different structure of justification than the one expressed by procedural formulations. According to proceduralism, principles are justified when they result from a suitably framed procedure, similar to the way presidents become legitimate by running in fair and open elections. By contrast, Korsgaard justifies principles in terms of their function—they solve practical problems. Moreover, principles are objective when they uniquely solve the problem, that is, when there exists no competing principle that can also solve the problem. To illustrate the point, Korsgaard draws on Rawls’ Political Liberalism and the problem of liberalism. Rawls describe the problem of liberalism as follows: “[H]ow is it possible for there to exist over time a just and stable society of free and equal citizens, who remain profoundly divided by reasonable religious, philosophical, and moral doctrines?” (Rawls 1993, xx, xxvii, 4). Korsgaard thinks Rawls’ two principles solve this problem insofar as they describe what a liberal society must do in order to be liberal (C. Korsgaard 2003, 115). Consequently, Rawls’ conception of justice is justified; it functions so as to solve the problem of liberalism. However, Rawls’ conception of justice is not the only justified conception, since other liberal conceptions can also solve the problem. It follows that Rawls’s principles are justified but not objectively so, since they do not uniquely solve the problem. Consequently, rival conceptions of liberalism are equally defensible insofar as they function equally well. In order to construct a uniquely objective set of principles, one must abstract a common core from the several justified sets of principles. Rawls does something like this when he identifies three abstract principles characteristic of any liberal society. These include: (1) the specification of certain basic rights, liberties and opportunities; (2) an assignment of priority to those rights with respect to claims of the general good, and (3) some measure assuring to all citizens adequate all-purpose means to make effective use of their rights (Rawls 1993, 6). Although neither Rawls nor Korsgaard makes this argument, it is possible to think of this abstract core as an objective set of principles constitutive of liberalism, since one could hardly describe a liberal society without also presupposing them as governing principles.

Another way to frame Rawls’ Political Liberalism within a practical formulation of constructivism is in terms of the moral concerns implicit in the problem of liberalism. These concerns can serve as criteria for assessing whether principles solve the problem. Again, take Rawls’ problem of liberalism. It asks how it is possible for there to exist over time a just and stable society of free and equal citizens, who remain profoundly divided by reasonable religious, philosophical, and moral doctrines (Rawls 1993, xx, xxvii, 4). Notice that the problem attaches a set of concerns to a fact about society. The relevant fact is the fact of pluralism—in a liberal society citizens are profoundly divided by reasonable conceptions of a good life. The concern is social stability given this fact. The challenge is to find a set of principles that answer this concern. In addition to the fact and concern, the problem of liberalism expresses a conception of reasonableness. Citizens are reasonable when they are both (a) ready to propose fair terms of cooperation they reasonably believe those to whom they are offered can reasonably accept and (b) appreciate certain factors, or burdens of judgment, that render it impossible to fully reconcile disagreements over all matters of value, including some matters of justice (Rawls 1993, xliv, 58–59). Certain comprehensive doctrines—the stuff of pluralism—become reasonable when those holding them recognize the social implications of the burdens of judgment, and allow the effects of this recognition to take root in one’s “attitude (including toleration) toward other comprehensive doctrines” (Rawls 1993, 375).

With these building blocks in place, we can begin to see how the problem of liberalism points the way toward a solution. The problem expresses concerns and concepts that can be formulated as criteria for assessing principles of justice. For example, the criterion of reciprocity obliges citizens to defend their political positions with reasons they honestly believe those to whom they are offered might reasonably accept (Rawls 1993, xliv). It is implicit in the concern for stability among reasonable citizens profoundly divided by reasonable conceptions of a good life. Consequently, the problem of liberalism contains within it the resources for articulating the standards against which competing conceptions of justice can be assessed. If citizens find the formulation of the problem compelling, they will simultaneously agree on these standards, since these standards are already implicit in the formulation of the problem. Conceptions of justice meeting these standards are justified because they answer the concerns reflected in the problem and thus function so as to solve the problem.

This is a powerful form of political argument. Its essential point is that the epistemic standards for assessing rival conceptions of justice are internal to the problems we encounter in social life. Analyses of these problems can uncover the standards against which principles are justified. This creates a straightforward, instrumental assessment of political principles and the public policies based on them. Principles and policies are justified when they answer the concerns implicit in the problem. The moral authority of these principles and policies is felt by anyone recognizing the problem.

The practical interpretation of constructivism is not without its difficulties, since the justification of principles hinges on the description of the problem. It is not obviously the case that people will agree on the formulation of the problem. For example, if one accepts Rawls’ description of the problem of liberalism, then one is also committed to accepting some conception of liberal justice as binding on social practices. But one might not accept Rawls’ description of the problem and thus fail to see how the principles solving Rawls’ problem are binding on his or her actions. Consequently, the practical interpretation of constructivism shifts the question of justification onto the descriptions of problems. This mirrors the way in which a proceduralist formulation shifts the question of justification onto the account of procedures. In each case, the justification of principles first requires a defense of something else, the procedure or the problem.

Korsgaard and Rawls represent different directions for addressing this difficulty. Korsgaard hopes to ground the description of agency on generally weak and widely acceptable ideas about freedom and unity. By contrast, Rawls localizes the description of the problem to a particular domain of political concern. These two directions mirror the two directions taken by those developing a procedural formulation of constructivism. In both cases the idea is to offer a better defense of the fundamental elements from which principles are constructed, for in the absence of such a defense the principles themselves will lack justificatory force.

6. Conclusion

In his Reconstruction and Critique of A Theory of Justice, Robert Paul Wolff suggests that the problem with which Rawls begins is not the impasse in our recent political history concerning the conflict between freedom and equality, but rather “the impasse in Anglo-American ethical theory at about the beginning of the 1950’s” (Wolff 1977, 11). This latter impasse concerns the debate between utilitarianism and intuitionism during the first half of the twentieth century. Wolff interprets Rawls as trying to advance normative political theory beyond this impasse by drawing on each position’s respective strengths without succumbing to their fatal flaws. The strength of utilitarianism is its straightforward assertion of human happiness as the metric by which moral right is measured. It offers a clear, plausible, and constructive criterion for settling moral disputes on reasons all can understand. Its fatal flaw, however, is that the metric itself—overall human happiness—can also serve as a reason for violating the individual autonomy and freedom of persons. Intuitionism avoids this fatal flaw by flatly asserting the inviolability of human autonomy and freedom, thus protecting individuals against those who might sacrifice human rights in order to achieve a greater good. Its fatal flaw, however, is that it offers no reason for treating autonomy and freedom as inviolable, and thus fails to explain why these features of human dignity place moral constraints on actions that might otherwise produce some valuable end.

Wolff interprets Rawls as sketching a way out of this impasse by developing an account of practical reason that grounds the metric of moral assessment on reasons all can understand. For “without rational grounds for choosing one system of ends or goals rather than another… we would be forced to retreat to the subjectivity of prudence, as utilitarianism, for all its efforts to the contrary, ultimately does; or else we would, in desperation, simply have to posit substantive objective moral principles without a suggestion of rational argument, as does intuitionism” (Wolff 1977, 20).

The impasse Wolff describes is indeed the impasse constructivism tries to break. In each of the variants described above, the aim is to provide a method of analysis by which a set of principles can be justified. This is accomplished by defending—or making plausible—the use of certain fundamental elements in the construction of a favored set of principles. Moreover, the analysis should be as clear and as easy to follow as a utilitarian analysis. Indeed, it is in the clarity of the analysis that constructivism’s greatest impact ultimately rests, since the clarity of the analysis represents a compelling form of political argument. What constructivism is ultimately concerned with is the nature of normative political argument and each variant described above can be interpreted as an effort to find a compelling form of political argument that can justify normative political principles. In short, it seeks a methodology of substantive justification (James 2013, 251). The various political principles constraining public policy are the result of this methodology. Or, to put the same point the other way around, the method of political analysis constructs the principles. Apart from these constructions, there are no moral facts or true political judgments, nor are there ways of assessing the moral appropriateness of political action. It is only when deliberations are properly constrained by a particular methodology that the resulting product is a principle against which our policies can be assessed as right or wrong. If the methodology or form of political argument is compelling, then a basis for settling fundamental political questions can be established on reasons all can understand. Although such a basis cannot guarantee agreement, it should at least narrow our political debates by cementing the point at which disagreements arise, bringing out into the light of day the reasons why people arrive at political judgments that are not only different but are sometimes incommensurable. Consequently, the holy grail of political constructivism is not a set of principles we can all agree upon, but rather a method of normative political analysis so compelling that no clear headed person can plausibly deny without also appearing entirely tone deaf to the kinds of concerns peculiar to political life. Such a method would enable society to deal with its political problems in a constructive manner by systematically building upon previous successes in an ongoing struggle to make the public domain as just as it can possibly be.

7. References and Further Reading

Barry, Brian. 1991. Theories of Justice. University of California Press.
Barry, Brian. 1995. “John Rawls and the Search for Stability.” Ethics 105 (4): 874–915.
Cohen, G. A. 2008. Rescuing Justice and Equality. Cambridge, Mass.: Harvard University Press.
Daniels, Norman. 1996. Justice and Justification: Reflective Equilibrium in Theory and Practice. Cambridge University Press.
Darwall, Stephen, Allan Gibbard, and Peter Railton. 1992. “Toward Fin de Siècle Ethics: Some Trends.” The Philosophical Review 101 (1): 115–89. doi:10.2307/2185045.
Dworkin, Ronald. 1973. “The Original Position.” The University of Chicago Law Review 40 (3): 500–533. doi:10.2307/1599246.
Hare, R. M. 1973. “Rawls’ Theory of Justice—I.” The Philosophical Quarterly 23 (91): 144–55. doi:10.2307/2217486.
James, Aaron. 2012. Fairness in Practice: A Social Contract for a Global Economy. Oxford University Press.
James, Aaron. 2013. “Political Constructivism.” In A Companion to Rawls, J. Mandel and D.A. Reidy, 251–64. John Wiley & Sons.
Klosko, George. 1993. “Rawls’s ‘Political’ Philosophy and American Democracy.” American Political Science Review 87 (02): 348–59. doi:10.2307/2939045.
Korsgaard, C. 2003. “Realism and Constructivism in Twentieth-Century Moral Philosophy.” Journal of Philosophical Research 28: 99–122.
Korsgaard, Christine M. 2009. Self-Constitution: Agency, Identity, and Integrity. Oxford; New York: Oxford University Press.
Lenman, James and Yonatan Shemmer, eds. 2012. Constructivism in Practical Philosophy. Oxford: Oxford University Press.
Nagel, Thomas. 1973. “Rawls on Justice.” The Philosophical Review 82 (2): 220–34. doi:10.2307/2183770.
O’Neill, Onora. 1988. “The Presidential Address: Constructivisms in Ethics.” Proceedings of the Aristotelian Society 89: 1–17.
O’Neill, Onora. 1996. Towards Justice and Virtue: A Constructive Account of Practical Reasoning. Cambridge; New York: Cambridge University Press.
O’Neill, Onora. 2003. “Constructivism in Rawls and Kant.” Edited by Freeman, Samuel. The Cambridge Companion to Rawls, 347–67.
Rawls, John. 1993. Political Liberalism. Columbia University Press.
Rawls, John. 1999a. A Theory of Justice, Revised Edition. Harvard University Press.
Rawls, John. 1999b. Collected Papers. Harvard University Press.
Rawls, John. 2001. The Law of Peoples: With “The Idea of Public Reason Revisited.” Harvard University Press.
Roberts, Peri. 2007. Political Constructivism. Routledge.
Sandel, Michael J. 1998. Liberalism and the Limits of Justice. 2nd ed. Cambridge University Press.
Street, S. 2010. “What Is Constructivism in Ethics and Metaethics?” Philosophy Compass 5 (5): 363–84.
Valentini, Laura. 2011. “Global Justice and Practice-Dependence: Conventionalism, Institutionalism, Functionalism.” Journal of Political Philosophy 19 (4): 399–418. doi:10.1111/j.1467-9760.2010.00373.x.
Wolff, Robert Paul. 1977. Understanding Rawls: A Reconstruction and Critique of A Theory of Justice. First Edition edition. Princeton, N.J: Princeton University Press.

Author Information

Michael Buckley
Email: michael.buckley@lehman.cuny.edu
City University of New York
U. S. A.

Socialism

Table of Contents

1. Socialism and Capitalism: Basic Institutional Contrasts

a. Ownership: Some Preliminaries

b. Private, State, and Social Ownership

c. Economic Systems as Hybrids

2. Socialism vs. Communism in Marxist Thought

3. Why Socialism? Economic Considerations

4. Why Socialism? Democracy

a. Scope

b. Influence

5. Why Socialism? Exploitation

a. Exploitation as Forced, Unpaid Labor

b. Eliminating Exploitation

6. Why Socialism? Freedom and Human Development

a. Formal Freedom

b. Effective Freedom

7. Why Socialism? Community and Equality

a. Why Produce? Communal vs. Market Reciprocity

b. Justice, Inequality, Community

8. Institutional Models of Socialism for the 21st Century

a. Central Planning

b. Participatory Planning

i. Parecon: Basic Features

ii. Allocation in Parecon: Economic Coordination Through Councils

iii. Evaluating Parecon

c. Market Socialism

i. Schweickart’s “Economic Democracy”

ii. Evaluating Economic Democracy

9. References and Further Reading

Author Information

Egalitarianism

Table of Contents

1. What is Egalitarianism?

2. Equality of What?

a. Welfare

b. Resources

c. Capabilities

d. Democratic/Social Equality

e. Primary Goods

f. Luck Egalitarianism

3. Equality of Opportunity

4. Anti-Egalitarianism

a. Sufficiency vs. Equality

5. Domestic or Global?

6. References and Further Reading

Author Information

Jürgen Habermas (1929—)

Table of Contents

1. Biography: Early Life to Structural Transformation

2. Enduring Themes in Formative and Transitional Work

a. Public Deliberation Over Positivist Decisionism and Technocracy

b. From Philosophical Anthropology to a Theory of Social Evolution

3. The Linguistic Turn into the Theory of Communicative Action

4. Discourse Ethics

5. Political and Legal Theory

6. References and Further Reading

a. General Introductions to Habermas

b. Introductory Books and Articles on Specific Themes

i. Biography

ii. Linguistic Turn

iii. Discourse Ethics

iv. Political Theory

c. Works Cited

d. Secondary Scholarship Beyond the Subject-Specific Recommendations Cited Above

Author Information

Veṅkaṭanātha (Vedānta Deśika) (c. 1269—c. 1370)

Table of Contents

1. Background

2. Veṅkaṭanātha’s Life, Works, and Formation

3. Veṅkaṭanātha’s Role within the History of Indian Philosophy

4. Veṅkaṭanātha’s Epistemology, Ontology, and Theology

a. Epistemological Issues

b. Cosmology and Metaphysics

5. State of the Art of Research on Veṅkaṭanātha

6. Abbreviations

7. References and Further Reading

Author Information

Ancient Aesthetics

Table of Contents