Marie Le Jars de Gournay (1565—1645)

A close friend and editor of Montaigne, Marie Le Jars de Gournay is best known for her proto-feminist essays defending equality between the sexes.  Her unusual lifestyle as a single woman attempting to earn her living through writing matched her theoretical argument on the right of equal access of women and men to education and public offices.  Gournay’s extensive literary corpus touches a wide variety of philosophical issues.  Her treatises on literature defend the aesthetic and epistemological value of metaphor in poetic speech.  Her works in moral philosophy analyze the virtues and vices of the courtier, with particular attention to the evil of slander.  Her educational writings emphasize formation in moral virtue according to the Renaissance tradition of the education of the prince.  Her social criticism attacks corruption in the court, clergy and aristocracy of the period.  In her writings on gender, Gournay marshals classical, biblical, and ecclesiastical sources to demonstrate the equality between the sexes and to promote the rights of women in school and in the workplace.

Table of Contents

  1. Biography
  2. Works
  3. Philosophical Themes
    1. Language, Literature, Aesthetics
    2. Moral Philosophy
    3. Social Criticism
    4. Philosophy of Education
    5. Gender and Equality
  4. Reception and Interpretation
  5. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Biography

Born on October 6, 1565, Marie Le Jars belonged to a minor aristocratic family.  Her father Guillaume Le Jars hailed from a noble family in the region of Sancerre; her mother Jeanne de Hacqueville descended from a family of jurists.  Her maternal grandfather and paternal uncle had distinguished themselves as writers.  After her birth, her father purchased the estate of Gournay-sur-Aronde; the family name now included “de Gournay.”

After the death of her father in 1578, Marie Le Jars de Gournay retired with her mother and siblings to the chateau of Gournay.  An avid reader, she provided herself with her own education, centered on the classics and French literature.  By the end of her adolescence, she had become fluent in Latin, learned at least some Greek, and had become a devotee of Ronsard and the Pléaide poets.  Philosophically, she read Plutarch and other Stoic authors.  Once she discovered the Essays of Montaigne, she became his enthusiastic disciple, with special interest in the more Stoic strands of his thought.

In 1588 Gournay personally met with Montaigne; the meeting would establish a lifelong friendship.  Shortly after this encounter, Gournay wrote her novella The Promenade of Monsieur de Montaigne, Concerning Love in the Work of Plutarch.  As subsequent correspondence and meetings deepened their association, Montaigne referred to Gournay as his “adopted daughter” and increasingly shared his intellectual preoccupations with her.

After the death of her mother in 1591, Gournay found herself in straitened financial circumstances.  In 1593, the widow of the recently deceased Montaigne asked Gournay to edit a posthumous edition of the works of Montaigne.  After working for more than a year at Montaigne’s estate in the Bordeaux region, Gournay produced the new edition of the works, completed by a long preface of her own composition, in 1595.  Later in life, Gournay would produce numerous new and expanded editions of the works of Montaigne.

During the next decades, Gournay led a precarious existence in the salons and courts of Paris.  As a single woman attempting to make a living through writing, translation, and editing, she became the object of mockery as well as of fascination in the capital’s literary coteries.  Her translations from the Latin, especially of Vergil, earned her a reputation as a classical scholar.  Often modeled after Montaigne’s essays, her treatises took sides in the controversies of the day.  She praised the older poetry of the Pléiade and condemned newer, more neoclassical poetry.  She defended the centrality of free will against Augustinians who stressed predestination.  She championed a humanistic model of education, with its emphasis on the mastery of classical languages, against more scientific models.  Her work as a controversialist reached its apogee in 1610, when she defended the unpopular Jesuits, whom many French pamphleteers had blamed for the assassination of King Henri IV by a religious fanatic the same year.

Despite her controversial reputation, Gournay became influential in court circles.  She undertook writing assignments for Queen Margot, Marie de Médicis, and Louis XIII.  In recognition of her literary skill, Cardinal Richelieu granted her a state pension in 1634.  During the same period she assisted in the organization of the nascent Académie française.  A committed Catholic sympathetic to the anti-Protestant parti dévot, she still maintained close connections to more libertine members of the Parisian salons, such as Gabriel Naudé and François La Mothe Le Vayer.  She maintained a correspondence with other European female scholars, notably Anna Maria van Schurman and Bathsua Reginald Makin.

Having experienced opprobrium as a career woman devoted to professional writing, Gournay used her writings to criticize the misogyny of Parisian literary society.  Her treatises Equality Between Men and Women (1622) and Complaints of Ladies (1626) defended the equality between the sexes and argued for equal access of both genders to education and to public offices.  In 1626, she published a collection of her previous writings.  A financial and critical success, this collection of her writings was subsequently expanded and reprinted by Gournay in 1634 and 1641. She died on July 13, 1645.

2. Works

The works of Marie Le Jars de Gournay cover a variety of literary genres.  As a translator, she published French versions of Cicero, Ovid, Tacitus, Sallust, and Vergil.  Her multi-volume translation of the Aeneid was the most celebrated of her translations of the Latin classics.

As a novelist, she wrote The Promenade of Monsieur de Montaigne, Concerning Love in the Work of Plutarch.  Written in 1588, this early work already raises Gournay’s proto-feminist concerns on the difficulties experienced by women who attempt to be the intellectual peers of men.  Her poetry, modeled after the outdated verse of Ronsard, was less successful.

Her successive editions of the works of Montaigne, first published in 1595, enhanced Montaigne’s reputation among the literary and philosophical elite of Europe.  Her repeatedly revised preface to these editions constituted an apology for the philosophical value and erudition of Montaigne’s essays.

As a formidable essayist herself, Gournay focused on several issues: the nature of literature; education of the prince; the nature of virtue and vice; the moral defects of contemporary society.  Especially controversial were her treatises defending the equality between the sexes and the right of women to pursue a humanistic education.  Equality Between Men and Women, Complaints of Ladies, and Apology for the Writing Women are illustrative of this genre.

In 1634 Gournay published a collection of her extant writings, called The Shadow of the Damoiselle de Gournay.  In subsequent years, she revised and expanded this edition of her works.  Named The Offerings or Presents of Demoiselle de Gournay, the last collection of her works was published in 1641.  This edition of her works runs to more than one thousand closely printed pages.

3. Philosophical Themes

Gournay’s treatises study numerous philosophical issues.  Her works on literary theory defend the value of figurative speech, especially metaphor, to communicate complex metaphysical truths. Her moral theory reflects the ethics of the Renaissance courtier.  Personal honor is the preeminent virtue, calumny the major vice.  Her pioneering work on gender insists on the equality of the sexes and on the malicious prejudice which has barred women from educational and work opportunities.  Especially bold is her social criticism. Numerous essays condemn the political and religious institutions of contemporary France for their moral defects.

a. Language, Literature, Aesthetics

A prolific poet and translator, Gournay devotes numerous treatises to issues of language and literature.  Against the neoclassical purism of certain literary critics of the period, Gournay defends the value of neologism and figurative speech.  In particular she defends the aesthetic and epistemological value of metaphor in poetic discourse.  Not only does metaphor please the senses of the reader; it communicates certain truths about God, nature, and the human soul which cannot be expressed through more concise, abstract rhetoric.  Defense of Poetry provides her most extensive analysis of innovation and simile in the poetic expression of truth.

Against contemporary critics who attempt to purify the French language through increasing stress on the rules of grammar and rhetoric, Gournay argues that poetic speech is inventive by nature.  The use of neologisms, elaborate analogies, and colorful synonyms constitutes the craft of the poet.  “Every artisan practices his or her craft according to the judgment of his or her mind.  We are artisans in our own language.   In other words, we are not only bound to work according to what we have received and learned; we are even more bound to shape, enrich, and build it, in order to add riches to riches, beauties to beauties.”  Vibrant innovation rather than imitation is the duty of the poet in creating his or her discourse.

The richly figured rhetoric defended by Gournay contrasts with the purified, restrained speech promoted by the influential neoclassical literary establishment.  The attempt to purify language of complex metaphor only leads to a vitiated speech produced by grammarians rather than poets.  “We must also laugh at what happens to these overly refined scholars when they spot some metaphorical phrase that bears excellence in its construction, brilliance, or exquisite power.  Not only do they fail to notice its beauty or its value; they denounce it and preach that restraint is more preferable.”  The pedantic speech encouraged by such strictures against metaphor quickly numbs the reader by its aridity and lack of variation.

Not only is speech denuded of metaphor bound to bore the reader; it fails to communicate the complex truths which poetry is destined to express.  The abstract, purified language of the scholar is incapable of expressing the surging emotive and spiritual life of the human person.  “[This purist approach] cannot pierce right down to the bone, as is necessary for the imagination to be properly expressed.  This must be done by a lively and powerful attack…. [The purists’] principal concern is to flee not only the frequent metaphors and proverbs we formerly used, but also to abandon borrowings from foreign languages, new expressive styles of speaking, and most of the lively diction and popular expressions—all those devices which everywhere strengthen a clause by making it more striking, especially in poetry.”  The new pedantic poetry tends toward the decorative; authentic poetry, rich in its figurative devices and colorful rhetoric, is alone capable of expressing the life great poetry embodies.  “May others look for milk and honey, if that’s what they want.  We are looking for what is called spirit and life.  I call it ‘life’ with good reason, since all speech that lacks this celestial ray in its composition—this ray of powerful dexterity, suppleness, agility, capacity to soar—is dead.”

Gournay invokes the philosopher Seneca to support her thesis that authentic poetic expression of life necessarily employs vivid, figured speech.  Great poetry often enjoys the mystical air of religious revelation.  “Seneca, a philosopher, a grave Stoic, teaches us that the soul escapes from itself and soars outside of humanity in order to give birth to something high and ecstatic far above its peers and above humanity itself.”  The rule-based strictures of the neoclassical establishment, which focuses on the surface rather than the substance of speech, threaten to destroy the religious vision which is the font of poetic inspiration.  “True poetry is an Apollonian furor.  Do they [neoclassical critics] want us to be their disciples after having been those of Apollo?  Rather, do they want us to be their schoolboys, since they crank out laws for us in order for us to crank out others?”

For Gournay, her battle against the surging neoclassical aesthetic in France is neither a simple issue of taste nor nostalgia for the embroidered lyricism of the Pléiade; it is a combat for a poetry that can express the richness of the experience of life through the verbal armory of synonym, analogy, and simile.  Only in metaphorical speech can the author express the complexity of the soul’s pilgrimage as well as touch the senses and imagination of the potential reader.

b. Moral Philosophy

In many treatises Gournay presents her moral philosophy.  Centered on questions of virtue and vice, Gournay’s moral theory defends an aristocratic code of conduct tied to the virtue of honor.  With personal reputation as a supreme good, calumny emerges as a principal evil and violence practiced in defense of one’s honor as a moral duty.

Like other neo-Stoic authors of the period, Gournay admits that the nature and authenticity of virtue is elusive.  But unlike many of her contemporaries, she does not simply dismiss virtue as a mask of the vice of pride.  In Vicious Virtue, she argues that the elusiveness of virtue is tied to the hidden motivations behind virtuous acts.  While one may observe external actions, one cannot observe the occult motives inspiring the moral agent to act in an apparently virtuous manner.  “One cannot remove from humanity all the virtuous actions it practices because of coercion, self-interest, chance, or accident.  Even graver are the external virtues which follow on some vicious inclination…To eliminate all such virtuous acts would place the human race closer to the rank of simple animals than I would dare to say.”  Much, if not all, of human moral action is motivated by immoral or amoral factors.  External virtuous conduct is caused more by personal interest or accident than by conscious virtuous intention.  To eliminate all the moral actions inspired by less than virtuous motives is to eliminate practically all deliberative moral action; the only remaining activity is comparable to that manifested by non-rational animals.

Despite the fragility of virtue, Gournay identifies certain virtues and vices as central in the moral conflicts of the age.  Calumny is a particularly dangerous vice because it destroys the personal honor and social reputation which Gournay considers a paramount moral good.  Undoubtedly, Gournay’s personal experience of battling the criticisms launched against her and of the backbiting gossip of the court helped to focus her campaign against calumny.

Of Slander provides a detailed analysis of the malicious gossip which Gournay believed to be one of the principal evils of the age.  Citing Aristotle, Gournay claims that personal honor is the most estimable external possession of the human person.  “Every day we risk many goods for the sake of life and we risk our lives for the sake of a piece of honor.  This is why Aristotle calls it the greatest of external goods, just as he qualifies shame as the greatest of external evils.  Moreover, can we deny that the love of honor is necessary as the powerful author and tutor of virtue?  At the very least, it is nine-tenths of the ten parts that make up virtue.  This is because few people are capable of biting right into this fruit, which seems too bitter without this bait.”  Since honor is so central for the cultivation of virtue, the loss of personal reputation paralyzes the pursuit of the good and constitutes a severe moral loss for the individual so affected.

Not only does calumny destroy the reputation and happiness of the victimized individual; it constitutes a grave evil for the entire ambient society.  Invoking the patristic author Bernard of Clairvaux, Gournay argues that calumny and other forms of malicious gossip constitute a serious sin, which God has promised to punish with special force.  “Both the gossip and his or her voluntary audience carry around the devil, one on the tongue, the other in the ear.  This murderous thrust of the tongue transpierces three persons in one blow: the offended party, the speaker, and the listener.”  By destroying truth and the respect of persons, calumny attacks the very foundations of social life.  It abets other expressions of the contempt of persons, such as mockery and sarcasm.  Of Slander demonstrates how easily the practice of calumny causes physical violence, such as its frequent provocation of recourse to the duel by the party whose honor has been outraged.

Gournay’s adherence to an aristocratic code of honor also appears in her treatment of violence in Is Revenge Legitimate? The treatise recognizes that Christianity would appear to ban vengeance; the believer is called to love his or her enemies and exercise forbearance in the face of evil.  But Gournay argues that only minor or ineradicable evils are to be treated this way; major injustices, such as assaults on the good name of oneself or of one family, demand swift reparation.  God’s greatest gift to human beings, reason itself, indicates that such moral infractions require strict retribution if the order of justice sustaining society is to be preserved.  “We must not doubt that this great God has given us Reason as the touchstone and lighthouse in this life.  He has based his moral laws on reason and reason on his moral laws….The Free Will which God has given us as the instrument of our salvation would be useless or would rather be a dangerous trap if it were not enlightened by this Reason, because this power did not have any light itself.  We must see if Reason could tolerate the entire abolition of vengeance, if justice and utility could do without it.”  For Gournay, the answer is obviously negative.  The defense of the social order of justice requires the willingness of individuals and of the state to uphold the order of justice by swiftly punishing those who have violated the honor of others.  The risks of abuse in the execution of this retribution should not blind individuals and the state to its necessity.  The alternative is anarchy.

c. Social Criticism

In her analysis of virtue and vice, Gournay attacks the corruption of prominent social institutions of the period.  She does not hesitate to criticize the moral failings of three powerful institutions: the court, the clergy, and the aristocracy.  Her critique of the vices typical of each institution serves the broader goal of the moral reform of France along the lines of the principles of the reforming council of the Catholic Church, the Council of Trent.

In Considerations on Some Tales of Court, Gournay criticizes the malicious gossip that dominates the court atmosphere.  Flowing from false personal pride, this tendency to slander other courtiers easily leads to violence.  “Slander is beloved by those who are looking for a fight.  It seems to give them a certain distinction of freedom.  But when I see the measures of security many of those who practice slander today take, I see the mark of servitude rather than of freedom.”  The treatise recounts how such contemptuous slander has recently provoked duels and civil wars in France.  Ultimately, it weakens loyalty to the throne by the ridicule it heaps upon the royal family and courtiers, thus undermining the stability of the social order itself.

Counsels to Certain Churchmen focus on a particular abuse: laxity in the practice of sacramental confession.  In principle, sacramental confession is the occasion for the Catholic to express sorrow for sins, express the sincere resolution to avoid committing these sins in the future, and, if judged properly contrite, to receive absolution and an appropriate restitutionary penance from the confessor.  In practice, lax confessors, who mechanically grant absolution, have turned the sacrament into a “cosmetics machine.”  No moral reform or authentic repentance occurs in this conspiracy between hardened sinners and indulgent confessors.  To make confession once again an instrument of moral conversion, Gournay insists that the confessor must employ the armory of spiritual arms available to him in treating obdurate sinners.  “To move or strengthen a penitent toward this charity [repentant love of God], notably in what concerns abstaining from committing the offense in the future, the confessor should not spare the use of solicitude, remonstrance, threats, infliction of penalties, even on occasion the refusal of absolution.  Divine, civil, and philosophical judgments tell us that if we do not prevent a crime, its evil is imputed to us.  The mouth of Saint Paul says the same thing about our responsibility: in cases of necessity, he orders us to abandon the sinner to Satan through excommunication in order to bring the sinner to repentance.”  The moral rigorism of Gournay is evident in this exhortation to confessors.  Only a strict practice of repentance and restitution on part of the sinner and of demanding scrutiny on the part of the confessor can make the act of sacramental confession the serious means of moral conversion it was instituted to be.

The vices of the French aristocracy are the object of attack in Of the Nothingness of the Average Courage of this Time and Of the Low Price of the Quality of the Nobility.  The primary vice of this social class is the absence of what should be their defining virtue: courage.  Gournay understands by courage the willingness to defend the weak which originally distinguished the nobility of the sword.  “Generous courage necessarily includes courtesy and benevolence, conjoined with a prudent use of courageous force.  It should not appear to vanquish the strong more than it lifts up the weak.  Among other reasons, this is because the vindication and protection of the weak is the very justification for the use of force and of its consequences.”  Even before the chivalric code, Plato had accurately defined the virtue of fortitude as “a prudent, tolerant expression of courage in order to realize what is just and helpful.”

According to Gournay, the traditional courage of the aristocracy has deteriorated into a cult of power for its own sake.  Rather than exercising its martial prerogatives on behalf of the oppressed, many nobles have become oppressors themselves by the use of violent power to advance their own interests or even whims.  “The first problem is the power and arrogance which flow from the sword hanging at the side of nobles.  Few of them manage to avoid becoming intoxicated by this power.  The second is a certain contagious illusion they pick up by imitation of others.  They start to believe that they are the important people, the eminent ones, and the leaders of a gang in court or in the provinces.  They usurp power, they strike a peasant or a simple bourgeois, they insult the first and worst-armed person they meet simply to have revenge—and any remonstrance concerning their behavior has little effect.  They make a scepter or rather a god out of power.”  This corruption of power into violent self-importance threatens the civil order, since it inaugurates lawlessness and civil wars motivated by little more than personal jealousy.

Gournay does not spare the poorer classes in her social critique.  In Of False Devotions Gournay criticizes those who believe that the performance of external devotions guarantee their salvation; it is only the cultivation of moral virtues in free cooperation with divine grace that can unite the human soul to God.  Gournay places unusual emphasis on two moral virtues involving self-reflection: integrity and probity.  “Among the virtues preeminent in rank are those of integrity and probity, because they give us a special attachment to the Creator and contain all the other considerations we owe the divine majesty.  The other virtues ally us primarily to other human beings.”  Given the self-reflective quality of these prime virtues, Gournay censures unbalanced devotionalism for its irrational, whimsical qualities.  The wish to please God through external gifts displaces the hard work of moral reform that should be the touchstone of the Christian life.

The stress on the cultivation of virtue over the pursuit of external devotion is not a uniquely Christian concern; Gournay cites Aristotle in her argument that the upright moral agent must carefully attempt to eradicate every vice.  “The Philosopher holds that a human being is vicious if he or she possesses just one vice and is not virtuous if he or she does not possess all the virtues.”  Certain Catholic popular devotions run the risk of deceiving their practitioners of the true state of their souls if they divert the devotee from moral scrutiny and repentance.  “These devourers of rosaries who called themselves devout are lying if they are covetous, envious, imposters, mockers, or slanderers, that is to say, the executioners of reputation, or if they assault some other interests of their neighbors.”  As is typical in Gournay’s scale of virtue, the mendacious destruction of another’s reputation emerges as the gravest vice.  When such vices are allied to the ostentatious practice of popular devotions, the vice is doubled by hypocrisy.

d. Philosophy of Education

Closely linked to her moral philosophy is Gournay’s educational theory.  In several treatises on the education of a prince, Gournay argues that the primary purpose of education is the formation of moral character.  The cultivation of virtue in general and of the virtues specific to one’s state in life constitutes the principal aim of instruction.  Humanistic in nature, the ideal education of the prince also entails extensive exercise in modern and classical languages.

Of the Education of the Royal Children of France outlines the primarily moral nature of authentic education.  In Gournay’s perspective, the pupil must be encouraged to cultivate moral virtue by precept and example.  The development of a moral personality is not guaranteed by nature or providence, since human beings possess a spacious free will.  “The salvation of the human race depends on what falls under its choice and free will.  Prevenient grace cannot force this choice although it does encourage the will to make the good choice and strengthens it when it consents.  Because of this we know that if we try to imprint on minds such qualities as faith, virtue, and reason—which we could otherwise call God’s commandments—the minds will conserve the impression of such qualities.”  Art must build on nature in developing the moral character of the pupil, because the adult’s exercise of freedom will be shaped by the moral dispositions encouraged in early age.  The assistance provided by God’s grace in adhering to the good should not be exaggerated, since grace does not overwhelm the moral agent’s exercise of personal freedom.  In her treatment of grace and freedom, Gournay clearly sides with her Jesuit allies against the emphasis of neo-Augustinian Catholics and Protestants on predestination.

The principal emphasis in this moral formation is the cultivation of the virtues.  Successful education should emphasize the development of virtues proper to the pupil’s future state in life as well as the development of the cardinal virtues.  Gournay’s plan for the education of the prince illustrates the mixture of generic and specific moral habits.  “Our muses or sciences should teach Prudence, Temperance, Fortitude and Justice.  Beyond that, they should teach liveliness, concentration, elegance, eloquence, good judgment, and restraint.  Because we are speaking of courtiers, they should also teach chivalry, courtesy, politeness, and a charming personal grace.”  The development of a moral character capable of leadership, diplomacy, and inspiration is the ultimate aim of such a royal pedagogy.

Since the cultivation of moral personality is the central aim of education, successful education depends to a great extent on the moral character of those chosen as teachers.  In the case of royalty, extraordinary care must be exercised in the choice of governors, teachers, and tutors.  Gournay sketches the ideal portrait of the governor chosen to supervise the education of the prince.  “We seek a governor who respects the laws of heaven and earth and who loves his country; a man of the ancient faith; a man who has never damaged the goods, honor, tranquility, or liberty of another; a man who prefers to undergo an injustice than to commit one; a man who is dutiful, well-mannered, charitable, free from pride and vanity; a disinterested man, who sees clearly and who acts in his own affairs as he advises others to do; a man whom one can believe when he is speaking about a friend, an enemy, or himself; who easily accepts obligations; whose words are without artifice, whose counsel is honest, whose resolution is constant; a man who has noble courage, diligent work habits, solid morals, a sense of moderation, even temper; someone whose self-possession protects him from the lure and applause of the world.”  This endless catalogue of ethical qualities for the ideal governor indicates the centrality of moral character for all educational personnel, since the moral personality of the prince develops in large part through emulation of those who instruct him.

In Institution of the Sovereign Prince, Gournay details the more humanistic side of her model of education.  In addition to moral education, the prince requires a literary formation.  Among the disciplines to be studied, Gournay underscores the importance of grammar, logic, philosophy, and theological doctrine.  She stresses the role of languages in this humanistic curriculum.  In addition to French, the pupil should learn Latin; ideally the pupil should master Latin as Montaigne did, by learning to speak it from the cradle.  Tutors should guide the pupil through Latin classics.  Also desirable is the study of Greek and Hebrew.  Not only will this literary formation provide the sovereign with cultural polish; it will permit him to understand more deeply the issues of polity and justice treated in depth by Holy Scripture and the classics.

This classical study will also reinforce the pedagogical effort to strengthen the pupil’s commitment to moral virtue.  A lifelong habit of serious study of the classics will encourage the monarch’s commitment to virtue.  The governed will imitate the virtue or lack of it in the rule and the recreation of the one who governs.  “It is necessary for a ruler to find his relaxation and delight in the muses; otherwise, he will surrender to life of debauchery, luxury, gambling, or gossip….If he indulges in debaucheries, luxury, and gambling, one will see soon enough that his subjects will grow morally ill through the contagion.”  The humanistic initiation into the appreciation of classical literature and art complements the moral formation that might prove too austere without the allure of the muses.

e. Gender and Equality

The most celebrated of Gournay’s treatises defend the equality between the sexes.  Equality Between Men and Women argues that the current subordination of women to men is based on prejudice; only the lack of educational opportunities explains the difference in cultural achievement between the sexes.  Complaint of Ladies explores the roots of the misogyny which has reduced women to a state of servitude.

In Equality Between Men and Women, Gournay develops a cumulative argument from classical, biblical, and ecclesiastical arguments to demonstrate gender equality.  This catalogue of philosophical and theological authorities, as well as the historical achievements of women themselves, indicates that prejudice alone has caused the irrational denigration of women that has become the creed of contemporary society.

Among philosophers supporting gender equality, Plato holds pride of place.  “Plato, to whom no one denies the title of ‘divine,’ assigns them [women] the same rights, faculties, and functions in the Republic.” The treatise also marshals citations from Aristotle, Cicero, Plutarch, Boccaccio, Tasso, and Erasmus on behalf of gender equality.  The historical achievements of Sappho, Hypatia of Alexandria, and Catherine of Sienna, among other women, indicate the intellectual capacity of women.

Scripture and church history provide just as ample a catalogue of citations supporting the equality of the sexes and examples of women who held offices comparable to those held by men.  From the opening book of Genesis, Holy Scripture insists that both men and women are made in the divine image; thus, they are both capable of rational reflection and are both the subject of the same rights and duties.  Several women are named as authors of biblical texts: Anne, Mary, Judith.  Deborah served as a prophet, Judith as a warrior, Tecla as a coworker with Saint Paul.  Of special interest to Gournay is the status of Mary Magdalene, who is the first disciple commissioned to announce the news of Christ’s resurrection and who bears the ancient title of ‘Apostle to the Apostles.’  Sacred tradition often depicts her preaching to the masses in Provence.  The sacerdotal ministries of church governance and preaching thus appear to be open to women as well as men.  Gournay dismisses Saint Paul’s restrictions on the teaching and preaching activities of women in church as a simple precaution against the possible temptation caused by the view of women who are “more gracious and attractive” than men.

Giving women equal access to education will quickly overcome the misogynist burdens under which they currently labor.  Deprivation of education is the sole cause of the current gap between the sexes in the area of cultural achievement.  “If the ladies arrive less frequently to the heights of excellence than do the gentlemen, it is because of this lack of good education.  It is sometimes due to the negative attitude of the teacher and nothing more.  Women should not permit this to weaken their belief that they can achieve anything.”  The path to sexual equality in the future lies in the improvement of educational opportunities for women and in the discouragement of misogynistic stereotypes which discourage women from even attempting cultural achievements.

Complaint of Ladies explores the depth of the misogyny which makes sexual equality such a distant, chimerical goal.  Gournay condemns the current social situation of women as one of tacit slavery.  “Blessed are you, Reader, if you are not of the sex to which one forbids all goods, depriving it of freedom.  One denies this sex just about everything: all the virtues and all the public offices, titles, and responsibilities.  In short, this sex has its own power taken away; with this freedom gone, the possibility of developing virtues through the use of freedom disappears.  This sex is left with the sovereign and unique virtues of ignorance, servitude, and the capacity to play the fool, if this game pleases it.”  Despite the clear philosophical, historical, scriptural, and ecclesiastical evidence for the dignity of woman and for her fundamental equality with man, the political and literary mainstream of French society continues to treat women as chattel.

Especially disturbing is the contempt with which the era’s misogynist literature treats women.  Gournay condemns the sarcastic dismissal of woman which characterizes so many of these texts.  “When I read these writings by men, I suspect that they see more clearly the anatomy of their beards than they see the anatomy of their reasons.  These tracts of contempt written by these doctors in moustaches are in fact quite handy to brush up the luster of their reputation in public opinion, since to gain esteem from the masses—this beast at several heads— nothing is easier than to mock so and so and [to compare them to] a poor crazy woman.”  If in principle men and women are clearly equal, in fact this equality will be difficult to practice in a society poisoned by a popular misogynist art, whose irrational fantasies require no further justification.

4. Reception and Interpretation

The reception of the works of Marie Le Jars de Gournay follows three distinct periods.  During her lifetime, Gournay’s writings attracted a large cultivated public.  Her polemical style and combative positions in the religious, political, and literary quarrels of the period made her a prominent essayist.  By the end of the seventeenth century, she had become virtually unread.  New editions of Montaigne had superseded her own; the antiquated quarrels over Ronsard or grace elicited little interest.  This oblivion lasted well into the twentieth century.  At the end of the century, Gournay’s works underwent a revival in literary and philosophical circles.  The major impetus for this new interest was the feminist effort to expand the canon of the humanities to include the texts of long-ignored female authors.  Gournay’s proto-feminist essays on gender equality constituted the focus of this revival.  In recent decades, they have become the object of numerous editions, translations, and commentaries.

The interpretation of the philosophy of Gournay remains largely tied to her work on the equality between the sexes and her attendant criticism of the social oppression of women.  While her pioneering work in gender theory merits such scholarly attention, it has tended to obscure her other philosophical concerns.  Gournay’s contributions to aesthetics, ethics, pedagogy, social criticism, and theology invite further discovery.  Gournay’s works have also suffered from her close association with Montaigne.  While her philosophy is clearly indebted to the mentor she reverently invokes as “the author of the Essais,” her philosophy differs from the more skeptical theories of Montaigne.  Whereas Montaigne often invokes a constellation of classical authorities to show their contradictions and to argue that many controversies have no certain solution, Gournay frequently invokes a catalogue of classical and biblical authorities to demonstrate the impressive consensus that exists among philosophical and theological authorities on a disputed topic and thus to identify the correct solution.  Gournay’s distinctive method of Catholic-humanism, in which a flood of classical and ecclesiastical authorities are harmonized to prove the truth of a contested philosophical thesis, requires further scholarly analysis.

5. References and Further Reading

The translation from French to English above are by the author of this article.

a. Primary Sources

  • Gournay, Marie Le Jars de. Les avis, ou les Présens de la Demoiselle de Gournay (Paris: s.n., 1641).
    • [This is Gournay’s last and most complete edition of her works.  An electronic version of this work is available on the Gallica section of the website for the Bibliothèque nationale de France.]
  • Gournay, Marie Le Jars de. “Préface” aux Essais de Montaigne (Paris: Tardieu-Denesle, 1828): 3-52.
    • [Gournay’s preface explains her interpretation of the essays of Montaigne and the history of her relationship to Montaigne.  An electronic version of this work is available on the Gallica section of the website for the Bibliothèque nationale de France.]
  • Gournay, Marie Le Jars de. Apology for the Woman Writing and Other Works, ed. and trans. Richard Hillman and Colette Quesnel (Chicago: University of Chicago Press, 2002).
    • [This English translation of Gournay’s writings concerning gender features a substantial biography and bibliography.]
  • Gournay, Marie Le Jars de. Preface to the Essays of Montaigne, ed. and trans. Richard Hollman and Colette Quesnel (Tempe, AZ: Medieval and Renaissance Texts and Studies, 1998).
    • [This English translation of Gournay’s preface to Montaigne features a thorough discussion of the relationship between Montaigne and Gournay; Hillman’s scholarly notes also contextualize the argument of the preface and of the essays.]

b. Secondary Sources

  • Bauschatz, Cathleen M. “Marie de Gournay’s Gendered Images for Language and Poetry,” Journal of Medieval and Renaissance Studies, 25 (1995): 489-500.
    • [Bauschatz links Gournay’s philosophy of language and art to concerns for sexual difference.]
  • Butterworth, Emily. Poisoned Words: Slander and Satire in Early Modern France (London: Legenda, 2006).
    • [In a chapter devoted to Gournay, Butterworth studies Gournay’s preoccupation with the moral problem of slander.]
  • Cholakian, Patricia Francis. “The Economics of Friendship: Gournay’s Apologie pour celle qui escrit,” Journal of Medieval and Renaissance Studies, 25 (1995): 407-17.
    • [Cholakian underscores the differences between Montaigne and Gournay concerning the capacity of women to cultivate friendship.]
  • Deslauriers, Marguerite. “One Soul in Two Bodies: Marie de Gournay and Montaigne,” Angelaki: Journal of the Theoretical Humanities, 13, 2 (2008): 5-15.
    • [Deslauriers analyzes the multiple ways in which Gournay claims to be spiritually united to Montaigne.]
  • Dykeman, Therese Boos. The Neglected Canon: Nine Women Philosophers, First to the Twentieth Century (Dordrecht, Boston: Kluwer Academic, 1999).
    • [Dykeman’s introduction to several translated essays by Gournay provides a solid biography of Gournay and develops a philosophical interpretation of her work.]
  • Franchetti, Anna Lia. L’ombre discourante de Marie de Gournay (Paris: Champion, 2006).
    • [This erudite study of the later work of Gournay argues for the Stoic influences on Gournay’s moral philosophy and philosophy of language.]
  • Lewis, Douglas. “Marie de Gournay and the Engendering of Equality,” Teaching Philosophy 22:1 (1999): 53-76.
    • [Douglas analyzes the rhetorical and argumentative strategies used by Gournay in her defense of gender equality.]
  • McKinley, Mary. “An editorial revival: Gournay’s 1617 Preface to the Essais,” Montaigne Studies 8 (1996): 145-58.
    • [By comparing the 1595 and 1617 prefaces to Montaigne, McKinley demonstrates the changes in Gournay’s intellectual convictions over two decades.]

Author Information

John J. Conley
jconley1@loyola.edu
Loyola University in Maryland
U. S. A.

Anne Le Fèvre Dacier (1647—1720)

A distinguished classicist during the reign of the French king Louis XIV, Madame Dacier achieved renown for her translation of Greek and Latin texts into French.  Her translation of Homer’s Iliad (1699) and Odyssey (1708) remains a monument of neoclassical French prose.  In defending Homer during a new chapter of the literary quarrel between the ancients and the moderns, Dacier developed her own philosophical aesthetics.  She insists on the centrality of taste as an indicator of the level of civilization, both moral and artistic, within a particular culture.  Exalting ancient Athens, she defends a primitivist philosophy of history, in which modern society represents an artistic and ethical decline from its Hebrew and Hellenic ancestors.  A proponent of Aristotle, Dacier defends the Aristotelian theory that art imitates nature, but she adds a new emphasis on the social character of the nature that art allegedly imitates.  In her philosophy of language, she explores the nature and value of metaphor in evoking spiritual truths; she also condemns the rationalist critique of language which dismisses the fictional or the analogous as a species of obscurantism.  The Bible’s robust use of metaphor has established a literary as well as a spiritual norm for Christian civilization.  Against modern censors of classical literature on the grounds of obscenity, Dacier defends the pedagogical value of the classics, especially the epics of Homer, in forming the moral character and even the piety of those who avidly study them.

Table of Contents

  1. Biography
  2. Works
  3. Philosophical Aesthetics
    1. Theory of Taste
    2. Mimesis and Nature
    3. Theory of Language
    4. Moral Pedagogy of Literature
  4. Reception and Interpretation
  5. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Biography

Born in Preuilly-sur-Claise on August 5, 1647, Anne Le Fèvre was raised in the city of Saumur in the Loire region of central France.  Her father, Tanneguy Le Fèvre, was a professor of classical languages at a local academy.  Under her father’s tutelage, Anne Le Fèvre quickly learned Latin and Greek and demonstrated a precocious skill for the translation of the classics into French.  Her adolescent marriage in 1664 to the publisher of her father’s works, Jean Lesnier, rapidly deteriorated; the embittered spouses agreed to a permanent separation.

After the death of her father in 1674, Madame Le Fèvre Lesnier enjoyed the patronage of Pierre-Daniel Huet, a royal tutor to the French dauphin and the future bishop of Avranches.  A member of the Académie française, the scholarly cleric introduced her to the controversies surrounding Descartes in contemporary French philosophy.  Originally a supporter of Cartesianism, Huet would turn decisively against it in his Critique of Cartesian Philosophy (1689).  Huet encouraged Le Fèvre Lesnier’s move to Paris and commitment to a scholarly life devoted to the translation of the classics.

Published in 1674, her first translation, an edition of Callimachus, received the acclaim of fellow classicists.  The Duke of Montausier, the overseer of the dauphin’s education, then invited Le Fèvre Lesnier to contribute translations to the series ad usum Delphini (“for the use of the dauphin”) which he had initiated.  Her editions of Publius Annius Florus (1674), Sextus Aurelius Victor (1681), Eutropius (1683), and Dictys of Crete (1684) spread the fame of the series far beyond the court circles for which it had been originally designed.  Le Fèvre Lesnier also published independent translations of Anacreon (1681), Sappho (1681), Terence (1683), Plautus (1683), and Aristophanes (1684). The emergence of a provincial woman as France’s preeminent classicist made Madame Le Fèvre Lesnier a celebrity in the literary salons of Paris.  Padua’s Academy of Ricovrati elected her to membership in 1679.

Shortly after the death of her separated husband, Anne Le Fèvre Lesnier married fellow classicist André Dacier, a former student of her father, in 1683.  A year later, the couple retired to Castres in order to devote themselves to theological study.  Originally Protestants, both Monsieur and Madame Dacier decided to embrace Catholicism.  When they formally entered the Catholic Church in 1685, Louis XIV granted the couple a royal pension in recognition of their conversion.

In the following years, Madame Dacier published new translations of Plautus, Aristophanes, and Terence and collaborated with her husband on several translations, notably new French versions of Plutarch and Marcus Aurelius.  These translations of ancient Stoic authors reflected Madame Dacier’s sympathy for neo-Stoicism and her opposition to neo-Epicureanism in the philosophical debates of the period.  The literary skill and classical erudition of Madame Dacier earned her the praise of France’s most influential literary critic, Nicolas Boileau.

In 1699, Madame Dacier published her major work, a French translation of Homer’s Iliad.  Her version of Homer’s Odyssey followed in 1708. Widely acclaimed as both faithful translations and graceful examples of French prose, the books re-ignited the long-simmering querelle des anciens et des modernes.  Siding with the “ancients,” Dacier defended the superiority of classical literature, notably the epics of Homer, over the literary products of modern France.  Supporting the “moderns,” Antoine Houdar de la Motte published his own version of Homer in 1714, in which he radically altered the text to suit modern sensibilities and in which he criticized the stylistic and moral flaws of Homer compared with the poetry of modern France.  In the same year, Madame Dacier published her major treatise on the question: Of the Causes of the Corruption of Taste.  The work lambasted La Motte’s translation of Homer and provided a point-by-point refutation of his critique of antiquity.  The lengthy treatise also permitted Dacier to declare her philosophical allegiance to Aristotle on artistic questions and to present her own philosophy of art and language.

Even the clergy divided in this new chapter of the querelle.  Supporting La Motte, Abbé Terrasson claimed that with its superior knowledge of the world, due to the philosophy of Descartes and technological progress, modern French culture had produced a superior literature. Defending Dacier, Bishop Fénelon argued that classical literature remained superior to the uneven literary achievements of modern France.

When the Jesuit Jean Hardouin proposed a new system for interpreting Homer, Madame Dacier refuted it in her second major theoretical work: Homer Defended against the Apology of Father Hardouin, or the Sequel to the Causes of the Corruption of Taste (1716).  This treatise reconfirmed her commitment to a neo-Aristotelian theory of art and literary exegesis.  It also expanded the grounds for defending the moral and artistic superiority of ancient civilization.

Madame Dacier died at the Louvre on August 17, 1720.

2. Works

The works of Madame Dacier divide into two categories: translations from classical languages and polemical treatises.

The major translations cover the genres of history, drama, lyrical poetry, and epic.  The histories include translations of the history of Rome by Publius Annius Florus (70-140); the history of Rome by Sextus Aurelius Victor (320-390); the history of Rome by Etropius (the fourth century historian of Julian the Apostate); and the chronicle of the Trojan wars by Dictys of Crete (author of a highly fictionalized alleged eyewitness account of the war, known through a fourth-century Latin translation by Q. Septimius).  This emphasis on military and political history reflects the type of education deemed appropriate for France’s dauphin.

The dramas include translations of the comedies of Aristophanes (446-386 B.C.E.), Plautus (254-184 B.C.E.), and Terence (195-159 B.C.E.).  The lyrical poetry includes the epigrams of the Alexandrian poet Callimachus (310-240 B.C.E.), the verse of Anacreon (570-488 B.C.E.), and the love poems of Sappho (620-570 B.C.E.).  The alleged licentiousness of many of these authors appealed to the more libertine salons and sharpened the controversy over the appropriateness of a woman engaged in publication.

It was in the genre of the epic that Madame Dacier achieved her greatest fame.  Her translation of the Iliad in 1699 established her as a master of neoclassical French style as well as confirming her reputation as a preeminent classicist.  Her preface to the Iliad, staunchly defending Homer and Hellenistic civilization, helped re-launch the querelle des anciens and des modernes.  Her translation of the Odyssey in 1708 achieved similar acclaim from scholars and the general public.  Both works underwent numerous re-editions and were frequently used in Francophone secondary schools for courses in literature.

Madame Dacier wrote her two polemical treatises toward the end of her life.  On the surface, Of the Causes of the Corruption of Taste (1714) and Homer Defended against the Apology of Father Hardouin, or Sequel to the Causes of the Corruption of Taste (1716) are both occasional works.  Both books, however, transcend the immediate disputes of the quarrel over Homer; they permit Dacier to present a neo-Aristotelian theory of art, language, mimesis, and moral education.

3. Philosophical Aesthetics

The philosophical aesthetics developed by Madame Dacier appears primarily in her treatise Of the Causes of the Corruption of Taste [CCT].  Like other eighteenth-century philosophers, Dacier places the question of taste at the center of aesthetic investigation.  She considers a society’s degree of artistic taste to be linked to its degree of moral probity and political order.  In her normative judgments, Dacier praises the achievement of ancient Greece and judges modern France as decadent in comparison.  Declaring herself a partisan of Aristotle, Dacier defends the mimetic thesis that art imitates nature, but she redefines “nature” to include the psychology of the characters depicted and the predominant traits of the society mirrored in art.  Her philosophy of language defends the value of metaphorical speech against the rationalist charge of opacity.  For Dacier, classical literature possesses ethical as well as formal value inasmuch as it can encourage the formation of moral and even religious virtues in the character of the modern Christian reader.

a. Theory of Taste

For Dacier, taste is a central symptom of the general moral and political quality of society.  The capacity of a particular culture to produce and appreciate sublime works of art, especially literary works, indicates the culture’s degree of moral and civic maturity.  The decline of literary taste presages a decline in virtue among the youth who are exposed to mediocre art.  “If we tolerate false [artistic] principles spoiling the mind and judgment [of young people], there are no more resources left for them.  Bad taste and ignorance will finish off this work of leveling.  As a result, literature will be entirely lost.  And it is literature which is the source of good taste, of politeness, and of all good government” [CCT].  Dacier invokes Plato’s authority in defense of her thesis that civic virtue and vice is tied to the quality of the art and literature habitually diffused among the members of the polis.  “This is why Socrates wanted his fellow citizens to commit themselves entirely to the youth and to take great care to prepare and form good subjects for the republic” [CCT].  Through a process of empathetic imitation by its audience, great art, as exemplified by Homer’s epics, encourages the ascent of the moral, social, and political virtues central to civilization.

Fragile and fleeting, artistic taste can easily decline.  Dacier designates three principal causes of the corruption of taste: poor education; ignorance of teachers; the laziness and negligence of the pupils themselves.  Likeswise, when a society abandons the humanist ideal of an educated public who reads and cherishes the classics in the original languages, the civic virtues nurtured by exposure to the classics will inevitably fade.

Dacier identifies two particular causes of the decline of literature and morality in contemporary France.  The first is the omnipresence of licentious literature.  “One factor contributing to the corruption of taste consists of these licentious shows which directly attack religion and morals.  Their soft and effeminate poetry and music communicates all of their poison to the soul and disables all the nerves of the mind” [CCT].  Not without irony, the translator of Plautus and Aristophanes condemns licentious theater for its weakening of intellectual and moral clarity among its habitual spectators.

The second cause, the vogue of sentimental novels, operates a similar destruction of heroic virtue in its poorly constructed tales of romantic love.  “The other cause consists of these frivolous and sentimental works…these false epic poems, these absurd novels produced by ignorance and love.  They transform the greatest heroes of antiquity into bourgeois damsels.  They so accustom young people to these false characters that they can only tolerate true heroes when they resemble these bizarre and extravagant personages” [CCT].  A public sated with sentimental tales of seduction will have little capacity to understand, let alone practice, the heroic civic virtues represented by the characters of Homeric or Virgilian epic.

Dacier’s analysis of the decline of taste and the related decline of civic culture is inscribed in  her primitivist philosophy of history.  The most perfect examples of literary style lie in the inspired books of the Bible.  “When I read the books of Moses and other sacred authors who lived before the time of Homer, I am not astonished by the great taste which reigns in their writings, since they had the true God as their teacher.  One senses that no human production could possibly reach the divine character of these writings” [CCT].

Although pagan, ancient Egyptian culture receives a similar panegyric.  “I see that geometry, architecture, painting, sculpture, astronomy, and divination flourished among the Egyptians only a few centuries after the great flood.  I see a people convinced of the immortality of the soul and of the necessity of a religion, a people who had a very mysterious and enigmatic theology and who built temples and who gave to Greece her very cult and gods.  When I see the ancient monuments which remain from this people, I cannot doubt that good taste must have also reigned in their writings, although this baffles me and I do not know where all of this could have come from” [CCT].

The culture of ancient Greece, in particular the epics of Homer, also miraculously resisted the tendency of civilization to decline artistically and morally since biblical times.  “I see in Greece all at once a coup of genius.  I see a poet who, two hundred years after the Trojan War and against the degradation imprinted by nature into all the productions of the human mind, combines the glory of invention with that of perfection.  He gives us a sort of poem without any previous model, which he had imitated from no one, and which no one has been able to imitate since then.  This poem’s story, union and composition of its parts, harmony and nobility of diction, artful combination of truth and falsehood, magnificence of ideas, and sublimity of views has always made it considered as the most perfect work issuing from a human hand” [CCT].

The current disdain for Homer and other classical authors reflects the literary-cultural decadence affecting contemporary France.  The loss of the Renaissance humanist’s veneration of the classics indicates a moral and political, as well as artistic, decline for French society.  “Everywhere today there reigns a certain spirit [disdainful of the classics] more than capable of damaging literature and poetry.  This fact has already caused foreigners to reproach us that we are degenerating away from that good taste we had happily developed in the previous century” [CCT].  For Dacier, the only solution to this cultural decline is the neoclassical one: a renewed study of classical languages and literature, with a new literary effort to imitate classical authors in vernacular works and concomitantly an effort to renew political society through the imitation of the civic virtues exalted by Homer and similar Greco-Roman authors.

b. Mimesis and Nature

Throughout her polemical writings, Madame Dacier cites Aristotle’s Poetics as her primary authority for her thesis that art constitutes the imitation of nature.  An oft-cited secondary source for this mimetic theory is Horace’s Art of Poetry.  Against rationalists such as La Motte, Dacier insists that great art’s imitation of nature does not consist in the reproduction of what literally exists in the external physical world; rather, it mirrors the acts of the hidden soul and rightly incorporates mythology, hyperbole, and idealization into its portrait of the moral universe.

In imitating nature, literature must focus on what is true.  Even in writing fiction, the writer must so manipulate the characters and action that they acquire the qualities of verisimilitude.  “I am convinced that a writer writes the true more effectively than he or she does the false.  The mind struck by a real object feels it much more forcefully than if it were struck by an object it only creates by itself or that it does not believe to exist” [CCT].  Like all other artists, the poet must draw his or her truth from nature, even if the usual domain of the poet is the spiritual nature of the human soul in conflict rather than the physical landscape.

The poet’s imitation of nature is never the literal reproduction of preexistent physical or moral nature.  Embellishment of nature is often obligatory if the poet is to place into proper relief the character of his or her personages.  “The exceptional brilliance which the poet [Homer] has given to the valor of this hero [Achilles] has confused them [the critics of Homer].  They didn’t see that this exaggerated valor is there to bring out the nature of his character and not to hide his faults. Poets are like painters. They must make their hero more beautiful, as long as they always conserve the resemblance to the hero and they only add what is compatible with the basic character with which they have clothed their hero” [CCT].  In fashioning the hero who dominates the epic and tragic drama, the author inevitably eliminates and exaggerates certain details of human moral action in order to create a striking moral ideal.

To understand the legitimate freedom of the artist in the imitation of nature, it is crucial to grasp the distinction between history and poetry.  Whereas the historian depicts what actually happens, the poet can present the probable or the possible.  Whereas the historian focuses on the unique fact, the poet dwells on general human truths.  “History writes about only what has happened; poetry writes about what might have or must have happened, either necessarily or probably.  History reports on particular things, poetry on general things.  That is why poetry has greater moral value than does history.  General things interest all human beings while particular things are related only to one human being” [CCT].  In this neo-Aristotelian concept of poetic truth, the freedom of the artist is not unlimited.  Poetic license to embellish character or plot cannot trespass the limits of the probable.

The legitimate freedom of the poet can also be grasped by contrasting poetry with politics and other practical arts.  The truth expressed in the imaginative world of poetry differs from the truth sought in political judgment.  “Aristotle was right to say that ‘that one must not judge the excellence of poetry as one judges the excellence of politics, nor even the excellence of all the other arts.’  Politics and all the other arts seek the true or the possible.  Poetry seeks the astonishing and the marvelous as long as it does not clearly shock the sense of what is probable” [CCT].

Even the other fine arts do not enjoy the freedom proper to the poet in his or her evocation of nature inasmuch as they focus their imitation of nature on specific, external objects.  “In fact, all the other imitations, those of painting, sculpture, architecture, and all the other arts, aim at the imitation of only one thing” [CCT].  Literature alone imitates the universe of the human moral agent; the legitimate license of the poet flows from the challenge of this elusive, spiritual object of mimesis.

In her refutation of La Motte and other critics of Homer, Dacier defends Homer’s use of mythology and other fictional devices in his presentation of the character of Achilles.  In particular, she defends the episode of Achilles with the Phoenix, which La Motte had dismissed as a literary absurdity.  Dacier contends that the dialogue with the Phoenix helped to enhance the moral character of the flawed hero.  “No one is more convinced than I am that everything which exists in nature is not good to be depicted just because it exists.  But I believe that what the Phoenix says [in this passage disputed by La Motte] is not in the nature of the things one should not depict.  In all times and in all nations…images depend on customs and on ways of thinking.  What Homer is doing here…is still quite natural and quite appropriate to show Achilles’s tenderness.  This flows quite logically from the tenderness which the Phoenix just showed him.  It even serves to heighten the grandeur of Achilles.  What kind of child could this be who would have his tears washed away by someone like the Phoenix, son of a king?” [CCT]  The imitation of nature depends on psychological and social context.  The depiction of a mythological character like the Phoenix helps to reveal the positive moral traits of Achilles, in particular his tenderness and his king-like dignity.  Thus, the fact that such a fictional character does not exist in physical nature does not eliminate its usefulness for illuminating the moral nature of the epic hero.  The presence of the Phoenix is also justified, in a similar manner, by the cultural context of the poem’s genesis and setting.  The dialogue between the Phoenix and the warrior is perfectly logical within the religious presuppositions of the ancient Greek world.  It is this world, not the more skeptical world of eighteenth-century France, which art must imitate in Homer’s poetry.

c. Theory of Language

Like her mimetic theory of art, Dacier’s theory of language contests the rationalist thesis that ideal speech provides a clear one-to-one correspondence between a particular object and its linguistic signifier.  Dacier insists on the necessity and value of metaphorical speech, even outside the domain of poetry.

Metaphorical speech often communicates truths which cannot be expressed by more literal speech.  The frequent use of analogy is necessary for the effective communication of moral  truths which elude reduction to straightforward description.  “To depict well the objects of which one speaks, there is no method more certain than to provide images by comparison.  Does poetry alone use it?  Doesn’t eloquent oration use it just as much?  Doesn’t God use it?  Aren’t the divine Scriptures full of it?  Didn’t Our Lord use it again and again in his discourses?” [CCT]  The repeated use of metaphor in the Bible itself confirms the propriety of the recourse to metaphor in various types of religious and secular speech.

Dacier mocks the rationalists like La Motte who condemn metaphorical speech as a species of obscurantism.  “Should we say, like these literalist minds, that these [biblical] comparisons illuminate nothing and that it would have been better for the Holy Spirit to have made a plain depiction of these objects than to have had recourse to these misleading similarities?…Should we be so sure that these comparisons are imperfect and that they only serve to confuse matters rather than to clarify them?…Doesn’t one sense the awful impiety of such a position?  It is not without reason that Scripture calls impiety ignorance” [CCT].  The effort to eliminate metaphorical speech in favor of more literal language reflects the incapacity to grasp the moral realities and religious mysteries only communicated through elaborate simile.  The conceits of Scripture provide the inspired models for this metaphorical use of language to evoke the spiritual.

Rather than being inferior to clear propositional language in revealing the truth, poetry is actually more powerful than philosophy.  Homer reveals the capacity of poetry to unveil the true through the use of analogy.  “No poet has been more successful than him [Homer] in depicting objects by similarity.  Could the most philosophical discourse give a stronger and livelier picture of these objects than the images he draws in the mind through these comparisons?” [CCT]  Rather than representing confusion, metaphorical speech in the hands of a master like Homer evokes complex spiritual truths which more prosaic speech cannot express.

Dacier also defends metaphorical speech because it has the power to touch the emotions and will of the reader as well as his or her intellect.  Rather than conveying simple information, metaphorical rhetoric possesses a persuasive power absent in more literal forms of communication.  The value of metaphorical speech in the moral and religious realms lies in its capacity to shape the action of the moral agent and to convert the sinner.

d. Moral Pedagogy of Literature

For Dacier, the study of classical literature is essential to shape the moral character of the members of society, especially its governing elite.  Against the criticism that both Greek culture and literature are marked by immorality, Dacier defends the moral probity of classical Greek authors and their capacity to foster virtue in their readers.  Against the theological argument that the Greek pantheon and the authors it inspired feature immoral deities, Dacier claims that the theology of classical Greek poetry is closer to that of Christian monotheism than its modern critics would admit.

Homer epitomizes the moral value of the Greek classics.  “No philosopher has given greater precepts of morality than has Homer….Everyone [except modern critics] has recognized that the Iliad and the Odyssey art two quite perfect tableaux of human life.  With admirable variety, they represent everything that is worthy of praise or blame, that is useful or pernicious, in a word all the evils which madness can produce and all the goods which wisdom can cause” [CCT].  As evidence of Homeric passages promoting virtue, Dacier cites the prudence and wisdom apparent in King Nestor’s discourses in the Iliad and the Odyssey.

According to Dacier, La Motte and other modern critics of Homer have seriously misunderstood the moral structure of Homer’s epics and the classical Greek society they mirror.  In particular, they have misconstrued the moral nature of the epic hero Achilles.  Rather than being a model for moral imitation by the reader, Achilles is in fact a warning against the destructiveness of the vices of vanity, temerity, and arrogance with which Homer has clothed his character.  Dacier cites her philosophical guide Aristotle in this interpretation of Achilles as a salutary warning against vice.  “Did Aristotle ignore the continual emotional eruptions of Achilles?  Where did Aristotle consider them a virtue?  Undoubtedly, he made us see that the character of Achilles must represent not what a man does in anger, but rather everything anger itself can do.  Consequently, he considered this hero the brutal opposite of the man who does good” [CCT].  The modern dismissal of classical Greek literature as morally damaging is based upon such basic misinterpretations of the moral character of the epic and tragic hero.

Dacier defends the theological as well as the moral probity of Homer’s epics.  Against the common Christian charge that classical literature features a pantheon of violent, vicious deities, she insists that Homer actually provides a portrait of God and of the human soul which accords with the biblical prophets and apostles on numerous points.  “Homer recognizes one superior God, on which all the other gods are dependant.  Everywhere he supports human freedom and the concept of a double destiny so necessary to harmonize this freedom with predestination; the immortality of the soul; and punishments and rewards after death.  He recognized the great truth that human beings have nothing good which they have not received from God; that it is from God that comes all the success in what they undertake; that they must request this happy outcome by their prayers; and that the misfortune which occurs to them is called down by their folly and by the improper use they make of their freedom” [CCT].  Given its sound theology and moral psychology, the epics of Homer and similar classical Greek works of literature can nurture the theological as well as the moral virtues essential for a Christian political order.

4. Reception and Interpretation

Since the late seventeenth century, Madame Dacier has been recognized as a preeminent classicist and translator.  The essayist Madame de Lambert praised her contemporary for having contradicted anti-intellectual stereotypes of women.  “I esteem Madame Dacier infinitely.  Our sex owes her a great deal.  She has protested against the common error which condemns us to ignorance.  As much as from contempt as from an alleged superiority, men have denied us all learning. Madame Dacier is an example proving that we are capable of learning.  She has associated erudition with good manners.” [New Reflections on Women, 1727]

Dacier received philosophical as well as literary recognition.  Gilles Ménage dedicated his History of Women Philosophers (1690) to Dacier under the accolade of “the most erudite woman in the present or the past.”  In his Philosophical Dictionary (1764), Voltaire argued that “Madame Dacier was no doubt a woman superior to her sex and she has done a great service to letters.”  Countless encyclopedias of women authors cite Dacier for her erudition and scholarly productivity but her philosophical reflection has received comparatively little attention.

Recent scholarship has continued this literary rather than philosophical focus on Dacier.  Garnier (2002), Hayes (2002), and Moore (2002) examine questions of translation during the querelle des anciens et des modernes.  Bury (1999) studies Dacier in the context of the role of women intellectuals in the period.

The challenge for a philosophical interpretation of Dacier is to analyze the theories of art, mimesis, language, and education developed in her more theoretical works.  There is also the historical challenge to explore the neo-Aristotelianism she defended in the aesthetic rather than the customary metaphysical realm.  Her role in diffusing Stoic philosophy through the translations of Plutarch and Marcus Aurelius she co-authored with her husband also merits further study.

5. References and Further Reading

All translations from French to English are by the author of this article.

a. Primary Sources

  • Dacier, Anne Le Fèvre, Des causes de la corruption du goût. Paris: Rigaud, 1714.
    • A digital version of this work is available online at Gallica: Bibliothèque numérique on the webpage of the Bibliothèque nationale de France.
  • Dacier, Anne Le Fèvre, Homère défendu contre l’apologie du Père Hardouin, ou la suite aux causes de la corruption du goût. Paris: Coignard, 1716.
    • A digital version of this work is available online at Gallica: Bibliothèque numérique on the webpage of the Bibliothèque nationale de France.

b. Secondary Sources

  • Bury, Emmanuel, “Madame Dacier,” in Femmes savantes, saviors de femmes: Du crépuscule de la Renaissance à l’aube des Lumières, ed. Colette Navitel. Geneva: Droz, 1999: 209-20.
    • The author analyzes Dacier in terms of leading women intellectuals of early modernity.
  • Garnier, Bruno, “Anne Dacier, un esprit moderne au pays des anciens,” in Portraits de traductrices, ed. Jean Delisle. Ottawa and Artois: Presse universitaire d’Ottawa and Artois Presse universitaire, 2002: 13-54.
    • The author focuses on Dacier’s methods of translation.
  • Hayes, Julie Candler, “Of Meaning and Modernity: Anne Dacier and the Homer Debate,” in Strategic Rewriting, ed. David Lee Rubin. Charlottesville, VA: Rookwood, 2002: 173-95.
    • The author studies Dacier’s principles of translation and role in the querelle des anciens et des modernes.
  • Moore, Fabienne, “Homer Revisited: Anne Le Fèvre Dacier’s Preface to Her Prose Translation of the Iliad in Early Eighteenth-Century France,” Studies in the Literary Imagination, Fall 2000; 33(2): 87-107.
    • The author analyzes the moral theories as well as the translation methods of Dacier.

Author Information

John J. Conley
E-mail: jconley1@loyola.edu
Loyola University in Maryland
U. S. A.

Agnès Arnauld (1593—1671)

An abbess of the Jansenist convent of Port-Royal, Mère Agnès Arnauld developed an Augustinian philosophy shaped by the mystical currents of the French Counter-Reformation.  Her philosophy of God depicts a deity who is radically other than his creatures.  Only a negative theology, a theology of what God is not, can explore the divine attributes.  In her ethical theory, Mère Agnès contextualizes moral virtue by analyzing those religious virtues proper to a nun in a contemplative order.  Influenced by the mystical école française, the abbess stresses self-annihilation as the summit of the nun’s life of virtue.  In her legal writings tied to the reformation of the convent, the abbess defends the spiritual freedom of women.  Women are to enjoy vocational freedom, the freedom to pursue education, and the freedom to hold opinions on disputed theological questions.  Similarly, women are to enjoy substantial freedom in the exercise of their authority as superiors of convents.  During the persecution of the convent, Mère Agnès developed a moral code of resistance to abuses of power.  She details the conditions under which cooperation with illegitimate commands of civil or ecclesiastical authority could be tolerated or rejected.

Table of Contents

  1. Biography
  2. Works
  3. Philosophical Themes
    1. Negative Theology
    2. Monastic Virtue Theory
    3. Law and Freedom
    4. Ethics of Resistance
  4. Interpretation and Relevance
  5. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Biography

Born on December 31, 1593, Jeanne Arnauld was the third daughter of Antoine Arnauld the Elder and Marie Catherine Marion Arnauld.  From birth, the prominent family of jurists had designated the infant for an abbacy in a convent.  Through negotiations with King Henri IV and fraudulent transaction with the Vatican, in which documents attesting the candidate’s age were falsified, her maternal grandfather Simon Marion had Jeanne appointed the abbess of the Benedictine convent of Saint-Cyr in 1599.  Assuming her religious name of Mère Cathérine –Agnès de Saint-Paul (commonly known as Mère Agnès), the infant abbess took with relish to the liturgical offices and other practices of the monastery.

As her elder sister’s reform of Port-Royal became increasingly stormy, Mère Agnès devoted her time to supporting Mère Angélique in the work of Port-Royal’s transformation.  Sealing her commitment to monastic reform, Mère Agnès renounced the abbacy of Saint-Cyr in 1610, was clothed in the Cistercian habit of Port-Royal in 1611, and pronounced her vows as a member of the community in 1612.  Mère Agnès assisted her sister in governing the burgeoning convent through a series of major offices: mistress of novices, subprioress, and vicar abbess.

During the decade of the 1610s, Mère Agnès emerged as one of the convent’s leading spiritual directors.  Her extensive correspondence reveals the eclectic influences on her thought: the Jesuit Jean Suffren; the Capuchin Archange de Pembroke; and the Feuillant Eustache de Saint-Paul Assaline.  François de Sales influenced the characteristic moderation expressed in Mère Agnès’s judgments.  On issues of gender and mystical states, the central reference was Teresa of Avila.  Her Path of Perfection, Interior Castle, and Autobiography are repeatedly cited.

In the 1620s, the convent became more turbulent.  With the transfer of the convent from the rural valley of the Chevreuese to the Parisian Saint-Jacques neighborhood in 1625, the convent came under the influence of new Oratorian chaplains, notably Pierre de Bérulle and Charles de Condren.  A disciple of the Platonist Pseudo-Dionysius, Bérulle encouraged an apophatic mysticism, which stressed the incapacity of the human mind to know God through image or concept. Condren emphasized complete abandonment to God’s will, climaxed by self-annihilation.  During the Oratorian ascendancy (1625-1636), Sebastien Zamet, an episcopal overseer of the convent and an ally of the Oratorians, pushed the convent in a less austere direction.  Liturgical offices became more complicated, church decorations became more sumptuous, and nuns were encouraged to share their latest mystical insights with devout laity in the front parlors.  The original reformers quietly fumed at what they considered a regression toward conventual decadence.

During the Oratorian ascendancy, Mère Agnès composed a small treatise, Private Chaplet of the Blessed Sacrament, under the direction of Condren.  Honoring Christ in the Eucharist, each of the sixteen stanzas devoted itself to the sixteen centuries elapsed since the Last Supper.  At Condren’s suggestion, the pious litany was expanded; the nun explained the meaning of the various apophatic titles ascribed to God. [An apophatic theology considers God to be ineffable, and it attempts to describe God in terms of what God is not.]   As the Chaplet quietly circulated among the nuns and lay benefactors of Port-Royal, a crisis erupted in 1633.  Octave Bellegarde, another episcopal supervisor of Port-Royal who disapproved of Zamet’s reform and of the Oratorians’ speculative mysticism, denounced the pamphlet as heretical and temerarious.  In June 1633, a committee of the theology faculty of the Sorbonne condemned Mère Agnès’s text as destructive of morals because of its stress on passive abandonment to God.  The Jesuit Étienne Binet seconded the condemnation, while the Jansenist Jean de Hauranne, abbé de Saint-Cyran vigorously defended the orthodoxy of the treatise.  The Vatican’s halfhearted intervention into the pamphlet war pleased neither party.  Mère Agnès’s text was withdrawn from circulation but neither was its theology condemned nor was it placed on the Index of Forbidden Books.

During her first abbacy over Port-Royal (1636-42), Mère Agnès restored the convent to the austerity of the Angelican reform.  Her relationship with the convent’s new chaplain, however, proved less than amicable.  A close friend and disciple of Cornelius Jansen, bishop of Ypres and Louvain theologian, Saint-Cyran imported the radical Augustinian theology of Jansen into the convent.  This theology’s emphasis on practical morality and the value of occasional deprivation of the sacraments clashed with the abbess’s more exuberant mystical piety.  By the time of Saint-Cyran’s imprisonment by Richelieu (1638-43), however, Mère Agnès had become a partisan of the Jansenist movement and intensified the convent’s cult of Saint-Cyran through the circulation of his letters and conferences.

At the end of her abbacy, Mère Agnès was appointed the convent’s mistress of novices.  The nun flourished in this role of spiritual counselor both through conversation with the novices and through correspondence with an extended number of correspondents.  Her letters to Jacqueline Pascal on the proper moment for entering Port-Royal express the prudence and moderation for which the nun was renowned: “Our Lord wants to purify you by this delay because you have not always desired it.  It is necessary to have a hunger and thirst for justice to expiate the disgust one once had for this vocation in earlier times.  Saint Augustine wonderfully describes this delay of God’s grace in the souls of those who desire the abundance of God’s grace, which God has postponed [L; to Jacqueline Pascal; February 25, 1650].”  Through her correspondence, the nun also participated in the philosophical and theological controversies of the day swirling around the publication of Antoine Arnauld’s Frequent Communion and Jean Brisacier’s Jansenism Confounded.

When Mère Agnès assumed her second abbacy of Port-Royal (1658-1681), the controversy over Jansenism had erupted into a crisis.  The French throne demanded that every priest, nun, and teacher in the realm sign a statement assenting to the Vatican’s condemnation of five heretical propositions found in Jansens’s massive Augustinus (published posthumously in 1640).  Using an ingenious theological distinction, Antoine Arnauld the Younger argued that members of the church were only bound to assent to church judgments of droit (concerning faith and morals); they were not bound to accept church judgments of fait (empirical fact).  The first type of judgments was essential to the church’s mission of salvation, whereas the second was not.  In the “crisis of the signature,” the Jansenists were willing to assent to the condemnation of the five theses concerning grace and freedom, but they could not assent to the erroneous judgment that Jansen had supported such heresies.   Mère Agnès indicates her refusal to giving an unreserved signature to the controversial document.  “The church is attacked in truth and charity, the two columns that support it.  This is what they are trying to destroy by this unfortunate signature, which we would offer against the truth and thereby destroy the charity we should have for the dead as well as the living.  We would be subscribing to the condemnation of a holy bishop who never taught the heresies they impute to him [L; to Madame de Foix; December 10, 1662].”

As the 1660s progressed, Mère Agnès witnessed the progressive intensification of the persecution of the convent.  The convent school and novitiate were closed; the chaplains were expelled.  Foreign nuns hostile to Jansenism were imported to govern Port-Royal, now surrounded by an armed guard.  The most recalcitrant nuns, including Mère Agnès, were exiled to foreign convents.  Mère Agnès herself was placed with the Visitation nuns; she refused to sign the controversial statement although some of her own nieces in the convent eventually yielded.  By the end of the 1660s, a truce was arranged to resolve the growing scandal of an entire convent under interdict.  In 1669, the “Peace of the Church” permitted the reopening of the convent, the resumption of liturgical life, and the reopening of the convent school and novitiate.  Several uncharacteristically placid years descended on Port-Royal.

Mère Agnès Arnauld died on February 19, 1671.

2. Works

A prolific author, Mère Agnès was one of the few Port-Royal nuns to see her works published during her lifetime.  Circulating first as a devotional pamphlet, Private Chaplet of the Blessed Sacrament (1626) caused an international dispute over its controversial apophatic approach to the divine attributes.  Louvain, the Jansenists, and the Oratorians defended the work, while the Jesuits and the Sorbonne opposed it.  Working in collaboration with Mère Angélique Arnauld and Antoine Arnauld, Mère Agnès was the principal author of the Constitutions of Port-Royal (1665), the legal framework for the Angelican reform of the convent and a theological text marked by the abbess’s Oratorian insistence on the annihilation of oneself.  Her Image of a Perfect and an Imperfect Nun (1665) provides the fullest exposition of her virtue theory.  The contemplative virtues central to the monastic life, especially the spirit of adoration, are stressed; to be authentic, virtue must empty itself of all self-interest.  Her accent on the intellectual nature of religious contemplation provoked a new controversy.  Martin de Barcos (1696) and Jean Desmarets de Saint-Sorlin (1665) criticized her approach as too intellectualist; Pierre de Nicole (1679) defended her use of reason in meditation.  The Spirit of Port-Royal (1665) underscores self-annihilation in its treatment of the spiritual character of the convent community.

The posthumously published works of Mère Agnès also make a single contribution to the philosophical and theological canon of Port-Royal.  An exercise in moral casuistry, Counsels on the Conduct Which the Nuns Should Maintain in the Event of a Change in the Governance of the Convent (1718) tackles the problem of moral cooperation with evil as it analyzes which actions would be legitimate and illegitimate in obeying the civil and ecclesiastical authorities who were persecuting the convent.  The two-volume Letters of Mère Agnès Arnauld, abbess of Port-Royal (1858) reflects the Augustinian axis of the abbess’s philosophy.   She repeatedly refers to the texts of Saint Augustine himself and modern Augustinian writers such as Jansen, Saint-Cyran, Antoine Arnauld, and Teresa of Avila in justifying her positions on spiritual government and theological controversy.

3. Philosophical Themes

The philosophical reflection of Mère Agnès Arnauld follows two primary avenues: philosophy of God and moral philosophy.  Influenced by the apophatic theology of the Oratorians, her philosophy of God stresses God’s alterity (otherness) and the incapacity of human concepts to penetrate the divine essence.  Her moral philosophy develops a theocentric account of the virtues central to the monastic life.  It also presents a casuistic analysis of the permissible and impermissible modes of cooperation with the persecutors of Port-Royal.

a. Negative Theology

In the Private Chaplet of the Blessed Sacrament [PC], Mère Agnès provides the most substantial expression of her apophatic theology.  A devotional treatise written in praise of Christ’s presence in the Eucharist, the Private Chaplet stresses the negative attributes of God disclosed in the eucharistic Christ.  The adorer of the Eucharist cannot penetrate the essence of the godhead, affirmed more accurately by terms expressing what he is not than by those expressing what he is.

A series of titles express this divine unknowability.  God is inaccessible. “He remains in himself, letting creatures remain in the incapacity to approach him [PC no.11].”  God is incomprehensible.  “He alone knows his ways.  He justifies to himself alone the plans he has for his creatures [PC no.12].”  God is entirely sovereign.  “He acts as the first cause without any subordination to the ends he has given himself [PC no.13].”  Other negative divine attributes include illimitability, inapplicability, and incommunicability.

Even the positive attributes ascribed to God receive an apophatic reinterpretation.  God’s holiness is entirely other than the alleged holiness of certain human creatures.  “The company God wants to keep with humanity is separate from it.  He resides only in himself.  It is not reasonable that God should approach us because we are only sin [PC no.1].”  The existence allegedly shared by both God and creatures is illusory.  Divine existence only manifests the non-being of creatures, especially peccatory human creatures.  “God is everything he wants to be and makes all other beings disappear.  As the sun blots out all other light, God exists simply to exist [PC no.4].”  The analogy of being disappears in this exaltation of divine alterity.

Throughout her writings, Mère Agnès emphasizes the rupture between God and human beings.  Analogical presentations of the divine attributes are inevitably anthropomorphic projections of human attributes into the divine essence.  As in the act of adoration before the Eucharist, the primary act of metaphysical affirmation of God is the adorer’s humble recognition of his or her utter incapacity to imagine or name the magnum mysterium that is the cause and the end of cosmic and human existence.  Only the language of negation and alterity can prevent both piety and philosophical reflection from deteriorating into imaginary projection.

b. Monastic Virtue Theory

In Image of a Perfect and an Imperfect Nun [IP], Mère Agnès analyzes the moral virtues proper to a nun committed to a strictly cloistered community.  Her distinction between the perfect and imperfect is not the one between virtue and vice.  The dividing line between authentic virtue and its subtle counterfeits lies in the difference between theocentric and anthropocentric postures of the will.

The monastic virtue of reverence illustrates the difference.  Both perfect and imperfect nuns practice their external obligations of divine worship and of reverence for their superiors.   The perfect focus on God alone, ignoring other creatures “as if they did not exist [IP, 7].”  The imperfect, however, suffer from a vacillating attention that “desires something other than God and that fears losing something other than God that pleases them [IP, 10].”  This anthropocentric turning on oneself corrupts the virtue that should be purely focused on God.

Other monastic virtues exemplify the split between anthropocentric and theocentric versions of virtue.  Perfect submission to the divine will accepts the periods of aridity and desolation which characterize spiritual maturation.  Imperfect submission, however, bridles at such deprivation and clings to sensible consolations.  Perfect zeal desires nothing other than the glory of God.  Imperfect zeal becomes fascinated with the external means used to glorify God and seeks recognition for its efforts.  Perfect repentance firmly renounces all sin and seeks solitude to reform one’s life.  Imperfect repentance vacillates and cannot give finality to its vague, contradictory desires for reform.  Recalling Pascal, Mère Agnès describes the vacillation of the imperfect soul: “Her mind is like a reed shaken by the wind, which makes it turn now this way and now that [IP, 53].”

This anti-anthropocentric account of virtue ultimately celebrates the annihilation of the self in the perfect practice of the monastic virtues.  Authentic humility entails recognition of one’s utter dependence on God for the least moral action.  “It is on this incapacity to perform the least good and to avoid the least evil without God’s help that the true nun establishes the unshakable foundation of humility [IP, 94].”  Similarly, authentic poverty acknowledges one’s utter non-existence in face of God.  “It is the knowledge that she has nothing that was hers before she was created out of nothing,  especially since the sin of Adam, who made all humanity worthy of not only losing the goods of heaven but of losing the goods of earth as well [IP, 100].”  Clearly influenced by the Oratorian spirituality of annihilation, the abbess depicts perfect virtue as a collapse of the moral agent into the divine will.

At the apex of the monastic virtues lie the contemplative virtues of solitude and adoration.  Authentic solitude permits the nun to recognize her utter uselessness in the face of God’s grandeur.  “God reduces us to be totally useless so that we might experience what the Prophet says: ‘Since the Lord is God, he has no need of our goods.’  This is to say that no matter how excellent our works may be, they provide no benefit to him; they are only advantageous for ourselves [IP, 148].”  In the act of adoration, the perfect nun experiences this self-annihilation in its fullest; she also discovers the source of this abolition of the human self in God’s operation of grace in the cross of Christ.  “She hears the voice of her Savior, who commands her to announce his death through her voluntary death to all things and to herself until he comes, which is to say, until she dies in her body.  He further tells us to find her glory and her rest only in the cross, in humiliation and privation of what she loves.  She should do so out of love for the one who dispossessed himself of his own glory, who annihilated himself, and who died for her salvation [IP, 160].”

This account of virtue is Augustinian in its stress on the utter necessity of grace for the performance of any moral action.  The “natural” virtues of prudence, fortitude, temperance, and justice are notable by their absence since they are illusory manifestations of pride.  The account is Oratorian inasmuch as it stresses the annihilation of the self as the key trait of the moral agent perfectly united to God in the practice of virtue.  It is also contemplative inasmuch as it integrates the practice of the moral virtues into the gaze of the adorer who knows through speculative experience and divine illumination how radically all good dispositions and good actions are caused by the sovereign godhead.  This architectural contemplative gaze simultaneously recognizes the utter nothingness of the human agent distorted by sin and concupiscence.

c. Law and Freedom

As the principal author of the Constitutions of the Monastery of Port-Royal [CM],. Mère Agnès crafted the basic legal structure for the reformed convent.  The piecemeal reforms effected by her sister Mère Angélique would now be embedded in a legal document recognized by ecclesiastical authority.  In the Constitutions, Mère Agnès also sketches her philosophy of freedom and rights in a gendered key.  The freedom of women to pursue a vocation, to develop a theological culture, and to exercise limited self-government are affirmed.  In particular, the authority of the convent’s abbess to govern and instruct her nuns without external interference is underscored.

Rooted in the Angelican reform, the Constitutions emphasize the vocational freedom of women.  The convent will only accept women who have indicated their desire to pursue a monastic calling free of parental pressure.  “We should not admit any girl if she is not truly called by God.  She should show by her life and actions a true and sincere desire to serve God.  Without this we should never admit anyone for any other reason, even when it is a question of the intelligence, the wealth, or the noble title the candidate might bring [CM, 54].”  To emphasize this vocational freedom, the Constitutions abolish the dowry requirement, long traditional for choir nuns in Benedictine and Cistercian convents.  “If a poor but excellent girl, clearly called by God, presents herself for admission, we should not refuse her, although the convent would be heavily burdened.  We would then hope that God who sent her would feed her.  We should not be afraid to make such commitments as long as we choose souls carefully and only accept souls rich in virtue instead of temporal advantages [CM, 74].”  Although this policy of vocational freedom faithfully followed the canon law of the church, it shocked French aristocratic opinion, long accustomed to placing unmotivated widows and surplus daughters in convents through the gift of a dowry.

Similarly, the education of women in Port-Royal’s convent school was to respect this vocational freedom.  The school was to accept only pupils whose parents had not already designated them for the married or cloistered life.  “We will only accept those girls whose parents offer them to God in indifference: that is, indifference as to whether they have decided to become nuns or whether they have decided to return to the world [CM, 99].”  For Mère Agnès, the major purpose of the convent school was to permit the pupils to discern their personal vocations through prayer, sacramental life, and dialogue with the teaching nuns.

The Constitutions also stress the spiritual freedom of the individual nun in her times of personal prayer.  In a period when many religious orders minutely prescribed methods of meditation, Mère Agnès insisted on the spontaneity and freedom which a nun must enjoy as she advances in prayerful maturity.  “Saint Benedict’s intention was that we should give the Holy Spirit room and time to stir up in us the spirit of meditation, which consists in a sincere desire to belong to God and to do so in purity and compunction of heart….True meditation is a celestial gift and not a human one.  It is the Holy Spirit praying for us when he makes us pray [CM, 43].”  Like virtue, prayer is theocentric in its very causation.  A certain illuminism emerges in this Augustinian account of prayer.

To exercise this contemplative freedom, the nun must develop an extensive theological culture.  In a period when personal meditation on Scripture was still considered suspect, Mère Agnès stipulates a comparatively wide number of theological texts to be studied by the nuns.  In addition to biblical reading, nuns are to meditate on works from the patristic period (Augustine, the desert fathers, Dorotheus and Bernard of Clairvaux) and the modern period (François de Sales, Louis of Grenada, and Teresa of Avila).

The nuns are also to enjoy limited self-governance.  The abbesses are to be elected by the nuns meeting in chapter.  The term of office was now to be fixed at three years, renewable for one additional term.  Although the bishop appoints a clerical overseer for the convent, the overseer is to be chosen from a list of three names presented by the abbess.  Similarly, the abbess is to exercise the right of approval for the chaplains and confessors who serve the convent.  The authority of the abbess in the reformed convent is especially pronounced.  She is to serve as the nuns’ principal spiritual director and to enjoy an extensive teaching role.  She is to provide lectures commenting on key monastic texts, such as the Rule of Saint Benedict.  In the conférence, one of the reformed Port-Royal’s creations, the abbess is to field the question of her fellow nuns on both practical and speculative issues troubling the convent.

So strong was the reformed convent’s accent on theological culture and debate that critics derided the Port-Royal nuns as théologiennes.  In Mère Agnès’ perspective, the authentic nun is the woman who freely pursues a personal vocation, who strengthens this vocation through substantial theological study, who chooses her own superiors, and who pursues God in spontaneous meditation guided by the Holy Spirit.  The challenge to the patriarchal tradition of the forced vocation and the illiterate convent was evident.

Even in her legal texts, the Augustinian philosophy of Mère Agnès is evident.  The Constitutions are not a blueprint for human efforts to build the ideal convent; they reflect the work of divine grace within the reformers.  “As Saint Augustine says, we must work to conquer our vices by constant efforts and ardent prayers, but we must recognize at the same time that our efforts as well as our prayers, if there is anything good in them, are the effects of grace [CM, 273].” Corporate, no less than individual, acts of virtue have a single causation in the operation of divine grace.

d. Ethics of Resistance

As the opposition to Port-Royal intensified, Mère Agnès composed Counsels on the Conduct Which Nuns Should Maintain in the Event of a Change in the Governance of the Convent [CC].  The work attempted to prepare the nuns to negotiate the persecutions which would soon overwhelm the convent.  Mère Agnès presciently saw the exile of recalcitrant nuns, the imposition of foreign superiors, and the use of ecclesiastical interdict (barring a Christian from participation in the sacraments) as probable tactics of the new persecution.  Her Counsels functions both as a casuistical manual, which instructs nuns on acceptable and unacceptable scenarios of cooperation with hostile authorities, and a moral exhortation, which analyzes the virtues the nuns should cultivate under duress.

If foreign superiors are imposed on the convent, the Port-Royal nuns should refuse to acknowledge their authority.  Such imposed superiors represent a violation of the convent’s constitutions, which have been duly approved by the Vatican and the French throne.  “These superiors cannot have a true authority by usurping a power that does not belong to them.  They will be intruders, even when they want to adorn themselves with the obedience due superiors [CC, 83].”  The nuns of Port-Royal have not pledged to follow generic vows of poverty, chastity, and obedience; with the approval of church and state, they have promised to live these vows in the convent of Port-Royal, ruled by its constitutions and laws.  The imposition of foreign superiors represents a serious violation of this vocational right.

In practice, the nun must distinguish between acceptable and unacceptable cooperation in the commands of these illegitimate superiors.  Material cooperation is the easiest.  The nun should quickly accept commands concerning manual labor, meals, and physical disposition of one’s space.  Even here, however, the nun must refuse commands to activities incompatible with the ethos of Port-Royal; making elaborate vestments or placing flowers on the altar, for example, would violate the convent’s austere understanding of poverty.  Moral cooperation with the illegitimate superiors should be refused.  The nun is not to reveal her convictions or feelings to the illegitimate superiors or their attendant clergy.  If a command is refused, no explanation is to be given.  Under no circumstances should the nun agree to submit to the demand of an unreserved signature on the statement assenting to the church’s condemnation of Jansen; to do so would be to deny the truth concerning grace.  Even conversations on this topic are to be avoided.

The problem of material cooperation when one is exiled to a foreign convent is comparatively easier.  A legitimate superior of a foreign convent exercises a certain authority over the entire house, including the guests who reside there.  An exiled Port-Royal nun should easily accept the host convent’s different material culture, even to the point of breaking the reformed Port-Royal’s vegetarianism, and different spiritual culture, including participation in a different version of the divine office than that used at Port-Royal.  Even here, however, a strict silence should be employed to avoid any moral cooperation with Port-Royal’s persecutors.  If an exiled nun confesses her sin to the convent’s confessor, she should not reveal anything else in her conscience except her sins, soberly described.  Interviews with the new superior should be respectful but the exiled nun should not reveal her internal state of mind.  An asceticism of the tongue is essential in this genteel campaign of resistance.

The abbess also reminds the nuns of the virtues they need to cultivate during the impending persecution.  They need to acquire the virtues of the martyrs who have preceded them.  “God clearly permits us to be consoled by the thought that we are suffering because we feared to offend him by assenting against our conscience to something we thought impossible to do without attacking the truth [CC, 104].”  In refusing to assent to the condemnation of what they believed to be Jansen’s accurate theory of grace, the nuns have become the most modern of victims: the martyr to conscience.

The persecution also gives the nuns the occasion to deepen the virtues of humility and of dependence on God’s providence.  Most strikingly, the deprivation of Holy Communion (as part of the censure of interdict) permits the nuns to discover an asacramental type of communion with Christ that seems to transcend the value of sacramental communion.  “Instead of the bread of God, we receive the word of God himself, which must be heard in our heart….We place our confidence in the promise made to us in Holy Scripture that the spiritual anointing, even greater in affliction, will teach us everything [CC, 95].” Under the brunt of persecution, the piety of the nun becomes a comparatively antinomian type of piety, no longer requiring the sacramental or sacerdotal meditation of the church to experience intimate communion with God.

4. Interpretation and Relevance

Several factors have limited the reception and interpretation of the works of Mère Agnès Arnauld as properly philosophical works.  First, a number of secondary works have focused exclusively on the controversy over the nun’s early pamphlet Private Chaplet of the Blessed Sacrament. The commentaries by Saint-Cyran (1633), Binet (1635), Armogathe (1991), and Lesaulnier (1994) indicate the longstanding interest in the church-state controversy behind this international quarrel.  This focus on the ecclesiastical politics behind the abbess’s early work has tended to devalorize the more detailed positions on virtue and authority developed by Mère Agnès in her works of maturity.  Second, the Jansenist movement itself often distanced itself from the works of the nun, considered too mystical for the practical, rationalist piety of the Jansenist mainstream.  Barcos’s (1696) critique of Mère Agnès’s theories of prayer indicates the disdain of the later Jansenist movement for a mysticism-oriented philosophy too dependent on its Oratorian sources.

The contemporary philosophical retrieval of the thought of Mère Agnès is focusing more on her work as a moralist.  Her virtue theory privileges those intellectual and volitional habits that typify the way of life proper to a strictly cloistered convent.  Contemplation itself, interpreted as a loving gaze on God freed from all self-interest, becomes the keystone of the authentic virtuous life.  As Mesnard (1994) argues, the more active dimension of the life of the embattled nun merits new consideration.  Her reflections on the quandaries of material cooperation with evil constitute a casuistry for the oppressed.  The ethics of resistance she constructed during the persecution of Port-Royal remains to be explored.

5. References and Further Reading

All translations from French to English above are by the author of this article.

a. Primary Sources

  • Arnauld, Mère Agnès. Avis donnés par la Mère Cathérine Agnès de Saint-Paul, Sur la conduite que les   religieuses doivent garder, au cas qu’il arrivât du changement dans le gouvernement de sa maison (N.p.: 1718).
    • [The treatise analyzes the morality of material cooperation with the opponents of the convent.]
  • Arnauld, Mère Agnès. Counsels on the Conduct Which the Nuns Should Maintain in the Event of a Change in the Governance of the Convent.1718.
  • Arnauld, Mère Agnès. L’image d’une religieuse parfait et d’une imparfaite, avec les occupations intérieures pour toute la journée (Paris: Charles Savreux, 1665.)
    • [This work analyzes the difference between theocentric and anthropocentric virtue, with the accent placed on the Oratorian virtue of self-annihilation.]
  • Arnauld, Mère Agnès. Les constitutions du monastère de Port-Royal du Saint-Sacrement (Mons: Gaspard Migeot, 1665.)
    • [Written with the collaboration of Antoine Arnauld and Mère Angélique Arnauld, this juridical document provides the legal framework for the reformed Port-Royal convent.]
  • Arnauld, Mère Agnès. Lettres de la Mère Agnès Arnauld, abbesse de Port-Royal, ed. Prosper Faugère [and Rachel Galet], 2 vols. (Paris: Benjamin Duprat, 1858).
    • [The correspondence indicates the broad Augustinian culture of the abbess as well as the principles of her methods of governance and of spiritual direction.]
  • Arnauld, Mère Agnès. Private Chaplet of the Blessed Sacrament (1626).
  • Arnauld, Mère Agnès. Spirit of Port-Royal (1665).

b. Secondary Sources

  • Armogathe, Robert. “Le chapelet secret de Mère Agnès Arnauld,” XVIIe siècle no. 170 (1990): 77-86.
    • [The article provides an excellent critical edition of the Private Chaplet and an analysis of the work’s theology of rupture.]
  • Barcos, Martin de. Les sentiments de l’abbé Philérème sur l’oraison mentale (Cologne: P. Du Marteau, 1696).
    • [The Jansenist leader criticizes Mère Agnès’s approach to meditation as too methodical and too intellectualist.]
  • Binet, Étienne. Discussion sommaire d’un livret intitulé “Le chapelet secret du très-saint Sacrement” (Paris: 1635).
    • [The Jesuit author criticizes Mère Agnès’s Private Chaplet for its alleged asacramentalism and discouragement of the cultivation of the moral virtues.]
  • Bugnion-Sécretan, Perle. Mère Agnès Arnauld, 1593-1672; Abbesse de Port-Royal. (Paris: Cerf, 1996).
    • [This biography uses Mère Agnès’s correspondence to probe the abbess’s psychological life.]
  • Conley, John J. Adoration and Annihilation: The Convent Philosophy of Port-Royal (Notre Dame, IN: University of Notre Dame Pres, 2009): 113-174.
    • [This chapter analyzes Mère Agnès’s Augustinian philosophy, especially its theory of virtue, freedom, and the divine attributes.]
  • Chédozeau, Bernard. “Aux sources du Traité de l’oraison de Pierre Nicole: Martin de Barcos et Jean Desmarets de Saint-Sorlin lecteurs des Occupations intérieures de la Mère Agnès Arnauld,” Chroniques de Port-Royal 43 (1994): 123-34.
    • [The article traces the influence of Mère Agnès’s spirituality on subsequent controversies over the nature of Christian contemplation.]
  • Desmarets de Saint-Sorlin, Jean. Le chemin de la paix et celui de l’inquiétude, vol. 1 (Paris: C. Audinet, 1665).
    • [The book condemns Mère Agnès’s theories of prayer as too rationalistic.]
  • Lesaulnier, Jean. “Le chapelet secret de la Mère Agnès Arnauld,” Chroniques de Port-Royal 43 (1994): 9-23.
    • [The article provides a well-documented textual study of the various versions of and controversy over the abbess’s Private Chaplet.]
  • Mesnard, Jean. “Mère Agnès femme d’action,” Chroniques de Port-Royal 43 (1994): 57-80.
    • [Unlike other commentators, the author stresses the practical rather than the mystical dimension of Mère Agnès’s work and theories.]
  • Nicole, Pierre de. Traité de l’oraison (Paris: H. Josset, 1679).
    • [A leading Jansenist philosopher defends Mère Agnés’s spirituality as both doctrinally orthodox and philosophically sound.]
  • Saint-Cyran, Jean du Vergier de Haurranne, abbé de. Examen d’une apologie qui a été faite pour server de défense à un petit livre intitulé Le chapelet secret du Très-Saint Sacrement (Paris, 1633).
    • [A defense of the Augustinian pedigree and orthodoxy of Mère Agnès’s Private Chaplet, the work marked Saint-Cyran’s inaugural alliance with the convent of Port-Royal.]
  • Timmermans, Linda. “La ‘Religieuse Parfaite’ et la théologie: L’attitude de la Mère Agnès à l’égard de la participation aux controverses,” Chroniques de Port-Royal 43 (1994): 97-112.
    • [The commentary on Image of a Perfect Nun argues that the abbess desired nuns to abstain from theological disputes; Mère Agnès’s own participation in several such public disputes is downplayed.]
  • Weaver, F. Ellen. La Contre-Réforme et les constitutions de Port-Royal (Paris: Cerf, 2002.)
    • [This study stresses the link between the abbess’s vision of reformed conventual life and both the earlier Cistercian tradition and other “non-Jansenist” currents in the Counter-Reformation.]

Author Information

John J. Conley
Email: jconley1@loyola.edu
Loyola University in Maryland
U. S. A.

Modal Illusions

We often talk about how things could have been, given different circumstances, or about how things might be in the future. When we speak this way, we presume that these situations are possible. However, sometimes people make mistakes regarding what is possible or regarding what could have been the case. When what seems possible to a person is not really possible, this person is subject to a modal illusion. With a modal illusion either (i) things seem like they could have been otherwise when they could not have been otherwise or (ii) things seem as if they could not have been otherwise when they could have been otherwise. The most widely discussed cases are instances of the former. Certain impossibilities seem (at least to some people) to be possible. Because of these illusions, there are certain necessary truths (truths which could not have been false) that are mistakenly thought to be contingent. Of particular concern to philosophers working on modal illusions are certain necessary truths that are known a posteriori, and which strike some people as contingent. The most discussed examples are found in Saul Kripke’s Naming and Necessity (1972), the work that sparked the contemporary interest in modal illusions.

While many elementary necessary truths seem to be necessary, the “necessary a posteriori” do not always seem to be so. For example, it is obviously a necessary truth that two is greater than one. It does not seem that things could have been otherwise. On the other hand, it is also a necessary truth that water is composed of H2O (as Kripke (1972) explains), but this might not seem to be necessary. The proposition expressed by the sentence ‘water is H2O’ strikes some people as contingently true because it seems that water could have been composed of something else. However, water could not have been composed of anything other than H2O since that’s what water is. Anything else would not be water. We came to know the composition of water through experience and so one might think that we could have had different experiences that would have shown that water was composed of XYZ, for example, and not H2O. However, the idea that things could have been otherwise and that the proposition is merely contingently false is a modal illusion.

Table of Contents

  1. Modal Illusions
  2. The Necessary A Posteriori
  3. Ramifications
  4. Similarity Accounts
  5. Objections
    1. True Modal Beliefs and False Non-Modal Beliefs
    2. Other Examples of Modal Illusions
  6. Two-Dimensionalist Accounts
  7. Objections
    1. Other Examples of Modal Illusions
    2. The Epistemic Status of the Secondary Proposition
    3. Believing Impossibilities
  8. Possibility Accounts
  9. Objections
    1. Conceivability and Possibility
    2. Impossible Worlds
    3. Metaphysical Possibility
  10. References and Further Reading

1. Modal Illusions

Unless otherwise specified, the terms ‘necessary,’ ‘contingent,’ ‘possible,’ ‘impossible,’ and all of their cognates refer to metaphysical notions. The phrases ‘could have been,’ ‘could not have been,’ and so forth are also used in a metaphysical sense. If p is necessarily true, then p could not have been false. If p is necessarily false, then p could not have been true.  The propositions expressed by the sentence ‘2 is greater than 1’ is necessarily true since it could not have been false, for example. If p is contingently true, then although p is true, it could have been false. For example, the proposition expressed by the sentence, ‘John McCain lost the 2008 U.S. Presidential election,’ is contingently true since it could have been false. McCain could have won the 2008 election. If p is possible, then either p is true or p is contingently false. The proposition expressed by the sentence ‘McCain won the 2008 election’ is false, but it is possible that McCain could have won the 2008 election.

Certainly, a person can be mistaken about the modal properties of many different types of statements or propositions. A person might mistakenly believe that a contingent truth known a priori is necessarily true. Kripke (1972) gives examples of the “contingent a priori that may also be illusory. Consider Kripke’s example, ‘stick S is one meter,’ said of the stick used to fix the reference of ‘one meter.’ Kripke points out that ‘stick S is one meter’ is contingent because stick S could have been a different length; it could have been longer or shorter than one meter. Yet, the speaker knows that stick S is one meter a priori because stick S is being used to fix the referent of ‘one meter.’ Before one knows how long the stick actually is, one knows that it is one meter long. It strikes some people as false that stick S could have been longer or shorter than one meter since stick S is fixing the reference of ‘one meter.’ Stick S could have been many lengths, but it could not have been longer than or shorter than one meter since ‘one meter’ refers to whatever length stick S happens to be. Those who are struck by the appearance that stick S could not have been longer or shorter than one meter, are subject to a modal illusion. (However, this does not seem to be a common mistake made regarding Kripke’s examples of the “contingent a priori”. Rather, it seems that when a person doubts the Kripkean examples of the “contingent a priori, the person believes that these truths are knowable a posteriori. One might argue that while it is necessary that stick S is one meter, one could only have known that through experience.)

There may also be “contingent a posteriori truths that are thought to be necessary. For example, Kripke (1972, p. 139) points out that it is sometimes mistakenly thought that light could not have existed without being seen as light. “The fact that we identify light in a certain way seems to us to be crucial even though it is not necessary; the intimate connection may create an illusion of necessity.” It is merely contingently true that light is seen as light, but some might think it is necessarily true and that things could not have been otherwise.

Finally, there are certainly necessary a priori truths that strike some people as merely contingently true. Any mistake about what could have been the case or could not have been the case is a modal illusion. However, the most commonly discussed examples of modal illusions are Kripke’s examples of the “necessary a posteriori and therefore, these will be the focus of this entry. Sections 3 through 8 below provide an overview of the most prominent explanations offered by contemporary philosophers regarding how or why a person subject to a modal illusion of the necessary a posteriori comes to make the mistake.

2. The Necessary A Posteriori

The following are the three most commonly discussed examples of modal illusions of the “necessary a posteriori”:

(a) Hesperus is Phosphorus.

(b) Water is H2O.

(c) This table is made of wood. (Said of a table originally made of wood.)

The examples above do strike many people as contingent on first consideration. However, the propositions expressed by each of the above sentences are necessary. For example, ‘Hesperus is Phosphorus’ is both necessary and knowable a posteriori. Given that Hesperus is Phosphorus, Hesperus is necessarily Phosphorus since being self-identical is a necessary property (Any object is necessarily identical to itself.) Yet, we came to know that Hesperus is Phosphorus through empirical means. The proposition expressed by the sentence might seem contingent to someone if that person thought that Hesperus could have been distinct from Phosphorus. (b) and (c) are also necessary since composition is a necessary property of an object or substance. But of course, we need empirical evidence to know the composition of water or this table and so both (b) and (c) are a posteriori.

Although (a), (b), and (c) are necessary truths, the following propositions are necessarily false, but may seem to some people to be merely contingently false to some people:

(a1) Hesperus is distinct from Phosphorus.

(b1) Water is XYZ.

(c1) This table is made of ice. (Said of a table originally made of wood.)

It might seem that Hesperus could have been distinct from Phosphorus, that water could have been composed of XYZ, or that this table could have been made of ice. A person might consider this table, think about what it could have been made of and come to the mistaken conclusion that it could have been made of ice and then conclude that the proposition expressed by the sentence ‘this table is made of ice’ is merely contingently false. But of course, this table could not have been made of ice. Given that this table is made of wood, it is necessarily made of wood. Any table made of ice would not be this same table.

Of course, some philosophers deny that these examples are necessary. In that case, there is no modal illusion to explain since what seems contingent is contingent and what seems possible is possible. However, each of the accounts considered below all attempt to explain the illusion in these cases because each of them accepts the Kripkean conclusions about the necessary nature of the above examples.

3. Ramifications

The correct solution to the problem of modal illusions will have an important impact on many philosophical issues because it is common for philosophical arguments to rely upon thought experiments about what is and is not possible.  For example, in the philosophy of mind, some say that they can conceive of mental activity without any physical activity or of a mental entity existing in the absence of a physical entity. Indeed, this was part of Descartes’ argument. Descartes relied on the seeming possibility that his mind or soul could exist without his body. Descartes’ narrator claimed that he could imagine being deceived about having a body, but he could not imagine being deceived about being a thinking being. So it seems that the mind or soul could exist or could have existed without the body. If this is true, then physicalism must be false.

The possibility of a philosophical zombie is often used in arguments against a physical reduction of consciousness. Some people believe that philosophical zombies could have existed. One might imagine a being completely identical in every respect to a human being, however this being is not conscious; there is no mental activity whatsoever. There are no emotions, thoughts, beliefs, fears, desires and so forth even though there are all the corresponding neurological events happening in the body. Moreover, the zombie exhibits all the behaviors of a person with emotions, thoughts, beliefs, fears, desires and so forth. For example, it acts angry when there are the neurological firings in the brain that normally occur when a person experiences anger. However, the zombie does not feel anger; the zombie does not feel anything! If these sorts of creatures could have existed, then mental activity does not supervene on physical activity. All the physical facts would be the same as they actually are but there would be no mental facts.

Another example many dualists use is that many people are struck by the feeling that pain could exist or could have existed without the corresponding physical activity in the body. Some say that they can imagine pain, the sensation, without the correlated neurological, physical activity in the body that occurs whenever a person has pain (call that C-Fiber stimulation). If this represents a genuine metaphysical possibility, then pain and other conscious events are not identical with, or reducible to, physical events.

Dualists use the sort of reasoning in these examples to show that there is no necessary connection between the mental and the physical, as perhaps these are modal illusions. Perhaps zombie worlds, body-less souls, and pain in the absence of C-Fiber stimulation are not really possible. It may be the case that although a philosophical zombie seems possible it is not possible, just as it is the case XYZ-water seems possible, even though it is not possible. In responding to arguments that rely on these appearances of possibility, many physicalists point to the Kripkean examples of the “necessary a posteriori”, arguing that these examples strike many people as contingent even though they are necessary. So even if it is necessary that mental events are physical events and even if it is true that mental events could not have existed without the corresponding physical events, it might seem as though they could have, just as it might seem as though water could have existed without being H2O even though it could not have.

Depending on the correct account of modal illusions, the seeming possibilities of philosophical zombies and of a purely mental world may or may not count as modal illusions. Different explanations of modal illusions have different consequences for the materialist/dualist debate because only some explanations of modal illusions will count zombie worlds and body-less souls as modal illusions.

4. Similarity Accounts

Some explanations of what modal illusions are contend that the person who is struck by the feeling that things could have been otherwise does not really have an impossible situation in mind. Instead, the situation the person considers is one in which there are similar objects or a similar substance and the situation has been re-described. This family of accounts, called Similarity Accounts, includes Kripke’s own. According to Kripke, it might seem possible that this (wooden) table could have been made of ice because we claim that we can imagine this table being made of ice. However, Kripke (1972, p. 114) says, “this is not to imagine this table as made of…ice, but to…imagine another table, resembling this one in all the external details made of…ice.” According to Kripke, the intuition that leads a person to conclude that this table could have been made of ice is not an intuition about this table but an intuition about a similar one. The intuition must be re-described.

Kripke (1972, p. 142) also argues that the necessarily false propositions ((a1), (b1), and (c1)) could not have been true but some “appropriate corresponding qualitative statement” for each is true. Kripke (1972, p. 143) claims that the sentence, ‘two distinct bodies might have occupied in the morning and the evening, respectively, the very positions actually occupied by Hesperus-Phosphorus-Venus’ is true and should replace the “inaccurate statement that Hesperus might have turned out not to be Phosphorus”. It is unclear whether Kripke wants to maintain that the person subject to the modal illusion really has that corresponding statement in mind or whether he simply wants to maintain that this corresponding statement should replace the false statement the person does have in mind. In either case, Kripke adopts a Similarity Account approach in saying that the person has the false belief because she considers a situation in which some planet similar to Hesperus is distinct from some planet similar to Phosphorus—and not a situation in which Hesperus is not Phosphorus. Similarity Accounts argue that if Hesperus could not have been distinct from Phosphorus, then when a person claims to believe that they could have been distinct, it cannot be because she has imagined a scenario or situation in which they are distinct since there is no such possible scenario or situation.

Kripke goes on to argue that there is no similar explanation about the belief that pain could have existed in the absence of C-Fiber stimulation. One can imagine that pain could have existed in the absence of C-Fiber stimulation; there is no re-description necessary because there is no other feeling that is very much like pain that the person imagines. To be a pain is to be felt as a pain, according to Kripke, and so if we imagine the sensation of pain without C-Fiber stimulation, the sensation we imagine must be pain—otherwise, what would the similar phenomenon be if not pain?

The appearance of pain without C-Fiber stimulation is not like the appearance of water without hydrogen and oxygen, according to Kripke. It is not true that to be water is to be experienced as water. A person can have all the experiences of water and yet the substance could be something else. When one imagines water composed of XYZ, according to these accounts, the person has imagined this similar substance—one that is experienced as water but is not water. However, when one imagines pain existing in the absence of C-Fiber stimulation, there is no phenomenon similar to pain that the person really imagines. One cannot have all the experiences of pain without there being pain. So Similarity Accounts are unable to explain the false intuition that pain could have existed in the absence of C-Fiber stimulation because this intuition is not false and so not a modal illusion.

5. Objections

a. True Modal Beliefs and False Non-Modal Beliefs

According to Similarity Accounts, the reason a person believes that something impossible could have been the case is because she imagines a situation that could have been the case for some similar objects or substances. It might seem that water could have been composed of XYZ because a person might imagine some substance very similar to water in all qualitative respects, but this substance will not really be water.

Consider a true modal belief, such as the belief that John McCain could have won the 2008 U.S. Presidential election. Normally, we would say that this is a belief regarding John McCain himself and not someone similar to John McCain in the relevant respects. Indeed, this is what Kripke wants to hold about true modal beliefs. Kripke (1972, p. 44) writes, “When you ask whether it is necessary or contingent that Nixon won the election, you are asking the intuitive question whether in some counterfactual situation this man would in fact have lost the election.” He adamantly opposes the idea that the intuition is about some man similar to Nixon, yet he claims that the intuition that this (wooden) table might have been made of ice is not about this table. There may be a reason to explain true and false modal intuitions in this non-uniform way, but without an argument, we have no reason to claim that our false modal intuitions are about objects similar to the objects we claim they are about while our true modal intuitions are about the very objects we claim they are about.

Such a theory is also non-uniform in how it would be extended to treat false non-modal beliefs.  The belief that New York City is the capital of New York State is a false non-modal belief. (It is a false belief, yet the belief is not at all about what could have been the case.) If a Similarity Account were extended to explain how or why a person has false beliefs more generally, the account would say that the person comes to this belief because he has an intuition that some city, similar to New York City in the relevant respects, is the capital of New York State. This is clearly an implausible explanation of such a false belief. We have no reason to believe that our common false beliefs stem from true beliefs about similar objects.

Now consider a necessary falsehood that a person mistakenly believes is true. Any mathematical falsehood would count. The mathematical falsehood that 18 squared is 314 (it is actually 324) is necessarily false; it could not have been true, but someone might mistakenly believe it is true. If a Similarity Account were extended to treat false beliefs more generally, the account would say that the person who believes that 18 squared is 314 does not really have 18 in mind but some number similar to 18 in the relevant respects. This is what the theory would say to explain any false mathematical beliefs. Because many (if not all) Similarity Accounts argue that one can never imagine impossibilities (which is Barcan Marcus’ claim in “Rationality and Believing the Impossible” (1983)), then no one could ever believe that a mathematical falsehood either could have been true or even is true. But clearly, we are capable of believing mathematical falsehoods.

b. Other Examples of Modal Illusions

In many occurrences of modal illusions, a person will come to realize that the proposition expressed by the sentence is necessary and will still be struck by the feeling that things could have been otherwise. As Alex Byrne (2007, p. 13) says, “A modal illusion, properly so-called, would require the appearance that p is possible in the presence of the conviction that p is impossible.” For example, a person who has read Kripke many times and acknowledges that water is necessarily H2O may still be struck by the appearance that water could have been XYZ. Call this a “person in the know.” A Similarity Account cannot explain the modal illusion in these cases. The subject in the know at once believes that water is necessarily composed of H2O and is struck by the feeling that things could have been otherwise. A Similarity Account would say that the intuition is that some other substance, similar to water in the relevant respects. The sentence ‘water could have been XYZ’ needs to be replaced.

Our subject in the know might say, “I know that it is impossible that p but it still sure seems like p could have been the case.” A Similarity Account might argue that what our subject really means is that “I know that it is impossible that p* but it still sure seems like p* could have been the case.” In that case, the account would need to explain why it is p* she has in mind in both instances. But more importantly, the account would need to explain this new illusion: if p* is possible and it strikes her as possible, why does she claim to know that it is impossible that p*? p* is possible and so according to this type of explanation, she must have a different proposition in mind. Might that be p**?

Similarity Accounts also cannot explain the illusion that this very table could have been made of ice. Imagine a person points to a wooden table and claims, “this very table is made of wood, but it could have been made of ice.” The person cannot be more specific about which table he means to consider; it is this very one in front of him, one that is made of wood. It would be absurd to say that the person is considering some other similar table that is made of ice. Our subject has said that the table he means to consider is made of wood. He could even have said, “It seems to me that any wooden table could have been made of ice.” Similarity Accounts fail to explain this illusion as well. It cannot be that our subject means to consider a specific table but mistakenly considers some similar one. He is making the claim about any wooden table, whatsoever. What is he considering in this case if the Similarity Account is correct?

6. Two-Dimensionalist Accounts

Another type of account that seeks to explain how modal illusions of the “necessary a posteriori” arise makes use of the two-dimensional semantic framework proposed by philosophers such as David Chalmers and Frank Jackson. This sort of approach aims to explain how a person might mistakenly think that a necessary proposition is contingent. As opposed to a traditional view of reference, the two-dimensional semantic framework proposes that there are two intensions of certain words. According to one common view of reference, a concept determines a function from possible worlds to referents. The function is an “intension” and it determines the “extension”. Two-Dimensionalism proposes that sometimes there are two intensions because often there is no single intension that can do all the work a meaning needs to do.

For example, Chalmers and Jackson explain that ‘water’ has two intensions. Under the common view of reference, the concept “water” determines a function from possible worlds to water/H2O. The function is an intension and determines that the extension of ‘water’ is always water/H2O. But according to the two-dimensional framework, there are two different intensions, two different functions from possible worlds to extensions. While the secondary intension of ‘water’ always picks out water/H2O, the primary intension picks out the “watery stuff” of a world—the clear, drinkable, liquid that fills the lakes, rivers, and oceans in a possible world. In certain possible worlds, that stuff is composed of XYZ and so it might seem as if the proposition expressed by the sentence ‘water is XYZ’ is merely contingently false. That is an illusion caused by conflating the primary and secondary intensions of ‘water.’ The primary intension is meant to capture the “cognitive significance” of the term, which is what a person subject to a modal illusion must have in mind.

Certain sentences thus express two different propositions depending on the two different intensions of the terms in the sentence. According to Two-Dimensionalists, the primary proposition determines the epistemic property of the sentence (whether it is a priori or a posteriori) while the secondary proposition determines the modal property of the sentence (whether it is necessary or contingent). With any example of the Kripkean “necessary a posteriori”, the primary proposition is a posteriori but not necessary, while the secondary proposition is necessary but not a posteriori. The secondary proposition in this case, that water is H2O, is necessary in the standard Kripkean sense, but it is not a posteriori because the secondary intension always picks out H2O in any possible world; we do not need to do empirical investigation to know that water is water.  The primary proposition is not necessary since the watery stuff of a world could be composed of H2O, XYZ, or something else. However, it is a posteriori. We need empirical evidence to know what water is composed of in any world.

Jackson (1997, p.76) holds that the secondary proposition is “normally meant by unadorned uses of the phrase ‘proposition expressed by a sentence’” and Chalmers (1996, p. 64) too says that the secondary proposition “is more commonly seen as the proposition expressed by a statement.” Therefore, one might say that the proposition expressed by ‘Hesperus is Phosphorus’ is necessary. If it seems contingent to a person, that is a modal illusion and the illusion is explained by the fact that the primary proposition is not necessary. According to this sort of account, when a person is subject to a modal illusion and concludes that a necessary truth is contingent, she does not consider the proposition expressed. Rather, the sentence misdescribes the situation she is considering. Her mistake is not simply in concluding that the proposition is contingent but in reporting what proposition she is considering. Two-Dimensionalist Accounts have this feature in common with Similarity Accounts: the person subject to the modal illusion does not have some impossible situation in mind. The situation she has in mind is not described correctly.

Chalmers uses his Two-Dimensionalist explanation of modal illusions to argue for dualism. According to Chalmers, pain in the absence of C-Fiber stimulation is not a modal illusion. In The Conscious Mind, Chalmers (1996, p. 133) says, “with consciousness, the primary and secondary intensions coincide.” The primary intension of ‘pain’ picks out painful sensations, feelings experienced as pain, but the secondary intension of ‘pain’ also picks out painful sensations, feelings experienced as pain, since what it means to be a pain is to be experienced as a pain. It does not always pick out C-Fiber stimulation. So, painy-stuff cannot be misdescribed by the word ‘pain’ since all that it is to be a pain is to be felt as a pain. The secondary proposition—the proposition that backs the necessity or contingency of a sentence—expressed by ‘pain is C-Fiber stimulation’ is contingent. The proposition could have been false since the secondary intension of ‘pain’ picks out something other than C-Fiber stimulation in some possible worlds. The person who believes that the proposition expressed by the sentence ‘pain is C-Fiber stimulation’ is contingently true has not made a mistake.

While Jackson once used his account of modal illusions to defend a dualist theory, he now supports physicalism. Given his physicalist commitments, Jackson should hold that a person who is struck by the feeling that pain could have existed in the absence of C-Fiber stimulation is under a modal illusion. Given his Two-Dimensionalist commitments, however, it is hard to know what he would say to explain the illusion. A Two-Dimensionalist Account of the illusion that pain could have existed in the absence of C-Fiber stimulation should say that the person who believes this imagines a situation in which the primary intension of ‘pain’ picks out something just like pain, but is not pain. It is unclear how a Two-Dimensionalist could make this sort of approach work since, as Chalmers (1996, p. 133) and Kripke (1972, p. 151) have noted, ‘pain’ always picks out pain and not painy-stuff. There is no painy-stuff that is not pain.  But perhaps what Jackson wants to argue is that while we believe we are imagining a world in which there is pain and no C-Fiber stimulation, there really must be C-Fiber stimulation in that situation.

7. Objections

a. Other Examples of Modal Illusions

Consider again the person in the know who is subject to a modal illusion. Two-Dimensionalist accounts fail to explain the illusion in these cases. Chalmers argues that it might seem as if the proposition expressed by the sentence ‘water is XYZ’ is contingently false because the sentence is used to express something true in some possible worlds. Chalmers (2007, p. 67) says that the person subject to the modal illusion considers “a conceivable situation—a small part of a world” in which watery stuff (and not water) is XYZ but the subject misdescribes the situation she is considering using the term ‘water.’ According to Chalmers (1996, p. 367, footnote 32), there is a “gap between what one finds conceivable at first glance and what is really conceivable.” It might seem conceivable that water could have been XYZ, but it is not really conceivable since it is impossible. While this may be a plausible explanation in the typical cases of modal illusions, it is an implausible explanation for what happens in the case of our subject in the know. This person knows enough to recognize that there might be a situation in which the watery stuff at a world is composed of XYZ and thus makes the primary proposition expressed by the sentence ‘water is XYZ’ true, but she does not have that proposition or situation in mind. Rather, it strikes her as possible—even though she believes it is not possible—that water could have been XYZ and that the proposition expressed (the secondary proposition) is contingently false. The person in the know would explicitly consider the secondary proposition and it might still strike her as merely contingently false.

The Two-Dimensionalist explanations also fail to explain modal illusions involving ‘This table is made of wood’ or other sentences that use demonstratives. Imagine our subject is asked whether it seems that this table could have been made of ice and a certain wooden table is pointed to. If it strikes our subject as possible, she is subject to a modal illusion. Given that the table is made of wood, it could not have been made of anything else. According to a Two-Dimensionalist explanation of modal illusions, the reason it might seem as if this table could have been made of ice is that our subject has imagined a scenario in which the primary proposition expressed by the sentence ‘this table is made of ice’ is true. It is unclear what scenario or possible world would verify the sentence. If there is one table referred to when our interrogator uses the phrase ‘this table’ and points to a specific table, what might the primary intention of ‘this table’ pick out if not this very one?

Nathan Salmon (2005, p. 234) argues that in using the demonstrative and ostensively referring to the table, “I make no reference—explicit or implicit, literal or metaphorical, direct or allusive—to any … table other than the one I am pointing to.” There is no similar table our subject is asked to consider. It is stipulated when she is asked whether it seems that this very table could have been made of ice that she is to consider this very table. When asked to imagine this very table being made of ice, either one can or one cannot. If one can, the object of belief is this very table and one is subject to a modal illusion. If one comes to the conclusion that this table could have been made of ice, one has come to a conclusion about this very table. It is an incorrect conclusion, but that doesn’t mean it wasn’t this table the person considered when reasoning to this mistaken conclusion.

Finally, consider another less discussed example of the “necessary a posteriori”. Kripke (1972) argues that every person necessarily has the parents that he or she has. Still, it seems to many people as if other people could have been their parents. If it seems to a person that she could have had different parents, that person must be subject to a modal illusion. According to Two-Dimensionalist Accounts, the reason a person makes this mistake is because she imagines a possible world in which someone very much like herself has parents other than the ones she actually has. If our subject, for example believes that ‘I am the daughter of the Queen of England,’ is merely contingently false, it is because she considers a world that would verify the primary proposition. The primary proposition, ostensibly, is true in some possible worlds, worlds in which someone very much like the speaker is the daughter of the Queen of England.

It seems very unlikely that a person would mean to imagine a world in which she is the daughter of the Queen of England and instead imagines a world in which someone just like her is the daughter of the Queen of England. It seems strange that any one would mistakenly and unknowingly use the word ‘I’ to refer to someone other than himself or herself. Furthermore, Chalmers (2006) argues that what makes the primary proposition true in certain possible worlds is not that the speakers of that world use the terms in a certain way. The way they use the terms are irrelevant. We are concerned with how we use the terms and what those terms would pick out in other possible worlds. So in this case, it is not because there is some doppelganger of our subject who uses ‘I’ to refer to herself that the sentence ‘I am the daughter of the Queen of England’ is true. It is a matter of how our subject uses the term and what the word ‘I’ would pick out in this other possible world. But given that our subject could not have been the daughter of the Queen of England (since she is not), it is unclear to whom ‘I’ refers in this possible world if not the subject herself.

b. The Epistemic Status of the Secondary Proposition

Chalmers (1996) explains that the “necessary a posteriori” express two propositions; one is necessary and the other is a posteriori but not necessary. Chalmers (1996, p. 64) claims that a statement is necessarily true in the first (a priori) sense if the associated primary proposition holds in all centered possible worlds (that is, if the statement would turn out to express a truth in any context of utterance). A statement is necessarily true in the a posteriori sense if the associated secondary proposition holds in all possible worlds (that is, if the statement as uttered in the actual world is true in all counterfactual worlds).

A statement such as ‘Hesperus is Phosphorus,’ for example, is not necessary in the first, a priori, sense because the primary proposition does not hold in all possible worlds—it does not express a truth in any context of utterance. However, it is necessary in the secondary sense since the secondary proposition holds in all possible worlds. The statement, as uttered in the actual world, is true in all counterfactual worlds. This is because the secondary proposition expresses something like “Venus-Hesperus-Phosphorus is Venus-Hesperus-Phosphorus.” Chalmers says that that this secondary proposition is not a posteriori, however. The primary proposition is a posteriori but not necessary, while the secondary proposition is necessary but not a posteriori. If it is not a posteriori, it would be either a priori or not knowable. This example seems to perhaps be a priori since it would not take any empirical investigation to know that Venus is Venus and certainly, this is fact that we can know.

But consider a statement such as ‘water is H2O.’ This statement is necessary in the secondary sense          because the secondary proposition holds in all possible worlds. The statement, as uttered in the actual world, is true in all counterfactual worlds since the secondary intension of ‘water’ always picks out H2O. But the secondary proposition is not a posteriori. Then it is either a priori or it is not knowable at all. Since we of course can know that water is H2O, it must be knowable a priori, but it is unclear how in the world a person could know the composition of water without empirical evidence.

The objection can also be made using ‘This table is made of wood.’ The secondary proposition expressed by this sentence (said of a table actually originally made of wood) is necessary in the secondary sense because the sentence, as uttered in this world, is true in all counterfactual worlds. But again, the secondary proposition is not both necessary and a posteriori. Either it is not knowable at all or else it is knowable a priori. Since we can of course know that this table is made of wood, that must be something we can know a priori, but it is even more implausible that we can know that fact a priori than it is plausible that we can know the composition of water a priori. How could we know what any table is made of without empirical evidence?

Yet Two-Dimensionalist Accounts rely on this idea to explain modal illusions of the “necessary a posteriori”. It is because one proposition is a posteriori and not necessary while the other proposition is necessary and not a posteriori that we make these modal mistakes. The proposition expressed (the necessary one) may seem contingent because the primary proposition is not necessary and because the primary proposition is not knowable a priori, one might imagine that it could have been false since one can imagine a possible world in which it is false. But if the secondary proposition is not a priori either, then we have no need to posit a primary proposition to explain the illusion.

c. Believing Impossibilities

Finally, Two-Dimensionalist Accounts assume that a person cannot imagine impossibilities, but it seems quite plausible that we can and often do imagine or believe impossibilities. We believe mathematical falsehoods, for example, which are surely impossible. Two-Dimensionalists maintain that the scenario imagined has been misdescribed and it is not an impossible scenario that the person believes to be possible.  But if a person can believe that the mathematically impossible is possible, it is a natural extension to say that a person can believe other impossibilities are possible, including metaphysical impossible scenarios such as that water could have been XYZ.

Chalmers (1996, p. 97) recognizes that some mathematical falsehoods are conceivable in a sense; both Goldbach’s Conjecture and its negation are conceivable “in some sense” but “the false member of the pair will not qualify as conceivable” in Chalmers’ usage since there is no scenario that verifies the false member of the pair. Call Goldbach’s Conjecture g and its negation ¬g. When a person claims to believe ¬g, assuming g is true, the belief must be misdescribed. Chalmers (1996, p. 67) says that although one might claim to believe that Goldbach’s Conjecture is false, he is only “conceiving of a world where mathematicians announce it to be so; but if in fact Goldbach’s Conjecture is true, then one is misdescribing this world; it is really a world in which the Conjecture is true and some mathematicians make a mistake.” This might be a plausible explanation of what is going on in the Goldbach case since, at this time, we do not know which is true and which is false, but consider any very complicated mathematical proposition that is known to be true. If someone claims to believe it is false, Chalmers would have to argue that the person has misdescribed the world imagined. This is clearly not the case in most occurrences of false mathematical beliefs. The mathematician who has erred does not imagine a situation in which the complicated mathematical proposition is “announced” to be false; he believes it is false. Two-Dimensionalist Accounts cannot explain these common mathematical false beliefs.

8. Possibility Accounts

Rather than invoking a substitute object of thought and saying that there is only one sense of ‘possibility’ relevant to the discussion, another approach to modal illusions would be to maintain that there is only one object of thought under consideration but different senses of ‘possibility’ are in play. One way to do this is to hold that it is possible that water is XYZ, for example, in some non-metaphysical sense. Such Possibility Accounts deny the assumptions made by Similarity Accounts and Two-Dimensionalist accounts that one cannot believe the impossible and that when one claims to believe the impossible, one has mis-described or re-described one’s belief. Possibility Accounts argue that the person does have in mind some impossible world, or at least some impossible situation, and mistakenly believes that it is possible or could have obtained. The reason the impossible situation might seem possible is because it is possible in some other sense.

There are many occurrences of modal illusions in which there is no similar substance or object that can serve as the object of thought and explain the illusion. Possibility Accounts deny that the false modal intuition is about some other object or substance and instead claim that the belief is about a metaphysically impossible situation and that the reason it strikes many people as possible is that it is possible in an epistemic sense. Of course there are many definitions of ‘epistemic possibility.’ According to some theorists, p is epistemically possible if p is true for all one knows. According to others, p is epistemically possible if p is not known to be false. And according to others, p is epistemically possible if p cannot be known to be false a priori. It is some version of this last definition that many theorists rely on to explain modal illusions of the “necessary a posteriori” using a Possibility Account. Since all of the examples discussed here are necessary and a posteriori, they cannot be known to be false a priori.  Therefore, each example is epistemically possible. Since each example is epistemically possible, it might seem to a person that things could have been otherwise even though things could not have been otherwise. The appearance of metaphysical possibility is explained by the epistemic possibility.

This type of account claims that a person subject to a modal illusion can, and usually does, have a metaphysical impossibility in mind, but it also claims that when the person believes the proposition expressed by the sentence ‘Hesperus is distinct from Phosphorus’ is contingently false, the proposition the person thinks is contingently false is the proposition expressed by the sentence and not some other. It is not that she believes that the sentence could have expressed something else and thus could have been true. Rather, she believes of the proposition expressed that it could have been true.

Possibility Accounts are thought to be able to explain those modal illusions that the other two types of accounts cannot explain. For example, when the person in the know says, “I know that it is impossible that p but it still sure seems like p could have been the case,” the Possibility Account argues that the subject can at once know that p is (metaphysically) impossible and be struck by the feeling that p is possible if p is possible in some other sense. Consider, too, the failed attempts to explain the modal illusion that this very table could have been made of ice. If this table could have been made of ice in some other sense, then the reason one might think that it could have been made of ice (in a metaphysical sense) is clear. Possibility accounts then must be able to explain how these impossibilities are possible in some other sense.

Stephen Yablo, a prominent defender of Possibility Accounts of modal illusions, claims that while water could not have been XYZ in a metaphysical sense, water could have been XYZ in a “conceptual” sense: if p is conceptually possible, then p could have turned out to be the case. Yablo explains that if p is metaphysically possible then p could have turned out to have been the case. There are certain propositions that while metaphysically impossible are conceptually possible. Such a proposition p could not have turned out to have been the case even though it could have turned out to be the case. This explains modal illusions of the “necessary a posteriori”. All of the examples so far considered are conceptually possible even though they are metaphysically impossible. (a1), (b1), and (c1) could have turned out to be so.

Yablo insists that conceptually possibility should not be reduced to the a priori, but without reducing it, ‘conceptual possibility’ could be cashed out in any number of ways. For instance, consider again Goldbach’s Conjecture. In some sense, either g or ¬g “could turn out to be the case” since we don’t know which is true. But in another sense, only g or ¬g could turn out to be the case since, if g is false, it is not only necessarily false, but logically impossible. Even though we don’t know right now whether g or ¬g is true, only one could turn out to be true in a certain sense. It is not clear whether or not something such as Goldbach’s conjecture could turn out to be true.

On the other hand, Yablo (1993, pp. 29-30) argues that it is conceptually possible that “there should be a town whose resident barber shaved all and only the town’s non-shavers.” This means that it could have turned out to be the case that there is a town whose resident barber shaves all and only the town’s non-shavers. However, it certainly could not have turned out to be the case that there is a town whose resident barber shaves all and only the non-shavers. The example is different than Goldbach’s conjecture. In that case, the necessary falsehood is unknown, so in some sense, the necessary falsehood could turn out to be the case. In the barber case, however, we know that the proposition is false, so it could not turn out to be true. And if it could not turn out to be the case, then such a town is not conceptually possible, contrary to Yablo’s claims.

Other Possibility Accounts avoid this problem by defining ‘epistemic possibility’ or ‘conceptual possibility’ in another way. For example, Scott Soames says that p is epistemically possible if and only if p is a way the world could conceivably be and that p is a way the world could conceivably be if we need evidence to rule out that it is the way the world is. For example, it is epistemically possible that water is XYZ because it is conceivable that the world is such that water is composed of XYZ. We do need evidence to rule out that this is the way the world is because we need evidence to know the composition of water. For all instances of the “necessary a posteriori”, one does need evidence to rule out metaphysical impossibilities that are epistemically possible. On the other hand, one does not need evidence to rule out that the world is such that there is a town whose resident barber shaves all and only the town’s non-shavers. This is not epistemically possible and not an example of the “necessary a posteriori”.

According to the schema Soames offers to identify instances of the “necessary a posteriori”, (a) is not an example of the “necessary a posteriori”. Soames argues that the proposition expressed by the sentence ‘Hesperus is Phosphorus’ is necessary, but it is not a posteriori since the proposition expressed is something like “Venus is Venus.” Clearly we do not need empirical evidence to know that is true and we do not need empirical evidence to rule out that the world is such that Venus is not Venus (or that Hesperus is not Phosphorus). If we do not need evidence to rule out that this is the way the world is, then it is not epistemically possible.

The problem with Soames’ account is that we did need evidence to know that Hesperus is Phosphorus. The ancients who made this discovery did not do it from the armchair; they needed empirical evidence. Soames claims that “the function of empirical evidence needed for knowledge that Hesperus is Phosphorus is not to rule out possible world-states in which the proposition is false…evidence is needed to rule out possible states in which we use the sentence … to express something false.” The ancients though did not need empirical evidence to rule out worlds in which the sentence is used to express something false. They needed evidence to know that Hesperus and Phosphorus were the same.

Furthermore, it seems that Soames could argue similarly regarding the other two example: Soames could say that we did not need evidence to rule out a possible world-state in which the proposition that water is H2O is false, but we needed evidence to rule out possible states in which we use the sentence ‘water is H2O’ to express something false. This is similar to what the Two-Dimensionalists argue, although Soames gives rather forceful and convincing arguments against Two-Dimensionalism himself. He does not adopt this strategy for either of the other two examples. Although Soames’ general explanation is promising, it is a problem that he rejects the explanation for one important example of modal illusions of the “necessary a posteriori”.

A Possibility Account might say that a philosophical zombie is epistemically possible but not metaphysically possible or that pain in the absence of C-Fiber stimulation is epistemically possible but not metaphysically possible. This is a common position taken by those who adopt a Possibility account. Chalmers (1996, p. 137) explains: “On this position, “zombie worlds” may correctly describe the world that we are conceiving, even according to secondary intensions. It is just that the world is not metaphysically possible.” Chalmers (1996, p. 131) claims that this is “by far the most common strategy used by materialists” and recognizes Bigelow and Pargetter (1990) and Byrne (1993) among that camp.

However, not all Possibility Accounts defend this view in the way Chalmers describes. According to some Possibility Accounts, the reason the examples of the “necessary a posteriori” strike some people as contingent is because one cannot know that their negations are false a priori. Because we cannot know that the propositions expressed by the sentences ‘philosophical zombies exist’ and ‘pain is not C-Fiber stimulation’ are false a priori, these are epistemic possibilities. Since they are epistemically possible, it might seem to some people that they are metaphysically possible even if they are not—even if physicalism is true.

On the other hand, one could adopt a Possibility Account and deny physicalism. In that case, one could allow that philosophical zombies and pain in the absence of C-Fiber stimulation are both epistemically possible and metaphysically possible. One could adopt a Possibility Account of modal illusions but deny that the dualist intuitions count as modal illusions. Accordingly, the propositions expressed by the sentences ‘philosophical zombies do not exist,’ and ‘pain is C-Fiber stimulation’ would not count as genuine instances of the “necessary a posteriori”.

9. Objections

a. Conceivability and Possibility

There is a common view that conceivability implies possibility. Gendler and Hawthorne (2007) discuss this alleged implication in detail in their introduction to Conceivability and Possibility. According to this view it cannot both be true that water could not have been XYZ and that someone might conceive that water is XYZ. If conceivability implies possibility and a person conceives that water is or could have been XYZ, then it must be possible that water could have been XYZ.  However, given Kripke’s convincing arguments, most will reject this conclusion. On the other hand, if conceivability implies possibility and water could not have been XYZ, then a person who says she conceives that water is or could have been XYZ must not really be conceiving what she claims to conceive. This motivates some who adopt a view claiming the belief needs to be re-described. Given the objection to such accounts (including the strong objection that we do believe impossibilities) it seems equally objectionable to claim that the person is not really conceiving of water when she claims to conceive that water might have been XYZ.

There does not seem to be an independent reason to maintain the link between conceivability and possibility. If conceivability does not imply possibility, then it might be the case that while water could not have been XYZ, one might conceive that it could have been.  If conceivability does not imply possibility, some version of a Possibility Account would have more force. While there does not seem to be an independent reason to maintain the link between conceivability and possibility, there are many reasons to reject it. First of all, our modal intuitions are not infallible, so we would have no reason to believe that whatever seems possible is possible. To think so is to give more credit than is due to our modal intuitions. If our modal intuitions were infallible, we would be unable to explain other modal errors that we make, such as our mathematical errors. Secondly, modal justification itself is not something philosophers have come to agree upon. We are still not sure what justifies our modal knowledge and so we cannot hold, at this time, that our modal intuitions always count as knowledge. Finally, our a posteriori justification in general is fallible. Since this is so, we have good reason to think that our a posteriori justification when it comes to modal truths might also be fallible.

b. Impossible Worlds

Chalmers objects to Possibility Accounts, or what he calls “two-senses views,” because he believes such accounts are committed to incorporating impossible worlds into their metaphysics. If p is impossible, yet epistemically possible, it must be true in some possible world, but if p is metaphysically impossible, it is true in no possible world. Therefore, it seems that there are metaphysically impossible worlds in which p is true or at which p is true. The idea of countenancing world that are impossible strikes many philosophers as highly problematic.

However, not all possibility accounts, or two-senses views, are committed to impossible worlds. If the definition of ‘possibility’ relies on possible worlds, this might be a valid concern, but not all Possibility Accounts rely on such a definition. For example, Yablo makes no mention of possible worlds. According to Yablo, p is conceptually possible if p is a way the world could have turned out to be. Yablo (1996) insists that a way the world could have turned out to be is not a possible world; it is not an entity at all.  A way the world could have been or could be is analogous to a way one feels or a way a bird might build a nest and when one talks about a way a bird might build a nest, one does not make reference to a thing.

c. Metaphysical Possibility

Perhaps the most forceful objection to a Possibility Account is that it presumes there is some sort of primitive notion of metaphysical modality that is left undefined, one that cannot be identified or analyzed in non-modal terms. Those who use the terms ‘metaphysically necessary’ or ‘metaphysically possible’ have only explained how they use the term, but no one has given an analysis of what these terms mean. The question arises as to what may be meant by ‘water is necessarily H2O’, as it seems to beg the question, “If this does not just reduce to possible worlds or to the a priori, then what does it reduce to, if anything?”

Some have argued that these notions are vague and that, although there are examples of what most people mean by metaphysically necessary and possible, there is no clear way to decide what counts as metaphysically possible in the problematic cases, including cases that have the dualists’ concerns at their center.

This is a strong objection but perhaps not an insurmountable one. While there are no clear definitions of these terms in the literature, most philosophers who use them have a basic understanding of what they mean. There is some intuitive sense that philosophers, following Kripke, have in mind. Furthermore, philosopher and non-philosophers alike do think that, although things are one way, some things could have been otherwise. It is this notion that philosophers are referring to when they use the term ‘metaphysical possibility.’  Kripke himself recognizes that there are no good definitions for these terms and that there are no necessary and sufficient conditions spelled out for either metaphysical necessity or metaphysical possibility. Still, we have a basic understanding of these notions. If p is necessary, p could not have been otherwise and ¬p could not have been true. If p is false but possible then p could have been the case even though it is not actually the case.

10. References and Further Reading

  • Barcan Marcus, R. (1983). Rationality and Believing the Impossible. Journal of Philosophy. Vol. 80, No. 6, (June 1983). pp. 321-388.
  • Bealer, G. (2004). The Origins of Modal Error. Dialectica, Vol. 58, pp. 11-42.
  • Bealer, G. (2002). Modal Epistemology and the Rationalist Renaissance. In Gendler & Hawthorne, (Eds.), Conceivability and Possibility. Oxford: Oxford University Press.
  • Bigelow, J. & Pargetter, R. (1990). Acquaintance With Qualia. Theoria. Vol. 56. pp. 129-147.
  • Byrne, A. (2007). Possibility and Imagination. Philosophical Perspectives. 21. pp. 125-144.
  • Byrne, A. (1993). The Emergent Mind, Ph.D. Dissertation, Princeton University.
  • Chalmers, D. (2007). Propositions and Attitude Ascriptions: A Fregean Account. Nous.
  • Chalmers, D. (2006). The Foundations of Two-Dimensional Semantics. In M. Garcia-Carpintero and J. Macia (eds.), Two-Dimensional Semantics. Oxford: Oxford University Press. pp. 55-140.
  • Chalmers, D. (2002). Does Conceivability Entail Possibility? In T. Gendler & J. Hawthorne (eds.), Conceivability and Possibility. Oxford: Oxford University Press. pp. 145-200.
  • Chalmers, D. (1996). The Conscious Mind. Oxford, New York: Oxford University Press.
  • Della Rocca, M. (2002). Essentialism versus Essentialism. In T. Gendler & J. Hawthorne (eds.), Conceivability and Possibility. Oxford: Oxford University Press. pp. 223-252.
  • Descartes, R. (1996). Meditations on First Philosophy, translated by J. Cottingham. Cambridge: Cambridge University Press.
  • Descartes, R. (1983). Principles of Philosophy, translated by V.R. Miller & R.P. Miller. Dordrecht: D. Reidel.
  • Evans, G. (2006). Comments on ‘Two Notions of Necessity.’ In M. Garcia-Carpintero and J. Macia (eds.), Two-Dimensional Semantics. Oxford: Oxford University Press. pp. 176-180.
  • Fine, K. (2002). The Varieties of Necessity. In T. Gendler & J. Hawthorne (eds.), Conceivability and Possibility. Oxford: Oxford University Press. pp. 253-282.
  • Garcia-Carpintero, M. & Macia, J. (eds.). (2006). Two-Dimensional Semantics. Oxford & New York:  Oxford University Press.
  • Gendler, T.S. & Hawthorne, J. (eds.). (2002). Conceivability and Possibility. Oxford & New York: Oxford University Press.
  • Hill, C. (2006). Modality, Modal Epistemology, and the Metaphysics of Consciousness. In S. Nichol, (ed.), The Architecture of Imagination. Oxford: Oxford University Press. pp. 205-235.
  • Hill, C. (1997). Imaginability, Conceivability, Possibility, and the Mind-Body Problem. Philosophical Studies: 87: pp. 61-85.
  • Hill, C. (1991). Sensations: A Defense of Type Materialism. Cambridge: Cambridge University Press.
  • Jackson, F. (2003). Mind and Illusion. In A. O’Hear, (ed.), Mind and Persons. Cambridge: Cambridge University Press. pp. 251-272.
  • Jackson, F. (1997). From Metaphysics to Ethics: A Defense of Conceptual Analysis. Oxford: Oxford University Press.
  • Kripke, S. (1972). Naming and Necessity. Cambridge, MA: Harvard University Press.
  • Kripke, S. (1971). Identity and Necessity.  In M.K. Munitz (ed.), Identity and Individuation. New York: New York University Press.
  • Loar, B. (1990). Phenomenal States. Philosophical Perspectives. Vol. 4. pp. 81-108.
  • Ludwig, K. (2003). The Mind-Body Problem: An Overview. In T. A. Warfield & S P. Stich, (eds.), The Blackwell Guide to the Philosophy of Mind. Maldin, MA: Blackwell. pp. 1-46.
  • Lycan, W. G. (1995). A Limited Defense of Phenomenal Information. In T. Metzinger (ed.), Conscious Experience. Paderborn. Schoningh. pp. 243-258.
  • Salmon, N. (2005). Reference and Essence. Amherst, NY: Prometheus Books.
  • Sidelle, A. (2002). On the Metaphysical Contingency of the Laws of Nature. In T. Gendler & J. Hawthorne (eds.), Conceivability and Possibility. Oxford: Oxford University Press. Pp, 309-366.
  • Soames, S. (2006). Kripke, the Necessary A Posteriori, and the Two-Dimensionalist Heresy. In M. Garcia-Carpintero and J. Macia (eds.), Two Dimensional Semantics. Oxford: Oxford University Press. pp. 272-292.
  • Soames, S. (2005). Reference and Description. Princeton and Oxford: Princeton University Press.
  • Sorensen, R. (2006). Meta-Conceivability and Thought Experiments. In S. Nichols (ed.), The Architecture of Imagination. Oxford: Oxford University Press. pp. 257-272.
  • Sorensen, R. (2002). The Art of the Impossible. In T. Gendler & J. Hawthorne (eds.), Conceivability and Possibility. Oxford: Oxford University Press. pp. 337-368.
  • Sorensen, R. (1996). Modal Bloopers: Why Believable Impossibilities are Necessary. American Philosophical Quarterly, 33 (1): pp, 247-261.
  • Sorensen, R. (1992). Thought Experiments. NY: Oxford University Press.
  • Tye, M. (1995). Ten Problems of Consciousness. Cambridge, MA. MIT Press.
  • Wong, K. (2006). Two-Dimensionalism and Kripkean A Posteriori Necessity. In M. Garcia-Carpintero and J. Macia (eds.), Two-Dimensional Semantics. Oxford: Oxford University Press. pp. 310-326.
  • Yablo, S. (2006). No Fool’s Cold: Notes on Illusions of Possibility. In M. Garcia-Carpintero and J. Macia (eds.), Two-Dimensional Semantics. Oxford: Oxford University Press. pp. 327-346.
  • Yablo, S. (2001). Coulda Shoulda Woulda. In T. Gendler & J. Hawthorne (eds.), Conceivability and Possibility. Oxford: Oxford University Press. pp. 441-492.
  • Yablo, S. (2000). Textbook Kripkeanism and the Open Texture of Concepts. Pacific Philosophical Quarterly. 81: pp, 98-122.
  • Yablo, S. (1996). How In the World? Philosophical Topics. 24. pp. 255-286.
  • Yablo, S. (1993). Is Conceivability a Guide to Possibility? Philosophy and Phenomenological Research. 53: pp. 1-42.

Author Information

Leigh Duffy
Email: duffy.leigh@gmail.com
Buffalo State College
U. S. A.

Foundationalism

Epistemic foundationalism is a view about the proper structure of one’s knowledge or justified beliefs.  Some beliefs are known or justifiably believed only because some other beliefs are known or justifiably believed.  For example, you can know that you have heart disease only if you know some other claims such as your doctors report this and doctors are reliable.  The support these beliefs provide for your belief that you have heart disease illustrates that your first belief is epistemically dependent on these other two beliefs.  This epistemic dependence naturally raises the question about the proper epistemic structure for our beliefs.  Should all beliefs be supported by other beliefs?  Are some beliefs rightly believed apart from receiving support from other beliefs?  What is the nature of the proper support between beliefs?  Epistemic foundationalism is one view about how to answer these questions.  Foundationalists maintain that some beliefs are properly basic and that the rest of one’s beliefs inherit their epistemic status (knowledge or justification) in virtue of receiving proper support from the basic beliefs.  Foundationalists have two main projects: a theory of proper basicality (that is, a theory of noninferential justification) and a theory of appropriate support (that is, a theory of inferential justification).

Foundationalism has a long history.  Aristotle in the Posterior Analytics argues for foundationalism on the basis of the regress argument.  Aristotle assumes that the alternatives to foundationalism must either endorse circular reasoning or land in an infinite regress of reasons.  Because neither of these views is plausible, foundationalism comes out as the clear winner in an argument by elimination.  Arguably, the most well known foundationalist is Descartes, who takes as the foundation the allegedly indubitable knowledge of his own existence and the content of his ideas.  Every other justified belief must be grounded ultimately in this knowledge.

The debate over foundationalism was reinvigorated in the early part of the twentieth century by the debate over the nature of the scientific method.  Otto Neurath (1959; original 1932) argued for a view of scientific knowledge illuminated by the raft metaphor according to which there is no privileged set of statements that serve as the ultimate foundation of knowledge; rather, knowledge arises out of a coherence among the set of statements we accept.  In opposition to this raft metaphor, Moritz Schlick (1959; original 1932) argued for a view of scientific knowledge akin to the pyramid image in which knowledge rests on a special class of statements whose verification doesn’t depend on other beliefs.

The Neurath-Schlick debate transformed into a discussion over nature and role of observation sentences within a theory.  Quine (1951) extended this debate with his metaphor of the web of belief in which observation sentences are able to confirm or disconfirm a hypothesis only in connection with a larger theory.  Sellars (1963) criticizes foundationalism as endorsing a flawed model of the cognitive significance of experience.  Following the work of Quine and Sellars, a number of people arose to defend foundationalism (see section below on modest foundationalism).  This touched off a burst of activity on foundationalism in the late 1970s to early 1980s.  One of the significant developments from this period is the formulation and defense of reformed epistemology, a foundationalist view proposed foundational beliefs such as there is a God (see Plantinga (1983)). While the debate over foundationalism has abated in recent decades, new work has picked up on neglected topics about the architecture of knowledge and justification.

Table of Contents

  1. Knowledge and Justification
  2. Arguments for Foundationalism
    1. The Regress Argument
    2. Natural Judgment about Cases
  3. Arguments against Foundationalism
    1. The Problem of Arbitrariness
    2. The Sellarsian Dilemma
  4. Types of Foundationalist Views
    1. Theories of Noninferential Justification
      1. Strong Foundationalism
      2. Modest Foundationalism
      3. Weak Foundationalism
    2. Theories of Proper Inference
      1. Deductivism
      2. Strict Inductivism
      3. Liberal Inductivism
      4. A Theory of Inference and A Theory of Concepts
  5. Conclusion
  6. References and Further Reading

1. Knowledge and Justification

The foundationalist attempts to answer the question: what is the proper structure of one’s knowledge or justified beliefs? This question assumes a prior grasp of the concepts of knowledge and justification.  Before the development of externalist theories of knowledge (see entry on internalism and externalism in epistemology) it was assumed that knowledge required justification.  On a standard conception of knowledge, knowledge was justified true belief.  Thus investigation on foundationalism focused on the structural conditions for justification.  How should one’s beliefs be structured so as to be justified?  The following essay discusses foundationalism in terms of justification (see BonJour (1985) for a defense of the claim that knowledge requires justification).  Where the distinction between justification and knowledge is relevant (for example, weak foundationalism), this article will observe it.

What is it for a belief to be justified?  Alvin Plantinga (1993) observes that the notion of justification is heavily steeped in deontological terms, terms like rightness, obligation, and duty.  A belief is justified for a person if and only if the person is right to believe it or the subject has fulfilled her intellectual duties relating to the belief.  Laurence BonJour (1985) presents a slightly different take on the concept of justification stating that it is “roughly that of a reason or warrant of some kind meeting some appropriate standard” (pp., 5-6).  This ‘appropriate standard’ conception of justification permits a wider understanding of the concept of justification.  BonJour, for instance, takes the distinguishing characteristic of justification to be “its essential or internal relation to the cognitive goal of truth” (p. 8).  Most accounts of justification assume some form of epistemic internalism.  Roughly speaking, this is the view that a belief’s justification does not require that it meets some condition external to a subject’s perspective, conditions such as being true, being produced by a reliable process, or being caused by the corresponding fact (see entry on internalism and externalism in epistemology).  All the relevant conditions for justification are ‘internal’ to a subject’s perspective.  These conditions vary from facts about a subject’s occurrent beliefs and experiences to facts about a subject’s occurrent and stored beliefs and experiences and further to facts simply about a subject’s mind, where this may include information that, in some sense or other, a subject has difficulty bringing to explicit consciousness.  Although some externalists offer accounts of justification (see Goldman (1979) & Bergmann (2006)), this article assumes that justification is internalistic.  Externalists have a much easier time addressing concerns over foundationalism.  It is a common judgment that the foundationalist / coherentist debate takes place within the backdrop of internalism (see BonJour (1999)).

2. Arguments for Foundationalism

This section discusses prominent arguments for a general type of foundationalism.  Section 4, on varieties of foundationalism, discusses more specific arguments aimed to defend a particular species of foundationalism.

a. The Regress Argument

 

 

The epistemic regress problem has a long history.  Aristotle used the regress argument to prove that justification requires basic beliefs, beliefs that are not supported by any other beliefs but are able to support further beliefs (see Aristotle’s Posterior Analytics I.3:5-23).  The regress problem was prominent in the writings of the academic skeptics, especially Sextus Empiricus’s Outlines of Pyrrhonism and Diogenes Laertius “The Life of Pyrrho” in his book The Lives and Options of Eminent Philosophers.  In the 20th century the regress problem has received new life in the development of the coherentist and infinitist options (see BonJour (1985) and Klein (1999), respectively).

To appreciate the regress problem begin with the thought that the best way to have a good reason for some claim is to have an argument from which the claim follows.  Thus one possesses good reason to believe p when it follows from the premises q and r.  But then we must inquire about the justification for believing the premises.  Does one have a good argument for the premises?  Suppose one does.  Then we can inquire about the justification for those premises.  Does one have an argument for those claims?  If not, then it appears one lacks a good reason for the original claim because the original claim is ultimately based on claims for which no reason is offered.  If one does have an argument for those premises then either one will continue to trace out the chain of arguments to some premises for which no further reason is offered or one will trace out the chain of arguments until one loops back to the original claims or one will continue to trace out the arguments without end.  We can then begin to see the significance of the regress problem: is it arguments all the way down?  Does one eventually come back to premises that appeared in earlier arguments or does one eventually come to some ultimate premises, premises that support other claims but do not themselves require any additional support?

Skepticism aside, the options in the regress problem are known as foundationalism, coherentism, and infinitism.  Foundationalists maintain that there are some ultimate premises, premises that provide good reasons for other claims but themselves do not require additional reasons.  These ultimate premises are the proper stopping points in the regress argument.  Foundationalists hold that the other options for ending the regress are deeply problematic and that consequently there must be basic beliefs.

Coherentists and infinitists deny that there are any ultimate premises.  A simple form of coherentism holds that the arguments for any claim will eventually loop back to the original claim itself.  As long as the circle of justifications is large enough it is rationally acceptable.  After all, every claim is supported by some other claim and, arguably, the claims fit together in such a way to provide an explanation of their truth (see Lehrer (1997), Chs 1 & 2)

Infinitists think that both the foundationalist and coherentist options are epistemically objectionable.  Infinitists (as well as coherentists) claim that the foundationalist options land in arbitrary premises, premises that are alleged to support other claims but themselves lack reasons.  Against the coherentist, infinitists claim that it simply endorses circular reasoning: no matter how big the circle, circular arguments do not establish that the original claim is true.  Positively, infinitists maintain that possessing a good reason for a claim requires that it be supported by an infinite string of non-repeating reasons (see Klein (1999)).

Foundationalists use the regress argument to set up the alternative epistemological positions and then proceed to knock down these positions.  Foundationalists argue against infinitism that we never possess an infinite chain of non-repeating reasons.  At best when we consider the justification for some claim we are able to carry this justification out several steps but we never come close to anything resembling an unending chain of justifications.  For this criticism and others of infinitism see Fumerton (1998).

Against the coherentist the foundationalist agrees with the infinitist’s criticism mentioned above that circular reasoning never justifies anything.  If p is used to support q then q itself cannot be used in defense of p no matter how many intermediate steps there are between q and p.  This verdict against simple coherentism is strong, but foundationalist strategy is complicated by the fact that it is hard to find an actual coherentist who endorses circular reasoning (though see Lehrer (1997) Ch 1 and 2 for remarks about the circular nature of explanation).  Coherentists, rather, identify the assumption of linear inference in the regress argument and replace it with a stress on the holistic character of justification (see BonJour (1985)).  The assumption of linear inference in the regress argument is clearly seen above by the idea that the regress traces out arguments for some claim, where the premises of those arguments are known or justifiably believed prior to the conclusion being known or justifiably believed.  The form of coherentism that rejects this assumption in the regress argument is known as holistic coherentism.

Foundationalist arguments against holistic coherentism must proceed with more care.  Because holistic coherentists disavow circular reasoning and stress the mistaken role of linear inference in the regress argument, foundationalists must supply a different argument against this option.  A standard argument against holistic coherentism is that unless the data used for coherence reasoning has some initial justification it is impossible for coherence reasoning to provide justification.  This problem affected Laurence BonJour’s attempt to defend coherentism (see BonJour (1985), pp. 102-3).  BonJour argued that coherence among one’s beliefs provided excellent reason to think that those beliefs were true.  But BonJour realized that he needed an account of how one was justified in believing that one had certain beliefs, i.e., what justified one in thinking that one did indeed hold the system of beliefs one takes oneself to believe.  BonJour quickly recognized that coherence couldn’t provide justification for this belief but it wasn’t until later in his career that he deemed this problem insuperable for a pure coherentist account (see BonJour (1997) for details).

The regress problem provides a powerful argument for foundationalism.  The regress argument, though, does not resolve particular questions about foundationalism.  The regress provides little guidance about the nature of basic beliefs or the correct theory of inferential support.  As we just observed with the discussion of holistic coherentism, considerations from the regress argument show, minimally, that the data used for coherence reasoning must have some initial presumption in its favor.  This form of foundationalism may be far from the initial hope of a rational reconstruction of common sense.  Such a reconstruction would amount to setting out in clear order the arguments for various commonsense claims (for example, I have hands, there is a material world, I have existed for more than five minutes, etc) that exhibits the ultimate basis for our view of things.  We shall consider the issues relating to varieties of foundationalists views below.

b. Natural Judgment about Cases

Another powerful consideration for foundationalism is our natural judgment about particular cases.  It seems evident that some beliefs are properly basic.  Leibniz, for instance, gives several examples of claims that don’t “involve any work of proving” and that “the mind perceives as immediately as the eye sees light” (see New Essays, IV, chapter 2, 1).  Leibniz mentions the following examples:

White is not black.

A circle is not a triangle.

Three is one and two.

Other philosophers (for example, C.I. Lewis, Roderick Chisholm, and Richard Fumerton) have found examples of such propositions in appearance states (traditionally, referred to as the given).  For instance, it may not be evident that there is a red circle before one because one may be in a misleading situation (for example, a red light shining on a white circle).  However, if one carefully considers the matter one may be convinced that something appears red.  Foundationalists stress that it is difficult to see what one could offer as a further justification for the claim about how things seem to one.  In short, truths about one’s appearance states are excellent candidates for basic beliefs.

As we shall see below a feature of this appeal to natural judgment is that it can support strong forms of foundationalism.  Richard Fumerton maintains that for some cases, for example, pain states, one’s belief can reach the highest level of philosophical assurance (see Fumerton (2006)).  Other philosophers (for example, James Pryor (2000)) maintain that some ordinary propositions, such as I have hands, are foundational.

3. Arguments Against Foundationalism

This section examines two general arguments against foundationalism.  Arguments against specific incarnations of foundationalism are considered in section 4.

a. The Problem of Arbitrariness

As noted above the regress argument figures prominently in arguing for foundationalism.  The regress argument supports the conclusion that some beliefs must be justified independently of receiving warrant from other beliefs.  However, some philosophers judge that this claim amounts to accepting some beliefs as true for no reason at all, that is, epistemically arbitrary beliefs.  This objection has significant bite against a doxastic form of foundationalism (the language of ‘doxastic’ comes from the Greek word ‘doxa’ meaning belief).  Doxastic foundationalism is the view that the justification of one’s beliefs is exclusively a matter of what other beliefs one holds.  Regarding the basic beliefs, a doxastic foundationalist holds that these beliefs are ‘self-justified’ (see Pollock & Cruz (1999), 22-23).  The content of the basic beliefs are typically perceptual reports but importantly a doxastic foundationalist does not conceive of one’s corresponding perceptual state as a reason for the belief.  Doxastic foundationalists hold that one is justified in accepting a perceptual report simply because one has the belief.  However, given the fallibility of perceptual reports, it is epistemically arbitrary to accept a perceptual report for no reason at all.

The arbitrariness objection against non-doxastic theories must proceed with more care.  A non-doxastic form of foundationalism denies that justification is exclusively a matter of relations between one’s beliefs.  Consider a non-doxastic foundationalist that attempts to stop the regress with non-doxastic states like experiences.  This foundationalist claims that, for example, the belief that there is a red disk before one is properly basic.  This belief is not justified on the basis of any other beliefs but instead justified by the character of one’s sense experience.  Because one can tell by reflection alone that one’s experience has a certain character, the experience itself provides one with an excellent reason for the belief.  The critic of non-doxastic foundationalism argues that stopping with this experience is arbitrary.  After all, there are scenarios in which this experience is misleading.  If, for example, the disk is white but illuminated with red light then one’s experience will mislead one to think that the disk is really red.Unless one has a reason to think that these scenarios fail to obtain then it’s improper to stop the regress of reasons here.

One foundationalist solution to the arbitrariness problem is to move to epistemically certain foundations.  Epistemically certain foundations are beliefs that cannot be misleading and so cannot provide a foothold for arbitrariness concerns.  If, for instance, one’s experience is of a red disk and one believes just that one’s experience has this character, it is difficult to see how one’s belief could be mistaken in this specific context.   Consequently, it is hard to make sense of how one’s belief about the character of one’s experience could be epistemically arbitrary.  In general, many foundationalists want to resist this move.  First, relative to the large number of beliefs we have, there are few epistemically certain beliefs. Second, even if one locates a few epistemically certain beliefs, it is very difficult to reconstruct our common-sense view of the world from those beliefs.  If the ultimate premises of one’s view include only beliefs about the current character of one’s sense experience it’s near impossible to figure out how to justify beliefs about the external world or the past.

Another foundationalist response to the arbitrariness argument is to note that it is merely required that a properly basic belief possess some feature in virtue of which the belief is likely to be true.  It is not required that a subject believe her belief possesses that feature.  This response has the virtue of allowing for modest forms of foundationalism in which the basic beliefs are less than certain.  Critics of foundationalism continue to insist that unless the subject is aware that the belief possesses this feature, her belief is an improper stopping point in the regress of reasons.  For a defense of the arbitrariness objection against foundationalism see Klein (1999) & (2004), and for responses to Klein see Bergmann (2004), Howard-Snyder & Coffman (2006), Howard-Snyder (2005), and Huemer (2003).

b. The Sellarsian Dilemma

The Sellarsian dilemma was first formulated in Wilfrid Sellars’s rich, but difficult, essay “Empiricism and the Philosophy of Mind.”  Sellars’s main goal in this essay is to undermine the entire framework of givenness ((1963), p. 128).  Talk of ‘the given’ was prevalent in earlier forms of foundationalism (see, for example, C.I. Lewis (1929), Ch 2).  The phrase ‘the given’ refers to elements of experience that are putatively immediately known in experience.  For instance, if one looks at a verdant golf course the sensation green is alleged to be given in experience. In a Cartesian moment one may doubt whether or not one is actually perceiving a golf course but, the claim is, one cannot rationally doubt that there is a green sensation present. Strong foundationalists appeal to the given to ground empirical knowledge.  In “Empiricism and the Philosophy of Mind” Sellars argues that the idea of the given is a myth.

The details of Sellars’ actual argument are difficult to decipher.  The most promising reconstruction of Sellars’ argument occurs in chapter 4 of BonJour’s (1985).  BonJour formulates the dilemma using the notion of ‘assertive representational content’.  Representational content is the kind of content possessed by beliefs, hopes, and fears.  A belief, a hope, or a fear could be about the same thing; one could believe that it is raining, hope that it is raining, or fear that it is raining.  These states all have in common the same representational contentAssertive representational content is content that is presented as being true but may, in fact, be false.  A good case of assertive content comes from the Müller-Lyer illusion.  In this well-known experiment a subject experiences two vertical lines as being unequal in length even though they have the same length.  The subject’s experience presents as true the content that these lines are unequal.

Given the notion of assertive representational content BonJour reformulates the Sellarsian dilemma: either experience has assertive representational content or not.  If experience has assertive representational content then one needs an additional reason to think that the content is correct.  If, however, experience lacks this content then experience cannot provide a reason for thinking that some proposition is true.  The dilemma focuses on non-doxastic foundationalism and is used to argue that anyway the view is filled out, it cannot make good on the intuition that experience is a proper foundation for justification.

Let us examine each option of the dilemma staring with the second option.  A defense of this option observes that it is difficult to understand how experience could provide a good reason for believing some claim if it failed to have representational content.  Think of the olfactory experience associated with a field of flowers in full bloom.  Apart from a formed association between that experience and its cause, it is difficult to understand how that experience has representational content.  In other words, the experience lacks any content; it makes no claim that the world is one way rather than another.  However, if that is right, how could that experience provide any reason for believing that the world is one way rather than another?  If the experience itself is purely qualitative then it cannot provide a reason to believe that some proposition is true.  In short, there is a strong judgment that apart from the representational content of experience, experience is powerless to provide reasons.

A defense of the first option of the dilemma takes us back to issues raised by the arbitrariness objection.  If experience does have assertive representational content then that content can be true or false.  If the content is possibly false, the experience is not a proper stopping point in the regress of reasons.  The whole idea behind the appeal to the given was to stop the regress of reasons in a state that did not require further justification because it was not the sort of thing that needed justification.  If experience, like belief, has representational content then there is no good reason to stop the regress of reasons with experience rather than belief.  In brief, if experience functions as a reason in virtue of its assertive representational content then there is nothing special about experience as opposed to belief in its ability to provide reasons.  Since the arbitrariness objection shows that belief is not a proper stopping point in the regress, the Sellarsian dilemma shows that experience is not a proper stopping point either.

Probably the best foundationalist response to the Sellarsian dilemma is to argue that the first option of the dilemma is mistaken; experience has assertive propositional content and can still provide a regress stopping reason to believe that some claim is true.  There are broadly two kinds of responses here depending on whether one thinks that the content of experience could be false.  On one view, experience carries a content that may be false but that this experiential content provides a basic reason for thinking that this content is true.  For instance, it may perceptually seem to one that there is a coffee mug on the right corner of the desk.  This content may be false but in virtue of its being presented as true in experience one has a basic reason for thinking that it is true (see Pryor (2000) & Huemer (2001) for developments of this view).  The other view one might take is that experiential content—at least the kind that provides a basic reason—cannot be false.  One this view the kind of content that experience provides for a basic reason is something like this: it perceptually seems that there is a red disk before me.  Laurence BonJour (in BonJour & Sosa (2003)) develops a view like this.  On his view, one has a built-in constitutive awareness of experiential content, and in virtue of that awareness of content one has a basic reason to believe that the content is true.   For a good criticism of BonJour’s strategy, see Bergmann (2006), Chapter 2.  For a different, externalist response to the dilemma see Jack Lyons (2008).

See the Encyclopedia article “Coherentism” for more criticism of foundationalism.

4. Types of Foundationalist Views

This section surveys varieties of foundationalist views.  As remarked above foundationalists have two main projects: providing a suitable theory of noninferential justification and providing an adequate theory of proper inference.  We will examine three views on non-inferential justification and three views on inferential justification.

a. Theories of Noninferential Justification

An adequate theory of noninferential justification is essential for foundationalism.  Foundationalist views differ on the nature of noninferential justification.  We can distinguish three types of foundationalist views corresponding to the strength of justification possessed by the basic beliefs: strong, modest, and weak foundationalism.  In the following we shall examine these three views and the arguments for and against them.

i. Strong Foundationalism

Strong foundationalists hold that the properly basic beliefs are epistemically exalted in some interesting sense.  In addition to basic beliefs possessing the kind of justification necessary for knowledge (let us refer to this as “knowledge level justification”) strong foundationalists claim the properly basic beliefs are infallible, indubitable, or incorrigible.  Infallible beliefs are not possibly false.  Indubitable beliefs are not possible to doubt even though the content may be false, and incorrigible beliefs cannot be undermined by further information.  The focus on these exalted epistemic properties grows out of Descartes’ method of doubt.  Descartes aimed to locate secure foundations for knowledge and dismissed any claims that were fallible, dubitable, or corrigible.  Thus, Descartes sought the foundations of knowledge in restricted mental states like I am thinking.  Before we examine arguments against strong foundationalism let us investigate some arguments in favor of it.

Probably the most widespread argument for strong foundationalism is the need for philosophical assurance concerning the truth of one’s beliefs (see Fumerton (2006)).  If one adopts the philosophical undertaking to trace out the ultimate reasons for one’s view it can seem particularly remiss to stop this philosophical quest with fallible, dubitable, or corrigible reasons.  As Descartes realized if the possibility that one is dreaming is compatible with one’s evidence then that evidence is not an adequate ground for a philosophical satisfying reconstruction of knowledge.  Consequently, if a philosophically satisfying perspective of knowledge is to be found it will be located in foundations that are immune from doubt.

Another argument for strong foundationalism is C.I. Lewis’s contention that probability must be grounded in certainty (see Lewis (1952); also see Pastin (1975a) for a response to Lewis’s argument).  Lewis’s argument appeals explicitly to the probability calculus but we can restate the driving intuition apart from utilizing any formal machinery. Lewis reasoned that if a claim is uncertain then it is rationally acceptable only given further information.  If that further information is uncertain then it is acceptable only given additional information.  If this regress continues without ever coming to a certainty then Lewis conjectures that the original claim is not rationally acceptable.

We can get a sense of Lewis’s intuition by considering a conspiracy theorist that has a defense for every claim in his convoluted chain of reasoning.  We might think that, in general, the theorist is right about the conditional claims—if this is true then that is probably correct—but just plain wrong that the entire chain of arguments supports the conspiracy theory.  We correctly realize that the longer the chain of reasoning the less likely the conclusion is true.  The chance of error grows with the amount of information.  Lewis’s argument takes this intuition to its limit: unless uncertainties are grounded in certainties no claim is ever rationally acceptable.

Let us examine several arguments against strong foundationalism.  The most repeated argument against strong foundationalism is that its foundations are inadequate for a philosophical reconstruction of knowledge.  We take ourselves to know much about the world around us from mundane facts about our immediate surroundings to more exotic facts about the far reaches of the universe.  Yet if the basic material for this reconstruction is restricted to facts about an individual’s own mind it is nearly impossible to figure out how we can get back to our ordinary picture of the world.  In this connection strong foundationalists face an inherent tension between the quest for epistemic security and the hope for suitable content to reconstruct commonsense.  Few strong foundationalists have been able to find a suitable balance between these competing demands.  Some philosophers with a more metaphysical bent aimed to reduce each statement about the material world to a logical construction of statements about an individual’s own sense experience.  This project is known as phenomenalism.  The phenomenalist’ guiding idea was that statements about the physical world were really complex statements about sensations.  If this guiding idea could be worked out then strong foundationalist would have a clear conception of how the “commonsense” picture of the world could be justified.  However, this guiding idea could never be worked out.  See, for instance, Roderick Chisholm’s (1948) article.

Another argument against strong foundationalism is David Armstrong’s ‘distinct existence’ argument ((1968), 106-7).  Armstrong argues that there is a difference between an awareness of X and X, where X is some mental state.  For instance, there is a difference between being in pain and awareness of being in pain.  As long as awareness of X is distinct from X, Armstrong argues that it is possible for one to seemingly be aware of X without X actually occurring.  For instance, an intense pain that gradually fades away can lead to a moment in which one has a false awareness of being in pain.  Consequently, the thought that one can enjoy an infallible awareness of some mental state is mistaken.

A recent argument against strong foundationalism is Timothy Williamson’s anti-luminosity argument (see Williamson (2000)).  Williamson does not talk about foundationalism but talks rather in terms of the ongoing temptation in philosophy to postulate a realm of luminous truths, truths that shine so brightly they are always open to our view if we carefully consider the matter.  Even though Williamson doesn’t mention foundationalism his argument clearly applies to the strong foundationalist.  Williamson’s actual argument is intricate and we cannot go into it in much detail.  The basic idea behind Williamson’s argument is that appearance states (for example, it seems as if there is a red item before you) permit of a range of similar cases.  Think of color samples.  There is a string of color samples from red to orange in which each shade is very similar to the next.  If appearance states genuinely provided certainty, indubitability, or the like then one should be able to always tell what state one was in.  But there are cases that are so similar that one might make a mistake.  Thus, because of the fact that appearance states ebb and flow, they cannot provide certainty, indubitability or the like.  There is a burgeoning discussion of the anti-luminosity argument; see Fumerton (2009) for a strong foundationalist response and Meeker & Poston (2010) for a recent discussion and references).

ii. Modest Foundationalism

Prior to 1975 foundationalism was largely identified with strong foundationalism.  Critics of foundationalism attacked the claims that basic beliefs are infallible, incorrigible, or indubitable.  However, around this time there was a growing recognition that foundationalism was compatible with basic beliefs that lacked these epistemically exalted properties.  William Alston (1976a; 1976b), C.F. Delaney (1976), and Mark Pastin (1975a; 1975b) all argued that a foundationalist epistemology merely required that the basic beliefs have a level of positive epistemic status independent of warranting relations from other beliefs. In light of this weaker form of foundationalism the attacks against infallibility, incorrigibility, or indubitability did not touch the core of a foundationalist epistemology.

William Alston probably did the most to rehabilitate foundationalism.  Alston provides several interrelated distinctions that illustrate the limited appeal of certain arguments against strong foundationalism and also displays the attractiveness of modest foundationalism.  The first distinction Alston drew was between epistemic beliefs and non-epistemic beliefs (see 1976a).  Epistemic beliefs are beliefs whose content contains an epistemic concept such as knowledge or justification, whereas a non-epistemic belief does not contain an epistemic concept.  The belief that there is a red circle before me is not an epistemic belief because its content does not contain any epistemic concepts.  However, the belief that I am justified in believing that there is a red, circle before me is an epistemic belief on account of the epistemic concept justified figuring in its content.   Alston observes that prominent arguments against foundationalism tend to run together these two beliefs.  For instance, an argument against foundationalism might require that to be justified in believing that p one must justifiedly believe that I am justified in believing that p.  That is, the argument against foundationalism assumes that epistemic beliefs are required for the justification of non-epistemic beliefs.  As Alston sees it, once these two types of belief are clearly separated we should be suspicious of any such argument that requires epistemic beliefs for the justification of non-epistemic beliefs (for details see (1976a) and (1976b)).

A closely related distinction for Alston is the distinction between the state of being justified and the activity of exhibiting one’s justification.  Alston argues in a like manner that prominent objections to foundationalism conflate these two notions.  The state of being justified does not imply that one can exhibit one’s justification.  Reflection on actual examples support Alston’s claim.  Grandma may be justified in believing that she has hands without being in a position to exhibit her justification.  Timmy is justified in believing that he has existed for more than five minutes but he can do very little to demonstrate his justification.  Therefore, arguments against foundationalism should not assume that justification requires the ability to exhibit one’s justification.

A final, closely allied, distinction is between a justification regress argument and a showing regress argument.  Alston argues that the standard regress argument is a regress of justification that points to the necessity of immediately justified beliefs.  This argument is distinct from a showing regress in which the aim is to demonstrate that one is justified in believing p.  This showing regress requires that one proves that one is justified in believing p for each belief one has.  Given Alston’s earlier distinctions this implies that one must have epistemic beliefs for each non-epistemic belief and further it conflates the distinction between the state of being justified and the activity of exhibiting one’s justification.

With these three distinctions in place and the further claim that immediately justified beliefs may be fallible, revisable, and dubitable Alston makes quick work of the standard objections to strong foundationalism.  The arguments against strong foundationalism fail to apply to modest foundationalism and further have no force against the claim that some beliefs have a strong presumption of truth.  Reflection on actual cases supports Alston’s claim.  Grandma’s belief that she has hands might be false and revised in light of future evidence.  Perhaps, Grandma has been fitted with a prosthetic device that looks and functions just like a normal hand.  Nonetheless when she looks and appears to see a hand, she is fully justified in believing that she has hands.

Alston’s discussion of modest foundationalism does not mention weaker forms of foundationalism.  Further Alston is not clear on the precise epistemic status on these foundations.  Alston describes the ‘minimal’ form of foundationalism as simply being committed to non-inferentially justified beliefs.  However, as we shall shortly see BonJour identifies a modest and weak form of foundationalism.  For purposes of terminological regimentation we shall take ‘modest’ foundationalism to be the claim that the basic beliefs possess knowledge adequate justification even though these beliefs may be fallible, corrigible, or dubitable.  A corollary to modest foundationalism is the thesis that the basic beliefs can serve as premises for additional beliefs.  The picture then the modest foundationalist offers us is that of knowledge (and justification) as resting on a foundation of propositions whose positive epistemic status is sufficient to infer other beliefs but whose positive status may be undermined by further information.

A significant development in modest foundationalism is the rise of reformed epistemology.  Reformed epistemology is a view in the epistemology of religious belief, which holds that the belief that there is a God can be properly basic.  Alvin Plantinga (1983) develops this view.  Plantinga holds that an individual may rationally believe that there is a God even though the individual does not possess sufficient evidence to convince an agnostic.  Furthermore, the individual need not know how to respond to various objections to theism.  On Plantinga’s view as long as the belief is produced in the right way it is justified.  Plantinga has developed reformed epistemology in his (2000) volume.  Plantinga develops the view as a form of externalism that holds that the justification conferring factors for a belief may include external factors.

Modest foundationalism is not without its critics.  Some strong foundationalists argue that modest foundationalism is too modest to provide adequate foundations for empirical knowledge (see McGrew (2003)).  Timothy McGrew argues that empirical knowledge must be grounded in certainties.  McGrew deploys an argument similar to C.I. Lewis’s argument that probabilities require certainties.  McGrew argues that every statement that has less than unit probability is grounded in some other statement.  If the probability that it will rain today is .9 then there must be some additional information that one is taking in account to get this probability.  Consequently, if the alleged foundations are merely probable then they are really no foundations at all.  Modest foundationalists disagree.  They hold that some statements may have an intrinsic non-zero probability (see for instance Mark Pastin’s response to C.I. Lewis’s argument in Pastin (1975a)).

iii. Weak Foundationalism

Weak foundationalism is an interesting form of foundationalism.  Laurence BonJour mentions the view as a possible foundationalist view in his (1985) book The Structure of Empirical Knowledge.  According to BonJour the weak foundationalist holds that some non-inferential beliefs are minimally justified, where this justification is not strong enough to satisfy the justification condition on knowledge.  Further this justification is not strong enough to allow the individual beliefs to serve as premises to justify other beliefs (see BonJour (1985), 30).  However, because knowledge and inference are fundamental features to our epistemic practices, a natural corollary to weak foundationalism is that coherence among one’s beliefs is required for knowledge-adequate justification and also for one’s beliefs to function as premises for other beliefs.  Thus for the weak foundationalist, coherence has an ineliminable role for knowledge and inference.

This form of foundationalism is a significant departure from the natural stress foundationalists place on the regress argument.  Attention on the regress argument focuses one back to the ultimate beliefs of one’s view.  If these beliefs are insufficient to license inference to other beliefs it is difficult to make good sense of a reconstruction of knowledge.  At the very least the reconstruction will not proceed in a step by step manner in which one begins with a limited class of beliefs—the basic ones—and then moves to the non-basic ones.  If, in addition, coherence is required for the basic beliefs to serve as premises for other beliefs then this form of weak foundationalism looks very similar to refined forms of coherentism.

Some modest foundationalists maintain that weak foundationalism is inadequate.  James Van Cleve contends that weak foundationalism is inadequate to generate justification for one’s beliefs (van Cleve (2005)).  Van Cleve presents two arguments for the claim that some beliefs must have a high intrinsic credibility (pp. 173-4).  First, while coherence can increase the justification for thinking that one’s ostensible recollections are correct, one must have significant justification for thinking that one has correctly identified one’s ostensible recollection.  That is to say, one must have more than weak justification for thinking one’s apparent memory does report that p, whether or not this apparent memory is true.  Apart from the thought that one has strong justification for believing that one’s ostensible memory is as one takes it to be, Van Cleve argues it is difficult to see how coherence could increase the justification for believing that those apparent memories are true.

The second argument Van Cleve offers comes from Bertrand Russell ((1948), p. 188).  Russell observes that one fact makes another probable or improbable only in relation to a law.  Therefore, for coherence among certain facts, to make another fact probable one must have sufficient justification for believing a law that connects the facts.  Van Cleve explains that we might not require a genuine law but rather an empirical generalization that connects the two facts.  Nonetheless Russell’s point is that for coherence to increase the probability of some claim we must have more than weak justification for believing some generalization.  The problem for the weak foundationalist is that our justification for believing an empirical generalization depends on memory.  Consequently, memory must supply the needed premise in a coherence argument and it can do this only if memory supplies more than weak justification.  In short, the coherence among ostensible memories increases justification only if we have more than weak justification for believing some generalization provided by memory.

b. Theories of Proper Inference

Much of the attention on foundationalism has focused on the nature and existence of basic beliefs.  Yet a crucial element of foundationalism is the nature of the inferential relations between basic beliefs and non-basic beliefs.  Foundationalists claim that all of one’s non-basic beliefs are justified ultimately by the basic beliefs, but how is this supposed to work?  What are the proper conditions for the justification of the non-basic beliefs?  The following discusses three approaches to inferential justification: deductivism, strict inductivism, and liberal inductivism.

i. Deductivism

Deductivists hold that proper philosophical method consists in the construction of deductively valid arguments whose premises are indubitable or self-evident (see remarks by Nozick (1981) and Lycan (1988)).  Deductivists travel down the regress in order to locate the epistemic atoms from which they attempt to reconstruct the rest of one’s knowledge by deductive inference.  Descartes’ epistemology is often aligned with deductivism.  Descartes locates the epistemically basic beliefs in beliefs about the ideas in one’s mind and then deduces from those ideas that a good God exists.  Then given that a good God exists, Descartes deduces further that the ideas in his mind must correspond to objects in reality.  Therefore, by a careful deductive method, Descartes aims to reconstruct our knowledge of the external world.

Another prominent example of deductivism comes from phenomenalism.  As mentioned earlier, phenomenalism is the attempt to analyze statements about physical objects in terms of statements about patterns of sense data.  Given this analysis, the phenomenalist can deduce our knowledge of the external world from knowledge of our own sensory states.  Whereas Descartes’ deductivism took a theological route through the existence of a good God, the phenomenalist eschews theology and attempts a deductive reconstruction by a metaphysical analysis of statements about the external world.  Though this project is a momentous failure, it illustrates a tendency in philosophy to grasp for certainty.

Contemporary philosophers dismiss deductivism as implausible.  Deductivism requires strong foundationalism because the ultimate premises must be infallible, indubitable, or incorrigible.  However, many philosophers judge that the regress stopping premises need not have these exalted properties. Surely, the thought continues, we know things like I have hands and the world has existed for more than five minutes? Additionally, if one restricts proper inference to deduction then one can never expand upon the information contained in the premises.  Deductive inference traces out logical implications of the information contained in the premises.  So if the basic premises are limited to facts about one’s sensory states then one can’t go ‘beyond’ those states to facts about the external world, the past, or the future.  To accommodate that knowledge we must expand either our premises or our conception of inference.  Either direction abandons the deductivist picture of proper philosophical method.

ii. Strict Inductivism

One response to the above challenge for deductivism is to move to modest foundationalism, which allows the basic premises to include beliefs about the external world or the past.  However, even this move is inadequate to account for all our knowledge.  In addition to knowing particular facts about the external world or the past we know some general truths about the world such as all crows are black.  It is implausible that this belief is properly basic.  Further, the belief that every observed and unobserved crow is black is not implied by any properly basic belief such as this crow is black.  In addition to moving away from a strong foundationalist theory of non-inferential justification, one must abandon deductivism.

To accommodate knowledge of general truths, philosophers must allow for other kinds of inference beside deductive inference.  The standard form of non-deductive inference is enumerative induction.  Enumerative induction works by listing (that is, enumerating) all the relevant instances and then concluding on the basis of a sufficient sample that all the relevant instances have the target property.  Suppose, for instance, one knows that 100 widgets from the Kenosha Widget Factory have a small k printed on it and that one knows of no counterexamples to this.  Given this knowledge, one can infer by enumerative induction that every widget from the Kenosha Widget Factory has a small k printed on it. Significantly, this inference is liable to mislead.  Perhaps, the widgets one has examined are special in some way that is relevant to the small printed k.  For example, the widgets come from an exclusive series of widgets to celebrate the Kafka’s birthday.  Even though the inference may mislead, it is still intuitively a good inference.  Given a sufficient sample size and no counterexamples, one may infer that the sample is representative of the whole.

The importance of enumerative induction is that it allows one to expand one’s knowledge of the world beyond the foundations.  Moreover, enumerative induction is a form of linear inference.  The premises of the induction are known or justifiably believed prior to the conclusion being justified believed.  This suggests that enumerative induction is a natural development of the foundationalist conception of knowledge.  Knowledge rests on properly basic beliefs and those other beliefs that can be properly inferred from the best beliefs by deduction and enumerative induction.

iii. Liberal Inductivism

Strict inductivism is motivated by the thought that we have some kind of inferential knowledge of the world that cannot be accommodated by deductive inference from epistemically basic beliefs.  A fairly recent debate has arisen over the merits of strict inductivism.  Some philosophers have argued that there are other forms of non-deductive inference that do not fit the model of enumerative induction.  C.S. Peirce describes a form of inference called “abduction” or “inference to the best explanation.”  This form of inference appeals to explanatory considerations to justify belief.  One infers, for example, that two students copied answers from a third because this is the best explanation of the available data—they each make the same mistakes and the two sat in view of the third.  Alternatively, in a more theoretical context, one infers that there are very small unobservable particles because this is the best explanation of Brownian motion.  Let us call ‘liberal inductivism’ any view that accepts the legitimacy of a form of inference to the best explanation that is distinct from enumerative induction.  For a defense of liberal inductivism see Gilbert Harman’s classic (1965) paper.  Harman defends a strong version of liberal inductivism according to which enumerative induction is just a disguised form of inference to the best explanation.

A crucial task for liberal inductivists is to clarify the criteria that are used to evaluate explanations.  What makes one hypothesis a better explanation than another?  A standard answer is that hypotheses are rated as to their simplicity, testability, scope, fruitfulness, and conservativeness.  The simplicity of a hypothesis is a matter of how many entities, properties, or laws it postulates.  The theory that the streets are wet because it rained last night is simpler than the theory that the streets are wet because there was a massive water balloon fight between the septuagenarians and octogenarians last night.  A hypothesis’s testability is a matter of its ability to be determined to be true or false.  Some hypotheses are more favorable because they can easily be put to the test and when they survive the test, they receive confirmation.  The scope of a hypothesis is a matter of how much data the hypothesis covers.  If two competing hypotheses both entail the fall of the American dollar but another also entails the fact that the Yen rose, the hypothesis that explains this other fact has greater scope.  The fruitfulness of a hypothesis is a matter of how well it can be implemented for new research projects.  Darwin’s theory on the origin of the species has tremendous fruitfulness because, for one, it opened up the study of molecular genetics.  Finally, the conservativeness of a hypothesis is a matter of its fit with our previously accepted theories and beliefs.

The liberal inductivist points to the alleged fact that many of our commonsense judgments about what exists are guided by inference to the best explanation.  If, for instance, we hear the scratching in the walls and witness the disappearance of cheese, we infer that there are mice in the wainscoting.  As the liberal inductivist sees it, this amounts to a primitive use of inference to the best explanation.  The mice hypothesis is relatively simple, testable, and conservative.

The epistemological payout for accepting the legitimacy of inference to the best explanation is significant.  This form of inference is ideally suited for dealing with under-determination cases, cases in which one’s evidence for a hypothesis is compatible with its falsity.  For instance, the evidence we possess for believing that the story of general relativity is correct is compatible with the falsity of that theory.  Nonetheless, we judge that we are rational in believing that general relativity is true based on the available evidence.  The theory of general relativity is the best available explanation of the data.  Similarly, epistemological under-determination arguments focus on the fact that the perceptual data we possess is compatible with the falsity of our common sense beliefs.  If a brain in the vat scenario obtained then one would have all the same sensation states and still believe that, for example, one was seated at a desk.  Nevertheless, the truth of our commonsense beliefs is the best available explanation for the data of sense.  Therefore, our commonsense beliefs meet the justification condition for knowledge.  See Jonathan Vogel (1990) for a response to skepticism along these lines and see Richard Fumerton (1992) for a contrasting perspective.

Liberal inductivism is not without its detractors. Richard Fumerton argues that every acceptable inductive inference is either a straightforward case of induction or a combination of straightforward induction and deduction. Fumerton focuses on paradigm cases of alleged inference to the best explanation and argues that these cases are enthymemes (that is, arguments with suppressed premises).  He considers a case in which someone infers that a person walked recently on the beach from the evidence that there are footprints on the beach and that if a person walked recently on the beach there would be footprints on the beach.  Fumerton observes that this inference fits in to the standard pattern of inference to the best explanation.  However, he then argues that the acceptability of this inference depends on our justification for believing that in the vast majority of cases footprints are produced by people.  Fumerton thus claims that this paradigmatic case of inference to the best explanation is really a disguised form of inference to a particular: the vast majority of footprints are produced by persons; there are footprints on the beach; therefore, a person walked on the beach recently.  The debate of the nature and legitimacy of inference to the best explanation is an active and exciting area of research.  For an excellent discussion and defense of inference to the best explanation see Lipton (2004).

iv. A Theory of Inference and A Theory of Concepts

There are non-trivial connections between a foundationalist theory of inference and theory of concepts.  This is one of the points at which epistemology meets the philosophy of mind.  Both deductivists and strict inductivists tend to accept a thesis about the origin of our concepts.  They both tend to accept the thesis of concept empiricism in which all of our concepts derive from experience.  Following Locke and Hume, concept empiricists stress that we cannot make sense of any ideas that are not based in experience.  Some concept empiricists are strong foundationalists in which case they work with a very limited range of sensory concepts (for example, C.I. Lewis) or they are modest foundationalist in which they take concepts of the external world as disclosed in experience (that is, direct realists).  Concept empiricists are opposed to inference to the best explanation because a characteristic feature of inference to the best explanation is inference to an unobservable.  As the concept empiricist sees it this is illegitimate because we lack the ability to think of genuine non-observables.  For a sophisticated development of this view see Van Fraassen (1980).

Concept rationalists, by contrast, allow that we possess concepts that are not disclosed in experience.  Some concept rationalists, like Descartes, held that some concepts are innate such as the concepts God, substance, or I.  Other concept rationalists view inference to the best explanation as a way of forming new concepts.  In general concept rationalists do not limit the legitimate forms of inference to deduction and enumerative induction.  For a discussion of concept empiricism and rationalism in connection with foundationalism see Timothy McGrew (2003).

5. Conclusion

 

Foundationalism is a multifaceted doctrine.  A well-worked out foundationalist view needs to naturally combine a theory of non-inferential justification with a view of the nature of inference.  The nature and legitimacy of non-deductive inference is a relatively recent topic and there is hope that significant progress will be made on this score.  Moreover, given the continued interest in the regress problem foundationalism provides to be of perennial interest.  The issues that drive research on foundationalism are fundamental epistemic questions about the structure and legitimacy of our view of the world.

6. References and Further Reading

  • Alston, W. 1976a.  “Two Types of Foundationalism.” The Journal of Philosophy 73, 165-185.
  • Alston, W. 1976b.  “Has foundationalism been refuted?” Philosophical Studies 29, 287-305.
  • Armstrong, D.M. 1968. A Materialist Theory of Mind.  New York: Routledge.
  • Audi, R.  The Structure of Justification.  New York: Cambridge.
  • Bergmann, Michael. 2004.  “What’s not wrong with foundationalism,” Philosophy and Phenomenological Research LXVIII, 161-165.
  • Bergmann, Michael. 2006. Justification without Awareness.  New York: Oxford.
  • BonJour, L. 1985.  The Structure of Empirical Knowledge.  Cambridge, MA. Harvard University Press.
  • BonJour, L.  1997.  “Haack on Experience and Justification.”  Synthese 112:1, 13-23.
  • BonJour, L. 1999.  “The Dialectic of Foundationalism and Coherentism.” In The Blackwell Guide to Epistemology eds. John Greco and Ernest Sosa.  Malden, MA: Blackwell, 117-142.
  • BonJour, L and Sosa, E. 2003.  Epistemic Justification: Internalism vs. Externalism, Foundations vs. Virtues. Malden, MA: Blackwell.
  • Chisholm, R. 1948. “The Problem of Empiricism,” The Journal of Philosophy 45, 512-517.
  • Delaney, C.F. 1976. “Foundations of Empirical Knowledge – Again,” New Scholasticism L, 1-19.
  • Fumerton, R. 1980.  “Induction and Reasoning to the Best Explanation.”  Philosophy of Science 47, 589-600.
  • Fumerton, R. 1992.  “Skepticism and Reasoning to the Best Explanation.”  Philosophical Issues 2, 149-169.
  • Fumerton, R. 1998.  “Replies to My Three Critics.” Philosophy and Phenomenological Research 58, 927-937.
  • Fumerton, R.  2006. “Epistemic Internalism, Philosophical Assurance and the Skeptical Predicament,” in Knowledge and Reality, eds. Crisp, Davidson, and Laan. Dordrecht: Kluwer, 179-191.
  • Fumerton, R. 2009. “Luminous enough for a cognitive home.”  Philosophical Studies 142, 67-76.
  • Goldman, A. 1979.  “What is Justified Belief?” in Justification and knowledge. Eds.  George Pappas.  Dordrecht: D. Reidel, 1-23.
  • Haack, S. 1993.  Evidence and Inquiry: Towards Reconstruction in Epistemology. Malden, MA: Blackwell.
  • Harman, Gilbert. 1965. “Inference to the Best Explanation.”  The Philosophical Review 74, 88-95.
  • Howard-Snyder, Daniel. 2005.  “Foundationalism and Arbitrariness,” Pacific Philosophical Quarterly 86, 18-24.
  • Howard-Snyder, D & Coffman, E.J. 2006 “Three Arguments Against Foundationalism: Arbitrariness, Epistemic Regress, and Existential Support,” Canadian Journal of Philosophy 36:4, 535-564.
  • Huemer, Michael. 2003.  “Arbitrary Foundations?” The Philosophical Forum XXXIV, 141-152.
  • Klein, Peter.  1999.  “Human knowledge and the regress of reasons,” Philosophical Perspectives 13, 297-325.
  • Klein, Peter. 2004. “What is wrong with foundationalism is that it cannot solve the epistemic regress problem,”  Philosophy and Phenomenological Research LXVIII, 166-171.
  • Lehrer, K. 1997.  Self-Trust.  New York: Oxford.
  • Lewis, C.I. 1929.  Mind and the World Order.  New York: Dover Publications.
  • Lewis, C.I.  1952.  “The Given Element in Empirical Knowledge.” The Philosophical Review 61, 168-175.
  • Lipton, P. 2004.  Inference to the Best Explanation 2nd edition.  New York: Routledge.
  • Lycan, W. 1988.  Judgment and Justification.  New York: Cambridge.
  • Lyons, J. 2008.  “Evidence, Experience, and Externalism,” Australasian Journal of Philosophy 86, 461-479
  • McGrew, T. 2003. “A Defense of Classical Foundationalism,” in The Theory of Knowledge, ed. Louis Pojman, Belmont: CA. Wadsworth, pp. 194-206.
  • Meeker, K & Poston, T.  2010.  “Skeptics without Borders.”  American Philosophical Quarterly 47:3, 223-237.
  • Neurath, Otto.  1959.  “Protocol Sentences.” In Logical Positivism ed. A.J. Ayer Free Press, New York, 199-208.
  • Nozick, R. 1981.  Philosophical Explanations.  Cambridge, MA: Harvard University Press.
  • Pastin, M. 1975a. “C.I. Lewis’s Radical Foundationalism” Nous 9, 407-420.
  • Pastin, M. 1975b. “Modest Foundationalism and Self-Warrant,” American Philosophical Quarterly 4, 141-149.
  • Plantinga, A. 1983.  “Reason and Belief in God,” in Faith and Rationality. Eds. Alvin Plantinga and Nicholas Wolterstorff.  Notre Dame, IN: University of Notre Dame Press.
  • Plantinga, A. 1993.  Warrant: The Current Debate.  New York: Oxford.
  • Plantinga, A. 2000.  Warranted Christian Belief.  New York: Oxford.
  • Pollock, J and Cruz, J. 1999.  Contemporary Theories of Knowledge 2nd edition.  New York: Rowman & Littlefield.
  • Pryor, J. 2000.  “The Skeptic and the Dogmatist.”  Nous 34, 517-549.
  • Pryor, J. 2001. “Highlights of Recent Epistemology,” The British Journal for the Philosophy of Science 52, 95-124.
    • Stresses that modest foundationalism looks better in 2001 than it looked circa 1976.
  • Quine. W.V.O.  1951. “Two Dogmas of Empiricism.”  The Philosophical Review 60, 20-43.
  • Rescher, N. 1973.  The Coherence Theory of Truth.  New York: Oxford.
  • Russell, B. 1948.  Human Knowledge.  New York: Routledge.
  • Schlick, Moritz. 1959.  “The Foundation of Knowledge.” In Logical Positivism ed. A.J. Ayer Free Press, New York, 209-227.
  • Sellars, Wilfrid. 1963.  “Empiricism and the Philosophy of Mind,” in Science, Perception, and Reality.  Atascadero, CA: Ridgeview Publishing Co, pp. 127-196.
  • Triplett, Timm. 1990. “Recent work on Foundationalism,” American Philosophical Quarterly 27:2, 93-116.
  • van Cleve, James. 2005.  “Why Coherence is Not Enough:  A Defense of Moderate Foundationalism,” in Contemporary Debates in Epistemology, edited by Matthias Steup and Ernest Sosa. Oxford:  Blackwell, pp. 168-80.
  • van Fraassen, Bas. 1980.  The Scientific Image.  New York: Oxford.
  • Vogel, Jonathan. 1990.  “Cartesian Skepticism and Inference to the Best Explanation.”  The Journal of Philosophy 87, 658-666.

Author Information

Ted Poston
University of South Alabama
U. S. A.

Lokayata/Carvaka—Indian Materialism

In its most generic sense, “Indian Materialism” refers to the school of thought within Indian philosophy that rejects supernaturalism.  It is regarded as the most radical of the Indian philosophical systems.  It rejects the existence of other worldly entities such an immaterial soul or god and the after-life.  Its primary philosophical import comes by way of a scientific and naturalistic approach to metaphysics.  Thus, it rejects ethical systems that are grounded in supernaturalistic cosmologies.  The good, for the Indian materialist, is strictly associated with pleasure and the only ethical obligation forwarded by the system is the maximization of one’s own pleasure.

The terms Lokāyata and Cārvāka have historically been used to denote the philosophical school of Indian Materialism.  Literally, “Lokāyata” means philosophy of the people.  The term was first used by the ancient Buddhists until around 500 B.C.E. to refer to both a common tribal philosophical view and a sort of this-worldly philosophy or nature lore.  The term has evolved to signify a school of thought that has been scorned by religious leaders in India and remains on the periphery of Indian philosophical thought.  After 500 B.C.E., the term acquired a more derogatory connotation and became synonymous with sophistry.  It was not until between the 6th and 8th century C.E. that the term “Lokāyata” began to signify Materialist thought.  Indian Materialism has also been named Cārvāka after one of the two founders of the school.  Cārvāka and Ajita Kesakambalin are said to have established Indian Materialism as a formal philosophical system, but some still hold that Bṛhaspati was its original founder.  Bṛhaspati allegedly authored the classic work on Indian Materialism, the Bṛhaspati Sῡtra.  There are some conflicting accounts of Bṛhaspati’s life, but, at the least, he is regarded as the mythical authority on Indian Materialism and at most the actual author of the since-perished Bṛhaspati Sῡtra.  Indian Materialism has for this reason also been named “Bṛhaspatya.”

Table of Contents

  1. History
    1. Vedic Period
    2. Epic Period and Brāhmaṇical Systems
  2. Status in Indian Thought
    1. Contributions to Science
    2. Materialism as Heresy
  3. Doctrine
    1. Epistemology
    2. Ontology
    3. Cosmology
  4. Ethics
  5. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1.  History

Traces of materialism appear in the earliest recordings of Indian thought.  Initially, Indian Materialism or Lokāyata functioned as a sort of negative reaction to spiritualism and supernaturalism.  During the 6th and 7th centuries C.E. it evolved into a formal school of thought and remains intact, though consistently marginalized.

a. Vedic Period

Vedic thought, in the most comprehensive sense, refers to the ideas contained within the Samhitas and the Brāhamaṇas, including the Upaniṣads.  Historians have estimated that the Vedas were written and compiled between the years 1500 B.C.E. and 300 B.C.E. It is difficult to point to one philosophical view in the Upaniṣads, at least by Western standards; however they are considered by scholars to comprise all of the philosophical writing of the Vedas.  The Vedas exemplify the speculative attitude of the ancient Indians, who had the extreme luxury of reflecting on the whence and whither of their existence.  The ancient Indians, also called Aryans, flourished due to the bounty of food and resources provided by the land.  Free from the burdens of political conflict and social upheaval, they were able to ponder the origin of the universe and the purpose of life.  Their meditations on such subjects have been recorded in the literature of the Vedas.

The Vedic period marked the weakest stage of the development of Indian Materialism.  In its most latent form, Materialism is evident in early Vedic references to a man who was known as Bṛhaspati and his followers.  The literature suggests that Bṛhaspati did not attempt to forward a constructive system of philosophy but rather characteristically refuted the claims of others schools of thought.  In this sense, followers of Bṛhaspati were not only skeptical but intentionally destructive of the orthodoxies of the time.  It is thought that any mention of “unbelievers” or “scoffers” in the Vedas refers to those who identified with Bṛhaspati and his materialist views.  Thus, Materialism in its original form was essentially anti-Vedic.  One of Bṛhaspati’s principal objections to orthodoxy was the practice of repeating verses of sacred texts without understanding their meaning.  However, Bṛhaspati’s ideas (“Bṛhaspatya”) would not become a coherent philosophical view without any positive import.  His followers eventually adopted the doctrine of “Svabhava,” which at this point in history signified the rejection of 1) the theory of causation and 2) the notion that there are good and evil consequences of moral actions.  “Svabhava” enhanced Bṛhaspatya by providing it with the beginnings of a metaphysical framework.  In the concluding portions of the Vedas there are violent tales of the opposition of the Bṛhaspatya people to the spiritualism of the time.  Interestingly, the following anecdote from the Taittiriya Brāhmaṇa implies that the gods were impervious to the destructive efforts of Bṛhaspati:

Once upon a time Bṛhaspati struck the goddess Gāyatrī on the head.  The head   smashed into pieces and the brain split.  But Gāyatrī is immortal.  She did not die.  Every bit of her brain was alive. (Dakshinaranjan, 12)

The term “Svabhava” in Sanskrit can be translated to “essence” or “nature.”  Bṛhaspati used the term to indicate a school of thought that rejected supernaturalism and the ethical teachings that followed from supernaturalist ideologies.  Bṛhaspati and his followers were scorned and ridiculed for not believing in the eternal nature of reality and for not revering the gods and the truths they were supposed to have espoused.  It is interesting to note that while other schools have incorporated the “Svabhava” as a doctrine of essences or continuity of the soul, the use of the term by Bṛhaspati was specifically meant to represent his association with the philosophical naturalism.  Naturalism, in this sense, rejects a Platonic notion of essences and the dualism that is exemplified in Platonic philosophy as well as some of the Indian spiritualistic schools.  This brand of dualism is that which asserts that there are two categorically different realms of reality: the material and the immaterial.  Supernaturalism in general embraces this doctrine and holds that the latter realm is not encompassed by “nature.”  In contrast to this, Naturalism rejects the existence of the immaterial realm and suggests that all of reality is encompassed by nature.  Widely varying schools of Naturalism exist today and do not necessarily embrace the mechanistic materialism that was originally embraced by the Cārvāka.

b. Epic Period and Brāhmaṇical Systems

The major work of the Epic Period of Indian history (circa 200 B.C.E. to 200 C.E.) is the Mahābhārata.  The Great War between the Kurus and the Pandavas inspired a many-sided conversation about morality.  Conversation developed into intellectual inquiry and religion began to be replaced by philosophy.  It was around the beginning of this period that the Bṛhaspati school began to merge with the philosophical naturalism of the time.  Naturalism rejected the existence of a spiritual realm and also rejected the notion that the morality of an action can cause either morally good or evil consequences.  Naturalist underpinnings helped to further shape Indian Materialism into a free-standing philosophical system.   The term Lokāyata replaced Bṛhaspatya and scholars have speculated that this was due to the desire for a distinction between the more evolved philosophical system and its weaker anti-Vedic beginnings.   The Lokāyata remained oppositional to the religious thought of the time, namely, Jainism and Buddhism, but it was also positive in that it claimed the epistemological authority of perception.  Furthermore, it attempted to explain existence in terms of the four elements (earth, air, fire, water).  While there is little certainty about the formal development of the Lokāyata school during the Epic Period, it is suspected that its adoption of naturalistic metaphysics led to its eventual association with scientific inquiry and rationalistic philosophy.  Materialism stood out as a doctrine because it rejected the theism of the Upaniṣadic teachings as well as the ethical teachings of Buddhism and Jainism.  It stood for individuality and rejected the authority of scripture and testimony.

The Lokāyata adopted its hedonistic values during the development of the Brāhmaṇical systems of philosophy (circa 1000 C.E.).  As a reaction against the ascetic and meditative practices of the religious devout, Indian Materialism celebrated the pleasures of the body.  People began gratifying their senses with no restraint.  Pleasure was asserted as the highest good and, according to the Lokāyata, was the only reasonable way to enjoy one’s life.  Some scholarship suggests that during this stage of its development Indian Materialism began to be referred to as “Cārvāka” in addition to the “Lokāyata.”  This is contrary to the more popular view that the school was named Cārvāka after its historical founder helped to establish the Lokāyata as a legitimate philosophy.  The term Cārvāka literally means “entertaining speech” and is derived from the term charva, which means to chew or grind with one’s teeth.  It is possible that Cārvāka himself acquired the name due to his association with Indian Materialism, which then led to the school acquiring the name as well.  This is one of many areas of the history of Indian Materialism that remains open to debate.

2.  Status is Indian Thought

The perceived value of Lokāyata from within the Indian Philosophical community is as relevant a topic as its philosophical import.  If nothing else, the etymology of the term Lokāyata is evidence of the consistent marginalization of Indian Materialism.  Because of its association with hedonistic behavior and heretical religious views, followers of the spiritualistic schools of Indian philosophy (Jainism, Buddhism, Hinduism) are reticent on the subject of the materialistic tendencies present in their own systems; however, some scholars, such as Daya Krishna, have suggested that materialism is, in varying degrees, present in all Indian philosophical schools.  This is not to say that materialism replaces other ideologies—it is to say rather that notions about the priority of this-worldliness appear even in some spiritualistic schools.  While matter does not take priority over the spiritual realm in every sense, its significance is elevated more so than in other major world religions.  This observation, for some, carries little weight when examining the philosophical import of the various Indian schools of thought; however, it seems relevant when considering the evolution of Indian thought.  The original meaning of Lokāyata as prevalent among the people has become true in the sense that it is pervasive in Indian philosophical thought at large.  This is not to say that materialism is widely accepted or even that its presence is overtly acknowledged, but it is difficult to deny its far-reaching influence on Indian Philosophy as a whole.

a. Contributions to Science

The most significant influence that Materialism has had on Indian thought is in the field of science.  The spread of Indian Materialism led to the mindset that matter can be of value in itself.  Rather than a burden to our minds or souls, the Materialist view promoted the notion that the body itself can be regarded as wondrous and full of potential.  Evidence in this shift in perspective can be seen by the progress of science over the course of India’s history.  Materialist thought dignified the physical world and elevated the sciences to a respectable level.  Moreover, the Materialist emphasis on empirical validation of truth became the golden rule of the Scientific Method.  Indian Materialism pre-dated the British Empiricist movement by over a millennium.  Whereas the authority of empirical evidence carried little weight in Ancient India, modern thought began to value the systematic and cautious epistemology that first appeared in the thought of the Lokāyata.

b. Materialism as Heresy

Regardless of its positive influence on Indian thought, the fact remains that Indian Materialism is often regarded as blatant heresy against the Spiritualistic schools.  It rejects the theism of Hinduism as well as the moralism of Buddhist and Jain thought.  The anti-orthodox claims of the Materialists are seen as heretical by the religious masses and fly in the face of the piety promoted by most religious sects.  However, it is questionable whether the formal ethics of Materialism are truly practiced to their logical extent by those who claim to belong to the school.  It is suspected by many scholars that Indian Materialism today stands for an atheistic view that values science in place of supernaturalism.  More than anything, Materialists have historically expressed a view that has not found favor among the established religious and social authorities.

3.  Doctrine

There are no existing works that serve as the doctrinal texts for the Lokāyata.  The available materials on the school of thought are incomplete and have suffered through centuries of deterioration.  Mere fragments of the Bṛhaspati Sῡtra remain in existence and because of their obscure nature provide little insight into the doctrine and practices of ancient Indian Materialists.  Clues about the history of Indian Materialism have been pieced together to formulate at best a sketchy portrayal of how the “philosophy of the people” originated and evolved over thousands of years.

a. Epistemology

Epistemological thought varies in Indian philosophy according to how each system addresses the question of “Pramānas” or the “sources and proofs of knowledge.”  (Mittal 41)  The Lokāyata (Cārvāka) school recognized perception (pratkaysa) alone as a reliable source of knowledge.  They therefore rejected two commonly held pramānas: 1) inference (anumana) and 2) testimony (sabda).  Because of its outright rejection of such commonly held sources of knowledge, the Lokāyata was not taken seriously as a school of philosophy.  The common view was that Cārvākas merely rejected truth claims and forwarded none of their own.  To be a mere skeptic during the time amounted to very low philosophical stature.

However, there are additional accounts of the Lokāyata that suggest that the epistemology was more advanced and positivistic than that of mere skepticism.  In fact, it has been compared to the empiricism of John Locke and David Hume.  The Cārvākas denied philosophical claims that could not be verified through direct experience.  Thus, the Lokāyata denied the validity of inferences that were made based upon truth claims that were not empirically verifiable.  However, logical inferences that were made based on premises that were derived from direct experience were held as valid.  It is believed that this characterization of the epistemology of the Lokāyata most accurately describes the epistemological position of contemporary Indian Materialism.

Cārvākas were, in a sense, the first philosophical pragmatists.  They realized that not all sorts of inference were problematic; in order to proceed through daily life inference is a necessary step.  For practical purposes, the Lokāyata made a distinction between inferences made based on probability as opposed to certainty.  The common example used to demonstrate the difference is the inference that if smoke is rising from a building it is probably an indication that there is a fire within the building.  However, Cārvākas were unwilling to accept anything beyond this sort of mundane use of inference, such as the mechanical inference forwarded by the Buddhists.  The Lokāyata refused to accept inferences about what has never been perceived, namely god or the after-life.

b. Ontology

The ontology of the Lokāyata rests on the denial of the existence of non-perceivable entities such as God or spiritual realm.  Critics of this school of thought point to the fallacy of moving from the premise “the soul cannot be known” to the conclusion “the soul does not exist.”  Again, there is a pragmatic tendency in this sort of thinking.  It seems that followers of the Lokāyata were not concerned with truths that could not be verified; however they were not entirely skeptical.  The Lokāyata posited that the world itself and all material objects of the world are real.  They held that all of existence can be reduced to the four elements of air, water, fire and earth.  All things come into existence through a mixture of these elements and will perish with their separation.  Perhaps the most philosophically sophisticated position of Indian Materialism is the assertion that even human consciousness is a material construct.  According to K. K. Mittal, the ontology of the Lokāyata is strictly set forth as follows:

  1. Our observation does not bring forth any instance of a disincarnate consciousness. For the manifestation of life and consciousness, body is an inalienable factor.
  2. That body is the substratum of consciousness can be seen in the undoubted fact of the arising of sensation and perception only in so far as they are conditioned by the bodily mechanism.
  3. The medicinal science by prescribing that certain foods and drinks (such as Brāhmighrta) have the properties conducive to the intellectual powers affords another proof and evidence of the relation of consciousness with body and the material ingredients (of food).  (Mittal 47)

Mittal reports (ibid.), apparently two schools of thought within the Lokāyata arose out of these tenets.  One forwarded the position that there can be no self or soul apart from the body; another posited that a soul can exist alongside a body as long as the body lives, but that the soul perishes with the body.  The latter view adopted the position that the soul is pure air or breath, which is a form of matter.  Therefore, the Lokāyata collectively rejects the existence of an other-worldly soul, while sometimes accepts the notion of a material soul.

c. Cosmology

To speculate as to why the universe exists would be an exercise in futility for an Indian Materialist.  The purpose and origin of existence is not discoverable through scientific means.  Furthermore, the speculation about such matters leads to anxiety and frustration, which reduce pleasure and overall contentment.  There is no teleology implicit in Indian Materalism, which is evidenced in the school’s position that the universe itself probably came into existence by chance.  Although there can be no certainty about the origin of the universe, the most probable explanation is that it evolved as a result of a series of random events.

There is also no doctrine of Creation in the Lokāyata.  The principles of karma (action) and niyati (fate) are rejected because they are derived from the notion that existence in itself is purposeful.  The fundamental principle of Indian Materialism was and remains “Svabhava” or nature.  This is not to suggest that nature itself has no internal laws or continuity.  It would be a misinterpretation of Indian Materialism to suppose that it forwards a cosmology of chaos.  Rather, it resembles most closely the naturalism forwarded by the American philosopher John Dewey.  While it posits no “creator” or teleology, Indian Materialism regards nature itself as a force that thrives according to its own law.

4. Ethics

The most common view among scholars regarding the ethic of Indian Materialism is that it generally forwards Egoism.  In other words, it adopts the perspective that an individual’s ends take priority over the ends of others.  Materialists are critical of other ethical systems for being tied to notions of duty or virtue that are derived from false, supernaturalist cosmologies.  Indian Materialism regards pleasure in itself and for itself as the only good and thus promotes hedonistic practices.  Furthermore, it rejects a utilitarian approach to pleasure.  Utilitarianism regards pleasure (both higher and lower) as the ultimate good and therefore promotes the maximization of the good (pleasure) on a collective level.  Indian Materialism rejects this move away from pure egoism.  The doctrine suggests that individuals have no obligation to promote the welfare of society and would only tend to do so if it were to ultimately benefit them as well.

It is interesting to note that the Cārvāka school has been maligned by virtually all schools of Indian philosophy not merely for its rejection of the supernatural but probably more so for its insistent rejection of anything beyond Egoistic ethics.  In fact, some scholars hold that Indian Materialism is purely nihilistic.  That is to say that an Egoistic or Hedonistic ethic are not even essential elements of the system, but certainly serve as accurate descriptions for the held values and practices of the Cārvāka people.  This view holds that the axiology of the Cārvāka was purely negative.  It claims nothing more than the rejection of both what we think of now as a Platonic notion of “The Good” along with any notion of “god” or “gods.”

The term “nāstika” is used by almost all schools of Indian Philosophy as a critical term to refer to another school of thought that has severely breached what is thought to be acceptable in terms of both religious beliefs and ethical values.  The greatest recipient of this term is the Cārvāka school.  Commonly degraded to the same degree, the term “Cārvāka” and the more general term “nāstika” are sometimes used interchangeably simply to denote a brand of thinking that does not fall in line with the classical schools of Indian thought.  The chief insult that is imported by the term “nāstika” is that the recipient of the title has strayed dangerously away from a path toward enlightenment.  Ethical practices and one’s spiritual education in Indian culture are inextricably tied to one another.  Those who identify with the Indian Materialist school are criticized by the prominent Indian philosophical schools of thought because they are viewed as largely ignorant of both metaphysical and moral truths.  This sort of ignorance is not perceived as a grave threat to the greater good of society, but rather to the individual who is bereft of spiritual and moral knowledge.  That Indian Philosophy as a whole shows concern for the individual beliefs and practices of its members is in stark contrast to the cultural and individual relativism that is largely embraced by the West.

5. References and Further Reading

a. Primary Sources

  • Gunaratna. Tarkarahasyadīpika. Cārvāka/Lokāyata: an Anthology of Source Materials and Some Recent Studies. Ed. Debiprasad Chattopadhyaya. New Delhi: Indian Council of Philosophical Research in association with Rddhi-India Calcutta, 1990.
  • The Mahābhārata. Trans. and Ed. James L. Fitzgerald.  Chicago: Chicago  University Press, 2004.
  • The Rāmāyaṇa of Vālmīki : an Epic of Ancient India.  Ed. Robert Goldman and Sally J. Sutherland.  Trans. Robert Goldman.  Princeton: Princeton University Press, 1984.
  • The Hymns of the Rgveda. Ed. Jagdish L. Shastri.  Trans. Ralph T. H. Griffith.  New Revised Edition.  Delhi: Motilal Banarsidass, 1973.

b. Secondary Sources

  • Chattopadhyaya, Debiprasad.  Lokāyata; a Study in Ancient Materialism. Bombay: People’s Publishing House, 1959.
  • Daksinaranjan, Sastri.  A Short History of Indian Materialism.  Calcutta: The Book Company, Ltd., 1957.
  • Dasgupta, Surendranath.  A History of Indian Philosophy.  Vol. V.  Cambridge: Cambridge University Press, 1955.
  • Flint, Robert.  Anti-theistic theories: being the Baird lecture for 1877. Edinburgh and London: W. Blackwood and Sons, 1879.
  • Garbe, Richard.  The Philosophy of Ancient India.  Chicago: Open Court Publishing Company, 1899.
  • Grimes, John A.  A Concise Dictionary of Indian Philosophy: Sanskrit Terms Defined in English. New and Revised Edition.  Albany: State University of New York Press, 1996.
  • Halbfass, Wilhelm. Tradition and Reflection: Explorations in Indian Thought. Albany, NY: State University of New York Press, 1991.
  • Hopkins, Edward Washburn.  Ethics of India.  New Haven: Yale University Press, 1924.
  • Mittal, Kewal Krishan.  Materialism in Indian Thought.  New Delhi: Munihiram Manoharlal Publishers Pvt. Ltd., 1974.
  • Radhakrishnan, Sri.  Indian Philosophy. Vols. I & II.  New York: Macmillan, 1927-1929.
  • Raju, P. T. The Philosophical Traditions of India.  Pittsburgh: University of Pittsburgh Press, 1972.
  • Raju, P. T.  Structural Depths of Indian Thought. Albany, NY: State University of New York  Press, 1985.
  • Ranganathan, Shyam.  Ethics and The History of Indian Philosophy. Delhi: Motilal Banarsidass Publishers Pvt. Ltd., 2007.
  • Sharma, Ishwar Chandra.  Ethical philosophies of India. Lincol, NE: Johnsen Publishing Company, 1965.
  • Smart, Ninian.  Doctrine and Argument in Indian Philosophy.  London: Allen and Unwin, 1964.
  • Vanamamalai, N.  “Materialist Thought in Early Tamil Literature.” Social Scientist, 2.4 (1973): 25-41.

Author Information

Abigail Turner-Lauck Wernicki
Email: awernicki@racc.edu
Holy Family University
U. S. A.

Mathematical Structuralism

The theme of mathematical structuralism is that what matters to a mathematical theory is not the internal nature of its objects, such as its numbers, functions, sets, or points, but how those objects relate to each other. In a sense, the thesis is that mathematical objects (if there are such objects) simply have no intrinsic nature. The structuralist theme grew most notably from developments within mathematics toward the end of the nineteenth century and on through to the present, particularly, but not exclusively, in the program of providing a categorical foundation to mathematics.

Philosophically, there are a variety of competing ways to articulate the structuralist theme. These invoke various ontological and epistemic themes. This article begins with an overview of the underlying idea, and then proceeds to the major versions of the view found in the philosophical literature. On the metaphysical front, the most pressing question is whether there are or can be “incomplete” objects that have no intrinsic nature, or whether structuralism requires a rejection of the existence of mathematical objects altogether. Each of these options yields distinctive epistemic questions concerning how mathematics is known.

There are ontologically robust versions of structuralism, philosophical theories that postulate a vast ontology of structures and their places; on the other hand, there are versions of structuralism amenable to those who prefer desert landscapes, denying the existence of distinctively mathematical objects altogether; and also there are versions of structuralism in between those two—postulating an ontology for mathematics, but not a specific realm of structures. The article sketches the various strengths of each option and the challenges posed for them.

Table of Contents

  1. The Main Idea
  2. Taking on the Metaphysics: The Ante Rem Approach
  3. Getting by without Ontology: Structuralism without (Ante Rem) Structures
  4. References and Further Reading

1. The Main Idea

David Hilbert’s Grundlagen der Geometrie [1899] represents the culmination of a trend toward structuralism within mathematics.  That book gives what, with some hindsight, we might call implicit definitions of geometric notions, characterizing them in terms of the relations they bear to each other.  The early pages contain phrases such as “the axioms of this group define the idea expressed by the word ‘between’ . . .” and “the axioms of this group define the notion of congruence or motion.”  The idea is summed up as follows:

We think of . . . points, straight lines, and planes as having certain mutual relations, which we indicate by means of such words as “are situated,” “between,” “parallel,” “congruent,” “continuous,” etc.  The complete and exact description of these relations follows as a consequence of the axioms of geometry.

Hilbert also remarks that the axioms express “certain related fundamental facts of our intuition,” but in the book—in the mathematical development itself—all that remains of the intuitive content are the diagrams that accompany some of the theorems.

Mathematical structuralism is similar, in some ways, to functionalist views in, for example, philosophy of mind.  A functional definition is, in effect, a structural one, since it, too, focuses on relations that the defined items have to each other.  The difference is that mathematical structures are more abstract, and free-standing, in the sense that there are no restrictions on the kind of things that can exemplify them (see Shapiro [1997, Chapter 3, §6]).

There are several different, and mutually incompatible, philosophical ideas that can underlie and motivate mathematical structuralism.  Some philosophers postulate an ontology of structures, and claim that the subject matter of a given branch of mathematics is a particular structure, or a class of structures.  An advocate of a view like this would articulate what a structure is, and then say something about the metaphysical nature of structures, and how they and their properties can become known.  Other structuralist views deny the existence of structures, and develop the underlying theme in other ways.

Let us define a system to be collection of objects together with certain relations on those objects.  For example, an extended family is a system of people under certain blood and marital relations—father, aunt, great niece, son-in-law, and so forth.  A work of music is a collection of notes under certain temporal and other musical relations.  To get closer to mathematics, define a natural number system to be a countably infinite collection of objects with a designated initial object, a one-to-one successor relation that satisfies the principle of mathematical induction and the other axioms of arithmetic.  Examples of natural number systems are the Arabic numerals in their natural order, an infinite sequence of distinct moments of time in temporal order, the strings on a finite (or countable) alphabet arranged in lexical order, and, perhaps, the natural numbers themselves.

To bring Hilbert [1899] into the fold, define a Euclidean system to be three collections of objects, one to be called “points,” a second to be called “lines,” and a third to be called “planes,” along with certain relations between them, such that the axioms are true of those objects and relations, so construed.  Otto Blumenthal reports that in a discussion in a Berlin train station in 1891, Hilbert said that in a proper axiomatization of geometry, “one must always be able to say, instead of ‘points, straight lines, and planes’, ‘tables, chairs, and beer mugs’” (“Lebensgeschichte” in Hilbert [1935, 388-429]; the story is related on p. 403).  In a much-discussed correspondence with Gottlob Frege, Hilbert wrote (see Frege [1976], [1980]):

Every theory is only a scaffolding or schema of concepts together with their necessary relations to one another, and that the basic elements can be thought of in any way one likes.  If in speaking of my points, I think of some system of things, e.g., the system love, law, chimney-sweep . . . and then assume all my axioms as relations between these things, then my propositions, e.g., Pythagoras’ theorem, are also valid for these things . . . [A]ny theory can always be applied to infinitely many systems of basic elements.

A structure is the abstract form of a system, which ignores or abstracts away from any features of the objects that do not bear on the relations.  So, the natural number structure is the form common to all of the natural number systems.  And this structure is the subject matter of arithmetic.  The Euclidean-space-structure is the form common to all Euclidean systems.  The theme of structuralism is that, in general, the subject matter of a branch of mathematics is a given structure or a class of related structures—such as all algebraically closed fields.

A structure is thus a “one over many,” a sort of universal.  The difference between a structure and a more traditional universal, such as a property, is that a property applies to, or holds of, individual objects, while a structure applies to, or holds of, systems.  Structures are thus much like structural universals, whose existence remains subject to debate among metaphysicians (see, for example, Lewis [1986], Armstrong [1986], Pagès [2002])).  Indeed, one might think of a mathematical structure as a sort of free-standing structural universal, one in which the nature of the individual objects that fill the places of the structure, is irrelevant (see Shapiro [2008, §4]).

Any of the usual array of philosophical views on universals can be adapted to structures.  One can be a Platonic ante rem realist, holding that each structure exists and has its properties independent of any systems that have that structure.  On this view, structures exist objectively, and are ontologically prior to any systems that have them (or at least ontologically independent of such systems).  Or one can be an Aristotelian in re realist, holding that structures exist, but insisting that they are ontologically posterior to the systems that instantiate them.  Destroy all the natural number systems and, alas, you have destroyed the natural number structure itself.  A third option is to deny that structures exist at all.  Talk of structures is just a convenient shorthand for talk of systems that have a certain similarity.

In a retrospective article, Paul Bernays [1967, 497] provides a way to articulate the latter sort of view:

A main feature of Hilbert’s axiomatization of geometry is that the axiomatic method is presented and practiced in the spirit of the abstract conception of mathematics that arose at the end of the nineteenth century and which has generally been adopted in modern mathematics.  It consists in abstracting from the intuitive meaning of the terms . . . and in understanding the assertions (theorems) of the axiomatized theory in a hypothetical sense, that is, as holding true for any interpretation . . . for which the axioms are satisfied.  Thus, an axiom system is regarded not as a system of statements about a subject matter but as a system of conditions for what might be called a relational structure . . . [On] this conception of axiomatics, . . . logical reasoning on the basis of the axioms is used not merely as a means of assisting intuition in the study of spatial figures; rather, logical dependencies are considered for their own sake, and it is insisted that in reasoning we should rely only on those properties of a figure that either are explicitly assumed or follow logically from the assumptions and axioms.

Advocates of these different ontological positions concerning structures take different approaches to other central philosophical concerns, such as epistemology, semantics, and methodology.  Each such view has it relatively easy for some issues and finds deep, perhaps intractable problems with others.  The ante rem realist, for example, has a straightforward account of reference and semantics:  the variables of a branch of mathematics, like arithmetic, analysis, and set theory, range over the places in an ante rem structure.  Each singular term denotes one such place.  So the language is understood at face value.  But the ante rem realist must account for how one obtains knowledge of structures, so construed, and she must account for how statements about ante rem structures play a role in scientific theories of the physical world.  As suggested by the above Bernays passage, the nominalistic, eliminative structuralist has it easier on epistemology.  Knowing a truth of, say, real analysis, is knowing what follows from the description of the theory.   But this sort of structuralist must account for the semantics of the reconstrued statements, how they are known, how they figure in science, and so forth.

2. Taking on the Metaphysics: The Ante Rem Approach

To repeat, the ante rem structuralist holds that, say, the natural number structure and the Euclidean space structure exist objectively, independent of the mathematician, her form of life, and so forth, and also independent of whether the structures are exemplified in the non-mathematical realm.  That is what makes them ante rem.  The semantics of the respective languages is straightforward:  The first-order variables range over the places in the respective structure, and a singular term such as ‘0’ denotes a particular place in the structure.

So, on this view, the statements of a mathematical theory are to be read at face value.  The grammatical structure of the mathematical language reflects the underlying logical form of the propositions.  For example, in the arithmetic equation, 3×8=24, the numerals ‘3’, ‘8’, and ‘24’, at least seem to be singular terms—proper names.  On the ante rem view, they are singular terms.  The role of a singular term is to denote an individual object.  On the ante rem view, these numerals denote places in the natural number structure.  And, of course, the equation expresses a truth about that structure.  In this respect, then, ante rem structuralism is a variation on traditional Platonism.

For this perspective to make sense, however, one has to think of a place in a structure as a bona fide object, the sort of thing that can be denoted by a singular term, and the sort of thing that can be in the range of first-order variables.  To pursue the foregoing analogy with universals, a place in a structure is akin to a role or an office, one that can be occupied by different people or things.  So, the idea here is to construe an office as an object in its own right, at least with respect to the structure.

There is, of course, an intuitive difference between an object and a place in a structure, between an office-holder and an office.  Indeed, the ante rem view depends on that very distinction, in order to characterize structures in the first place (in terms of systems).  Yet, we also think of the places in ante rem structures as objects in their own right.

This is made coherent by highlighting and enforcing a distinction in linguistic practice.  It is a matter of keeping track of what we are talking about at any given time.  There are two different orientations involved in discussing structures and their places.  First, a structure, including its places, can be discussed in the context of systems that exemplify the structure.  For example, one might say that the current vice president used to be a senator, or that the white king’s bishop in one game was the white queen’s bishop in another game.  For a more mathematical example, in the system of Arabic numerals, the symbol ‘2’ plays the two-role (if we think of the structure as starting with one), while in the system of roman numerals, the string ‘II’ plays that role.  Call this the places-are-offices perspective.  This office-orientation presupposes a background ontology that supplies objects that fill the places of the structures.  In the case of political systems, the background ontology is people (who have met certain criteria, such as being of a certain age and being duly elected); in the case of chess games, the background ontology is small, moveable objects—pieces with certain colors and shapes.  In the case of arithmetic, anything at all can be used as the background ontology: anything at all can play the two-role in a natural number system.  This is what is meant by saying that mathematical structures, like this one, are “free-standing”.

In contrast to the places-are-offices perspective, there are contexts in which the places of a given structure are treated as objects in their own right.  We say that the vice president presides over the senate, and that a bishop that is on a black square cannot move to a white square, without intending to speak about any particular vice president or chess piece.  Such statements are about the roles themselves.  Call this the places-are-objects perspective.  Here, the statements are about the respective structure as such, independent of any exemplifications it may have.  The ante rem structuralist proposes that we think of typical statements in pure mathematics as made in the places-are-objects mode.  This includes such simple equations as 3×8=24, and more sophisticated statements, for example, that there are infinitely many prime numbers.

To be sure, one can think of statements in the places-are-objects mode as simply generalizations over all systems that exemplify the structure.  This is consonant with the above passage from Bernays [1967], suggesting that we understand “the assertions (theorems) of [an] axiomatized theory in a hypothetical sense, that is, as holding true for any interpretation . . . for which the axioms are satisfied.” However, the ante rem structuralist takes the mathematical statements, in places-are-objects mode, as being about the structure itself.  Indeed, on that view, the structure exists, and so we can talk about its places directly.

So, for the ante rem structuralist, in the places-are-offices mode, singular terms denoting places are bona fide singular terms, and variables ranging over places are bona fide variables, ranging over places.  Places are bona fide objects.

The ante rem structuralist envisions a smooth interplay between places-are-offices statements and places-are-objects statements.  When treating a structure in the places-are-offices mode, the background ontology sometimes includes places from other structures.  We say, for example, that the finite von Neumann ordinals exemplify the natural number structure (under the ordinal successor relation, in which the successor of an ordinal α is α∪{α}).  In the places-are-objects mode, for set theory, the variables range over the places of the iterative hierarchy, such places construed as objects.  The von Neumann ordinals are some of those places-cum-objects.  We think of those objects as forming a system, under the ordinal successor relation.  That system exemplifies the natural number structure.  And in that system, the set-cum-object {φ,{φ}}, is in the two-role (beginning with the empty set, as zero).  We sometimes write {φ,{φ}}=2.  From the ante rem perspective, a symbol denoting {φ,{φ}} is construed in the places-are-objects mode, vis-à-vis the iterative hierarchy, and “2” is in the places-are-offices mode, vis-à-vis the natural number structure.  So construed, “{φ,{φ}}=2” is not actually an identity, but more like a predication.  It says that a certain von Neumann ordinal plays a certain role in a given natural number system.  In the Zermelo system, {{φ}}=2.

Sometimes, the background ontology for the places-are-offices perspective consists of places of the very structure under discussion.  It is commonly noted, for example, that the even natural numbers exemplify the natural number structure.  That is, we consider a system whose objects are the even natural numbers (themselves construed from the places-are-objects mode), under the relation symbolized by “+2”.  That system exemplifies the natural number structure.  In that system, 6 is in the three-role.  Of course, it would be confusing to write 6=3, but if care is taken, it can be properly understood, remembering that it is not an identity.

Trivially, the natural number structure itself, construed from the places-are-objects mode, exemplifies the natural number structure.  In that case, 6 plays the six-role.  Some philosophers might think that this raises a problem analogous to Aristotle’s Third Man argument against Plato.  It depends on the relationship between an ante rem structure and the systems it exemplifies.  In short, the Third Man arguments are problematic if one holds that the reason why a given collection of objects, under certain relations, is a natural number system is that it exemplifies the natural number structure.  This invokes something like the Principle of Sufficient Reason.  If something is so, then there must be a reason why it is so, and the cited reason must be, in some metaphysical sense, prior to the something.  From that perspective, one cannot hold that the natural number structure itself exemplifies the natural number structure because it exemplifies the natural number structure.  That would be a circular reason.  The ante rem structuralist is free to reject this instance of the Principle of Sufficient Reason.  She simply points out that the reason the natural number structure exemplifies the natural number structure is that it has a distinguished initial position and satisfies the relevant principles (for an opposing construal, see Hand [1993]).

The ante rem structuralist should say something about the metaphysical nature of a structure, and how it is that mathematical objects are somehow constituted by a structure.  Consider the following slogan from Shapiro [1997, p. 9]:

Structures are prior to places in the same sense that any organization is prior to the offices that constitute it.  The natural number structure is prior to “6,” just as “baseball defense” is prior to “shortstop” or “U.S. Government” is prior to “Vice President.”

What is this notion of priority? For the non-mathematical examples such as baseball defenses and governments, one might characterize the priority in terms of possible existence.  To say that A is prior to B is to say that B could not exist without A.  No one can be a shortstop independent of a baseball defense; no one can be vice president independent of a government (or organization).  Unfortunately, this articulation of the priority does not make sense of the mathematical cases.  The ante rem structuralist follows most ontological realists in holding that the mathematical structures and their places exist of necessity.  It does not make sense to think of the natural number structure existing without its places, nor for the places to exist without the structure.

The dependence relation in the slogans for ante rem structuralism is that of constitution.  Each ante rem structure consists of some places and some relations.  A structure is constituted by its places and its relations, in the same way that any organization is constituted by its offices and the relations between them.  The constitution is not that of mereology.  It is not the case that a structure is just the sum of its places, since, in general, the places have to be related to each other via the relations of the structure.  An ante rem structure is a whole consisting of, or constituted by, its places and its relations.

3. Getting by without Ontology: Structuralism without (Ante Rem) Structures

Some philosophers find the existence of ante rem structures extravagant.  For such thinkers, there are other ways to preserve the structuralist insights.  One can take structures to exist, but only in the systems that exemplify them.  Metaphysically, the idea is to reverse the priority cited above: structures are posterior to the systems that exemplify them—although, again, it may prove difficult to articulate the relevant notion of priority.  This would be an Aristotelian, in re realism.  On a view like this, the only structures that exist are those that are exemplified.  I do not know of any philosophers of mathematics who articulate such a view in detail.  I mention it, in passing, in light of the connection between structures and traditional universals.

Another, perhaps ontologically cleaner, option is to reject the existence of structures, in any sense of “existence.”  On such a view, apparent talk of structures is only a façon de parler, a way of talking about systems that are structured in a certain way.  The view is sometimes dubbed eliminative structuralism.

The eliminativist can acknowledge the places-are-objects orientation when discussing structures or, to be precise, when discussing structured systems, but he cannot understand such statements literally (without adopting an error theory).  For the eliminativist, the surface grammar of places-are-objects statements does not reflect their underlying logical form, since, from that perspective, there are no structures and there are no places to which one can refer.

The ante rem structuralist and the eliminativist agree that statements in the places-are-objects mode imply generalizations concerning systems that exemplify the structure.  We say, for example, that the vice president presides over the senate, and this entails that all vice presidents preside over their respective senates.  The chess king can move one square in any direction, so long as the move does not result in check.  This entails that all kings are so mobile, and so immobile.  Of course, the generalizations themselves do not entail that there are any vice presidents or chess kings—nor do they entail that there are any structures.

The eliminative structuralist holds that places-are-objects statements are just ways of expressing the relevant generalizations, and he accuses the ante rem structuralist of making too much of their surface grammar, trying to draw deep metaphysical conclusions from that.  The same goes for typical statements in pure mathematics.  Those, too, should be regimented as generalizations over all systems that exemplify the given structure or structures.  For example, the statement “For every natural number n there is a prime p>n” is rendered:

In any natural number system S, for every object x in S, there is another object y in S such that y comes after x in S and y has no divisors in S other than itself and the unit object of S.

In general, any sentence Φ in the language of arithmetic gets regimented as something like:

(Φʹ)      In any natural number system S, Φ[S],

where Φ[S] is obtained from Φ by restricting the quantifiers to the objects in S, and interpreting the non-logical terminology in terms of the relations of S.

In a similar manner, the eliminative structuralist paraphrases or regiments—and deflates—what seem to be substantial metaphysical statements, the very statements made by his philosophical opponents.  For example, “the number 2 exists” becomes “in every natural number system, there is an object in the 2-place”.  Or “real numbers exist” becomes “every real number system has objects in its places.”  These are trivially true, analytic if you will—not the sort of statements that generate heated metaphysical arguments.

The sailing is not completely smooth for the eliminative structuralist, however.  As noted, this view takes the places-are-offices perspective to be primary—paraphrasing places-are-objects statements in those terms.  Places-are-offices statements, recall, presuppose a background of objects to fill the places in the systems.  For mathematics, the nature of these objects is not relevant.  For example, as noted, anything at all can play the two-role in a natural number system.  Nevertheless, for the regimented statements to get their expected truth-values, the background ontology must be quite extensive.

Suppose, for example, that the entire universe consists of no more than 10100,000 objects.  Then there are no natural number systems (since each such system must have infinitely many objects).  So for any sentence Φ in the language of arithmetic, the regimented sentence Φʹ would be vacuously true.  So the eliminativist would be committed to the truth of (the regimented version of) 1+1=0.

In other words, a straightforward, successful eliminative account of arithmetic requires a countably infinite background ontology.  And it gets worse for other branches of mathematics.  An eliminative account of real analysis demands an ontology whose size is that of the continuum; for functional analysis, we’d need the powerset of that many objects.  And on it goes.  The size of some of the structures studied in mathematics is staggering.

Even if the physical universe does exceed 10100,000 objects, and, indeed, even if it is infinite, there is surely some limit to how many physical objects there are (invoking Cantor’s theorem that the powerset of any set is larger than it).  Branches of mathematics that require more objects than the number of physical objects might end up being vacuously trivial, at least by the lights of the straightforward, eliminative structuralist.  This would be bad news for such theorists, as the goal is to make sense of mathematics as practiced.  In any case, no philosophy of mathematics should be hostage to empirical and contingent facts, including features of the size of the physical universe.

In the literature, there are two eliminativist reactions to this threat of vacuity.  First, the philosopher might argue, or assume, that there are enough abstract objects for every mathematical structure to be exemplified.  In other words, we postulate that, for each field of mathematics, there are enough abstract objects to keep the regimented statements from becoming vacuous.

Some mathematicians, and some philosophers, think of the set-theoretic hierarchy as the ontology for all of mathematics.  Mathematical objects—all mathematical objects—are sets in the iterative hierarchy.  Less controversially, it is often thought that the iterative hierarchy is rich enough to recapitulate every mathematical theory.  Penelope Maddy [2007, 354] writes:

Set theory hopes to provide a dependable and perspicuous mathematical theory that is ample enough to include (surrogates for) all the objects of classical mathematics and strong enough to imply all the classical theorems about them.  In this way, set theory aims to provide a court of final appeal for claims of existence and proof in classical mathematics . . . Thus set theory aims to provide a single arena in which the objects of classical mathematics are all included, where they can be compared side-by-side.

One might wonder why it is that a foundational theory only needs “surrogates” for each mathematical object, and not the real things.  For a structuralist, the answer is that in mathematics the individual nature of the objects is irrelevant.  What matters is their relations to each other (see Shapiro [2004]).

An eliminative structuralist might maintain that the theory of the background ontology for mathematics—set theory or some other—is not, after all, the theory of a particular structure.  The foundation is a mathematical theory with an intended ontology in the usual, non-structuralist sense.  In the case of set theory, the intended ontology is the sets.  Set theory is not (merely) about all set-theoretic systems—all systems that satisfy the axioms.  So, the foundational theory is an exception to the theme that mathematics is the science of structure.  But, the argument continues, every other branch of mathematics is to be understood in eliminative structuralist terms.  Arithmetic is the study of all natural number systems—within the iterative hierarchy.  Euclidean geometry is the study of all Euclidean systems, and so forth.  There are thus no structures—ante rem or otherwise—and, with the exception of sets, or whatever the background ontology may be, there are no mathematical objects either.  Øystein Linnebo [2008] articulates and defends a view like this.  Although there is not much discussion of the background ontology, Paul Benacerraf’s classic [1965] can be read in these terms as well.  Benacerraf famously argues that there are no numbers—talk of numbers is only a way to talk about all systems of a certain kind, but he seems to have no similar qualms about sets.

Of course, this ontological version of eliminative structuralism is anathema to a nominalist, who rejects the existence of abstracta altogether.  For the nominalist, sets and ante rem structures are pretty much on a par—neither are wanted.  The other prominent eliminative reaction to the threat of vacuity is to invoke modality.  In effect, one avoids (or attempts to avoid) a commitment to a vast ontology by inserting modal operators into the regimented generalizations.  To reiterate the above example, the modal eliminativist renders the statement “For every natural number n there is a prime p>n” as something like:

In any possible natural number system S, for every object x in S, there is another object y in S such that y comes after x in S and y has no divisors in S other than itself and the unit object of S.

In general, let Φ be any sentence in the language of arithmetic; Φ gets regimented as:

In any possible natural number system S, Φ[S],

or, perhaps,

Necessarily, for any natural number system S, Φ[S],

where, again, Φ[S] is obtained from Φ by restricting the quantifiers to the objects in S, and by interpreting the non-logical terminology in terms of the relations of S.

The difference with the ontological, eliminative program, of course, is that here the variables ranging over systems are inside the scope of a modal operator.  So the modal eliminativist does not require an extensive rich background ontology.  Rather, she needs a large ontology to be possible.  Geoffrey Hellman [1989] develops a modal program in detail.

The central, open problem with this brand of eliminativist structuralism concerns the nature of the invoked modality.  Of course, it won’t do much good to render the modality in terms of possible worlds.  If we do that, and if we take possible worlds, and possibilia, to exist, then modal eliminative structuralism would collapse into the above, ontological version of eliminative structuralism.  Not much would be gained by adding the modal operators.  So the modalist typically takes the modality to be primitive—not defined in terms of anything more fundamental.  But, of course, this move does not relieve the modalist of having to say something about the nature of the indicated modality, and having to say something about how we know propositions about what is possible.

Invoking metaphysical possibility and necessity does not seem appropriate here.  Intuitively, if mathematical objects exist at all, then they exist of necessity.   And perhaps also intuitively, if mathematical objects do not exist, then their non-existence is necessary.  Physical and conceptual modalities are also problematic for present purposes.

Hellman mobilizes the logical modalities for his eliminative structuralism.  Our arithmetic sentence Φ becomes

In any logically possible natural number system S, Φ[S].

It is logically necessary that for any natural number system S, Φ[S].

In contemporary logic textbooks and classes, the logical modalities are understood in terms of sets.  To say that a sentence is logically possible is to say that there is a certain set that satisfies it.  Of course, this will not do here, for the same reason that the modalist cannot define the modality in terms of possible worlds.  It is especially problematic here.  It does no good to render mathematical ‘existence’ in terms of logical possibility if the latter is to be rendered in terms of existence in the set-theoretic hierarchy.  Again, the modalist takes the notion of logical possibility to be a primitive, explicated by the theory as a whole.  For more on this program, see Hellman [1989], [2001], [2005].

To briefly sum up and conclude, the parties to the debate over how to best articulate the structuralist insights agree that each of the major versions has its strengths and, of course, each has its peculiar difficulties.  Negotiating such tradeoffs is, of course, a stock feature of philosophy in general.  The literature has produced an increased understanding of mathematics, of the relevant philosophical issues, and how the issues bear on each other, and the discussion shows no signs of abating. For additional discussion see “The Applicability of Mathematics.”

4. References and Further Reading

  • Armstrong, D. [1986], “In defence of structural universals,” Australasian Journal of Philosophy 64, 85-88.
    • As the title says.
  • Awodey, S. [1996], “Structure in mathematics and logic:  a categorical perspective,” Philosophia Mathematica (3) 4, 209-237.
    • Articulates a connection between structuralism and category theory.
  • Awodey, S. [2004], “An answer to Hellman’s question:  ‘Does category theory provide a framework for mathematical structuralism?’,” Philosophia Mathematica (3) 12, 54-64.
    • Continuation of the above.
  • Awodey, S. [2006], Category theory, Oxford, Oxford University Press.
    • Readable presentation of category theory.
  • Benacerraf, P. [1965], “What numbers could not be,” Philosophical Review 74, 47-73; reprinted in Philosophy of mathematics, edited by P. Benacerraf and H. Putnam, Englewood Cliffs, New Jersey, Prentice-Hall, 1983, 272-294.
    • Classic motivation for the (eliminative) structuralist perspective.
  • Bernays, P. [1967], “Hilbert, David” in The encyclopedia of philosophy, Volume 3, edited by P. Edwards, New York, Macmillan publishing company and The Free Press, 496-504.
  • Chihara, C. [2004], A structural account of mathematics, Oxford, Oxford University Press.
    • Account of the application of mathematics in “structural” terms, but without adopting a structuralist philosophy.
  • Frege, G. [1976], Wissenschaftlicher Briefwechsel, edited by G. Gabriel, H. Hermes, F. Kambartel, and C. Thiel, Hamburg, Felix Meiner.
  • Frege, G. [1980], Philosophical and mathematical correspondence, Oxford, Basil Blackwell.
  • Hale, Bob, [1996], “Structuralism’s unpaid epistemological debts,” Philosophica Mathematica (3) 4, 124-143.
    • Criticism of modal eliminative structuralism.
  • Hand, M. [1993], “Mathematical structuralism and the third man,” Canadian Journal of Philosophy 23, 179-192.
    • Critique of ante rem structuralism, on Aristotelian grounds.
  • Hellman, G. [1989], Mathematics without numbers, Oxford, Oxford University Press.
    • Detailed articulation and defense of modal eliminative structuralism.
  • Hellman, G. [2001], “Three varieties of mathematical structuralism,” Philosophia Mathematica (III) 9, 184-211.
    • Comparison of the varieties of structuralism, favoring the modal eliminative version.
  • Hellman, G. [2005], “Structuralism,” Oxford handbook of philosophy of mathematics and logic, edited by Stewart Shapiro, Oxford, Oxford University Press, 536-562.
    • Comparison of the varieties of structuralism, again favoring the modal eliminative version.
  • Hilbert, D. [1899], Grundlagen der Geometrie, Leipzig, Teubner; Foundations of geometry, translated by E. Townsend, La Salle, Illinois, Open Court, 1959.
  • Hilbert, D. [1935], Gesammelte Abhandlungen, Dritter Band, Berlin, Julius Springer.
  • Lewis, D. [1986], “Against structural universals,” Australasian Journal of Philosophy 64, 25-46.
    • As the title says.
  • Linnebo, Øystein [2008], “Structuralism and the Notion of Dependence,” Philosophical Quarterly 58, 59-79.
    • An ontological eliminative structuralism, using set theory as the (non-structural) background foundation.
  • MacBride, F. [2005], “Structuralism reconsidered,” Oxford Handbook of philosophy of mathematics and logic, edited by Stewart Shapiro, Oxford, Oxford University Press, 563-589.
    • Philosophically based criticism of the varieties of structuralism.
  • Maddy, P. [2007], Second philosophy: a naturalistic method, Oxford, Oxford University Press.
  • McLarty, C. [1993], “Numbers can be just what they have to,” Nous 27, 487-498.
    • Connection between category theory and the philosophical aspects of structuralism.
  • Pagès, J. [2002], “Structural universals and formal relations,” Synthese 131, 215-221.
    • Articulation and defenses of structural universals.
  • Resnik, M. [1981], “Mathematics as a science of patterns: Ontology and reference,” Nous 15, 529-550.
    • Philosophical articulation of structuralism, with focus on metaphysical issues.
  • Resnik, M. [1982], “Mathematics as a science of patterns: Epistemology,” Nous 16, 95-105.
    • Philosophical articulation of structuralism, with focus on epistemological issues.
  • Resnik, M. [1992], “A structuralist’s involvement with modality,” Mind 101, 107-122.
    • Review of Hellman [1989] focusing on issues concerning the invoked notion of modality.
  • Resnik, M. [1997], Mathematics as a science of patterns, Oxford, Oxford University Press.
    • Detailed articulation of a realist version of structuralism.
  • Shapiro, S. [1997], Philosophy of mathematics: structure and ontology, New York, Oxford University Press.
    • Elaborate articulation of structuralism, with focus on the various versions; defense of the ante rem approach.
  • Shapiro, S. [2004], “Foundations of mathematics:  metaphysics, epistemology, structure,” Philosophical Quarterly 54, 16-37.
    • The role of structuralist insights in foundational studies.
  • Shapiro, S. [2008], “Identity, indiscernibility, and ante rem structuralism:  the tale of i and -i,” Philosophia Mathematica (3) 16, 2008, 285-309.
    • Treatment of the identity relation, from an ante rem structuralist perspective, and the metaphysical nature of structures.

Author Information

Stewart Shapiro
Email: shapiro.4@osu.edu
The Ohio State University, U.S.A. and
University of St. Andrews, United Kingdom

Connectionism

Connectionism is an approach to the study of human cognition that utilizes mathematical models, known as connectionist networks or artificial neural networks.  Often, these come in the form of highly interconnected, neuron-like processing units. There is no sharp dividing line between connectionism and computational neuroscience, but connectionists tend more often to abstract away from the specific details of neural functioning to focus on high-level cognitive processes (for example, recognition, memory, comprehension, grammatical competence and reasoning). During connectionism’s ideological heyday in the late twentieth century, its proponents aimed to replace theoretical appeals to formal rules of inference and sentence-like cognitive representations with appeals to the parallel processing of diffuse patterns of neural activity.

Connectionism was pioneered in the 1940s and had attracted a great deal of attention by the 1960s. However, major flaws in the connectionist modeling techniques were soon revealed, and this led to reduced interest in connectionist research and reduced funding.  But in  the 1980s  connectionism underwent a potent, permanent revival. During the later part of the twentieth century, connectionism would be touted by many as the brain-inspired replacement for the computational artifact-inspired ‘classical’ approach to the study of cognition. Like classicism, connectionism attracted and inspired a major cohort of naturalistic philosophers, and the two broad camps clashed over whether or not connectionism had the wherewithal to resolve central quandaries concerning minds, language, rationality and knowledge. More recently, connectionist techniques and concepts have helped inspire philosophers and scientists who maintain that human and non-human cognition is best explained without positing inner representations of the world. Indeed, connectionist techniques are now very widely embraced, even if few label themselves connectionists anymore. This is an indication of connectionism’s success.

Table of Contents

  1. McCulloch and Pitts
  2. Parts and Properties of Connectionist Networks
  3. Learning Algorithms
    1. Hebb’s Rule
    2. The Delta Rule
    3. The Generalized Delta Rule
  4. Connectionist Models Aplenty
    1. Elman’s Recurrent Nets
    2. Interactive Architectures
  5. Making Sense of Connectionist Processing
  6. Connectionism and the Mind
    1. Rules versus General Learning Mechanisms: The Past-Tense Controversy
    2. Concepts
    3. Connectionism and Eliminativism
    4. Classicists on the Offensive: Fodor and Pylyshyn’s Critique
      1. Reason
      2. Productivity and Systematicity
  7. Anti-Represenationalism: Dynamical Stystems Theory, A-Life and Embodied Cognition
  8. Where Have All the Connectionists Gone?
  9. References and Further Reading
    1. References
    2. Connectionism Freeware

1. McCulloch and Pitts

In 1943, neurophysiologist Warren McCulloch and a young logician named Walter Pitts demonstrated that neuron-like structures (or units, as they were called) that act and interact purely on the basis of a few neurophysiologically plausible principles could be wired together and thereby be given the capacity to perform complex logical calculation (McCulloch & Pitts 1943). They began by noting that the activity of neurons has an all-or-none character to it – that is, neurons are either ‘firing’ electrochemical impulses down their lengthy projections (axons) towards junctions with other neurons (synapses) or they are inactive. They also noted that in order to become active, the net amount of excitatory influence from other neurons must reach a certain threshold and that some neurons must inhibit others. These principles can be described by mathematical formalisms, which allows for calculation of the unfolding behaviors of networks obeying such principles. McCulloch and Pitts capitalized on these facts to prove that neural networks are capable of performing a variety of logical calculations. For instance, a network of three units can be configured so as to compute the fact that a conjunction (that is, two complete statements connected by ‘and’) will be true only if both component statements are true (Figure 1). Other logical operations involving disjunctions (two statements connected by ‘or’) and negations can also be computed. McCulloch and Pitts showed how more complex logical calculations can be performed by combining the networks for simpler calculations. They even proposed that a properly configured network supplied with infinite tape (for storing information) and a read-write assembly (for recording and manipulating that information) would be capable of computing whatever any given Turing machine (that is, a machine that can compute any computable function) can.

Figure 1: Conjunction Network We may interpret the top (output) unit as representing the truth value of a conjunction (that is, activation value 1 = true and 0 = false) and the bottom two (input) units as representing the truth values of each conjunct. The input units each have an excitatory connection to the output unit, but for the output unit to activate the sum of the input unit activations must still exceed a certain threshold. The threshold is set high enough to ensure that the output unit becomes active just in case both input units are activated simultaneously. Here we see a case where only one input unit is active, and so the output unit is inactive. A disjunction network can be constructed by lowering the threshold so that the output unit will become active if either input unit is fully active. [Created using Simbrain 2.0]

Somewhat ironically, these proposals were a major source of inspiration for John von Neumann’s work demonstrating how a universal Turing machine can be created out of electronic components (vacuum tubes, for example) (Franklin & Garzon 1996, Boden 2006). Von Neumann’s work yielded what is now a nearly ubiquitous programmable computing architecture that bears his name. The advent of these electronic computing devices and the subsequent development of high-level programming languages greatly hastened the ascent of the formal classical approach to cognition, inspired by formal logic and based on sentence and rule (see Artificial Intelligence). Then again, electronic computers were also needed to model the behaviors of complicated neural networks.

For their part, McCulloch and Pitts had the foresight to see that the future of artificial neural networks lay not with their ability to implement formal computations, but with their ability to engage in messier tasks like recognizing distorted patterns and solving problems requiring the satisfaction of multiple ‘soft’ constraints. However, before we get to these developments, we should consider in a bit more detail some of the basic operating principles of typical connectionist networks.

2. Parts and Properties of Connectionist Networks

Connectionist networks are made up of interconnected processing units which can take on a range of numerical activation levels (for example, a value ranging from 0 – 1). A given unit may have incoming connections from, or outgoing connections to, many other units. The excitatory or inhibitory strength (or weight) of each connection is determined by its positive or negative numerical value. The following is a typical equation for computing the influence of one unit on another:

Influenceiu = ai * wiu

This says that for any unit i and any unit u to which it is connected, the influence of i on u is equal to the product of the activation value of i and the weight of the connection from i to u. Thus, if ai = 1 and wiu = .02, then the influence of i on u will be 0.02. If a unit has inputs from multiple units, the net influence of those units will just be the sum of these individual influences.

One common sort of connectionist system is the two-layer feed-forward network. In these networks, units are segregated into discrete input and output layers such that connections run only from the former to the latter. Often, every input unit will be connected to every output unit, so that a network with 100 units, for instance, in each layer will possess 10,000 inter-unit connections. Let us suppose that in a network of this very sort each input unit is randomly assigned an activation level of 0 or 1 and each weight is randomly set to a level between -0.01 to 0.01. In this case, the activation level of each output unit will be determined by two factors: the net influence of the input units; and the degree to which the output unit is sensitive to that influence, something which is determined by its activation function. One common activation function is the step function, which sets a very sharp threshold. For instance, if the threshold on a given output unit were set through a step function at 0.65, the level of activation for that unit under different amounts of net input could be graphed out as follows:

Figure 2: Step Activation Function

Thus, if the input units have a net influence of 0.7, the activation function returns a value of 1 for the output unit’s activation level. If they had a net influence of 0.2, the output level would be 0, and so on. Another common activation that has more of a sigmoid shape to it – that is, graphed out it looks something like this:

Figure 3: Sigmoid Activation Function

Thus, if our net input were 0.7, the output unit would take on an activation value somewhere near 0.9.

Now, suppose that a modeler set the activation values across the input units (that is, encodes an input vector) of our 200 unit network so that some units take on an activation level of 1 and others take on a value of 0. In order to determine what the value of a single output unit would be, one would have to perform the procedure just described (that is, calculate the net influence and pass it through an activation function). To determine what the entire output vector would be, one must repeat the procedure for all 100 output units.

As discussed earlier, the truth-value of a statement can be encoded in terms of a unit’s activation level. There are, however, countless other sorts of information that can be encoded in terms of unit activation levels. For instance, the activation level of each input unit might represent the presence or absence of a different animal characteristic (say, “has hooves,” “swims,” or “has fangs,”) whereas each output unit represents a particular kind of animal (“horse,” “pig,” or “dog,”). Our goal might be to construct a model that correctly classifies animals on the basis of their features. We might begin by creating a list (a corpus) that contains, for each animal, a specification of the appropriate input and output vectors. The challenge is then to set the weights on the connections so that when one of these input vectors is encoded across the input units, the network will activate the appropriate animal unit at the output layer. Setting these weights by hand would be quite tedious given that our network has 10000 weighted connections. Researchers would discover, however, that the process of weight assignment can be automated.

3. Learning Algorithms

a. Hebb’s Rule

The next major step in connectionist research came on the heels of neurophysiologist Donald Hebb’s (1949) proposal that the connection between two biological neurons is strengthened (that is, the presynaptic neuron will come to have an even stronger excitatory influence) when both neurons are simultaneously active.  As it is often put, “neurons that fire together, wire together.” This principle would be expressed by a mathematical formula which came to be known as Hebb’s rule:

Change of weightiu = ai * au * lrate

The rule states that the weight on a connection from input unit i to output unit u is to be changed by an amount equal to the product of the activation value of i, the activation value of u, and a learning rate. [Notice that a large learning rate conduces to large weight changes and a smaller learning rate to more gradual changes.] Hebb’s rule gave connectionist models the capacity to modify the weights on their own connections in light of the input-output patterns it has encountered.

Let us suppose, for the sake of illustration, that our 200 unit network started out life with connection weights of 0 across the board. We might then take an entry from our corpus of input-output pairs (say, the entry for donkeys) and set the input and output values accordingly. Hebb’s rule might then be employed to strengthen connections from active input units to active output units. [Note: if units are allowed to have weights that vary between positive and negative values (for example, between -1 and 1), then Hebb’s rule will strengthen connections between units whose activation values have the same sign and weaken connections between units with different signs.] This procedure could then be repeated for each entry in the corpus. Given a corpus of 100 entries and at 10,000 applications of the rule per entry, a total of 1,000,000 applications of the rule would be required for just one pass through the corpus (called an epoch of training). Here, clearly, the powerful number-crunching capabilities of electronic computers become essential.

Let us assume that we have set the learning rate to a relatively high value and that the network has received one epoch of training. What we will find is that if a given input pattern from the training corpus is encoded across the input units, activity will propagate forward through the connections in such a way as to activate the appropriate output unit. That is, our network will have learned how to appropriately classify input patterns.

As a point of comparison, the mainstream approach to artificial intelligence (AI) research is basically an offshoot of traditional forms of computer programming. Computer programs manipulate sentential representations by applying rules which are sensitive to the syntax (roughly, the shape) of those sentences. For instance, a rule might be triggered at a certain point in processing because a certain input was presented – say, “Fred likes broccoli and Sam likes cauliflower.” The rule might be triggered whenever a compound sentence of the form p and q is input and it might produce as output a sentence of the form p (“Fred likes broccoli”). Although this is a vast oversimplification, it does highlight a distinctive feature of the classical approach to AI, which is the assumption that cognition is effected through the application of syntax-sensitive rules to syntactically structured representations. What is distinctive about many connectionist systems is that they encode information through activation vectors (and weight vectors), and they process that information when activity propagates forward through many weighted connections.

In addition, insofar as connectionist processing is in this way highly distributed (that is, many processors and connections simultaneously shoulder a bit of the processing load), a network will often continue to function even if part of it gets destroyed (if connections are pruned). The same kind of parallel and distributed processing (where many processors and connections are shouldering a bit of the processing load simultaneously) that enables this kind of graceful degradation also allows connectionist systems to respond sensibly to noisy or otherwise imperfect inputs. For instance, even we encoded an input vector that deviated from the one  for donkeys but was still closer to the donkey vector than to any other, our model would still likely classify it as a donkey. Traditional forms of computer programming, on the other hand, have a much greater tendency to fail or completely crash due to even minor imperfections in either programming code or inputs.

The advent of connectionist learning rules was clearly a watershed event in the history of connectionism. It made possible the automation of vast numbers of weight assignments, and this would eventually enable connectionist systems to perform feats that McCulloch and Pitts could scarcely have imagined. As a learning rule for feed-forward networks, however, Hebb’s rule faces severe limitations. Particularly damaging is the fact that the learning of one input-output pair (an association) will in many cases disrupt what a network has already learned about other associations, a process known as catastrophic interference. Another problem is that although a set of weights oftentimes exists that would allow a network to perform a given pattern association task, oftentimes its discovery is beyond the capabilities of Hebb’s rule.

b. The Delta Rule

Such shortcomings led researchers to investigate new learning rules, one of the most important being the delta rule. To train our network using the delta rule, we it out with random weights and feed it a particular input vector from the corpus. Activity then propagates forward to the output layer. Afterwards, for a given unit u at the output layer, the network takes the actual activation of u and its desired activation and modifies weights according to the following rule:

Change of weightiu = learning rate * (desiredu – au) * ai

That is, to modify a connection from input i to output u, the delta rule computes the product of the difference between the desired activation of u and the actual activation (the error score), the activation of i, and a (typically very small) learning rate. Thus, assuming that unit u should be fully active (but is not) and input i happens to be highly active, the delta rule will increase the strength of the connection from i to u. This will make it more likely that the next time i is highly active, u will be too. If, on the other hand, u should have been inactive but was not, the connection from i to u will be pushed in a negative direction. As with Hebb’s rule, when an input pattern is presented during training, the delta rule is used to calculate how the weights from each input unit to a given output unit are to be modified, a procedure repeated for each output unit. The next item on the corpus is then input to the network and the process repeats, until the entire corpus (or at least that part of it that the researchers want the network to encounter) has been run through. Unlike Hebb’s rule, the delta rule typically makes small weight changes, meaning that several epochs of training may be required before a network achieves competent performance. Again unlike Hebb’s rule, however,  the delta rule will in principle always slowly converge on a set of weights that will allow for mastery of all associations in a corpus, provided that such a set of weights exists. Famed connectionist Frank Rosenblatt called networks of the sort lately discussed perceptrons. He also proved the foregoing truth about them, which became known as the perceptron convergence theorem.

Rosenblatt believed that his work with perceptrons constituted a radical departure from, and even spelled the beginning of the end of, logic-based classical accounts of information processing (1958, 449; see also Bechtel & Abrahamson 2002, 6). Rosenblatt was very much concerned with the abstract information-processing powers of connectionist systems, but others, like Oliver Selfridge (1959), were investigating the ability of connectionist systems to perform specific cognitive tasks, such as recognizing handwritten letters. Connectionist models began around this time to be implemented with the aid of Von Neumann devices, which, for reasons already mentioned, proved to be a blessing.

There was much exuberance associated with connectionism during this period, but it would not last long. Many point to the publication of Perceptrons by prominent classical AI researchers Marvin Minsky and Seymour Papert (1969) as the pivotal event. Minsky and Papert showed (among other things) that perceptrons cannot learn some sets of associations. The simplest of these is a mapping from truth values of statements p and q to the truth value of p XOR q (where p XOR q is true, just in case p is true or q is true but not both). No set of weights will enable a simple two-layer feed-forward perceptron to compute the XOR function. The fault here lies largely with the architecture, for feed-forward networks with one or more layers of hidden units intervening between input and output layers (see Figure 4) can be made to perform the sorts of mappings that troubled Minsky and Papert. However, these critics also speculated that three-layer networks could never be trained to converge upon the correct set of weights. This dealt connectionists a serious setback, for it helped to deprive connectionists of the AI research funds being doled out by the Defense Advanced Research Projects Agency (DARPA). Connectionists found themselves at a major competitive disadvantage, leaving classicists with the field largely to themselves for over a decade.

c. The Generalized Delta Rule

In the 1980s, as classical AI research was hitting doldrums of its own, connectionism underwent a powerful resurgence thanks to the advent of the generalized delta rule (Rumelhart, Hinton, & Williams 1986). This rule, which is still the backbone of contemporary connectionist research, enables networks with one or more layers of hidden units to learn how to perform sets of input-output mappings of the sort that troubled Minsky and Papert. The simpler delta rule (discussed above) uses an error score (the difference between the actual activation level of an output unit and its desired activation level) and the incoming unit’s activation level to determine how much to alter a given weight. The generalized delta rule works roughly the same way for the layer of connections running from the final layer of hidden units to the output units. For a connection running into a hidden unit, the rule calculates how much the hidden unit contributed to the total error signal (the sum of the individual output unit error signals) rather than the error signal of any particular unit.  It adjust the connection from a unit in a still earlier layer to that hidden unit based upon the activity of the former and based upon the latter’s contribution to the total error score. This process can be repeated for networks of varying depth. Put differently, the generalized delta rule enables backpropagation learning, where an error signal propagates backwards through multiple layers in order to guide weight modifications.

Figure 4: Three-layer Network [Created using Simbrain 2.0]

4. Connectionist Models Aplenty

Connectionism sprang back onto the scene in 1986 with a monumental two-volume compendium of connectionist modeling techniques (volume 1) and models of psychological processes (volume 2) by David Rumelhart, James McClelland and their colleagues in the Parallel Distributed Processing (PDP) research group. Each chapter of the second volume describes a connectionist model of some particular cognitive process along with a discussion of how the model departs from earlier ways of understanding that process. It included models of schemata (large scale data structures), speech recognition, memory, language comprehension, spatial reasoning and past-tense learning. Alongside this compendium, and in its wake, came a deluge of further models.

Although this new breed of connectionism was occasionally lauded as marking the next great paradigm shift in cognitive science, mainstream connectionist research has not tended to be directed at overthrowing previous ways of thinking. Rather, connectionists seem more interested in offering a deeper look at facets of cognitive processing that have already been recognized and studied in disciplines like cognitive psychology, cognitive neuropsychology and cognitive neuroscience. What are highly novel are the claims made by connectionists about the precise form of internal information processing. Before getting to those claims, let us first discuss a few other connectionist architectures.

a. Elman’s Recurrent Nets

Over the course of his investigation into whether or not a connectionist system can learn to master the complicated grammatical principles of a natural language such as English, Jeffrey Elman (1990) helped to pioneer a powerful, new connectionist architecture, sometimes known as an Elman net. This work posed a direct challenge to Chomsky’s proposal that humans are born with an innate language acquisition device, one that comes preconfigured with vast knowledge of the space of possible grammatical principles. One of Chomsky’s main arguments against Skinner’s behaviorist theory of language-learning was that no general learning principles could enable humans to produce and comprehend a limitless number of grammatical sentences. Although connectionists had attempted (for example, with the aid of finite state grammars) to show that human languages could be mastered by general learning devices, sentences containing multiple center-embedded clauses (“The cats the dog chases run away,” for instance) proved a major stumbling block. To produce and understand such a sentence requires one to be able to determine subject-verb agreements across the boundaries of multiple clauses by attending to contextual cues presented over time. All of this requires a kind of memory for preceding context that standard feed-forward connectionist systems lack.

Elman’s solution was to incorporate a side layer of context units that receive input from and send output back to a hidden unit layer. In its simplest form, an input is presented to the network and activity propagates forward to the hidden layer. On the next step (or cycle) of processing, the hidden unit vector propagates forward through weighted connections to generate an output vector while at the same time being copied onto a side layer of context units. When the second input is presented (the second word in a sentence, for example), the new hidden layer activation is the product of both this second input and activity in the context layer – that is, the hidden unit vector now contains information about both the current input and the preceding one. The hidden unit vector then produces an output vector as well as a new context vector. When the third item is input, a new hidden unit vector is produced that contains information about all of the previous time steps, and so on. This process provides Elman’s networks with time-dependent contextual information of the sort required for language-processing. Indeed, his networks are able to form highly accurate predictions regarding which words and word forms are permissible in a given context, including those that involve multiple embedded clauses.

While Chomsky (1993) has continued to self-consciously advocate a shift back towards the nativist psychology of the rationalists, Elman and other connectionists have at least bolstered the plausibility of a more austere empiricist approach. Connectionism is, however, much more than a simple empiricist associationism, for it is at least compatible with a more complex picture of internal dynamics. For one thing, to maintain consistency with the findings of mainstream neuropsychology, connectionists ought to (and one suspects that most do) allow that we do not begin life with a uniform, amorphous cognitive mush. Rather, as mentioned earlier, the cognitive load may be divided among numerous, functionally distinct components. Moreover, even individual feed-forward networks are often tasked with unearthing complicated statistical patterns exhibited in large amounts of data. An indication of just how complicated a process this can be, the task of analyzing how it is that connectionist systems manage to accomplish the impressive things that they do has turned out to be a major undertaking unto itself (see Section 5).

b. Interactive Architectures

There are, it is important to realize, connectionist architectures that do not incorporate the kinds of feed-forward connections upon which we have so far concentrated. For instance, McClelland and Rumelhart’s (1989) interactive activation and competition (IAC) architecture and its many variants utilize excitatory and inhibitory connections that run back and forth between the units in different groups. In IAC models, weights are hard-wired rather than learned and units are typically assigned their own particular, fixed meanings. When a set of units is activated so as to encode some piece of information, activity may shift around a bit, but as units compete with one another to become most active through inter-unit inhibitory connections activity will eventually settle into a stable state. The stable state may be viewed, depending upon the process being modeled, as the network’s reaction to the stimulus, which, depending upon the process being modeled, might be viewed as a semantic interpretation, a classification or a mnemonic association. The IAC architecture has proven particularly effective at modeling phenomena associated with long-term memory (content addressability, priming and language comprehension, for instance). The connection weights in IAC models can be set in various ways, including on the basis of individual hand selection, simulated evolution or statistical analysis of naturally occurring data (for example, co-occurrence of words in newspapers or encyclopedias (Kintsch 1998)).

An architecture that incorporates similar competitive processing principles, with the added twist that it allows weights to be learned, is the self-organizing feature map (SOFM) (see Kohonen 1983; see also Miikkulainen 1993). SOFMs learn to map complicated input vectors onto the individual units of a two-dimensional array of units. Unlike feed-forward systems that are supplied with information about the correct output for a given input, SOFMs learn in an unsupervised manner. Training consists simply in presenting the model with numerous input vectors. During training the network adjusts its inter-unit weights so that both each unit is highly ‘tuned’ to a specific input vector and the two-dimensional array is divided up in ways that reflect the most salient groupings of vectors. In principle, nothing more complicated than a Hebbian learning algorithm is required to train most SOFMs. After training, when an input pattern is presented, competition yields a single clear winner (for example, the most highly active unit), which is called the system’s image (or interpretation) of that input.

SOFMs were coming into their own even during the connectionism drought of the 1970s, thanks in large part to Finnish researcher Tuevo Kohonen. Ultimately it was found that with proper learning procedures, trained SOFMs exhibit a number of biologically interesting features that will be familiar to anyone who knows a bit about topographic maps (for example, retinotopic, tonotopic and somatotopic) in the mammalian cortex. SOFMs tend not to allow a portion of the map go unused; they represent similar input vectors with neighboring units, which collectively amount to a topographic map of the space of input vectors; and if a training corpus contains many similar input vectors, the portion of the map devoted to the task of discriminating between them will expand, resulting in a map with a distorted topography. SOFMs have even been used to model the formation of retinotopically organized columns of contour detectors found in the primary visual cortex (Goodhill 1993). SOFMs thus reside somewhere along the upper end of the biological-plausibility continuum.

Here we have encountered just a smattering of connectionist learning algorithms and architectures, which continue to evolve. Indeed, despite what in some quarters has been a protracted and often heated debate between connectionists and classicists (discussed below), many researchers are content to move back and forth between, and also to merge, the two approaches depending upon the task at hand.

5. Making Sense of Connectionist Processing

Connectionist systems generally learn by detecting complicated statistical patterns present in huge amounts of data. This often requires detection of complicated cues as to the proper response to a given input, the salience of which often varies with context. This can make it difficult to determine precisely how a given connectionist system utilizes its units and connections to accomplish the goals set for it.

One common way of making sense of the workings of connectionist systems is to view them at a coarse, rather than fine, grain of analysis — to see them as concerned with the relationships between different activation vectors, not individual units and weighted connections. Consider, for instance, how a fully trained Elman network learns how to process particular words. Typically nouns like “ball,” “boy,” “cat,” and “potato” will produce hidden unit activation vectors that are more similar to one another (they tend to cluster together) than they are to “runs,” “ate,” and “coughed”. Moreover, the vectors for “boy” and “cat” will tend to be more similar to each other than either is to the “ball” or “potato” vectors. One way of determining that this is the case is to begin by conceiving activation vectors as points within a space that has as many dimensions as there are units. For instance, the activation levels of two units might be represented as a single point in a two-dimensional plane where the y axis represents the value of the first unit and the x axis represents the second unit. This is called the state space for those units. Thus, if there are two units whose activation values are 0.2 and 0.7, this can be represented as the point where these two values intersect (Figure 5).

Figure 5: Activation of Two Units Plotted as Point in 2-D State Space

The activation levels of three units can be represented as the point in a cube where the three values intersect, and so on for other numbers of units. Of course, there is a limit to the number of dimensions we can depict or visualize, but there is no limit to the number of dimensions we can represent algebraically. Thus, even where many units are involved, activation vectors can be represented as points in high-dimensional space and the similarity of two vectors can be determined by measuring the proximity of those points in high-dimensional state space. This, however, tells us nothing about the way context determines the specific way in which networks represent particular words. Other techniques (for example, principal components analysis and multidimensional scaling) have been employed to understand such subtleties as the context-sensitive time-course of processing.

One of the interesting things revealed about connectionist systems through these sorts of techniques has been that networks which share the same connection structure but begin training with different random starting weights will often learn to perform a given task equally well and to do so by partitioning hidden unit space in similar ways. For instance, the clustering in Elman’s models discussed above will likely obtain for different networks even though they have very different weights and activities at the level of individual connections and units.

At this point, we are also in a good position to understand some differences in how connectionist networks code information. In the simplest case, a particular unit will represent a particular piece of information – for instance, our hypothetical network about animals uses particular units to represent particular features of animals. This is called a localist encoding scheme. In other cases an entire collection of activation values is taken to represents something – for instance, an entire input vector of our hypothetical animal classification network might represent the characteristics of a particular animal. This is a distributed coding scheme at the whole animal level, but still a local encoding scheme at the feature level. When we turn to hidden-unit representations, however, things are often quite different. Hidden-unit representations of inputs are often distributed without employing localist encoding at the level of individual units. That is, particular hidden units often fail to have any particular input feature that they are exclusively sensitive to. Rather, they participate in different ways in the processing of many different kinds of input. This is called coarse coding, and there are ways of coarse coding input and output patterns as well. The fact that connectionist networks excel at forming and processing these highly distributed representations is one of their most distinctive and important features.

Also important is that connectionist models often excel at processing novel input patterns (ones not encountered during training) appropriately. Successful performance of a task will often generalize to other related tasks. This is because connectionist models often work by detecting statistical patterns present in a corpus (of input-output pairs, for instance). They learn to process particular inputs in particular ways, and when they encounter inputs similar to those encountered during training they process them in a similar manner. For instance, Elman’s networks were trained to determine which words and word forms to expect given a particular context (for example, “The boy threw the ______”). After training, they could do this very well even for sentence parts they ha not encountered before. One caveat here is that connectionist systems with numerous hidden units (relative to the amount of variability in the training corpus) tend to use the extra memory to ‘remember by rote’ how to treat specific input patterns rather than discerning more abstract statistical patterns obtaining across many different input-output vectors. Consequently, in such cases performance tends not to generalize to novel cases very well.

As we have seen, connectionist networks have a number of desirable features from a cognitive modeling standpoint. There are, however, also serious concerns about connectionism. One is that connectionist models must usually undergo a great deal of training on many different inputs in order to perform a task and exhibit adequate generalization. In many instances, however, we can form a permanent memory (upon being told of a loved one’s passing, for example) with zero repetition (this was also a major blow to the old psychological notion that rehearsal is required for a memory to make it into long-term storage). Nor is there much need to fear that subsequent memories will overwrite earlier ones, a process known in connectionist circles as catastrophic interference. We can also very quickly detect patterns in stimuli (for instance, the pattern exhibited by “J, M, P…”) and apply them to new stimuli (for example, “7, 10, 13…”). Unfortunately, many (though not all) connectionist networks (namely many back-propagation networks) fail to exhibit one-shot learning and are prone to catastrophic interference.

Another worry about back-propagation networks is that the generalized delta rule is, biologically speaking, implausible. It certainly does look that way so far, but even if the criticism hits the mark we should bear in mind the difference between computability theory questions and learning theory questions. In the case of connectionism, questions of the former sort concern what sorts of things connectionist systems can and cannot do and questions of the latter address how connectionist systems might come to learn (or evolve) the ability to do these things. The back-propagation algorithm makes the networks that utilize them implausible from the perspective of learning theory, not computability theory. It should, in other words, be viewed as a major accomplishment when a connectionist network that utilizes only biologically plausible processing principles (, activation thresholds and weighted connections) is able to perform a cognitive task that had hitherto seemed mysterious. It constitutes a biologically plausible model of the underlying mechanisms regardless of whether or not it came possess that structure through hand-selection of weights, Hebbian learning, back-propagation or simulated evolution.

6. Connectionism and the Mind

The classical conception of cognition was deeply entrenched in philosophy (namely in empirically oriented philosophy of mind) and cognitive science when the connectionist program was resurrected in the 1980s. Nevertheless, many researchers flocked to connectionism, feeling that it held much greater promise and that it might revamp our common-sense conception of ourselves. During the early days of the ensuing controversy, the differences between connectionist and classical models of cognition seemed to be fairly stark. Connectionist networks learned how to engage in the parallel processing of highly distributed representations and were fault tolerant because of it. Classical systems were vulnerable to catastrophic failure due to their reliance upon the serial application of syntax-sensitive rules to syntactically structured (sentence-like) representations. Connectionist systems superimposed many kinds of information across their units and weights, whereas classical systems stored separate pieces of information in distinct memory registers and accessed them in serial fashion on the basis of their numerical addresses.

Perhaps most importantly, connectionism promised to bridge low-level neuroscience and high-level psychology. Classicism, by contrast, lent itself to dismissive views about the relevance of neuroscience to psychology. It helped spawn the idea that cognitive processes can be realized by any of countless distinct physical substrates (see Multiple Realizability). The basic idea here is that if the mind is just a program being run by the brain, the material substrate through which the program is instantiated drops out as irrelevant. After all, computationally identical computers can be made out of neurons, vacuum tubes, microchips, pistons and gears, and so forth, which means that computer programs can be run on highly heterogeneous machines. Neural nets are but one of these types, and so they are of no essential relevance to psychology. On the connectionist view, by contrast, human cognition can only be understood by paying considerable attention to kind of physical mechanism that instantiates it.

Although these sorts of differences seemed fairly stark in the early days of the connectionism-classicism debate, proponents of the classical conception have recently made great progress emulating the aforementioned virtues of connectionist processing. For instance, classical systems have been implemented with a high degree of redundancy, through the action of many processors working in parallel, and by incorporating fuzzier rules to allow for input variability. In these ways, classical systems can be endowed with a much higher level of fault and noise tolerance, not to mention processing speed (See Bechtel & Abrahamson 2002). We should also not lose sight of the fact that classical systems have virtually always been capable of learning. They have, in particular, long excelled at learning new ways to efficiently search branching problem spaces. That said, connectionist systems seem to have a very different natural learning aptitude – namely, they excel at picking up on complicated patterns, sub-patterns, and exceptions, and apparently without the need for syntax-sensitive inference rules. This claim has, however, not gone uncontested.

a. Rules versus General Learning Mechanisms: The Past-Tense Controversy

Rumelhart and McClelland’s (1986) model of past-tense learning has long been at the heart of this particular controversy. What these researchers claimed to have shown was that over the course of learning how to produce past-tense forms of verbs, their connectionist model naturally exhibited the same distinctive u-shaped learning curve as children. Of particular interest was the fact that early in the learning process children come to generate the correct past-tense forms of a number of verbs, mostly irregulars (“go” → “went”). Later, performance drops precipitously as they pick up on certain fairly general principles (for example, adding “-ed”) and over-apply them even to previously learned irregulars (“went” may become “goed”). Lastly, performance increases as the child learns both the rules and their exceptions.

What Rumelhart and McClelland (1986) attempted to show was that this sort of process need not be underwritten by mechanisms that work by applying physically and functionally distinct rules to representations. Instead, all of the relevant information can be stored in superimposed fashion within the weights of a connectionist network (really three of them linked end-to-end). Pinker and Prince (1988), however, would charge (inter alia) that the picture of linguistic processing painted by Rumelhart and McClelland was extremely simplistic and that their training corpus was artificially structured (namely, that the proportion of regular to irregular verbs varied unnaturally over the course of training) so as to elicit u-shaped learning. Plunkett and Marchman (1993) went a long way towards remedying the second apparent defect, though Marcus (1995) complained that they did not go far enough since the proportion of regular to irregular verbs was still not completely homogenous throughout training. As with most of the major debates constituting the broader connectionist-classicist controversy, this one has yet to be conclusively resolved. Nevertheless, it seems clear that this line of connectionist research does at least suggest something of more general importance – namely, that an interplay between a structured environment and general associative learning mechanisms might in principle conspire so as to yield complicated behaviors of the sort that lead some researchers to posit inner classical process.

b. Concepts

Some connectionists also hope to challenge the classical account of concepts, which embody knowledge of categories and kinds. It has long been widely held that concepts specify the singularly necessary and jointly sufficient conditions for category membership – for instance, “bachelor” might be said to apply to all and only unmarried, eligible males. Membership conditions of this sort would give concepts a sharp, all-or-none character, and they naturally lend themselves to instantiation in terms of formal inference rules and sentential representations. However, as Wittgenstein (1953) pointed out, many words (for example, “game”) seem to lack these sorts of strict membership criteria. Instead, their referents bear a much looser family resemblance relation to one another. Rosch & Mervis (1975) later provided apparent experimental support for the related idea that our knowledge of categories is organized not in terms of necessary and sufficient conditions but rather in terms of clusters of features, some of which (namely those most frequently encountered in category members) are more strongly associated with the category than others. For instance, the ability to fly is more frequently encountered in birds than is the ability to swim, though neither ability is common to all birds. On the prototype view (and also on the closely related exemplar view), category instances are thought of as clustering together in what might be thought of as a hyper-dimensional semantic space (a space in which there are as many dimensions as there are relevant features). In this space, the prototype is the central region around which instances cluster (exemplar theory essentially does away with this abstract region, allowing only for memory of actual concrete instances). There are clearly significant isomorphisms between concepts conceived of in this way and the kinds of hyper-dimensional clusters of hidden unit representations formed by connectionist networks, and so the two approaches are often viewed as natural allies (Horgan & Tienson 1991). This way of thinking about concepts has, of course, not gone unchallenged (see Rey 1983 and Barsalou 1987 for two very different responses).

c. Connectionism and Eliminativism

Neuroscientist Patricia Churchland and philosopher Paul Churchland have argued that connectionism has done much to weaken the plausibility of our pre-scientific conception of mental processes (our folk psychology). Like other prominent figures in the debate regarding connectionism and folk psychology, the Churchlands appear to be heavily influenced by Wilfrid Sellars’ view that folk psychology is a theory that enables predictions and explanations of everyday behaviors, a theory that posits internal manipulation to the sentence-like representations of the things that we believe and desire. The classical conception of cognition is, accordingly, viewed as a natural spinoff of this folk theory. The Churchlands maintain that neither the folk theory nor the classical theory bears much resemblance to the way in which representations are actually stored and transformed in the human brain. What leads many astray, say Churchland and Sejnowski (1990), is the idea that the structure of an effect directly reflects the structure of its cause (as exemplified by the homuncular theory of embryonic development). Thus, many mistakenly think that the structure of the language through which we express our thoughts is a clear indication of the structure of the thoughts themselves. The Churchlands think that connectionism may afford a glimpse into the future of cognitive neuroscience, a future wherein the classical conception is supplanted by the view that thoughts are just points in hyper-dimensional neural state space and sequences of thoughts are trajectories through this space (see Churchland 1989).

A more moderate position on these issues has been advanced by Daniel Dennett (1991) who largely agrees with the Churchlands in regarding the broadly connectionist character of our actual inner workings. He also maintains, however, that folk psychology is for all practical purposes indispensible. It enables us to adopt a high-level stance towards human behavior wherein we are able to detect patterns that we would miss if we restricted ourselves to a low-level neurological stance. In the same way, he claims, one can gain great predictive leverage over a chess-playing computer by ignoring the low-level details of its inner circuitry and treating it as a thinking opponent. Although an electrical engineer who had perfect information about the device’s low-level inner working could in principle make much more accurate predictions about its behavior, she would get so bogged down in those low-level details as to make her greater predictive leverage useless for any real-time practical purposes. The chess expert wisely forsakes some accuracy in favor of a large increase in efficiency when he treats the machine as a thinking opponent, an intentional agent. Dennett maintains that we do the same when we adopt an intentional stance towards human behavior. Thus, although neuroscience will not discover any of the inner sentences (putatively) posited by folk psychology, a high-level theoretical apparatus that includes them is an indispensable predictive instrument.

On a related note, McCauley (1986) claims that whereas it is relatively common for one high-level  theory to be eliminated in favor of another, it is much harder to find examples where a high-level theory is eliminated in favor of a lower-level theory in the way that the Churchlands envision. However, perhaps neither Dennett nor McCauley are being entirely fair to the Churchlands in this regard. What the Churchlands foretell is the elimination of a high-level folk theory in favor of another high-level theory that emanates out of connectionist and neuroscientific research. Connectionists, we have seen, look for ways of understanding how their models accomplish the tasks set for them by abstracting away from neural particulars. The Churchlands, one might argue, are no exception. Their view that sequences are trajectories through a hyperdimensional landscape abstracts away from most neural specifics, such as action potentials and inhibitory neurotransmitters.

d. Classicists on the Offensive: Fodor and Pylyshyn’s Critique

When connectionism reemerged in the 1980s, it helped to foment resistance to both classicism and folk psychology. In response, stalwart classicists Jerry Fodor and Zenon Pylyshyn (1988) formulated a trenchant critique of connectionism. One imagines that they hoped to do for the new connectionism what Chomsky did for the associationist psychology of the radical behaviorists and what Minsky and Papert did for the old connectionism. They did not accomplish that much, but they did succeed in framing the debate over connectionism for years to come. Though their criticisms of connectionism were wide-ranging, they were largely aimed at showing that connectionism could not account for important characteristics of human thinking, such as its generally truth-preserving character, its productivity, and (most important of all) its systematicity. Of course they had no qualms with the proposal that vaguely connectionist-style processes happen, in the human case, to implement high-level, classical computations.

i. Reason

Unlike Dennett and the Churchlands, Fodor and Pylyshyn (F&P) claim that folk psychology works so well because it is largely correct. On their view, human thinking involves the rule-governed formulation and manipulation of sentences in an inner linguistic code (sometimes called mentalese). [Incidentally, one of the main reasons why classicists maintain that thinking occurs in a special ‘thought language’ rather than in one’s native natural language is that they want to preserve the notion that people who speak different languages can nevertheless think the same thoughts – for instance, the thought that snow is white.] One bit of evidence that Fodor frequently marshals in support of this proposal is the putative fact that human thinking typically progresses in a largely truth-preserving manner. That is to say, if one’s initial beliefs are true, the subsequent beliefs that one infers from them are also likely to be true. For instance, from the belief that the ATM will not give you any money and the belief that it gave money to the people before and after you in line, you might reasonably form a new belief that there is something wrong with either your card or your account. Says Fodor (1987), if thinking were not typically truth-preserving in this way, there wouldn’t be much point in thinking. Indeed, given a historical context in which philosophers throughout the ages frequently decried the notion that any mechanism could engage in reasoning, it is no small matter that early work in AI yielded the first fully mechanical models and perhaps even artificial implementations of important facets of human reasoning. On the classical conception, this can be done through the purely formal, syntax-sensitive application of rules to sentences insofar as the syntactic properties mirror the semantic ones. Logicians of the late nineteenth and early twentieth century showed how to accomplish just this in the abstract, so all that was left was to figure out (as von Neumann did) how to realize logical principles in artifacts.

F&P (1988) argue that connectionist systems can only ever realize the same degree of truth preserving processing by implementing a classical architecture. This would, on their view, render connectionism a sub-cognitive endeavor. One way connectionists could respond to this challenge would be to create connectionist systems that support truth-preservation without any reliance upon sentential representations or formal inference rules. Bechtel and Abrahamson (2002) explore another option, however, which is to situate important facets of rationality in human interactions with the external symbols of natural and formal languages. Bechtel and Abrahamson argue that “the ability to manipulate external symbols in accordance with the principles of logic need not depend upon a mental mechanism that itself manipulates internal symbols” (1991, 173). This proposal is backed by a pair of connectionist models that learn to detect patterns during the construction of formal deductive proofs and to use this information to decide on the validity of arguments and to accurately fill in missing premises.

ii. Productivity and Systematicity

Much more attention has been pain to other aspects of F&P’s (1988) critique, such as their claim that only a classical architecture can account for the productivity and systematicity of thought. To better understand the nature of their concerns, it might help if we first consider the putative productivity and systematicity of natural languages.

Consider, to start with, the following sentence:

(1)  “The angry jay chased the cat.”

The rules governing English appear to license (1), but not (2), which is made from (modulo capitalization) qualitatively identical parts:

(2)  “Angry the the chased jay cat.”

We who are fluent in some natural language have knowledge of the rules that govern the permissible ways in which the basic components of that language can be arranged – that is, we have mastery of the syntax of the language.

Sentences are, of course, also typically intended to carry or convey some meaning. The meaning of a sentence, say F&P (1988), is determined by the meanings of the individual constituents and by the manner in which they are arranged. Thus (3), which is made from the same constituents as (1), conveys a very different meaning.

(3)  “The angry cat chased the jay.”

Natural language expressions, in other words, have a combinatorial syntax and semantics.

In addition, natural languages appear to be characterized by certain recursive rules which enable the production of an infinite variety of syntactically distinct sentences. For instance, in English one such rule allows any two grammatical statements to be combined with ‘and’. Thus, if (1) and (3) are grammatical, so is this:

(4)  “The angry jay chased the cat and the angry cat chased the jay.”

Sentence (4) too can be combined with another, as in (5) which conjoins (4) and (3):

“The angry jay chased the cat and the angry cat chased the jay, and the angry cat chased the jay.”

Earlier we discussed another recursive principle which allows for center-embedded clauses.

One who has mastered the combinatorial and recursive syntax and semantics of a natural language is, according to classicists like F&P (1988), thereby capable in principle of producing and comprehending an infinite number of grammatically distinct sentences. In other words, their mastery of these linguistic principles gives them a productive linguistic competence. It is also reputed to give them a systematic competence, in that a fluent language user who can produce and understand one sentence can produce and understand systematic variants. A fluent English speaker who can produce and understand (1) will surely be able to produce and understand (3). It is, on the other hand, entirely possible for one who has learned English from a phrase-book (that is, without learning the meanings of the constituents or the combinatorial semantics of the language) to be able to produce and understand (1) but not its systematic variant (3).

Thinking, F&P (1988) claim, is also productive and systematic, which is to say that we are capable of thinking an infinite variety of thoughts and that the ability to think some thoughts is intrinsically connected with the ability to think others. For instance, on this view, anyone who can think the thought expressed by (1) will be able to think the thought expressed by (3). Indeed, claims Fodor (1987), since to understand a sentence is to entertain the thought the sentence expresses, the productivity and systematicity of language imply the productivity and systematicity of thought. F&P (1988) also maintain that just as the productivity and systematicity of language is best explained by its combinatorial and recursive syntax and semantics, so too is the productivity and systematicity of thought. Indeed, they say, this is the only explanation anyone has ever offered.

The systematicity issue has generated a vast debate (see Bechtel & Abrahamson 2002), but one general line of connectionist response has probably garnered the most attention. This approach, which appeals to functional rather than literal compositionality (see van Gelder 1990), is most often associated with Smolensky (1990) and with Pollack (1990), though for simplicity’s sake discussion will be restricted to the latter.

Pollack (1990) uses recurrent connectionist networks to generate compressed, distributed encodings of syntactic strings and subsequently uses those encodings to either recreate the original string or to perform a systematic transformation of it (e.g., from “Mary loved John” to “John loved Mary”). Pollack’s approach was quickly extended by Chalmers (1990), who showed that one could use such compressed distributed representations to perform systematic transformations (namely moving from an active to a passive form) of even sentences with complex embedded clauses. He showed that this could be done for both familiar and novel sentences. What this suggests is that connectionism might offer its own unique, non-classical account of the apparent systematicity of thought processes. However, Fodor and McLaughlin (1990) argue that such demonstrations only show that networks can be forced to exhibit systematic processing, not that they exhibit it naturally in the way that classical systems do. After all, on a classical account, the same rules that license one expression will automatically license its systematic variant. It bears noting, however, that this approach may itself need to impose some ad hoc constraints in order to work. Aizawa (1997) points out, for instance, that many classical systems do not exhibit systematicity. On the flipside, Matthews (1997) notes that systematic variants that are licensed by the rules of syntax need not be thinkable. Waskan (2006) makes a similar point, noting that thinking may be more and less systematic than language and that the actual degree to which thought is systematic may be best accounted for by, theoretically speaking, pushing the structure of the world ‘up’ into the thought medium, rather than pushing the structure of language ‘down’. This might, however, come as cold comfort to connectionists, for it appears to  merely replace one competitor to connectionism with another.

7. Anti-Represenationalism: Dynamical Stystems Theory, A-Life and Embodied Cognition

As alluded to above, whatever F&P may have hoped, connectionism has continued to thrive. Connectionist techniques are now employed in virtually every corner of cognitive science. On the other hand, despite what connectionists may have wished for, these techniques have not come close to fully supplanting classical ones. There is now much more of a peaceful coexistence between the two camps. Indeed, what probably seems far more important to both sides these days is the advent and promulgation of approaches that reject or downplay central assumptions of both classicists and mainstream connectionists, the most important being that human cognition is largely constituted by the creation, manipulation, storage and utilization of representations. Many cognitive researchers who identify themselves with the dynamical systems, artificial life and (albeit to a much lesser extent) embodied cognition endorse the doctrine that one version of the world is enough. Even so, practitioners of the first two approaches have often co-opted connectionist techniques and terminology. In closing, let us briefly consider the rationale behind each of these two approaches and their relation to connectionism.

Briefly, dynamical systems theorists adopt a very high-level perspective on human behavior (inner and/or outer) that treats its state at any given time as a point in high-dimensional space (where the number of dimensions is determined by the number of numerical variables being used to quantify the behavior) and treats its time course as a trajectory through that space (van Gelder & Port 1995). As connectionist research has revealed, there tend to be regularities in the trajectories taken by particular types of system through their state spaces. As paths are plotted, it is often as if the trajectory taken by a system gets attracted to certain regions and repulsed by others, much like a marble rolling across a landscape can get guided by valleys, roll away from peaks, and get trapped in wells (local or global minima). The general goal is to formulate equations like those at work in the physical sciences that will capture such regularities in the continuous time-course of behavior. Connectionist systems have often provided nice case studies in how to characterize a system from the dynamical systems perspective. However, whether working from within this perspective in physics or in cognitive science, researchers find little need to invoke the ontologically strange category of representations in order to understand the time course of a system’s behavior.

Researchers in artificial life primarily focus on creating artificial creatures (virtual or real) that can navigate environments in a fully autonomous manner. The strategy generally favored by artificial life researchers is to start small, with a simple behavior repertoire, to test one’s design in an environment (preferably a real one), to adjust it until success is achieved, and then to gradually add layers of complexity by repeating this process. In one early and influential manifesto of the ‘a-life’ movement, Rodney Brooks claims, “When intelligence is approached in an incremental manner, with strict reliance on interfacing to the real world through perception and action, reliance on representation disappears” (Brooks 1991). The aims of a-life research are sometimes achieved through the deliberate engineering efforts of modelers, but connectionist learning techniques are also commonly employed, as are simulated evolutionary processes (processes that operate over both the bodies and brains of organisms, for instance).

8. Where Have All the Connectionists Gone?

There perhaps may be fewer today who label themselves “connectionists” than there were during the 1990s. Fodor & Pylyshyn’s (1988) critique may be partly responsible for this shift, though it is probably more because the novelty of the approach has worn off and the initial fervor died down. Also to blame may be the fact that connectionist techniques are now very widely employed throughout cognitive science, often by people who have very little in common ideologically. It is thus increasingly hard to discern among those who utilize connectionist modeling techniques any one clearly demarcated ideology or research program. Even many of those who continue to maintain an at least background commitment to the original ideals of connectionism might nowadays find that there are clearer ways of signaling who they are and what they care about than to call themselves “connectionists.” In any case, whether connectionist techniques are limited in some important respects or not, it is perfectly clear is that connectionist modeling techniques are still powerful and flexible enough as to have been widely embraced by philosophers and cognitive scientists, whether they be mainstream moderates or radical insurgents. It is therefore hard to imagine any technological or theoretical development that would lead to connectionism’s complete abandonment. Thus, despite some early fits and starts, connectionism is now most assuredly here to stay.

9. References and Further Reading

a. References

  • Aizawa, K. (1997). Explaining systematicity, Mind and Language, 12, 115-136.
  • Barsalou, L. (1987). The instability of graded structure: Implications for the nature of concepts. In U. Neisser (Ed.), Concepts and conceptual development: Ecological and intellectual factors in categorization. Cambridge, UK: Cambridge University Press, 101-140.
  • Bechtel, W. & A. Abrahamsen. (1991). Connectionism and the mind: An introduction to parallel processing in networks. Cambridge, MA: Basil Blackwell.
  • Bechtel, W. & A. Abrahamsen. (2002). Connectionism and the mind: An introduction to parallel processing in networks, 2nd Ed. Cambridge, MA: Basil Blackwell.
    • Highly recommended introduction to connectionism and the philosophy thereof.
  • Boden, M. (2006). Mind as machine: A history of cognitive science. New York: Oxford.
  • Brooks, R. (1991). Intelligence without representation. Artificial Intelligence, 47, 139-159.
  • Chalmers, D. (1990). Syntactic transformations on distributed representations. Connection Science, 2, 53-62.
  • Chomsky, N. (1993). On the nature, use and acquisition of language. In A. Goldman (Ed.), Readings in the Philosophy and Cognitive Science. Cambridge, MA: MIT, 511-534.
  • Churchland, P.M. (1989). A neurocomputational perspective: The nature of mind and the structure of science. Cambridge, MA: MIT.
  • Churchland, P.S. & T. Sejnowski. (1990).  Neural representation and neural computation. Philosophical Perspectives, 4, 343-382.
  • Dennett, D. (1991). Real Patterns. The Journal of Philosophy, 88, 27-51.
  • Elman, J. (1990). Finding Structure in Time. Cognitive Science, 14, 179-211.
  • Fodor, J. (1987). Psychosemantics. Cambridge, MA: MIT.
  • Fodor, J. & B. McLaughlin. (1990). Connectionism and the problem of systematicity: Why Smolensky’s solution doesn’t work, Cognition, 35, 183-204.
  • Fodor, J. & Z. Pylyshyn. (1988). Connectionism and cognitive architecture: A critical analysis. Cognition, 28, 3-71.
  • Franklin, S. & M. Garzon. (1996). Computation by discrete neural nets. In P. Smolensky, M. Mozer, & D. Rumelhart (Eds.) Mathematical perspectives on neural networks (41-84). Mahwah, NJ: Lawrence Earlbaum.
  • Goodhill, G. (1993). Topography and ocular dominance with positive correlations. Biological Cybernetics, 69, 109-118 .
  • Hebb, D.O. (1949). The Organization of Behavior. New York: Wiley.
  • Horgan, T. & J. Tienson (1991). Overview. In Horgan, T. & J. Tienson (Eds.) Connectionism and the Philosophy of Mind. Dordrecht: Kluwer.
  • Kintsch, W. (1998). Comprehension: A Paradigm for Cognition. Cambridge: Cambridge University Press.
  • Kohonen, T. (1982). Self-organized formation of topologically correct feature maps. Biological Cybernetics, 43, 59-69.
  • Marcus, R. (1995). The acquisition of the English past tense in children and multilayered connectionist networks. Cognition, 56, 271-279.
  • Matthews, R. (1997). Can connectionists explain systematicity? Mind and Language, 12, 154-177.
  • McCauley, R. (1986). Intertheoretic relations and the future of psychology. Philosophy of Science, 53, 179-199.
  • McClelland, J. & D. Rumelhart. (1989). Explorations in parallel distributed processing: A handbook of models, programs, and exercises. Cambridge, MA: MIT.
    • This excellent hands-on introduction to connectionist models of psychological processes has been replaced by: R. O’Reilly & Y. Munakata. (2000). Computational explorations in cognitive neuroscience: Understanding the mind by simulating the brain. Cambridge, MA: MIT. Companion software called Emergent.
  • McCulloch, W. & W. Pitts. (1943). A logical calculus of the ideas immanent in nervous activity Bulletin of Mathematical Biophysics, 5:115-133.
  • Rosenblatt, F. (1958). The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review, 65, 386-408.
  • Miikkulainen, R. (1993). Subsymbolic Natural Language Processing. Cambridge, MA: MIT.
    • Highly recommended for its introduction to Kohonen nets.
  • Minsky, M. & S. Papert. (1969). Perceptrons: An introduction to computational geometry. Cambridge, MA: MIT.
  • Pinker, S. & A. Prince. (1988). On language and connectionism: Analysis of a parallel distributed processing model of language acquisition. Cognition, 28, 73-193.
  • Pollack, J. (1990). Recursive distributed representations. Artificial Intelligence, 46, 77-105.
  • Plunkett, K. & V. Marchman. (1993). From rote learning to system building: Acquiring verb morphology in children and connectionist nets. Cognition, 48, 21-69.
  • Rey, G. (1983). Concepts and stereotypes. Cognition, 15, 273-262.
  • Rosch, E. & C. Mervis. (1975). Family resemblances: Studies in the internal structure of categories. Cognitive Psychology, 7, 573-605.
  • Rumelhart, D., G. Hinton, & R. Williams. (1986). Learning internal representations by error propagation. In D. Rumelhart & J. McClelland (Eds.), Parallel distributed processing: Explorations in the microstructure of cognition, Vol. 1. Cambridge, MA: MIT, 318-362.
  • Selfridge, O. (1959). Pandemonium: A paradigm for learning. Rpt. in J. Anderson & E. Rosenfeld (1988), Neurocomputing: Foundations of research. Cambridge, MA: MIT, 115-122.
  • Smolensky, P. (1990). Tensor product variable binding and the representation of symbolic structures in connectionist networks. Artificial Intelligence, 46, 159–216.
  • van Gelder, T. (1990). Compositionality: A connectionist variation on a classical theme. Cognitive Science, 14, 355-384.
  • van Gelder, T. & R. Port. (1995). Mind as motion: Explorations in the dynamics of cognition. Cambridge, MA: MIT.
  • Waskan, J. (2006). Models and Cognition: Prediction and explanation in everyday life and in science. Cambridge, MA: MIT.
  • Wittgenstein, L. (1953). Philosophical Investigations. New York: Macmillan.

b. Connectionism Freeware

  • BugBrain provides an excellent, accessible, and highly entertaining game-based hands-on tutorial on the basics of neural networks and gives one a good idea of what a-life is all about as well. BugBrain comes with some learning components, but they are not recommended.
  • Emergent is research-grade software that accompanies O’Reilly and Munakata’s Computational explorations in cognitive neuroscience (referenced above).
  • Simbrain is a fairly accessible, but somewhat weak, tool for implementing a variety of common neural network architectures.
  • Framsticks is a wonderful program that enables anyone with the time and patience to evolve virtual stick creatures and their neural network controllers.

Author Information

Jonathan Waskan
Email: waskan@illinois.edu
University of Illinois at Urbana-Champaign
U. S. A.