A deductive argument is said to be valid if and only if it takes a form that makes it impossible for the premises to be true and the conclusion nevertheless to be false. Otherwise, a deductive argument is said to be invalid.
A deductive argument is sound if and only if it is both valid, and all of its premises are actually true. Otherwise, a deductive argument is unsound.
According to the definition of a deductive argument (see the Deduction and Induction), the author of a deductive argument always intends that the premises provide the sort of justification for the conclusion whereby if the premises are true, the conclusion is guaranteed to be true as well. Loosely speaking, if the author’s process of reasoning is a good one, if the premises actually do provide this sort of justification for the conclusion, then the argument is valid.
In effect, an argument is valid if the truth of the premises logically guarantees the truth of the conclusion. The following argument is valid, because it is impossible for the premises to be true and the conclusion nevertheless to be false:
Elizabeth owns either a Honda or a Saturn.
Elizabeth does not own a Honda.
Therefore, Elizabeth owns a Saturn.
It is important to stress that the premises of an argument do not have actually to be true in order for the argument to be valid. An argument is valid if the premises and conclusion are related to each other in the right way so that if the premises were true, then the conclusion would have to be true as well. We can recognize in the above case that even if one of the premises is actually false, that if they had been true the conclusion would have been true as well. Consider, then an argument such as the following:
All toasters are items made of gold.
All items made of gold are time-travel devices.
Therefore, all toasters are time-travel devices.
Obviously, the premises in this argument are not true. It may be hard to imagine these premises being true, but it is not hard to see that if they were true, their truth would logically guarantee the conclusion’s truth.
It is easy to see that the previous example is not an example of a completely good argument. A valid argument may still have a false conclusion. When we construct our arguments, we must aim to construct one that is not only valid, but sound. A sound argument is one that is not only valid, but begins with premises that are actually true. The example given about toasters is valid, but not sound. However, the following argument is both valid and sound:
In some states, no felons are eligible voters, that is, eligible to vote.
In those states, some professional athletes are felons.
Therefore, in some states, some professional athletes are not eligible voters.
Here, not only do the premises provide the right sort of support for the conclusion, but the premises are actually true. Therefore, so is the conclusion. Although it is not part of the definition of a sound argument, because sound arguments both start out with true premises and have a form that guarantees that the conclusion must be true if the premises are, sound arguments always end with true conclusions.
It should be noted that both invalid, as well as valid but unsound, arguments can nevertheless have true conclusions. One cannot reject the conclusion of an argument simply by discovering a given argument for that conclusion to be flawed.
Whether or not the premises of an argument are true depends on their specific content. However, according to the dominant understanding among logicians, the validity or invalidity of an argument is determined entirely by its logical form. The logical form of an argument is that which remains of it when one abstracts away from the specific content of the premises and the conclusion, that is, words naming things, their properties and relations, leaving only those elements that are common to discourse and reasoning about any subject matter, that is, words such as “all,” “and,” “not,” “some,” and so forth. One can represent the logical form of an argument by replacing the specific content words with letters used as place-holders or variables.
For example, consider these two arguments:
All tigers are mammals.
No mammals are creatures with scales.
Therefore, no tigers are creatures with scales.
All spider monkeys are elephants.
No elephants are animals.
Therefore, no spider monkeys are animals.
These arguments share the same form:
All A are B;
No B are C;
Therefore, No A are C.
All arguments with this form are valid. Because they have this form, the examples above are valid. However, the first example is sound while the second is unsound, because its premises are false. Now consider:
All basketballs are round.
The Earth is round.
Therefore, the Earth is a basketball.
All popes reside at the Vatican.
John Paul II resides at the Vatican.
Therefore, John Paul II is a pope.
These arguments also have the same form:
All A’s are F;
X is F;
Therefore, X is an A.
Arguments with this form are invalid. This is easy to see with the first example. The second example may seem like a good argument because the premises and the conclusion are all true, but note that the conclusion’s truth isn’t guaranteed by the premises’ truth. It could have been possible for the premises to be true and the conclusion false. This argument is invalid, and all invalid arguments are unsound.
While it is accepted by most contemporary logicians that logical validity and invalidity is determined entirely by form, there is some dissent. Consider, for example, the following arguments:
My table is circular. Therefore, it is not square shaped.
Juan is a bachelor. Therefore, he is not married.
These arguments, at least on the surface, have the form:
x is F;
Therefore, x is not G.
Arguments of this form are not valid as a rule. However, it seems clear in these particular cases that it is, in some strong sense, impossible for the premises to be true while the conclusion is false. However, many logicians would respond to these complications in various ways. Some might insist–although this is controverisal–that these arguments actually contain implicit premises such as “Nothing is both circular and square shaped” or “All bachelors are unmarried,” which, while themselves necessary truths, nevertheless play a role in the form of these arguments. It might also be suggested, especially with the first argument, that while (even without the additional premise) there is a necessary connection between the premise and the conclusion, the sort of necessity involved is something other than “logical” necessity, and hence that this argument (in the simple form) should not be regarded as logically valid. Lastly, especially with regard to the second example, it might be suggested that because “bachelor” is defined as “adult unmarried male”, that the true logical form of the argument is the following universally valid form:
x is F and not G and H;
Therefore, x is not G.
The logical form of a statement is not always as easy to discern as one might expect. For example, statements that seem to have the same surface grammar can nevertheless differ in logical form. Take for example the two statements:
(1) Tony is a ferocious tiger.
(2) Clinton is a lame duck.
Despite their apparent similarity, only (1) has the form “x is a A that is F.” From it one can validly infer that Tony is a tiger. One cannot validly infer from (2) that Clinton is a duck. Indeed, one and the same sentence can be used in different ways in different contexts. Consider the statement:
(3) The King and Queen are visiting dignitaries.
It is not clear what the logical form of this statement is. Either there are dignitaries that the King and Queen are visiting, in which case the sentence (3) has the same logical form as “The King and Queen are playing violins,” or the King and Queen are themselves the dignitaries who are visiting from somewhere else, in which case the sentence has the same logical form as “The King and Queen are sniveling cowards.” Depending on which logical form the statement has, inferences may be valid or invalid. Consider:
The King and Queen are visiting dignitaries. Visiting dignitaries is always boring. Therefore, the King and Queen are doing something boring.
Only if the statement is given the first reading can this argument be considered to be valid.
Because of the difficulty in identifying the logical form of an argument, and the potential deviation of logical form from grammatical form in ordinary language, contemporary logicians typically make use of artificial logical languages in which logical form and grammatical form coincide. In these artificial languages, certain symbols, similar to those used in mathematics, are used to represent those elements of form analogous to ordinary English words such as “all”, “not”, “or”, “and”, and so forth. The use of an artificially constructed language makes it easier to specify a set of rules that determine whether or not a given argument is valid or invalid. Hence, the study of which deductive argument forms are valid and which are invalid is often called “formal logic” or “symbolic logic.”
In short, a deductive argument must be evaluated in two ways. First, one must ask if the premises provide support for the conclusion by examing the form of the argument. If they do, then the argument is valid. Then, one must ask whether the premises are true or false in actuality. Only if an argument passes both these tests is it sound. However, if an argument does not pass these tests, its conclusion may still be true, despite that no support for its truth is given by the argument.
Note: there are other, related, uses of these words that are found within more advanced mathematical logic. In that context, a formula (on its own) written in a logical language is said to be valid if it comes out as true (or “satisfied”) under all admissible or standard assignments of meaning to that formula within the intended semantics for the logical language. Moreover, an axiomatic logical calculus (in its entirety) is said to be sound if and only if all theorems derivable from the axioms of the logical calculus are semantically valid in the sense just described.
The author of this article is anonymous. The IEP is actively seeking an author who will write a replacement article.
Collective Intentionality
The idea that a collective could be bearer of intentional states such as belief and intention is likely to raise some eyebrows, especially in certain Anglo-American and European philosophical circles. The dominant picture in these circles is that intentionality is a feature of individual minds/brains. On the face of it, groups don’t have minds or brains. How could they have intentional states?
Despite the initial skepticism, there is a growing number of philosophers turning their attention to the issue of collective intentionality. The focus of these recent discussions has been primarily on the notions of collective intention and belief. Philosophers of action theory have been interested in collective intentions because of their interest in understanding collective or group agency. Individual intentions shape and inform individual actions. My intention guides my daily activities, structures my desires in a variety of ways, and facilitates coordination with both my future self and others around me. But we do not always act alone and it is coordination with others that raises interesting issues regarding the possibility of collective intentions. Many philosophers believe that individual intentions alone will not explain collective action and that joint action requires joint (sometimes called shared or collective in the literature) intentions. An exception to this trend is Seamus Miller who has argued that collective or joint action can be understood in terms of collective ends that are not intentions. Because his positive account of joint action does not appeal to collective intentionality, his work will not be highlighted in this article.
Interest in the notion of collective belief has been motivated, in part, by concerns over how to understand our collective belief ascriptions and the role they play in social scientific theory and everyday contexts. We often attribute beliefs, desires, and other propositional attitudes to groups like corporations. What do these ascriptions mean? Are they to be taken literally?
A common response to the questions that arise concerning our practice of ascribing intentional states to groups is to say that these ascriptions are mere fictions. When we say, “The Federal Reserve believes that interest rates ought to remain low,” this does not mean that the Federal Reserve literally has a belief. Rather, we are speaking metaphorically. According to this account, our ascriptions of intentional states to groups, though useful, are, strictly speaking, false.
Although this account has common-sense appeal, it has not been appealing to philosophers working in this area for a variety of reasons. First, our practice of attributing responsibility to organizations (consider, for instance, current tobacco lawsuits) seems to presuppose that organizations literally have intentional states. For we could not hold them legally and morally responsible for an action unless they intended to commit the act. Since we do not hold organizations metaphorically responsible (much to the dismay of tobacco companies), the attributions on which our ascriptions of responsibility rest should be, at least initially, considered non-metaphorical.
Further, our ascriptions of intentional states to groups have a surprising explanatory power. They allow us to predict and explain the actions of groups. Although false ascriptions could be explanatorily powerful (just as false theories are sometimes explanatorily powerful), explanatory power is prima facie evidence that our ascriptions are not simply false. We might also note that if the instrumentalist about collective intentionality is correct, then we, the media, social scientists, lawyers, political scientists, etc. are continually disseminating falsehoods. This seems to be an odd result and again, prima facie, evidence that our ascriptions are not mere metaphors.
It should be noted that rejection of the metaphorical approach to our collective intentional state ascriptions does not necessarily commit one to the view that when we are ascribing intentional states to groups those ascriptions are true in virtue of the fact that there is a collective or group mind that is the bearer of these states. In rejecting the metaphorical approach one need not also reject an individualistic approach. As we shall see there are alternative accounts that hold that these ascriptions are true, not in virtue of there being a group mind, but in virtue of the fact that the individuals within the group have certain intentional states. Summative accounts are of this kind.
2. Summative Accounts
Summative accounts of collective attitude ascription argue that these ascriptions are a short-hand way of referring to the fact that most members have the attitude (and the content) ascribed to the group. This is the view espoused by Anthony Quinton in ‘Social Objects’ (1975). These accounts have been labeled summative by Margaret Gilbert (1989) because they try to analyze group attitude ascriptions in terms of the sum of individual attitudes with the same content as that ascribed to the group.
There are a variety of summative accounts on offer. For the purposes of this article I will focus on two types, simple summative account (SSA) and the complex summative account (CSA), identified by Margaret Gilbert in her (1987) article “Modelling Collective Belief.” According to the simple summative account:
Group G believes that p if and only if all or most of the members believe that p.
A simple summative account of group intention would substitute ‘intends’ in the formulation above. Gilbert (1987, 1989, 1994) has argued persuasively that this analysis is insufficient. Consider a case in which every member of the philosophy department believes that eating meat is immoral, but the members do not express this opinion because they are afraid of the response they will receive from their colleagues and students. In this context, it is unlikely that we would attribute to the philosophy department the belief that eating meat is immoral. It is possible, of course, to construct a context in which it would be appropriate to attribute such a belief to the philosophy department–perhaps, if the philosophy department were engaged in a discussion of animal rights. But in such a context the beliefs of the individuals would no longer be secret. Presumably, at least some of the members would express their opinions.
This example suggests that group belief depends on certain epistemic features of individuals. The complex summative account acknowledges these epistemic features by introducing the notion of common knowledge. CSA requires that members of the group recognize or know that most of the members in the group believe that p. Thus, CSA is committed to the conceptual truth of the following:
A group G believes that p if and only if (1) most of the members of G believe that p, and (2) it is common knowledge in G that (1).
Gilbert (1989, 1994) has argued that the CSA is too weak. Consider the following example: A company has formed two committees and coincidently the committees have the same exact membership. One committee has been formed in order to develop an office dress code. Call this committee the Dress Code committee. The other committee has been formed to assess the recently installed phone system. Call this committee the Phone committee. Now imagine that (a) every member of the Dress Code committee personally believes that spandex pants are inappropriate apparel for the office and this is common knowledge within the Dress Code committee, and (b) the same goes mutatis mutandis for each member of the Phone committee. It seems compatible with (a) and (b) that (c) the Dress Code committee believes spandex is inappropriate, and (d) the Phone committee does not believe that spandex is appropriate office apparel. Yet the conditions of the CSA have been met for both. Gilbert provides a similar example in (1996, 199). The addition of common knowledge, according to Gilbert, does not provide sufficient conditions for group belief. Although Angelo Corlett (1996) has criticized examples of this sort and has provided a defense of a simple summative account, most theorists agree with Gilbert that the account is insufficient.
In addition to being too weak, many including Gilbert believe that both the CSA and SSA are too strong. On summative accounts it is conceptually necessary for most of the members of G to believe that p in order for G to believe that p. This seems too strong. Indeed, there seem to be contexts in which no group member has the attitude ascribed to the group. Imagine a group of politicians who do not personally believe that partial birth abortion should be outlawed, but because of the pressure exerted by their constituents they vote to ban partial birth abortion. Ascriptions of belief to the group of politicians would probably be made on the basis of this vote and, thus, we would ascribe the belief that partial birth abortion should be banned to the group of politicians even though no individual politician personally believes this proposition.
Group intentions, too, are not easily understood in terms of the summation of individual intentions to perform some action. Consider this example given by John Searle (1990, 403). Imagine a group of people sitting on the grass enjoying a sunny afternoon. Suddenly it grows dark and starts to rain. They all get up and run for shelter. In this scenario each individual has the intention “I am running to shelter” and these intentions are had independently of one another. Now imagine a situation in which their running to the shelter is part of a performance. Suppose they are a group of actors and this is part of a scene in a play. Thus, at one point in the play they perform the same actions done by the individuals in the above scenario. According to Searle, the performance by the actors involves a collective intention in the form “we intend to do x.” This collective intention is different from the individual intentions had by the individual actors and it is not captured by summing up individual intentions in the form “I intend to x.”
The reason why collective intentions cannot be reduced to individual intentions, argues Searle, is that no set of I-intentions even supplemented with mutual beliefs will add up to a we-intend. Collective intentions involve a sense of acting and willing something together. Individual intentions involved in this enterprise are derived from collective intentions and the individual intentions that are derived from the collective intention will often have a different content from that of the collective intention. Michael Bratman (1999,111) also stresses the inadequacy of summative accounts of group intentions. Consider a case in which you have an intention to paint the house and I have an intention to paint the same house and this is common knowledge between us. The set of intentional states is not enough to guarantee that our actions are coordinated in any manner so that we are painting the house together. Indeed, the complex summative account does not rule out the possibility of our painting the same house at the same time but independent of one another (avoiding the other by chance). The set of individual intentional states identified by the complex summative accounts is not going to play any role in coordinating our behavior so that painting the house is something we do together. Intentions, either collective or individual, do, by their nature, play a role in planning and coordination. (Bratman, 1999, 1987) So, according to this line of reasoning, summative accounts, even of the complex kind, cannot be an adequate account of the nature of collective intention.
3. Non-Summative Accounts
a. Searle
In “Collective Intentions and Actions” (1990) and in The Construction of Social Reality (1995) John Searle defends an account of collective intentionality that is non-summative, but remains individualistic. Searle specifies that anything we say about collective intention must meet the following conditions of adequacy:
It must be consistent with the fact that society is nothing over and above the individuals that comprise it. All consciousness and intentionality is in the minds of individuals. Specifically, individual brains.
It must be consistent with the fact that all intentionality could be had by a brain in a vat.
Searle’s first criterion of adequacy denies that groups themselves can be intentional agents and advocates a form of individualism. The second criterion is motivated by atomism. According to this condition, all intentionality, individual or collective, is independent of what the real world is like, since a radical mistake is possible. These two conditions entail that collective intentions exist in individual brains. Thus Searle’s position allows for the possibility of a single person having the collective intention “we intend to do x.”
…I could have all the intentionality I do have even if I am radically mistaken, even if the apparent presence and cooperation of other people is an illusion, even if I am suffering a total hallucination, even if I am a brain in a vat. (1990, 117)
How is it possible for an individual to have an intention of the form “We intend to J”? Searle contends that this capacity is biologically primitive. Indeed, he suggests that it is shared by a variety of other species. This capacity presupposes other Background capacities (the Background is a technical term for Searle referring to conditions necessary for certain cognitive activities and language). In particular, it presupposes a Background sense of the other as a candidate for cooperative agency (1990, 414).
Collective intentionality plays a large role in Searle’s overall account of social reality. In The Construction of Social Reality (1995) collective intentionality is that which confers a function on artifacts and changes them into social facts. Pieces of paper function as money because we intend them to do so. Just as individual intentionality has the ability to change the world via speech acts, collective intentionality has, according to Searle, the ability to create social facts.
Searle’s account of collective intention has been criticized for a variety of reasons. First, Tollefsen (2002d) notes that it rests on the controversial assumption that externalist theories of content individuation are false. According to standard externalist reasoning, if a brain in a vat is not in the proper water environment (either in causal contact with water or able to theorize about water) then it cannot have beliefs or intentions about water. The content of a belief is determined by external rather than merely internal aspects. If this is correct then a brain in a vat could not have we-intentions. Further, there are some who argue that one could not even have a concept of another agent if he or she is not part of a social practice of interpretation (Davidson, for instance, 1992). If these views are correct it would be difficult to say how a brain in a vat could have a we-concept at all. One cannot simply assume that these theories are false without a lengthy discussion and refutation. To the extent that Searle’s account rests on a controversial thesis in the philosophy of mind and language it is problematic.
Others (Meijers 2001, Gilbert 1998) have argued that Searle’s account fails to capture the normative relations that are an integral part of collective intentions. When we form a collective intention, we create obligations and expectations among us. The football players in Searle’s example above are obligated to perform certain actions given that they have formed a collective intention to execute a pass play. As Gilbert notes (1989, 1994) if one of the players fails to do his or her part the other players have a right to rebuke their teammate. This rebuke is evidence of the normativity involved in joint action. When we form a collective intention we make commitments and incur obligations. Searle’s account, because it essentially allows for solipsistic we-intentions, fails to acknowledge the normativity involved in collective intentionality. For Gilbert and Meijers, the normativity of collective intentionality is essential to the phenomenon.
Searle himself acknowledges that it is because of the special nature of collective intentions that we are able to distinguish between the two cases of individuals running for cover in the example above. There is something about collective intentions that coordinates individual, independent actions into a joint action. But isolated, perhaps even solipsistic, we-intentions do not, in themselves, seem to be enough to direct and coordinate the individual intentional actions of which the joint action is comprised. Suppose, for instance, that none of the actors knew of the other actor’s we-intention. It would seem to be a complete accident that they acted together. Indeed, it would seem as fortuitous as a group of individuals that just happen to get up at the same time and run for cover.
b. Bratman
The problems with Searle’s account point to the fact that whatever individual intentional states underlie collective intentions, they should be interrelated in a significant way. Michael Bratman provides an account of collective intention in terms of the intentions of the individual participants and their interrelations. His analysis provides a rational reconstruction of what it is for two people to intend to do something together. We should note that Bratman uses the term “shared intention” rather than collective intention.
We need to be careful with this phrase as there are several senses in which one can “share” an intention. You and I, for instance, can both intend to wash the dishes and thus we share, in some sense, the intention to wash the dishes. But these intentions are consistent with our washing the dishes independently of one another. Here is another way to distinguish between the weak and the strong sense of sharing. You and I each have a quarter in our pockets. In this case, one might say that we share “quarter possession.” This is the weak sense of sharing. This sense of sharing is to be distinguished from a case in which we share a quarter between us. The weak sense of sharing does not aid us in understanding how people can perform actions together. With this caution in mind, I will use collective intention and shared intention interchangeably to refer to the type of intention that is thought to be crucial for understanding collective actions. The weak sense of shared intention noted above is not a candidate.
Bratman begins his discussion of collective intention by identifying the role that collective or shared intentions play. First, shared intentions help to coordinate our intentional actions. For instance, our shared intention of washing the dishes will guide each of our intentional actions towards satisfying the goal of washing the dishes. Thus, someone will wash the dishes before rinsing them and someone will rinse them before drying them. Second, our shared intention will coordinate our actions by making sure that our own personal plans of action meld together. If I plan to do the washing, then I will check with your plan and see if there is any conflict. Third, shared intentions act as a backdrop against which bargaining and negotiation occur. Conflicts about who does the washing and who does the drying will be resolved by considering the fact that we share the intention to do the dishes. Thus, shared intention unifies and coordinates individual intentional actions by tracking the goals accepted by each individual.
Consider a case in which you and I intend to wash the dishes together. If this intention is a shared intention then it is not a matter of you having an intention to wash the dishes and me having an intention to wash the dishes. Nor is it a matter of each of us having an atomistically conceived we-intention to wash the dishes. Such coincident intentions do not insure that each of us knows of the other’s intention and that we are committed to the joint action of washing the dishes together. Further, an explicit promise made to each other does not seem to insure that we share an intention either. Because I might be lying to you and have no intention of washing the dishes with you. Thus, explicit promises are not sufficient for shared intention. Nor are they necessary for shared intention. Bratman provides an example from Hume to highlight this. “Consider Hume’s example of two people in a row boat who row together ‘tho they have never given promises to each other.’ Such rowers may well have a shared intention to row the boat together”(Bratman, 1993, 98-99).
What do shared intentions consist in according to Bratman? Bratman shares Searle’s commitment to individualism in that he does not think that shared intentions are the intentions of a plural agent, nor are they to be understood solely in terms of individual intentional states. Shared intentions, according to Bratman, are to be identified with the state of affairs consisting of a set of interrelated individual intentional states. What set of individual attitudes are interrelated in appropriate ways such that the complex consisting of such attitudes would, if functioning properly, do the jobs of shared intention?
Here is a somewhat simplified version of Bratman’s answer to this question. We intend to wash the dishes if and only if:
a. I intend that we wash the dishes.
b. You intend that we wash the dishes.
I intend that we wash the dishes in accordance with and because of 1a and 1b; you intend likewise.
1 and 2 are common knowledge between us.
It should be noted that the focus in this article is on Bratman’s account of the shared intention that underlies joint intentional action. In “Shared Cooperative activity” (1999) Bratman provides an account of the shared intention that underlies more cooperative ventures and it involves conditions 1-3 and some additional conditions that rule out coercion.
As a first approximation, this complex of intentional attitudes above seems plausible. But consider a case in which we each intend to wash the dishes together and we each do so in part because of the other’s intention. However, I intend to wash the dishes with Palmolive and you intend to wash them with Joy. All of this is common knowledge and we will not compromise. Is there a collective intention present? It seems not. In this case we do not have our subplans coordinated in the appropriate way. Recall that one of the jobs that shared intention has is to coordinate our individual plans and goals. In the example above our individual subplans are in conflict and this would prevent us from achieving our goal of getting the dishes washed.
Bratman avoids this counterexample by adding a clause about participants’ subplans. It is not necessary that our subplans match, but they must mesh. So, if my subplan is to wash the dishes with Palmolive, and your subplan is to wash them with hot water, and I have no preference about the water temperature, then our subplans mesh though they don’t match exactly. But if we have subplans to wash the dishes with completely different types of dish detergent then our subplans do not mesh. Bratman reformulates the account in the following way:
We intend to J if and only if:
(a) I intend that we J and (b) you intend that we J
I intend that we J in accordance with and because of 1a and 1b, and meshing subplans of 1a and 1b; you intend the same.
1 and 2 are common knowledge
This account of collective intentions rejects the atomism of Searle’s account. Because a shared intention is the complex of attitudes of individuals and their interrelations, an individual cannot have a shared intention. As we have seen, on Searle’s account one can have a shared intention, even if one is a brain in a vat. On Bratman’s view the intentions of individuals are interrelated and reflexive in a way that makes solipsistic we-intentions impossible.
Bratman’s account of collective or shared intentions has been criticized in a variety of ways. Both Searle and Bratman attempt to avoid the specter of the collective mind. Searle places we-intentions in the mind of individuals. Bratman avoids positing a plural agent by trying to explain collective intentions in terms of individual attitudes with common contents that are distinctively social in the sense that solitary individuals could not have them. But how is it possible for me to have an intention with the form “we-intend” or with the form “I intend that we do J”? There seem to be certain features of intention itself that would rule out both Searle’s and Bratman’s ways of understanding the notion of joint intention. This line of argument has been developed, in slightly different ways, in recent papers by Annette Baier (1997), Frederick Stoutland (1997), and J. David Velleman (1997). Normally, when I intend to do something, the action I intend to do is under my control. And in normal cases of shared intention (cases where there is no coercion or where I am not in control of your actions), the other agent is seen as being in control of his or her own actions. Further, when I intend to do something, this intention settles, in some sense, what I will do. In Bratman’s terms, I have set a plan or course of action for myself. But how, then, can I intend that we do something? There is something in this scenario that is out of my control. My intention that we J cannot settle what we will do, because you have an equally important role in settling what will be done. Thus, I cannot intend that we J.
Stoutland (1997) puts the problem a bit differently by emphasizing that Bratman’s attempt to identify a set of individual intentions with common contents is impossible. Because intention makes an implicit reference to the subject that fulfills the intention, there are no intentions with common content. “Art can intend to go to a film and Mary can intend to do the same; but their intentions do not have common content, since Art’s intention is his going to the film and Mary’s is her going to the film.” (1997, 56). Likewise, it would seem impossible for me to have a Searlian we-intention. Because intention makes an implicit reference to the subject that is responsible for fulfilling the intention and I am not a we, I cannot have a we-intention. In cases of joint action I am not the subject that is responsible for fulfilling the intention. In order to be responsible I would have to have the actions of others under my direct control. But I do not. Therefore, I cannot have a we-intention.
In “I intend that we J” (1999) Bratman alters his account of shared intention in an attempt to meet this challenge. Basically, Bratman introduces the technical notion of intending that. This is supposed to be like ordinary intention except that it does not require that the individual with the intention also be the individual who fulfills the intention. I can intend that my children go to college, for instance. On this understanding of intention it seems possible for an individual to have the intention that we X. This way of avoiding the objection has seemed to some to be problematic. First, to intend that my children go to college is simply to intend to do something that brings it about that my children go to college. And these actions (whatever they might be) are under my direct control. This is not so in the case of my intending that we X. Further, Bratman seems to have changed the subject. Intentions are normally intentions to do something. It is intentions to act that explain behavior at the individual level. If collective actions presuppose intention in the way that individual agency does, then it would seem to be the same sort of intention to that is presupposed. But according to Stoutland and others, Bratman doesn’t give us an account of these intentions.
Like Searle, Bratman has been accused of ignoring the normativity of collective intentions. For Gilbert and Meijers, there is a normativity involved in collective intentionality that suggests that collective intentions and other intentional states are essentially commitments of a sort. Consider Gilbert’s (1989) example of walking together. We form an intention to walk together and begin our journey. Halfway through the walk you veer off to the left and start walking away from me. If we intended to walk together, this behavior is not only odd but justifiably subject to rebuke. The behavior will be considered to be a violation of some sort of commitment that we made. There seems to be a sense in which you ought not to have done this and I have the right to rebuke you. “Hey” I can say, “we are walking together. Where are you going?” I can take offense at your behavior and, according to Gilbert, my offense is justified and its justification derives from the normative commitments that are inherent in the collective intention.
Bratman’s account of collective or shared intentionality does not involve a normative element. For him, cognitive attitudes and their interrelations are enough to explain collective intentionality. Although he admits that certain shared activities will involve obligations, he stresses that it is possible to have a shared intention that does not involve promises or obligations. That is, there is nothing essentially normative about collective intentionality. He does, however, make a further distinction between weak and strong shared intentions, in which the latter involves binding agreement. This normativity inherent in a binding agreement, however, is explained in terms of additional moral principles like Scanlon’s (1998) “principle of fidelity.”
c. Gilbert
Margaret Gilbert’s account of collective intentions and other intentional states like belief aims, in part, to explain the nature of this normative phenomenon without having to postulate additional normative principles. Her account of collective intentionality is also part of a larger project to provide a conceptual analysis of certain group concepts. In On Social Facts (1989), in addition to providing an analysis of the concept of a group belief and intention, she also provides an account of the concept of a social group and the concept of social convention. In doing so, she claims to be uncovering the “core” of such concepts and legitimizing the use of these “everyday” concepts within the social sciences.
Gilbert’s account of collective intentionality is closely linked to her account of the concept of a social group. Briefly, our everyday concept of a social group is, according to Gilbert, the concept of a plural subject of belief or action. A plural subject is an entity, or as Gilbert puts it, “a special kind of thing, a ‘synthesis sui generis‘”(1996, 268) formed when individuals bond or unite in a particular way. This “special kind of thing” can be the subject to which intentional action and psychological attributes are attributed. We can formulate the conceptually necessary and sufficient conditions for the existence of plural subjects in the following way:
Individuals A1…..An….form a plural subject of X-ing (for some action X or psychological attribute X) if and only if A1An form a joint commitment to X-ing as a body.
It will be helpful to begin by considering what is involved in a joint commitment to act as a body or as a single individual. We will then consider the plural subject framework as it applies to psychological states like belief.
A joint commitment to act as a body is a commitment made by a collection of individuals to perform some present or future action as would a single individual. Joint commitments are formed when each of a number of people expresses his or her willingness to participate in the relevant joint commitment with the others. Each person understands that only when all of the relevant people have agreed to participate in the joint commitment will the joint commitment be formed. Once every one has agreed, a pool of wills is formed and individuals are then jointly committed. Once the joint commitment is established, each individual is individually obligated to do his or her part to make it the case that he or she acts as a body.
Consider a case in which Joe’s construction company agrees to build a house for Mrs. Wilbur. The members of the company do not each individually agree to build Mrs. Wilbur a house. This would lead to the proliferation of Wilbur abodes. They each individually agree, however, to make it the case that the house is built by the construction company and express their willingness to do so on the condition that every other member do the same. This expression of willingness need not be simultaneous. The members may express their willingness over time. Nor do they need to express their willingness verbally. In many cases, silence is an adequate expression of intention. They must, however, in order for the joint commitment to come into existence, communicate in some way and at some point in time their intention to do their part in building the house as a body with others.
Because joint commitments are joint, they cannot simply be reduced to an aggregate of individual commitments. A joint commitment gives rise to certain obligations and entitlements. Members of the group have a right to expect that other members will follow through on their commitments. Sam and Tammy are entitled to expect that Joe will do his part to make it the case that the construction company builds a house for Mrs. Wilbur. If Joe is doing something to frustrate the building process, Sam and Tammy are justified in rebuking him.
A joint commitment can only be rescinded if every member party to the joint commitment agrees to rescind it. The existence of the joint commitments in the face of an individual rescinding his or her individual commitment explains why the members of the construction company have a right to rebuke Joe when he is not doing his part. If Joe says, “I’ve had enough of this mindless labor,” and walks off the site, the joint commitment remains in full force because there has been no agreement among the members to rescind the joint commitment. This does not mean, of course, that the individual commitment Joe makes cannot be broken. It does mean, however, that if he breaks his individual commitment, even for a good reason, this does not nullify the joint commitment and its associated obligations.
According to Gilbert, the obligations which arise from a joint commitment are of a special kind and they differ from other forms of obligations in the following ways: First, although each individual in the group must be “willing” to be jointly committed, this notion of willingness does not, according to Gilbert, rule out coercion. A person can be coerced into being part of a joint commitment and yet it still remains a commitment to which a person is obligated. Gilbert wants to show that joint commitments arise in various environments and under various circumstances. Often joint commitments are coerced because the person who is doing the coercion needs the commitment of others in order to carry through with their actions.
A second aspect that distinguishes the obligations of a joint commitment from other types of obligation is the interdependence of the commitments makes it the case that no one member can rescind a joint commitment. For example, Al’s commitment to travel with Doris cannot be dissolved by Al changing his mind. This feature was already noted above.
Third, in becoming party to a joint commitment a person has a reason to act. It is a reason that remains whether or not his or her beliefs or external circumstances change. Joe is obligated to every other member of Joe’s construction company to act in accordance with the joint commitment to building a house. This commitment acts as a reason and, if reasons are causes, joint commitments can often explain why individuals act in particular circumstances. It is a reason that remains and will bind him to acting appropriately until the group as a whole decides to release one another from this obligation.
Finally, the people party to a joint commitment are aware of the obligations they have to one another. They could not be held responsible for violation of such obligations unless they were aware of these obligations. The fact that every other member has committed herself to the joint commitment is common knowledge, and there is also common knowledge of the obligations, expectations, and entitlements that arise from such commitments.
Having discussed the notion of forming a joint commitment to act as a body, we are now in a position to apply the plural subject schema to belief:
Individuals A1…An… form a plural subject of believing that p if and only if A1…An form a joint commitment to believe that p as a body.
Recall that joint commitments are commitments of groups, not individuals. They arise, in the case of joint action, when each individual expresses his willingness to do his part provided that every other individual commits to doing her part to bring it about that they perform some action as a body. Gilbert simply extends this analysis of joint action to group belief. Individuals express their willingness to do their part to make the case that they believe as a body. These commitments and expectations are common knowledge. This set of reciprocal intentions and commitments sets up the pool of wills and certain obligations and entitlements then come into play. But what is required in doing one’s part to make it the case that they believe that p as a body?
Gilbert makes it clear that members do not have to themselves believe that p. This allows her to avoid the pitfalls of the summative accounts. They also do not have to act as if they personally believe that p. Doing one’s part in the context of a joint belief, then, seems to involve at least not saying anything contrary to the group belief while speaking as a member of the group or acting contrary to the group belief while acting in one’s capacity as a group member. One who participates in a joint commitment to believe that p thereby accepts an obligation to do what he can to bring it about that any joint endeavors among the members of the group be conducted on the assumption that p is true. He is entitled to expect others’ support in bringing this about. Further, if one does believe something that is inconsistent with p, one is required at least not to express that belief baldly. The committee members would have a right to rebuke one of their own if, in acting as a member of the committee, he or she expressed views that were contrary to the group view without prefacing his or her remarks with “I personally believe that…”
According to Gilbert, then, when individuals form a plural subject of belief, (i.e., when they become party to a joint commitment to believe that p as a body), there is group belief that p. Note that she provides necessary and sufficient conditions for the existence of a plural subject of belief. But Gilbert recognizes in later work (1994) that there may be cases in which we want to say that a group has a belief, yet they do not meet the existence conditions for a plural subject of belief. This recognition leads her to say that what she is giving is an analysis of the core notion of group belief and that other cases of group belief will be extensions of this core notion. Thus, we end up with the following statement of the conceptually sufficient conditions for group belief:
There is a group belief that p if some persons constitute the plural subject of a belief that p. Such persons collectively believe that p.
Unique to Gilbert’s account is the assertion that under certain circumstances individuals form a plural subject and this subject is the legitimate subject of intentional state ascriptions. Recall that Bratman and Searle deny that there is a collective entity that is the appropriate subject of intentional state ascription. Her account, then, is less individualistic than Searle’s and Bratman’s.
Gilbert’s account of collective intentionality has been criticized on the following grounds. First, Tollefsen (2002) has argued that Gilbert’s analysis is circular. This can be seen if we consider what it means to commit to doing one’s part to make it the case that the group believe as a body or act as a body. Gilbert claims that the notion of a group of individuals acting together to constitute a body is primitive and it guides the actions and thoughts of individuals in the group. It is this notion that tells them what their part is and what they are committed to doing. It is from this concept, for instance, that one knows that she must not say p, without prefacing her remarks appropriately, when she is acting as member of a group that believes not p. To do so would be to disrupt the unity within the group and break their semblance of being “one body.”
But this notion seems to be just the notion of a plural subject. For a collection of individuals to believe as a body or act as a body is for them to act or believe as a subject, a subject constituted by a plurality of individuals. Indeed Gilbert says as much in the following passage:
I do, of course, posit a mechanism for the construction of social groups (plural subjects of belief or action). And this mechanism can only work if everyone involved has a grasp of a subtle conceptual scheme, the conceptual scheme of plural subjects. Given that all have this concept, then the basic means for bringing plural subject-hood into being is at their disposal. All that anyone has to do is to openly manifest his willingness to be part of a plural subject of some particular attribute (1989, 416)
Plural subjects are formed when each of a set of individual agents expresses willingness to constitute, with the others, the plural subject of a goal, belief, principle of action, or other such thing, in conditions of common knowledge. The conceptually necessary conditions for plural subjecthood, then, contain the notion of plural subjecthood. As a conceptual analysis of our core notion of group belief -the belief of a plural subject-Gilbert’s analysis seems circular.
Gilbert (in correspondence) has responded to this charge by arguing that for her the concept of a plural subject is a technical notion. It is not, as Tollefsen suggests, simply the notion of a subject comprised of individuals but of a subject formed on the basis of joint commitments. So her analysis of plural subjecthood does not contain the technical notion of a plural subject and her analysis is not circular. The passage above, however, suggests that, at the very least, the formation of plural subjects presupposes that the participants have an understanding of the technical concept of plural subjecthood and an understanding of joint commitments. Since both notions are very technical, it seems psychologically implausible that everyday folk have even an implicit understanding of these concepts.
Tuomela (1992) charges Gilbert with circularity, as well. Gilbert argues that joint commitments are to be analyzed in terms of individuals expressing their willingness to be jointly committed with others. But this analysis leave the concept of joint commitment unanalyzed. Gilbert does, however, say a great deal more about the notion of joint commitment than this suggests. In particular, her most recent work (2003) provides a more detailed explanation of joint commitment. Expressions of willingness come in as conditions for the formation of a joint commitment, not part of an analysis of the notion of joint commitment. If Gilbert’s analysis of joint commitment does not appeal to the notion of a joint commitment then it seems she has avoided Tuomela’s objection.
Tuomela (1998) has also argued that Gilbert’s account is somewhat limited. Her account of group intentionality is an account of what we mean when we say “We believe that p,” where “we” is a small, unstructured group like a reading club, poetry discussion group, and committees with no formal decision method. She claims that she is giving an analysis of our core meaning of group belief. But the paradigm case of attribution of intentional states to groups seems to be those in which the subject is an organization like a corporation. This is particularly true when one reflects on our practice of praising and blaming the actions of corporations, states, governments, etc. Yet it is unclear how Gilbert’s account extends to organizations. It seems obvious that not every member of the organization (take, for instance, IBM) would have to openly express their willingness to do his or her part in bringing it about that IBM believes that profits are lower this year than last as a body in order for it to be true that IBM believe that. Does the person on the assembly line have to express his willingness to be jointly committed in the way described? It seems that not even an implicit expression of willingness (a failure to speak up) would make sense of this. To the extent that Gilbert’s account does not seem to extend to a range of other types of groups to which the intentional idiom extends, Tuomela argues that it remains inadequate.
There may be ways, however, of extending Gilbert’s analysis to account for the beliefs of large organizations. Gilbert suggests that one might explain corporate beliefs, for instance, by claiming that the core notion of group belief applies to the board of directors and there is a convention in place that makes the board’s beliefs the beliefs of the corporation. Gilbert has used the plural subject framework to provide an account of convention (1989).
d. Tuomela
Raimo Tuomela (1992, 1995) develops an account of collective belief, he calls the positional account of group beliefs. This account relies on the notions of rule-based social positions and tasks that are defined by the rules in force in a collective and emphasizes the role of positional beliefs. “Positional beliefs are views that a position-holder has qua a position-holder or has internalized and accepted as a basis of his performances of aforementioned kinds of social tasks” (1995, 312). Strictly speaking, positional beliefs are not beliefs at all but acceptances. His account of collective belief attempts to encompass not only the beliefs of small, organized, groups but organizations as well. Tuomela also provides an analysis of shared we-beliefs (called non-normative or merely factual group beliefs). Shared we-beliefs are not, according to Tuomela, proper group (collective) beliefs. Collective belief does not require that any particular member actually believe that p. Whereas in the case of a we-belief each member believes that p and it is common knowledge that each member believes that p. In this respect shared we-beliefs are, according to Tuomela, those characterized by the summative accounts. They are able to capture certain social phenomena but cannot explain collective belief in cases like corporations or groups where individuals do not themselves believe the proposition in question. For our purposes we will be focusing on Tuomela’s account of group (collective) belief proper.
In Chapter Seven of The Importance of Us (1995) and Group Beliefs (1992) Synthese, 91: 285-318. Raimo Tuomela provides the following analysis of our concept of collective belief.
(BG) G believes that p in the social and normative circumstances C if and only if in C there are operative members A1……An in G with respective positions P1…….Pn such that
(1) The agents A1….Am when they are performing their social tasks in their positions P1….Pm and due to their exercising the relevant authority system in G, (intentionally) jointly accept p as the view of G, and because of this exercise of the authority system they ought to continue to accept or positionally believe that p.
(2) there is a mutual belief among the operative members to the effect that (1)
(3) because of (1) the full-fledged and adequately informed non-operative members of G tend to tacitly accept-or at least ought to accept–p as members of G.
(4) there is a mutual belief in G to the effect that (3)
This account relies heavily on a distinction between operative and non-operative members, acceptance and belief, and the notion of correct social and normative circumstances. I will consider each of these features in turn.
Operative members are those members who are responsible for the group belief having the content that it does. In the case of a corporation, the board of directors may be the operative members. Whereas those who work on the assembly line or in the credit department, for instance, are non-operative members. Which members are operative is determined by the rules and regulations of the corporation. Such rules and regulations are part of the social and normative circumstances referred to in Tuomela’s analysis.
The relevant social and normative circumstances involve tasks and social roles and rules, either formal (resembling laws or statutes) or informal (based on informal group agreements). So, for instance, corporations have certain rules that define the roles and tasks of its members. The rules are formal in some cases and are to be found in the corporate handbook or charter. These rules often specify which members are operative and define the relation between operative members and non-operative members. In addition, they make clear the chain of authority and decision-making procedures. “Indeed, in the case of typical formal collectives (like corporations), certain position-holders are required by the constitutive rules of the collective to set goals and accepts views for the collective” (1998, 308).
According to Tuomela’s analysis, then, one of the necessary conditions for our concept of group belief to apply is that operative members have certain intentional states. In this respect he shares something with Gilbert’s view and individualism in general. It is a further question whether Tuomela’s account can be viewed as intentionalistic and, if so, whether his analysis suffers from circularity. I consider this issue below. For now we can note that, for Tuomela, the intentional states of individuals must be embedded in the right social and normative circumstances. So group belief statements are not analyzed solely in terms of statements about individual intentional states on Tuomela’s view. Tuomela therefore breaks from strong analytical individualism.
Tuomela’s account also relies on the distinction between accepting a proposition and believing it. Tuomela stresses the difference between accepting and believing by noting that accepting is an action where certain beliefs are “non-actional” or experiential. Perceptual beliefs seem to be of this kind. The agent is in some way passive. He concludes based on this that at least experiential believing is different from accepting a proposition. As for non-perceptual beliefs, Tuomela goes on to argue that they are also different from accepting a proposition. Typically, when someone is said to believe that p, she does so if and only if she accepts p as true (given a certain disquotational account of truth). Tuomela points out, however, that this need not always be the case. Someone might accept a proposition but not believe it. “A person may, for instance, accept as true that he (or his body) is a probabilistically fluctuating bunch of hadrons and leptons without really believing it to be true in the experiential sense, let alone having that conviction. His acceptance would then be “cognitive” acceptance in the sense that he would be willing to operate on the assumption in question, to concretely act on it and to use it as a premise in his reasoning, and so on.” (1995, 309)
As we have seen, traditional summative accounts that require all or some of the members believe that p were too strong. Tuomela attempts to avoid this problem by requiring that operative members accept that p. No member actually has to believe that p. The operative members have, in Tuomela’s view, positional beliefs. Positional beliefs are views a position-holder has accepted as a basis for his performance of certain kinds of social tasks. These positional beliefs are different from personal beliefs. For instance, the board of directors might personally believe that it is wrong for the company to fire 10,000 employees yet a director accepts this proposition and acts on it given the fact that he holds a position of authority in the company. Positional views, then, need not be truth-related. We may accept false beliefs and therefore adopt positional views that we know to be false.
Tollefsen (2002) has argued that Tuomela’s account suffers from the same problem of circularity from which Gilbert’s account suffers. Consider condition (1) of Tuomela’s analysis.
(1) The agents A1….Am when they are performing their social tasks in their positions P1….Pm, and due to their exercising the relevant authority system in G, (intentionally) jointly accept p as the view of G, and because of this exercise of the authority system they ought to continue to accept or positionally believe that p.
The operative members must intentionally and jointly accept P as the view of the group, where joint acceptance simply means that each operative member accepts p as the view of the group and this is common knowledge. But what are we to make of the reference to “the view of the group”? On an ordinary understanding of what it is to have a view on some issue is to have an opinion or a belief. The “view” of the group, then, seems to be simply the belief of the group. If so, one of the necessary and sufficient conditions for group belief appears to make reference to the notion of a group belief. Tuomela’s analysis, then, is circular. There is a group belief that p if and only if operative members accept p as the group belief. But group belief (the view of the group rather than the view of its individual members) is the concept that the analysis is supposed to illuminate by providing necessary and sufficient conditions for its application. It is hard to see how to make sense of the view of the group without appealing to notions like the belief of the group, the goal of the group, what the group intends, and so on.
The circularity issues raised by Gilbert’s and Tuomela’s account might be avoided if we simply give up the methodology of conceptual analysis. Indeed, Tuomela insists that he is not engaged in conceptual analysis but is providing truth conditions for our ascriptions. Thus, although his account is circular, it is not viciously so. We can view these accounts, then, as offering us a sort of identity theory of collective intentionality. Indeed, this is how Bratman viewed his account of collective (shared) intention. Group belief and intention plays a certain role. What these theorists have done is identify a complex of interrelated intentional states of individuals that plays that role. One could, then, conclude that collective belief and/or intention is that complex of attitudes.
The problem with this approach is that one might wonder whether there might not be other ways in which these roles could be realized. Might there not be other combinations of individual attitudes and public acts and conditions, combinations that even in our world would function together in the ways that realize the roles of shared intention? The problem is analogous to type identity theories in the philosophy of mind. If mental states are multiply realized by different sorts of physical states, then type identity is false. Analogously, if collective intentional states are multiply realizable then identifying them with the complex of individual states is also problematic. Collective intentional states could plausibly be realized by a variety of different configurations of individual intentional states. Indeed, Tuomela’s voluminous work on group intentionality supports this. He provides different accounts of group intentional states depending on the particular group in question (e.g. normative vs. normative group belief). And we have also seen that Gilbert acknowledges that the conditions she identifies for group intentional states are sufficient but not necessary. This leaves open the possibility that group beliefs and other attitudes could be realized by other sets of individual intentional states. At the most, then, these accounts provide us with accounts of ways in which group attitudes can be realized but they do not provide us with an account of what group attitudes are.
We are left with the same question that plagues token-token identity theories in the philosophy of mind. The token identity thesis states that for every token instance of a mental state, there will be some token neuro-physiological event with which that token instance is identical. But what is it about these token mental states that makes them all tokens of the same type? If Sue and Eric both believe that Columbus is the capital of Ohio, then what is it that they have in common that makes their different neurophysiological states the same belief?
We can formulate the same question with respect to group intentional states. If GM and the Federal Reserve are both ascribed the belief that interest rates should be cut, what do these two groups have in common that makes it appropriate to ascribe to them the same belief? Tuomela would point to the fact that they both meet the conditions he specifies for proper group belief. But what if the members of GM meet the conditions of normative group belief and the members of the Federal Reserve Board meet the conditions for non-normative group belief? Do they share the same belief? And we are left with the further question of what is it about these particular configurations of intentional states that makes it appropriate to call them beliefs or intentions at all? Why is collective intentionality a species of intentionality? The work of Pettit (2002), Tollefsen (2002c), and Velleman (1997) attempt to fill this lacuna by showing that certain groups count as intentional agents given standard accounts of intentionality. Rather than analyze the concept of collective intentions or beliefs, these theorists have attempted to show that our everyday concept of belief and intention extends naturally to certain groups. Gilbert (2002), also, has recently attempted to flesh out the strong analogy between individual beliefs and group beliefs.
4. Internal Debates: Belief vs. Acceptance
Among those who acknowledge that collectives can be the subject of intentional state ascription, there is a debate raging over which type of intentional states are appropriately attributable to collectives. There are some, like Margaret Gilbert and Tollefsen who argue that it is appropriate to attribute to groups a wide range of intentional states including beliefs. Others, like K. Brad Wray (2002), Raimo Tuomela (2000), and Anthonie Meijers (1999), have argued that, although groups may accept a proposition, they cannot believe. The nature of belief, according to these philosophers, is such that groups cannot be believers. The latter camp has been labeled by Gilbert as the rejectionists because they reject the possibility of group belief. For ease, I refer to the former camp as the believers.
In “Collective Belief and Acceptance” (2002), Wray identifies four differences between acceptance and belief.
You can accept things that you do not believe but you cannot believe what you do not accept. (Rejection of the entailment thesis)
“Acceptance often results from a consideration of one’s goals, and thus results from adopting a policy to pursue a particular goal.” (2002, p. 7).
Belief is a disposition to feel that something is true.
Belief is involuntary, whereas acceptance is voluntary.
Wray then proceeds to show that the examples that Gilbert gives of group belief (1989), (1994), are actually instances of acceptance. Because group attitudes are formed against the background of goals, because they are formed voluntarily, and because their formation does not entail that members believe the content of the attitude, group views are more aptly described as instances of acceptance. Both Wray (2000) and Meijers (1999) develop an acceptance-based account of collective attitudes.
There have been various attempts to respond to this line of argument. Much rests on the merits of the original distinction between acceptance and belief and on exploring the analogy between groups and individuals. Tollefsen (2003b), for instance, argues that the issue of voluntarism concerning belief is not as clear cut as rejectionists make it out to be. The assertion that we cannot will to believe is an empirical assertion and not a conceptual assertion about the nature of belief. Perhaps, then, individuals cannot will to believe because of our epistemic limits, but this does not rule out the possibility that collective agents can will to believe. Gilbert (2002) has argued that rejectionists beg the question with respect to collective belief. They assume that collective belief must have all the features of individual belief in order for it to be genuine belief but this just privileges individual belief without argument. It may be that collective belief, although a species of belief, is unique in certain respects.
5. The Role of Collective Intentionality
We have already seen that some theorists focus on the role of collective intentions in organizing and coordinating collective action. And in Searle’s account of social reality, collective intentions confer status functions on artifacts and turn them into social facts. Money is money because we accept it and intend it to be. Others have explored the role that collective intentionality, either collective intentions or beliefs, plays in jurisprudence, economics, and politics, and moral theory. Gilbert (2001), for instance, argues that her account of collective intentionality provides a better account of social rules than H.L.A. Harts. Social rules are to be understood as the joint commitments of a society. This explains why we are justified in rebuking those who violate social rules. Maria Cristina Redondo (2001) argues that Searle’s account of social facts, an account grounding in collective intentionality, supports a version of legal positivism. Ota Weinberger (2001) develops the relationship between discussions of collective intentionality and the notion of the “general will” or the “will of the people.” Weinberger argues that the “general will” should be understood in terms of institutional processes that are collectively accepted within the community.
6. References and Further Reading
Bratman, M. 1987. Intentions, Plans, and Practical Reason. Cambridge, MA: Harvard University Press.
Bratman, M. 1992. “Practical Reasoning and Acceptance in a Context.” Mind 101: 1-15.
Bratman, M. 1993. “Shared Intention.” Ethics 104: 97-113.
Bratman, M. 1999. Faces of Intention. Cambridge, MA: Cambridge University Press.
Cohen, L.J. 1992. An Essay on Belief and Acceptance. Oxford, U.K.: Clarendon Press.
Corlett, A. 1996. Analyzing Social Knowledge. Maryland: Rowman and Littlefield.
Davidson, D. 1992. The Second Person. Midwest Studies in Philosophy XVII: 255-265.
Gilbert, M. 1987. Modelling Collective Belief. Synthese, vol. 73. Reprinted in (1996). Chapter 7.
Gilbert, M. 1989. On Social Facts. New York: Routledge.
Gilbert, M. 1993. “Agreements, Coercion, and Obligation.” Ethics. 103: 679-706
Gilbert, M. 1994. “Remarks on collective belief” in Frederick Schmitt ed. Socializing Epistemology. Maryland: Rowman & Littlefield.
Gilbert, M. 1996. Living Together. Maryland: Rowman & Littlefield.
Gilbert, M. 1996. “Concerning Sociality: The Plural Subject as Paradigm” in J. Greenwood (ed.), The Mark of the Social. Maryland: Rowman and Littlefield.
Gilbert, M. (2000) Sociality and Responsibility. Blue Ridge Summit: Rowman and Littlefield.
Gilbert, M. 2001. “Social Rules as Plural Subject Phenomena” in Lagerspetz et. al.
Gilbert, M. 2002. “Belief and Acceptance as Features of Groups.” Protosociology, Volume 16, 35-69.
Gilbert, M. 2003. “The Structure of the Social Atom: Joint Commitment and the Foundation of Human Social Behavior” in Schmitt, F. ed. Socializing Metaphysics. Maryland: Rowman and Littlefield.
Hindriks, F. 2002. “Social Groups, Collective Intentionality, and Anti-Hegelian Skepticism,” in Realism in Action: Essays in the Philosophy of Social Science, Matti Sintonen, Petri Ylikoski, and Kaarlo Miller (eds.), Dordrecht: Kluwer Academic Publishers.
Hindricks, F.A. (2002) “Social Ontology, Collective Intentionality, and Ockhamian Skepticism” in Meggle (2002), 125-49.
Lagerspetz, E. Heikki Ikaheimo, and Jussi Kotkavirta, eds. 2001 On the Nature of Social and Institutional Reality. Finland, SoPhi.
Lewis, D. 1969. Convention: A Philosophical Study. Cambridge, MA: Harvard University Press.
Meggle, G. (ed.) (2002) Social Facts and Collective Intentionality, Frankfurt am Main: Hansel-Hohenhausen.
Meijers, A. (1994). Speech Acts, Communication, and Collective Intentionality:Beyond Searle’s Individualism. Leiden.
Meijers, A. (1999) Belief, Cognition, and the Will. Tilburg: Tilburg University Press, 59-71.
Meijers, A. (2003) “Can Collective Intentionality be Individualized?” American Journal of Economics and Sociology 62, 167-93.
Miller, S. 2001. Social Action, Cambridge University Press.
Pettit, P. (2003) “Groups with Minds of their Own” in Schmitt F. (ed) Socializing Metaphysics, Rowman and Littlefield, pp. 167-93.
Quinton, A. 1975. “Social Objects.” Proceedings of the Aristotelian Society 75: 67-87.
Redondo, M. 2001. “On Normativity in Legal Contexts,” in Lagerspetz et al.
Scanlon, T. 1998. What We Owe to Each Other. Cambridge, MA: Harvard University Press.
Schmitt, F. (ed). 2003. Socializing Metaphysics. Maryland: Rowman and Littlefield.
Searle, J. 1990. “Collective Intentions and Actions.” In Intentions in Communication, P.Cohen, J. Morgan, and M.E. Pollack, eds. Cambridge, MA: Bradford Books, MIT press.
Searle, J. 1995. The Construction of Social Reality. New York, N.Y.: Free Press.
Stoutland, F. 1997. “Why Are Philosophers of Action so Anti-Social?” in Alanen, Heinamaa, and Wallgreen eds. Commonality and Particularity in Ethics. New York, N.Y.: St. Martin’s Press.
Tollefsen, D. 2002. “Collective Intentionality and the Social Sciences.” Philosophy of the Social Sciences 32 (1): 25-50.
Tollefsen, D. 2002b. “Challenging Epistemic Individualism.” Protosociology, volume 16, pp. 86-117. June 2002.
Tollefsen, D. 2002c. “Organizations as True Believers.” Journal of Social Philosophy, vol 33 (3): pp. 395-411.
Tollefsen, D. 2002d. Interpreting Organizations. Dissertation. Ohio State University.
Tollefsen, D. 2003a. “Collective Epistemic Agency.” Southwest Philosophy Review, vol. 20 (1), pp. 55-66.v
Tollefsen, D. 2003b. “Rejecting Rejectionism.” Protosociology, volume 18. pp. 389-408.
Tollefsen, D. 2004. “Joint Action and Joint Attention.” Under review Philosophy of the Social Sciences
Tuomela, R. 1992. Group Beliefs. Synthese 91: 285-318.
Tuomela, R. 1993. “Corporate Intention and Corporate Action.” Analyse und Kritik 15: 11-21.
Tuomela, R. 1995. The Importance of Us. Standford: Standford University Press.
Tuomela, R. 2000. Cooperation: A Philosophical Study. Philosophical Studies Series, Kluwer Academic Publishers, Dordrecht.
Tuomela, R. 2002. The Philosophy of Social Practices. Cambridge, UK: Cambridge University Press.
Tuomela, R. (2003). The Philosophy of Social Practices. Cambridge, UK: Cambridge University Press.
Tuomela, R. 2004. “We-Intention Revisted.” forthcoming in Philosophical Studies.
Velleman, D. 1997. “How to Share an Intention.” Philosophy and Phenomenological Research LVII: 29-50.
Weinberger, O. 2001. “Democracy and theory of institutions,” in Lagerspetz et al.
Wray, B. 2000. “Collective Belief and Acceptance.” Synthese 00: 1-15.
Special Relativity: Proper Time, Coordinate Systems, and Lorentz Transformations
This supplement to the main Time article explains some of the key concepts of the Special Theory of Relativity (STR). It shows how the predictions of STR differ from classical mechanics in the most fundamental way. Some basic mathematical knowledge is assumed.
The essence of the Special Theory of Relativity (STR) is that it connects three distinct quantities to each other: space, time, and proper time. ‘Time’ is also called coordinate time or real time, to distinguish it from ‘proper time’. Proper time is also called clock time, or process time, and it is a measure of the amount of physical process that a system undergoes. For example, proper time for an ordinary mechanical clock is recorded by the number of rotations of the hands of the clock. Alternatively, we might take a gyroscope, or a freely spinning wheel, and measure the number of rotations in a given period. We could also take a chemical process with a natural rate, such as the burning of a candle, and measure the proportion of candle that is burnt over a given period.
Note that these processes are measured by ‘absolute quantities’: the number of times a wheel spins on its axis, or the proportion of candle that has burnt. These give absolute physical quantities and do not depend upon assigning any coordinate system, as does a numerical representation of space or real time. The numerical coordinate systems we use firstly require a choice of measuring units (meters and seconds, for example). Even more importantly, the measurement of space and real time in STR is relative to the choice of an inertial frame. This choice is partly arbitrary.
Our numerical representation of proper time also requires a choice of units, and we adopt the same units as we use for real time (seconds). But the choice of a coordinate system, based on an inertial frame, does not affect the measurement of proper time. We will consider the concept of coordinate systems and measuring units shortly.
Proper time can be defined in classical mechanics through cyclic processes that have natural periods – for instance, pendulum clocks are based on counting the number of swings of a pendulum. More generally, any natural process in a classical system runs through a sequence of physical states at a certain absolute rate, and this is the ‘proper time rate’ for the system.
In classical physics, two identical types of systems (with identical types of internal construction, and identical initial states) are predicted to have the same proper time rates. That is, they will run through their physical states in perfect correlation with each other.
This holds even if two identical systems are in relative constant motion with respect to each other. For instance, two identical classical clocks would run at the same rate, even if one is kept stationary in a laboratory, while the other is placed in a spaceship traveling at high speed.
This invariance principle is fundamental to classical physics, and it means that in classical physics we can define: Coordinate time = Proper time for all natural systems. For this reason, the distinction between these two concepts of time was hardly recognized in classical physics (although Newton did distinguish them conceptually, regarding ‘real time’ as an absolute temporal flow, and ‘proper time’ as merely a ‘sensible measure’ of real time; see his Scholium).
However, the distinction only gained real significance in the Special Theory of Relativity, which contradicts classical physics by predicting that the rate of proper time for a system varies with its velocity, or motion through space. The relationship is very simple: the faster a system travels through space, the slower its internal processes go. At the maximum possible speed, the speed of light, c, the internal processes in a physical system would stop completely. Indeed, for light itself, the rate of proper time is zero: there is no ‘internal process’ occurring in light. It is as if light is ‘frozen’ in a specific internal state.
At this point, we should mention that the concept of proper time appears more strongly in quantum mechanics than in classical mechanics, through the intrinsically ‘wave-like’ nature of quantum particles. In classical physics, single point-particles are simple things, and do not have any ‘internal state’ that represents proper time, but in quantum mechanics, the most fundamental particles have an intrinsic proper time, represented by an internal frequency. This is directly related to the wave-like nature of quantum particles. For radioactive systems, the rate of radioactive decay is a measure of proper time. Note that the amount of decay of a substance can be measured in an absolute sense. For light, treated as a quantum mechanical particle (the photon), the rate of proper time is zero, and this is because it has no mass. But for quantum mechanical particles with mass, there is always a finite ‘intrinsic’ proper time rate, represented by the ‘phase’ of the quantum wave. Classical particles do not have any correlate of this feature, which is responsible for quantum interference effects and other non-classical ‘wave-like’ behavior.
2. The STR Relationship Between Space, Time, and Proper Time
STR predicts that motion of a system through space is directly compensated by a decrease in real internal processes, or proper time rates. Thus, a clock will run fastest when it is stationary. If we move it about in space, its rate of internal processes will decrease, and it will run slower than an identical type of stationary clock. The relationship is precisely specified by the most profound equation of STR, usually called the metric equation (or line metric equation). The metric equation is:
(1)
This applies to the trajectory of any physical system. The quantities involved are:
D is the difference operator.
Dt is the amount of proper time elapsed between two points on the trajectory.
Dt is the amount of real time elapsed between two points on the trajectory.
Dr is the amount of motion through space between two points on the trajectory.
c is the speed of light, and depends on the units we choose for space and time.
The meaning of this equation is illustrated by considering simple trajectories depicted in a space-time diagram.
Figure 1. Two simple space-time trajectories.
Figure 1. Two simple space-time trajectories.
If we start at a initial point on the trajectory of a physical system, and follow it to a later point, we find that the system has covered a certain amount of physical space, Dr, over a certain amount of real time, Dt, and has undergone a certain amount of internal process or proper-time, Dt. As long as we use the same units (seconds) to represent proper time and real time, these quantities are connected as described in Equation (1). Proper time intervals are shown in Figure 1 by blue dots along the trajectories. If these were trajectories of clocks, for example, then the blue dots would represent seconds ticked off by the clock mechanism.
In Figure 1, we have chosen to set the speed of light as 1. This is equivalent to using our normal units for time, i.e. seconds, but choosing the units for space as c meters (instead of 1 meter), where c is the speed of light in meters per second. This system of units is often used by physicists for convenience, and it appears to make the quantity c drop out of the equations, since c = 1. However, it is important to note that c is a dimensional constant, and even if its numerical value is set equal to 1 by choosing appropriate units, it is still logically necessary in Equation 1 for the equation to balance dimensionally. For multiplying an interval of time, Dt, by the quantity c converts from a temporal quantity into a spatial quantity. Equations of physics, just like ordinary propositions, can only identify objects or quantities of the same physical kinds with each other, and the role of c as a dimensional constant remains crucial in Equation (1), for the identity it states to make any sense.
Trajectories in Figure 1
Trajectory 1 (green) is for a stationary particle, hence Dr = 0 (it has no motion through space), and putting this value in Equation (1), we find that: Dt = Dt. For a stationary particle, the amount of proper time is equal to the amount of coordinate time.
Trajectory 2 (red) is for a moving particle, and Dr > 0. We have chosen the velocity in this example to be: v = c/2, half the speed of light. But: v = Dr/Dt (distance traveled in the interval of time). Hence: Dr = ½cDt. Putting this value into Equation (1), we get: c²Dt² = c²Dt²-(½cDt)², or: Dt = Ö(¾)Dt » 0.87Dt. Hence the amount of proper time is only about 87% of coordinate time. Even though this trajectory is very fast, proper time is still only slowed down a little.
Trajectory 3 (black) is for a particle moving at the speed of light, with v = c, giving: Dr = cDt. Putting this in Equation (1), we get: c²Dt² = c²Dt²-(cDt)² = 0. Hence for a light-like particle, the amount of proper time is equal to 0.
Now from the classical point of view, Equation (1) is a surprise – indeed, it seems bizarre! For how can mere motion through space directly and precisely affect the rate of physical processes occurring in a system? We are used to the opposite idea, that motion through space, by itself, has no intrinsic effect on processes. This is at the heart of the classical Galilean invariance or symmetry. But STR breaks this rule.
We can compare this situation with classical physics, where (for linear trajectories) we have two independent equations:
(2.a) Dt = Dt
(2.b) Dr = vDt for some (real numbers)
Equation (2.a) just means that the rate of proper time in a system is invariant – and we measure it in the same units as coordinate time, t.
Equation (2.b) just means that every particle or system has some finite velocity or speed, v, through space, with v defined by: v = Dr/Dt.
There is no connection here between proper time and spatial motion of the system.
The fact that (2) is replaced by (1) in STR is very peculiar indeed. It means that the rate of internal process in a system like a clock (whether it is a mechanical, chemical, or radioactive clock) is automatically connected to the motion of the clock in space. If we speed up a clock in motion through space, the rate of internal process slows down in a precise way to compensate for the motion through space.
The great mystery is that there is no apparent mechanism for this effect, called time dilation. In classical physics, to slow down a clock, we have to apply some force like friction to its internal mechanism. In STR, the physical process of a system is slowed down just by moving it around. This applies equally to all physical processes. For instance, a radioactive isotope decays more slowly at high speed. And even animals, including human beings, should age more slowly if they move around at high speed, giving rise to the Twin Paradox.
In fact, time dilation was already recognized by Lorentz and Poincare, who developed most of the essential mathematical relationships of STR before Einstein. But Einstein formulated a more comprehensive theory, and, with important contributions by Minkowski, he provided an explanation for the effects. The Einstein-Minkowski explanation appeals to the new concept of a space-time manifold, and interprets Equation (1) as a kind of ‘geometric’ feature of space-time. This view has been widely embraced in 20th Century physics. By contrast, Lorentz refused to believe in the ‘geometric’ explanation, and he thought that motion through space has some kind of ‘mechanical’ effect on particles, which causes processes to slow down. While Lorentz’s view is dismissed by most physicists, some writers have persisted with similar ideas, and the issues involved in the explanation of Equation (1) continue to be of deep interest, to philosophers at least.
But before moving on to the explanation, we need to discuss the concepts of coordinate systems for space and time, which we have been assuming so far without explanation.
3. Coordinate Systems
In physics we generally assume that space is a three dimensional manifold and time is a one dimensional continuum. A coordinate system is a way of representing space and time using numbers to represent points. We assign a set of three numbers, (x,y,z), to characterize points in space, and one number, t, to characterize a point in time. Combining these, we have general space-time coordinates: (x,y,z,t). The idea is that every physical event in the universe has a ‘space-time location’, and a coordinate system provides a numerical description of the system of these possible ‘locations’.
Classical coordinate systems were used by Descartes, Galileo, Newton, Leibniz, and other classical physicists to describe space. Classical space is assumed to be a three dimensional Euclidean manifold. Classical physicists added time coordinates, t, as an additional parameter to characterize events. The principles behind coordinate systems seemed very intuitive and natural up until the beginning of the 20th century, but things changed dramatically with the STR. One of Einstein’s first great achievements was to reexamine the concept of a coordinate system, and to propose a new system suited to STR, which differs from the system for classical physics. In doing this, Einstein recognized that the notion of a coordinate system is theory dependent. The classical system depends on adopting certain physical assumptions of classical physics – for instance, that clocks do not alter their rates when they are moved about in space. In STR, some of the laws underpinning these classical assumptions change, and this changes our very assumptions about how we can measure space and time. To formulate STR successfully, Einstein could not simply propose a new set of physical laws within the existing classical framework of ideas about space and time: he had to simultaneously reformulate the representation of space and time. He did this primarily by reformulating the rules for assigning coordinate systems for space and time. He gave a new system of rules suited to the new physical principles of STR, and reexamined the validity of the old rules of classical physics within this new system.
A key feature Einstein focused on is that a coordinate system involves a system of operational principles, which connect the features of space and time with physical processes or ‘operations’ that we can use to measure those features. For instance, the theory of classical space assumes that there is an intrinsic distance (or length) between points of space. We may take distance itself to be an underlying feature of ‘empty space’. Geometric lines can be defined as collections of points in space, and line segments have intrinsic lengths, prior to any physical objects being placed in space. But of course, we only measure (or perceive) the underlying structure of space by using physical objects or physical processes to make measurements. Typically, we use ‘straight rigid rulers’ to measure distances between points of space; or we use ‘uniform, standard clocks’ to measure the time intervals between moments of time. Rulers and clocks are particular physical objects or processes, and for them to perform their measurement functions adequately, they must have appropriate physical properties.
But those physical properties are the subject of the theories of physics themselves. Classical physics, for example, assumes that ordinary rigid rulers maintain the same length (or distance between the end-points) when they are moved around in space. It also assumes that there are certain types of systems (providing ‘idealized clocks’) that produce cyclic physical processes, and maintain the same temporal intervals between cycles through time, even if we move these systems around in space.
These assumptions are internally consistent with principles of measurement in classical physics. But they are contradicted in STR, and Einstein had to reformulate the operational principles for measuring space and time, in a way that is internally consistent with the new physical principles of STR.
We will briefly describe these new operational principles shortly, but there are some features of coordinate systems that are important to appreciate first.
a. Coordinates as a Mathematical Language for Time and Space
The assignment of a numerical coordinate system for time or space is thought of as providing a mathematical language (using numbers as names) for representing physical things (time and space). In a sense, this language could be ‘arbitrarily chosen’: there are no laws about what names can be used to represent things. But naturally there are features that we want a coordinate system to reflect. In particular, we want the assignment of numbers to directly reflect the concepts of distance between points of space, and the size of intervals between moments of time.
We perform mathematical operations on numbers, and we can subtract two numbers to find the ‘numerical distance’ between them. For numbers are really defined as certain structures, with features such as continuity, and we want to use the structures of number systems to represent structural features of space and time.
For instance, we assume in our fundamental physical theory that any two intervals of time have intrinsic magnitudes, which can be compared to each other. The ‘intrinsic temporal distance’ between two moments, t1 and t2, may be the same as that between two quite different moments, t3 and t4. We naturally want to assign numbers to times so that ordinary numerical subtraction corresponds to the ‘intrinsic temporal distance’ between events. We choose a ‘uniform’ coordinate system for time to achieve this.
Figure 2. A Coordinate system for time gives a mathematical language for a physical thing.
Numbers are used as names for moments of time.
4. Cartesian Coordinates for Space
Time is simple because it is one-dimensional. Three-dimensional space is much more complex. Because space is three dimensional, we need three separate real numbers to represent a single point. Physicists normally choose a Cartesian coordinate system to represent space. We represent points in this system as: r = (x,y,z), where x, y, and z are separate numerical coordinates, in three orthogonal (perpendicular) directions.
The numerical structure with real-number points is denoted in mathematics as (x,y,z). Three dimensional space itself (a physical thing) is denoted as: . A Cartesian coordinate system is a special kind of mapping between points of these two structures. It makes the intrinsic spatial distance between two points in E3 be directly reflected by the ‘numerical distance’ between their numerical coordinates in .
The numerical distances in are determined by a numerical function for length. A line from the origin: (0,0,0), to the point r = (x,y,z), which is called the vector r, has its length given by the Pythagorean formula:
|r| = √(x²+y²+z²).
More generally, for any two points, r1 = (x1, y1, z1), and: r2 = (x2, y2, z2), the distance function is:
|r2 – r1| = √((x2 – x1)²+ (y2 – y1)²+ (z2 – z1)²)
The special feature of this system is that the lengths of lines in the x, y, or z directions alone are given directly by the values of the coordinates. E.g. if: r = (x,0,0), then the vector to r is a line purely in the x-direction, and its length is simply: |r| = x. If r1 = (x1,0,0), and: r2 = (x2,0,0), then the distance between them is just: |r2 – r1| = (x2 – x1 ). As well, a Cartesian coordinate system treats the three directions, x, y, and z, in a symmetric way: the angles between any pair of these directions is the same, 900. For this reason, a Cartesian system can be rotated, and the same form of the general distance function is maintained in the rotated system.
In fact, there are spatial manifolds which do not have any possible Cartesian coordinate system – e.g. the surface of a sphere, regarded as a two dimensional manifold, cannot be represented by using Cartesian coordinates. Such spaces were first studied as geometric systems in the 19th century, and are called non-classical or non-Euclidean geometries. However, classical space is Euclidean, and by definition:
Euclidean space can be represented by Cartesian coordinate systems.
We can define alternative, non-Cartesian, coordinate systems for Euclidean space; for instance, cylindrical and spherical coordinate systems are very useful in physics, and they use mixtures of linear or radial distance, and angles, as the numbers to specify points of space. The numerical formulas for distance in these coordinate systems appear quite different from the Cartesian formula. But they are defined to give the same results for the distances between physical points. This is the most crucial feature of the concept of distance in classical physics:
Distance between points in classical space (or between two events that occur at the same moment of time) is a physical invariant. It does not change with the choice of coordinate system.
The form of the numerical equation for distance changes with the choice of coordinate system; but this is done deliberately to preserve the physical concept of distance.
5. Choice of Inertial Reference Frame
A second crucial concept is the idea of a reference frame. A reference frame specifies all the trajectories that are regarded as stationary, or at rest in space. This defines the property of remaining at the same place through time. But the key feature of both classical mechanics and STR is that no unique reference frame is determined. Any object that is not accelerating can be regarded as stationary ‘in its own inertial frame’. It defines a valid reference frame for the whole universe. This is the natural reference frame ‘from the point of view’ of the object, or ‘relative to the object’. But there are many possible choices because given any particular reference frame, any other frame, defined to give everything a constant velocity relative to the first frame is also a valid choice.
The class of possible (physically valid) reference frames is objectively determined, because acceleration is absolutely distinguished from constant motion. Any object that is not accelerating may be regarded as defining a valid reference frame. But the specific choice of a reference frame from the range of possibilities is regarded as arbitrary or conventional. This choice must be made before a coordinate system can be defined to represent distances in space and time. Even after we have chosen a reference frame, there are still innumerable choices of coordinate systems. But the reference frame settles the definition of distances between events, which must be defined as the same in any coordinate system relative to a given reference frame.
The idea of the conventionality of the reference frame is partly evident already in the choice of a Cartesian coordinate system: for it is an arbitrary matter where we choose the origin, or point: 0 = (0,0,0), for such a system. It is also arbitrary which directions we choose for the x, y, and z axes – as long as we make them mutually perpendicular. We are free to rotate a given set of axes, x, y, z, to produce a new set, x’, y’, and z’, and this gives another Cartesian coordinate system. Thus, translations and rotations of Cartesian coordinate systems for space still leave us with Cartesian systems.
But there is a further transformation, which is absolutely central to classical physics, and involves both time and space. This is the Galilean velocity transformation, or velocity boost. The essential point is that we need to apply a spatial coordinate system through time. In pure classical geometry, we do not have to take time into account: we just assign a single coordinate system, at a single moment of time. But in physics we need to apply a coordinate system for space at different moments of time. How do we know whether the coordinate system we apply at one moment of time represents the same coordinate system we use at a later moment of time?
The principles of classical physics mean that we cannot measure ‘absolute location in space’ across time. The reason is the fundamental classical principle that the laws of nature do not distinguish between two inertial frames moving relative to each other at a constant speed. This is the classical Galilean principle of ‘relativity of motion’. Roughly stated, this means that uniform motion through space has no effect on physical processes. And if motion in itself does not affect processes, then we cannot use processes to detect motion.
Newton believed that the classical conception of space requires there to be absolute spatial locations through time nonetheless, and that some special coordinate systems or physical objects will indeed be at ‘absolute rest’ in space. But in the context of classical physics, it is impossible to measure whether any object is at absolute rest, or is in uniform motion in space. Because of this, Leibniz denied that classical physics requires any concept of absolute position in space, and argued that only the notion of ‘relative’ or ‘relational’ space’ is required. In this view, only the relative positions of objects with regards to each other are considered real. For Newton, the impossibility of measuring absolute space does not prevent it from being a viable concept, and even a logically necessary concept. There is still no general agreement about this debate between ‘absolute’ and ‘relative’ or ‘relational’ conceptions of space. It is one of the great historical debates in the philosophy of both classical and relativistic physics. However, it is generally accepted that classical physics makes absolute space undetectable. This means, at least, that in the context of classical physics there is no way of giving an operational procedure for determining absolute position (or absolute rest) through time.
However absolute acceleration is detectable. Accelerations are always accompanied by forces. This means that we can certainly specify the class of coordinate systems which are in uniform motion, or which do not accelerate. These special systems are called inertial systems, or inertial frames, or Galilean frames. The existence of inertial frames is a fundamental assumption of classical physics. It is also fundamental in STR, and the notion of an inertial frame is very similar in both theories.
The laws of classical physics are therefore specified for inertial coordinate systems. They are equally valid in any inertial frame. The same holds for the laws of STR. However, the laws for transforming from one inertial frame to another are different for the two theories. To see how this works, we now consider the operational specification of coordinate systems.
6. Operational Specification of Coordinate Systems for Classical Space and Time
In classical physics, we can define an ‘operational’ measuring system, which allows us to assign coordinates to events in space and time.
Classical Time. We imagine measuring time by making a number of uniform clocks, synchronizing them at some initial moment, checking that they all run at exactly the same rates (proper time rates), and then moving clocks to different points of space, where we keep them ‘stationary’ in a chosen inertial frame. We subsequently measure the times of events that occur at the various places, as recorded by the different clocks at those places.
Of course, we cannot assume that our system of clocks is truly stationary. The entire system of clocks placed in uniform motion would also define a valid inertial frame. But the laws of classical physics mean that clocks in uniform inertial motion run at exactly the same rates, and so the times recoded for specific events turn out to be exactly the same, on the assumptions of the classical theory, for any such system of clocks.
Classical Space. We imagine measuring space by constructing a set of rigid measuring rods or rulers of the same length, which we can (imaginatively at least) set up as a grid across space, in an inertial frame. We keep all the rulers stationary relative to each other, and we use them to measure the distances between various events. Again, the main complication is that we cannot determine any absolutely stationary frame for the grid of rulers, and we can set up an alternative system of rulers which is in relative motion. This results in assigning different ‘absolute velocities’ to objects, as measured in two different frames. However, on the assumptions of the classical theory, the relative distances between any two objects or events, taken at any given moment of time, is measured to be the same in any inertial frame. This is because, in classical physics, uniform motion in itself does not alter the lengths of material objects, or the forces between systems of objects. (Accelerations do alter lengths).
7. Operational Specification of Coordinate Systems for STR Space and Time
In STR, the situation is in many ways very similar to classical physics: there is still a special concept of inertial frames, acceleration is absolutely detectable, and uniform velocity is undetectable. According to STR, the laws of physics still are invariant with regard to uniform motion in space, very much like the classical laws.
We also specify operational definitions of inertial coordinate systems in STR in a similar way to classical physics. However, the system sketched above for assigning classical coordinates fails, because it is inconsistent with the physical principles of STR. Einstein was forced to reconstruct the classical system of measurement to obtain a system which is internally consistent with STR.
STR Time. In STR, we can still make uniform clocks, which run at the same rates when they are held stationary relative to each other. But now there is a problem synchronizing them at different points of space. We can start them off synchronized at a particular common point; but moving them to different points of space already upsets their synchronization, according to Equation (1).
However, while synchronizing distant clocks is a problem, they nonetheless run at the same intrinsic rates as each other when held in the same inertial frame. And we can ensure two clocks are in a common inertial frame as long as we can ensure that they maintain the same distance from each other. We see how to do this next.
Given we have two clocks maintained at the same distance from each other, Einstein showed that there is indeed a simple operational procedure to establish synchronization. We send a light signal from Clock 1 to Clock 2, and reflect it back to Clock 1. We record the time it was sent on Clock 1 as t0, and the time it was received again as a later time, t2. We also record the time it was received at Clock 2 as t1’ on Clock 2. Now symmetry of the situation requires that, in the inertial frame of Clock 1, we must assume that the light signal reached Clock 2 at a moment halfway between t0 and t1, i.e. at the time: t1 = ½(t2 – t0). This is because, by symmetry, the light signal must take equal time traveling in either direction between the clocks, given that they are kept at a constant distance throughout the process, and they do not accelerate. (If the light signal took longer to travel one way than the other, then light would have to move at different speeds in different directions, which contradicts STR).
Hence, we must resynchronize Clock 2 to make: t1’ = t1. We simply set the hands on Clock 2 forwards by: (t1 – t1’), i.e. by: ½(t2 – t0) – t1’. (Hence, the coordinate time on Clock 2 at t1’ is changed to: t1’ + (½(t2 – t0) – t1’) = ½(t2 – t0) = t1.)
This is sometimes called the ‘clock synchronization convention’, and some philosophers have argued about whether it is justified. But there is no real dispute that this successfully defines the only system for assigning simultaneity in time, in the chosen reference frame, which is consistent with STR.
Some deeper issues arise over the notion of simultaneity that it seems to involve. From the point of view of Clock 1, the moment recorded at: t1 = ½(t2 – t0) must be judged as ‘simultaneous’ with the moment recorded at t1’ on Clock 2. But in a different inertial frame, the natural coordinate system will alter the apparent simultaneity of these two events, so that simultaneity itself is not ‘objective’ in STR, except relative to a choice of inertial frame. We will consider this later.
STR Space. In STR, we can measure space in a very similar way as in classical physics. We imagine constructing a set of rigid measuring rods or rulers, which are checked to be the same length in the inertial frame of Clock 1, and we extend this out into a grid across space. We have to move the rulers around to start with, but when we have set up the grid, we keep them all stationary in the chosen inertial frame of Clock 1.
We then use this grid of stationary measuring rods to measure the distances between various events. The main assumption is that identical types of measuring rods (which are the same lengths when we originally compare them at rest with Clock 1), maintain the same lengths after being moved to different places (and being made stationary again with regard to Clock 1). This feature is required by STR.
The main complication, once again, is that we cannot determine any absolutely stationary frame for the grid of rulers. We can set up an alternative system of rulers, which are all in relative motion in a different inertial frame. As in classical physics, this results in assigning different ‘absolute velocities’ to most trajectories in the two different frames. But in this case there is a deeper difference: on the assumptions of STR, the lengths of measuring rods alter according to their velocities. This is called space dilation, and it is the counterpart of time dilation.
Nonetheless, Einstein showed that perfectly sensible operational definitions of coordinate measurements for length, as well as time, are available in STR. But both simultaneity and length become relative to specified inertial frames.
It is this confusing conceptual problem, which involves the theory dependence of measurement, that Einstein first managed to unravel, as the prelude to showing how to radically reconstruct classical physics.
8. Operationalism
Unraveling this problem requires us to specify ‘operational principles’ of measurement, but this does not require us to embrace an operational theory of meaning. The latter is a form of positivism, and it holds that the meaning of ‘time’ or ‘space’ in physics is determined entirely by specifying the procedures for measuring time or space. This theory is generally rejected by philosophers and logicians, and it was rejected by Einstein himself in his mature work. According to operationalism, STR changes the meanings of the concepts of space and time from the classical conception. However, many philosophers would argue that ‘time’ and ‘space’ have a meaning for us which is essentially the same as for Galileo and Newton, because we identify the same kinds of things as time and space; but relativity theory has altered our scientific beliefs about these things – just as the discovery that water is H2O has altered our understanding of the nature of water, without necessarily altering the meaning of the term ‘water’. This semantic dispute is ongoing in the philosophy of science. Having clarified these basic ideas of coordinate systems and inertial frames, we now turn back to the notion of transformations between coordinate systems for different inertial frames.
9. Coordinate Transformations and Object Transformations
Physics uses two different concepts of transformations. It is important to distinguish these carefully.
Coordinate transformations: Taking the description of a given process (such as a trajectory), described in one coordinate system, and transforming to its description in an alternative coordinate system.
Object transformations: Taking a given process, described in a given coordinate system, and transforming it into a different process, described in the same coordinate system as the original process.
The difference is illustrated in the following diagram for the simplest kind of transformation, translation of space.
Figure 3. Object, Coordinate, and Combined Transformations.
The transformations in Figure 3 are simple space translations.
Figure 3 (B) shows an object transformation. The original trajectory (A) is moved in space to the right, by 4 units. The new coordinates are related to the original coordinates by: xnew particle® xoriginal particle + 4.
Figure 3 (C) shows a coordinate transformation: the coordinate system is moved to the left by 4 units. The new coordinate system, x’, is related to the original system, x, by: x’original particle = xoriginal particle + 4. The result ‘looks’ the same as (B).
Figure 3 (D) shows a combination of the object transformation (B) and a coordinate transformation, which is the inverse of that in (C), defined by: x’’original particle = xoriginal particle – 4. The result of this looks the same as the original trajectory in (A), because the coordinate transformation appears to ‘undo’ the effect of the object transformation.
10. Valid Transformations
There is an intimate connection between these two kinds of transformations. This connection provides the major conceptual apparatus of modern physics, through the concept of physical symmetries, or invariance principles, and valid transformations.
The deepest features of laws or theories of physics are reflected in their symmetry properties, which are also called invariances under symmetry transformations. Laws or theories can be understood as describing classes of physical processes. Physical processes that conform to a theory are valid physical processes of that theory. Of course, not all (logically) possible processes that we can imagine are valid physical processes of a given theory. Otherwise the theory would encompass all possible processes, and tell us nothing about what is physically possible, as opposed to what is logically conceivable.
Symmetries of a theory are described by transformations that preserve valid processes of the theory. For instance, time translation is a symmetry of almost all theories. This means that if we take a valid process, and transform it, intact, to an earlier or later time, we still have a valid process. This is equivalent to simply setting the ‘temporal origin’ of the process to a later or earlier time.
Other common symmetries are:
Rotations in space (if we take a valid process, and rotate it to another direction in space, we end up with another valid process).
Translations in space (if we take a valid process, and move it to another position in space, we end up with another valid process).
Velocity transformations (if we take a valid process, and give it uniform velocity boost in some direction in space, we end up with another valid process).
These symmetries are valid both in classical physics and in STR. In classical physics, they are called Galilean symmetries or transformations. In STR they are called Lorentz transformations. However, although the symmetries are very similar in both theories, the Lorentz transformations in STR involve features that are not evident in the classical theory. In fact, this difference only emerges for velocity boosts. Translations and rotations are identical in both theories. This is essentially because velocity boosts in STR involve transformations of the connection between proper time and ordinary space and time, which does not appear in classical theory.
The concept of valid coordinate transformations follows directly from that of valid object transformations. The point is that when we make an object transformation, we begin with a description of a process in a coordinate system, and end up with another description, of a different process, given in the same coordinate system. Now instead of transforming the processes involved, we can do the inverse, and make a transformation of the coordinate system, so that we end up with a new coordinate description of the original process, which looks exactly the same as the description of the transformed process in the original coordinate system.
This gives an alternative way of regarding the process, and its transformed image: instead of taking them as two different processes, we can take them as two different coordinate descriptions of the same process.
This is connected to the idea that certain aspects of the coordinate system are arbitrary or conventional. For instance, the choice of a particular origin for time or space is regarded as conventional: we can move the origins in our coordinate description, and we still have a valid system. This is only possible because the corresponding object transformations (time and space translations) are valid physical transformations.
Physicists tend to regard coordinate transformations and valid object transformations interchangeably and somewhat ambiguously, and the distinction between the two is often blurred in applied physics. While this doesn’t cause practical problems, it is important when learning the concepts of the theory to distinguish the two kinds of transformations clearly.
11. Velocity Boosts in STR and Classical Mechanics
STR and classical mechanics have exactly the same symmetries under translations of time and space, and rotations of space. They also both have symmetries under velocity boosts: both theories hold that, if we take a valid physical process, and give it a uniform additional velocity in some direction, we end with another valid physical process. But the transformation of space and time coordinates, and of proper time, are different for the two theories under a velocity boost. In classical physics, it is called a Galilean transformation, while for STR it is called a Lorentz transformation.
To see how the difference appears, we can take a stationary trajectory, and consider what happens when we apply a velocity boost in either theory.
Figure 4. Classical and STR Velocity Boosts give different results.
In both diagrams, the green line is the original trajectory of a stationary particle, and it looks exactly the same in STR and classical mechanics. Proper time events (marked in blue) are equally spaced with the coordinate time intervals in both cases.
If we transform the classical trajectory by giving the particle a velocity (in this example, v = c/2) towards the right, the result (red line) is very simple: the proper time events remain equally spaced with coordinate time intervals. The same sequence of proper time events takes the same amount of coordinate time to complete. The classical particle moves a distance: Dx = v.Dt to the right, where Dt is the coordinate time duration of the original process.
But when we transform the STR particle, a strange thing happens: the proper time events become more widely spaced than the coordinate time intervals, and the same sequence of proper time events takes more coordinate time to complete. The STR particle moves a distance: Dx’ = v.Dt’ to the right, where: Dt’ > Dt, and hence: Dx’ > Dx.
The transformations of the coordinates of the (proper time) points of the original processes are shown in the following table.
Table 1. Example of Velocity Transformation.
We can work out the general formula for the STR transformations of t’ and x’ in this example by using Equation (1). This requires finding a formula for the transformation of time-space coordinates:
(t, 0) ® (t’, x’)
We obtain this by applying Equation (1) in the (t’,x’) coordinate system, giving:
(1’)
It is crucial that this equation retains the same form under the Lorentz equation. In this special case, we have the additional facts that:
(i) Dt = Dt, and:(ii) Dx’ = vDt’
We substitute (i) and (ii) in (1’) to get:
This rearranges to give:
and:
We can see that: Dx’/Dt’ = v. This is a special case of a Lorentz transformation for this simplest kind of trajectory. Note that if we think of this as a coordinate transformation which generates the appearance of this object transformation, we need to move the new coordinate system in the opposite direction to the motion of the object. I.e. if we define a new coordinate system, (x’,t’), moving at –v (i.e. to the left) with regard to the original (x,t) system, then the original trajectory (which appeared stationary in (x,t)) will appear to be moving with velocity +v (to the left) in (x’,t’). In general, object transformations correspond the inverse coordinate transformations.
12. Lorentz Transformations for Velocity Boost V in the x-direction
The previous transformations is only for points on the special line where: x = 0. More generally, we want to work out the formulae for transforming points anywhere in the coordinate system:
(t, x) ® (t’, x’)
The classical formulas are Galilean transformations, and they are very simple.
Galilean Velocity Boost:
(t, x) ® (t, x+vt)t’ = t
x’ = x+vt
The STR formulas are more general Lorentz transformations. The Galilean transformation is simple because time coordinates are unchanged, so that: t = t’. This means that simultaneity in time in classical physics is absolute: it does not depend upon the choice of coordinate system. We also have that distance between two points at a given moment of time is invariant, because if: x2 -x1 = Dx, then: x’2 -x’1 = (x2+vt) – (x1-vt) = Dx. Ordinary distance in space is the crucial invariant quantity in classical physics.
But in STR, we have a complex interdependence of time and space coordinates. This is seen because the transformation formulas for both t’ and x’ are functions of both x and t. I.e. there are functions f and g such that:
t’ = f(x,t) and: x’ = g(x,t)
These functions represent the Lorentz transformations. To give stationary objects a velocity V in the x-direction, these general functions are found to be Lorentz Transformation, and the factor is called γ, letting us write these equations more simply as:
We can equally consider the corresponding coordinate transformation, which would generate the appearance of this object transformation in a new coordinate system. It is essentially the same as the object transformation – except it must go in the opposite direction. For the object transformation, which increases the velocity of stationary particles by the speed V in the x direction, corresponds to moving the coordinate system in the opposite direction. I.e. if we define a new coordinate system, and call it (x’,t’), and place this in motion with a speed –V (i.e. V in the negative-x-direction), relative to the (x,t) coordinate system, then the original stationary trajectories in (x,t)-coordinates will appear to have speed V in the new (x’,t’) coordinates.
Because the Lorentz transformation of processes leaves us with valid STR processes, the Lorentz transformation of a STR coordinate system leaves us with a valid coordinate system. In particular, the form of Equation (1) is preserved by the Lorentz transformation, so that we get: . This can be checked by substituting the formulas for t’ and x’ back into this equation, and simplifying; the resulting equation turns out to be identical to Equation (1).
13. Galilean Transformation of Coordinate System
One useful way to visualize the effect of a transformation is to make an ordinary space-time diagram, with the space and time axes drawn perpendicular to each other as usual, and then to draw the new set of coordinates on this diagram. In these diagrams, the space axes represent points which are measured to have the same time coordinates, and similarly, the time axes represent points which are measured to have the same space coordinates. When we make a velocity boost, these lines of simultaneity and same-position are altered.
This is shown first for a Galilean velocity boost, where in fact the lines of simultaneity remain the same, but the lines representing position are rotated:
Figure 5. Galilean Velocity Boost.
In Figure 5, the (green) horizontal lines are lines of absolute simultaneity. They have the same coordinates in both t and t’.
The (blue) vertical lines are lines with the same x-coordinates.
The (gray) slanted lines are lines with the same x’-coordinates.
The spacing of the x’ coordinates is the same as the x coordinates, which means that relative distances between points are not affected.
The solid black arrow represents a stationary trajectory in (x,t).
An object transformation of +V moves it onto the green arrow, with velocity: v = c/2 in the (x,t)-system.
A coordinate transformation of +V, to a system (x’,t’) moving at +V with regard to (x,t), makes this green arrow appears stationary in the (x’,t’) system.
This coordinate transformation makes the black arrow appear to be moving at –V in (x’,t’) coordinates.
14. Lorentz Transformation of Coordinate System
In a Lorentz velocity boost, the time and space axes are both rotated, and the spacing is also changed.
Figure 6. Rotation of Space and Time Coordinate Axes by a Lorentz Velocity Boost. Some proper time events are marked in blue.
To obtain the (x’,t’)-coordinates of a point defined in (x,t)-coordinates, we start at that point, and: (i) move parallel to the green lines, to find the intersection with the (red) t’-axis, which is marked with the x’-coordinates; and: (ii) move parallel to the red lines, to find the intersection with the (green) x’-axis, which is marked with the t’-coordinates. The effects of this transformation on a solid rod or ruler extending from x=0 to x=1, and stationary in (x,t), is shown in more detail below.
Figure 7. Lorentz Velocity Boost. Magnified view of Figure 6 shows time and space dilation. The gray rectangle represents a unit of the space-time path of a rod (Rod 1) stationary in (x,t). The dark green lines represent a Lorentz (object) transformation of this trajectory, which is a second rod (Rod 2) moving at V in (x,t) coordinates. This is a unit of the space-time path of a stationary rod in (x’,t’).
15. Time and Space Dilation
Figure 7 shows how both time and space dilation effects work. To see this clearly, we need to consider the volumes of space-time that an object like a rod traces out.
The (gray) rectangle PQRS represents a space-time volume, for a stationary rod or ruler in the original frame. It is 1-meter long in original coordinates (Dx = 1), and is shown over 1 unit of proper time, which corresponds to one unit of coordinate time (Dt = 1).
The rectangle PQ’R’S’ (green edges) represents a second space-time volume, for a rod which appears to be moving in the original frame. This is how the space-time volume of the first rod transforms under a Lorentz transformation.
We may interpret the transformation as either: (i) a Lorentz velocity boost of the rod by velocity +V (object transformation), or equally: (ii) a Lorentz transformation to a new coordinate system, (x’,t’), moving at –V with regard to (x,t). Note that:
The length of the moving rod measured in x is now shorter than the stationary rod: Dx = 1/γ. This is space dilation.
The coordinate time between proper time events on the moving rod measured in t is now longer than for the stationary rod (Dt = γ). This is time dilation.
The need to fix the new coordinate system in this way can be worked out by considering the moving rod from the point of view of its own inertial system.
As viewed in its own inertial coordinate system, the green rectangle PQ’R’S’ appears as the space-time boundary for a stationary rod. In this frame:
PS’ appears stationary: it is a line where: x’ = 0.
PQ’ appears as a line of simultaneity, i.e. it is a line where: t’=0.
R’S’ is also a line of simultaneity in t’.
Points on R’S’ must have the time coordinate: t’=1, since it is at the time t’ when one unit of proper time has elapsed, and for the stationary object, Dt’ = Dt.
The length of PQ’ must be one unit in x’, since the moving rod appears the same length in its own inertial frame as the original stationary rod did.
Time and space dilation are often referred to as ‘perspective effects’ in discussions of STR. Objects and processes are said to ‘look’ shorter or longer when viewed in one inertial frame rather than in another. It is common to regard this effect as a purely ‘conventional’ feature, which merely reflects a conventional choice of reference frame. But this is rather misleading, because time and space dilation are very real physical effects, and they lead to completely different types of physical predictions than classical physics.
However, the symmetrical properties of the Lorentz transformation makes it impossible to use these features to tell whether one frame is ‘really moving’ and another is ‘really stationary’. For instance, if objects get shorter when they are placed in motion, then why do we not simply measure how long objects are, and use this to determine whether they are ‘really stationary’? The details in Figure 7 reveal why this does not work: the space dilation effect is reversed when we change reference frames. That is:
Measured in Frame 1, i.e. in (x,t)-coordinates, the stationary object (Rod 1) appears longer than the moving object (Rod 2). But:
Measured in Frame 2, using (x’,t’)-coordinates, the moving object (Rod 2) appears stationary, while the originally stationary object (Rod 1) moves. But now the space dilation effect appears reversed, and Rod 2 appears longer than Rod 1!
The reason this is not a real paradox or inconsistency can be seen from the point of view of Frame 2, because now Rod 1 at the moment of time t’ = 0 stretches from the point P to Q’’, rather than from P to Q, as in Frame 1. The line of simultaneity alters in the new frame, so that we measure the distance between a different pair of space-time events. And PQ’’ is now found to be shorter than PQ’, which is the length of Rod 2 in Frame 2.
There is no answer, within STR, as to which rod ‘really gets shorter’. Similarly there is no answer as to which rod ‘really has faster proper time’ – when we switch to Frame 2, we find that Rod 2 has a faster rate of proper time with regard to coordinate time, reversing the time dilation effect apparent in Frame 1. In this sense, we could consider these effects a matter of ‘perspective’ – although it is more accurate to say that in STR, in its usual interpretation, there are simply no facts about absolute length, or absolute time, or absolute simultaneity, at all.
However, this does not mean that time and space dilation are not real effects. They are displayed in other situations where there is no ambiguity. One example is the twins’ paradox, where proper time slows down in an absolute way for a moving twin. And there are equally real physical effects resulting from space dilation. It is just that these effects cannot be used to determine an absolute frame of rest.
16. The Full Special Theory of Relativity
So far, we have only examined the most basic part of STR: the valid STR transformations for space, time, and proper time, and the way these three quantities are connected together. This is the most fundamental part of the theory. It represents relativistic kinematics. It already has very powerful implications. But the fully developed theory is far more extensive: it results from Einstein’s idea that the Lorentz transformations represent a universal invariance, applicable to all physics. Einstein formulated this in 1905: “The laws of physics are invariant under Lorentz transformations (when going from one inertial system to another arbitrarily chosen inertial system)”. Adopting this general principle, he explored the ramifications for the concepts of mass, energy, momentum, and force.
The most famous result is Einstein’s equation for energy: E = mc². This involves the extension of the Lorentz transformation to mass. Einstein found that when we Lorentz transform a stationary particle with original rest-mass m0, to set it in motion with a velocity V, we cannot regard it as maintaining the same total mass. Instead, its mass becomes larger: m = γm0, with γ defined as above. This is another deep contradiction with classical physics.
Einstein showed that this requires us to reformulate our concept of energy. In classical physics, kinetic energy is given by: E = ½ mv². In STR, there is a more general definition of energy, as: E = mc². A stationary particle then has a basic ‘rest mass energy’ of m0c². When it is set in motion, its energy is increased purely by the increase in mass, and this is kinetic energy. So we find in STR that:
Kinetic Energy = mc²-m0c² = (γ-1)m0c²
For low velocities, with: v << c, it is easily shown that: (γ-1)c² is very close to ½v², so this corresponds to the classical result in the classical limit of low energies. But for high energies, the behavior of particles is very different. The discovery that there is an underlying energy of m0c² simply from rest-mass is what made nuclear reactors and nuclear bombs possible: they convert tiny amounts of rest mass into vast amounts of thermal energy.
The main application Einstein explored first was the theory of electromagnetism, and his most famous paper, in which he defined STR in 1905, is called “Electrodynamics of Moving Bodies”. In fact, Lorentz, Poincaré and others already knew that they needed to apply the Lorentz transformation to Maxwell’s theory of classical electromagnetism, and had succeeded a few years earlier in formulating a theory which is extremely similar to Einstein’s in its predictions. Some important experimental verification of this was also available before Einstein’s work (most famously, the Michelson-Morley experiment). But his theory went much further. He radically reformulated the concepts that we use to analyse force, energy, momentum, and so forth. In this sense, his new theory was primarily a philosophical and conceptual achievement, rather than a new experimental discovery of the kind traditionally regarded as the epitome of empirical science.
He also attributed his universal ‘principle of relativity’ to the very nature of space and time itself. With important contributions by Minkowski, this gave rise to the modern view that physics is based on an inseparable combination of space and time, called space-time. Minkowski treated this as a kind of ‘geometric’ entity, based on regarding our Equation (1) as a ‘metric equation’ describing the geometric nature of space-time. This view is called the ‘geometric explanation’ of relativity theory, and this approach led Einstein even deeper into modern physics, when he applied this new conception to the theory of gravity, and discovered a generalised theory of space-time.
The nature of this ‘geometric explanation’ of the connection between space, time, and proper time is one of the most fascinating topics in the philosophy of physics. But it involves the General Theory of Relativity, which goes beyond STR.
17. References and Further Reading
The literature on relativity and its philosophical implications is enormous – and still growing rapidly. The following short selection illustrates some of the range of material available. Original publication dates are in brackets.
Bondi, Hermann. 1962. Relativity and Common Sense. Heinemann Educational Books.
A clear exposition of basic relativity theory for beginners, with a minimum of equations. Contains useful discussions of the Twins Paradox and other topics.
Einstein, Albert. 1956 (1921). The Meaning of Relativity. (The Stafford Little Lectures of Princeton University.) Princeton University Press.
Einstein’s account of the principles of his famous theory. Simple in parts, but mainly a fairly technical summary, requiring a good knowledge of physics.
Epstein, Lewis Carroll. 1983. Relativity Visualized. Insight Press. San Francisco.
A clear, simple, and rather unique introduction to relativity theory for beginners. Epstein illustrates the functional relationships between space, time, and proper time in a clear and direct way, using novel geometric presentations.
Grunbaum, Adolf. 1963. Philosophical Problems of Space and Time. Knopf, New York.
A collection of original studies by one of the seminal philosophers of relativity theory, this covers an impressive range of issues, and remains an important starting place for many recent philosophical studies.
Lorentz, H. A., A. Einstein, H. Minkowski and H. Weyl. 1923. The Principle of Relativity. A Collection of Original Memoirs on the Special and General Theory of Relativity. Trans. W. Perrett and G.B. Jeffery. Methuen. London.
These are the major figures in the early development of relativity theory, apart from Poincare, who simultaneously with Lorentz formulated the ‘pre-relativistic’ version of electromagnetic theory, which contains most of the mathematical basis of STR, shortly before Einstein’s paper of 1905. While Einstein deeply admired Lorentz – despite their permanent disagreements about STR – he paid no attention to Poincare.
Newton, Isaac. 1686. Mathematical Principles of Natural Philosophy.
Every serious student should read Newton’s “Definitions” and “Scholium”, where he introduces his concepts of time and space.
Planck, Max. 1998 (1909). Eight Lectures on Theoretical Physics.
Planck elegantly summarizes the revolutionary discoveries that characterized the first decade of 20th Century physics. Lecture 8 is one of the earliest accounts of relativity theory. This classic work shows Planck’s penetrating vision of many fundamental themes that soon came to dominate physics.
Reichenbach, Hans. 1958 (1928). The Philosophy of Space and Time. Dover, New York.
An influential early study of the concepts of space and time, and the relativistic revolution. Although Reichenbach’s approach is underpinned by his positivistic program, which is rejected today by philosophers, the central issues are of continuing interest.
Russell, Bertrand. 1977 (1925). ABC of Relativity. Unwin Paperbacks, London.
A early popular exposition of the meaning of relativity theory by one of the most influential 20th century philosophers, this presents key philosophical issues with Russell’s characteristic simplicity.
Schlipp, P.A. (Ed.) 1949. Albert Einstein: Philosopher-Scientist. The Library of Living Philosophers.
A classic collection of papers on Einstein and relativity theory.
Spivak, M. 1979. A Comprehensive Introduction to Differential Geometry. Publish or Perish. Berkeley.
An advanced mathematical introduction to the modern approach to differentiable manifolds, which developed in the 1960’s. Philosophical interest lies in the detailed semantics for coordinate systems, and the generalizations of concepts of geometry, such as the tangent vector.
Tipler, Paul A. 1982. Physics. Worth Publishers Ltd.
An extended introductory textbook for undergraduates, Chapter 35, “Relativity Theory”, is a typical modern introduction to relativity theory.
Torretti, Roberto. 1983/1996. Relativity and Geometry. Dover, New York.
An excellent source for the specialist philosopher, summarizing history and concepts of both the Special and General Theories, with extended bibliography. Combines excellent technical summaries with detailed historical surveys.
Wangsness, Roald K. 1979. Electromagnetic Fields. John Wiley & Sons Ltd.
This is a typical advanced modern undergraduate textbook on electromagnetism. The final chapter explains how the structure of electrodynamics is derived from the principles of STR.
Of the writings of the Presocratics, only quotations embedded in the works of later authors have survived. These quotations, along with reports about the Presocratics and imitations of their works, were first compiled into a standard edition (Die Fragmente der Vorsokratiker) in the nineteenth century by Hermann Diels (1848-1922) with revisions by Walther Kranz and subsequent editors, in a complete edition of all the works of Presocratic authors which has become standard in the field of ancient philosophy. The works of Presocratics, therefore, are normally referred to by DK numbers. In Diels-Kranz, each author is assigned a number, and within that author’s number, entries are divided into three groups labeled alphabetically:
testimonia: ancient accounts of the authors’ life and doctrines
ipsissima verba (literally, exact words, sometimes also termed “fragments”): the exact words of the author
imitations: works which take the author as a model
Within each of these three groups, individual fragments or testimonia are assigned sequential numbers. So, for example, since Protagoras is the eightieth author in Diels-Kranz, the third testimony concerning him, a generally unreliable short biography by Hesychius, would be referred to as DK80a3.
Diels, Hermann and Walther Kranz. Die Fragmente der Vorsokratiker. Zurich: Weidmann, 1985.
Freeman, Kathleen. Ancilla to the Pre-Socratic Philosophers. Cambridge: Harvard Univ Pr., 1983 (reprint edition).
This book is a complete English translation of the ‘b’ passages–the so-called ‘fragments’–from Die Fragmente der Vorsokratiker.
Rudolf Carnap, a German-born philosopher and naturalized U.S. citizen, was a leading exponent of logical positivism and was one of the major philosophers of the twentieth century. He made significant contributions to philosophy of science, philosophy of language, the theory of probability, inductive logic and modal logic. He rejected metaphysics as meaningless because metaphysical statements cannot be proved or disproved by experience. He asserted that many philosophical problems are indeed pseudo-problems, the outcome of a misuse of language. Some of them can be resolved when we recognize that they are not expressing matters of fact, but rather concern the choice between different linguistic frameworks. Thus the logical analysis of language becomes the principal instrument in resolving philosophical problems. Since ordinary language is ambiguous, Carnap asserted the necessity of studying philosophical issues in artificial languages, which are governed by the rules of logic and mathematics. In such languages, he dealt with the problems of the meaning of a statement, the different interpretations of probability, the nature of explanation, and the distinctions between analytic and synthetic, a priori and a posteriori, and necessary and contingent statements.
Rudolf Carnap was born on May 18, 1891, in Ronsdorf, Germany. In 1898, after his father’s death, his family moved to Barmen, where Carnap studied at the Gymnasium. From 1910 to1914 he studied philosophy, physics and mathematics at the universities of Jena and Freiburg. He studied Kant under Bruno Bauch and later recalled how a whole year was devoted to the discussion of The Critique of Pure Reason. Carnap became especially interested in Kant’s theory of space. Carnap took three courses from Gottlob Frege in 1910, 1913 and 1914. Frege was professor of mathematics at Jena. During those courses, Frege expounded his system of logic and its applications in mathematics. However, Carnap’s principal interest at that time was in physics, and by 1913 he was planning to write his dissertation on thermionic emission. His studies were interrupted by World War I and Carnap served at the front until 1917. He then moved to Berlin and studied the theory of relativity. At that time, Albert Einstein was professor of physics at the University of Berlin.
After the war, Carnap developed a new dissertation, this time on an axiomatic system for the physical theory of space and time. He submitted a draft to physicist Max Wien, director of the Institute of Physics at the University of Jena, and to Bruno Bauch. Both found the work interesting, but Wien told Carnap the dissertation was pertinent to philosophy, not to physics, while Bauch said it was relevant to physics. Carnap then chose to write a dissertation under the direction of Bauch on the theory of space from a philosophical point of view. Entitled Der Raum (Space), the work was clearly influenced by Kantian philosophy. Submitted in 1921, it was published the following year in a supplemental issue of Kant-Studien.
Carnap’s involvement with the Vienna Circle developed over the next few years. He met Hans Reichenbach at a conference on philosophy held at Erlangen in 1923. Reichenbach introduced him to Moritz Schlick, then professor of the theory of inductive science at Vienna. Carnap visited Schlick—and the Vienna Circle—in 1925 and the following year moved to Vienna to become assistant professor at the University of Vienna. He became a leading member of the Vienna Circle and, in 1929, with Hans Hahn and Otto Neurath, he wrote the manifesto of the Circle.
In 1928, Carnap published The Logical Structure of the World, in which he developed a formal version of empiricism arguing that all scientific terms are definable by means of a phenomenalistic language. The great merit of the book was the rigor with which Carnap developed his theory. In the same year he published Pseudoproblems in Philosophy asserting the meaninglessness of many philosophical problems. He was closely involved in the First Conference on Epistemology, held in Prague in 1929 and organized by the Vienna Circle and the Berlin Circle (the latter founded by Reichenbach in 1928). The following year, he and Reichenbach founded the journal Erkenntnis. At the same time, Carnap met Alfred Tarski, who was developing his semantical theory of truth. Carnap was also interested in mathematical logic and wrote a manual of logic, entitled Abriss der Logistik (1929).
In 1931, Carnap moved to Prague to become professor of natural philosophy at the German University. It was there that he made his important contribution to logic with The Logical Syntax of Language (1934). His stay in Prague, however, was cut short by the Nazi rise to power. In 1935, with the aid of the American philosophers Charles Morris and Willard Van Orman Quine, whom he had met in Prague the previous year, Carnap moved to the United States. He became an American citizen in 1941.
From 1936 to 1952, Carnap was a professor at the University of Chicago (with the year 1940-41 spent as a visiting professor at Harvard University). He then spent two years at the Institute for Advanced Study at Princeton before taking an appointment at the University of California at Los Angeles.
In the 1940s, stimulated by Tarskian model theory, Carnap became interested in semantics. He wrote several books on semantics: Introduction to Semantics (1942), Formalization of Logic (1943), and Meaning and Necessity: A Study in Semantics and Modal Logic (1947). In Meaning and Necessity, Carnap used semantics to explain modalities. Subsequently he began to work on the structure of scientific theories. His main concerns were (i) to give an account of the distinction between analytic and synthetic statements and (ii) to give a suitable formulation of the verifiability principle; that is, to find a criterion of significance appropriate to scientific language. Other important works were “Meaning Postulates” (1952) and “Observation Language and Theoretical Language” (1958). The latter sets out Carnap’s definitive view on the analytic-synthetic distinction. “The Methodological Character of Theoretical Concepts” (1958) is an attempt to give a tentative definition of a criterion of significance for scientific language. Carnap was also interested in formal logic (Introduction to Symbolic Logic, 1954) and in inductive logic (Logical Foundations of Probability, 1950; The Continuum of Inductive Methods, 1952). The Philosophy of Rudolf Carnap, ed. by Paul Arthur Schilpp, was published in 1963 and includes an intellectual autobiography. Philosophical Foundations of Physics, ed. by Martin Gardner, was published in 1966. Carnap was working on the theory of inductive logic when he died on September 14, 1970, at Santa Monica, California.
2. The Structure of Scientific Theories
In Carnap’s opinion, a scientific theory is an interpreted axiomatic formal system. It consists of:
a formal language, including logical and non-logical terms;
a set of logical-mathematical axioms and rules of inference;
a set of non-logical axioms, expressing the empirical portion of the theory;
a set of meaning postulates stating the meaning of non-logical terms, which formalize the analytic truths of the theory;
a set of rules of correspondence, which give an empirical interpretation of the theory.
The sets of meaning postulates and rules of correspondence may be included in the set of non-logical axioms. Indeed, meaning postulates and rules of correspondence are not usually explicitly distinguished from non-logical axioms; only one set of axioms is formulated. One of the main purposes of the philosophy of science is to show the difference between the various kinds of statements.
The Language of Scientific Theories The language of a scientific theory consists of:
a set of symbols and
rules to ensure that a sequence of symbols is a well-formed formula, that is, correct with respect to syntax.
Among the symbols of the language are logical and non-logical terms. The set of logical terms include logical symbols, e.g., connectives and quantifiers, and mathematical symbols, e.g., numbers, derivatives, and integrals. Non-logical terms are divided into observational and theoretical. They are symbols denoting physical entities, properties or relations such as ‘blue’, ‘cold’, ‘ warmer than’, ‘proton’, ‘electromagnetic field’. Formulas are divided into: (i) logical statements, which do not contain non-logical terms; (ii) observational statements, which contain observational terms but no theoretical terms; (iii) purely theoretical statements, which contain theoretical terms but no observational terms and (iv) rules of correspondence, which contain both observational and theoretical terms.
Classification of statements in a scientific language
type of statement
observational terms
theoretical terms
logical statements
No
No
observational statements
Yes
No
purely theoretical statements
No
Yes
rules of correspondence
Yes
Yes
Observational language contains only logical and observational statements; theoretical language contains logical and theoretical statements and rules of correspondence.
The distinction between observational and theoretical terms is a central tenet of logical positivism and at the core of Carnap’s view on scientific theories. In his book Philosophical Foundations of Physics (1966), Carnap bases the distinction between observational and theoretical terms on the distinction between two kinds of scientific laws, namely empirical laws and theoretical laws.
An empirical law deals with objects or properties that can be observed or measured by means of simple procedures. This kind of law can be directly confirmed by empirical observations. It can explain and forecast facts and be thought of as an inductive generalization of such factual observations. Typically, an empirical law which deals with measurable physical quantities, can be established by means of measuring such quantities in suitable cases and then interpolating a simple curve between the measured values. For example, a physicist could measure the volume V, the temperature T and the pressure P of a gas in diverse experiments, and he could find the law PV=RT, for a suitable constant R.
A theoretical law, on the other hand, is concerned with objects or properties we cannot observe or measure but only infer from direct observations. A theoretical law cannot be justified by means of direct observation. It is not an inductive generalization but a hypothesis reaching beyond experience. While an empirical law can explain and forecast facts, a theoretical law can explain and forecast empirical laws. The method of justifying a theoretical law is indirect: a scientist does not test the law itself but, rather, the empirical laws that are among its consequences.
The distinction between empirical and theoretical laws entails the distinction between observational and theoretical properties, and hence between observational and theoretical terms. The distinction in many situations is clear, for example: the laws that deal with the pressure, volume and temperature of a gas are empirical laws and the corresponding terms are observational; while the laws of quantum mechanics are theoretical. Carnap admits, however, that the distinction is not always clear and the line of demarcation often arbitrary. In some ways the distinction between observational and theoretical terms is similar to that between macro-events, which are characterized by physical quantities that remain constant over a large portion of space and time, and micro-events, where physical quantities change rapidly in space or time.
3. Analytic and Synthetic
To the logical empiricist, all statements can be divided into two classes: analytic a priori and synthetic a posteriori. There can be no synthetic a priori statements. A substantial aspect of Carnap’s work was his attempt to give precise definition to the distinction between analytic and synthetic statements.
In The Logical Syntax of Language (1934), Carnap studied a formal language that could express classical mathematics and scientific theories, for example, classical physics. Carnap would have known Kurt Gödel’s 1931 article on the incompleteness of mathematics. He was, therefore, aware of the substantial difference between the two concepts of proof and consequence: some statements, despite being a logical consequence of the axioms of mathematics, are not provable by means of these axioms. He would not, however, have been able to take account of Alfred Tarski’s essay on semantics, first published in Polish in 1933. Tarski’s essay led to the notion of logical consequence being regarded as a semantic concept and defined by means of model theory. These circumstances explain how Carnap, in The Logical Syntax of Language, gave a purely syntactic formulation of the concept of logical consequence. However, he did define a new rule of inference, now called the omega-rule, but formerly called the Carnap rule:
From the infinite series of premises A(1), A(2), … , A(n), A(n+1) ,…, we can infer the conclusion (x)A(x)
Carnap defines the notion of logical consequence in the following way: a statement A is a logical consequence of a set S of statements if and only if there is a proof of A based on the set S; it is admissible to use the omega-rule in the proof of A. In the definition of the notion of provable, however, a statement A is provable by means of a set S of statements if and only if there is a proof of A based on the set S, but the omega-rule is not admissible in the proof of A. (A formal system which admits the use of the omega-rule is complete, so Gödel’s incompleteness theorem does not apply to such formal systems.
Carnap then proceeded to define some kinds of statements: (i) a statement is L-true if and only if it is a logical consequence of the empty set of statements; (ii) a statement is L-false if and only if all statements are a logical consequence of it; (iii) a statement is analytic if and only if it is L-true or L-false; (iv) a statement is synthetic if and only if is not analytic. Carnap thus defines analytic statements as logically determined statements: their truth depends on logical rules of inference and is independent of experience. Thus, analytic statements are a priori while synthetic statements are a posteriori, because they are not logically determined.
Carnap maintained his definitions of statements in his article “Testability and Meaning” (1936) and his book Meaning and Necessity (1947). In “Testability and Meaning,” he introduced semantic concepts: a statement is analytic if and only if it is logically true; it is self-contradictory if and only if it is logically false. In any other case, the statement is synthetic. In Meaning and Necessity. Carnap first defines the notion of L-true (a statement is L-true if its truth depends on semantic rules) and then defines the notion of L-false (a statements if L-false if its negation is L-true). A statement is L-determined if it is L-true or L-false; analytic statements are L-determined, while synthetic statements are not L-determined. This is very similar to the definitions Carnap gave in The Logical Syntax of Language but with the change from syntactic to semantic concepts.
In 1951, Quine published the article “Two Dogmas of Empiricism,” in which he disputed the distinction made between analytic and synthetic statements. In response, Carnap partially changed his point of view on this problem. His first response to Quine came in “Meaning postulates” (1952) where Carnap suggested that analytic statements are those which can be derived from a set of appropriate sentences that he called meaning postulates. Such sentences define the meaning of non logical terms and thus the set of analytic statements is not equal to the set of logically true statements. Later, in “Observation language and theoretical language” (1958), he expressed a general method for determining a set of meaning postulates for the language of a scientific theory. He further expounded on this method in his reply to Carl Gustav Hempel in The Philosophy of Rudolf Carnap (1963), and in Philosophical Foundations of Physics (1966). Suppose the number of non-logical axioms is finite. Let T be the conjunction of all purely theoretical axioms, and C the conjunction of all correspondence postulates and TC the conjunction of T and C. The theory is equivalent to the single axiom TC. Carnap formulates the following problems: how can we find two statements, say A and R, so that A expresses the analytic portion of the theory (that is, all consequences of A are analytic) while R expresses the empirical portion (that is, all consequences of R are synthetic)? The empirical content of the theory is formulated by means of a Ramsey sentence (a discovery of the English philosopher Frank Ramsey). Carnap’s solution to the problem builds a Ramsey sentence on the following instructions:
Replace every theoretical term in TC with a variable.
Add an appropriate number of existential quantifiers at the beginning of the sentence.
Look at the following example. Let TC(O 1 ,..,O n ,T 1 ,…,T m ) be the conjunction of T and C; in TC there are observational terms O 1 …O n and theoretical terms T 1 …T m . The Ramsey sentence (R) is
EX 1 …EX m TC(O 1 ,…,O n ,X 1 ,…,X m )
Every observational statement which is derivable from TC is also derivable from R and vice versa so that, R expresses exactly the empirical portion of the theory. Carnap proposes the statement R TC as the only meaning postulate; this became known as the Carnap sentence. Note that every empirical statement that can be derived from the Carnap sentence is logically true, and thus the Carnap sentence lacks empirical consequences. So, a statement is analytic if it is derivable from the Carnap sentence; otherwise the statement is synthetic. The requirements of Carnap’s method can be summarized as follows : (i) non-logical axioms must be explicitly stated, (ii) the number of non-logical axioms must be finite and (iii) observational terms must be clearly distinguished from theoretical terms.
4. Meaning and Verifiability
Perhaps the most famous tenet of logical empiricism is the verifiability principle, according to which a synthetic statement is meaningful only if it is verifiable. Carnap sought to give a logical formulation of this principle. In The Logical Structure of the World (1928) he asserted that a statement is meaningful only if every non-logical term is explicitly definable by means of a very restricted phenomenalistic language. A few years later, Carnap realized that this thesis was untenable because a phenomenalistic language is insufficient to define physical concepts. Thus he choose an objective language (“thing language”) as the basic language, one in which every primitive term is a physical term. All other terms (biological, psychological, cultural) must be defined by means of basic terms. To overcome the problem that an explicit definition is often impossible, Carnap used dispositional concepts, which can be introduced by means of reduction sentences. For example, if A, B, C and D are observational terms and Q is a dispositional concept, then
(x)[Ax → (Bx ↔ Qx)]
(x)[Cx → (Dx ↔ ~Qx)]
are reduction sentences for Q. In “Testability and Meaning” (1936) Carnap revised the new verifiability principle in this way: all terms must be reducible, by means of definitions or reduction sentences, to the observational language. But this proved to be inadequate. K. R. Popper showed not only that some metaphysical terms can be reduced to the observational language and thus fulfill Carnap’s requirements, but also that some genuine physical concepts are forbidden. Carnap acknowledged that criticism and in “The Methodological Character of Theoretical Concepts” (1956) sought to develop a further definition. The main philosophical properties of Carnap’s new principle can be outlined under three headings. First, of all, the significance of a term becomes a relative concept: a term is meaningful with respect to a given theory and a given language. The meaning of a concept thus depends on the theory in which that concept is used. This represents a significant modification in empiricism’s theory of meaning. Secondly, Carnap explicitly acknowledges that some theoretical terms cannot be reduced to the observational language: they acquire an empirical meaning by means of the links with other reducible theoretical terms. Third, Carnap realizes that the principle of operationalism is too restrictive. Operationalism was formulated by the American physicist Percy Williams Bridgman (1882-1961) in his book The Logic of Modern Physics (1927). According to Bridgman, every physical concept is defined by the operations a physicist uses to apply it. Bridgman asserted that the curvature of space-time, a concept used by Einstein in his general theory of relativity, is meaningless, because it is not definable by means of operations., Bridgman subsequently changed his philosophical point of view, and admitted there is an indirect connection with observations. Perhaps influenced by Popper’s criticism, or by the problematic consequences of a strict operationalism, Carnap changed his earlier point of view and freely admitted a very indirect connection between theoretical terms and the observational language.
5. Probability and Inductive Logic
A variety of interpretations of probability have been proposed:
Classical interpretation. The probability of an event is the ratio of the favorable outcomes to the possible outcomes. For example: a die is thrown with the result that “the score is five”. There are six possible outcomes with only one favorable; thus the probability of “the score is five” is one sixth.
Axiomatic interpretation. The probability is whatever fulfils the axioms of the theory of probability. In the early 1930s, the Russian mathematician Andrei Nikolaevich Kolmogorov (1903-1987) formulated the first axiomatic system for probability.
Frequency interpretation, now the favored interpretation in empirical science. The probability of an event in a sequence of events is the limit of the relative frequency of that event. Example: throw a die several times and record the scores; the relative frequency of “the score is five” is about one sixth; the limit of the relative frequency is exactly one sixth.
Probability as a degree of confirmation. This was an approach supported by Carnap and students of inductive logic. The probability of a statement is the degree of confirmation the empirical evidence gives to the statement. Example: the statement “the score is five” receives a partial confirmation by the evidence; its degree of confirmation is one sixth.
Subjective interpretation. The probability is a measure of the degree of belief. A special case is the theory that the probability is a fair betting quotient – this interpretation was supported by Carnap. Example: suppose you bet that the score would be five; you bet a dollar and, if you win, you will receive six dollars: this is a fair bet.
Propensity interpretation. This is a proposal of K. R. Popper. The probability of an event is an objective property of the event. For example: the physical properties of a die (the die is homogeneous; it has six sides; on every side there is a different number between one and six; etc.) explain the fact that the limit of the relative frequency of “the score is five” is one sixth.
Carnap devoted himself to giving an account of the probability as a degree of confirmation. The philosophically most significant consequences of his research arise from his assertion that the probability of a statement, with respect to a given body of evidence, is a logical relation between the statement and the evidence. Thus it is necessary to build an inductive logic; that is, a logic which studies the logical relations between statements and evidence. Inductive logic would give us a mathematical method of evaluating the reliability of an hypothesis. In this way inductive logic would answer the problem raised by David Hume’s analysis of induction. Of course, we cannot be sure that an hypothesis is true; but we can evaluate its degree of confirmation and we can thus compare alternative theories.
In spite of the abundance of logical and mathematical methods Carnap used in his own research on the inductive logic, he was not able to formulate a theory of the inductive confirmation of scientific laws. In fact, in Carnap’s inductive logic, the degree of confirmation of every universal law is always zero.
Carnap tried to employ the physical-mathematical theory of thermodynamic entropy to develop a comprehensive theory of inductive logic, but his plan never progressed beyond an outline stage. His works on entropy were published posthumously.
6. Modal Logic and the Philosophy of Language
The following table, which is an adaptation of a similar table Carnap used in Meaning and Necessity, shows the relations between modal properties such as necessary and impossible and logical properties such as L-true, L-false, analytic, synthetic. The symbol N means “necessarily”, so that Np means “necessarily p” or “p is necessary.”
Modal and logical properties of statements
Modalities
Formalization
Logical status
p is necessary
Np L
true, analytic
p is impossible
N~p L
false, contradictory
p is contingent
~Np & ~N~p
factual, synthetic
p is not necessary
~Np Not L
true
p is possible
~N~p Not L
false
p is not contingent
Np v N~p L
determined, not synthetic
Carnap identifies the necessity of a statement p with its logical truth: a statement is necessary if and only if it is logically true. Thus modal properties can be defined by means of the usual logical properties of statements. Np, i.e., “necessarily p”, is true if and only if p is logically true. He defines the possibility of p as “it is not necessary that not p”. That is, “possibly p” is defined as ~N~p. The impossibility of p means that p is logically false. It must be stressed that, in Carnap’s opinion, every modal concept is definable by means of the logical properties of statements. Modal concepts are thus explicable from a classical point of view (meaning “using classical logic”, e.g., first order logic). Carnap was aware that the symbol N is definable only in the meta-language, not in the object language. Np means “p is logically true”, and the last statement belongs to the meta-language; thus N is not explicitly definable in the language of a formal logic, and we cannot eliminate the term N. More precisely, we can define N only by means of another modal symbol we take as a primitive symbol, so that at least one modal symbol is required among the primitive symbols.
Carnap’s formulation of modal logic is very important from a historical point of view. Carnap gave the first semantic analysis of a modal logic, using Tarskian model theory to explain the conditions in which “necessarily p” is true. He also solved the problem of the meaning of the statement (x)N[Ax], where Ax is a sentence in which the individual variable x occurs. Carnap showed that (x)N[Ax] is equivalent to N[(x)Ax] or, more precisely, he proved we can assume its equivalence without contradictions.
From a broader philosophical point of view, Carnap believed that modalities did not require a new conceptual framework; a semantic logic of language can explain the modal concepts. The method he used in explaining modalities was a typical example of his philosophical analysis. Another interesting example is the explanation of belief-sentences which Carnap gave in Meaning and Necessity. Carnap asserts that two sentences have the same extension if they are equivalent, i.e., if they are both true or both false. On the other hand, two sentences have the same intension if they are logically equivalent, i.e., their equivalence is due to the semantic rules of the language. Let A be a sentence in which another sentence occurs, say p. A is called “extensional with respect to p” if and only if the truth value of A does not change if we substitute the sentence p with an equivalent sentence q. A is called “intensional with respect to p” if and only if (i) A is not extensional with respect to p and (ii) the truth of A does not change if we substitute the sentence p with a logically equivalent sentence q. The following examples arise from Carnap’s assertions:
The sentence A v B is extensional with respect to both A and B; we can substitute A and B with equivalent sentences and the truth value of A v B does not change.
Suppose A is true but not L-true; therefore the sentences A v ~A and A are equivalent (both are true) and, of course, they are not L-equivalent. The sentence N(A v ~A) is true and the sentence N(A) is false; thus N(A) is not extensional with respect to A. On the contrary, if C is a sentence L-equivalent to A v ~A, then N(A v ~A) and N(C) are both true: N(A) is intensional with respect to A.
There are sentences which are neither extensional not intensional; for example, belief-sentences. Carnap’s example is “John believes that D”. Suppose that “John believes that D” is true; let A be a sentence equivalent to D and let B be a sentence L-equivalent to D. It is possible that the sentences “John believes that A” and “John believes that B” are false. In fact, John can believe that a sentence is true, but he can believe that a logically equivalent sentence is false. To explain belief-sentences, Carnap defines the notion of intensional isomorphism. In broad terms, two sentences are intensionally isomorphic if and only if their corresponding elements are L-equivalent. In the belief-sentence “John believes that D” we can substitute D with an intensionally isomorphic sentence C.
7. Philosophy of Physics
The first and the last books Carnap published during his lifetime were concerned with the philosophy of physics: his doctoral dissertation (Der Raum, 1922) and Philosophical Foundations of Physics, ed. by Martin Gardner, 1966. Der Raum deals with the philosophy of space. Carnap recognizes the difference between three kinds of theories of space: formal, physical and intuitive s. Formal space is analytic a priori; it is concerned with the formal properties of the space that is with those properties which are a logical consequence of a definite set of axioms. Physical space is synthetic a posteriori; it is the object of natural science, and we can know its structure only by means of experience. Intuitive space is synthetic a priori, and is known via a priori intuition. According to Carnap, the distinction between three different kinds of space is similar to the distinction between three different aspects of geometry: projective, metric and topological respectively.
Some aspects of Der Raum remain very interesting. First, Carnap accepts a neo-Kantian philosophical point of view. Intuitive space, with its synthetic a priori character, is a concession to Kantian philosophy. Second, Carnap uses the methods of mathematical logic; for example, the characterization of intuitive space is given by means of Hilbert’s axioms for topology. Thirdly, the distinction between formal and physical space is similar to the distinction between mathematical and physical geometry. This distinction, first proposed by Hans Reichenbach and later accepted by Carnap, and became the official position of logical empiricism on the philosophy of space.
Carnap also developed a formal system for space-time topology. He asserted (1925) that space relations are based on the causal propagation of a signal, while the causal propagation itself is based on the time order.
Philosophical Foundations of Physics is a clear and approachable survey of topics from the philosophy of physics based on Carnap’s university lectures. Some theories expressed there are not those of Carnap alone, but they belong to the common heritage of logical empiricism. The subjects dealt with in the book include:
The structure of scientific explanation: deductive and probabilistic explanation.
The philosophical and physical significance of non-Euclidean geometry; the theory of space in the general theory of relativity. Carnap argues against Kantian philosophy, especially against the synthetic a priori, and against conventionalism. He gives a clear explanation of the main properties of non-Euclidean geometry.
Determinism and quantum physics.
The nature of scientific language. Carnap deals with (i) the distinction between observational and theoretical terms, (ii) the distinction between analytic and synthetic statements and (iii) quantitative concepts.
As a sample of the content of Philosophical Foundations of Physics we can briefly look at Carnap’s thought on scientific explanation. Carnap accepts the classical theory developed by Carl Gustav Hempel. Carnap gives the following example to explain the general structure of a scientific explanation:
(x)(Px→ Qx)
Pa
———
Qa
where the first statement is a scientific law; the second, is a description of the initial conditions; and the third, is the description of the event we want to explain. The last statement is a logical consequence of the first and the second, which are the premises of the explanation. A scientific explanation is thus a logical derivation of an appropriate statement from a set of premises, which state universal laws and initial conditions. According to Carnap, there is another kind of scientific explanation, probabilistic explanation, in which at least one universal law is not a deterministic law, but a probabilistic law. Again Carnap’s example is:
fr(Q,P) = 0.8
Pa
———-
Qa
where the first sentence means “the relative frequency of Q with respect to P is 0.8”. Qa is not a logical consequence of the premises; therefore this kind of explanation determines only a certain degree of confirmation for the event we want to explain.
8. Carnap’s Heritage
Carnap’s work has stimulated much debate. A substantial scholarly literature, both critical and supportive, has developed from examination of his thought. With respect to the analytic-synthetic distinction, Ryszard Wojcicki and Marian Przelecki – two Polish logicians – formulated a semantic definition of the distinction between analytic and synthetic. They proved that the Carnap sentence is the weakest meaning postulate, i.e., every meaning postulate entails the Carnap sentence. As a result, the set of analytic statements which are a logical consequence of the Carnap sentence is the smallest set of analytic statements. Wojcicki and Przelecki’s research is independent of the distinction between observational and theoretical terms, i.e., their suggested definition also works in a purely theoretical language. They also dispense with the requirement for a finite number of non-logical axioms.
The tentative definition of meaningfulness that Carnap proposed in “The Methodological Character of Theoretical Concepts” has been proved untenable. See, for example, David Kaplan, “Significance and Analyticity” in Rudolf Carnap, Logical Empiricist and Marco Mondadori’s introduction to Analiticità, Significanza, Induzione, in which Mondadori suggests a possible correction of Carnap’s definition.
With respect to inductive logic, I mention only Jaakko Hintikka’s generalization of Carnap’s continuum of inductive methods. In Carnap’s inductive logic, the probability of every universal law is always zero. Hintikka succeeded in formulating an inductive logic in which universal laws can obtain a positive degree of confirmation.
In Meaning and Necessity, 1947, Carnap was the first logician to use a semantic method to explain modalities. However, he used Tarskian model theory, so that every model of the language is an admissible model. In 1972 the American philosopher Saul Kripke was able to prove that a full semantics of modalities can be attained by means of possible-worlds semantics. According to Kripke, not all possible models are admissible. J. Hintikka’s essay “Carnap’s heritage in logical semantics” in Rudolf Carnap, Logical Empiricist, shows that Carnap came extremely close to possible-worlds semantics, but was not able to go beyond classical model theory.
The omega-rule, which Carnap proposed in The Logical Syntax of Language, has come into widespread use in metamathematical research over a broad range of subjects.
9. References and Further Reading
The Philosophy of Rudolf Carnap (1963) contains the most complete bibliography of Carnap’s work. Listed below are Carnap’s most important works, arranged in chronological order.
a. Carnap’s Works
1922 Der Raum: Ein Beitrag zur Wissenschaftslehre, dissertation, in Kant-Studien, Ergänzungshefte, n. 56
1925 “Über die Abhängigkeit der Eigenschaften der Raumes von denen der Zeit” in Kant-Studien, 30
1928 Scheinprobleme in der Philosophie, Berlin : Weltkreis-Verlag
1928 Der Logische Aufbau der Welt, Leipzig : Felix Meiner Verlag (English translation The Logical Structure of the World; Pseudoproblems in Philosophy, Berkeley : University of California Press, 1967)
1929 (with Otto Neurath and Hans Hahn) Wissenschaftliche Weltauffassung der Wiener Kreis, Vienna : A. Wolf
1929 Abriss der Logistik, mit besonderer Berücksichtigung der Relationstheorie und ihrer Anwendungen, Vienna : Springer
1932 “Die physikalische Sprache als Universalsprache der Wissenschaft” in Erkenntnis, II (English translation The Unity of Science, London : Kegan Paul, 1934)
1934 Logische Syntax der Sprache (English translation The Logical Syntax of Language, New York : Humanities, 1937)
1935 Philosophy and Logical Syntax, London : Kegan Paul
1936 “Testability and meaning” in Philosophy of Science, III (1936) and IV (1937)
1938 “Logical Foundations of the Unity of Science” in International Encyclopaedia of Unified Science, vol. I n. 1, Chicago : University of Chicago Press
1939 “Foundations of Logic and Mathematics” in International Encyclopaedia of Unified Science, vol. I n. 3, Chicago : University of Chicago Press
1942 Introduction to Semantics, Cambridge, Mass. : Harvard University Press
1943 Formalization of Logic, Cambridge, Mass. : Harvard University Press
1947 Meaning and Necessity: a Study in Semantics and Modal Logic, Chicago : University of Chicago Press
1950 Logical Foundations of Probability, Chicago : University of Chicago Press
1952 “Meaning postulates” in Philosophical Studies, III (now in Meaning and Necessity, 1956, 2nd edition)
1952 The Continuum of Inductive Methods, Chicago : University of Chicago Press
1954 Einführung in die Symbolische Logik, Vienna : Springer (English translation Introduction to Symbolic Logic and its Applications, New York : Dover, 1958)
1956 “The Methodological Character of Theoretical Concepts” in Minnesota Studies in the Philosophy of Science, vol. I, ed. by H. Feigl and M. Scriven, Minneapolis : University of Minnesota Press
1958 “Beobacthungssprache und theoretische Sprache” in Dialectica, XII (English translation “Observation Language and Theoretical Language” in Rudolf Carnap, Logical Empiricist, Dordrecht, Holl. : D. Reidel Publishing Company, 1975)
1966 Philosophical Foundations of Physics, ed. by Martin Gardner, New York : Basic Books
1977 Two Essays on Entropy, ed. by Abner Shimony, Berkeley : University of California Press
b. Other Sources
1962 Logic and Language: Studies Dedicated to Professor Rudolf Carnap on the Occasion of his Seventieth Birthday, Dordrect, Holl. : D. Reidel Publishing Company
1963 The Philosophy of Rudolf Carnap, ed. by Paul Arthur Schillp, La Salle, Ill. : Open Court Pub. Co.
1970 PSA 1970: Proceedings of the 1970 Biennial Meeting of the Philosophy of Science Association: In Memory of Rudolf Carnap, Dordrect, Holl. : D. Reidel Publishing Company
1971 Analiticità, Significanza, Induzione, ed. by Alberto Meotti e Marco Mondadori, Bologna, Italy : il Mulino
1975 Rudolf Carnap, Logical Empiricist. Materials and Perspectives, ed. by Jaakko Hintikka, Dordrecht, Holl. : D. Reidel Publishing Company
1986 Joëlle Proust, Questions de Forme: Logique at Proposition Analytique de Kant a Carnap, Paris, France: Fayard (English translation Questions of Forms: Logic and Analytic Propositions from Kant to Carnap, Minneapolis : University of Minnesota Press)
1990 Dear Carnap, Dear Van: The Quine-Carnap Correspondence and Related Work, ed. by Richard Creath, Berkeley : University of California Press
1991 Maria Grazia Sandrini, Probabilità e Induzione: Carnap e la Conferma come Concetto Semantico, Milano, Italy : Franco Angeli
1991 Erkenntnis Orientated: A Centennial Volume for Rudolf Carnap and Hans Reichenbach, ed. by Wolfgang Spohn, Dordrecht; Boston : Kluwer Academic Publishers
1991 Logic, Language, and the Structure of Scientific Theories: Proceedings of the Carnap-Reichenbach Centennial, University of Konstanz, 21-24 May 1991 Pittsburgh : University of Pittsburgh Press; [Konstanz] : Universitasverlag Konstanz
1995 L’eredità di Rudolf Carnap: Epistemologia, Filosofia delle Scienze, Filosofia del Linguaggio, ed. by Alberto Pasquinelli, Bologna, Italy : CLUEB