| CARVIEW |
Specifically, the theory used is Tarski’s theory of concatenation (TC), with the following axioms:
Concatenation is associative, there are at least two different atomic symbols, and two different ways of dividing a text in two have a common beginning, a common ending and differ only in whether the middle part is the end of first half or the beginning the second half. That’s it. This seemingly meager theory, however, turns out to be undecidable, and indeed essentially so.
Before defining what we mean by undecidable, some preliminaries first. The intended model of TC (playing the role of ) is the set of all (finite non-empty) words over the alphabet
:
. These will be called the standard texts. The term
designates the text
, the term
the text
. Clearly each element of the intended model is designated by some constant term. The role of numerals (
) is played by a class of constant terms called standard names (
). We can easily define inclusion of texts, and limited quantification (
) is replaced by quantification over subtexts (
). The equivalent of the crucial schema
(where, for example,
is the numeral
) is provable, as are elementary facts about equality of texts. (Transposing the argument to a different setting brings out nicely the exact value of each step in the proof.)
Now we can define the notions corresponding to primitive and general recursivity. The class of elementarily discernible relations on the set of standard texts is the smallest class (a) including the relations of being identical with
, being identical with
, identity of two texts and the ternary relation of z being the concatenation of x and y, (b) closed under adding a redundant argument, identifying two arguments, permuting the order of arguments, negation, conjunction, and quantification limited to subtexts (thus
yields $S(t,y,\ldots) = \forall x \subseteq t R(x,y,\ldots)$). The class
of generally discernible relations adds to these inductive conditions closure under dual quantification:
is GD if there are two GD relations
such that
and
. The second condition is clearly equivalent to
, so we have a procedure for deciding whether for any given argument
holds or not: going through the standard texts one by one, we will sooner or later hit upon a suitable witness playing the role of
or
(WLOG we can assume that dual quantification has been used only once in defining a GD relation, as the last step). The definition of a generally discernible function is obvious, and we can easily observe that the counter-image of a GD set under a GD function is GD, and the class of GD functions is closed under composition. Every GD relation
can be represented by a formula
in a given consistent extension T of TC, that is, each instance
holds iff
is provable in T, where
is the standard name of the standard text
, and every ED relation
can be strongly represented a formula
, that is,
is represented by
and
by
. The above-mentioned schema is appealed to in proving the limited quantification induction step of these representation theorems. (It should be noted here that the original paper only proves this for
true in the standard model, the extension to arbitrary consistent extensions and the normal form lemma it relies on is proved in the paper Undecidability and Concatenation.)
We also need to code the formulas and other syntactic objects of TC within TC (in other words, as sequences of a’s and b’s). Since our theory is a theory of texts, we don’t have to rely on any specifically arithmetical tricks like the Chinese remainder theorem here, we simply enumerate the symbols used in TC and assign the codes to them (variables are written as, for example,
, so if the code of
is
and the code of
is
, then the code of this variable is
). Thus each formula (as a word over the alphabet
) can also play the role of a text (word over the alphabet
). Substitution can then be defined in full generality, but Grzegorczyk simplifies matters by assuming WLOG that the free variable we’re substituting for occurs only once in the formula, which makes the definition of substitution particularly simple.
The last step is the self-referential one. Assume that the consistent theory T containing TC is GD, i.e. the set of the codes of the formulas of T is GD (hence so is its complement). Substitution and coding are GD operations, therefore the operation that assigns the code of the formula to the code of the formula
containing one free variable is also GD, and so is the counter-image of the complement of T under this operation, i.e. the set
of (the codes of) formulas
such that
. By the representation theorem, there is a formula
defining this set: for any standard text
,
. Setting (the code of)
for
, we get
, but by definition of
also
, contradiction. Therefore, T is not GD. And since
, where
is the conjunction of the axioms of TC and
is the set of logically valid formulas of first-order logic (in our language), we get that
is not generally discernible. QED.
More explicitly: we again have a set of voters , each voting by imposing a linear order
on a set of alternatives
(we will call any such
-tuple of linear orders a “profile”). This time, however, our goal is more modest: we only wish to pick a single winning alternative from
(i.e. provide a function
from the set of voter preferences
to
, called the collective choice procedure). Our demands are:
- every candidate can win, given a suitable set of voter preferences (non-imposition)
- knowing the votes of other voters does not give anyone an incentive to change their own vote (stability / non-manipulability)
That is, given the preferences of all voters except for , the winning alternative when
votes according to his sincere preferences is preferred by
to the alternatives which would win if
voted otherwise. We will demonstrate that again, any function which satisfies these two demands is dictatorial in the sense of simply returning the top preference of some particular voter. (The proof is based on the paper Stability of Aggregation Procedures, Ultrafilters and Simple games by P. Bateau, J.-M. Blin and B. Monjardet. I tried to make the argument as informal as possible without losing the rigour. For the definition of an ultrafilter, see the previous post.)
First, a quick ultrafilter lemma:
Lemma (a characterization of ultrafilters):
Every family of subsets of a non-empty set
satisfying the following conditions is an ultrafilter:
Proof: if , then we would have both
and
, therefore
(non-triviality). Now let us assume that
, but
. Then
, but
. Therefore
has the finite intersection property and is an ultrafilter.
Now, back to social choice theory. Let’s say that for a profile , the collective choice is
. We would like know what happens when some of the voters change their preferences. For notational convenience, let
be the set of voters which prefer
to
.
Lemma (about stable collective choice):
For any stable collective choice procedure , if
and
, then
.
Proof: first, we will define a function which takes a profile as an argument and returns the set of all profiles which result from applying the follow operation to
:
- if voter
prefers
to
in
, the pair
,
gets moved to the beginning of his ranking in this order in
- if voter
prefers
to
in
, the pair
,
gets moved to the beginning of his ranking in either order in
- every other ranking in the profile remains the same
In other words, and
get moved to the front of
and
doesn’t drop in ranking relative to
. Likewise,
is the set of all profiles which result from successively applying this operation to every voter.
It is not difficult use the stability of to prove that since
, we have
(meaning: every profile which results from applying the operation to
picks out
). By applying this to each voter, we obtain the same result for
.
Now, by the hypothesis of this theorem, (
places at least as many constraints on the operation as
). Thus,
. But if it were the case that
for some
, we would have
, which is a contradiction, since
and
have a non-empty intersection.
Combining this lemma with the assumption of non-imposition yields the weak Pareto property: if everyone prefers to
,
cannot be the collective choice.
Armed with these results, we can now explore the structure of “preventing families”. A set of voters is said to be a member of the preventing family
for a collective choice procedure iff
can never get chosen by the procedure when all members of
prefer
to
(the members of
can prevent
from being chosen by unanimously preferring
).
Lemma (about preventing families):
is non-empty: trivial
: follows easily from the weak Pareto property
: by the previous property and the theorem about collective choice
: easy application of the weak Pareto property
It would seem reasonable to demand that be the same for every
. Fortunately, we do not have to, as the following theorem shows:
Lemma (neutrality of stable collective choice procedures):
for any
.
Proof: let us first show that (hence
) for any
. Pick
and
and consider a profile
such that
are the top three choices of every voter and:
for
for
for
for
Then by the weak Pareto property,
, but
and
. Thus,
and by property (2) of preventing families,
.
The same reasoning applied to
and the profile:
for
for
for
for
shows that
.
Applying this result to properties (3) and (4) of preventing families, we see that the unique preventing family in fact satisfies the first two conditions of the ultrafilter lemma. The last step of the proof of the Gibbard-Satterthwaite theorem will be to show that it also satisfies the last one.
Lemma:
The preventing family is an ultrafilter.
Proof: the first two conditions of the ultrafilter lemma are satisfied. As for the last one, let us assume that , but
. We could then construct a profile
such that:
for
for
for
otherwise
By the weak Pareto property, . But since
,
.
In view of the fact that every ultrafilter on a finite set is principal (consists precisely of its subsets containing a given element), this concludes the proof of the Gibbard-Satterthwaite theorem: every stable, non-imposed collective choice procedure for at least three alternatives is dictatorial. Conversely, a free ultrafilter on an infinite set corresponds to a stable non-dictatorial collective choice procedure (or, more accurately, a collective choice function).
]]>Let us first define a more general notion. A filter on a set
is a family of subsets of
such that:
(non-triviality)
(superset property)
(finite intersection property)
An ultrafilter is a filter such that for all
:
(maximality)
By non-triviality and the finite intersection property, only one of these options can hold, hence every ultrafilter on is maximal in the sense that it cannot be extended to a larger filter on
. A trivial example of an ultrafilter is provided by the family
of subsets of
which contain a given element
. Such ultrafilters are called “principal”. It is easy to show that every ultrafilter on a finite set is principal. On the other hand, the existence of non-principal ultrafilters (also called “free”) cannot be proven in ZF. Proving their existence requires adding a non-constructive choice principle to the axioms of ZF. In particular, assuming that every filter can be extended to an ultrafilter is equivalent to the Boolean prime ideal theorem, a principle strictly weaker than the Axiom of Choice but still independent of ZF. Applying this to, for example, the filter of cofinite subsets of an infinite set yields a free ultrafilter.
The intuitive interpretation of an (ultra)filter is that it specifies which subsets are to be considered “large”. A property holds for “almost all” elements of a set if the set of elements for which it holds is contained in the relevant (ultra)filter – for example, if its complement is finite. (This is a generalization of the measure-theoretic notion of “almost everywhere”, i.e. on a set of full measure. Indeed, an ultrafilter can be thought of as a finitely additive zero-one measure.) This interpretation suggests a natural connection to social choice theory – if “almost all” voters prefer A to B, we would ideally like to rank A higher than B.
And indeed, this is what Arrow’s conditions imply (the following proof is based on these notes by A. McLennan [pdf]). Arrow considers a situation in which we have a set of voters , each linearly ordering a set of three or more alternatives
according to his preferences (we will symbolize the
-th voter’s ordering by
). The task is to find a reasonable procedure (“social welfare function”) for aggregating their preferences into a single linear ordering of social preference (
). The natural conditions which Arrow imposes on this procedure are:
- If every voters prefers x to y, the society prefers x to y. (unanimity)
- The social preference between x and y depends only on the voters’ preferences between x and y, i.e. not on their ranking of other alternatives. (independence of irrelevant alternatives)
We will say that a set of voters is decisive for x over y if, whenever all the voters in
prefer x to y and the rest of the voters
prefer y to x, the society prefers x to y. First let us prove that if
is a decisive coalition for some x over some y, it is decisive for every alternative over any other. If
is decisive for x over y, then let us consider a set of voter preferences such that
for
and
for
. Because
is decisive for x over y, we have
, unanimity implies
and transitivity means that
. Therefore (by IIA),
is also decisive for x over z. A symmetric argument shows that
is also decisive for t over z for any t,z.
We now wish to prove that the family of decisive sets does in fact form an ultrafilter. First, non-triviality immediately follows from unanimity. Second, the intersection property. Let us assume that and
are decisive and consider the set of voter preferences defined thus:
for
for
for
for
Then by the decisiveness of
and
by the decisiveness of
, hence
by transitivity and
is decisive by IIA.
Third, maximality. Consider the set of preferences for
and
for
. If neither
nor
were decisive, then x cannot be socially preferred to y, nor y to z, but by unanimity
.
And finally, the superset property follows from the fact that if and
is decisive,
cannot be decisive (by the finite intersection property and non-triviality).
If is finite, we can conclude the proof of Arrow’s theorem by noting that every ultrafilter on
is principal, i.e. for some
,
is a decisive coalition and the social preference is always simply a copy of a’s preferences. Therefore, every social welfare function for a finite number number of voters at least three alternatives which satisfies unanimity and IIA is dictatorial in this sense.
For an infinite number of voters, the existence of such a function cannot be disproven and in fact follows from the Axiom of Choice or the weaker Boolean Prime Ideal Theorem (although calling it a “procedure” is somewhat hyperbolic due to the non-constructive nature of free ultrafilters). If there are only two alternatives to choose from, Arrow’s result is complemented by May’s theorem, which states that simple majority voting is the canonical choice.
And an interesting, if only tangentially related factoid to end with: for exactly three choices and a large number of voters, the relative prevalence of the Condorcet effect (cycles in social preference which prevent us from applying a simple majority rule) tends to according to this paper by G.-T. Guilbaud [pdf].
Well, I don’t know the first thing about economics, but it was the first thing to come to my mind. And it turns out my hunch was correct, there is indeed a portion of economics which can be axiomatized in essentially the same way as thermodynamics, as shown by E. Smith and D.K. Foley in their paper Classical thermodynamics and economic general equilibrium theory. (It may be worthwhile to think about how you would draw an analogy between the two before continuing.)
The role of thermodynamic systems in the formalism is taken over by the so-called quasi-linear neoclassical economies, which means we assume that each agent is assigned a vector of n+1 commodities
(a commodity bundle) and tries to maximize his (quasiconcave, differentiable) ordinal utility function
by exchanging goods with others. Quasi-linearity requires that each agent’s utility depend on one of the commodities
(called the “linear commodity” or “money”) as:
(where
). We will only be concerned with economic exchange, which corresponds to assuming the conservation of commodities (energy1). If any two such (groups of) agents are brought into “thermal contact” (are allowed to exchange their commodities), they will ideally keep trading their commodities until they reach a state of equilibrium where no two agents can both increase their utility by trading their commodities – a Pareto optimal state. The final Pareto optimal state given some initial endowment of goods is in general non-unique, but as it turns out, in a quasi-linear economy, the allocation of all non-linear commodities is the same for all such Pareto optimal allocations. In such an economic equilibrium, there is an economy-wide price for every commodity (otherwise, one could get richer by buying cheap and selling dear). And this price is uniquely defined by the distribution of commodities (it’s the common value of the agents’ marginal utilities at that distribution).
In fact, in a quasi-linear economy, we can aggregate the agents’ individual endowments and postulate that the equilibrium state maximizes their total utility from the non-linear goods, and the price of commodity
is the intensive variable conjugate to the (obviously extensive) total amount of the commodity
in the economy. The total utility is equivalent to entropy, the total amount of commodity
is equivalent to the extensive variable
(say, internal energy), and the price of commodity
is equivalent to the (entropic) conjugate intensive variable to
(say, the inverse of the temperature). An economic reinterpretation can also be found for the Legendre transforms of internal energy (see the paper).
1 The authors draw an analogy between and energy/volume, but given that there are
different commodities, an analogy between
and the number of moles
seems more natural to my (lay) eye, given the absence of any a priori difference between the non-linear commodities and the obvious analogy between the nature of the two quantities. It would also mean we don’t have to assume that the utility function is increasing in the commodity or worry about the third law of thermodynamics (I don’t see any meaningful way to interpret it economically and the authors don’t even mention it).
(The native Evenki are a Tungusic people related to the Manchu.)
]]>I mentioned all of those mostly so that you can, if you wish, think them through and come up with examples / counter-examples. The only ordering principles relevant here are those pertaining to any.
Let’s look at some examples from [1] (the asterisk marks ungrammaticality):
If any members contributes, I’ll be happy.
If every member contributes, I’ll be happy.
* If Jane wins, anybody is happy.
If Jane wins, everybody is happy.
Jane can win any match.
Jane can win every match.
* Jane has won any match.
Jane has won every match.
Jane hasn’t won any match.
Jane hasn’t won every match.
* Mary believes that Jane will win any match.
John has not dated any girl.
John has not dated every girl.
* Any girl has not been dated by John.
Every girl has not been dated by John.
I included a plenty of examples for a reason – try analyzing some of them for yourself to see how (G.any/every) along with the rules for preference determine the meaning of these sentences. Some of them are straightforward, some of them probably need some commentary. The difference in meaning between sentences like “Jane can win any/every match” is seen as a matter of talking about all possible matches vs. talking about actual matches only. Perhaps a clearer example is seen in [3]:
Any candidate will be considered on his merits.
Every candidate will be considered on his merits.
There is clearly a difference in meaning between, again involving a difference in the range of quantification. Either we first choose a particular future (the actual one) and quantify over individuals in it, or we first quantify over individuals in various possible futures and claim that all of them will be considered on their own merits (or would be if such a future were to occur). We can say that every candidate happens to be left-handed, but we can’t really say that any candidate happens to be left-handed. On the other hand, “any” is not assumed to have any special precedence over “intensional operators” like “hope”, hence in sentences like “Mary hopes that Jane will win any/every match” Hintikka claims that “the choice of the ‘world’ is absorbed to the intensional operator and hence precedes the choice of an individual involved either in (G.every) or (G.any)”.
You may have noticed a regularity in the un/grammaticality of these any constructions. In fact, studying such constructions led Hintikka to formulate the following principle, which he calls the any-thesis:
The word ‘any’ is acceptable (grammatical) in a given context X – and Y – Z if and only if an exchange of ‘any’ for ‘every’ results in a grammatical expression which is not identical in meaning with X – any Y – Z.
This is a well-formed condition, as the rules of game semantics for English unambiguously determine whether the sentences are equivalent or not (even when one of them is ungrammatical according to this rule). I’ll give you some time to ponder on the linguistic implications of this thesis. Meanwhile, a technical note: in [3], Hintikka relaxes the condition that the resulting expression be grammatical, as it can fail to be for completely unrelated reasons (for example not being a suitable antecedent for a pronoun, as in “if Bill has every donkey, he beats it”).
The obvious thing to notice is that in defining grammaticality in terms of meaning, it goes against the classical generativist idea of syntax (grammar) being independent of semantics. But even if semantics is taken into account, it is usually in the form of some simple, algorithmic rules. That is, we still have an algorithm which can tell us whether any given sentence belongs to the set of grammatical English sentences or not (such a set is then called “recursive”). In contrast, the any-thesis would imply that the set of grammatical English sentences is not recursive. Indeed, it is not even recursively enumerable! (That would mean having an algorithm which is guaranteed to confirm that a grammatical sentence indeed is grammatical but which is not guaranteed to terminate if it is not.)
Why? Well, the set of theorems of first-order predicate logic is not recursive (assuming we have at least one two-place predicate), and having a procedure to determine the logical equivalence of sentence of the form “if any X, then Y” and “if every X, then Y” would enable us to create a decision procedure for first-order predicate logic (details in [1]). So the set of grammatical sentences of the fragment of English we are considering is not recursive. What’s more, if we have a recursive test for sentences which would be grammatical if it were not for the any-thesis, then it cannot even be recursively enumerable – since there is a recursive enumeration of the ungrammatical sentences of our fragment (why?), it would contradict our weaker result, as a set is recursive iff it and its complement (in a recursive set) are both recursively enumerable (intuitively speaking, one could run both enumerations in parallel and every sentence is bound to pop up in one or the other). By a similar argument, it follows that the whole set of grammatical English sentences is not recursively enumerable either.
But this flatly contradicts the basic principles of generativism and what Hintikka calls the “recursive paradigm”. If the thesis is true, it follows that grammar cannot be conceived as based on generative rules as Chomsky would have us believe. And even if you disagree with this particular thesis (i.e. you disagree with some ordering principle proposed by Hintikka), it demonstrates that you cannot a priori exclude such rules from consideration, so there doesn’t appear to be any a priori reason to assume that the generativist enterprise can be successful and restrict ourselves to generativist accounts only, as we cannot rule out the existence of some similar semantic-based criterion of grammaticality. At the very least, the Chomskian must not only present valid counter-examples against the any-thesis, but also convincing arguments that such rules simply cannot operate in natural languages.
There is much more to discuss about game semantics and Hintikka, but I think I’ll end my attempt at an introduction here, for now at any rate. I hope it has been at least somewhat informative. If anyone’s interested in Hintikka’s polemic with Chomsky, check out article [2] (available in its entirety on Google Books). For a bit more about the “recursive” vs. “strategic” paradigm, try [4] (partly available online).
As for my opinion about this whole thing, I dunno. On the one hand, it seems unrealistic to assume that people would judge that grammaticality of complicated sentences including any by testing, on some level, their equivalence with the corresponding every construction. On the other hand, the concept of grammaticality partly breaks down when we’re dealing with sufficiently complicated sentences, so the set of “all” grammatical English sentences is to a large extent an artifact of the theory anyway, a matter of how we choose to extrapolate from the actual data which consists of grammaticality judgements of relatively simple sentences. In line with this, one way of interpreting the result (assuming the thesis holds for sentences of “normal” length) which is not completely unrealistic psychologically would be to assume that we do have a procedure which raises a red flag if it notices that the any construction is equivalent to the corresponding every construction and does so quite reliably for constructions of reasonable complexity and to extrapolate our notion of grammaticality from this. And Hintikka’s way of extrapolating seems quite elegant.
References (all papers by Hintikka):
[1] Quantifiers in Natural Languages: Some Logical Problems (1977), in Game-Theoretical Semantics
[2] On the Any-Thesis and the Methodology of Linguistics (1980), in Paradigms for Language Theory and Other Essays
[3] Rejoinder to Peacocke (1979), in Game-Theoretical Semantics
[4] Paradigms for Language Theory (1990), in Paradigms for Language Theory and Other Essays
]]>The rules
Now that we have an idea of what game semantics looks like, let’s see how one could apply it to English. The natural rules that correspond to our rules for propositional connectives () can be adapted by simply replacing the formal symbol with the corresponding English conjunctive construction. This presupposes that we know how to get rid of pronominal reference from one conjunct to the other (or that we limit ourselves to some fragment of English where this can be done easily enough, for example we require that pronominal antecedents be proper names) and how to transform a negative sentence into a positive one (we’re going from the outside in, not generating something from the inside out). As for universal and existential quantification, the corresponding English words which Hintikka sets up analogical rules for are some / a(n) and every / any / each.
For existential quantification it is:
(G.some) When the game has reached a sentence of the form
X – some Y who Z – W,
a person may be chosen by myself. Let the proper name of that person be ‘b‘. Then the game is continued with respect to the sentence
X – b – W, b is a Y, and b Z.
An analogical rule (G.a(n)) can be devised for a(n).For universal quantification:
(G.every) When the game has reached a sentence of the form
X – every Y who Z – W,
a person may be chosen by Nature. Let the proper name of that person be ‘d‘. Then the game is continued with respect to the sentence
X – d – W if d is a Y and if d Z.
The rule (G.any) is analogical. Here we assume that Y and Z are singular and that the ‘who’ is the subject of ‘who Z’. The rules could be modified to deal with more general situations, but Hintikka does not go into that here.
The dependencies
So, we have defined some transformations which allow us to change an English sentence into a simpler English sentence. The natural question is, which order to we apply them in, and, since we are dealing with a game, what is their “informational status” (i.e. dependence, in the sense I talked about last time). I will discuss the first question in the next post (where we’ll finally get to the relevance of all this to linguistics itself), let us now concentrate on the latter. Hintikka argues that there are English sentences which do not exhibit a linear ordering of quantification. He provides an example:
Some novel by every novelist is mentioned in some survey by every critic.
This he interprets as being true only if the novel depends only on the novelist and the survey only on the critic. I.e. it is true in a state of affairs where
The bestselling book by every author is referred to in the longest essay by every critic
but not in a state of affairs where
The bestselling book by every author is referred to in the obituary essay on him by every critic.
I don’t know how about you, but I’d probably be quite willing to assent to the original statement even in such a situation. I would probably interpret it is “for every author and every critic, there is a book by the former which is mentioned in an essay by the latter” (but don’t mind my opinions). Against this, Hintikka gives an argument which I’m not sure I like, namely he appeals to the principle of charity and claims that even though the “literal meaning” of the sentence is as he says, people tend to interpret it the other way, because that’s the way “which is most likely to make the intended meaning of [the] utterance true” – just like the natural interpretation of “someone is hit by a car every week on this street” is simply an instance of the principle of charity, rather than an exception to the left-to-right ordering principle of English quantifiers.
Consider the sentence
John has not shown any of his paintings to some of his friends.
The natural interpretation is clearly that there are friends of John’s to whom he has shown none of paintings. But this seems to violate the left-to-right ordering principle (we would expect the scope of “any” to be greater than the scope of “some”). The principle of charity line of argument doesn’t seem to be applicable here, as it would not be anything out of ordinary if there were there were no paintings of John’s which he has shown to all of his friends. Partially-ordered quantification comes to the rescue, as we can simply interpret the two quantifiers to be independent of each other, which however turns out to be equivalent to the natural interpretation. Hintikka makes it clear that this is not due to some special behaviour of the “not”, as a similar analysis can be given for “John has shown all his paintings to some of his friends” (think it through yourself).
The other argument given is that “some novel by John is mentioned in some survey by every critic” clearly follows from the original sentence, but the “linear” interpretation (“there is one novel by John which is mentioned in some survey by every author”) is, according to Hintikka, “patently counter-intuitive”. Similarly with “some novel by every author is mentioned by every critic”. It is also argued that this nesting be arbitrarily complicated, that is that every sentence with branching quantifiers corresponds to some grammatical English sentence. Myself, I wouldn’t go as far as calling them patently counter-intuitive but I admit that in this case Hintikka’s interpretation seems acceptable. What I think Hintikka neglects, however, is that his sentences are a bit too abstract – I would say that their interpretation can also be expected to depend on stress (and possibly other prosodic features, too). I don’t think it’s possible to interpret the sentence “some novel by John is mentioned in some survey by every critic” (italics denoting stress) the way Hintikka does.
The implications
But however you feel about his particular examples, it seems undeniable that there are English sentences which can be naturally interpreted his way (for me, it would probably be sentences like “a particular book of every author is mentioned in a particular essay of every critic”). At the very least, even if you are unimpressed by any of his examples, they show that there is no a priori reason to assume that classical first-order predicate calculus can in general adequately capture quantification in natural language.
The importance of these findings (if true) is summarized at the end of Hintikka’s 1973 paper Quantifiers vs. Quantification Theory (according to the editor the first paper to discuss game semantics for natural language):
“Since our results suggest that the whole of f.p.o. quantification theory is built into the semantics of English quantifiers, it follows that the semantics of a relatively small fragment of English, viz. of English quantifiers plus a few supporting constructions, is a much subtler and much more complicated subject than anyone seems to have suspected. In the eyes of a logician, it seems to be powerful beyond the wildest dreams of linguists and of philosophers of language.”
Next time, I’ll try to wrap up this little introduction to Hintikka with his thesis concerning the interplay of syntax and semantics.
]]>A philosopher/logician I have long wanted to familiarize myself with is Jaakko Hintikka, one of the originators of game semantics – a semantics for logic based on game theory. So I picked up Game-Theoretical Semantics (edited by Esa Saarinen) at my library and plunged in. It turns out Hintikka has some interesting things to say about both formal logic and natural language. I originally intended to make a single post, but it seems best to split it into two parts, one dealing with the semantics itself, the other with applying it to natural language.
As you see, I am terrible at writing introductory fluff, so let’s get down to business. This post should be understandable to anyone with at least a cursory knowledge of formal logic. Just to refresh your memory: correspond to, respectively, “or”, “and”, “there is an x such that”, “for all x” and “not”. The basic idea behind game semantics is quite straight-forward: the truth-value of a sentence is determined by the outcome of a tug-of-war between ‘Myself’ (trying to prove the sentence true) and ‘Nature’ (trying to prove the sentence false). We fix a domain of individuals
and to every sentence
, we associate a recursively defined game
. If the sentence
is of the form:
, I choose
or
and the game is continued with respect to it
, Nature chooses
or
and the game is continued with respect to it
, I choose a member of
, give it a proper name if does not already have one (
, for example), and the game is continued with respect to
, Nature chooses a member of
, give it a proper name if does not already have one (
, for example), and the game is continued with respect to
, the roles of the two players are switched
The rules are called, naturally enough, (G.), (G.
), (G.E), (G.U), (G.
) respectively. Implication can be translated into disjunction and negation or a new rule can be added (in
, I choose
or
). Likewise, we can abandon the negation rule in favour of rewriting sentences using De Morgan’s laws.
Using this process, we ultimately arrive at an atomic sentence (a sentence like
which we do not further analyze logically). If
is true, I have won and Nature lost; if
is false, vice versa (rule
). Finally, the original sentence
is true iff I have a winning strategy in
. (In game theory, a strategy is a function describing which option one is going to choose given the previous ones, a winning strategy is naturally a strategy which guarantees you victory.)
Thus we have defined a semantics for first-order predicate calculus which determines which sentences are true and which are false. That this definition is reasonable should be obvious. Assuming the Axiom of Choice, one can easily use induction to prove that this definition of truth is equivalent to Tarski’s (truth = satisfaction by every assignment of objects to variables). The problematic step is where we have to jump from “for every , I have a winning strategy for
” to “I have a winning strategy for
” – without the Axiom of Choice, there needn’t be any, and sentences true according to Tarski would no longer have to be true according to Hintikka, resulting in a non-classical logic.
In fact, the easy modifiability to deal with non-classical logics is one of the main advantages of game semantics. For one thing, we might require the strategies to be computable functions. For another, the semantics works for a more general theory than classical logic if we relax a condition we have been tacitly assuming all along. Namely, the requirement of perfect information – that both players are informed about the previous choices made by their adversary (and remember their own) and their subsequent choices depend on them. The theory we get when we no longer assume such a dependence is called finite partially-ordered quantification or branching quantification. A typical formula with a branching quantifier might look like this:
This implies that the choice of depends only on
and the choice of
only on
(the branching may of course be further nested). We can make this dependence explicit by using the so-called Skolem functions
:
This sentence is not equivalent to any sentence of regular first-order predicate logic. On the other hand, the sentence:
is equivalent simply to:
.
This, obviously, is an exception rather than the norm – it can be proven that (validity in) this logic of finite partially-ordered quantification is recursively equivalent to (validity in) second-order logic, i.e. there is an algorithm to turn a sentence in second-order logic (where we can quantify over predicates in addition to individuals) into an equivalent sentence in this “branching” logic. And second-order logic is, of course, a much stronger theory than first-order logic.
What implications all this has for natural language (English in particular) will be discussed next time.
]]>Instead, I’d like to write about some not too advanced maths and logic, language(s), analytic philosophy and political theory. Possibly more, possibly less.
It goes without saying that you’re encouraged to leave comments of any kind. As much as writing one’s thoughts down in a coherent manner helps to expose their hidden inconsistencies, feedback or not, the point of having a blog isn’t really to just keep throwing browserfuls of text into an insatiable electronic abyss, is it?
So enjoy this blog, write comments and let’s hope that I won’t look back at this post in a couple of weeks and laugh at how naive I was to think that I would keep updating it semi-regularly.
Cheers.
]]>