A defense of syntactic freedom; or, First-order logic is better than you think.

In a recent email exhortation, @@@ Ian Horrocks asks us to restrict ourselves to well-understood principles and make sure that we are always working inside some subset of first-order logic (FOL). I entirely agree, but Ian and I come to different conclusions about what this means. Since my view is perhaps more radical than Ian's, and certainly less well documented, I explain it here and explain why some of the concerns that have been expressed about its internal coherence are unwarranted.

What counts as first order?

First, there is no such thing as the 'correct' or 'classical' or 'normal' first-order logic. There are probably as many variations of first-order logic as there are logicians, and among the dozens, perhaps hundreds, of logic textbooks that have been published almost no two use exactly the same logical system. The syntax can be written in a large variety of ways (infix, prefix, postfix; graphically like Peirce or like Frege; etc.); there are many widely different notions of first-order proofs (natural deduction systems, Gentzen tablaux, resolution-style proof systems, etc.) and there are even several semantic options. Some versions allow function symbols, others not; some use few connectives, some use only one, others allow many; and so on. I emphasize this rather elementary point only to provide some pre-emptive answer to the frequently heard complaint that any new idea violates the norms of 'normal' first-order logic. Of course all these varieties are profoundly alike in a certain sense (modulo a few well-known distinctions, such as the presence or absence of equality) but the issue we need to face is, how best to characterize this common quality? What makes a logic first-order?

One kind of answer, the one most often found in elementary textbooks, is purely syntactic. It characterizes first-order languages by their conformity to a kind of syntactic layering: there are individual symbols and relation symbols, and a language is first-order if it respects this layering by keeping these categories distinct and only allowing quantification over the individuals. In contrast, higher-order languages (which are well known to be less tractable in many ways) have a syntax which blurs or even eliminates this distinction by allowing relations to hold betwen other relations and allowing quantification over relations. On this view, then, the essentially first-order nature of FOL is essentially a syntactic restriction. This has the merit of being very easy to state and trivial to check. However, it misses the essential point.

Another kind of answer would be semantic: a language is first-order if it has a conventional Tarskian model theory in which individual names denote things in the universe and relation names denote relations over the universe. This is better, but it ignores the fact that one can give such a semantics to a notation which most people would consider non-first-order, such as type theory.

I wish to suggest a different answer, based on the relationship between the
semantic and proof-theoretic qualities of the language. The reason why FOL plays
such a central role in the foundations of mathematics and in knowledge representation
is not because of a simple syntactic restriction. FOL has a fundamental logical
character which make it in a sense into the canonical computational representational
logic: it is *the strongest possible relational logic which is both complete
and compact*. Relational means that the logic mentions entities and relations
between them. Completeness means that proof theory and the semantics are in
correspondence. Without this, of course, one could erect all kinds of semantic
fantasies but they would have no bearing on the actual proofs that could be
obtained. Compactness means that proofs can only use finite resources. Any conclusion
from an infinite set of assumptions must be finitely derivable from some finite
subset of them. This means that proofs are things that can actually be implemented
and constructed by mortal people and physical machines.

Now, these are very general conditions; and, moreover, they make a certain
basic sense. They are not arbitrary or capricious. Any logic which is going
to be used by finite agents to represent content *must* be first-order
in this sense, and so it seems interesting to characterize the nature of this
strongest logic in this way. In

This allows for any surface syntactic form, any version of the many interdefineable
combinations of syntactic features, and any finitistic proof theory to be counted
as FOL. When it comes to semantics, the variety of options available at the
syntax and proof-theory levels reduces to a single common picture which can
also be characterized very elegantly. The semantics of FOL is a mathematical
theory of how the language can be interpreted in a world, and it is the *simplest
possible* such theory. That is, it makes the *minimal* assumptions
about the nature of the world, consistent with its intended function of assigning
a truth-value to every expression of the logic. Call this the principle of *semantic
agnosticism*; the theory of truth should make the assumptions needed to
determine the truth of any expression, but no more.

Applied to any logic with connectives such as *and*, *or*, *not*
and *implies*, for example, this requires that the semantics provide
truthtables for the connectives, in order to determine the truthvalues of expressions
built up using the connectives. This is not to say that these truth-tables are
the full meanings of these words; only that they are what the formal semantics
is required to specify. Perhaps to assert (p implies q) is to say more than
((not p) or q); no matter, because that extra meaning in the word 'implies',
if there is any, does not affect the truth-value of the expression in any particular
world; and the logical semantics is concerned only with truth in this strict
sense of correspondence to a world.

Developing FOL from semantic agnosticism.

In this section we will show that applying the agnosticism principle rigorously can guide the task of designing a model theory for a relational language so as to almost completely define both the model theory and the syntax of the language. As a deliberate policy, whenever the we will always prefer the option which imposes the fewest constraints on the

constraints of the language. The aim here is to identify Applied to the quantifiers,
this principle requires that a world specify a set of entities to quantify over
(to make sense of 'for all'), and for each predication of a relation expression
to some arguments, whether or not the predication is true. This amounts to assigning
to each thing which is denoted by any relational expression, a set of n-tuples:
those sequences of arguments which make a predication of this relation true.
We know they are finite, because the language is compact. Thus, the agnosticism
principle requires that an interpretation define a nonempty set U, and a mapping
IEXT from some subset R of U to U*, the set of finite sequences over U, and
that any expression which occurs in the relation position in any atomic sentence
denotes an entity in the subset R; *and that is all*. *Any such structure
will suffice to assign a unique truthvalue to any expression of a first-order
language.* Notice that nothing has been said here about the nature of the
entities in the universe set. It is an essential part of the nature of FOL that
it imposes no conditions on its universe other than it be nonempty; any such
condition would limit its generality.

This condition has an inherent awkwardness, however, considered as a semantic condition, since it refers to the semantic forms of the language: IEXT is required to be defined on denotations of expressions which occur in a relational position in an atomic sentence. One could work within this awkwardness, but the language is usually tidied up in order to eliminate this condition. One way to do this is to simply declare a syntactic category of relation symbols, impose a syntactic restriction that all atomic expressions must begin with a relation symbol, and require all relation symbols to have denotations in the domain of IEXT. A simpler strategy, which amounts to a special case of the above, is to simply require that IEXT be defined on the entire universe. These are closely related: one can obtain the second from the first by assigning empty extensions to non-relation symbols. Since the second is simpler and requires no extra syntactic restrictions to be placed on the langauge, we will adopt it.

but not that it specify anything further about the nature of those entities. This