#universal grammar

LIVE

When people study language typology they study the ways in which languages vary. However, it’s more than just saying different languages use different words or these languages use very similar sounds. We study the ways in which structural features of languages differ (or are similar) and many go further asking questions about what the limits of linguistic structural variation are.

English speakers will know that in a simple transitive clause we start with the subject followed by the verb followed by the object, e.g. ‘Bob (S = subject) likes (V = verb) pizza (O = object)’, i.e. English has typically SVO word order. But are there other ways of arranging such a structure? Logically there are six ways: SVO, SOV, VSO, VOS, OSV, OVS. The next question that a typologist will ask is how are languages distributed across these possibilities. As a null hypothesis we might think that we would expect to find roughly equal numbers of languages in each group, but this is not what we find at all. SVO and SOV account for around 85% of all languages (with SOV being a bit more frequent than SVO). Adding VSO languages brings the total to around 95% of all languages. The question is: why is the distribution of languages so skewed?

Three broad types of answers suggest themselves as candidates (at least to my mind):

1)     It could be down to chance – the distribution of languages today may represent a highly skewed sample. If we came back in 1,000 years we might see a completely different distribution. This approach is obviously not taken by language typologists. There is certainly something interesting about the distribution which demands an explanation. To write the pattern off as due to chance would be to miss potentially significant insights into the ways languages are structured and shaped.

2)     The formal aspects of human language (perhaps as encoded by Universal Grammar) constrain the surface forms that human languages can inevitably take, i.e. variation is not limitless though it may be apparently vast.

3)     The functional pressures that act on speakers and hearers every time they use language will affect which forms languages will prefer to take, i.e. structures that are easier to say and to comprehend will be preferred and so will come to dominate amongst the languages of the world.

Given the great success of generative linguistics in the past few decades, (2) is a very popular approach to take. However, many intuitively feel that the approach in (3) is ultimately more satisfactory as an explanation. Personally I’m inclined to think that if we can explain surface variation in terms of performance preferences, this is a good thing because it means there is less for the formal approach to account for. Furthermore formal aspects of language are most often thought to be all-or-nothing affairs. If a grammar rules out a particular structure, that structure cannot exist, whereas if performance factors disfavour a particular structure, that structure will be either non-existent or rare.

But are (2) and (3) incompatible? You might think so given the distinction that’s often made between competence and performance. Many would not consider performance factors as relating to language proper – it is extra-linguistic and not something the linguist should be looking at. But the fact is that all the (overt) language that we use to construct theories of both competence and performance is being ‘performed’ in some way (either spoken or written or signed). I think there may well be limits on variation set by formal properties of human languages (which will account for some of the totally unattested structures) but others will be set by performance. And then maybe others that are to do with physics and biology more generally (here I’m thinking more of phonological typological patterns).

For now then it may be useful to adopt either (2) or (3) as an approach to language typology with the aim of seeing how far they can go, but always with the ultimate aim of putting the two together in the end for a more comprehensive account of why languages are the way they are.

When children learn their native language(s), they receive very little in the way of explicit instruction if any at all. The utterances a child is exposed to are not perfect - they contain false starts, repetitions, and various other mistakes and errors. Furthermore, children are not told what is grammatical and what is not. This latter point is especially important because it means that there is no negative evidence. Yet somehow a child is able to glean from such data the grammatical rules of their language(s). Consider the following:

(1) What did you say that Bill thought that John saw?

(2) *What did you say that Bill met the man who saw?

Despite perhaps never coming across utterances like (1) or (2), English speakers know that (1) is a grammatical sentence of English whilst (2) is not. But where did this knowledge come from? Another way of putting this question is to ask how we can know so much given how little we have to learn from. This is Plato’s Problem.

In modern linguistics, the solution to Plato’s Problem is to say that humans come equipped (i.e. it is in our genetics) with certain bits of knowledge, e.g. we instinctively/innately know how to analyse certain types of data in our environment. If we believe that any of these certain bits of knowledge that we make use of in language acquisition is specific to language, we arrive at the idea of Universal Grammar (UG). 

I haven’t written anything for a while since I’ve been so busy recently (been working a lot on the typology of relative clauses - perhaps I’ll post something about that soon). This evening I watched an interview (on YouTube) from the late 1970’s (1977, I think) with Chomsky. The interview is from a series called “Men of Ideas” produced by the BBC.

It’s a great interview - stimulating and perceptive questions and, of course, stimulating and perceptive answers! Many things caught my attention, one of which being that Chomsky spoke of two factors playing a role in language design, namely the biological endowment (i.e. Universal Grammar (UG) - the species- and domain-specific cognitive ‘organ’ dealing with language) and linguistic experience (i.e. the primary linguistic data from which we acquire our native language(s)). The idea was that all humans are born with a capacity for language, i.e. UG is innate in humans, provided by our genetic makeup. The data we encounter as children is so scant and degenerate (full of false starts, sentence fragments, etc.) that it would be virtually impossible to acquire a grammar in the short amount of time that it takes any normal child to do so the world over…unless we came pre-programmed for such a task. The idea was that UG was this pre-programming. UG was thought to be richly specified with linguistic principles (all genetically encoded) that would help children in the task of language acquisition by severely constraining the possible hypotheses that any child would postulate when acquiring a grammar to generate the data the child was exposed to. That was then.

Nowadays, Chomsky speaks not of two factors, but of three factors of language design. UG and the primary linguistic data are the first and second factors respectively. The third factor is made up of general principles of data analysis and efficient computation. The idea is that children can bring these domain-general (i.e. not exclusively related to language) tools to language acquisition. The third factor allows the first factor, i.e. UG, to be made much smaller. In other words, UG is no longer thought to be as richly specified as it once was. In fact, the aim is to make UG as small as possible. This is desirable for a number of reasons, but a particularly pertinent reason concerns the evolution of language, i.e. the evolution of the capacity for language in humans. As an 'organ’ of the mind, UG is a biological entity, and as such it must have evolved (though not necessarily through direct selection, as Chomsky points out in the interview!). Given that chimpanzees do not have UG, UG must have evolved some time in the last 5-7 million years or so. It is therefore unlikely that something as rich and complex as UG as it was originally conceived could have evolved in such an evolutionarily short space of time. The third factors, however, need not be specific to language, nor do they need to be specific to humans. Therefore, it is conceptually desirable if we can explain the design of language in terms of third factors. This is, in fact, viewed as the only source of principled explanation in Chomskyan syntax nowadays.

Importantly, although UG is far smaller than it was and may only consist of very few things (a recursive structure building operation at the very least), it is nevertheless still thought to exist. The UG hypothesis in its modern incarnation is thus still very different from approaches which deny the existence of UG altogether.

Anyway, if you’re interested, I suggest reading Chomsky’s (2005) paper:

Chomsky, N. (2005). Three Factors in Language Design. Linguistic Inquiry 36: 1, 1-22.

loading