#comparative method

LIVE

Grimm’s Law (also called the First Germanic Sound Shift) refers to changes which affected the stop consonants in what became the Germanic subgroup of the Indo-European language family (Proto-Germanic being the ancestor of all Germanic languages, i.e. Gothic, German, Yiddish, Swedish, Icelandic, Dutch, Afrikaans, Old English, English etc.). There are in fact three series of changes which changed some aspect of the articulation of the IE stop consonants whilst retaining the same number of distinctions (number of phonemes).

Law A:             IE /p t k/          >          Gmc /f θ x/

Law B:             IE /b d g/         >          Gmc /p t k/

Law C:             IE /bh dh gh/   >          Gmc /β ð γ/ (which later became /b d g/)

Exactly when this happened is not known but we can at least work when the Laws may have taken effect relative to each other, e.g. Law A cannot have happened after Law B because otherwise we would expect IE /b d g/ to show up as /f θ x/ in Germanic.

For example:

Latinpater > Englishfather, German Vater(German orthographic <v> is pronounced /f/)

Greektri > English three

Latincord- > English heart (English /h/ descends from earlier /x/)

Sanskritbhratar > English brother, German Bruder

These are standard but selective examples. Standard in the sense that you’ll find them in text books; selective in that we cannot simply look at one language and expect it to faithfully represent changes which happened hundreds of years ago. Latin, Greek and Sanskrit have undergone changes since Proto-Indo-European and English and German have undergone changes since Proto-Germanic. Modern German shows evidence of a Second Germanic Sound Shift which changed the Germanic stop consonants again! English did not undergo this change as it had already separated from the language that was to become German (compare threeanddrei,daughterandTochteretc.).

siberian-khatru-72:

possessivesuffix:

siberian-khatru-72:

max1461:

siberian-khatru-72:

max1461:

max1461:

I think it’s worth remembering that, for language families like IE and Semitic, the comparative method alone did not give us >5000 year old reconstructible proto-languages. The comparative method gave us 1500-2000 years, and we applied it to textual sources that were already >2000-3000 years old. Based on families with confident proto-language reconstructions that don’t have significant pre-modern written attestation, I think 3000 years is a better rule of thumb for the maximum time-depth at which the comparative method is really effective. Of course that’s just a rule of thumb—if someone can actually demonstrate an older relationship with systematic sound correspondences in core vocabulary and morphology then I’ll change my tune.

@kaumnyakte-ultra

True, but IIRC still only three or four thousand years. And the fact that huge chunks of the family are spoken on relatively isolated islands basically provides the ideal environment for the comparative method to succeed. We’re extremely spoiled by Austronesian, in comparison to like, the Amazon (which is what originally got me thinking about this), which is one of the least friendly environments possible for historical linguistics.

image

This is from Blust’s “The Austronesian languages”.

So, out-of-Taiwan expansion was already underway by 4800 BP, and the breakup of Proto-Austronesian in Taiwan must have occurred even earlier.

Also, the idea that island environment impedes language contact is a myth; even in Polynesia contact was widespread. I doubt that there is even one Oceanic language that does not have loans from other Oceanic languages.

Fair enough, that’s quite a bit older than I thought it was.

With regard to island environments, I wasn’t talking about language contact between already-differentiated varieties as much as I was talking about the fact that forming large dialect continua is more difficult, so subgrouping is in some sense cleaner and things are more closely aligned to the neogrammarian model with linearly orderable sound changes. But maybe this is not really true either, I’m not sure.

Well, sound changes are always linearly orderable. It’s just that in dialect continua the order of changes may be different for different varieties, since the changes themselves spread by contact. In clear-cut subgroups the order would be identical for all languages; such subgroups result from bottleneck effect, usually during migrations - and there were plenty of migrations in the Amazon. 

If you look at the most successful applications of the comparative method to modern languages - Austronesian, Bantu and Algonquian - you’ll find that there are few clear-cut subgroups in these families. Algonquian has an Eastern Algonquian subgroup, but hardly anything else; Bantu is divided into “zones” which are not subgroups, and Austronesian does have Malayo-Polynesian and Oceanic, but most really conservative languages outside of Taiwan belong to “Western Malayo-Polynesian”, which again is not a subgroup.

Of course, if you do have clear-cut subgroups, you can (and must) compare reconstructed intermediate protolanguages, which immediately adds one or two thousand years to your supposed time limit. Uralic reconstruction is based on comparing Proto-Finnic, Proto-Mansi, Proto-Samoyed, etc. Each of these low-level reconstructions is pretty solid, except perhaps Proto-Permic. I think that reconstructed Proto-Finnic is more useful for Uralic reconstruction than Gothic is for Indo-European.

Do I need to now start crossposting here discussions I just got done posting on Twitter…

What does “the comparative method being effective” mean exactly? Identifying a relationship at all? Identifying enough regular correspondences to sketch a reconstruction? Being actually certain that the reconstruction is broadly correct? The first clearly works at least up to 6000 years, with sufficient finesse probably more. The second clearly works at least up to 4000–5000 years.

The third is, yes, much more trouble. Even in IE we keep having debates over things like laryngeal theory and glottalic theory, large parts of them not depending on the correspondences per se but the phonetic typology of the assumed reconstructions and sound changes. Frankly I think this is actually fundamentally uncertain for any bottom-level proto-languages, no matter if 5000 or 500 years old: there are too many possibilities for isomorphic reconstructions. But add any solid outgroup evidence — a relationship that is known but not necessarily reconstructed — and a lot can be resolved. Sometimes loanword evidence might work as outgroup evidence too (very much the case for Finnic: e.g. Baltic loanword evidence will resolve that core *ht ~ South Estonian *tt is < *kt), but further back in history, any identifiable proto-node is ever more likely to not have been close enough to any other proto-node for this to work.

Intermediate proto-language uncertainty will still remain in figuring out what innovations are shared because they occurred before the split-up of Proto-Intermedic, and which are shared because they’re areal Common Intermedic, though that does at least amount to knowing that there was a given innovation in a given direction.

Yes, there is clearly a continuum as we go from more recent to more deep relationships, and the further we go, the less we can reconstruct.

But it is actually very misleading to frame this in terms of absolute time (”a cut-off point for the comparative method”). There are two reasons for this. 

First, you can date the breakup of a proto-language only by comparison with an archaeological dating, and/or by glottochronology (if you accept this method) - and in both cases you need a decent reconstruction. That is, you cannot assign a date to a language family unless you’ve already applied the comparative method. So by definition you cannot know the depth of a protolanguage that you cannot reconstruct.

Second, and more important, our ability to make a sufficiently detailed reconstruction depends not only on absolute time, but also on many other factors: level of documentation of daughter languages, number of daughter languages, rates of lexical replacement, availability of more or less conservative languages, possibilities for internal reconstruction (the more non-trivial morphophonology, the better), structure of the family (the more intermediate nodes in the tree, the better), etc. And of course, it depends on how much time and effort was put into attempts to reconstruct a proto-language - the main advantage of Indo-European is not the availability of ancient languages, but the sheer number of linguists engaged in reconstruction.

The meaninglessness of various figures for “the cut-off point of the comparative method” was shown long ago by Manaster Ramer in his paper on “Uses and Abuses of Mathematics in Linguistics”, and it is rather sad to see the same old notion of the “time limit for the comparative method” repeated again and again.

Right, I very much don’t want to claim these are consistent points for how far comparison works, they’re examples of roughly how old some known / reconstructed relationships have turned out to be (i.e. clearly more than 3000 years) Sufficiently hard cases that are younger but unworkable might also exist. Though how would we know exactly? Very good point too that relationships are identified first and their age determined only afterwards.

Manaster Ramer has been writing other good points on this too. In the alluded on Twitter discussion, de Carvalho recommended his 2000 paper with Baxter: “Beyond lumping and splitting: probabilistic issues in historical linguistics”.

Finally, the idea that our methods allow us to ‘prove’ language relationships to a certain limit, beyond which the responsible scientist must refrain from speculation, reflects a nineteenth- century inductivist ideology of science which is now rightly discredited. In the inductivist view, scientists carefully observe facts, their minds uncontaminated by preconceived notions or hypotheses; and they prove new scientific results by applying a fixed code of valid inductive principles to their observations. (A ‘method’ in the narrow sense, as in ‘the comparative method’, is a code of this kind.) As long as scientists unswervingly follow this procedure, it is believed, the truth of their results is assured, and the store of legitimately proven scientific knowledge is gradually increased. But speculations not firmly grounded in observation undermine the legitimacy of the whole process, and pollute the inquiry from that point on.
This view, though too rigid to follow in practice, and now largely abandoned by philosophers of science, still survives among the defence mechanisms of our field. By suggesting that hypotheses about deep linguistic relationships are forever beyond the reach of legitimate scientific inquiry, it is now doing a disservice by unnecessarily and prematurely discrediting some of the most interesting lines of inquiry open to us. We urgently need more discriminating defences which will protect us without exacting this high price.

siberian-khatru-72:

max1461:

siberian-khatru-72:

max1461:

max1461:

I think it’s worth remembering that, for language families like IE and Semitic, the comparative method alone did not give us >5000 year old reconstructible proto-languages. The comparative method gave us 1500-2000 years, and we applied it to textual sources that were already >2000-3000 years old. Based on families with confident proto-language reconstructions that don’t have significant pre-modern written attestation, I think 3000 years is a better rule of thumb for the maximum time-depth at which the comparative method is really effective. Of course that’s just a rule of thumb—if someone can actually demonstrate an older relationship with systematic sound correspondences in core vocabulary and morphology then I’ll change my tune.

@kaumnyakte-ultra

True, but IIRC still only three or four thousand years. And the fact that huge chunks of the family are spoken on relatively isolated islands basically provides the ideal environment for the comparative method to succeed. We’re extremely spoiled by Austronesian, in comparison to like, the Amazon (which is what originally got me thinking about this), which is one of the least friendly environments possible for historical linguistics.

image

This is from Blust’s “The Austronesian languages”.

So, out-of-Taiwan expansion was already underway by 4800 BP, and the breakup of Proto-Austronesian in Taiwan must have occurred even earlier.

Also, the idea that island environment impedes language contact is a myth; even in Polynesia contact was widespread. I doubt that there is even one Oceanic language that does not have loans from other Oceanic languages.

Fair enough, that’s quite a bit older than I thought it was.

With regard to island environments, I wasn’t talking about language contact between already-differentiated varieties as much as I was talking about the fact that forming large dialect continua is more difficult, so subgrouping is in some sense cleaner and things are more closely aligned to the neogrammarian model with linearly orderable sound changes. But maybe this is not really true either, I’m not sure.

Well, sound changes are always linearly orderable. It’s just that in dialect continua the order of changes may be different for different varieties, since the changes themselves spread by contact. In clear-cut subgroups the order would be identical for all languages; such subgroups result from bottleneck effect, usually during migrations - and there were plenty of migrations in the Amazon. 

If you look at the most successful applications of the comparative method to modern languages - Austronesian, Bantu and Algonquian - you’ll find that there are few clear-cut subgroups in these families. Algonquian has an Eastern Algonquian subgroup, but hardly anything else; Bantu is divided into “zones” which are not subgroups, and Austronesian does have Malayo-Polynesian and Oceanic, but most really conservative languages outside of Taiwan belong to “Western Malayo-Polynesian”, which again is not a subgroup.

Of course, if you do have clear-cut subgroups, you can (and must) compare reconstructed intermediate protolanguages, which immediately adds one or two thousand years to your supposed time limit. Uralic reconstruction is based on comparing Proto-Finnic, Proto-Mansi, Proto-Samoyed, etc. Each of these low-level reconstructions is pretty solid, except perhaps Proto-Permic. I think that reconstructed Proto-Finnic is more useful for Uralic reconstruction than Gothic is for Indo-European.

Do I need to now start crossposting here discussions I just got done posting on Twitter…

What does “the comparative method being effective” mean exactly? Identifying a relationship at all? Identifying enough regular correspondences to sketch a reconstruction? Being actually certain that the reconstruction is broadly correct? The first clearly works at least up to 6000 years, with sufficient finesse probably more. The second clearly works at least up to 4000–5000 years.

The third is, yes, much more trouble. Even in IE we keep having debates over things like laryngeal theory and glottalic theory, large parts of them not depending on the correspondences per se but the phonetic typology of the assumed reconstructions and sound changes. Frankly I think this is actually fundamentally uncertain for any bottom-level proto-languages, no matter if 5000 or 500 years old: there are too many possibilities for isomorphic reconstructions. But add any solid outgroup evidence — a relationship that is known but not necessarily reconstructed — and a lot can be resolved. Sometimes loanword evidence might work as outgroup evidence too (very much the case for Finnic: e.g. Baltic loanword evidence will resolve that core *ht ~ South Estonian *tt is < *kt), but further back in history, any identifiable proto-node is ever more likely to not have been close enough to any other proto-node for this to work.

Intermediate proto-language uncertainty will still remain in figuring out what innovations are shared because they occurred before the split-up of Proto-Intermedic, and which are shared because they’re areal Common Intermedic, though that does at least amount to knowing that there was a given innovation in a given direction.

loading