Thursday, 23 May 2013

Historical Linguistics Has Limitations

I'm finishing up my post on fusion, feature, fission, etc, in the series on historical linguistics, and I'll put that up shortly.  But before I do, I wanted to post something I've been thinking about recently.

There have been some proposals - in fact, many proposals, and they keep on coming - trying to establish a common language family in Eurasia, or Afro-Eurasia, or the Americas, or the world.  People seem to love this kind of research.  Media publications seem to love it, too, with the New York Times and BBC website regularly publishing overviews of these kinds of spurious linguistics.  It looks a lot like science.  It is not science.  The people involved are usually non-linguists applying a non-standard framework to linguistic problems - a phylogeographic model employed by epidemiologists and biologists, for instance, which has graphs and cool jargon and other sciency accoutrements.  It looks convincing to people who don't know anything about linguistics or reconstruction, and it tends to treat languages as if they are something other than languages.  But languages are languages; they aren't genetic entities like living organisms, but 'genetic' entities, with a primarily metaphorical association to the idea of genetic relationship. Treat them as something other than languages and you'll only do it wrong.

It's a shame that these things are so popular.  Even biologists and geneticists are taken in by them, and sometimes try to establish connections between haplogroups and these ridiculous macro-families.  Dieneke's Blog, which is a generally reliable source for information on human population genetics, seems to endorse this kind of bullshit, as do the authors of some of its comments.  Archaeologists also seem to support it.  Apparently, Colin Renfrew, famed archaeologist and linguistics-abuser, gave the go-ahead to a recent amateurish attempt to link up Indo-European, Uralic, Eskimo-Aleut, Dravidian, and other Eurasian language families into one macro-family.  The work in the paper was awful, but no one seems to care.  These people seem to value having a cool result over an accurate, realistic, and sensible one.

The fact is, ten thousand years seems to be long enough to obliterate connections between languages.  This isn't surprising, given the nature of language change, but apparently it is surprising to some people, those who want to establish connections over vast spans of time.  Look at the Americas: the languages of the Americas show exceptional diversity, and within Amazonia alone there are dozens of language isolates and minor families with no relationship to any others.  Assuming they come from one or two - or even five, or six, or ten - original languages that entered through Beringia between 15-12,500 years ago, such a span of time is clearly sufficient to eradicate detectable genetic connections between languages.

This is something to be borne in mind when looking at these papers.  I've seen some endorsements - on blogs and in comments, but by academics and researchers - of the idea that there ought to be one common language family outside of Africa that descends from the common language of the population that first left Africa tens of thousands of years ago.  But think about that for a minute and it becomes completely absurd.

First, even if it were true, it can only ever be an assumption, because unless you have some written records from 60,000 years ago, it is in principle impossible to reconstruct or establish this language using existing languages.  That would almost certainly be long to eradicate any morphological traces - not just words or phonemes, but the structure and nature of the language.  Was it agglutinating?  Did it use consonantal roots and interwoven vowels like Afroasiatic, or did it rely on vowel changes and ablauts to establish tense and number like Indo-European languages?  We can never have an answer to this, even in principle.

Secondly, and perhaps more importantly, the first evidence for people leaving Africa comes from about 125,000 years, and there is better established evidence from around 60,000 years ago, implying that people left Africa between 125-60,000 years ago.  That's a 65,000 year gap.  If ten thousand years, or fifteen thousand, can obliterate traces of connections between languages, what could 65,000 do?  I hardly think we'd be looking at a single migrating population here, and even if there were an initial migration that took its merry time working its way through the Levant and Arabia, the extreme time-depth is still long enough to have turned this one language into many hundreds or thousands before they started to penetrate India or central Asia.  One language?

One language my aching arse.

Oh, and third, assuming you need a third reason to reject this bunk, there seem to have been loads of languages and language families that have disappeared from the earth in the last 60,000 years.  Almost all the non-Indo-European languages of Europe, except for Basque and Etruscan, seem to have disappeared from the map.  Lots of non-Indo-European and non-Dravidian languages must have died in South Asia, and likewise non-Sino-Tibetan languages must have disappeared in China and Tibet.  The languages of the people associated with Hoabinhian tools in southeast Asia also surely went the way of the dodo.  Without this evidence, who is to say we'd have anything like an accurate reconstruction?

Look at the Indo-European example from the now-published post on fusion.  The Celtic and Hellenic branches both have [b] where proto-Indo-European has [gʷ], based on evidence from other branches, including Indo-Aryan [g] and Germanic [k].  Only [gʷ] makes sense of all of these sound changes, but if some of them were missing, we'd be tempted to reconstruct [b], [k], or [g] to proto-Indo-European, or any other hypothetical macro-protolanguage - or, alternatively, we wouldn't have the Indo-European language family to reconstruct at all.  If we only had Celtic and Hellenic languages to work with - if, say, western Eurasia had been wholly divided between speakers of languages descended from proto-Celtic and proto-Greek, with all the other sub-families dying off before being written down - we'd reconstruct [bows] or something similar to proto-Indo-European.  And here's the thing about languages like Burushaski and Elamite, languages without documented or living relatives: they must have had similar histories of change that have been obscured by the absence of relatives against which to compare them.

Trying to extend the analysis of language to cover time-depths of 10,000 or more years is like using radiocarbon to establish continental drift.  That's not what it's designed for, and it really couldn't, even in principle, do that.  Historical linguistics has to work with what we've got, which is a few thousand existing languages, a bunch of historically documented ones, and reliable, realistic understanding of the human vocal apparatus.

It cannot do what these people want it to do.  And that's fine, because, like radiocarbon, the evidence from historical linguistics can tell us fantastic, fascinating, amazing, incredible things.  That ought to be enough.

1 comment:

You can post anonymously if you really want to, but I would appreciate it if you could provide some means of identifying who you are, if only for the purpose of knowing who has written what.