Linguistic aspects of the Aryan non-invasion theory

Dr. Koenraad ELST



4. Linguistic paleontology

4.1. Hot and cold climate

One of the main reasons for 19th‑century philologists to exclude India as a candidate for Urheimat status was the findings of a fledgling new method called linguistic paleontology. The idea was that from the reconstructed vocabulary, one could deduce which flora, fauna and artefacts were familiar to the speakers of the proto‑language, hence also their geographical area of habitation. The presence in the common vocabulary of words denoting northern animals like the bear, wolf, elk, otter and beaver seemed to indicate a northern Urheimat; likewise, the absence of terms for the lion or elephant seemed to exclude tropical countries like India.

It should be realized that virtually all IE‑speaking areas are familiar with the cold climate and its concomitant flora and fauna. Even in hot countries, the mountainous areas provide islands of cold climate, e.g. the foothills of the Himalaya have pine trees rather than palm trees, apples rather than mangoes. Indians are therefore quite familiar with a range of flora and fauna usually associated with the north, including bears (Sanskrit rksha, cfr. Greek arktos), otters (udra, Hindi ûd/ûdbilâw) and wolves (vrka). Elks and beavers do not live in India, yet the words exist, albeit with a different but related meaning: rsha means a male antelope, babhru ("brownie") a mongoose. The shift of meaning may have taken place in either direction: it is perfectly possible that emigrants from India transferred their term for "mongoose" to the first beavers which they encountered in Russia.

4.2. Early Vedic flora terms

When the Hittites settled in Anatolia, they found an advanced civilization and adopted numerous lexical and grammatical elements from it. By contrast: "It was different with the Indo-Aryan tribes arriving in India: with the Harappan civilization probably already in decline, they could very well preserve the full range of their traditions including their remarkably archaic language. The influence of non-Indo-European languages is just beginning to be visible (e.g. the retroflex series). The Aryan ideology of 'hospitality' and 'truth' is very vivid, as in Ancient Iran". (Zimmer 1990/1:151)

The same conservation of IE heritage seems to be in evidence in their vocabulary. As we saw, Austro-Asiatic is plausibly argued to have contributed many words to IA, yet only little in the semantic range where substratum influence is usually the largest, viz. indigenous flora. In that field, the early Vedic vocabulary has been screened for linguistic origins by Jean Haudry (2000:148), who argues that the foreign origin of IA is indicated not just by non-IE etymologies but also by artificial IA coinages based on IE vocabulary. Admittedly, a few are simply IE:

* bhûrja, birch;

* parkatî, ficus infectoria, cfr. Latin quercus, "oak";

* dâru, "wood", cfr. Gk. doru, Eng. tree;

* pitu-dâru, a type of pine, cfr. Lat. pituita, a type of pine.

A few (but here, Haudry is apparently not trying for exhaustiveness) are loans:

* shimshapa, dalbergia sisa ("from a West-Asian language");

* pîlu, an unspecified tree ("probable loan from Dravidian").

But all the others are Indian coinages on an IE base:

* nyagroha, ficus indica, "downward-growing";

* ashvattha, ficus religiosa, "horse food" (at least "probably", according to Haudry);

* vikankata, flacourtia sapida, "stinging in all directions";

* shamî, prosopis spicigera, "hornless";

* ashvaghna, nerium odoratum, "horse-slayer";

* târshtâghna, an unidentified tree, "evil-killer";

* spandana, an unidentified tree, "trembler";

* dhava, grislea tomentosa or anogeissus latifolia, "trembler";

* parna, butea frondosa, "feather", hence "leaf" (metonymic);

* svadhiti, an unidentified tree, "hatchet" (metaphoric).

It is of course remarkable that they didn't borrow more terms from the natives, as if they had invaded an uninhabited country and had to invent names from scratch. But the main question here is: does "artificial coinage" indicate that the referents of these words were new to the Indo-Aryans? It would seem that, on the contrary, artificial coinage pervades the whole IE vocabulary.

It is true that new phenomena are often indicated with such descriptive terms, e.g. French chemin de fer ("iron road") for "railway"; or in classical Sanskrit, loha, "the red one", to designate the then new metal "iron". Yet, far from being confined to new inventions, "artificial" coinage is the typical PIE procedure for creating names for natural species. Thus, PIE *bheros, "brown", has yielded the animal names Skt. bhâlu, Eng. bear, and with reduplication, *bhebhrus, Eng. beaver (perhaps also Gk. phrunos, "toad"), all meaning simply "the brown one". Similarly, *kasnos, Eng. hare, Skt. shasha, means "the grey one"; *udros, Eng. otter (whence also Gk. hydra, water-snake) means "the water-animal" (a general meaning which it has at least partly preserved in Skt. udra); lynx is from *leukh, "to be bright"; frog from *phreu-, "to jump". The deer, Lat. cervus, Dutch hert, is "the horned one", cfr. Lat. cornu, "horn". The bear, Slavic medv-ed, Skt. madhv-ad, is also the "honey-eater". Some of the said animals known by descriptive terms are inhabitants of the northern zone; following the AIT argument that such coinages indicate immigration, we would have to conclude that the Urheimat definitely did not know otters, beavers, bears, hares and lynxes.

The Indo-Europeans certainly knew the species homo, and had no need to be told about its existence by natives of some invaded country. Well, Latin homo/hom-in-is is an artificial derivative of hum-u-s, "soil", hence "earth-dweller" (cfr. Hebrew adam, "man", and adamah, "earth"), as opposed to the heaven-dwellers or gods, which gives us a glimpse of the philosophy of the PIE-speaking people. The Iranian-Armenian term for this species, mard, is another philosophical circumlocution, "mortal". The Sanskrit term manuSa, and possibly even puruSa, is a patronym: "descendent of Manu" and "descendent of Puru" (cfr. Urdu âdmî, "man", i.e. "son of Adam"), with Manu itself apparently derived from *man-, "mind". Not one of these is a truly simple term, all are artificially coined from more elementary semantic matter.

In so basic a vocabulary as the numerals, we encounter artificial coinages: PIE *oktou is a dual form meaning "twice four" (cfr. Avestan ashti, "four fingers", and perhaps Kartvelian *otxo, "four"); *kmtom, "hundred", is a derivative of *dkm, "ten", through *dkmtom. And there are connections between the numerals and the real world: five is related to finger; nine is related to new; *dkm, "ten" is related to in-dek-s, "pointing finger", Greek deik-numi, "to point out", etc.

Likewise for family terminology. The daughter, according to a popular etymology, is the "milkmaid", cfr. Skt. dugdha, "milk" (though the semantic connection could also be through "suckling" > "child"). The Roman children, liberi, were the "free ones", as opposed to the serf section of the extended household (cfr. conversely Persian: â-jâta, "born unto", "own progeny" > âzâd, "free"). Even the word *pa-ter, "father", literally "protec-tor" is a more artificial construction than, e.g., Gothic atta (best known through its diminutive Attila), a primitive term present in very divergent languages (as in the pater patriae epithets Ata-türk and Keny-atta).

Such descriptive formations are common in IE, but Sanskrit is often the only IE language in which the descriptive origin of words is still visible, which may indicate its high age. In all other IE languages, "wolf" is exclusively the name of this animal; in Sanskrit, vRka is still part of a continuum with the verb vRk, "to tear" (likewise for pRdâku, below). Very primitive is the fact that no suffix is used, as in the "tear-er", but the root itself is both verbal root and nominal stem.

Not the descriptive term, but rather the etymologically isolated term which only appears in the lexicon to designate a species, is an indicator of the newness and strangeness of the species to the speakers of the language concerned, because it would probably be borrowed by newcomers from the natives of the habitat of the species. Thus, tomato has no descriptive value and no etymological relatives in the IE languages, because it was borrowed wholesale as the name of this vegetable from the Amerindian natives of the tomato-growing regions. That Sanskrit matsya, "fish", is derivable from an IE root mad, "wet", while Greek ichthys and Italo-Germanic piscis/fish have no PIE etymology, indicates a substratum influence on the European branches of IE, not on the Indian one. Proposals of a link between ichthys and Greek chthôn, PIE *dhghom, "earth" (hence "nether world", including the submarine sphere?) are doubtful, and even if valid, they would only confirm our finding that description, i.c. of the fish as a "netherworlder", is a common formula for coining words in IE.

In Haudry's own list of simple inherited IE words, bhûrja is an artificial coinage, meaning "the bright one", the birch being exceptional in colour, viz. white. The same may hold for parkatî/quercus, which seems related to the word for "lightning", cfr. the personified lightning-god, Baltic Perkunas, Skt. Parjanya. It is certainly true of a general Vedic word for "tree", which Haudry also mentions: vanaspati, "lord of the jungle". Such metaphoric circumlocution is what the Nordic poets called a kenning, and it is omnipresent not only in the earliest known IE poetry traditions, but even in the formation of IE words themselves.

4.3. The linguistic horse

The word *ekw-o-s, "horse", is a later formation in PIE. The oldest vocabulary had athematic stems (e.g. Latin lex from leg-s), the thematic stems (e.g. Latin corv-u-s, "raven") belong to a later generation of PIE words. Simple roots are older than roots which have been lengthened with an extra (mostly gender-specific) vowel -a or -o; the development of the latter category, with its own declension, had also been completed before the disintegration of PIE. To take two momentous inventions, the IE words for "fire" (*egnis, *pûr, *âter) belong to the older category, while the words for "wheel" (*rot-o-s, *kwekwl-o-s) belong to the younger type, which indicates that the wheel was newly invented or newly adopted from neighbouring peoples by the Proto-Indo-Europeans, whereas the use of fire was already an ancient heritage.

Coming to livestock: *gwou-s, "cow", and *su-s, "pig" (whence the diminutive *su-in-o, "swine") belong to the older category, while *ekw-o-s, "horse", belongs to the younger category. Some scholars deduce from this that the pig and the cow were domesticated earlier than the horse, which happens to tally with the archaeological data. But it might just as well be interpreted as an indication that the horse was not only not domesticated by the earliest Proto-Indo-Europeans, but was simply not known to them; after all, the inhabitants of the areas where horses were available for domestication, must have known the horse since much earlier, as a wild animal on a par with the wolf and the deer. We shouldn't give too much weight to this, but it seems that if the term for "horse" is a younger formation, this might indicate that the horse was not native to the Urheimat, and that the Proto-Indo-Europeans only got acquainted with it (as with the wheel) shortly before their dispersal. In that case, India (if as horseless as usually claimed) was a better candidate for Urheimat status than the horse-rich steppes.

This cannot be taken as more than a small indication that the horse was not part of the scenery in the PIE homeland. There are many newer-type formations for age-old items, e.g. the species lup-u-s, "wolf", was most certainly known to the first PIE-speakers, whichever their homeland. But in the present case, another argument for the late origin of ekw-o-s has been added (by Lehmann 1967:247), viz. its somewhat irregular development in the different branches of IE, e.g. the appearance from nowhere of the aspiration in Greek hippos.

The only convincing attempt to give *ekwos roots in the basic PIE vocabulary, is through the Greek word ôkus, from *oku/eku, "fast", interpreting the name of the horse as "the fast one". Another cognate word, mentioned by Lehmann (1967:247), could be Balto-Slavic *ashu, "sharp". If this is so, those who see artificial coinage of Indian tree names in Sanskrit as proof of the speakers' unfamiliarity with the trees in question, should also deduce that this artificial coinage indicates the foreignness of the horse to the original PIE-speakers in their Urheimat. Conversely, if the irregularities in the various evolutes of *ekwos are taken to indicate that it "was borrowed, possibly even independently in some of the dialects" (Lehmann 1967:247), this would again confirm that the horse was a newcomer in the expanding PIE horizon. Because that PIE horizon started expanding from horseless India?

If this is not really a compelling argument, at least the converse is even more true: any clinching linguistic evidence for a horse-friendly Urheimat is missing. We should now count with the possibility that the Proto-Indo-Europeans only familiarized themselves with the horse towards the time of their dispersion. A possible scenario: during some political or economic crisis, adventurers from overpopulated India, speaking PIE dialects, settled in Central Asia where they acquainted themselves with the horse. More than the local natives, they were experienced at domesticating animals (even the elephant, judging from RV 9:47:3 which mentions an elephant decorated for a pageant), and they domesticated the horse. While communicating some specimens back to the homeland, they used the new skill to speed up their expansion westward, where their dialects became the European branches of the IE family. The horse became the prized import for the Indian elite, which at once explains both its rarity in the bone record and its exaltation in the Vedic literary record.

The terms for cart and the parts of a cart (wheel, axle) famously belong to the common PIE vocabulary, giving linguistic-paleontological support to the image of the PIE-speaking pioneers leaving their homeland in ox-drawn carts and trekking to their Far West. This cart was also known in Harappa. But unlike the wheel and its parts, the spoked wheel seems to be a later invention, at least according to the same criterion: felloe and spoke are not represented in the common PIE lexicon. The fast horse-drawn chariot with spoked wheels was a post-PIE innovation; its oldest available specimen was reportedly found in Sintashta in the eastern Urals and dated to ca. the turn of the 2nd millennium BC, synchronous with the declining years of Harappa. It remains possible that the 99% of non-excavated Harappan sites will also yield some specimens, but so far no Harappan chariots have been found; nor has any identifiably Vedic chariot, for that matter.

Yet, the Rg-Veda does mention chariots, though not everywhere and all the time. If the internal chronology of the Rg-Veda developed by Shrikant Talageri (2000) is approximately right (and much of it is uncontroversial, e.g. putting books 8 to 10 later than the "family books", 2 to 7), we can discern an early Vedic period in which the spoked wheel was unknown, or at least unmentioned, and a later period when it was very much present in the Vedic region and culture. In that case, the Vedic Aryans had lived in India well before the chariot was imported there (if not locally invented, so far unattested but not unlikely given Harappa's edge in technology). This implies that they either invaded India in an earlier period, without the aid of the horse-drawn spoked-wheel chariot, the tank or Panzer of antiquity; or that they were native to India.

4.4. Positive evidence from linguistic paleontology

In assessing the linguistic-paleontological evidence, it has been shown above that the fauna terms provide no proof for a northern Urheimat. Thomas Gamkrelidze and Vyaceslav Ivanov (1995:420-431 and 442-444), in their bid to prove their Anatolian Urheimat theory, have gone a step further and tried to find positive proof, viz. terms for hot‑climate fauna in the common IE vocabulary.

Thus, they relate Sanskrit pRdâku with Greek pardos and Hittite parsana, all meaning "leopard", an IE term lost in some northern regions devoid of leopards (note that the meaning in Sanskrit is still transparent, viz. "the spotted one", and that this description is also applied to the snake, while a derivative of the same root, pRshati, means "spotted deer"). The word lion is found as a native word, in regular phonetic correspondence, in Greek, Italic, Germanic and Hittite, and with a vaguer meaning "beast" in Slavic and Tocharian. It could be a Central Asian acquisition of the IE tribes on their way from India; alternatively, it is not unreasonable to give it deeper roots in IE by linking it with a verb *reu-, Skt. rav‑, "howl, roar", considering that alternation r/l is common in Sanskrit (e.g. the double form plavaga/pravaga, "monkey", or the noun plava, "frog" related to the verb pravate, "jump").

A word for "monkey" is common to Greek (kepos) and Sanskrit (kapi), and Gamkrelidze and Ivanov argue for its connection with the Germanic and Celtic word "ape". For "elephant", they even found two distinct IE words (1995:443): Sanskrit ibha, "male elephant", corresponding to Latin ebur, "ivory, elephant"; and Greek elephant‑ corresponding to Gothic ulbandus, Tocharian *alpi, "camel". In the second case, the "camel" meaning may be the original one, if we assume a migration through camel‑rich Central Asia to Greece, where trade contacts with Egypt made the elephant known once more; the word may be a derivative from a word meaning "deer", Greek elaphos. In the case of ibha/ebur, however (which Gamkrelidze and Ivanov connect with Hebrew shen-habbim, "tusk-of-elephant", "ivory"), we have a straightforward linguistic‑paleontological argument for an Urheimat with elephants.

To be sure, linguistic paleontology is no longer in fashion: "The long dispute about the reliability of this 'linguistic paleontology' is not yet finished, but approaching its inevitable end -- with a negative result, of course." (Zimmer 1990/1:142) Yet, to the extent that it does retain some validity, it no longer militates against the OIT, and even provides some modest support to it.

5. Exchanges with other language families

5.1. Souvenirs of language contacts

One of the best keys to the geographical itinerary of a language is the exchange of lexical and other elements with other languages. Two types of language contact should be distinguished. The first type of language contact is the exchange of vocabulary and other linguistic traits, whether by long‑distance trade contact, by contiguity or by substratum influence, between languages which are not necessarily otherwise related.

Perhaps more than by proven contact, a language can also be "rooted" in a region by a second type of "contact", viz. genetic kinship with a local language. To be sure, just like languages with which contacts were established, cognate languages may have moved, and their place of origin overwhelmed by newcomers. Still, in the present discussion it would count as a weighty argument if it could be shown that IE was genetically related to a West-European or an East-Asian language. This would "pull" the likely Homeland in a westerly or easterly direction. In Europe, the kinship would have to be with Basque, but this remains a language isolate, so this solid proof for a westerly homeland is missing. How about the Asian connection?

5.2. Sumerian and Semitic

If we discount coincidence, a few look-alikes between Sumerian and PIE may be assumed to be due to contact, though in the first millennium of writing, "Indo-European is not documented in the earliest Mesopotamian records". (Anthony 1991:197, contra the Anatolian homeland theory of Renfrew 1987 et al.) Also, these look-alikes are so few and phonetically so elementary that sheer coincidence might really be sufficient explanation. To borrow some examples from Gamkrelidze and Ivanov (1995), Sumerian agar, "irrigated territory", may be related to PIE *agr-o (Latin ager, Skt. ajraH), and may have been borrowed in either direction. Sumerian tur, "yard, enclosure for cattle", could be related with PIE *dhwer, Greek thyra, English door. Sumerian ngud/gud/gu, "bull, cow" (cfr. Skt. go, English cow), should be seen together with the Egyptian word ng3w, "a type of bull"; the latter type of semantic relation ("bull" to "type of bull"), narrowing down from the general to the particular, is often indicative of borrowing (cfr. from French chauffeur, "a driver" to English chauffeur, "the driver in your employ", or below, from PIE *Hster- "star", to Akkadian Ishtar, "planet Venus"): Egyptian borrowed from Sumerian, which in turn borrowed from IE. Sumerian kapazum, "cotton", already mentioned, may be from Austro-Asiatic *kapas as well as from Skt. karpâsa.

Kinship of Sumerian with IE is practically excluded (though there are vague indications of Sumerian-Munda kinship, fitting into the theory of the migration of both Sumerian and Austro-Asiatic from "Sundaland", the Sunda shelf to the south and east of Vietnam, when it was submerged by the post-Glacial rising tide in ca. 6000 BC, cfr. Oppenheimer 1998), because there is just no above-coincidence similarity in phonetic or grammatical features. If some words are related, it must be due to borrowing in the context of trade relations. The main geographical candidates for PIE regions trading with Sumer would be Anatolia and the Indus basin. Then again, being the main language of civilization in ca. 3000 BC, some Sumerian terms may have been used in long-distance trade with the Pontic area, the more conventional Urheimat candidate. Note however that the trade links between Sumer and the Harappan civilization ("Meluhha" in Mesopotamian texts) are well-attested, along with the presence of Indo-Aryans in Mesopotamia, e.g. the names Arisena and Somasena in a tablet from Akkad dated to ca. 2200 BC (Sharma 1995:36 w.ref. to Harmatta 1992:374). It doesn't follow that these Indo-Aryans in Mesopotamia originated in the Indus Valley, but it is not excluded either.

Far more important is the linguistic relation between IE and Semitic (and indirectly also the Chadic, Kushitic and Hamitic branches of the Afro‑Asiatic family, assumed to be the result of a pre‑4th‑millennium migration of early agriculturists from West Asia into North Africa). Semitic has frequently been suspected of kinship with IE, even by scholars skeptical of "Nostratic" mega-connections. Most remarkable are the common fundamental grammatical traits: Semitic, like IE, has grammatically functional vowel changes, grammatical gender, three numbers (singular, plural and a vestigial dual), declension, and conjugational categories including participles and medial and passive modes. Many of these grammatical elements are shared only by Afro-Asiatic and IE, setting them off as a pair against all other language families. The two also share most of their range of phonemes, even more so if we assume PIE laryngeals to match Semitic aleph, he and 'ayn, and if we take into account that the fricatives seemingly so typical of Semitic are often evolutes of stops (e.g. modern Hebrew Avraham from Abraham, thus transliterated in the Septuagint), just like Persian or Germanic developed fricatives from PIE stops (e.g. hafta c.q. seven from *septM). Moreover, if we count PIE laryngeals as consonants, two-consonant IE roots come closer to the typical three-consonant shape of Semitic roots.

One way to imagine how Semitic and IE went their separate ways has been offered by Bernard Sergent (1995:398 and 432), who is strongly convinced of the two families' common origin. He combines the linguistic evidence with archaeological and anthropological indications that the (supposedly PIE-speaking) Kurgan people in the North-Caspian area of ca. 4000 BC came from the southeast, a finding which might otherwise be cited in support of their ultimate Indian origin. Thus, the Kurgan people's typical grain was millet, not the rye and wheat cultivated by the Old Europeans, and in ca. 5000 BC, millet had been cultivated in what is now Turkmenistan (it apparently originates in China), particularly in the mesolithic culture of Jebel. From there on, the archaeological traces become really tenuous, but Sergent claims to discern a link with the Zarzian culture of Kurdistan, 10,000 to 8500 BC. Short, he suggests that the Kurgan people had come along the eastern coast of the Caspian Sea, not from the southeast (India) but the southwest, near Mesopotamia, where PIE may have had a common homeland with Semitic.

However, those who interpret the archaeological data concerning the genesis of agriculture in the Indus site of Mehrgarh as being the effect of a diffusion from West Asia, may well interpret an eventual kinship of IE with Semitic as proving their own point: along with its material culture, Mehrgarh's language may have been an offshoot of a metropolitan model, viz. a Proto‑Semitic-speaking culture in West Asia. This would mean that the Indus area was indeed the homeland of the original PIE, but that in the preceding millennia, PIE had been created by the interaction of Proto‑Semitic-speaking colonists from West Asia with locals.

A less heady theory holds that there is no genetic kinship between IE and Semitic, the lexical connection being too meagre, and that there has only been some contact. Oft-quoted is the seeming Semitic origin of the numerals 6 and 7 (Hebrew shisha, shiva, Arabic sitta, sab'a), conceivably borrowed at the time when counting was extended beyond the fingers of a single hand for the first time. Contact with Akkadian and even Proto‑Semitic is attested by a good handful of words, esp. some terms for utensils and animals. Examples of borrowing in the opposite direction, from PIE/IE to Semitic, include Semitic *qarn, "horn" (e.g. Hebrew qeren), from PIE khr-n (cfr. Latin cornu, Sanskrit shrnga), derived within PIE from a root kher-, "top, head" (Greek kar); and the well-known Semitic names of the planet Venus, Ishtar/'Ashtoret/'Ashtarte, from PIE *Hster-, "star" (with Semitic feminine suffix -t), derived within PIE from the root *as, "to burn, glow". Some scholars also claim that Akkadian sisi, Hebrew sus, "horse", is derived from Indo-Iranian ashwa.

Some terms are in common only with the Western IE languages, e.g. Semitic gedi corresponding to IE *ghed-, still recognizable in Latin haedus, English goat; IE *taur-o-s, "bull", Semitic *Taur-, from Proto-Semitic cu-r-; and IE woi-no-/wei-no, "wine", West-Semitic uain-, Hebrew yayin. In this case we should count with a common origin in a third language, possibly the Balkanic Old-European culture or its last stronghold, Minoic Crete. The transformation of demonstratives into the definite article in most Western IE languages has also been related, vaguely and implausibly, to Semitic influence. However, this testimony is a bit too slender for concluding that the Western Indo-Europeans had come from the East and encountered the Semites or at least Semitic influence on their way to the West. And the word *peleku, "axe", apparently related to Semitic (Arabic) falaqa, "to split", is only attested in the Eastern Greek-Armenian-Aryan subgroup of PIE, possibly a later loan to that group in its homeland after the northwestern branches had left it.

The general fact of IE-Semitic contact, like in the case of Sumerian, dimly favours an IE homeland with known trade relations with the Middle East, esp. Anatolia or India, over a Russian or European one. But given the very early civilizational lead of Semitic-speaking centres (e.g. Ebla, Syria, 3200 BC), the effect of truly long-distance trade to northern backwaters cannot be discounted. So, the evidence of Semitic or Sumerian contacts does not clinch the issue, though it certainly does not preclude an Indian homeland for PIE.

More promising though far more complicated is the analysis by Nichols (1997) of the transmission of loans in and around Mesopotamia, also taking the three Caucasian families into account. Of the latter, the two northern ones show little lexical exchange with IE, which pleads against a Pontic homeland. On the basis of these "loanword trajectories" through different languages, esp. of Mesopotamian cultural terms including those discussed above, Nichols (1997:127) finds that in the 4th millennium BC, "Abkhaz-Circassian and Nakh-Daghestanian are in approximately their modern locations, and Kartvelian and IE are to the east". More precisely, Kartvelian is "likely to have emanated from somewhere to the south-east of the Caspian" while the "locus of IE was farther east and farther north" (1997:128),-- which can only be Bactria.

Whether Bactria was the homeland in its own right or merely a launching-pad for Indians trekking west remains to be seen. But if Nichols' findings, as yet based on a limited corpus of data, could be corroborated further, it would generally help the OIT.

5.3. Uralic

A case of contact on a rather large scale which is taken to provide crucial information on the Urheimat question, is that between early IE and Uralic. It was a one‑way traffic, imparting some Tocharian, dozens of Iranian and also a few seemingly Indo‑Aryan terms to either Proto‑Uralic or Proto‑Finno‑Ugric (i.e. mainstream Uralic after Samoyedic split off). Among the loans from Indo-Iranian or Indo-Aryan, we note sapta, "seven, week", asura, "lord", sasar, "sister", shata, "hundred". (Rédei 1988) The Iranian influence is uncontroversial and easily compatible with any IE Urheimat scenario because for long centuries Iranian covered the area from Xinjiang to Eastern Europe, occupying the whole southern frontier of the Uralic speech area. But how do the seemingly Indo‑Aryan words fit in?

At first sight, their presence would seem to confirm the European Urheimat theory: on their way from Europe, the Indo‑lranian tribes encountered the Uralic people in the Ural region and imparted some vocabulary to them. This would even remain possible if, as leading scholars of Uralic suggest, the Uralic languages themselves came from farther east, from the Irtysh river and Balkhash lake area.

The question of the Uralic homeland obviously has consequences. Karoly Rédei (1988:641) reports on the work of a fellow Hungarian scholar, Peter Hajdu (1950s and 60s): "According to Hajdu, the Uralic Urheimat may have been in western Siberia. The defect of this theory is that it gives no explanation for the chronological and geographical conditions of its contacts between Uralians (Finno-Ugrians) and Indo-Europeans (Proto-Aryans)." Not at all: Hajdu's theory explains nicely how these contacts may have taken place in Central Asia rather than in Eastern Europe, and with Indo-Iranian rather than with the Western branches of IE. V.V. Napolskikh (1993) has supported the Irtysh/Balkhash homeland theory of Uralic with different types of evidence from that given by Hajdu, and now that the genetic aspect of population movements is being revaluated (Cavalli-Sforza 1996), the Asiatic physical features of isolated Uralic populations like the Samoyeds could also be included as pointers to an easterly homeland.

In that case, three explanations are equally sustainable. One rather facile scenario is the effect of long-distance trade between an Indian metropolis and the northerly backwaters, somewhat like the entry of Arabic and Persian words in distant European languages during the Middle Ages (e.g. tariff, cheque, bazar, douane, chess). More interesting is the possibility that these words were imparted to Uralic by IA‑speaking emigrants from India.

One occasion for mass emigration, which the OIT sees as a carrier of IA languages, was the catastrophe which led to the abandonment of the Harappan cities in about 2000 BC. This must have triggered migrations in all directions: to Maharashtra, to India's interior and east, to West Asia by Mitannic true Indo-Aryans as well as by the "sanskritized" non-IA Kassites. (I disagree with R.S. Sharma 1995:36 that Mesopotamian inscriptions from the 16th century BC "show that the Kassites spoke the Indo-European language", though they do mention the Vedic gods "Suryash" and "Marutash"; for samples of the non-IE Kassite lexicon, vide Van Soldt 1998.) And so, one of these groups went to the Pontic region. Along the way, some members ended up in an Uralic-speaking environment, imparting a bit of IA terminology but getting assimilated over time, just like their Mitannic cousins. The Uralic term orya, "slave", from ârya, may indicate that their position was not as dignified as that of the Mitannic horse trainers.

A third possibility is that the linguistic exchange which imparted Sanskrit-looking words to Uralic took place at a much earlier stage, that of Indo-Iranian, i.e. before the development of typical iranianisms such as the softening of [s] to [h]. Even the stage before Indo-Iranian unity, viz. when Indo-Iranian had not yet replaced the PIE kentum forms with its own satem forms and the PIE vowels a/e/o with its own uni-vowel a, may already have witnessed some lexical exchanges with Uralic. As Asko Parpola (1995:355) has pointed out, among the IE loans in Uralic, we find a few terms in kentum form which are exclusively attested in the Indo-Iranian branch of IE, e.g. Finnish kehrä, "spindle", from PIE *kettra, attested in Sanskrit as cattra. While it is of course also possible that words like *kettra once did exist in branches other than Indo-Iranian but disappeared in the intervening period, what evidence we have points to pre-satem Indo-Iranian.

The continuous IE-Uralic contact may stretch back even further: to the stage of PIE. Thus, there is nothing pointing to any specifics of the Indo-Iranian branch in the Uralic loan *nime, "name", *wete, "water", or *wige, "to transport" (cfr. Latin vehere): these may have been borrowed from united PIE or from other proto-branches, e.g. from proto-Germanic. Even more intimate common items concern the pronominal system, e.g. *m- marking the first and *t- the second person singular, *t- the demonstrative, *ku/kw the interrogative.

And the process of borrowing stretches back even farther than that: to the stage of laryngeal PIE. No less than 27 Uralic loans from IE have been identified where original PIE laryngeals are in evidence, mostly adapted as [k], e.g. Finnish kulke, "go, walk" (Koivulehto 1991:46) from PIE kwelH-, whence Skt. carati, "goes, walks". Sometimes the resulting sound is [sh], in most cases weakened later on to [h], e.g. Finnish puhdas, "clean", from PIE pewH-, "to clean", with perfect participle suffix -t-, cfr. Skt. pûta, "cleaned" (Koivulehto 1991:93). If this is correct, PIE and proto-Uralic have come in contact even before they got fragmented, i.e. in their respective homelands. In that case, PIE cannot have been located far from Central Asia, and probably Northwest India could do the job, especially if the Uralians ultimately arrived in the Ob-Irtysh basin from a more southerly region such as Sogdia.

A third partner in this relationship must also be taken into account, though its connection with Uralic looks older and deeper than that of PIE: Dravidian. Witzel (1999/1:349) acknowledges the "linguistic connections of Dravidian with Uralic". Both are families of agglutinative languages with flexive tendencies, abhorring consonant clusters and favouring the stress on the first syllable. Sergent (1997:65-72) maps out their relationship in some detail, again pointing to the northwest outside India as the origin of Dravidian. We may ignore Sergent's theory of an African origin of Dravidian for now, and limit our attention to his less eccentric position that a Proto-Dravidian group at one point ended up in Central Asia, there to leave substratum traces discernible even in the IE immigrant language Tocharian. The most successful lineage of Dravidians outside India was the one which mixed its language with some Palaeo-Siberian tongue, yielding the Uralic language family.

Looking around for a plausible location for this development, we find that Siberia may have been a peripheral part where the resulting language could survive best in relative isolation, but that its origins may have been in a more crossroads-like region such as Bactria-Sogdia. The Dravidians moved south to Baluchistan and then east into Sindh and Gujarat (avoiding confrontation with the Proto-Indo-Europeans in Panjab), while the Uralians moved north, and those who stayed behind were absorbed later into the expanding PIE community. The interaction of the three may perhaps be illustrated by the word *kota/koTa, "tent, house" in Uralic and in Dravidian, and also in Sanskrit and Avestan but not in any other branch of IE: perhaps Dravidian gave it to Uralic as a birth gift, and later imparted it to those IE languages it could still reach when in India. If this part of the evidence leaves it as conjectural that India was the habitat of the Proto-Indo-Europeans, it does at least argue strongly for some Central-Asian population centre, most likely Bactria-Sogdia, as the meeting-place of Proto-Uralic, Proto-Dravidian and PIE, before IE and Uralic would start their duet of continuous (one-way) linguistic interaction on their parallel migrations westward.

5.4. Sino-Tibetan

To prove an Asian homeland for IE, it is not good enough to diminish the connections between IE and more westerly language families. To anchor IE in Asia, the strongest argument would be genetic kinship with an East-Asian language family. However, in the case of Sino-Tibetan, all we have is loans, early but apparently not PIE. The early dictionaries suggested a connection between Tibetan lama, written and originally pronounced as blama, and Sanskrit brahma (S.C. Das 1902:900); blama is derived from bla-, "upper, high" (as in (b)La-dakh, "high mountain-pass"), and doesn't Sanskrit bRh-, root of brahma, mean "to grow", i.e. "to become high", close enough to the meaning of Tibetan bla-? But more such look-alikes to build a case for profound kinship were never found.

On the other hand, early contact between members of the two families is well-attested, though not in India. A well‑known set of transmitted terms was in the sphere of cattle‑breeding, all from IE (mostly Tocharian) to Chinese: terms for horse (ma < *mra, cfr. mare); hound (quan, cfr. Skt. shvan); honey (mi, cfr. mead, Skt. medhu); bull (gu, cfr. Skt. go); and, more recently, lion (shi, Iranian sher). This doesn't add new information on the Urheimat question, for the IE-speaking cattle-breeders in Northwest China could have come from anywhere, but it confirms our image of the relations between the tea-drinking Chinese farmers (till today, milk is a rarity in the Chinese diet) and the milk-drinking "barbarians" on their borders.

The first one to point out some common vocabulary between IE and Chinese was Edkins 1871. Since then, the attempt has become more ambitious. The old racial objection has been overruled: there is no reason why the early Indo-Europeans should have been fair-haired Caucausians (though such types have been found in large numbers in Xinjiang), and at any rate languages are known to cross racial frontiers, witness the composition of the Turkish language community, from Mongoloid in pre-Seljuq times to indistinguishable from Armenians or Syrians or Bulgarians today. Also, unlike modern Chinese, archaic Chinese was similar to IE in the shape of its words: monosyllabic roots with consonant clusters, and probably not yet with different tones except for a pitch accent, traces of which also exist in Sanskrit and Greek.

Pulleyblank (1993) claims to have reconstructed a number of rather abstract similarities in the phonetics and morphology of PIE and Sino‑Tibetan. Though he fails to back it up with any lexical similarities, he confidently dismisses as a "prejudice" the phenomenon that "for a variety of reasons, the possibility of a genetic relationship between these two language families strikes most people as inherently most improbable". He believes that "there is no compelling reason from the point of view of either linguistics or archaeology to rule out the possibiity of a genetic connection between Sino‑Tibetan and Indo‑European. Such a connection is certainly inconsistent with a European or Anatolian homeland for the Indo‑Europeans but it is much less so with the Kurgan theory", esp. considering that the Kurgan culture "was not the result of local evolution in that region but had its source in an intrusion from an earlier culture farther east". (1993:106, emphasis mine)

This is of course very interesting, but: "It will be necessary to demonstrate the existence of a considerable number of cognates linked by regular sound correspondences. To do so in a way that will convince the doubters on both sides of the equation will be a formidable task." (1993:109) That task has been tackled by Chang Tsung‑tung (1988), though he sees his rich harvest of ca. 1500 common Sino-IE words not as genetic kinship but as the result of IE superstratum influence imparted to a native Tibeto-Burmese dialect when the first Chinese state was "established by IE conquerors", (1988:34), identified by tradition with the culture heroes, esp. the Yellow Emperor, said to have been enthroned in 2697 BC.

Among Chang's findings, we may note e.g. Chinese sun, "grandson", cfr. son; pi (archaic *peit), "must, duty", cfr. bid, Latin fides, "trust", Greek peithô, "persuade"; gei (*kop), "give", cfr. give; gu (*kot), "bone", cfr. Latin costa, "rib"; lie (*leut), "inferior", cfr. lit-tle; ye (*lop), "leaf", cfr. leaf; bao (*bak), "thin", cfr. few, Latin paucus, "few"; zhi (*teig), "show, point at", cfr. in-dex, Greek deiknumi; shi (*zieg), "see", cfr. see, sight.

If verified, Chang's list is really impressive, but it doesn't decide the IE homeland question, except to confirm trivially that it was not China, since IE was brought there by foreign invaders; but whence? Most remarkable in Chang's list is the high number of Northwest-IE and specifically Germanic cognates: "Germanic preserved the largest number of cognate words" (1988:32). Likewise, Gamkrelidze and Ivanov (1995:832) trace a Germanic itinerary through Central Asia, leading to contact with Yeniseian, the northwesternmost branch of Sino-Tibetan, and they cite a (very tenuous) etymology as proof: "And in some Ancient European dialects, in particular Germanic, borrowings from Yeniseian must be assumed in such words as *hus, 'house' (...), cfr. Yeniseian qus, 'tent, house'."

In their scenario, this is rather weird, taking the Proto-Germanic tribe from Iran to northeastern Central Asia and then back west to Europe. Eurocentric expansion models would simply let some Germanic warriors, after their ethnogenesis in Europe, strike east all through Central Asia, a scenario already widely accepted for Tocharian. In an OIT, it all falls into place: on their way from India, somewhere near the Aral Lake, the Proto-Germanic tribe lost one adventurous clan branching off to the east and settling in China. But I readily admit that this OIT-serving scenario looks too good to be true. It would also imply that the ethnogenesis of the Germanic tribes, included their distinctive vocabulary, took place in Central Asia rather than in their historic North-European habitat. This is not very plausible.

5.5. Austronesian

Even more unexpected and eccentric than the Chinese connection is the case for early contact or kinship between IE and Austronesian. According to Southworth (1979:205): "The presence of other ethnic groups, speaking other languages [than IE, Dravidian or Munda], must be assumed (...) numerous examples can be found to suggest early contact with language groups now unrepresented in the subcontinent. A single example will be noted here. The word for 'mother' in several of the Dardic languages, as well as in Nepali, Assamese, Bengali, Oriya, Gujarati, and Marathi (...) is âî (or a similar form). The source of this is clearly the same as that of classical Tamil ây, 'mother'. These words are apparently connected with a widespread group of words found in Malayo‑Polynesian (cf. Proto‑Austronesian *bayi ...) and elsewhere. The distribution of this word in Indo‑Aryan suggests that it must have entered Old Indo‑Aryan very early (presumably as a nursery word, and thus not likely to appear in religious texts), before the movement of Indo‑Aryan speakers out of the Panjab. In Dravidian, this word is well‑represented in all branches (though amma is perhaps an older word) and thus, if it is a borrowing, it must be a very early one."

Next to âyî, "mother", Marathi has the form bâî, "lady", as in Târâ-bâî, LakSmî-bâî. etc.; the same two forms are attested in Austronesian. So, we have a nearly pan‑Indian word, attested from Nepal and Kashmir to Maharashtra and Tamil Nadu, and seemingly related to Austronesian. For another example: "Malayo‑Polynesian shares cognate forms of a few [words which are attested in both Indo‑Aryan and Dravidian], notably Old Indo‑Aryan phala‑ ['fruit'], Dravidian palam ['ripe fruit'], etc. (cf. Proto‑Austronesian *palam, 'to ripen a fruit artificially'...), and the words for rice." (Southworth 1979:206)

Austronesian seems to have very early and very profound links with IE. In the personal pronouns (e.g. Proto-Austronesian *aku, cfr. ego), the first four numerals (e.g. Malay dua for "two", though one theory holds that the proto-Austronesian form is *dusa, whence duha, dua) and other elementary vocabulary (e.g. the words for "water" and "land"), the similarity is too striking to be missed. Remarkable lexical similarities had been reported since at least the 1930s, and they have been presented by Isidore Dyen (1970), whose comparisons are sometimes not too obvious but satisfy the linguistic requirement of regularity.

At the same time, this lexical similarity or exchange is not backed up by grammatical similarities: in contrast with the elaborate categories of IE grammar, Austronesian grammar looks much less complicated, the textbook example being the "childlike" plural by reduplication, as in Malay orang, "man", orang‑orang, "men". If the connection is real, we may be dealing with a case of heavy pidginization: a mixed population (colluvies gentium) adopting lexical items from another language but making up a grammar from scratch. Then again, genetically related languages may become completely different in language structure (e.g. English vs. Sanskrit, Chinese vs. Tibetan): Dyen therefore saw no objection to postulating a common genetic origin rather than an early large‑scale borrowing.

Dyen cannot be accused of an Indian homeland bias either for IE or for Austronesian. For the latter, "Dyen's lexicostatistical classification of Austronesian suggested a Melanesian homeland, a conclusion at variance with all other sources of information (...) heavy borrowing and numerous shifts in and around New Guinea have obviously distorted the picture", according to Bellwood (1994). For IE, he didn't feel qualified to question the AIT consensus. It is in spite of his opinions about the Austronesian and IE homelands that he felt forced to face facts concerning IE-Austronesian similarities. And frankly, I wouldn't know how to rhyme these data together either, except for their unsportsmanlike dismissal as "probably coincidence".

The dominant opinion as reported by Bellwood is that Southeast China and Taiwan (ultimately "Sundaland"?) are the homeland from where Austronesian expanded in all seaborne directions. Hence its adstratum presence in Japanese, and this is a rather hard nut to crack for an Indian homeland theory (defended by Talageri 1993:129) of Austronesian. For another alternative: suppose the Indo-Europeans and the Austronesians shared a homeland somewhere in southern China or Southeast Asia. An entry of the Indo‑Europeans into India from the east, arriving by boat from Southeast Asia, is an interesting thought experiment, if only to free ourselves from entrenched stereotypes. Why not counter the Western AIT with an Eastern AIT?

6. Glottochronology

Among the methods once used to map out the history of IE, one which has gone out of fashion is glottochronology, i.e. estimating the rate of change in a language, and deducing a given text corpus's age from the amount of difference with the language's present state (or state at a known later time) divided by the rate of change. In a few trivial cases, the assumption remains valid, e.g. it is impossible for the Rg-Veda to have been composed over a period of a thousand years, because no language remains that stable for so long, i.e. no language has a rate of change approximating zero (unless it is a classical language artificially maintained, like classical Sanskrit or classical Chinese, or alternatively, unless the Vedic hymns were linguistically updated at the time of their final compilation and editing).

Likewise, trivial glottochronology allows us to say that the time-lapse between Rg-Veda and Avesta must be longer than approximately zero. It is often said that with a few phonetic substitutions, an Avesta copy in Devanagari script (as is effectively used by the Parsis: Kanga and Sontakke 1962) could be read as if it were Vedic Sanskrit. Having tried the experiment, I disagree: there is already a considerable distance between the two languages, including a serious morphological recrudescence in Avestan as compared to Vedic. Indeed, in the introduction to his authoritative translation of Zarathushtra's Gâthâs, Insler (1975:1) writes: "The prophet's hymns are laden with ambiguities resulting both from the merger of many grammatical endings and from the intentionally compact and often elliptical style..." (emphasis mine) Having evolved from a common starting-point, the Avestan language represents a younger stage of Indo-Iranian, a linguistic fact matched by the religious difference between the Rg-Veda, which initially knows nothing of a Deva/Asura conflict, and the Avesta where (like in younger Vedic literature) this conflict has come centre-stage.

Though a glottochronological intuition remains legitimate, the attempt to define a universal rate of change has been abandoned. A test of the common assumptions behind much glottochronological reasoning has been carried out on a group of languages with a well-known history: the Romance languages. It was found that according to the glottochronological assumptions, Italian and French separated to become different languages in 1586 AD, Romanian and Italian in 1130 AD, etc.: a full millennium later than in reality. (Haarmann 1990:2) If this is an indication of a general bias in our estimates, the intuitive or supposedly scientific estimates of the age at which PIE split (3,000 BC), at which Indo-Iranian split (1,500 BC) etc., are probably much too low as well. And it so happens that the OIT tends to imply a higher chronology, with the Rg-Veda falling in the Harappan or even the pre-Harappan period.

The AIT itself gets into difficulties, having to cram a lot of Old IA history into the period between the decline of Harappa and the life of the Buddha, especially if both the Vedic period and the invasion-to-Veda period have to be lengthened. And they may really have to. Winternitz already wrote (1907:288):

"We cannot explain the development of the whole of this great literature if we assume as late a date as round about 1200 BC or 1500 BC as its starting-point."

He consequently opted for "2000 or 2500 BC" as the beginning of Vedic literature. And this beginning came a long while after the invasion, for according to Kuiper (1967, 1997:xxiv, quoted with approval by Witzel 1999/1:388) -

"between the arrival of the Aryans (...) and the formation of the oldest hymns of the RV a much longer period must have elapsed than normally thought".

On the other hand, one is struck by the living presence of the Iranian ethnic groups first mentioned in the Rg-Veda (dâsas, dasyus and paNis, who were not "dark-skinned pre-Aryan aboriginals" but Iranians, as shown e.g. by Parpola 1995:367 ff.) as late as the Greco-Roman period. Herodotos, Strabo and others know of such Iranian peoples as the Parnoi and the Dahae. Alexander encountered an Indian king called Poros, apparently the same name as carried by the Vedic patriarch Puru, a very rare name in classical Hinduism. As a matter of intuitive glottochronology, one wonders: at a thousand years after their mention in the Rg-Veda, isn't this stability of nomenclature already a sign of unusual conservatism, given that cultures and names change continually? For the Iranian tribes, isn't staying around to be noted by Herodotos already a big achievement, considering that nations continually disappear, merge, change names, move out or otherwise disappear from the radar screen? Isn't it consequently unlikely that the Rg-Veda be, say, another thousand years older, making the lifespan of these names and tribes even more exceptional?

In fact, tribal identities can last even longer, and it is again the Rg-Veda which provides ethnonyms which have remained in use till today, i.e. 3200 years later by even the most conservative estimate. The Vedic king sudâs faced and defeated a coalition of tribes among which we recognize Iranian ethnonyms still in use, including the paktha, bhalâna (both 7:18:7) and parshu (RV 7:83:1, 8:6:46). The first is Pakhtoon, Pashtu or Pathan, the second is still found in Bolan, the mountain pass in Baluchistan; and these two embolden us to identify the third as the eponymous founders of the Persian province of Fars. Whichever the date of the Rg-Veda, if the Pathans could retain their tribal name and identity till today, the dâsas and paNis could certainly do so until the Greco-Roman period. Glottochronology is no longer an obstacle standing in the way of the higher chronology required by most versions of the OIT.

7. Conclusion

We have just looked into the pro and contra of some prima facie indications for an Out-of-India theory of IE expansion. Probably none of these can presently be considered as decisive evidence against the AIT. But at least it has been shown that the linguistic evidence does not necessitate the AIT. One after another, the classical proofs of a European origin have been discredited, usually by scholars who had no knowledge of or interest in an alternative Indian homeland theory.

It is too early to say that linguistics has proven an Indian origin for the IE family. But we can assert with confidence that the oft‑invoked linguistic evidence for a European Urheimat and for an Aryan invasion of India is wanting. We have not come across linguistic data which are incompatible with the OIT. In the absence of a final judgment by linguistics, other approaches deserve to be taken more seriously, unhindered and uninhibited by fear of that large‑looming but in fact elusive "linguistic evidence for the AIT".


