I recently acquired The Horse, The Wheel, and Language by the anthropologist David Anthony. The subject of this large book is the Indo-European expansion, which is a contentious and tricky subject. Nineteenth century ruminations on the topic produced the Aryan concept, and introduced it into the German academic world at a nationalistic time as the German state coalesced out of the principalities left by the destruction of the Holy Roman Empire, providing one of the roots from which Nazism grew. Even today many non-academics interested in the subject are white racists searching for their "Aryan" roots. Connections with fascism may be found among many of the twentieth century scholars of Indo-European religion and society, including, possibly, Georges Dumezil, a French scholar whose claims about Indo-European society gave impetus to structuralism in anthropology - a strange, quasi-cognitivist movement defined by a not particularly scientific method for uncovering universals in social organisation and myth (and more) based on "structures" in the brain. This was very influential, and it is one of the many schools of thought whose history forms much of useless dreck that one studies on a course in social anthropology.
Indo-European studies is inherently interesting, even if you're not a Nazi. It's also by far the most developed and well-studied area of comparative ethnology, to the extent that it is often considered a separate subject from that found in anthropology departments. (One of my favourite books on Indo-European ethnology was written by a Classicist.) The term "Indo-European" is a term of comparative linguistics (in German it's sometimes called "Indogermanisch", which might explain a lot), and it refers to the fact that languages from Icelandic to Assamese and from Latvian to Hindi are related, as linguists put it, "genetically".* This language family is the one that has effectively conquered the earth, with the prominent exception of east Asia.
Countries with Indo-European national or majority languages may be found on every continent, and most of the most-spoken languages on earth, whether in terms of native or second-language speakers, are Indo-European ones. Moreover, Indo-European languages are not the only pieces of Indo-European culture carried by these people in their migrations (see below). We find all sorts of common tropes in poetry, mythology, and, to some extent, social structure, in the earliest attested Indo-European-speaking societies, whether in India, Iran, or Ireland. The Aryan myth had a foundation in Indo-European thought, it seems; words related to the Sanskrit term "Arya" may be found throughout the Indo-European world, meaning "noble". Even the name of Eire may derive from it, as does the name of Iran.
So how do linguists discover the relationships between languages? They manage to do it due to the fortunate fact that the links between languages are fairly systematic. Sounds change in fairly consistent ways depending on the other sounds (or "phonemes") in the language, and there is some pressure to limit the maximum number of phonemes in ordinary use (confusion is the result of the inverse). This allows historical linguists to reconstruct the plausible past pronunciation of bits of vocabulary that are shared between related languages. Pronunciation and vocabulary items are not the only things that can be reconstructed, either. Similarities in verb conjugations and noun declensions in Sanskrit, ancient Greek, and Latin were one of the things that first twigged European scholars onto the notion that the languages are related. In fact, a very strange looking "language" that reflects the origins of all sub-families of Indo-European can be constructed, and this is called Proto-Indo-European, or PIE for short.
PIE seems to have been spoken (or rather, the real language corresponding to it was, as the academic reconstruction can't really be called a spoken language) in the area of the Pontic-Caspian steppe, the region north of and between the Black Sea ("Pontus") and the Caspian Sea, between Europe and Asia. The problem of finding the geographic origin of a proto-language is that archaeological evidence cannot be directly correlated with a language except in two ways: one is if the language is attested in written form in the area at the time (in which case the diagnosis is very stable), and the other is if the proto-language has reconstructible, original terminology that fits with the archaeological data. The latter is called the Woerter und Sachen method ("words and things"), and while it is quite reliable, it isn't as reliable as having written attestations.
An example of the Woerter und Sachen method might be that if a bronze sword were found in an archaeological site and the reconstructed language had words for "bronze" and "sword" - words that could be found in all (or most) sub-families and showing the expected sound changes (thereby ruling out the possibility that they are loanwords) - then that might be evidence for the idea that the speakers of the proto-language also produced the sword and, therefore, had a connection to the site. When there are many such pieces of evidence, then we have a reasonable claim that the archaeological site and the language family are linked. It is on the basis of such evidence that the Indo-European Pontic-Caspian steppe hypothesis has been founded, and it seems very secure. David Anthony's book is an attempt to show the data for this argument as clearly as possible.
So, they came from eastern Europe/Central Asia, and may have domesticated the horse and spread out by that means. But there are some conflicting claims made by other archaeologists (including the now-defunct Anatolian hypothesis, which I won't discuss here). I don't have the expertise or the reading in genetics to assess their claims directly, but they are interesting to consider. Sir Barry Cunliffe, for instance, claims that despite the seemingly late arrival of Celtic, Latin, and Germanic languages into Britain, and of Indo-European languages generally into Europe, there doesn't seem to have been much in the way of real migration of human beings.
Cunliffe claims that the archaeological and genetic data support the view that instead of Belgic, Roman, and Anglo-Saxon warriors coming over the seas, slaughtering the indigenes, and imposing not only their languages but also their genes, there has actually been genetic continuity in Great Britain, and continental Europe as a whole, since the Neolithic - a time when no one in Western Europe, at least, spoke an Indo-European language. This is apparently based on the argument that people can speak a language without being closely genetically related to its other speakers, as we see in the situation in much of Anglophone and Francophone Africa. The white invaders seldom intermarried with the locals, but the locals now speak English and French, and drive cars, use paper money, and so on. The language and certain aspects of the culture followed but European genes made a small impact, if any. Europeans were the ruling class; intermarriage was not the norm. The same might have applied to Indo-European-speaking invaders.
We find a similar situation in the Austronesian world, and it is a little easier to understand because Austronesian languages are primarily spoken on islands, many of them isolated by thousands of miles of ocean. "Austronesian" is a linguistic term like Indo-European, and it refers to languages spoken natively throughout the Philippines, Indonesia, Madagascar, and the Pacific, from Hawaii to the east coast of Africa and from Taiwan to New Zealand. The languages are clearly related and show amazing similarities (and some important differences) over the area. One word, lima, meaning the number 5, may be found from Taiwan to Hawaii and beyond: Hawaiian lima, Malay lima, Fijian lima, Minang (West Sumatra) limo, Paiwan (southern Taiwan) lima. The proto-Austronesian language (PAn) appears to have been spoken in Taiwan, and all other non-Formosan (non-Taiwanese) Austronesian languages are members of one sub-family of Austronesian, called Malayo-Polynesian.
The genetic data shows that in many areas, non-Austronesian-speaking peoples contributed a huge amount of genetic material to current populations in many Austronesian-speaking regions. Proto-Oceanic peoples (ie, those who populated the south Pacific) interbred with the local population of New Guinea and the already populated parts of island Melanesia to produce the Polynesia and Melanesian Malayo-Polynesian-speaking peoples, who are dark-skinned and frizzy-haired, unlike the aboriginal people of Taiwan. The same thing happened in Indonesia, where "Papuan"-looking indigenes with frizzy hair and dark skin produced offspring with the Malayo-Polynesian speakers who invaded their lands. Genetics tells a very different tale to linguistics in this part of the world; studies have shown no sizeable genetic difference between Malayo-Polynesian-speaking Timorese and Timorese people speaking non-Austronesian languages, for instance.
In Madagascar, the population is mixed with descendents of Bantu-speaking Africans (Bantu being a language sub-family of Niger-Congo, the dominant language family of sub-Saharan Africa), as well as Arabs and French colonists. Likewise, the speakers of Indo-Aryan languages mixed with the indigenous, probably Dravidian-speaking, peoples of the Indian sub-continent and produced the modern population of India in a similar way, and it is more than reasonable to expect that the same thing happened in Europe.** This is especially likely in the example of the speakers of the Germanic sub-family, which appears to have a non-Indo-European substrate.
So if there weren't mass migrations of people taking control of the land and introducing themselves and, by consequence, their languages and culture, then what was happening? Is it reasonable to say that there was an Indo-European expansion if the Indo-European speakers were often a minority in the places they went to and contributed little genetically to the resulting Indo-European-speaking population? This is controversial, of course, and it's possible that there were mass migrations of Indo-European speakers, but assuming for the moment that there weren't, I'd still believe that it would be reasonable to speak of an Indo-European expansion. Kind of.
What expanded wasn't a population of people that took over everything, however, but a population of "representations", or "memes", or "traits" (or however you want to label these things). A human population is necessary for these things to spread - they are just bits of information encoded in individual human brains, after all - but that population doesn't have to have been one that replaced others wherever it went and displayed absolute, or even majority, genetic continuity with the speakers of the proto-language - or, to extend the reasoning, the original users of the particular technology.***
It is important to remember that language, and other things of that nature, can only indicate a historical relationship of some kind, but not necessarily what the nature of the relationship is. It tell us that people met and spoke and had a relationship of sufficient duration and impact to leave linguistic evidence. But, simply, the fact that you speak English, or that you pass English onto your children, does not mean that you are the genetic descendent of Anglo-Saxon warriors, and speaking an Indo-European language doesn't make you Aryan.
* There are several non-Indo-European languages attested in Europe, including Basque, the Iberian languages (spoken before the mass arrival of Celts or Celtic culture), Etruscan, Finnish, Hungarian, Mordvin, and others. But most of Europe speaks an Indo-European language, as does about half of the world's population. In India the languages of the southern cone are Dravidian language, unrelated to Indo-European and likely indigenous to the subcontinent. The situation in the Caucasus and Greater Persia is more complicated due to Turkic migrations and the survival of some non-Indo-European language families.
** Racial terms are controversial and fuzzy at best, but the genetic evidence does point to a small population of fair-skinned invaders mixing with a large population of dark-skinned native people in northwest India - who seem to have had a developed and intricate social life themselves, based on evidence from the cities of the ancient Harappan civilization, which was pre-Indo-Aryan.
*** There are some indications of loanwords in proto-Indo-European, even. The PIE word for sun, *sóh₂wl̥, appears to come from an expression including the word "lamp" ("lamp of *Dyeus", *Dyeus being the hypothesised sky-father deity [ie, Zeus, Diuspiter, etc], is the phrase M. L. West plumps for), and the word doesn't seem to obey the usual rules of PIE derivation. West says, "...as a lamp is an item of material culture, we must reckon with the possibility that the old Indo-European word was originally a loan from another language. That might explain its unusual morphology."