<

In This Site



In This Section


Distribution of R1b1c Subclades + Predictability of Haplogroup by Haplotype

By Dr David Faux - 7 June 2007

[This paper uses the 'old' nomenclature of the ISOGG R-2007 tree ]

The following is my take on whether it is possible to predict R1b1c subclades by knowing only the haplotype.
What I am about to provide is by virtue of my access to the EthnoAncestry commercial database, plus some data from research being done by Dr. Wilson and colleagues from across Europe.

R1b1c1-3, 5, 8

Some day a severe pruning will be required to sever these apparently private SNPs from the phylogenetic tree, for as far as I know, no one tested by any commercial DNA testing company have tested positive for R1b1c1-3, 5, or 8.

R1b1c7

This haplotype is amenable to prediction by virtue of a very "recognizable" pattern of Y-STR scores (e.g., DYS390=25; DYS385a, b = 11, 13; DYS392 = 14). As many know, David Wilson and Gareth Henson worked with EthnoAncestry to discover the linkage between the NW Irish (Ui Neill) haplotype and the Y-SNP M222 (a SNP coincidentally discovered by EA's Dr. Cavalleri when he was a student). This subclade is found in NW Ireland, elsewhere in Ireland, and in Scotland, western England and very thinly scattered as far east as Norway.
That leaves us with M153 - R1b1c4, M167 - R1b1c6, S21 - R1b1c9, and S28 - R1b1c10 and these four remain elusive to those hoping to search databases, mine the data, and arrive at a pattern of Y-STR scores that will act as quasi - SNPs and serve to predict these haplogroups.

R1b1c4

This haplotype is seldom seen outside Iberia and western France.

R1b1c6

This haplotype is centered in Iberia but fans out to include SW England, Southern Ireland, parts of France and a few outliers (but still a small percentage of the R1b1c population). Their modals appear to be the same as R1b1c*.

R1b1c9*, a, b

There is, however, one exception - a grouping within the subclade S21+. What some have called "Frisian" with DYS390 = 23, if seen with DYS492 = 13, is in fact very likely to be S21+. However what I think many fail to realize is that about half of the S21+ group cannot be predicted in this manner. Curiously I have two Shetlanders with aboriginal (Norse) surnames who with 37 markers fit the S21 pattern but they test negative; however another who is Western Atlantic Modal Haplotype does test S21+. It is clearly bimodal, "Frisian" and "other". Being DYS492 = 13 or 14 is strongly suggestive of all varieties of S21+.

Sub-groupings of R1b1c9 can however generally be predicted. R1b1c9a is seen when the testing DYS439 returns a null value. R1b1c9b has a fairly distinctive pattern of Y-STR scores.

All R1b1c9 of whatever stripe is to date found in the heaviest concentration in the Saxon - Frisian region (where it approaches 75% of the R1b1c in that location), and tapers off slightly into Scandinavia (where it is still the predominant R1b1c haplogroup), and falls somewhat precipitously as one travels to the west (except in England and Lowland Scotland where it can make up 50% of the R1b1c). The strong showing in Italy may reflect the footprint of Germanics such as the Lombards, or perhaps the remnants of those who over-wintered in that region while R1b1c* was basking in the Franco - Cantabrian area. To date a lack of data means that we don't know if eastern R1b1c, (e.g., Hungarian; Ashkenazi; Anatolian), is R1b1c*, R1b1c9, or R1b1c10.

R1b1c10

It is now quite evident that there is no way to predict R1b1c10 from a haplotype, even at 67 markers. Ron Scott's database of extended haplotypes for SNP tested R1b is a good starting point and my thanks to him for allowing me to use his R1b Extended Haplotypes template for this work. The data can be seen at www.davidkfaux.org/R1b1c10_Data.htm.
I will keep it updated as new information becomes available.

Most S28+ Y-STR markers are modal for R1b1c. I (being S28+) have the very unusual DYS444 = 14 (12 being a strong modal) but of the totality of the R1b1c10 with extended haplotypes, only one (ancestor from Kent County) shares this with me. There is no consistency whatsoever within the R1b1c10 haplogroup subclade - they "look" no different compared to R1b1c*, or R1b1c6 for that matter.

Considering that England, Ireland and Scotland are highly over-represented in the numbers tested it is interesting that no R1b1c10 have come to light yet in Ireland, and those from Scotland only from Orkney and the east coast. Similarly, the findings in England are to date almost exclusively found within the Danelaw, and generally within a few miles of the North Sea.

In the distribution of this haplogroup subclade there is a very definite "hotspot" in Switzerland, Alpine Germany, and Northern Italy. Despite the few Swiss who have tested, to the best of my knowledge all have been R1b1c10. The Italians are all from locations within a few miles of the Swiss border and that applies to Germany also (although with less certainty). The majority of R1b1c whom EA has tested from this area are S28+, although we are looking at relatively small sample sizes. From this "epicenter" the haplogroup radiates out through the middle of France to the Bay of Biscay (France being a mixture of R1b1c*, 4, 6, 9 and 10 - at the moment I don't know which predominates). When this information is added to the French findings from the middle of the country as far west as the Bay of Bisquay, plus one from Southern Poland, and another from Greece this distribution is without hesitation that of the La Tene Celts via their documented expansion from the Marne, Moselle, and Bohemian clusters in the 4th Century BC. I suspect very strongly that the R1b1c10 in Switzerland and vicinity are the descendants of the Celtic Helvetii tribe.

There is an "isolated" enclave in Southern Scandinavia that has shown itself in the English Danelaw, those with aboriginal (place) surnames in Orkney (characteristic of Norse families), and coastal Eastern Scotland (only in those places known to have been settled by the Vikings). It appears that the English R1b1c10 are best explained as Danish Viking descendants of the Celtic Cimbri tribe from the northern part of the Jutland Peninsula. Those is Scotland likely arrived in Viking times from the Vestfold / Vik area of Southeast Norway. New archaeological research shows that there were Danish Viking settlements on Anglesey in Wales - but it was also the Druidic stronghold of the Continental Celts at the time of the Roman invasion - so I am really not clear on the origin of the S28 there (along with a lesser percentage of S21- R1b1c9, but both being far outclipsed by R1b1c*). The I1b2a and R1a1 findings from that area may help to pinpoint the location of origin of the population on Anglesey. I have always wondered why the Norse name of Onguls stuck, rather than the Celtic Mon, as a place name there.

Further research may confirm that R1b1c10 is one of the largest haplogroups found in Central Europe north of the Alps. Perhaps this holds true until interfacing with the large haplogroup R1b1c* groups of the west (plus their likely kindred R1b1c4, 5, and 7); R1b1c9 and I1a in the north; and the I1b and R1a1 populations of the Slavic - speaking world in the east. To date there has not been an exception (time will likely cure this) of S28+ being found outside the known areas of 4th Century La Tene Celtic migrations and settlement. Hence for the present we might tentatively term it the "La Tene Celt marker". Perhaps some will find this a tad presumptuous or simplistic - but nothing ventured, nothing gained. In this instance there does seem to be a noteworthy correspondence between the distribution patterns of archaeological assemblages of the Hallstatt and La Tene eras, and this particular genetic marker. Based on this observation I predict that ancient DNA testing of, for example, the La Tene cemeteries in Bohemia (home of the Boii tribe), but also the burial places of the Helvetii and related tribes of the Alpine regions, will predominantly test R1b1c10 (S28), as will the present day R1b1c populations of these domains - they being descendants of the "Ancient Celts".

R1b1c*

A very typical reaction after testing negative for all R1b1c subclade SNPs (R1b1c1 to R1b1c10), and being assigned to the "asterisk" category, is to expect testing companies or experts on this list to interpret the meaning of this result. Clearly this is going to be a more difficult assignment than "interpreting" a M222-R1b1c7 (Ui Neill) result since there is a very consistent Y-STR pattern associated with this subclade. Some may require the services of professionals such as those listed on the www.isogg.org site in order to make sense of the findings. Many who have the time and ability to persevere through times of frustration (inevitable when working in genetic genealogy), can take their knowledge of being R1b1c* and use this as a crucial piece of information to construct a likely scenario relating to Y - origins back to the Iron or even Bronze Age. No one said it would be easy, or that there will be one and only one crystal clear interpretation available.

Some will be fortunate and fit into a haplotype pattern that is very robust and geographically rooted such as "Southwest Irish" even though they are R1b1c* and no corresponding SNP has yet been found for their clade. Ken Nordtvedt lists the modal haplotypes of 17 clades which he or others had identified via mining the YHRD, Ysearch, and Sorenson databases, and which are found within R1b1c*. This data can be found here. Information on some of these can be found in various locations on www.worldfamilies.net. It strikes me that Kevin Campbell's new study on the "Geographic Patterns of Haplogroup R1b in the British Isles" (see www.isogg.org) can offer some possible hints about subclade distribution for this location. Eventually the unknown quantity of R1b1c6, 7, 9 and 10 could be teased out of this and other studies to provide a somewhat clearer picture of R1b1c*.

At this point we must be honest and say that for some (e.g., adoptees) there may as yet be no way to surmount this hurdle and identify any likely geographical origin. Perhaps when testing say 100 markers becomes routine, one or more markers (such as DYS492 = 13 for R1b1c9) will emerge to show correspondence with others in a geographically meaningful way. Perhaps population geneticists will locate a new SNP, which can be offered by a commercial testing company, and that will give coherence to the findings.

This leads into another important topic. Is there any guarantee that everyone who is R1b1c* will some day be "converted" to something like R1b1c18? Unfortunately no. Look how long geneticists have been exploring R1a1 and have not come up with anything downstream of M17 (except a few SNPs that are little more than "Family SNPs" or sub-tribal markers for a very restricted part of Central Asia). Nothing from India to Siberia to Norway to Greece. We assume that when full Y-chromosome scans become available (perhaps as early as this year but price will possibly be an impediment for some time) that "our" SNP will magically appear. It is expected that more useful SNPs will indeed emerge in this fashion, but some folk may have to settle for what I have termed "Family SNPs" whose time depth may not be more than a few generations (or in some cases are personal to the individual).

How useful is spending numerous hours (as I did prior to being informed that I was S28+ (obsessively combing through databases (e.g., Sorenson) and seeking haplotypes with a low Genetic Distance from one's own (e.g., a 23/25 "match")? Unless the haplotype is rare, or has a series of off-modal markers, the question of whether the similarity can best be explained as identical by descent or identical by state (convergence) may not be answerable.

Many will find nothing more than a random assortment of matches to people from Poland to Portugal and nothing that stands out to indicate a likely ancestral point of origin. It should be noted that if SNP tested many of these "matches" east of say France would turn out to be either S21+ or S28+ (as will be noted in another post, the haplotypes of these subclades, with the exception of "Frisian", cannot be differentiated from R1b1c*). Hence if we were to take the known or probable R1b1c9 and R1b1c10 haplotypes out of the mix we will be better able to lift the veil that has obscured any ability to put a face to the geographic distribution of R1b1c*.

Although speaking about archaeological evidence, the words of Barry Cunliffe (actually Sir Barrington Cunliffe) of the University of Oxford may ring true for genetic evidence concerning R1b1c*. Since I anticipate criticism of what I am about to say I would like to quote his words. He said that in attempting to "construct a European protohistory", "we will inevitably be drawn into simplification and generalization, laying ourselves open to criticism from the purists, but better the attempt to create a whole, however imperfect, than to be satisfied with the minute examination of only a part" ("The Ancient Celts", 1997, Preface). The "whole" in this case would be R1b1c*. In other words setting out reasonable hypotheses is infinitely better than throwing up one's hands in despair due to scattered and incomplete evidence.

I would like to propose that the distribution of R1b1c* has changed little in 3500 years, and that movements over the years were largely within a circumscribed area (e.g.,Iberia to Ireland; Belgium to England). It may approximate what is shown in Map 1 of the above work by Cunliffe. Here his archaeologically based "Atlantic Bronze Age System" of 1300 to 700 BC includes Portugal, Northern Spain, the entire Atlantic facade of France through Belgium, plus all of Britain and Ireland. This appears remarkably like the "heart" of R1b1c* land and R1b1c* would be expected to decrease to the south and east where R1b1c* would encounter R1b1c9 (Nordic Bronze Age encompassing Scandinavia and the coastal areas of Germany and the northern third of the country) and R1b1c10 (the northern Urnfield Culture, particularly the Northern Alpine zone) in increasing numbers. Even today there is relatively little R1b1c* in for example Norway or Switzerland where R1b1c9 and R1b1c10 respectively strongly predominate. It must be noted that whatever migrations have occurred (some west to east movements may go back to immediate post Ice Age times), there is still a scattering of R1b1c* in for example Norway - just very much less than what is seen in say Ireland. However if one's ancestors are from the Southwest of England, an ancestor from Norway while theoretically possible, is very improbable.

So, in summary, what is the answer to the frequently asked question, "where did my R1b1c* ancestor's originate"? One can only work with a probability model, and some scenarios will be more probable than others, but the strongest likelihood is that if your ancestors are from say England, Ireland or Spain they were probably aboriginal to the region, and that movements over the years were largely within this circumscribed area (e.g., Iberia to Ireland; Belgium to England). Some of this flux is reflected in the archaeological and historical (e.g., Roman sources) record. Whether there is substance to this speculation, only time and further research will tell.

If and when "your" SNP is found, then some revisions to the models you have created may be necessary. Still it is best to advance with the law or parsimony operating at the helm to guide hypotheses to test. If one is R1b1c* (after testing all 10 subclade SNPs), and one's ancestors have resided in East Anglia for the past 300 years, and the surname suggests an even earlier origin there, one should not expect to be criticized for coming to the conclusion that your Y-ancestors were most likely members of the Iceni tribe that gave the Romans so much grief. I realize that this would be problematic to those who are brick walled in for example the Americas, and hence some people will have to delay this inquiry until they can better pinpoint a place of origin in Europe.

The goal as I see it is to clarify and demystify haplogroup R1b1c* at the macro and micro levels - something quite attainable by tapping into the many resources that are often available without ever having to leave the comfort of one's home - thanks to Google searches.

I get the sense that many would ultimately hope to identify a geographic area and even tribal affiliation for their (SNP tested) haplotype. In some cases it will be possible, with some care and critical thinking, to do this using a knowledge of SNP (e.g., 28+) status, plus surname, plus geographic residence of ancestors in the Middle Ages, to trace the perambulations of a particular Y-chromosome back to the Bronze Age. What is needed is a multidisciplinary study using, not only genetic, but archaeological, historical, and linguistic evidence to "tell the story". It took me months of weighing two 4-inch ringbinders full of printed (largely primary source) material to construct the story relating to my paternal lineage. I hope to inspire others to do what I did (being somewhat obsessive compulsive helps) since, when the evidence all converges, you know that you will be writing something that will not only pertain to your immediate family, but also those who share your Y - heritage. Ultimately the history of Europe can be assembled from these efforts. I have written an 80 page monograph on the Cimbri tribe. Perhaps someone can tackle the Wends, the Suevi, the Veneti, the Hellenes and so on. The writings of most Ancient authors (e.g., Pytheas of Massilla) are online, and books on the archaeology of Central Europe and all other relevant subjects can be found in bookstores or online. Now with molecular biology / genetic genealogy as a powerful ally, and routine ancient Y-DNA analysis just around the corner, what was once impossible is within the grasp of those who are unafraid to risk criticism of the naysayers and do the in depth research required at this point.

If anyone is curious as to how one might go about this albeit daunting and intimidating enterprise you are welcome to read my efforts at www.davidkfaux.org/dnaprofile2.html for information back to Viking times; and www.davidkfaux.org/Cimbri-Chronology.pdf for a tribal history back to the Bronze Age.

The only thing that I would need help with is my very hesitant attempt to correlate an archaeological horizon such as the Unitice Culture with one or more haplogroups. It would be a useful effort if by some miracle we could arrive at some consensus on, for example, the probable haplogroup tapestry of the Ostrogoths or the Quadi. Too ambitious perhaps. Maybe we should just wait until aDNA across Europe is tested with the Snap Shot Minisequencing (as described in a recent article on Southern Siberia). None-the-less, to best understand change or stability through the ages we must test people alive today who, on balance of probabilities, may be the descendants of these groups. Exciting stuff.

The very best of luck to those who decide to explore the many faces of R1b1c*.

- David Faux