Introduction

The population of the UK today is culturally diverse, with 8% of its 54 million inhabitants belonging to ethnic minorities, and over one million classifying themselves as ‘Black or Black British’ in the 2001 census. These people owe their origins to immigration from the Caribbean and Africa beginning in the mid-20th century; before this time, the population has been seen as typically Western European, and its history has been interpreted in terms of more local immigration, including that of the Saxons, Vikings and Normans.1 However, in reality, Britain has a long history of contact with Africa (reviewed by Fryer2). Africans were first recorded in the north 1800 years ago, as Roman soldiers defending Hadrian's wall – ‘a division of Moors’. Some historians suggest that Vikings brought captured North Africans to Britain in the 9th century. After a hiatus of several hundred years, the influence of the Atlantic slave trade began to be felt, with the first group of West Africans being brought to Britain in 1555. African domestic servants, musicians, entertainers and slaves then became common in the Tudor period, prompting an unsuccessful attempt by Elizabeth I to expel them in 1601. By the last third of the 18th century, there were an estimated 10 000 black people in Britain,3 mostly concentrated in cities such as London.

Has this presence left a genetic trace among people regarded as ‘indigenous’ British? In principle, Y-chromosomal haplotyping offers a means to detect long-established African lineages. Haplotypes of the non-recombining region of the Y, defined by slowly mutating binary markers such as SNPs, can be arranged into a unique phylogeny.4, 5, 6 These binary haplotypes, known as haplogroups (hg), show a high degree of geographical differentiation, reflecting the powerful influence of genetic drift on this chromosome. Some clades of the phylogeny are so specific to particular continents or regions that they have been used to assign population-of-origin to individual DNA samples,7 and in quantifying the origins of the components of admixed populations using simple allele-counting methods.8, 9, 10

Studies of British genetic diversity, generally sampling on the criterion of two generations of residence, have found no evidence of African Y-chromosomal lineages,11, 12, 13, 14 suggesting that they either never became assimilated into the general population or have been lost by drift. However, here, we describe a globally rare and archetypically African sublineage in Britain and show that it has been resident there for at least 250 years, representing the first genetic trace of an appreciable African presence that has existed for several centuries.2

Materials and methods

DNA samples

British and US males were recruited under appropriate informed consent, and DNA extracted from buccal samples as described.15 R-surnamed British males were sampled randomly using information from electoral rolls, and a specific questionnaire was used to exclude close patrilineal relatives.

Binary marker typing

A set of 11 binary markers (M9, M69, M89, M145, M170, M172, M173, M201, P25, SRY10831 and 12f2) was typed using the SNaPshot minisequencing procedure (Applied Biosystems) as described,15 and the additional hgA-specific markers M91 and M33 typed by DNA sequencing using published primers.5

Y-STR. typing

A set of 17 Y-chromosomal short tandem repeats (Y-STRs: DYS19, DYS388, DYS389I, DYS389II, DYS390, DYS391, DYS392, DYS393, DYS434, DYS435, DYS436, DYS437, DYS438, DYS439, DYS460, DYS461, DYS462) were typed as described.16 HgA1 chromosomes were typed with an additional 51 Y-STRs (DYF386S1, DYF390S1, DYS406S1, DYS472, DYS476, DYS480, DYS481, DYS485, DYS487, DYS488, DYS490, DYS491, DYS492, DYS494, DYS495, DYS497, DYS505, DYS508, DYS511, DYS525, DYS530, DYS531, DYS533, DYS537, DYS540, DYS549, DYS554, DYS556, DYS565, DYS567, DYS568, DYS569, DYS570, DYS572, DYS573, DYS575, DYS576, DYS578, DYS579, DYS580, DYS583, DYS589, DYS590, DYS594, DYS617, DYS618, DYS636, DYS638, DYS640, DYS641 and DYS643) according to Lim et al.17 and with an additional nine (DYS385a/b, DYS425, DYS426, DYS447, DYS448, Y-GATA-H4.1 and YCAIIa/b) according to Parkin et al.18

Network analysis and dating

Median joining networks19 were constructed within the program Network 4.1.0.9 (www.fluxus-engineering.com/sharenet.htm) using variance-based weighting as described.20 Weighting for hgA1 was based on variance in all hgA1 chromosomes, whereas weighting for the R surname network was based on variance observed in 291 British hgR1b chromosomes.

Time-to-most-recent-common-ancestor (TMRCA) was estimated within Network from the ρ-statistic, using a 35-year generation time,15 and a mean per-locus, per-generation mutation rate of 2 × 10−3.21, 22, 23

Results

As part of a survey of British Y-chromosome diversity, we recruited a set of 421 males, who described themselves as British, and whose paternal grandfathers were born in Britain. The Y chromosomes of these males were typed using a set of 11 binary markers,15 including M145 (defining superhaplogroup DE) and M89 (defining superhaplogroup F). All chromosomes carried the derived allele at one or other of these two markers, with a single exception, in male GB1757, which could in principle belong to hgA, B or C (see phylogeny in Figure 1). Further testing, including the markers M91 and M31, gave the surprising result that it belonged to hgA, within the sublineage A1.

Figure 1
figure 1

Distribution of Y chromosomes belonging to hgA. Distribution of hgA among 4516 Y chromosomes in and around Africa, taken from the literature (see text). The colored sector of each pie chart is proportional to the frequency of subgroups of hgA, defined according to the phylogeny lower left. The gray sector in Cyprus indicates hgA chromosomes that were not further subclassified.32 The phylogeny includes selected markers referred to in the text.

HgA is the deepest-rooting clade of the Y phylogeny, and shows a particularly specific localization to the African continent (Figure 1), which is compatible with an African origin for modern human Y chromosomes. It constitutes 5.4% of a composite sample of 3551 Africans,4, 8, 10, 24, 25, 26, 27, 28, 29, 30 whereas in non-African indigenous populations only seven cases have been described, from Turkey,31 Cyprus,32 Sardinia33, 34 and Oman.35 Extensive surveys of Western European populations have failed to find any examples of these chromosomes.13, 36, 37, 38, 39 The subhaplogroup A1 was first reported in a single individual among a sample of 44 males from Mali.4 Subsequently, this scarce Western African hg has been found in only 25 more males (Figure 1): 2/64 Moroccan Berbers,24 3/766 African Americans,40, 41 2/39 Mandinka from Gambia/Senegal, 1/55 Malian Dogon,30 1/201 Cape Verde Islanders, 14/276 males from Guinea-Bissau,10 and 2/39 males from Niger (F Cruciani and R Scozzari, unpublished data).

The British male carrying the hgA1 chromosome knew of no familial African connection, and he displays a typical European appearance. To investigate the relationship of his Y chromosome with African examples, we compared its 10-locus Y-STR haplotype (see Supplementary Information) with those of 10 other available hgA1 chromosomes,10, 24, 40 (F Cruciani and R Scozzari, unpublished data). Figure 2a shows a median-joining network of these haplotypes: they are diverse, and all are unique. Although the British haplotype is peripheral, it lies equidistant (four mutational steps) from Niger and Guinea-Bissau haplotypes, and similar distances (2–4 steps) exist between other haplotypes in the network. This is compatible with a Western African origin for the British chromosome, but does not point to a particular population. Using the British haplotype (11 loci) to search the Y Chromosome Haplotype Reference Database (http://www.yhrd.org) finds no matches among 15 815 chromosomes worldwide, emphasizing its rarity. Also, when the haplotypes of the other hgA1 chromosomes are used in similar searches, they find only self-matches in the populations from which they derive, underlining the scarcity and African-specificity of hgA1.

Figure 2
figure 2

Diversity of Y-STR haplotypes of chromosomes belonging to hgA1 and within the R surname. (a) Relationships of Y-STR haplotypes within hgA1. Weighted median joining network containing the 10-locus Y-STR haplotypes (DYS19, DYS390, DYS391, DYS392, DYS393, DYS437, DYS438, DYS439, DYS389I, DYS389II-I) of eleven hgA1 chromosomes. Circles represent haplotypes, with area proportional to frequency and colored according to population. (b) Relationships of Y-STR haplotypes carried by men with the R surname. Weighted median joining network containing the 17-locus Y-STR haplotypes of 18 men named R. Circles represent haplotypes, with area proportional to frequency and colored according to hg, as shown in the phylogeny to the left. The node used as a root for dating hgA1 chromosomes is indicated with an asterisk.

How long has this archetypically African Y chromosome been in England? To address this question, our strategy was to seek patrilinearly related individuals who would share the hg, but whose Y-STR haplotype diversity could be used to estimate a TMRCA. To do this, we exploited the relationship between surnames and Y chromosome haplotypes,15, 42, 43, 44 noting that the upper bound of any estimated age would be limited by the fact that hereditary English surnames did not exist before the 11th century.45

The hgA1-bearing male bears a locative surname, which we refer to here as R, deriving from an East Yorkshire village.46 Only 121 people carried this name in 1998 (http://www.spatial-literacy.org/uclnames), and it still has a strong east Yorkshire focus. We recruited 18 apparently unrelated men carrying this name (or a close variant spelling, carried by 50 individuals) and typed a set of 11 binary markers and 17 Y-STRs,15 supplemented with the binary marker M31, allowing us to identify hgA1.

Figure 2b shows a median-joining network of 17-locus Y-STR haplotypes (see Supplementary Information) of Y chromosomes carried by 18 R-surnamed males. The chromosomes belong to three hgs, and include four clusters, indicating either multiple foundation or historical non-paternity within the name. However, a total of seven of the males carry hgA1 chromosomes, belonging to three closely related Y-STR haplotypes, and based on the ρ-statistic within Network, having a TMRCA of 440±330 years.

As an empirical adjunct to TMRCA calculations, we undertook extensive genealogical research to ask if the seven R-surnamed males carrying hgA1 chromosomes could be connected into a single genealogy with a historically verifiable MRCA. This research resolved the males into two well-supported genealogies (Figure 3a), with MRCAs born in 1788 and 1789, respectively. However, although both of these ancestors were resident in Yorkshire, evidence could not be found for a familial relationship between them. Patterns of forename usage in the two genealogies are quite distinct, which argues against a very recent connection. We recruited 12 unrelated R-surnamed men from the USA, hoping that the presence of hgA1 would indicate that the chromosome had been associated with the surname before emigration from Britain. However, none of these men carried a chromosome from this hg (data not shown), so the approach was uninformative.

Figure 3
figure 3

Genealogical relationships and Y-STR haplotypes of hgA1 R-surnamed men. (a) Genealogies of R-surnamed men bearing hgA1 Y chromosomes (left). To the right are their 17-locus Y-STR haplotypes, with differences highlighted by boxes. (b) Additional difference revealed at DYS537; haplotypes based on the other 59 Y-STRs (see text and Supplementary Information) are identical in all males.

Finally, we exploited a new resource of multiple novel Y-STRs47 in an attempt to refine TMRCA for the two genealogies. We typed the hgA1 chromosomes with an additional 60 Y-STRs (see Materials and methods), bringing the total to 77. Surprisingly, this analysis revealed only one new mutation – a single repeat decrease at Y-STR DYS537 in male GB1758 (Figure 3b). Applying the ρ-statistic within Network, as described above, to the 73-locus Y-STR haplotypes (excluding the bilocal DYS385a/b and YCAIIa/b; see Supplementary Information) in the two genealogical clusters yields a TMRCA of 140±80 years, which, adding the average age of the living individuals (52 years), equates to a likely oldest date of 1734 for the coalescence of the two genealogies. The TMRCA range overlaps that obtained using 17 markers (see above), and suggests that only a small number of generations separates the two genealogies from their common ancestor.

Discussion

Our study shows that a globally rare Y-chromosome type, belonging to the deepest-rooting African branch of the Y-phylogeny, has been present in Northern England since at least the mid-18th century. HgE3a is by far the most frequent Y-chromosomal lineage in Africa, existing at 48% in a continent-wide sample of 1122 chromosomes,30 so we would expect any substantial past immigration from Africa to Britain to have left examples of chromosomes belonging to this common hg. However, a survey of 1772 Y chromosomes from the British Isles found none,13 and they are also absent from our control sample of 421 chromosomes. The general rarity of African lineages may reflect a low level of initial introgression, later loss through drift, or sampling bias – for example, the large British survey13 sampled from small towns, in which the descendants of early British Africans, who were concentrated in cities, may be depleted.

Admixture between populations of African and European origin is often sex-biased, with a greater proportion of the African component of the hybrid population being contributed by females.48 Assuming an equal number of males and females of African origin migrating to Britain, we might therefore expect mitochondrial DNA to reveal a stronger signal of African admixture than the Y chromosome. There is little published evidence, but a study of mitochondrial DNA sequence diversity among 100 ‘white Caucasian’ British49 does contain one haplotype, which represents an hg L1c sequence (defined according to Salas et al.50), with a probable origin in West Central Africa.51 This could represent a possible maternal counterpart to the Y-lineage we describe here.

The presence of a very rare type of Y chromosome in two distinct branches of a genealogy that coalesce before the late 18th century demonstrates clearly that the chromosome must have been introduced before this time. However, the upper limit on the time of linkage with the R surname is less certain. In principle, the association could have been formed many generations earlier, with the descendants of any earlier-branching lineages having drifted to extinction, or not sampled in our study. As the pattern of Y-diversity in the surname suggests either multiple origins for the name, or non-paternities introducing diverse Y chromosomes, the hgA1 lineage need not have been a founding type at the time of surname establishment (about 700 years ago45). The contributor of the chromosome to the surname may himself have been a first-generation immigrant African, or an admixed and phenotypically European man carrying an African Y chromosome introduced into Britain some time earlier.

The remarkable Y chromosome present in the R surname provides the first genetic evidence of a long-lived African presence within Britain. It represents a cautionary tale for those who would predict population-of-origin from a Y hg,7 and emphasizes the complex nature of the history of human migration.