Sinhalese Genetics: Abstracts and Summaries

The Sinhalese are a major ethnic group living in Sri Lanka, the South Asian island nation that was called Ceylon until 1972. Their language, Sinhala, is part of the Indo-Aryan linguistic family. Sinhala means "lion people".

The Sinhalese originally came from northern India, as shown by genetic and linguistic connections. After they arrived in Sri Lanka, some of the Sinhalese apparently intermarried with the indigenous Vedda people, but even if so these lineages are minor. The data also shows that they intermarried with Tamils, certainly to a more substantial degree than with the Vedda. By and large, Sinhalese are related to Bengalis with some elements of Tamil-, Gujarati-, and Punjabi-connected ancestry. They don't seem to have any ancestry from southeast Asia.

Some admixture studies show a larger Sinhalese-Tamil ancestral connection than others.

The Sinhalese outnumber the Tamils within Sri Lanka. Thankfully, the long-standing civil war between the Sri Lankan government and Tamil Tigers has ended. This would be a good time for the majority and Tamils to reflect on their partially common roots so that a violent war will not occur again.

Common paternal (Y-DNA) haplogroups among Sinhalese men include R2a (found in about 38%) and R1a1a (found in about 13%). As R2 haplogroups are found in high frequencies among Sinhalese, Romanies, and West Bengalis, this supports the evidence that they are partly linked by ancestry thousands of years ago. About 10% of Sinhalese belong to the haplogroup (paragroup) F*. Also important among the Sinhalese is the haplogroup H but I don't know the percentage.

Major studies of Sinhalese

Sarabjit S. Mastana. "Molecular Anthropology: Population and Forensic Genetic Applications." Anthropology Today: Trends, Scope and Applications, Anthropologist Special Volume No. 3 (2007) guest-edited by Veena Bhasin and M. K. Bhasin: Chapter 29 on pages 373-383. Page 380 contains a section with the headline "The Mystery of Sinhalese Origins: An Alu perspective" and below is an excerpt from it:

"[...] Many researchers have attempted to untangle the mystery of the Sinhalese origins as they seem to have genomic contributions from many areas of India [...] but results with conventional systems are contradictory and partial. New genetic markers may be able to provide a perspective on the origin of the Sinhalese. In order to address these, we analysed the above mentioned 30 Alu polymorphisms in a sample of 121 Sinhalese collected from Colombo, Sri Lanka (Papiha et al., 1996b; Papiha and Mastana, 1999). In addition, Alu frequency data from Bengali (89) and Tamil (101), North and Western Indian populations (from the above study) were used for evaluation of genetic variation, affinities and genetic admixture. [...] Overall pattern of genetic relationships points towards substantial Bengali contribution as shown in DA distance derived dendrogram (Fig. 6) and admixture analyses. [...] When three parental populations were used Bengali contribution remained strong (50-66%) followed by North Western (20-23%) and rest contributed by Tamil."

S. S. Papiha, Sarabjit S. Mastana, C. A. Purandare, R. Jayasekara, and R. Chakraborty. "Population genetic study of three VNTR loci (D2S44, D7S22, and D12S11) in five ethnically defined populations of the Indian subcontinent." Human Biology 68:5 (October 1996): pages 819-835. The Sinhalese were found to have only a small amount of northwestern Indian ancestry. The article's data upholds the idea that Sinhalese are largely descended from Bengalis and descended to a lesser extent from Tamils, Gujaratis, and Punjabis. Excerpts from the Abstract:

"Using RFLP (restriction fragment length polymorphism) analysis, we have characterized the genotypic variation of three VNTR (variable number of tandem repeat) loci (D2S44, D7S22, and D12S11) with probes YNH24, g3, and MS43a, respectively, for 288 individuals from 5 genetically well-defined ethnic groups (Brahmins, Maratha, Gujarati Patel, Sinhalese, and Moors) of the Indian subcontinent. The distributions of VNTR alleles at the binned level were examined among the five populations, and the genetic affinities obtained using the VNTR data were compared with serogenetic data on 22 blood group and protein loci previously reported from our laboratory. For classical genetic markers the Sinhalese show slight affinity with the populations of western India. However, the genetic affinity results considerably parallel the results for VNTR loci and 25 combined VNTR/blood group/protein loci, suggesting that the Sinhalese show the least affinity with the populations of western India. These results confirm the findings of a recent study of genetic relationships of the populations of Sri Lanka based on admixture analysis. [...]"

G. N. Malavige, T. Rostron, S. L. Seneviratne, S. Fernando, S. Sivayogan, A. Wijewickrama, and G. S. Ogg. "HLA analysis of Sri Lankan Sinhalese predicts North Indian origin." International Journal of Immunogenetics 34:5 (2007): pages 313-315. Sinhalese people carry the rare allele HLA-A*02 at a frequency of 7.4%, similar to the frequency of 6.7% among North Indian peoples. The allele is believed to originate in northern India and this study shows the partial northern roots of the Sinhalese.

Vajira H. W. Dissanayake, Victoria Giles, Rohan W. Jayasekara, Harshalal R. Seneviratne, Noor Kalsheker, Fiona Broughton Pipkin, and Linda Morgan. "A study of three candidate genes for pre-eclampsia in a Sinhalese population from Sri Lanka." The Journal of Obstetrics and Gynaecology Research 35:2 (April 2009): pages 234-242. First published online on November 12, 2008. With the assistance of medical specialists, the scientists tested members of three ethnic groups of Sri Lanka (Sinhalese, Sri Lankan Tamils, and Moors) along with white European people from the United Kingdom. They used the genetic samples to compare their allele frequencies and found that the Sri Lankans closely matched each other. Excerpt from the "Results and Discussion" section:

"In all genes haplotype and allele frequencies were comparable within the three Sri Lankan populations, but differed significantly from those in the white Western European population. [...]"

Vajira H. W. Dissanayake, Lakshini Y. Weerasekera, C. Gayani Gammulla, and Rohan W. Jayasekara. "Prevalence of genetic thrombophilic polymorphisms in the Sri Lankan population--implications for association study design and clinical genetic testing services." Experimental and Molecular Pathology 87:2 (October 2009): pages 159-162. First published electronically on July 8, 2009. This article is consistent with the notion that Sinhalese are closely related to other Sri Lankans. The frequencies of the alleles observed were found to be very similar between Sinhalese, Sri Lankan Tamils, and Moors and they were also similar to those in some ethnic groups from southern India. Excerpts from the Abstract:

"We investigated the prevalence of genotypes/alleles of single nucleotide polymorphisms (SNP) and haplotypes defined by them in three genes in which variations are associated with venous thromboembolism in 80 Sinhalese, 80 Sri Lankan Tamils and 80 Moors in the Sri Lankan population and compared the SNP data with that of other populations in Southern India and haplotype data with that of HapMap populations. [...]"

Ruwan J. Illeperuma, Samudi N. Mohotti, Thilini M. De Silva, Neil D. Fernandopulle, and W. D. Ratnasooriya. "Genetic profile of 11 autosomal STR loci among the four major ethnic groups in Sri Lanka." Forensic Science International: Genetics 3:3 (June 2009): pages e105-e106. This is another study that concluded that Sinhalese people are closely linked by ancestry with other Sri Lankans. This and other studies show that this includes a connection with Sri Lankan Moors, which is interesting since the Moors originally were Arabs from further west so they must have intermarried with the locals. Excerpts from the Abstract:

"Allele frequencies and statistical parameters of forensic interest are presented for 11 autosomal microsatellites (CSF1PO, TPOX, TH01, D16S539, D13S317, D7S820, F13A, F13B, FESFPS, vWA and LPL) of four ethnic groups in Sri Lanka. A total of 513 unrelated individuals from Sinhalese, Sri Lankan Tamil, Indian Tamil and Sri Lankan Moor population groups were included. [...] All the 11 microsatellites were found to be highly polymorphic, with the combined power of exclusion being greater than 0.99999, in all four ethnic groups. Overall data analysis suggests that a single combined genetic database could be used for genetic-based identification purposes for the four ethnic groups."

Mikiko Soejima and Yoshiro Koda. "Denaturing high-performance liquid chromatography-based genotyping and genetic variation of FUT2 in Sri Lanka." Transfusion 45:12 (December 2005): pages 1934-1939. Sri Lankan Tamils and Sinhalese were genetically tested to genotype FUT2, a secretor gene using denaturing high-performance liquid chromatography (DHPLC) analysis. Excerpt from the Conclusion:

"[...] the genetic backgrounds of two Sri Lankan populations are quite similar, with little genetic flow from neighboring East and Southeast Asian populations to Sri Lanka."

Mikiko Soejima and Yoshiro Koda. "Population differences of two coding SNPs in pigmentation-related genes SLC24A5 and SLC45A2." International Journal of Legal Medicine 121:1 (January 2007): pages 36-39. First published electronically on July 18, 2006. Excerpts from the Abstract:

"The two genes SLC24A5 and SLC45A2 were recently identified as major determinants of pigmentation in humans and in other vertebrates. The allele p.A111T in the former gene and the allele p.L374F in the latter gene are both nearly fixed in light-skinned Europeans, and can therefore be considered ancestry informative marker (AIMs). [...] Here, we generate new allelic data for these two genes from samples of Chinese, Uygurs, Ghanaians, South African Xhosa, South African Europeans, and Sri Lankans (Tamils and Sinhalese). Our data confirm the earlier results and furthermore demonstrate that the SLC45A2 allele is a more specific AIM than the SLC24A5 allele because the former clearly distinguishes the Sri Lankans from the Europeans."

Gautam K. Kshatriya. "Genetic affinities of Sri Lankan populations." Human Biology 67:6 (December 1995): pages 843-866. This study of genetic distance used a series of alleles in multiple populations. Its contentions, especially its analysis showing Bengalis not being close to Sinhalese, differ from those of some other studies. Excerpts from the Abstract:

"[...] Both analyses give a similar picture, indicating that present-day Sinhalese and Tamils of Sri Lanka are closer to Indian Tamils and South Indian Muslims. They are farthest from Veddahs and quite distant from Gujaratis and Punjabis of northwest India and Bengalis of northeast India. [...] The study of genetic admixture revealed that the Sinhalese of Sri Lanka have a higher contribution from the Tamils of southern India (69.86% +/- 0.61) compared with the Bengalis of northeast India (25.41% +/- 0.51), [...]"

Lanka Ranaweera, Supannee Kaewsutthi, Aung Win Tun, Hathaichanoke Boonyarit, Samerchai Poolsuwan, and Patcharee Lertrit. "Mitochondrial DNA history of Sri Lankan ethnic people: their relations within the island and with the Indian subcontinental populations." Journal of Human Genetics 59 (2014): pages 28-36. First published online on November 7, 2013. This is a comprehensive study of mtDNA haplogroups in 271 unrelated individuals from Sri Lanka and South India comprising the Sinhalese, Tamil, and Vedda peoples. The Tamils were subdivided into Sri Lankan Tamils and Indian Tamils and the Sinhalese were subdivided into the Up-country and Low-country Sinhalese. The Tamils and Sinhalese subgroups all have higher mtDNA haplotype diversity than the Vedda do.
Table 2 lists all the haplogroups found in the sample populations, divided by ethnicity.
 • Mitochondrial haplogroups found among Up-country Sinhalese in the study are: Ma, M/N, M2a1, M12a1b, M33a1b/M35+199, M35a, M36a, M45, M52, M65a, M66, D4a, G3a1'2, H5, Nb, N1a1'2, R8/U4'9, R5a2a, R5a2b, R6a, R30b/R8a1a3, U1a'c, U2b, U6, U7, U7a (found in 7 of them, or 11.67%), P4a, and W.
 • Those found among Low-country Sinhalese are: Ma, M/N, M3, M6a, M18a, M30f, M33a1b/M35+199, M41, G3b1, HV2, R5a2b, R7a'b, R30b/R8a1a3, R30b (found in 6 of them, or 15%), U5a, U6, U7a, and T1a1'3.

Excerpts from the Abstract:

"[...] The haplotypes and analysis of molecular variance revealed that Vedda people's mitochondrial sequences are more related to the Sinhalese and Sri Lankan Tamils' than the Indian Tamils' sequences. MtDNA haplogroup analysis [on HVS-1 and HVS-2] revealed that several West Eurasian haplogroups as well as Indian-specific mtDNA clades were found amongst the Sri Lankan populations. Through a comparison with the mtDNA HVS-1 and part of HVS-2 of Indian database, both Tamils and Sinhalese clusters were [more] affiliated with Indian subcontinent populations than Vedda people who are believed to be the native population of the island of Sri Lanka."

Excerpts from the body of the paper:

"[...] In general, Sinhalese (Up-country and Low-country) and Tamil (Sri Lankan and Indian) subgroups exhibited relatively higher haplotype diversity (0.861-1.000) than did those of the Vedda (0.503-0.965). [...] The majority of Sinhalese and Tamil subgroups form close genetic proximities among themselves on both PC [principal component] axes. Major exception to this clustering is found in SU-Thu. It was evident that Up-country Sinhalese are genetically closer to Sri Lankan Tamils. On the other hand, Sri Lankan Tamil subgroups were closer to each other when compared with Indian Tamils. [...] Up-country Sinhalese, Low-country Sinhalese and Sri Lankan Tamils exhibited similar frequencies of haplogroup M (41.67-43.59%) [...] Haplogroup U was mostly found in Vedda (29.33%) and Up-country Sinhalese (23.33%) [...]"

Toomas Kivisild, Siiri Rootsi, Mait Metspalu, S. Mastana, K. Kaldma, J. Parik, Ene Metspalu, M. Adojaan, H.-V. Tolk, V. Stepanov, M. Gölge, E. Usanga, S. S. Papiha, C. Cinnioglu, R. King, Luigi Luca Cavalli-Sforza, Peter A. Underhill, and Richard Villems. "The genetic heritage of the earliest settlers persists both in Indian tribal and caste populations." American Journal of Human Genetics 72:2 (February 2003): pages 313-332. Part of this study shows a link genetic deep in time between the Sinhalese and the Romani people who migrated from India to Europe. According to this study, both the Romani and Sinhalese men carry the Y-DNA haplogroup H in high frequencies. This study furthermore notes that about 10 percent of Sinhalese men have the Y-DNA haplogroup F*. This is apparently also the study that shows that 13 percent of Sinhalese men carry the Y-DNA haplogroup R1a1a (R-M17).

S. Sengupta, L. Zhivotovsky, R. King, S. Mehdi, C. Edmonds, C. Chow, A. Lin, M. Mitra, et al. "Polarity and Temporality of High-Resolution Y-Chromosome Distributions in India Identify Both Indigenous and Exogenous Expansions and Reveal Minor Genetic Influence of Central Asian Pastoralists." The American Journal of Human Genetics 78:2 (2006): pages 202-221. Part of this study reportedly shows a link genetic deep in time between the Sinhalese and the Sinti Roma people who migrated from India to Europe.

N. Saha. "Blood genetic markers in Sri Lankan populations–reappraisal of the legend of Prince Vijaya." American Journal of Physical Anthropology 76:2 (June 1988): pages 217-225. This study uses older scientific techniques and its results on the proportions of Sinhalese ancestry do not accord with more modern studies. The three populations tested here were Sinhalese, Sri Lankan Tamils, and Sri Lankan Muslims. Excerpt from the Abstract:

"[...] The allelic frequencies of all the polymorphic systems were similar in these populations without any significant differences. A close look at the present results and earlier investigations on 13 polymorphic loci controlled by 37 alleles did not reveal any genetic characteristics in the present-day Sinhalese population that are distinct from those in the Tamils of Sri Lanka. As such, genetic evidence linking the legendary origin of the Sinhalese population to East India (Prince Vijaya) is lacking."

D. F. Roberts, C. K. Creen, and K. P. Abeyaratne. "Blood Groups of the Sinhalese." Man, New Series, 7:1 (March 1972): pages 122-127. This is a study that used older genetic techniques. Blood samples were drawn from 157 Sinhalese (133 adults and 24 children) with the aim of comparing their gene frequencies with other peoples of Sri Lanka and peoples currently residing in southern India and central India. Excerpts:

"The Sinhalese of Ceylon pose an interesting genetic problem. They speak a fundamentally Aryan language, and trace their descent from ancestors from the central latitudes of India who came to the island about 500 B.C. [...] Culturally there is considerable affinity between Sinhalese and other groups of Middle India, from whom they are apparently separated by the great Dravidian language block of southern India. From the limited genetical information available, in the gene frequencies for some characters the Sinhalese are very similar to other populations throughout India, e.g. the isoenzyme systems (Roberts et al. 1972). In others, however, they appear to be at the limit of a cline across India from north to south, for example of diminishing frequencies of blood group gene B and increasing O (Mourant et al. 1958; Mourant 1962). The problem then is how to reconcile the occurrence of a regular gradient in frequency with the cultural and traditional evidence. [...]"

