Han Chinese Genetics: Abstracts and Summaries

Family Tree DNA - Genetic testing service
Family Tree DNA: Genetic Testing Service
DNA testing will show how you're connected to other families and ethnic groups living in Asia and the world. People descended from any ethnic group of China in their mtDNA and/or Y-DNA lines who have tested with Family Tree DNA are welcome to join the "China / Chinese DNA Project" that's administered by Ivan Shim, Mike Sastrosudharmo, and Owen Lu. It has many hundreds of members.

The Han Chinese people are the founders of China as a nation and a culture. They are divided into numerous subgroups including Cantonese, Hakka, Sichuanese, Hunanese, Wu, etc. Their languages and dialects are members of the Sino-Tibetan language family.

The Han originated in China's Central Plain (Zhōngyuán) region and were descendants of the Hua and Xia tribes that farmed the lands near the Yellow (Huáng Hé) River. Beginning during the early period of unified China's rule by kings from the Shang dynasty, beginning around 1600 B.C.E., the Hua and Xia combined to form the Huaxia ethnicity, but they later rebranded themselves the Han after the name of the ruling Han imperial dynasty (260 B.C.E. to 220 C.E.).

The Han people did not originally live as far south as Guangdong or as far southwest as Sichuan, nor in the far northern areas of today's China. What happened was that in later times, many Han men moved southward and northward into lands of other cultures and intermarried with their women, including those from the so-called Yue peoples of the south and the Dian peoples of the southwest, and China politically grew to encompass those new lands. The Han culture became dominant in southern China after this expansion and the descendants of Han-Yue intermarriages came to regard themselves as Han. Although the Han are a coherent ethnicity on the paternal side, carrying a core group of Y-chromosomal haplogroups across the geographic span of the ethnicity, there are some genetic differences between the Northern Han and Southern Han that persist to the present day, because Southern Han are somewhat shifted towards southeastern Asians and carry some different mtDNA haplogroups. Nevertheless, Razib Khan pointed out that the Southern Han Chinese "are not closer to Southeast Asians than they are to North Chinese (the furthest southern dialect groups, such as those of Guangdong, are about equidistant to Vietnamese)."

The Y-DNA haplogroup O2-M122 is very common in the Han Chinese population and had a presence in prehistoric China, as did Q1a1a1-M120, which is also found among Mongols. Other branches of Q1a are found among Central Asians, Siberians, Amerindians, and Northern Europeans.

Branches of Y-DNA haplogroup C - including C-M217 (also known as C2-M217, formerly C3-M217) and C-F2613 - are also found among Han Chinese. The subclade C-CTS4660 is found among some Hans from Fujian province in southeastern China.

Most Han Chinese lineages are of East Eurasian origin, and in autosomal tests most Chinese people score entirely within the East Asian and Southeast Asian categories. However, some male lineages originating from Central-South Eurasia or West Eurasia have been detected in some groups of northern Han, including:

  • R1a1, which is particularly common in Eastern Europe, Central Asia, and South Asia;
  • R2a, which is especially found among South Asians and also found among some Central Asians;
  • G2a, which is fairly common in Southern Europe, Asia Minor, and the Caucasus;
  • J1, which is especially common in the Middle East among Arabs and among Jews of Israelite origin;
  • J2a, which is is prevalent among Middle Easterners, Italians, southern Spaniards, Pakistanis, and northwestern Indians.

    Major studies of Han Chinese

    Charleston W. K. Chiang, Serghei Mangul, Christopher R. Robles, and Sriram Sankararaman. "A Comprehensive Map of Genetic Variation in the World's Largest Ethnic Group-Han Chinese." Molecular Biology and Evolution 35:11 (November 1, 2018): pages 2736-2750.
          A comprehensive autosomal DNA study of the genomes of 11,670 Han women from 19 of China's provinces plus one autonomous region and all four direct-controlled municipalities. Excerpts from the Abstract:

    "[...] We identified previously unrecognized population structure along the East-West axis of China, demonstrated a general pattern of isolation-by-distance among Han Chinese, and reported unique regional signals of admixture, such as European influences among the Northwestern provinces of China. [...]"

    Jieming Chen, Houfeng Zheng, Jin-Xin Bei, Liangdan Sun, Wei-hua Jia, Tao Li, Furen Zhang, Mark Seielstad, Yi-Xin Zeng, Xuejun Zhang, and Jianjun Liu. "Genetic Structure of the Han Chinese Population Revealed by Genome-wide SNP Variation." The American Journal of Human Genetics 85:6 (December 11, 2009): pages 775-785.
          This genome-wide autosomal DNA study examined 6,580 Han Chinese samples from ten provinces in China (including Sichuan province in the southwest and Guangdong province in the south) plus 1,050 Han Chinese "from the Chinese metropolises of Beijing and Shanghai" and 570 Chinese living in Singapore. Those studied speak diverse languages including Mandarin (multiple dialects), Cantonese, Hakka, and Teochew.

    Excerpts from the Abstract:

    "[...] our study revealed a one-dimensional 'north-south' population structure and a close correlation between geography and the genetic structure of the Han Chinese. The north-south population structure is consistent with the historical migration pattern of the Han Chinese population. Metropolitan cities in China were, however, more diffused 'outliers,' probably because of the impact of modern migration of peoples. At a very local scale within the Guangdong province, we observed evidence of population structure among dialect groups, probably on account of endogamy within these dialects. [...]"

    Excerpts from the Discussion section:

    "[...] The inclusion of samples from the Sichuan province, in western China, did not reveal evidence for an east-west stratification of the Han Chinese population. This is likely to result from an underrepresentation of samples from the more western provinces. The absence of east-west stratification could also be the result of the recent mass migration of peoples from central and southern regions, such as Hunan and Hubei, of China to Sichuan during the late Ming and early Qing dynasties. [...]"

    Bo Wen, Hui Li, Daru Lu, Xiufeng Song, Feng Zhang, Yungang He, Feng Li, Yang Gao, Xianyun Mao, Liang Zhang, Ji Qian, Jingze Tan, Jianzhong Jin, Wei Huang, Ranjan Deka, Bing Su, Ranajit Chakraborty, and Li Jin. "Genetic evidence supports demic diffusion of Han culture." Nature 431 (September 16, 2004): pages 302-305.
          Individuals from 23 Han populations were sampled, yielding 1,289 Y-DNA samples and 1,119 mtDNA samples from them. Han people across virtually all of China are associated with a core group of Han Y-DNA haplogroups. Page 302 notes in particular that Northern Han and Southern Han men share O3-M122 and O3e-M134 and that these are found in high frequencies among both groups. By contrast, the Y-DNA haplogroup Q-M120 is reported to have a low percentage among Han Chinese men. Furthermore, according to the authors, the haplogroups O1b-M110, O2a1-M88, and O3d-M7 are found among about 4% of Southern Hans, and also found among non-Han groups of southern China, but not present in Northern Hans.

    Excerpts from the Abstract:

    "[...] Here we show, by systematically analysing Y-chromosome and mitochondrial DNA variation in Han populations, that the pattern of the southward expansion of Han culture is consistent with the demic diffusion model, and that males played a larger role than females in this expansion. [...]"

    Excerpt from page 302:

    "On the maternal side, however, the mtDNA haplogroup distribution showed substantial differentiation between northern Hans and southern Hans (Supplementary Table 3). The overall frequencies of the northern East Asian-dominating haplogroups (A, C, D, G, M8a, Y and Z) are much higher in northern Hans (55%, 49-64%) than are those in southern Hans (36%, 19-52%). In contrast, the frequency of the haplogroups that are dominant lineages (B, F, R9a, R9b and N9a) in southern natives is much higher in southern [Hans] (55%, 36-72%) than it is in northern Hans (33%, 18-42%)."

    Fuzhong Xue, Yi Wang, Shuhua Xu, Feng Zhang, Bo Wen, Xuesen Wu, Ming Lu, Ranjan Deka, Ji Qian, and Li Jin. "A spatial analysis of genetic structure of human populations in China reveals distinct difference between maternal and paternal lineages." European Journal of Human Genetics 16 (January 23, 2008): pages 705-717.
          This study explores differences between northern Chinese and southern Chinese people based on mtDNA data from 3435 individuals representing 91 populations and Y-chromosomal data from 5790 individuals representing 143 populations. They included Han and non-Han individuals from every province of China. 596 Han men from southern China had their Y-DNA tested, as did 985 Han men from northern China, yielding a total of 1,581 total Han Y-DNA samples.
          According to Table 3, the Y-chromosome haplogroup frequencies among Han men in southern China, in decreasing order, include 27.58% carrying O3*, 24.16% carrying O3e, 15.60% carrying K*, 14.09% carrying O1-M119, 7.55% carrying O2a-M95, and 6.71% carrying C-M130, and 1.34% is the frequency for D-M1, F*, and P*-M45. Table 3 also lists the frequencies for Han men in northern China: 29.54% have O3*, 25.58% have O3e, 17.67% have K*, 8.53% have C-M130, 7.61% have F*, 4.26% have P*-M45, 3.15% have O1-M119, 2.23% have D-M1, and only 1.42% have O2a-M95. Table 2 compares Han Y-DNA frequencies to those of Tibetans and other East Eurasian peoples.

    Excerpts from the Abstract:

    "[...] Our results highlight a distinct difference between spatial genetic structures of maternal and paternal lineages. A substantial genetic differentiation between northern and southern populations is the characteristic of maternal structure, with a significant uninterrupted genetic boundary extending approximately along the Huai River and Qin Mountains north to Yangtze River. On the paternal side, however, no obvious genetic differentiation between northern and southern populations is revealed."

    Excerpts from the Results section:

    "[...] when only Han populations are included, genetic boundaries between the northern and southern populations start to emerge (Figure 3b and d). Such division is statistically significant with the maternal lineages, but much weaker with the paternal lineages. [...] Table 1 shows the distribution of northern and southern dominating haplogroups of mtDNA. Haplogroups A, C, D*, D5, D5a, G, M7c, M8a, M9, N*, and Z are identified as NDH, with much higher frequencies in north than in south significantly. Haplogroups, M*, B*, B4, B4a, B4b1, B5*, B5a, F*, F1a, F1b, F1c, F2a, M7*, M7a, M7b*, M7b1, M7b2, R*, R9a, R9b, and R9c are identified as SDH, with much higher frequencies in south than in north. Most of the major haplogroups derived from M lineage are NDH except for M7, whereas most of the major haplogroups derived from N lineage are SDH except for N9 and A.

    [...] southern Hans and northern Hans share similar frequencies of Y-chromosome haplogroups (Table 3), which are characterized by carrying the M89, O3*, O3e, and K* mutations that are prevalent in almost all Han populations studied [...]"

    Pengyu Chen, Jian Wu, Li Luo, Hongyan Gao, Mengge Wang, Xing Zou, Yingxiang Li, Gang Chen, Haibo Luo, Limei Yu, Yanyan Han, Fuquan Jia, and Guanglin He. "Population Genetic Analysis of Modern and Ancient DNA Variations Yields New Insights Into the Formation, Genetic Structure, and Phylogenetic Relationship of Northern Han Chinese." Frontiers in Genetics 10 (October 30, 2019): article number 1045.
          These scientists genotyped 3,089 Northern Han Chinese people, giving special focus to 20 Hans from Shanxi Province, and compared them to other ancient and modern populations. They took into consideration Y-DNA lineages and mtDNA lineages in addition to autosomal DNA. Excerpts from the Discussion section:

    "[...] Analysis from the haplogroup distribution of Neolithic Chinese populations showed that the significant association of genetic continuity between ancient populations from Yellow River Valley sites (Mogou, Taojiazhai, and Hengbei) and modern northern-Han Chinese [...] Whole-genome high-density SNP data illustrate that Shanxi Han Chinese inherited 25.2% their ancestry from Yakut-related population and 74.8% from She-related population. Ancient autosomal genetic variation subsequently shows a two-way admixture from ancient North East Asian (45% ancestry from DevilsGate Hunter-Gatherer-related population) and ancient South Asian (55% ancestry Oakaie-related ancient population). [...] Han Chinese may be originated from the admixture between the ancient Tibeto-Burman population and a local pre-Sinitic population which may have been linguistically Altaic in the Neolithic time when agriculture emerged in Yangtze and Yellow River Basins. [...]"

    Yong-Bin Zhao, Ye Zhang, Quan-Chao Zhang, Hong-Jie Li, Ying-Qiu Cui, Zhi Xu, Li Jin, Hui Zhou, and Hong Zhu. "Ancient DNA Reveals That the Genetic Structure of the Northern Han Chinese Was Shaped Prior to 3,000 Years Ago." PLOS One 10:5 (May 4, 2015): e0125676.
          Table 1 lists the haplogroups of ancient samples from the Hengbei archaeological site in China's Jiang County in Shanxi Province. Their Y-DNA haplogroups included N, O2a, O3a, O3a3, and Q1a1 (a.k.a. Q-M120). (Those O haplogroups are also called O-M122, O-M95, and O-M175.) Their mtDNA haplogroups included A, B, C, D4, D5, F, G, M, M7, M8, M9a, M10, N9a, and R.

    Excerpts from the Abstract:

    "[...] Many genetic studies have shown that Han Chinese can be divided into two distinct groups: northern Han Chinese and southern Han Chinese. The genetic history of the southern Han Chinese has been well studied. However, the genetic history of the northern Han Chinese is still obscure. In order to gain insight into the genetic history of the northern Han Chinese, 89 human remains were sampled from the Hengbei site which is located in the Central Plain and dates back to a key transitional period during the rise of the Han Chinese (approximately 3,000 years ago). We used 64 authentic mtDNA data obtained in this study, 27 Y chromosome SNP data profiles from previously studied Hengbei samples, and genetic datasets of the current Chinese populations and two ancient northern Chinese populations to analyze the relationship between the ancient people of Hengbei and present-day northern Han Chinese. We used a wide range of population genetic analyses, including principal component analyses, shared mtDNA haplotype analyses, and geographic mapping of maternal genetic distances. The results show that the ancient people of Hengbei bore a strong genetic resemblance to present-day northern Han Chinese and were genetically distinct from other present-day Chinese populations and two ancient populations. These findings suggest that the genetic structure of northern Han Chinese was already shaped 3,000 years ago in the Central Plain area."

    Excerpts from the Discussion:

    "[...] In the paternal lineage, HB contained the haplogroups or sub-haplogroups N, O*, O2a, O3 and Q1a1. The total frequencies of these haplogroups reached high levels (66%–100%) in current Han Chinese. Haplogroup Q1a1, which was predominant in HB [Hengbei], is highly specific to the Han Chinese. Haplogroup O3, the second highest frequency (33.34%) in HB, occupies the highest frequencies in almost all current Han Chinese populations (32.5%-76.92%). Moreover, in the PCA plot, HB groups closely with the Han Chinese. [...]"

    Y. B. Zhao, Y. Zhang, H. J. Li, Y. Q. Cui, H. Zhu, and H. Zhou. "Ancient DNA evidence reveals that the Y chromosome haplogroup Q1a1 admixed into the Han Chinese 3,000 years ago." American Journal of Human Biology 26:6 (November-December 2014): pages 813-821. First published electronically on August 18, 2014.
          These scientists note that "Y chromosome haplogroup Q1a1 is found almost only in Han Chinese populations." The goal of their study was to try to find Q1a1 among ancient samples, since this had not been done before. For the first time, they were able to fully examine the Y-DNA of 27 ancient males "that were excavated from the presumed geographic source of the Han Chinese and dated to approximately 3,000 years ago". Their Y-DNA haplogroups turned out to be N, O*, O2a, O3a, and Q1a1. They concluded that 3,000 years ago was the approximate timeframe for the introduction of Q1a1 into the Han gene pool.

    Yun-Zhi Huang, Horolma Pamjav, Pavel Flegontov, Vlastimil Stenzl, Shao-Qing Wen, Xin-Zhu Tong, Chuan-Chao Wang, Ling-Xiang Wang, Lan-Hai Wei, Jing-Yi Gao, Li Jin, and Hui Li. "Dispersals of the Siberian Y-chromosome haplogroup Q in Eurasia." Molecular Genetics and Genomics 293:1 (2018): pages 107-117. First published online on September 7, 2017.
          They collected 16 of their own modern Chinese male Q carriers to supplement other data.

    Excerpts from the Discussion:

    "Subclade Q1a1a1-M120 was found specifically in the Han Chinese with a low frequency (Zhong et al. 2011). Our results suggested that subclade Q1a1a1-M120 had migrated from Mongolia to China during the Neolithic period, and spread over China with the ancestors of Han Chinese (Fig. 3; Table 1; ESM_1). Previous studies showed that Q1a1a1-M120 had migrated from north-western China to the Central Plain as nomads, and merged into the northern Han Chinese farmers at approximately 2.5-3 KYA (Zhao et al. 2010, 2014, 2015; Yan et al. 2014). Therefore, we supposed that the ancient nomads with Q1a1a1-M120 had migrated to south-eastward from north-western China and were assimilated by the Han Chinese farmers (Zhao et al. 2015)."

    Xiaotian Yao, Senwei Tang, Beilei Bian, Xiaoli Wu, Gang Chen, and Chuan-Chao Wang. "Improved phylogenetic resolution for Y-chromosome Haplogroup O2a1c-002611." Scientific Reports 7 (2017): article number 1146.
          They studied the branches of Y-DNA haplogroup O in depth including their frequencies among Chinese peoples and elsewhere in East Asia and Southeast Asia.

    Excerpts from the Abstract:

    "[...] In this study, we genotyped 89 new highly informative single nucleotide polymorphisms (SNPs) in 305 individuals with Haplogroup O2a1c-002611 identified from 2139 Han Chinese males. Two major branches were identified, O2a1c1-F18 and O2a1c2-L133.2 and the first was further divided into two main subclades, O2a1c1a-F11 and O2a1c1b-F449, accounting for 11.13% and 2.20% of Han Chinese, respectively. In Haplogroup O2a1c1a-F11, we also determined seven sublineages with quite different frequency distributions in Han Chinese ranging from 0.187% to 3.553%, implying they might have different demographic history. [...]"

    Excerpts from the Introduction:

    "[...] The O2-M122 is the most common lineage in China and is also prevalent throughout surrounding regions, comprising roughly 50 to 60% of the Han Chinese. There are three main subclades of O2-M122, called O2a1c-002611, O2a2b1-M134 and O2a2b1a1-M117, with each accounting for 12 to 17% of the Han Chinese. [...]"

    Min Lang, Hai Liu, Feng Song, Xianhua Qiao, Yi Ye, He Ren, Jienan Li, Jian Huang, Mingkun Xie, Shengjie Chen, Mengyuan Song, Youfang Zhang, Xiaoqin Qian, Taoxiu Yuan, Zheng Wang, Yuming Liu, Mengge Wang, Yacheng Liu, Jing Liu, and Yiping Hou. "Forensic characteristics and genetic analysis of both 27 Y-STRs and 143 Y-SNPs in Eastern Han Chinese population." Forensic Science International: Genetics 42 (September 2019): pages e13-e20. First published online on July 23, 2019.
          1269 Han Chinese males from 11 populations had detailed Y-DNA evaluations performed.

    Excerpts from the Abstract:

    "[...] Haplogroup O-M175 was the most predominant haplogroup in our Han Chinese data, ranging from 67.34% (Henan Han) to 93.16% (Guangdong Han). The highest haplogroup diversity (0.967056) was observed in Heilongjiang Han, with a discrimination capacity (DC) value of 0.3723. The number of alleles at single-copy loci varied from 2 for DYS391 (Guangdong Han) to 16 for DYS518 (Henan Han). For the majority of the populations (8/11), both the haplotype diversity and DC values are 1.0000. Furthermore, genetic differentiations were observed between Northern and Southern Han Chinese. These genetic differences were mainly reflected in haplogroup distribution and frequency, and they were confirmed by statistical analysis."

    Excerpt from the body of the study:

    "[...] For all populations, O2-M122 was observed at a higher frequency than was O1 (subdivided into O1a-M119 and O1b-P31 in our study), except for Guangxi Han (48.67% and 35.40% for haplogroups O1 and O2, respectively) [...]"

    Mengyuan Song, Zheng Wang, Yaqing Zhang, Chenxi Zhao, Min Lang, Mingkun Xie, Xiaoqin Qian, Mengge Wang, and Yiping Hou. "Forensic characteristics and phylogenetic analysis of both Y-STR and Y-SNP in the Li and Han ethnic groups from Hainan Island of China." Forensic Science International: Genetics 39 (March 2019): pages e14-e20. First released online on November 29, 2018.
          Hainan Island is China's southernmost province and its indigenous people are the Li. Han people also live there. This Y-chromosomal study examined 302 males belonging to the island's Li and Han ethnic groups. They found that there has been "little gene flow between the Li and Han" ethnicities. The Han men of Hainan Island carry the Y-DNA haplogroup O2-M122 at a frequency of 49.5%, making it their most common haplogroup, and this is consistent with Hans elsewhere in China. By contrast, the most common Y-DNA haplogroup among the Li people is O1b1a1a1a1a1b-CTS5854, representing 44.12% of the Li men in this study.

    Y. G. Yao, Q. P. Kong, H. J. Bandelt, Toomas Kivisild, and Y. P. Zhang. "Phylogeographic differentiation of mitochondrial DNA in Han Chinese." American Journal of Human Genetics 70 (March 2002): pages 635-651. First published electronically on February 8, 2002.
          This study focusing on Han Chinese mtDNA includes samples from several provinces in China. They found some differences in the frequencies of particular mtDNA haplogroups across different provinces. Excerpts from the Abstract:

    "[...] The southernmost provinces show more pronounced contrasts in their regional Han mtDNA pools than the central and northern provinces. These and other features of the geographical distribution of the mtDNA haplogroups observed in the Han Chinese make an initial Paleolithic colonization from south to north plausible but would suggest subsequent migration events in China that mainly proceeded from north to south and east to west. [...]"

    Y. B. Zhao, H. J. Li, S. N. Li, C. C. Yu, S. Z. Gao, Z. Xu, L. Jin, H. Zhu, and H. Zhou. "Ancient DNA evidence supports the contribution of Di-Qiang people to the Han Chinese gene pool." American Journal of Physical Anthropology 144:2 (February 2011): pages 258-268. First published electronically on September 24, 2010. Abstract:

    "Han Chinese is the largest ethnic group in the world. During its development, it gradually integrated with many neighboring populations. To uncover the origin of the Han Chinese, ancient DNA analysis was performed on the remains of 46 humans (1700 to 1900 years ago) excavated from the Taojiazhai site in Qinghai province, northwest of China, where the Di-Qiang populations had previously lived. In this study, eight mtDNA haplogroups (A, B, D, F, M*, M10, N9a, and Z) and one Y-chromosome haplogroup (O3) were identified. All analyses show that the Taojiazhai population presents close genetic affinity to Tibeto-Burman populations (descendants of Di-Qiang populations) and Han Chinese, suggesting that the Di-Qiang populations may have contributed to the Han Chinese genetic pool."

    H. Zhong, H. Shi, X. B. Qi, Z. Y. Duan, P. P. Tan, L. Jin, B. Su, and R. Z. Ma. "Extended Y chromosome investigation suggests postglacial migrations of modern humans into East Asia via the northern route." Molecular Biology and Evolution 28:1 (January 2011): pages 717-727. First published electronically on September 13, 2010.
          They sampled 3,826 males from 116 populations from China and one population from North Korea to try to identify "the time period and geographic source" of any Y-DNA lineages found to originate from Central-South Asia or West Eurasia. First, they acknowledged that most East Asian Y-DNA lines have deep roots in East Asia and before then had come from the south, and their 4 predominant East Asian haplogroups are O-M175, D-M174, C-M130, and N-M231, collectively representing the lineages of 92.87% of all East Asian males. Second, they confirmed that some Northern Han Chinese Y-DNA lineages come from outside of East Asia even within historical times. In a collection of 45 samples from the western Henan province, 8.9% of the Northern Han males in this study carry R1a1, 6.7% carry R2a, and 2.2% carry G2a*. In a separate collection of 21 samples from the western Henan province, specifically from residents of the city of Nanyang, 4.8% of those males carry G2a1 and 4.8% carry R1a1. Another data set came from Northern Han from the city of Harbin in Heilongjiang province in northeastern China, and among these males J1 was found in 1.8%, J2a in 1.8%, and L3 in 1.8%. As for C-M217, this study found its frequency to be 23.5% in Hans from the eastern Chinese city of Shanghai and 29.6% in Hans from Jilin in northeastern China but that it wasn't present in their sample of 27 Hans from Guangxi in southern China. In their sample of 56 Hans from Shanxi, one (1.8%) was found to carry Q-M120. Excerpts from the Abstract:

    "[...] However, there are other haplogroups (6.79% in total) (E-SRY4064, C5-M356, G-M201, H-M69, I-M170, J-P209, L-M20, Q-M242, R-M207, and T-M70) detected primarily in northern East Asian populations and were identified as Central-South Asian and/or West Eurasian origin based on the phylogeographic analysis. In particular, evidence of geographic distribution and Y chromosome short tandem repeat (Y-STR) diversity indicates that haplogroup Q-M242 (the ancestral haplogroup of the native American-specific haplogroup Q1a3a-M3) and R-M207 probably migrated into East Asia via the northern route. The age estimation of Y-STR variation within haplogroups suggests the existence of postglacial (~18 Ka) migrations via the northern route as well as recent (~3 Ka) population admixture. We propose that although the Paleolithic migrations via the southern route played a major role in modern human settlement in East Asia, there are ancient contributions, though limited, from WE [West Eurasia], which partly explain the genetic divergence between current southern and northern East Asian populations."

    Yali Xue, Tatiana Zerjal, Weidong Bao, Suling Zhu, Qunfang Shu, Jiujin Xu, Ruofu Du, Songbin Fu, Pu Li, Matthew E. Hurles, Huanming Yang, and Chris Tyler-Smith. "Male demography in East Asia: a north-south contrast in human population expansion times." Genetics 172:4 (April 2006): pages 2431-2439.
          This Y-DNA study found haplogroup J in 10% of Northern Han males from the city of Lanzhou, haplogroup R1a1 in 6.7% of them, and haplogroup N1-LLY22g* in 6.7% of them. N1-LLY22g* was found in only 2.9% each among Hans from Chengdu, Harbin, and Meixian District, and was absent from their Han sample from Yining City.

    Soon-Hee Kim, Ki-Cheol Kim, Dong-Jik Shin, Han-Jun Jin, Kyoung-Don Kwak, Myun-Soo Han, Joon-Myong Song, Won Kim, and Wook Kim. "High frequencies of Y-chromosome haplogroup O2b-SRY465 lineages in Korea: a genetic perspective on the peopling of Korea." Investigative Genetics 2:1 (April 4, 2011): 10.
          The scientists found that Y-DNA haplogroup C-M217, also called by its old terminology "C3" here, was found in 23.5% of Han males from the city of Xian (Xi'an) in Shaanxi province in northwestern China and in 5.88% of Han males from Beijing.

    Bing Su, Chunjie Xiao, Ranjan Deka, Mark T. Seielstad, Daoroong Kangwanpong, Junhua Xiao, Daru Lu, Peter Underhill, Luigi Luca Cavalli-Sforza, Ranajit Chakraborty, and Li Jin. "Y chromosome haplotypes reveal prehistorical migrations to the Himalayas." Human Genetics 107:6 (2000): pages 582-590.
          Includes 28 Han samples from Henan, 22 Han samples from Anhui, 30 Han samples from Shanghai, 32 Han samples from Shandong, 55 Han samples from Jiangsu, and 22 "Northern Han" samples. Y-DNA haplogroup Q-M120 was found in 7.1% of the Henan Hans, 4.6% of the Anhui Hans, 4.5% of the Northern Hans, 3.3% of the Shanghai Hans, 3.1% of the Shandong Hans, and 1.8% of the Jiangsu Hans.

    Tatiana M. Karafet, Liping Xu, Ruofu Du, William Wang, Shi Feng, R. S. Wells, Alan J. Redd, Stephen L. Zegura, and Michael F. Hammer. "Paternal Population History of East Asia: Sources, Patterns, and Microevolutionary Processes." The American Journal of Human Genetics 69:3 (2001): pages 615-628.
          40 Han males from Guangzhou and 44 Han males from Xi'an participated in this Y-DNA study. Haplogroup N1-LLY22g* was found in 15% of the Hans from Guangzhou and 6.8% of the Hans from Xi'an.

    Michael F. Hammer, Tatiana M. Karafet, Hwayong Park, Keiichi Omoto, Shinji Harihara, Mark Stoneking, and Satoshi Horai. "Dual origins of the Japanese: Common ground for hunter-gatherer and farmer Y chromosomes." Journal of Human Genetics 51:1 (2005): pages 47-58.
          84 Han males from Taiwan participated in this Y-DNA study. Haplogroup N1-LLY22g* was found in 3.6% of the Taiwanese Han.

    Qing Zhao, Shangling Pan, Zhendong Qin, Xiaoyun Cai, Yan Lu, Sara E. Farina, Chengwu Liu, Junhua Peng, Jieshun Xu, Ruixing Yin, Shilin Li, Jin Li, and Hui Li. "Gene flow between Zhuang and Han populations in the China-Vietnam borderland." Journal of Human Genetics 55 (2010): pages 774-776.
          There is some genetic connection between Han people from Napo County (in southern China near Vietnam's border) and the Hei-Yi Zhuang (Minz) people. Excerpts from the body text:

    "[...] The Han sample from Napo only formed eight haplogroups (Table 1), and all of these haplogroups are shared with the Minz. The haplogroup specific to Southeast Asia, O2a*, reaches its highest frequency among the Napo Han. O3a3c1 is the second highest occurrence in this population. A previous study proved that the Han Chinese originated in North China, and the Y chromosomes of Han contain mostly haplogroup O3.

    On the maternal side, mtDNA haplogroups for the [...] Napo Han are mainly M7b*, M*, M7* and R9b. These haplogroups are predominantly derived from southern China and are not Han dominant [...]"

    Jiao-Yang Tian, Hua-Wei Wang, Yu-Chun Li, Wen Zhang, Yong-Gang Yao, Jits van Straten, Martin B. Richards, and Qing-Peng Kong. "A genetic contribution from the Far East into Ashkenazi Jews via the ancient Silk Road." Scientific Reports 5 (February 11, 2015): article number 8377.
          Many Han Chinese people carry the mtDNA haplogroup M33c. In this study, it was identified in 4 samples from Guangdong province, 6 samples from the Guangxi region, 1 sample from Sichuan province, 2 samples from Shaanxi province, 2 samples from Jilin province, 1 sample from Jiangsu province, 5 samples from Hunan province, and 1 Hunan/Fujian sample. This reveals that it is present among Han from southern, central, and northern areas of China, but somewhat more common in southern areas compared to northern areas. It is also found among non-Han ethnicities of southern China and southeast Asia.

    Yu-Chun Li, Wei-Jian Ye, Chuan-Gui Jiang, Zhen Zeng, Jiao-Yang Tian, Li-Qin Yang, Kai-Jun Liu, and Qing-Peng Kong. "River Valleys Shaped the Maternal Genetic Landscape of Han Chinese." Molecular Biology and Evolution 36:8 (August 2019): pages 1643-1652. First published online on May 21, 2019.

    Excerpts from the Abstract:

    "[...] In this study, we dissected the matrilineal landscape of Han Chinese by studying 4,004 mtDNA haplogroup-defining variants in 21,668 Han samples from virtually all provinces in China. Our results confirmed the genetic divergence between southern and northern Han populations. However, we found a significant genetic divergence among populations from the three main river systems, that is, the Yangtze, the Yellow, and the Zhujiang (Pearl) rivers, which largely attributed to the prevalent distribution of haplogroups D4, B4, and M7 in these river valleys. Further analyses based on 4,986 mitogenomes, including 218 newly generated sequences, indicated that this divergence was already established during the early Holocene and may have resulted from population expansion facilitated by ancient agricultures along these rivers. These results imply that the maternal gene pools of the contemporary Han populations have retained the genetic imprint of early Neolithic farmers from different river basins, or that river valleys represented relative migration barriers that facilitated genetic differentiation, thus highlighting the importance of the three ancient agricultures in shaping the genetic landscape of the Han Chinese."

    Figure 2 on page 1645 shows the regional distributions of mtDNA haplogroups B4, D4, F1, M7, and A in China.

    Excerpts from page 1645:

    "Haplogroup A has the highest frequency in northern and northwestern China (fig. 2f) including Tianjin (11.99%), Ningxia (10.26%), and Shaanxi (9.96%), as well as some southern regions such as Anhui (9.35%) and Jiangsu (8.01%)"

    Excerpts from page 1646:

    "We also investigated but failed to detect any correlation between language dialects and mtDNA haplogroups, with majority of haplogroups shared by different dialect groups [...] For example, haplogroup D4 is ubiquitously distributed in Mandarin, Jin, and Wu speakers; M7 shows high distribution frequencies in either Cantonese, Min or Mandarin groups in southern China; B4 distributed in Cantonese, Wu, Xiang, and Mandarin dialects; A is prevalent in Mandarins in northern China and also found in Hui speakers in the south. [...] Fisher's exact test revealed that haplogroups M7, D4, R9, A, and B4 [...] displayed the most significant differences in distribution between northern and southern China [...] Haplogroups D4 and A contributed most to the north cluster, whereas M7, F1, and B4 to the south cluster [...]"

    Koh-ichiro Yoshiura, Akira Kinoshita, Takafumi Ishida, Aya Ninokata, Toshihisa Ishikawa, Tadashi Kaname, Makoto Bannai, Katsushi Tokunaga, Shunro Sonoda, Ryoichi Komaki, Makoto Ihara, Vladimir A. Saenko, Gabit K. Alipov, Ichiro Sekine, Kazuki Komatsu, Haruo Takahashi, Mitsuko Nakashima, Nadiya Sosonkina, Christophe K. Mapendano, Mohsen Ghadami, Masayo Nomura, De-Sheng Liang, Nobutomo Miwa, Dae-Kwang Kim, Ariuntuul Garidkhuu, Nagato Natsume, Tohru Ohta, Hiroaki Tomita, Akira Kaneko, Mihoko Kikuchi, Graciela Russomando, Kenji Hirayama, Minaka Ishibashi, Aya Takahashi, Naruya Saitou, Jeffery C. Murray, Susumu Saito, Yusuke Nakamura, and Norio Niikawa. "A SNP in the ABCC11 gene is the determinant of human earwax type." Nature Genetics 38 (2006): pages 324-330. First published online on January 29, 2006. Excerpts from the Abstract:

    "[...] Dry earwax is frequent in East Asians, [...] it is highest in Chinese and Koreans, and a common dry-type haplotype is retained among various ethnic populations. These suggest that the allele A arose in northeast Asia and thereafter spread [...]"

    Yuchen Wang, Dongsheng Lu, Yeun-Jun Chung, and Shuhua Xu. "Genetic structure, divergence and admixture of Han Chinese, Japanese and Korean populations." Hereditas 155 (April 6, 2018): article number 19.
          182 Han Chinese individuals from existing databases had their data examined in this study that compared them to Koreans, Japanese, and members of 8 other populations. According to the Results section, some Han Chinese samples show signals of admixture with non-Han populations including the Chinese Dai from Xishuangbanna in southwestern China and the Kinh Vietnamese from Ho Chi Minh City (Saigon), Vietnam. Northern Han Chinese also had ancient admixture events with the Koreans, and the data led the scientists to conclude that the most recent common ancestors between the northern Han and the Chinese lived about 1,200 years ago. The most recent common ancestors between Han Chinese and Japanese lived much earlier, about 3,000-3,600 years ago. But these populations are far from identical.

    Excerpts from the Abstract:

    "[...] Our analyses revealed that Han Chinese, Japanese and Korean populations have distinct genetic makeup and can be well distinguished based on either the genome wide data or a panel of ancestry informative markers (AIMs). Their genetic structure corresponds well to their geographical distributions, indicating geographical isolation played a critical role in driving population differentiation in East Asia. [...]"

    Excerpts from the Results section:

    "[...] Using only 6 populations (two Han Chinese populations, Japanese, Korean and two Mongolian populations) to reconstruct an individual tree, we found the phylogeny of the populations became clearer (Additional file 4: Figure S2B). Japanese individuals have their own cluster and Korean individuals are almost distinct from Han Chinese. North and South Han Chinese mixed together, but still have some substructure. [...]"

    Ruofu Du, Chunjie Xiao, and Luigi Luca Cavalli-Sforza. "Genetic distances between Chinese populations calculated on gene frequencies of 38 loci." Science in China Series C: Life Sciences 40:6 (December 1997): pages 613-621.
          In this old study, using early techniques, 130 alleles at 38 loci were studied for their gene frequencies among different populations. Excerpts from the Abstract:

    "[...] The results showed that, among both Han and ethnic minorities, there were two types, i.e. southern and northern Mongoloids, with Yangtze River as boundary. [...] This paper also conclusively proved genetically that the Han subpopulations in different regions are genetically close to the local ethnic minorities, which indicates that much blood of ethnic minorities has mixed into Han, at the same time, some blood of Han also has mixed into the local ethnic minorities."

    Ming Liao, Yuanliang Xie, Yan Mao, Zheng Lu, Aihua Tan, Chunlei Wu, Zhifu Zhang, Yang Chen, Tianyu Li, Yu Ye, Ziting Yao, Yonghua Jiang, Hongzhe Li, Xiaoming Li, Xiaobo Yang, Qiuyan Wang, and Zengnan Mo. "Comparative analyses of fecal microbiota in Chinese isolated Yao population, minority Zhuang and rural Han by 16sRNA sequencing." Scientific Reports 8 (2018): article number 1142.
          47 Han individuals participated in this study. Excerpts from the Discussion section:

    "The Megamonas genus was distinctive in the Han group compared to the Zhuang and Yao groups, with a significantly low abundance in the Yao population. [...] A higher abundance of the genera Megamonas was observed in Chinese populations compared to African populations, with a lower abundance in Chinese centenarians than in younger elderlies. Megamonas appeared to be related to some diseases, as its abundance was higher in obese Taiwanese individuals [...] The relative abundance of the Megamonas genus was positively associated with the frequency of bean consumption [...]"

    Major studies of the Cantonese people

    Cantonese people live in the province of Guangdong (Canton) and the region of Guangxi and in nearby Hong Kong. They share many paternal lineages with other Han Chinese populations but their maternal lineages mostly come from the Nanyue peoples who are indigenous to the area. Differences in mtDNA haplogroups between Southern Hans (including Cantonese) and Northern Hans are discussed in some of the scientific papers cited further above. One of those is "A spatial analysis of genetic structure of human populations in China reveals distinct difference between maternal and paternal lineages" (2008), and according to its file "Supplementary material 1: Description of the sources of mtDNA data", 60 Hans from Guangdong, 22 Hans from Guangxi, and 20 Hans from Hong Kong contributed their mtDNA samples to this study, while "Supplementary material 2: Description on the sources of Y-chromosome data" lists multiple batches of Y-DNA samples of Hans from Guangdong and Guangxi.

    Joseph Tien Seng Wee, Tam C. Ha, Susan Loong, and Chao-Nan Qian. "Is nasopharyngeal cancer really a 'Cantonese cancer'?" Chinese Journal of Cancer 29:5 (2010): pages 517-526. Excerpts from the Abstract:

    "Nasopharyngeal cancer (NPC) is endemic in Southern China, with Guandong province and Hong Kong reporting some of the highest incidences in the world. The journal Science has called it a 'Cantonese cancer'. We propose that in fact NPC is a cancer that originated in the Bai Yue ('proto Tai Kadai' or 'proto Austronesian' or 'proto Zhuang') peoples and was transmitted to the Han Chinese in southern China through intermarriage. [...] Genetic and anthropological evidence suggest there are a lot of similarities between the Bai Yue and the aboriginal peoples of Borneo and Northeast India; between Inuit of Greenland, Austronesian Mayalo Polynesians of Southeast Asia and Polynesians of Oceania, suggesting some common ancestry. Genetic studies also suggest the present Cantonese, Minnans and Hakkas are probably an admixture of northern Han and southern Bai Yue. All these populations have a high incidence of NPC. [...]"

    Major studies of the Tanka people

    Many of the Tankas live in Southern China and have assimilated into Han culture and, frequently, the Cantonese language, but unlike other Cantonese-speaking people they don't have Han paternal ancestry. Tankas also lack Han maternal ancestry.

    A. J. S. McFadzean and D. Todd. "Cooley's anaemia among the tanka of South China." Transactions of The Royal Society of Tropical Medicine and Hygiene 65:1 (1971): pages 59-62. Excerpts from the Abstract:

    "The history of the Tanka is briefly reviewed and it is concluded that they are the descendants of a pre-Han-Chinese autochthonous tribe who have lived in isolation. [...]"

    Major studies of the Pinghua-speaking population

    Pinghua is a Sinitic language spoken in the Guangxi Zhuang Autonomous Region and some of its speakers are classified as being Han Chinese but the people who speak it don't have ancient Han ancestry.

    Yan Lu, Shang-Ling Pan, Shu-Ming Qin, Zheng-Dong Qin, Chuan-Chao Wang, Rui-Jing Gan, Hui Li, and the Genographic Consortium. "Genetic evidence for the multiple origins of Pinghua Chinese." Journal of Systematics and Evolution 51:3 (2013): pages 271-279.

    Excerpts from the Abstract:

    "[...] A previous study found that populations speaking Han Chinese dialects have closer genetic relationships to each other than to neighboring ethnic groups. However, the Pinghua Chinese population from Guangxi is an exception. We have reported that northern Pinghua people are genetically related to populations speaking Daic languages. In this study, we further studied the southern Pinghua population. The Y chromosome and mitochondrial DNA haplogroup components and network analysis indicated that northern and southern Pinghua populations were genetically different. Therefore, we concluded that the Pinghua speakers may have various origins, even though Pinghua dialects are similar. Pinghua dialects might have originated when the Daic or Hmongic speakers from different regions learnt to speak the same Chinese dialect hundreds of years ago. [...]"

    Excerpt from page 271:

    "Most Pinghua speakers are Han Chinese; however, small sections of Guangxi indigenous ethnic groups such as Yao in Longsheng and Fuchuan counties are also taking Pinghua as their native language now [...]"

    Gan Rui Jing, Shang-Ling Pan, Laura F. Mustavich, Zhen-Dong Qin, Xiaoyun Cai, Ji Qian, Cheng-Wu Liu, Jun-Hua Peng, Shi-Lin Li, Jie-Shun Xu, Li Jin, and Hui Li. "Pinghua population as an exception of Han Chinese's coherent genetic structure." Journal of Human Genetics 53:4 (2008): pages 303-313.

    Excerpts from the Abstract:

    "The Han Chinese is the largest single ethnic group in the world, consisting of ten Chinese branches. With the exception of the Pinghua branch, the genetic structure of this group has been studied extensively, and Y chromosome and mitochondrial (mt)DNA data have demonstrated a coherent genetic structure of all Han Chinese. [...] We have studied 470 individual samples (including 195 males) from Pinghua populations and other ethnic groups (Zhuang, Kam, Mulam, Laka, and Mien) from six areas (Hezhou, Fuchuan, Luocheng, Jinxiu, Sanjiang, and Wuxuan) in the north of the Guangxi Zhuang Autonomous Region of China. Both mtDNA and the Y chromosomes were typed in these samples. High frequencies of the Y chromosome haplogroups O2a* and O*, which always present at a high frequency among the populations of the southern minorities, were found in Pinghua populations. Only Pinghua populations in Luocheng and Jinxiu maintain the Han frequent haplogroup O3a5a. mtDNA lineages B4a, B5a, M*, F1a, M7b1, and N* were found in Pinghua populations, exhibiting a pattern similar to the neighboring indigenous populations, especially the Daic populations. Cluster analyses (dendrograms, principal component analyses, and networks) of Pinghua populations, the other Han branches, and other ethnic groups in East Asia indicated that Pinghua populations are much closer to the southern minorities than to the other Han branches. Admixture analyses confirmed this result. In conclusion, we argue that Pinghua populations did not descend from Han Chinese, but from southern minorities. The ancestral populations of Pinghua people were assimilated by the Han Chinese in terms of language, culture, and self-identification and, consequently, the Pinghua people became an exceptional branch of Han Chinese's coherent genetic structure."

    Excerpt from page 273:

    "Haplogroups O3-M122 and O3a2c1a-M117 are dominant in Han Chinese but less frequent in Pinghua populations. O2-P31 is rare in Han Chinese but prevalent in Pinghua populations."

    Excerpt from page 274:

    "In the networks of O1a2-M110 and O2a1a-M88, the southern Pinghua samples share many haplotypes with the Hmong-Mien samples. In the network of O2a1*-M95, southern Pinghua and northern Pinghua samples seldom share haplotypes; however, both of them share many more haplotypes with the southern ethnic groups than with other Han Chinese. The southern and northern Pinghua samples are in different subclades, suggesting different origins for the southern and northern Pinghua populations."

  • Family Tree DNA Genealogy by Genetics, Ltd. - This site is an affiliate
  • 23andMe: Genetic Testing for Ancestry

  • Ancient China by Maurizio Scarpari - I own a copy of this book
  • Ancient China by Arthur Cotterell
  • Chinese Ceramics: From the Paleolithic Era to the Qing Dynasty
  • Living Language: Mandarin Chinese Course
  • Chinese-Themed Handmade Crafts and Other Products at Overstock

  • Genetics of Zhuang People
  • Genetics of Tibetan People
  • Genetics of Uygur People
  • Genetics of Kazakh People
  • Genetics of Kyrgyz People
  • Genetics of Mongolian People
  • Genetics of Korean People
  • Genetics of Russian People
  • Chinese DNA in Ashkenazi Jews