Researchers have analyzed DNA samples from 141,431 pregnant Chinese women—roughly one ten-thousandth of the country’s population—to garner insights on issues ranging from migration patterns to genetic diversity and susceptibility to disease. The team’s findings, newly published in Cell, represent the largest-scale genetic study of Chinese individuals to date.
Scientists from Shenzhen-based genome sequencing company BGI, the University of California, Berkeley, and the University of Copenhagen drew on blood samples taken during non-invasive prenatal testing (NIPT) for fetal trisomy, a chromosomal disorder linked with Down syndrome. As Alice Shen reports for The South China Morning Post, such tests, which analyze traces of fetal DNA found floating in the mother’s blood, have been conducted on six to seven million Chinese women, providing an ideal data pool for researchers.
Still, the vast scale of the study comes with a catch: NIPT only provides enough information to sequence six to ten percent of a mother’s genome, Chinese news outlet Xinhua explains. Comparatively, most rigorous studies sequence 80 percent or more of a genome.
To combat the relatively small fraction of genome sequencing captured by NIPT, the team designed software that used heavy computation and statistics to predict missing DNA. This software, in conjunction with the sheer scale and diversity of the study’s participants—Shen notes that the sample included women from 31 out of 34 Chinese provinces and 36 out of 55 ethnic minority groups—may actually enable the researchers to learn more than they would by thoroughly sequencing data from a smaller group.
DNA analysis revealed an array of information on the Chinese population, Robert Sanders writes for Berkeley News. A variation of the gene known as NRG1 was linked to a greater or lesser incidence of twins, while a variant of a different gene, EMB, was associated with older first-time mothers. Another genetic twist influenced the severity of herpesvirus 6, which can cause both the relatively harmless infant rash roseola and a series of more severe symptoms.
Other preliminary findings revolve around the migration and genetic diversity of various ethnic groups. Lead author Siyang Liu, a geneticist at the University of Copenhagen, tells Steph Yin of The New York Times that the Han, who make up 92 percent of China’s population, are genetically homogenous, with differences largely based on geographic residence in northern versus southern regions. Such patterns speak to post-1949 governmental policies and job opportunities that have shifted migration east or westward, Liu explains.
Genetic variations found amongst northern and southern Han populations reveal differences in immune response, bipolar disorder, type of earwax and diet. In the frigid north, a mutation of the fatty acid metabolizing gene FADS2 enables consumption of rich foods, such as hearty lamb stew. In the milder south, fresh crops are a more common staple.
Minority ethnic groups, including the Xinjiang-based Uyghurs and Kazakhs and Inner Mongolian Mongols, exhibited higher levels of genetic diversity than the Han. University of Southern California geneticist Charleston Chiang, who was not involved in the study, tells Yin that analysis of minority groups, who are often ignored in sequencing studies, can have significant medical implications.
For now, NIPT-based sequencing is still in early stages. To prove that such data is capable of revealing connections between genes and certain traits, the team also studied height and body mass index (BMI) across the sample. Ultimately, Yin writes that they identified 48 gene variations linked with height and 13 linked with BMI.
The new study mainly serves as proof-of-concept, co-author Xin Jin explains. Moving forward, he and his team will follow up this initial work with evaluation of prenatal data taken from more than 3.5 million Chinese participants.
“For me, this is a very exciting new model for biology research," co-author Xun Xu says in a statement. "It provides powerful tools and a platform for future study. Here, we show proof of concept that these data, and the structure and the methods, could be used to study a lot of things. It's just the beginning."