A Family Tree of You And Your 13 Million Closest Relatives

A big data project to connect all the people

2013102912502610_29_2013_family-tree.jpg
Via Tsuji

We’re in the era of Big Data, where some scientists are digging through absolutely staggering amounts of information to unlock the world’s secrets. Take, for example, computational biologist Yaniv Erlich. Using data from a geneaology website, says Nature, Erlich and his colleagues have been building huge family trees. One tree they say, connects the dots between 13 million different people, a legacy that stretches back more than 500 years.

In total, says Erlich on his website, the genetic tree project, called FamiLinx, has compiled the information of 43 million people. Following the connections between people, Erlich and Geni.com were able to follow a slice of the history of the age of exploration.

Video depicts human migration across generations

The starting point of FamiLinx was the public information on Geni.com, a genealogy-driven social network that is operated by MyHeritage. Geni.com allows genealogists to enter their family trees into the website and to create profiles of family members with basic demographic information such as sex, birth date, marital status, and location. The genealogists decide whether they want the profiles in their trees to be public or private. New or modified family tree profiles are constantly compared to all existing profiles, and if there is high similarity to existing ones, the website offers the users the option to merge the profiles and connect the trees.

By scraping the data, says Nature, Erlich is opening the door to, potentially, the future of human genetics research.

The structures of the trees themselves could provide interesting information about human demographics and population expansions, says Nancy Cox, a human geneticist at the University of Chicago, Illinois, who was not involved in the study. But more interesting, she says, is the possibility that such data may one day be linked to medical information or to DNA sequence data as more people have their genomes sequenced and deposit that information in public databases.

More from Smithsonian.com:

The Newest Member of the Human Family Tree

Get the latest stories in your inbox every weekday.