book-tree
Blog, Science

A Genealogical Tree of Mankind – 27 Million of Our Ancestors in it

The largest genealogical tree ever created describes the entire history of mankind and traces the history of our species. It is based on several thousand sequences of the human genome.

To date, hundreds of thousands of modern human genomes and thousands of ancient human genomes have been created. However, different methods and data quality make it difficult to compare them. Moreover, each human genome contains ancestral segments of different ages. Scientists at the University of Oxford’s Big Data Institute have applied a tree-recording method to the genomes of ancient and modern humans to create a unified genealogical tree of humanity. This method accounts for missing and erroneous data and uses ancient genomes to calibrate the timing of their fusion. This allows us to determine how genomes have changed over time and between populations, and provides a detailed picture of the evolution of our species.

Genomic datasets tend to be highly heterogeneous. Samples from different times, geographic locations, and populations are processed, sequenced, and analyzed using different methods. The resulting data sets contain true variation, but also complex patterns of omissions and errors. This makes it difficult to combine the data and hinders efforts to create the most complete picture of human genomic variability.

To address these problems, the authors of the study, which appeared in Science, used the underlying notion that the heritable relationships of all humans who have ever lived can be described by a single genealogy.

In their work, they presented statistical and computational methods to derive a unified genealogy of modern and ancient specimens. The scientists also tested them using computer modeling and empirical data analysis, highlighting points of difference and coincidence. Then, based on this, the researchers drew theoretical lines of descent between the genomes and got an idea of what gene variants (or alleles) these people probably had in common ancestors.

In addition to being able to map these genealogical relationships, the scientists also tried to figure out exactly where in the world the common ancestors of the sequenced people lived. The location was estimated based on the age of the sampled genomes and the location where each genome was sampled. Although, of course, this estimate may be very approximate.

To build a single genealogical tree, the researchers first pulled together genomic data from several large datasets from different projects. They included 3,601 high-quality genome sequences from modern humans, and eight from ancient humans, which include Neanderthal and Denisovan genomes.

The resulting genealogical structure is an analysis of 27 million ancestral haplotype fragments and 231 million lineages linking genomes from these datasets. The scientists also used an additional 3,589 lower-quality ancient samples to constrain and date the relationships.

The tree created in the study reveals quite a bit of information about the genealogy of all mankind. Overall, the authors of the paper reconstructed human history as accurately as possible given the available data. However, with more genome samples and more sophisticated software, the genealogical tree could have been even more accurate.

The important thing here is that methods have been created in the process, the main advantage of which is their potential ability to work with even millions of samples. The more data, the more accurate the results.

Now team members are working on new machine learning algorithms to get more accurate data about where and when our ancestors lived. Theoretically, the same tree-building method could help better understand the genetic basis of human diseases: you could determine the point of origin of alleles associated with a disease, and then reconstruct how and when these gene variants spread in different populations. Finally, the method can be used to trace the evolutionary history of other organisms, such as bees or cattle, and even viruses.