Next Article in Journal
Two Separate Cases: Complex Chromosomal Abnormality Involving Three Chromosomes and Small Supernumerary Marker Chromosome in Patients with Impaired Reproductive Function
Next Article in Special Issue
Traces of Late Bronze and Early Iron Age Mongolian Horse Mitochondrial Lineages in Modern Populations
Previous Article in Journal
Resolving the Phylogeny of the Olive Family (Oleaceae): Confronting Information from Organellar and Nuclear Genomes
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Revised Model of Anatomically Modern Human Expansions Out of Africa through a Machine Learning Approximate Bayesian Computation Approach

Department of Life Sciences and Biotechnology, University of Ferrara, 44121 Ferrara, Italy
*
Author to whom correspondence should be addressed.
Submission received: 5 November 2020 / Revised: 11 December 2020 / Accepted: 14 December 2020 / Published: 16 December 2020
(This article belongs to the Special Issue Ancient and Archaic Genomes)

Abstract

:
There is a wide consensus in considering Africa as the birthplace of anatomically modern humans (AMH), but the dispersal pattern and the main routes followed by our ancestors to colonize the world are still matters of debate. It is still an open question whether AMH left Africa through a single process, dispersing almost simultaneously over Asia and Europe, or in two main waves, first through the Arab Peninsula into southern Asia and Australo-Melanesia, and later through a northern route crossing the Levant. The development of new methodologies for inferring population history and the availability of worldwide high-coverage whole-genome sequences did not resolve this debate. In this work, we test the two main out-of-Africa hypotheses through an Approximate Bayesian Computation approach, based on the Random-Forest algorithm. We evaluated the ability of the method to discriminate between the alternative models of AMH out-of-Africa, using simulated data. Once assessed that the models are distinguishable, we compared simulated data with real genomic variation, from modern and archaic populations. This analysis showed that a model of multiple dispersals is four-fold as likely as the alternative single-dispersal model. According to our estimates, the two dispersal processes may be placed, respectively, around 74,000 and around 46,000 years ago.

1. Introduction

Levels and patterns of genome diversity reflect past demographic processes, and a crucial turning point in our demographic history is the expansion of anatomically modern humans (AMH) from Africa. Some aspects of this process seem rather well established. First, what is often called the ancestral African population should not be regarded as a single, biologically homogeneous unit, but as a structured population hosting regional diversity [1]. Second, the AMH expansion was accompanied by the disappearance of preexisting archaic human forms [2,3] Third, a variable component of the genomes of most present populations—always small, seldom zero—comes from anatomically archaic ancestors [4].
Conversely, there is disagreement over other aspects of the AMH expansion out of Africa, such as the number of major dispersal events, their timing, and the geographical routes followed by migrating people. Groups of AMH may have left Africa more than 100,000 years ago [5], but genetic evidence suggests that such early phenomena were not successful and did not lead to the establishment of permanent non-African populations. One expansion left traces in modern genomes; it took place between 60,000 and 50,000 years ago, along a Northern route in the Nile valley and across the Near East (see e.g., [6,7,8]). However, based on cranial morphology, Lahr and Foley [9] proposed an additional, earlier migration through a Southern route, from the Horn of Africa into the Arab peninsula, Southern Asia, and Australo-Melanesia. We shall refer to these alternative models as Single Dispersal (SD) and Multiple Dispersal (MD) hypotheses. The MD hypothesis found support in several studies, and notably in a comparison of cranial and DNA diversity data [10] but broader genomic analyses gave contradictory results. Tassi and colleagues [11] and, to a lesser extent, Pagani et al. [12] described patterns consistent with two dispersal processes, the first one overlapping in time with the proposed early Southern exit from Africa [11]. On the other hand, two studies of different genomic datasets concluded that there is little [4] or no evidence [13] for such an early dispersal process, and hence that AMH either left Africa in a single major migrational wave, or perhaps in several waves, but then only one of them contributed to the ancestry of modern populations.
Malaspinas et al. [13] conclusion in favor of SD was not really based on an explicit comparison between models. In their paper, indeed, they considered an MD model in which East Asians and Europeans have a more recent common ancestor than Aboriginal Australians and East Asians. and they estimated the models’ parameters. The evidence supporting the SD model came from the overlapping estimation for the divergence times of the ancestors of Aboriginal Australians and Eurasians.
This non-straightforward procedure was due to an implicit limitation of the composite likelihood method they applied, in which model selection may be performed through likelihood ratio tests (LRT) or by the Akaike Information Criterion (AIC; [14,15]). LRT and AIC can only be used to understand which modifications significantly improve the model, without explicit model testing and a direct attribution of probabilities to each tested scenario.
To understand which model, SD or MD, better accounts for the current levels of genome diversity, in this study we formally compare them by a recently developed Approximate Bayesian Computation framework, based on the study of the observed Frequency Distributions of four categories of Segregating Sites for pair of populations (FDSS) [16]. ABC is a powerful and flexible framework, based on computer simulations, to perform model selection and estimate models’ parameters. In its original formulation [17,18] the ABC algorithm suffered from two main issues, related to the simulation effort and to the number of summary statistics used to summarize the data. These issues limited the possibility to use ABC for the analysis of complex demographic histories and/or large datasets. In 2015, the introduction of a paradigm shift in the ABC model selection procedure based on a Machine Learning approach called Random Forest (ABC-RF, [19]), allowed to overcome the above-cited limitations and paved the ground for the application of ABC to the study of complex models through the analysis of complete genomes. Under ABC-RF, the model selection procedure is rephrased as a classification problem. At first, the classifier is constructed from simulations from the prior distribution via a machine learning RF algorithm. Once the classifier is constructed and applied to the observed data, the posterior probability of the resulting model can be approximated through another RF that regresses the selection error over the statistics used to summarize the data. The number of simulations necessary to obtain reliable estimates passed from a few million to a few thousand; the informative statistics are systematically extracted from the pool used to summarize the data. In 2018, a similar approach, based on a machine-learning tool of regression RF, has been developed for parameter estimation [20]. In [16] we showed that the ABC-RF algorithm, combined with the inferential power provided by the FDSS, can be satisfactorily exploited to estimated past population dynamics even in case of complex demographic histories, thus making the approach particularly suitable to the analysis of SD and MD models.
Under both SD and MD models, the structure of the past populations is the same, but the tree topologies differ in that they assume, respectively, one ancestral population for the SD model, and two ancestral populations leaving Africa at different times for the MD model. As the Australo-Melanesian represent the population that might carry the signal of the first wave of migrations out of the African continent and also, to make sure that the different results obtained by [12,13] were not due to differences in the Australo-Melanesian samples available, we repeated our analyses considering genomes coming from both studies, obtaining results that seem consistent and informative.

2. Materials and Methods

2.1. The FDSS

We summarized the data through the FDSS, i.e., the frequency distributions of the four mutually exclusive categories of segregating sites for pair of populations (i.e., private polymorphisms in either population, shared polymorphisms, and fixed differences [21]). This statistic proved to be powerful for reconstructing even a complex series of demographic processes [16]. The FDSS is calculated considering each genome analyzed as subdivided into a certain number of independent fragments of a certain length, and for each fragment, the number of sites belonging to each of the four above-mentioned categories is counted. The final vector of summary statistics is thus composed by the truncated frequency distribution of fragments having from 0 to n segregating sites in each category, for each pair of populations considered. We fixed the maximum number of segregating sites in a locus of a certain length to 100, and hence the last category contains all the observations higher than 100.
We calculated the FDSS using a python script (available on Github https://github.com/anbena/ABC-FDSS) [16]. The ABC-RF model selection estimates have been obtained using the function abcrf from the package abcrf and employing a forest of 500 classification trees, a number suggested providing the best trade-off between computational efficiency and statistical precision [19]. Before proceeding with the model selection procedure, we computed the confusion matrices and evaluated the out-of-bag classification error (CE) and the proportion of True Positives (1-CE), which are representative of the power of the whole inferential procedure. The ABC-RF parameters estimation on the most supported models have been performed through the function regAbcrf from the package abcrf and employing a forest of 500 regression trees. An outline of our entire workflow is reported in Figure S1.

2.2. Simulated Models of Anatomically Modern Humans Expansion Out of Africa

We tested two alternative models of expansion of anatomically modern humans out of the African continent (Figure 1), both sharing the same structure for the archaic groups, but differing for the relationships among modern populations. To design the models, we followed the parametrization proposed by [13], with some modifications detailed below. The first model (SD) indeed accounts for a single dispersal from Africa giving rise to both modern Eurasians and Australo-Melanesians, the second model (MD) accounts for two different waves of migrations, from two different African source populations, giving rise, first, to the modern Australo-Melanesians and, later to the modern Eurasians. The archaic groups consist of three Denisovan populations, two Neanderthal populations, and an unknown archaic population ancestral to both Neandertals and Denisovans. We explicitly considered admixture pulses from archaic to modern populations: a pulse from the archaic unknown population to Australo-Melanesians (as reported in [22]), two pulses from two different Denisovan populations to Asians and Australo-Melanesians [23,24], two pulses from the same Neandertal population to modern humans just after the separation between African and non-African populations, and to the ancestor of all Eurasians [25,26,27]. Both models account for the presence of a Basal European population, as described in [28,29,30]. This (so far, unknown) population contributed genes to modern Europeans, possibly diluting the contribution of archaic Neandertal variants in European genomes. The SD and MD models have 45 and 50 free parameters (i.e., parameters whose values are defined by prior distributions), respectively. The prior distributions associated with these parameters were set following what was proposed in the recent literature by [13,23,30], and are reported in Tables S1 and S2. We considered a generation time of 29 years, and we fixed the mutation rate at 1.25 × 10−8 bp/generation [31] and the intra-locus recombination rate at 1.12 × 10−8, all values as in [13].
We performed 20,000, 50,000, and 100,000 simulations for each model with ms [32], to evaluate the Prior Error Rate and identify the optimum number of simulations to use. At each iteration, we sampled six diploid genomes, one Neandertal, one Denisova, one African, one European, one Asian, and one Papuan. The FDSS was calculated from 10,000 independent genomic fragments of 500 bp length.

2.3. Observed Genomic Data

We analyzed the high-coverage genomes of Denisova [33] and Neandertal [26], together with worldwide modern human samples from [12]. All the individuals were mapped against the human reference genome hg19 build 37. To calculate the observed FDSS we only considered autosomal regions outside known and predicted genes ± 10,000 bp and outside CpG islands and repeated regions (as defined on the UCSC platform, [34]). We extracted 10,000 independent fragments of 500 bp length, separated by at least 10,000 bps in genomic regions that passed a set of minimal quality filters used for the analysis of the ancient genomes (map35_50%; [26,33]). We also included in the analysis of the 25 Papuan individuals published by [13]. For these individuals, we downloaded the alignments in CRAM format from https://www.ebi.ac.uk/ega/datasets/EGAD00001001634. The mpileup and call commands from samtools-1.6 [35], were used to call all variants within the 10,000 neutral genomic fragments, using the --consensus-caller flag, without considering indels. We then filtered the initial call set according to the filters reported in [13] using vcflib and bcftools [35]. The complete set of samples used for the comparison between SD and MD are reported in Table S3.
In each models’ comparison, we evaluated the genomic variation of one Denisova, one Neandertal, one African (Congo-pygmies), one European (Estonians), one Asian (Vietnamese), and one Australo-Melanesian (Papuans). We decided to restrict the analysis to one high coverage diploid genome per population since previous extensive analyses showed that a single individual sampled per population has a comparable discrimination power as twenty chromosomes [16]. However, to ensure the consistency of the results, we performed several model selection procedures (a) taking into account at each run one out of six Papuans from [12] or one of 25 Papuans from [13]; (b) considering alternative individuals as representative of African, European, and Asian populations (Table S4).

2.4. Assessment of the Quality of the Parameters Estimated

One of the most interesting features of ABC is its high flexibility for model checking, i.e., for assessing the quality of the estimates inferred from real data. This is mainly achieved through the analysis of pseudo-observed data (pods), i.e., simulated datasets generated under known conditions. To determine whether the observed data would contain enough information to estimate parameters of the multi-dimensional model tested, we exploited 1000 pods, each generated from the most supported model (i.e., the MD model) and through a known combination of demographic parameters. Using these pods, for each parameter we calculated the following indices:
  • The coefficient of determination (R2). R2 is the fraction of variance of the parameters explained by the summary statistics used to build the regression model. In the absence of an established threshold value, there is a general agreement that when R2 < 0.10, the summary statistics do not convey enough information about the parameter estimates [36].
  • The relative bias. To calculate the relative bias, we estimated the parameters for each pod with the same approach used for the observed data. The bias depends on the sum of differences between the 1000 estimates of each parameter thus obtained and the known (true) value, and it is calculated as
    1 n i = 1 n θ i   θ θ
    where θi is the estimator of the parameter θ (true value), and n is the number of pods used (1000 in our case). Because bias is relative, a value of 1 corresponds to a bias equal to 100% of the true value.
  • The root mean square error (RMSE). To calculate the RMSE we re-estimated parameters using pods. The RMSE depends the sum of squared differences between the 1000 estimates of each parameter thus obtained and the true value and it is calculated as:
    1 n i = 1 n ( θ i θ ) 2
  • The factor 2, representing the proportion of the 1000 estimated median values lying between 50% and 200% of the true value.
  • The 50% and 90% coverage, defined as the proportion of times that the known value lies within the 50% and the 90% credible interval of the 1000 estimates.

3. Results

3.1. Model Selection

Table 1 and Table S5 show the results of the power check of the comparison between SD and MD. Predictably, the Prior Error rate, which indicates the global quality of the ML classifier, decreases for increasing numbers of simulations in the reference table (from 20,000 to 100,000); for this reason, we decided to use 100,000 simulations for the subsequent analyses. The proportion of True Positives, that is the proportion of times the SD or the MD model is correctly recognized by the model selection procedure, is above 70% for both SD and MD, with a mean posterior probability associated with the true demography of about 75%.
Table 2 and Table S4 show the results of the model selection. Regardless of the Papuan individual considered, and the combination of non-Australo-Melanesian tested, the model selection analyses supported the MD model as the scenario best explaining the recent evolution of anatomically modern humans out of Africa, with probabilities ranging from 78 to 84%.

3.2. Parameters Estimation

Once identified the MD as the most probable model, we moved to estimate its parameter values maximizing the fit between observed and simulated genomic data. To do this, we exploited the recently developed ML method, based on a regression RF approach [20]. As detailed in [20], a faithful estimation of parameters’ posterior distribution may be now achieved with a reduced number of simulations (i.e., a few thousand; we used 100,000 simulations), making it feasible to also perform an accurate assessment of the quality of the parameters estimated using pods.
Parameters were estimated from two observed datasets (one with a Papuan individual from [13] and one with a Papuan individual from [12]), those which produced the highest value of posterior probability for the MD model in the model selection (Table 3 and Table 4). The posterior plots and the definition of the parameter’s acronyms are reported in Supplementary Materials (Figures S2–S10, Table S6). The R2, the bias, the RMSE, the Factor 2, and the 50–90% Coverage associated with each of these parameters are shown in Table 5. As expected for complex demography, many parameters are not well estimated, as indicated by low R2, high bias, and high RMSE. The parameters showing better estimation quality are the effective population sizes, in particular those associated with the ancestral population of African and non-African modern humans (nYG, R2= 91%), and the ancestral population of modern and archaic groups (nAM, R2= 99%). The divergence times appear to have been estimated reasonably well, with most of R2s above 10%. This is true in particular for the times of the two Out of Africa events, which also show a low bias and a high Factor2 and Coverage. On the other hand, it is evident that the data tell us very little about admixture events (their timing and admixture proportions) and migration rates. Although disappointing, this is not unexpected, and high levels of uncertainty associated with these parameters were already reported [13].
The estimates for the current African effective population size (nY) is about 15,000 (median value), in agreement with previous studies [37,38]. A lower value is estimated for the Eurasians, with an effective population size of about 7000 individuals for the Europeans (nE) and of about 11,000 individuals for the Asians (nA). A bit higher is the estimate for Australo-Melanesian population: the median value of the effective population size is indeed about 25,000 individuals (nP).
The first divergence within Africa (tdYG1), that generated the source population giving rise to the first wave of migrants has been estimated about 104,000 years ago, with a 95% confidence interval between 55,000 and 141,000 years ago (and a 50% CI between 78,000 and 125,000 years ago). The first waves of migrants left Africa (tdOA1) about 74,000 years ago (95% CI: 47,000–120,000 years ago; 50% CI: 55,000–96,000 years ago), whereas the second wave of migration (tdOA2), originated from a structure generated (tdYG2) about 100,000 years ago, left Africa about 46,000 years ago (95% CI: 40,000–59,000 years ago, 50% CI: 42,000–51,000 years ago). Europeans and Asians diverged (tdEA) about 37,000 years ago. These estimates are in agreement with a previous work that considered a less realistic model and a smaller amount of genetic data [11].

4. Discussion

In this paper, we explicitly compared two models of AMH evolution through an ABC–RF approach based on the analysis of modern and ancient complete genomes. The two tested demographic models consider details of our evolutionary history that have been proposed in the recent literature, such as the presence of a (so far, unsampled) Basal European population contributing to the genome of recent Europeans [30], or the two distinct pulses of admixture from two different Denisovan populations to Asians and Papuans [23]. The main difference between the two scenarios regards the dynamics of expansion from Africa of AMH. According to the SD model, all non-African populations derive from a single major migration wave; on the contrary, the MD model assumes two migration waves, distinct in time and place, the first one giving rise to modern Australo-Melanesians and the other giving rise to Eurasians. Needless to say, successive processes of gene flow and admixture have certainly complicated the apparently simple patterns generated by the initial African dispersal(s). Yet, even these admittedly simplified models are complex (defined by up to 50 parameters), and the differences between them are relatively small; therefore, one could expect that it might be difficult to tell them apart. On the contrary, the ABC-RF procedure we chose provided a good discriminatory power, with a proportion of True Positives of about 70% for both AD and MD models. This TP proportion is comparable to, or higher than, that reported in previous works where simpler (and hence less realistic) models were analyzed (see e.g., [39,40]). When the two alternative models were compared, the MD model resulted consistently four-fold more probable than the SD model, no matter which Papuan (Table 2), African, European or Asian individuals were considered (Table S4), with a posterior probability estimated around 80%. The support for the MD model is marginally higher than in [16], where a comparison between two alternative, less up-to-date, evolutionary histories of AMH favored the MD model with a probability of about 75%. These results are robust to slight changes in the MD parametrization. We indeed tested also a version of MD in which Papuans derived part of their genomes from Eurasians, modeled as a single pulse of admixture occurring after the second exit (rather than through a process of continuous gene flow), the results are reported in Table S7. Even in this version, the MD appeared more supported by data than the SD model, although it appeared slightly less likely than the previous MD model when included in the general comparison.
In this work, for the first time, we also attempted to estimate the parameters of the supported model by ABC-RF. The MD model was defined by 50 free parameters, estimated through the regression random forest algorithm [20]. We also assessed the quality of these estimates through the calculation of statistics that gave us information about the inferential power of the parameter’s estimation procedure. An assessment of the quality of the estimated parameters was prohibitive so far, due to computational limits of other inferential methods, e.g., those based on composite-likelihood [41]. With ABC-RF, instead, the same reference table (made up of just a few thousand simulations) allows one to both estimate parameters and assess their quality using a subset of the simulation as “pods”. To perform the same analysis by composite-likelihood methods, one would require about 100 thousand new simulations for each pod analyzed, which means, even considering only 100 pods, billions of simulations. This large amount of simulated data often exceeds computational constraints, in particular when complex demographies are analyzed. As a consequence, in studies of complex models, no information was provided about the reliability of parameter estimates [13,42]. The procedure we applied made it possible to compensate for this drawback, as shown in Table 5.
It would have been unrealistic to expect that all 50 parameters could be reliably estimated. The migration rates among modern populations, or the proportion and timing of admixture events, for instance, proved elusive, showing a low R2 and high bias and RMSE values. We knew that there is an almost infinite set of parameter combinations leading to the same patterns of genome diversity, with, for instance, old small-scale admixture events, and recent larger-scale admixture events, producing, in principle, the same consequences at the genomic level. Other parameters show better estimates. This is the case of the effective population sizes, or, to a lesser extent, of the divergence times. The African, European and Asian estimates of the effective population sizes are consistent with what reported in the literature [38,43]; the higher value estimated for the Australo-Melanesian group, here represented by the Papuans, may be surprising, but it is in agreement with the harmonic mean of the effective population sizes estimated over time by [12].
The most interesting parameters are those associated with the divergence/departure from Africa. These parameters show R2 above 10%, good coverage, and a factor 2 of about 100%; however, their confidence intervals are huge and their posterior distributions often seem to reflect the prior range. This means that we should still take with caution these estimates and that the ABC inferential procedure, albeit powerful, shows room for improvement. The key advantage of the ABC estimation is that the “quality assessment” procedure allows the acquisition of consciousness about the quality of the estimates; nevertheless, having this in mind, we can still discuss the estimates obtained. We dated the structure of African groups that gave rise to the source populations of the migration waves from Africa about 100,000 years ago. The bottleneck of the first exit from Africa, associated with the origin of Australo-Melanesian groups, has been estimated at about 74,000 years ago, in line with the timing inferred from paleoanthropological data (70,000 years ago, [44]). The second exit, giving rise to Eurasian populations, was placed at about 46,000 years ago. This is in agreement with previous estimates from genomic data [4,38,45] and receives further support from the relatively recent arrival of modern humans in Europe suggested by much of the archaeological evidence (40–45 thousand years ago, [46,47]). Some authors proposed an even earlier presence of AMH in Europe [48]. Be that as it may, it is also plausible that large-scale gene flow processes, documented at least twice in Europe (in the Neolithic period and Bronze Age; see [49]) may have slightly reduced diversity and hence the apparent depth of the DNA genealogies, thus producing a bias towards more recent values in the estimation of divergence times. The two migration waves from Africa considered in the MD model appear to be separated in time, with no temporal overlap considering their 50% confidence interval (55,000–96,000 for the first exit and 42,000–51,000 for the second exit), and a limited overlap considering their 95% confidence interval (47,000–120,000 for the first exit and 40,000–59,000 for the second exit).

5. Conclusions

In this paper we extensively tested two up-to-date models of modern human expansion Out of Africa through a machine learning ABC approach. The simulated variation has been compared with those observed in ancient and modern genomes, and our results consistently supported a Multiple Dispersal Model, in which modern Australo-Melanesians derive from an earlier migration from Africa than that giving rise to Eurasians. We also estimated the parameters of the most supported model, and we concentrated our effort in assessing the quality of the estimates produced. This procedure, albeit fundamental to ensure the reliability of the estimates, it is rarely performed, due to the limitations of available inferential methods. These limitations are currently overcame by the ABC-RF procedure coupled with the FDSS statistic, which allowed us to highlight weakness and strengths of the parameters estimated. Our results indeed support that the hypothesis of two main dispersal event from Africa, separated in time and place [10,11,12], cannot be dismissed [4,13], but the quality assessment of the parameters we estimated certainly show that needs to be further explored.

Supplementary Materials

The following are available online at https://0-www-mdpi-com.brum.beds.ac.uk/2073-4425/11/12/1510/s1, Table S1: Demographic parameters and prior distributions of Single Dispersal model. Table S2: Demographic parameters and prior distributions of Multiple Dispersal model. Table S3: Complete list of genomes used for the comparison of Single Dispersal model and Multiple Dispersal model using real data; Table S4: Results of model selection performed using alternative individuals from African, European and Asian populations; Table S5: Power test of model comparison for increasing number of simulations considered in the reference table.; Table S6. Complete list of acronyms of the MD model’s demographic parameters.; Table S7. Model Selection results including the MD-Pulse admixture model. Figure S1: Outline of the entire workflow; Figure S2: Posterior density of the effective population sizes estimated using the Papuan sample from Malaspinas et al. (2016). Figure S3: Posterior density of the divergence times and the admixture times estimated using the Papuan sample from Malaspinas et al. (2016). Figure S4: Posterior density of the admixture rates estimated using the Papuan sample from Malaspinas et al. (2016). Figure S5: Posterior density of the migration rates estimated using the Papuan sample from Malaspinas et al. (2016). Figure S6: Posterior density of the effective population sizes estimated using the Papuan sample from Pagani et al. (2016). Figure S7: Posterior density of the divergence times and the admixture times estimated using the Papuan sample from Pagani et al. (2016). Figure S8: Posterior density of the admixture rates estimated using the Papuan sample from Pagani et al. (2016). Figure S9: Posterior density of the migration rates estimated using the Papuan sample from Pagani et al. (2016). Figure S10: The model below represents a simplified version of the most supported model (MD) showing the main demographic parameters.

Author Contributions

Conceptualization, G.B. and S.G.; formal analysis, M.T.V.; methodology, A.B. and S.G.; software, M.T.V. and A.B.; supervision, G.B. and S.G.; writing—original draft, G.B. and S.G.; writing—review and editing, M.T.V., A.B., G.B., and S.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Acknowledgments

We are indebted to Francesca Tassi and Alberto Seno for technical help.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Scerri, E.M.L.; Thomas, M.G.; Manica, A.; Gunz, P.; Stock, J.T.; Stringer, C.; Grove, M.; Groucutt, H.S.; Timmermann, A.; Rightmire, G.P.; et al. Did Our Species Evolve in Subdivided Populations across Africa, and Why Does It Matter? Trends Ecol. Evol. 2018, 33, 582–594. [Google Scholar] [CrossRef] [Green Version]
  2. Mellars, P. Neanderthals and the Modern Human Colonization of Europe. Nature 2004, 432, 461–465. [Google Scholar] [CrossRef]
  3. Higham, T.; Douka, K.; Wood, R.; Ramsey, C.B.; Brock, F.; Basell, L.; Camps, M.; Arrizabalaga, A.; Baena, J.; Barroso-Ruíz, C.; et al. The Timing and Spatiotemporal Patterning of Neanderthal Disappearance. Nature 2014, 512, 306–309. [Google Scholar] [CrossRef] [PubMed]
  4. Mallick, S.; Li, H.; Lipson, M.; Mathieson, I.; Gymrek, M.; Racimo, F.; Zhao, M.; Chennagiri, N.; Nordenfelt, S.; Tandon, A.; et al. The Simons Genome Diversity Project: 300 Genomes from 142 Diverse Populations. Nature 2016, 538, 201–206. [Google Scholar] [CrossRef] [PubMed]
  5. Hershkovitz, I.; Weber, G.W.; Quam, R.; Duval, M.; Grün, R.; Kinsley, L.; Ayalon, A.; Bar-Matthews, M.; Valladas, H.; Mercier, N.; et al. The Earliest Modern Humans Outside Africa. Science 2018, 359, 456–459. [Google Scholar] [CrossRef] [Green Version]
  6. Liu, H.; Prugnolle, F.; Manica, A.; Balloux, F. A Geographically Explicit Genetic Model of Worldwide Human-Settlement History. Am. J. Hum. Genet. 2006, 79, 230–237. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  7. Mellars, P.; Gori, K.C.; Carr, M.; Soares, P.A.; Richards, M.B. Genetic and Archaeological Perspectives on the Initial Modern Human Colonization of Southern Asia. Proc. Natl. Acad. Sci. USA 2013, 110, 10699–10704. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  8. López, S.; Van Dorp, L.; Hellenthal, G. Human Dispersal out of Africa: A Lasting Debate. Evol. Bioinform. 2015. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  9. Lahr, M.M.; Foley, R. Multiple Dispersals and Modern Human Origins. Evol. Anthropol. Issues News Rev. 1994, 3, 48–60. [Google Scholar] [CrossRef]
  10. Reyes-Centeno, H.; Ghirotto, S.; Detroit, F.; Grimaud-Herve, D.; Barbujani, G.; Harvati, K. Genomic and Cranial Phenotype Data Support Multiple Modern Human Dispersals from Africa and a Southern Route into Asia. Proc. Natl. Acad. Sci. USA 2014, 111, 7248–7253. [Google Scholar] [CrossRef] [Green Version]
  11. Tassi, F.; Ghirotto, S.; Mezzavilla, M.; Vilaça, S.T.; De Santi, L.; Barbujani, G. Early Modern Human Dispersal from Africa: Genomic Evidence for Multiple Waves of Migration. Investig. Genet. 2015, 6, 6–13. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  12. Pagani, L.; Lawson, D.J.; Jagoda, E.; Mörseburg, A.; Eriksson, A.; Mitt, M.; Clemente, F.; Hudjashov, G.; DeGiorgio, M.; Saag, L.; et al. Genomic Analyses Inform on Migration Events during the Peopling of Eurasia. Nature 2016, 538, 238–242. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  13. Malaspinas, A.S.; Westaway, M.C.; Muller, C.; Sousa, V.C.; Lao, O.; Alves, I.; Bergström, A.; Georgios, A.; Cheng, J.Y.; Crawford, G.E. A Genomic History of Aboriginal Australia. Nature 2016, 538, 207–214. [Google Scholar] [CrossRef] [PubMed]
  14. Varin, C. On Composite Marginal Likelihoods. Asta Adv. Stat. Anal. 2008, 92, 1–28. [Google Scholar] [CrossRef]
  15. Varin, C.; Reid, N.; Firth, D. An Overview of Composite Likelihood Methods. Stat. Sin. 2011, 21, 5–42. [Google Scholar]
  16. Ghirotto, S.; Vizzari, M.T.; Tassi, F.; Barbujani, G.; Benazzo, A. Distinguishing among Complex Evolutionary Models Using Unphased Whole-genome Data through Random-Forest Approximate Bayesian Computation. Mol. Ecol. Resour. 2020, 1–15. [Google Scholar] [CrossRef]
  17. Beaumont, M.A.; Zhang, W.; Balding, D.J. Approximate Bayesian Computation in Population Genetics. Genetics 2002, 162, 2025–2035. [Google Scholar]
  18. Beaumont, M.A. Joint Determination of Topology, Divergence Time, and Immigration in Population Trees. In Simulations, Genetics and Human Prehistory; McDonald Institute for Archaeological Research: Cambridge, UK, 2008; pp. 135–154. [Google Scholar]
  19. Pudlo, P.; Marin, J.M.; Estoup, A.; Cornuet, J.M.; Gautier, M.; Robert, C.P. Reliable ABC Model Choice via Random Forests. Bioinformatics 2015, 32, 859–866. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  20. Raynal, L.; Marin, J.M.; Pudlo, P.; Ribatet, M.; Robert, C.P.; Estoup, A. ABC Random Forests for Bayesian Parameter Inference. Bioinformatics 2019, 35, 1720–1728. [Google Scholar] [CrossRef]
  21. Wakeley, J.; Hey, J. Estimating Ancestral Population Parameters. Genetics 1997, 145, 847–855. [Google Scholar]
  22. Mondal, M.; Casals, F.; Xu, T.; Dall’Olio, G.M.; Pybus, M.; Netea, M.G.; Comas, D.; Laayouni, H.; Li, Q.; Majumder, P.P.; et al. Genomic Analysis of Andamanese Provides Insights into Ancient Human Migration into Asia and Adaptation. Nat. Genet. 2016, 48, 1066–1070. [Google Scholar] [CrossRef] [PubMed]
  23. Browning, S.R.; Browning, B.L.; Zhou, Y.; Tucci, S.; Akey, J.M. Analysis of Human Sequence Data Reveals Two Pulses of Archaic Denisovan Admixture. Cell 2018, 173, 53–61.e9. [Google Scholar] [CrossRef] [Green Version]
  24. Jacobs, G.S.; Hudjashov, G.; Saag, L.; Kusuma, P.; Darusallam, C.C.; Lawson, D.J.; Mondal, M.; Pagani, L.; Ricaut, F.-X.; Stoneking, M.; et al. Multiple Deeply Divergent Denisovan Ancestries in Papuans. Cell 2019, 177, 1010–1021. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  25. Wall, J.D.; Yang, M.A.; Jay, F.; Kim, S.K.; Durand, E.Y.; Stevison, L.S.; Gignoux, C.; Woerner, A.; Hammer, M.F.; Slatkin, M. Higher Levels of Neanderthal Ancestry in East Asians than in Europeans. Genetics 2013, 194, 199–209. [Google Scholar] [CrossRef] [PubMed]
  26. Prüfer, K.; Racimo, F.; Patterson, N.; Jay, F.; Sankararaman, S.; Sawyer, S.; Heinze, A.; Renaud, G.; Sudmant, P.H.; De Filippo, C.; et al. The Complete Genome Sequence of a Neanderthal from the Altai Mountains. Nature 2014, 505, 43–49. [Google Scholar] [CrossRef] [PubMed]
  27. Vernot, B.; Akey, J.M. Resurrecting Surviving Neandertal Lineages from Modern Human Genomes. Science 2014, 343, 1017–1021. [Google Scholar] [CrossRef] [PubMed]
  28. Lazaridis, I.; Patterson, N.; Mittnik, A.; Renaud, G.; Mallick, S.; Kirsanow, K.; Sudmant, P.H.; Schraiber, J.G.; Castellano, S.; Lipson, M.; et al. Ancient Human Genomes Suggest Three Ancestral Populations for Present-Day Europeans. Nature 2014, 513, 409–413. [Google Scholar] [CrossRef] [Green Version]
  29. Lazaridis, I.; Nadel, D.; Rollefson, G.; Merrett, D.C.; Rohland, N.; Mallick, S.; Fernandes, D.; Novak, M.; Gamarra, B.; Sirak, K.; et al. Genomic Insights into the Origin of Farming in the Ancient Near East. Nature 2016, 536, 419–424. [Google Scholar] [CrossRef] [Green Version]
  30. Villanea, F.A.; Schraiber, J.G. Multiple Episodes of Interbreeding between Neanderthal and Modern Humans. Nat. Ecol. Evol. 2019, 3, 39–44. [Google Scholar] [CrossRef]
  31. Scally, A.; Durbin, R. Revising the Human Mutation Rate: Implications for Understanding Human Evolution. Nat. Rev. Genet. 2012, 13, 745–753. [Google Scholar] [CrossRef]
  32. Hudson, R.R. Generating Samples under a Wright-Fisher Neutral Model of Genetic Variation. Bioinformatics 2002, 18, 337–338. [Google Scholar] [CrossRef]
  33. Meyer, M.; Kircher, M.; Gansauge, M.T.; Li, H.; Racimo, F.; Mallick, S.; Schraiber, J.G.; Jay, F.; Prüfer, K.; De Filippo, C.; et al. A High-Coverage Genome Sequence from an Archaic Denisovan Individual. Science 2012, 338, 222–226. [Google Scholar] [CrossRef] [Green Version]
  34. Hinrichs, A.S.; Raney, B.J.; Speir, M.L.; Rhead, B.; Casper, J.; Karolchik, D.; Kuhn, R.M.; Rosenbloom, K.R.; Zweig, A.S.; Haussler, D.; et al. UCSC Data Integrator and Variant Annotation Integrator. Bioinformatics 2016, 32, 1430–1432. [Google Scholar] [CrossRef] [Green Version]
  35. Li, H.; Handsaker, B.; Wysoker, A.; Fennell, T.; Ruan, J.; Homer, N.; Marth, G.; Abecasis, G.; Durbin, R. The Sequence Alignment/Map Format and SAMtools. Bioinformatics 2009, 25, 2078–2079. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  36. Neuenschwander, S.; Largiadèr, C.R.; Ray, N.; Currat, M.; Vonlanthen, P.; Excoffier, L. Colonization History of the Swiss Rhine Basin by the Bullhead (Cottus Gobio): Inference under a Bayesian Spatially Explicit Framework. Mol. Ecol. 2008, 17, 757–772. [Google Scholar] [CrossRef] [PubMed]
  37. Fan, S.; Kelly, D.E.; Beltrame, M.H.; Hansen, M.E.B.; Mallick, S.; Ranciaro, A.; Hirbo, J.; Thompson, S.; Beggs, W.; Nyambo, T.; et al. African Evolutionary History Inferred from Whole Genome Sequence Data of 44 Indigenous African Populations. Genome Biol. 2019, 20, 1–14. [Google Scholar]
  38. McEvoy, B.P.; Powell, J.E.; Goddard, M.E.; Visscher, P.M. Human Population Dispersal “Out of Africa” Estimated from Linkage Disequilibrium and Allele Frequencies of SNPs. Genome Res. 2011, 21, 821–829. [Google Scholar] [CrossRef] [Green Version]
  39. Fagundes, N.J.R.; Ray, N.; Beaumont, M.; Neuenschwander, S.; Salzano, F.M.; Bonatto, S.L.; Excoffier, L. Statistical Evaluation of Alternative Models of Human Evolution. Proc. Natl. Acad. Sci. USA 2007, 104, 17614–17619. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  40. Veeramah, K.R.; Wegmann, D.; Woerner, A.; Mendez, F.L.; Watkins, J.C.; Destro-Bisol, G.; Soodyall, H.; Louie, L.; Hammer, M.F. An Early Divergence of KhoeSan Ancestors from Those of Other Modern Humans Is Supported by an ABC-Based Analysis of Autosomal Resequencing Data. Mol. Biol. Evol. 2012, 29, 617–630. [Google Scholar] [CrossRef] [Green Version]
  41. Excoffier, L.; Dupanloup, I.; Huerta-Sánchez, E.; Sousa, V.C.; Foll, M. Robust Demographic Inference from Genomic and SNP Data. PLoS Genet. 2013, 9, e1003905. [Google Scholar] [CrossRef] [Green Version]
  42. Nater, A.; Mattle-Greminger, M.P.; Nurcahyo, A.; Nowak, M.G.; De Manuel, M.; Desai, T.; Groves, C.; Pybus, M.; Sonay, T.B.; Roos, C.; et al. Morphometric, Behavioral, and Genomic Evidence for a New Orangutan Species. Curr. Biol. 2017, 27, 3576–3577. [Google Scholar] [CrossRef] [PubMed]
  43. Schiffels, S.; Durbin, R. Inferring Human Population Size and Separation History from Multiple Genome Sequences. Nat. Genet. 2014, 46, 919–925. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  44. Mirazón Lahr, M.; Foley, R.A. Towards a Theory of Modern Human Origins: Geography, Demography, and Diversity in Recent Human Evolution. Am. J. Phys. Anthropol. 1999, 107, 137–176. [Google Scholar] [CrossRef]
  45. Gravel, S.; Henn, B.M.; Gutenkunst, R.N.; Indap, A.R.; Marth, G.T.; Clark, A.G.; Yu, F.; Gibbs, R.A.; Bustamante, C.D.; The 1000 Genomes Project; et al. Demographic History and Rare Allele Sharing among Human Populations. Proc. Natl. Acad. Sci. USA 2011, 108, 11983–11988. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  46. Mellars, P. Why Did Modern Human Populations Disperse from Africa ca. 60,000 Years Ago? A New Model. Proc. Natl. Acad. Sci. USA 2006, 103, 9381–9386. [Google Scholar] [CrossRef] [Green Version]
  47. Reyes-Centeno, H.; Hubbe, M.; Hanihara, T.; Stringer, C.; Harvati, K. Testing Modern Human Out-of-Africa Dispersal Models and Implications for Modern Human Origins. J. Hum. Evol. 2015, 87, 95–106. [Google Scholar] [CrossRef]
  48. Hublin, J.J.; Sirakov, N.; Aldeias, V.; Bailey, S.; Bard, E.; Delvigne, V.; Endarova, E.; Fagault, Y.; Fewlass, H.; Hajdinjak, M.; et al. Initial Upper Palaeolithic Homo Sapiens from Bacho Kiro Cave, Bulgaria. Nature 2020, 581, 299–302. [Google Scholar] [CrossRef]
  49. Haak, W.; Lazaridis, I.; Patterson, N.; Rohland, N.; Mallick, S.; Llamas, B.; Brandt, G.; Nordenfelt, S.; Harney, E.; Stewardson, K.; et al. Massive Migration from the Steppe Was a Source for Indo-European Languages in Europe. Nature 2015, 522, 207–211. [Google Scholar] [CrossRef] [Green Version]
Figure 1. Demographic models compared: Single Dispersal (A) and Multiple Dispersals (B). AR: unknown archaic population; D-D1-D2: Denisovan groups; N-NR: Neandertal and Neandertal related groups; Y: African population; G1-G2: ghost populations; BE: Basal Europe population; E: European population; A: Asian population; P: Australo-Melanesian population.
Figure 1. Demographic models compared: Single Dispersal (A) and Multiple Dispersals (B). AR: unknown archaic population; D-D1-D2: Denisovan groups; N-NR: Neandertal and Neandertal related groups; Y: African population; G1-G2: ghost populations; BE: Basal Europe population; E: European population; A: Asian population; P: Australo-Melanesian population.
Genes 11 01510 g001
Table 1. Power test for model comparison using a reference table with 100,000 simulations per model.
Table 1. Power test for model comparison using a reference table with 100,000 simulations per model.
Prior Err. RateTrue Positive SDTrue Positive MDPost. Prob. SDPost. Prob. MD
0.260.730.750.750.73
Table 2. Model Selection results using Papuan individuals from [12,13]. In the first column are reported the ID of the Papuan samples used for the model choice. The second column shows the model selected by the ABC procedure. In the third and the fourth columns are reported the votes assigned to the SD and MD models by the Random-Forest algorithm. The last column shows the posterior probabilities associated with the most supported model. The samples with the highest posterior probabilities (in bold) were selected to perform the parameter estimation of the MD model.
Table 2. Model Selection results using Papuan individuals from [12,13]. In the first column are reported the ID of the Papuan samples used for the model choice. The second column shows the model selected by the ABC procedure. In the third and the fourth columns are reported the votes assigned to the SD and MD models by the Random-Forest algorithm. The last column shows the posterior probabilities associated with the most supported model. The samples with the highest posterior probabilities (in bold) were selected to perform the parameter estimation of the MD model.
ID_IndividualSelected ModelVotes SDVotes MDPost. Prob.
EGAN00001279031MD944060.822
EGAN00001279039MD864140.806
EGAN00001279047MD1113890.798
EGAN00001279054MD1283720.809
EGAN00001279032MD904100.825
EGAN00001279040MD1133870.784
EGAN00001279048MD994010.805
EGAN00001279033MD1083920.791
EGAN00001279041MD1113890.797
EGAN00001279049MD1263740.789
EGAN00001279034MD1503500.797
EGAN00001279042MD1093910.791
EGAN00001279050MD1113890.797
EGAN00001279035MD1083920.799
EGAN00001279043MD974030.802
EGAN00001279051MD1173830.786
EGAN00001279036MD1363640.778
EGAN00001279044MD1093910.784
EGAN00001279052MD1004000.815
EGAN00001279037MD964040.800
EGAN00001279045MD1483520.787
EGAN00001279053MD1004000.796
EGAN00001279038MD914090.811
EGAN00001279046MD1043960.781
EGAN00001279055MD1383620.787
Koinb1MD1653350.810
Koinb2MD1293710.811
Koinb3MD1753250.820
Kosip1MD1523480.818
Kosip2MD1363640.788
Kosip3MD1233770.830
Table 3. Estimated parameters for the MD model using the Papuan samples from [13]. The mean and the median estimated values are listed, as well as the 90% and the 50% credible intervals. The parameters cited in the text are reported in bold.
Table 3. Estimated parameters for the MD model using the Papuan samples from [13]. The mean and the median estimated values are listed, as well as the 90% and the 50% credible intervals. The parameters cited in the text are reported in bold.
ParameterMeanMedianVarianceQ (0.05)Q (0.95)Q (0.25)Q (0.75)
nAR282227935.77 × 1042540341026662914
nY19,07714,3471.72 × 108420444,993797629117
nG126,19126,9952.08 × 108325347,38513,67039,819
nG223,47322,2751.96 × 108190346,64911,15134,663
nBE25,61226,2692.08 × 108273147,60413,39438,160
nE13,49866162.07 × 10862742,565161623,761
nA16,36011,5532.25 × 10877344,620259928,065
nP24,26824,8392.34 × 108153547,53410,75637,349
nYG23,31722,2923.19 × 10717,11235,45619,78925,425
nNNR242423431.22 × 1052057300122192504
nDDR21,36019,6802.00 × 108157046,512948232,332
nDN17,02512,5761.77 × 108278943,117531227,001
nADN19,73316,5312.28 × 108210847,465577031,455
nAM18,84618,7451.73 × 10616,78021,02317,91119,745
rP0.02140.01468.36 × 10−40.01050.05320.01190.0192
rEA0.03130.01791.91 × 10−30.01090.08690.01420.0303
tdYG1101,162103,8427.61 × 10854,830140,53678,262125,226
tdYG299,00098,9257.13 × 10855,038137,97076,482124,250
tdOA177,10673,5665.86 × 10847,019120,20655,39296,881
tOAbot173,38966,2486.14 × 10844,341118,94252,08293,165
tdOA247,52445,9373.99 × 10740,39459,24542,59751,019
tOAbot245,22343,2825.30 × 10737,71858,38740,11048,153
tdG2BE68,41561,4973.78 × 10850,281113,56053,71375,889
tdEA38,18737,0174.33 × 10730,48350,07633,37441,444
taNG252,03249,7318.13 × 10742,68069,75845,40255,444
taNEA41,66340,0054.51 × 10733,96555,74336,65345,055
taARP61,56755,0484.53 × 10837,831106,64243,94575,654
taD1P51,04744,4603.89 × 10831,09495,15536,20758,088
taD2A28,64527,0594.24 × 10720,95839,74623,73032,456
taBEE25,26924,8441.00 × 10811,19445,25416,82731,380
paNG25.19 × 10−24.99 × 10−27.71 × 10−49.44 × 10−39.52 × 10−32.91 × 10−27.73 × 10−2
paNEA4.73 × 10−24.73 × 10−27.95 × 10−45.36 × 10−39.57 × 10−22.30 × 10−27.01 × 10−2
paARP4.82 × 10−24.83 × 10−29.00 × 10−44.97 × 10−39.45 × 10−22.09 × 10−27.71 × 10−2
paD1P5.21 × 10−25.27 × 10−28.43 × 10−44.58 × 10−39.53 × 10−22.84 × 10−27.85 × 10−2
paD2A4.74 × 10−24.72 × 10−28.46 × 10−43.95 × 10−39.32 × 10−22.17 × 10−27.24 × 10−2
paBEE2.78 × 10−12.85 × 10−11.61 × 10−26.83 × 10−24.79 × 10−11.71 × 10−13.83 × 10−1
mYG14.75 × 10−44.62 × 10−49.64 × 10−82.61 × 10−59.48 × 10−41.92 × 10−47.54 × 10−4
mG1Y4.74 × 10−44.64 × 10−47.95 × 10−84.65 × 10−59.30 × 10−42.25 × 10−46.98 × 10−4
mG1G24.93 × 10−44.80 × 10−48.50 × 10−84.54 × 10−59.41 × 10−42.49 × 10−47.63 × 10−4
mG2G15.34 × 10−45.61 × 10−48.83 × 10−84.77 × 10−59.68 × 10−42.69 × 10−47.94 × 10−4
mG2E5.23 × 10−45.29 × 10−48.13 × 10−85.19 × 10−59.57 × 10−42.84 × 10−47.81 × 10−4
mEG24.21 × 10−43.69 × 10−47.78 × 10−83.73 × 10−59.07 × 10−41.85 × 10−46.48 × 10−4
mEA4.19 × 10−43.60 × 10−48.63 × 10−83.73 × 10−59.66 × 10−41.81 × 10−46.45 × 10−4
mAE5.33 × 10−45.69 × 10−47.63 × 10−85.82 × 10−59.33 × 10−42.90 × 10−47.57 × 10−4
mAP1.70 × 10−41.27 × 10−42.26 × 10−81.42 × 10−55.16 × 10−47.40 × 10−52.10 × 10−4
mPA1.28 × 10−41.02 × 10−41.18 × 10−88.01 × 10−63.37 × 10−44.52 × 10−51.72 × 10−4
m1G2EA4.96 × 10−45.01 × 10−48.24 × 10−85.60 × 10−69.47 × 10−42.45 × 10−47.53 × 10−4
m1EAG24.46 × 10−44.00 × 10−48.23 × 10−85.18 × 10−59.49 × 10−41.99 × 10−46.95 × 10−4
m1EAP4.25 × 10−43.97 × 10−47.57 × 10−82.77 × 10−59.07 × 10−41.95 × 10−46.39 × 10−4
m1PEA4.40 × 10−44.02 × 10−48.39 × 10−84.04 × 10−59.31 × 10−41.77 × 10−46.93 × 10−4
Table 4. Estimated parameters for the MD model using the Papuan samples from [12]. The mean and the median estimated values are listed, as well as the 90% and the 50% credible intervals. The parameters cited in the text are reported in bold.
Table 4. Estimated parameters for the MD model using the Papuan samples from [12]. The mean and the median estimated values are listed, as well as the 90% and the 50% credible intervals. The parameters cited in the text are reported in bold.
ParameterMeanMedianVarianceQ (0.05)Q (0.95)Q (0.25)Q (0.75)
nAR280327834.57 × 1042532330226682900
nY19,18214,7711.62 × 108437944,930822329,102
nG126,72228,0032.18 × 108270247,51414,07540,579
nG225,32527,3941.97 × 108221847,18813,36236,308
nBE25,68426,2962.17 × 108219447,89613,70638,919
nE12,48553731.94 × 10869942,194161621,836
nA14,54389782.10 × 10891643,930221426,207
nP19,08916,6392.16 × 108104846,319498030,429
nYG22,85721,9222.62 × 10717,11231,78919,57925,130
nNNR242223361.24 × 1052057302322192531
nDDR21,77820,5721.94 × 108164046291960632,332
nDN16,23911,8461.59 × 108287941321531125,523
nADN19,27916,5312.21 × 108210847070488431,082
nAM18,62918,5741.57 × 10616,67120,69117,77919,476
rP0.02150.01436.10 × 10−40.01040.05760.01180.0204
rEA0.03140.01791.94 × 10−30.01090.08690.01440.0310
tdYG198,82999,9877.31 × 10854,220140,00976,337122,428
tdYG297,43096,6866.87 × 10854,693138,49076,482120,370
tdOA174,24468,9875.32 × 10846,663119,53954,33489,685
tOAbot170,34164,2855.47 × 10843,471116,60850,99285,938
tdOA248,55446,2577.36 × 10740,55964,86542,73951,453
tOAbot246,36643,4758.49 × 10737,92263,07440,24750,084
tdG2BE68,12262,0353.36 × 10850,281105,77453,53376,526
tdEA37,74735,9365.05 × 10730,38150,39932,69040,845
taNG253,60650,1161.08 × 10843,27473,01246,91757,484
taNEA42,25540,1757.98 × 10733,44956,37637,03045,231
taARP61,20354,6974.60 × 10837,428106,64343,99473,444
taD1P48,49343,6512.90 × 10831,34386,57936,45055,023
taD2A29,29827,6015.05 × 10721,09041,45124,13332,700
taBEE23,87123,3569.64 × 10710,50840,71115,26830,666
paNG25.29 × 10−25.35 × 10−27.32 × 10−48.94 × 10−39.52 × 10−23.18 × 10−27.51 × 10−2
paNEA5.12 × 10−25.22 × 10−27.83 × 10−45.58 × 10−39.60 × 10−22.69 × 10−27.44 × 10−2
paARP5.02 × 10−25.06 × 10−28.74 × 10−45.45 × 10−39.49 × 10−22.36 × 10−27.81 × 10−2
paD1P5.23 × 10−25.50 × 10−28.00 × 10−46.13 × 10−39.41 × 10−22.78 × 10−27.66 × 10−2
paD2A4.82 × 10−24.52 × 10−28.87 × 10−44.93 × 10−39.58 × 10−22.27 × 10−27.39 × 10−2
paBEE2.79 × 10−12.91 × 10−11.65 × 10−26.58 × 10−24.78 × 10−11.68 × 10−13.88 × 10−1
mYG14.47 × 10−44.08 × 10−48.52 × 10−83.74 × 10−59.32 × 10−41.89 × 10−46.97 × 10−4
mG1Y4.92 × 10−44.91 × 10−47.55 × 10−85.11 × 10−59.27 × 10−42.79 × 10−47.28 × 10−4
mG1G24.74 × 10−44.59 × 10−48.40 × 10−84.41 × 10−59.35 × 10−42.31 × 10−47.32 × 10−4
mG2G15.20 × 10−45.23 × 10−49.07 × 10−84.77 × 10−59.67 × 10−42.34 × 10−47.93 × 10−4
mG2E5.16 × 10−45.29 × 10−47.87 × 10−85.67 × 10−59.55 × 10−42.85 × 10−47.60 × 10−4
mEG23.77 × 10−43.04 × 10−48.13 × 10−82.70 × 10−59.11 × 10−41.30 × 10−45.80 × 10−4
mEA5.07 × 10−45.15 × 10−48.78 × 10−84.74 × 10−59.57 × 10−42.52 × 10−47.68 × 10−4
mAE4.67 × 10−44.68 × 10−47.94 × 10−84.78 × 10−59.17 × 10−42.29 × 10−47.07 × 10−4
mAP5.17 × 10−45.12 × 10−47.28 × 10−81.04 × 10−49.35 × 10−42.78 × 10−47.50 × 10−4
mPA4.05 × 10−43.79 × 10−45.71 × 10−85.15 × 10−58.70 × 10−42.27 × 10−45.41 × 10−4
m1G2EA5.20 × 10−45.21 × 10−48.85 × 10−84.88 × 10−59.74 × 10−42.74 × 10−47.90 × 10−4
m1EAG24.56 × 10−44.30 × 10−47.91 × 10−85.77 × 10−59.24 × 10−42.09 × 10−47.16 × 10−4
m1EAP4.92 × 10−45.12 × 10−47.88 × 10−86.32 × 10−59.42 × 10−42.47 × 10−47.11 × 10−4
m1PEA4.78 × 10−44.59 × 10−47.42 × 10−86.17 × 10−59.24 × 10−42.44 × 10−47.02 × 10−4
Table 5. Accuracy of the estimated parameters of the MD model assessed by 1000 pods. The parameters cited in the text are reported in bold.
Table 5. Accuracy of the estimated parameters of the MD model assessed by 1000 pods. The parameters cited in the text are reported in bold.
ParametersR2BiasRMSEFactor 2Coverage 90%Coverage 50%
nAR0.84−0.00205.90 × 1030.9900.9350.553
nY0.540.19001.04 × 1040.8670.9190.522
nG10.082.00201.46 × 1040.7020.8800.466
nG20.170.91751.36 × 1040.6980.9150.497
nBE0.022.21941.47 × 1040.7220.8950.479
nE0.330.42781.25 × 1040.7670.9080.523
nA0.280.41591.20 × 1040.7950.9220.532
nP0.390.34251.21 × 1040.7910.9080.501
nYG0.910.00203.54 × 1030.9980.9570.650
nNNR0.920.00863.64 × 1030.9980.9660.622
nDDR0.360.35291.18 × 1040.8000.9230.522
nDN0.540.19791.09 × 1040.8420.9410.534
nADN0.330.77491.29 × 1040.7050.9300.476
nAM0.990.00675.40 × 1020.9970.9950.870
rP0.100.11106.79 × 10−20.7210.8790.521
rEA0.100.09835.65 × 10−20.7480.9150.547
tdYG10.250.06292.23 × 1040.9980.9280.576
tdYG20.250.06302.25 × 1040.9960.9340.573
tdOA10.190.00251.99 × 1040.9980.9110.540
tOAbot10.190.00521.99 × 1040.9960.9180.544
tdOA20.13−0.02571.24 × 1040.9980.8830.511
tOAbot20.13−0.02611.24 × 1040.9950.8810.512
tdG2BE0.16−0.00161.98 × 1040.9990.9130.523
tdEA0.08−0.01679.09 × 1030.9890.8980.495
taD2A0.040.01167.35 × 1030.9930.9050.526
paD2A0.020.00102.88 × 10−21.0000.9000.500
taBEE0.030.12861.04 × 1040.9140.9040.486
paBEE0.020.04391.31 × 10−11.0000.8930.497
taD1P0.11−0.00701.72 × 1040.9730.8970.499
paD1P0.02−0.00022.85 × 10−21.0000.8970.508
taARP0.15−0.00021.85 × 1040.9880.9160.517
paARP0.03−0.00142.85 × 10−21.0000.9060.509
taNEA0.10−0.02041.06 × 1040.9920.8930.516
paNEA0.020.00002.81 × 10−21.0000.9240.516
taNG20.15−0.02231.36 × 1040.9980.9090.528
paNG20.02−0.00032.89 × 10−21.0000.9090.477
mYG10.151.26962.69 × 10−40.7090.9270.521
mG1Y0.031.81712.86 × 10−40.7420.9070.516
mG1G20.052.06672.85 × 10−40.7370.8950.519
mG2G10.052.99542.89 × 10−40.7450.8850.509
mG2E0.033.05473.01 × 10−40.6920.8860.460
mEG20.191.50132.67 × 10−40.7220.9080.503
mEA0.121.48342.68 × 10−40.7440.9020.543
mAE0.111.98132.74 × 10−40.7310.9080.523
mAP0.271.47892.40 × 10−40.7660.9100.548
mPA0.372.26872.35 × 10−40.7730.9080.546
m1G2EA0.022.12012.90 × 10−40.7010.9110.489
m1EAG20.042.78792.92 × 10−40.7080.8880.496
m1EAP0.062.51112.82 × 10−40.7280.9010.528
m1PEA0.053.21132.91 × 10−40.6940.9110.477
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Vizzari, M.T.; Benazzo, A.; Barbujani, G.; Ghirotto, S. A Revised Model of Anatomically Modern Human Expansions Out of Africa through a Machine Learning Approximate Bayesian Computation Approach. Genes 2020, 11, 1510. https://0-doi-org.brum.beds.ac.uk/10.3390/genes11121510

AMA Style

Vizzari MT, Benazzo A, Barbujani G, Ghirotto S. A Revised Model of Anatomically Modern Human Expansions Out of Africa through a Machine Learning Approximate Bayesian Computation Approach. Genes. 2020; 11(12):1510. https://0-doi-org.brum.beds.ac.uk/10.3390/genes11121510

Chicago/Turabian Style

Vizzari, Maria Teresa, Andrea Benazzo, Guido Barbujani, and Silvia Ghirotto. 2020. "A Revised Model of Anatomically Modern Human Expansions Out of Africa through a Machine Learning Approximate Bayesian Computation Approach" Genes 11, no. 12: 1510. https://0-doi-org.brum.beds.ac.uk/10.3390/genes11121510

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop