Advertisement

Variance of gametic diversity and its application in selection programs

Open AccessPublished:April 10, 2019DOI:https://doi.org/10.3168/jds.2018-15971

      ABSTRACT

      The variance of gametic diversity ( σgamete2) can be used to find individuals that more likely produce progeny with extreme breeding values. The aim of this study was to obtain this variance for individuals from routine genomic evaluations, and to apply gametic variance in a selection criterion in conjunction with breeding values to improve genetic progress. An analytical approach was developed to estimate σgamete2 by the sum of binomial variances of all individual quantitative trait loci across the genome. Simulation was used to verify the predictability of this variance in a range of scenarios. The accuracy of prediction ranged from 0.49 to 0.85, depending on the scenario and model used. Compared with sequence data, SNP data are sufficient for estimating σgamete2 Results also suggested that markers with low minor allele frequency and the covariance between markers should be included in the estimation. To incorporate σgamete2 into selective breeding programs, we proposed a new index, relative predicted transmitting ability, which better utilizes the genetic potential of individuals than traditional predicted transmitting ability. Simulation with a small genome showed an additional genetic gain of up to 16% in 10 generations, depending on the number of quantitative trait loci and selection intensity. Finally, we applied σgamete2 to the US genomic evaluations for Holstein and Jersey cattle. As expected, the DGAT1 gene had a strong effect on the estimation of σgamete2 for several production traits. However, inbreeding had a small impact on gametic variability, with greater effect for more polygenic traits. In conclusion, gametic variance, a potentially important parameter for selection programs, can be easily computed and is useful for improving genetic progress and controlling genetic diversity.

      Key words

      INTRODUCTION

      Since the introduction of marker-assisted selection and genomic selection, technological improvements have resulted in widespread incorporation of molecular information into genetic evaluations (
      • Nejati-Javaremi A.
      • Smith C.
      • Gibson J.P.
      Effect of total allelic relationship on accuracy of evaluation and response to selection.
      ;
      • Meuwissen T.H.E.
      • Hayes B.J.
      • Goddard M.E.
      Prediction of total genetic value using genome-wide dense marker maps.
      ;
      • Schaeffer L.R.
      Strategy for applying genome-wide selection in dairy cattle.
      ). Increased prediction accuracy, along with reduced generation intervals, has made genomic selection an important tool for achieving fast progress in dairy selection programs (
      • García-Ruiz A.
      • Cole J.B.
      • VanRaden P.M.
      • Wiggans G.R.
      • Ruiz-López F.J.
      • Van Tassell C.P.
      Changes in genetic selection differentials and generation intervals in US Holstein dairy cattle as a result of genomic selection.
      ). Despite concerns about inbreeding in selection and mating designs, most selection programs only consider breeding values when making selection decisions. Even with genomic selection models, genomic breeding value or PTA and evaluation of future progeny are mostly based on expected breeding values without consideration of the variability of those values due to Mendelian sampling.
      In addition to breeding value or PTA, other selection strategies have been proposed to increase the rate of genetic progress. One idea was to select animals that will provide greater genetic gains in the future rather than choosing the best animals in the current population.
      • Goiffon M.
      • Kusmec A.
      • Wang L.
      • Hu G.
      • Schnable P.S.
      Improving response in genomic selection with a population-based selection strategy: Optimal population value selection.
      showed improved genetic gains when selecting for the best gametes from a subset of individuals in a population.
      • Segelke D.
      • Reinhardt F.
      • Liu Z.
      • Thaller G.
      Prediction of expected genetic variation within groups of offspring for innovative mating schemes.
      discussed the potential use of the variation within groups of offspring, which allows the assignment of probabilities to obtain progeny with a breeding value over a given threshold, as well as the number of matings required. In a follow-up study,
      • Bonk S.
      • Reichelt M.
      • Teuscher F.
      • Segelke D.
      • Reinsch N.
      Mendelian sampling covariability of marker effects and genetic values.
      showed how exact within-family genetic variation can be calculated using data from phased genotypes. Recently,
      • Müller D.
      • Schopp P.
      • Melchinger A.E.
      Selection on expected maximum haploid breeding values can increase genetic gain in recurrent genomic selection.
      proposed a new selection criterion based on the expected maximum haploid breeding value. Collectively, these studies suggest that the incorporation of variation of future gametic values into mating decisions can improve genetic progress on top of the selection on breeding values.
      However, a few questions need to be answered before the application of gametic variance to breeding programs, such as how to assess the variation of future gametic values of an individual, how large is the gametic variance, how to use this information for selection, and how to estimate the variance of gametic diversity and use it in existing genomic evaluations. In this study, we aimed to address these questions from a statistical point of view, demonstrating the equivalence between gametic variance and Mendelian sampling variance in the classical BLUP (pedigree) model. We also sought to explore how this variance can be used as a selection criterion in conjunction with breeding values, with the goal of maximizing future genetic gains. We propose an approach for estimating this variance from routine genomic evaluations, verifying the adequacy of the estimates for individuals with and without progeny, and estimating the variance of breeding values of future progeny for a given mating. Finally, we evaluate the application of gametic variance to improve the selection of dairy traits in the US Holstein and Jersey populations.

      MATERIALS AND METHODS

      Estimation of the Variance of Gametic Diversity

      We refer to the variance of gametic diversity as σgamete2, which is equivalent to half of the Mendelian sampling variance (Appendix A1). σgamete2 measures the deviation of progeny breeding values from parent average and can be calculated using the probabilities of transmission of alleles at all QTL from an individual to its gametes. Gametic variance represents the variability of all possible gametic values generated by the permutation and recombination of each parental chromosome. In fact, only the heterozygous loci of an individual contribute to σgamete2, so we only consider heterozygous loci in the following text.
      Let's first consider one locus. For a biallelic locus j of an individual i with allele substitution effect αj, σgamete2 can be calculated from a binomial variance of σ[j]2=np(1p)αj2, where the probability of transmission of a reference allele to a gamete p = 0.5 and the number of alleles transmitted to a gamete n = 1. When 2 loci, j and k, are considered for an individual i, the resulting variance can be obtained as
      σ[j+k]2=σ[j]2+σ[k]2+2σjk


      andσjk=(pjkpjpk)αjαk,
      1


      where pj = pk = 0.5, and pjk is the probability that the 2 reference alleles of the 2 loci are transmitted together; pjk can be obtained from the linkage phase and recombination rate between the 2 loci. For example, pjk = 0.25 and σjk = 0 when the loci are in linkage equilibrium; pjk = 0.5 and σjk = 0.25αjαk when the 2 reference alleles are on the same chromosome and the loci are in complete linkage.
      Extending this calculation from 2 loci to all QTL on the genome, the σgamete2 of individual i can be obtained as the sum across all N heterozygous QTL:
      σgamete2=j=1Nσ[j]2+2j=1Nk=j+1Nσjk.


      This can be represented in matrix format as follows:
      σgamete2=[α1αN]M[α1αN],
      2


      where αj(j = 1,…,N) are the allele substitution effects, and M is the (co)variance matrix of the Mendelian transmission probabilities for the N heterozygous loci:
      M=[0.25al1,N(cM1,N200+0.25)alN,1(cMN,1200+0.25)0.25],


      where aljk is a phase indicator for loci j and k, with value 1 when both loci have the reference allele on the same chromosome and −1 otherwise; cMjk is the genetic distance between the 2 loci (in centimorgans). Any 2 loci with genetic distance >50 cM on the same chromosome, or on different chromosomes, are assumed to be independent and thus have zero values for the corresponding elements of M. When all the loci are independent,
      M=[0.250000000.25].


      Instead of using genetic distances, M can be set up when direct recombination rates are available.
      To estimate gametic variance in real data where genomic evaluation is available, we proposed to use the estimated SNP effects to replace true QTL effects in Equation [2]. This approximation of QTL with SNP marker effects is similar to that described by
      • Bonk S.
      • Reichelt M.
      • Teuscher F.
      • Segelke D.
      • Reinsch N.
      Mendelian sampling covariability of marker effects and genetic values.
      . Note that using estimated SNP effects in [2] may bias the estimation due to the covariance between estimated effects of SNP in linkage disequilibrium (LD) and potential biases from shrunken estimators of SNP effects, which warrants further investigation.

      Application of Gametic Variance in Selection Programs

      A new selection strategy using σgamete2 can be proposed, focusing on the future genetic progress (
      • Bijma P.
      • Wientjes Y.C.J.
      • Calus M.P.L.
      Increasing genetic gain by selecting for higher Mendelian sampling variance.
      ). When a small proportion of animals are selected for breeding, σgamete2 can help identify those that are most likely to produce progeny with extreme breeding values. Assuming selection intensity is maintained across generations, the average genetic value of the animals selected in the future will be related to the variance of gametes of the selected animals in the current generation. The average breeding value transmitted to future progeny can be calculated by summing the expected value and i times the standard deviation of gametic diversity (iσgamete). The selection intensity (i) represents the number of standard deviations between the population average and the average of selected individuals. The same intensity can be applied when using PTA as the expected value and σgamete2 as standard deviation, to obtain the mean breeding value transmitted to the selected individuals in the next generation. Similar approaches have been proposed by
      • Lehermeier C.
      • Teyssèdre S.
      • Schön C.C.
      Genetic gain increases by applying the usefulness criterion with improved variance prediction in selection of crosses.
      via a usefulness criterion (UC) with genomic EBV (GEBV) and the standard deviation of a given mating.
      Here, we propose a new selection criterion relative to the intensity of selection applied in the next generation (if) for an individual i (unknowing mating),
      RPTAi=PTAi+σgamete_i×if,
      [3]


      where RPTAi (relative PTA) is the average of the genetic values relative to the group of progeny that will be selected in the future (see Appendix A2). In addition, we introduce a new concept of coefficient of relative variation (CRV) as a measure of the variability of the additive genetic values (u) transmitted from an individual to its progeny (Appendix A3). The CRV of an individual i is defined as follows (where E indicates expected value):
      CRVi=σgamete0.5E(ui2).
      [4]


      Simulation

      To verify the estimation of σgamete2 by genomic models and the use of this new parameter to aid selection, we simulated different scenarios with various QTL, genotype, and phenotype data using the QMSim version 1.10 software (
      • Sargolzaei M.
      • Schenkel F.
      QMSim: A large-scale genome simulator for livestock.
      ). In brief, we simulated a historical population, a 10-generation recent population, and a 10-generation future population (Table 1).
      Table 1Summary of simulation parameters
      ParameterValue
      Genome parameter
       Genome size200 cM
       Number of chromosomes4
       Number of QTL20 and 200
       Number of markers10,000 (high-density panel) and 20,000+ QTL (sequence data)
       Mutation rate, QTL2.5 × 10−5
       Mutation rate, marker2.5 × 10−3
       Marker positions in genomeEvenly spaced
       QTL position in genomeRandom (uniform distribution)
       QTL allele effectGamma distribution (β = 0.4)
      Trait parameters
       Number of traits6
       Heritability0.10, 0.30, 0.50
       Phenotypic variance1
       Sex-limited traitNo
      Population structure parameters
       Historical generation
      Phase 1
      Number of generations500
      Number of animalsConstant (500 males and 500 females)
      MatingRandom
      Phase 2
      Number of generations500
      Initial number of animals1,000
      Final number of animals200 (100 males and 100 females)
      MatingRandom
      Phase 3
      Number of generations10
      Initial number200 (100 males and 100 females)
      Final number3,000 (1,500 males and 1,500 females)
      MatingRandom
       Recent generation
      Number of generations10
      Reference population9th
      Validation population9th and 10th
      Number of offspring per dam5
      Founders1,000 (200 males 800 females)
      MatingRandom
      SelectionBLUP
      CuttingBLUP
      Replacement20% females and 60% males
      Overlapping generationYes
      Generation 9–10 (predictability)Correlation(σgamete,2/σgamete_estimated2)
       Future generation
      Number of generations10
      Criterion of selection
      T_PTA = true PTA; T_RPTA = true relative PTA; σgamete2 = variance of gametic diversity.
      T_PTA = TRUE/2 or T_RPTA (TRUE/2) +σgamete2
      Number of offspring per dam5 or 10
      Replacement100% females and 100% males
      Better criterionGenetic gain per generation
      1 T_PTA = true PTA; T_RPTA = true relative PTA; σgamete2 = variance of gametic diversity.
      To mimic real populations, a historical population was simulated with the same proportion of males and females that were mated randomly. This population was generated in 3 phases: the first phase consisted of 500 generations with a constant population size of 1,000 individuals; the second phase had 500 generations with a constant reduction of population size from 1,000 to 200 to generate LD and establish drift-mutation balance; and the third phase included 10 generations of expansion, where the population size increased from 200 to 3,000. From the last generation of this historical population, 200 males and 800 females were randomly selected as founders to generate the study population, which consisted of 10 generations with 5 progeny per dam and a ratio of 50% males in the offspring. We simulated a selection for breeding values estimated by the classical BLUP (
      • Henderson C.R.
      Best linear unbiased estimation and prediction under a selection model.
      ). The replacement ratio was set at 20% for dams and 60% for sires (
      • Brito F.V.
      • Braccini Neto J.
      • Sargolzaei M.
      • Cobuci J.A.
      • Schenkel F.S.
      Accuracy of genomic selection in simulated populations mimicking the extent of linkage disequilibrium in beef cattle.
      ), and mating was random among selected individuals. The replacement ratio is the proportion of animals to be culled and replaced in each generation.
      From the study population (last 10 generations of the simulation), genotype and QTL data were obtained for the 9th generation (treated as a reference population) and the 10th generation (the validation population). The marker effects were first estimated in the reference generation. The σgamete2 values for all individuals were estimated for both the reference and validation populations using the marker effects estimated in the reference generation. For comparison, true gametic variance was also calculated using the QTL effects and their genotype data from the simulation.
      To reduce computational load, a small genome, with 4 autosomal chromosomes of 50 cM each, was simulated. The mutation rate was fixed at 2.5 × 10−5 in the historical population. The number of crossovers was sampled from a Poisson distribution. A total of 200,000 markers and different sets of QTL were simulated to be randomly distributed along the genome. After the genome was simulated, a panel with 10% of the polymorphic markers was sampled every 0.5 cM and another panel with 20% of the markers was sampled every 0.5 cM. The first panel was chosen to mimic a high-density SNP panel and the second for sequence data. A detailed description of the parameters is reported in Table 1.
      Six traits were simulated with heritabilities of 0.1, 0.3, and 0.5 and 20 QTL (i.e., 0.1 QTL per cM) or 200 QTL (i.e., 1 QTL per cM), respectively. We used 2 QTL densities similar to those used by
      • Meuwissen T.H.E.
      • Hayes B.J.
      • Goddard M.E.
      Prediction of total genetic value using genome-wide dense marker maps.
      . The QTL effects were generated based on a gamma distribution with parameter β = 0.4 (
      • Hayes B.
      • Goddard M.E.
      The distribution of effects of genes affecting quantitative traits in livestock.
      ). The phenotypic variance was assumed to be 1 for all traits. Four replicates were used for each trait. In addition, 10 future generations were simulated where the individuals were selected either by the true breeding value (T_PTA) or by true RPTA (T_RPTA) to verify and compare the genetic gains obtained using these criteria. To assess the effect of these indices on selection in the future generations, the replacement ratio was maintained at 100% and the number of offspring per dam was 5 (corresponding to a selection intensity of 0.996 for females and 1.76 for males) or 10 (corresponding selection intensities of 1.4 for females and 2.06 for males). As the predicted σgamete2 is a latent variance, its realized value depends on the number of progeny of an individual. Any inference using this variance should be regarded as a bet (probability of an event considering the number of attempts). Therefore, the selection intensity applied to RPTA (if) may need to be adjusted accordingly, and 3 values of if (0.5, 0.8, and 1) were tested in this study.

      Genomic Analysis

      Because σgamete2 depends on the marker effects in genomic models, we used a model that assumed homogeneity of variance of marker effects, GBLUP (SNP-BLUP), and another model that allowed heterogeneity of marker effects with differential shrinkage through the improved Bayesian LASSO (BLASSO;
      • Legarra A.
      • Robert-Granié C.
      • Croiseau P.
      • Guillaume F.
      • Fritz S.
      Improved Lasso for genomic selection.
      ). The analyses were performed using the GS3 v.3 software (
      • Legarra A.
      • Ricard A.
      • Filangio O.
      GS3 Genomic Selection — Gibbs Sampling — Gauss Seidel (and BayesCπ).
      ). The model included the population mean, marker effects, and residual. Only markers with minor allele frequency (MAF) >0.05 were considered. For estimation of additive and residual variances, the simulated true values were used as initial values to reduce computational complexity, followed by 20,000 iterations with the burn-in of 2,000 initial chains.

      Application of Gametic Variance to Real Data

      The data used were part of the 2017 US genomic evaluations from the Council on Dairy Cattle Breeding (CDCB, Bowie, MD), consisting of 1,364,278 Holstein and 164,278 Jersey cattle from the national dairy cattle database. Five dairy traits based on up to 5 lactations were analyzed: milk (MY), fat (FY) and protein (PY) yields, and fat (F%) and protein (P%) percentages. The genotype data were generated from different SNP arrays with the number of SNP ranging from 7K to 50K. All individuals were imputed to a common panel of 60,671 SNP and their linkage phase were determined by FindHap version 3 (
      • VanRaden P.M.
      • O'Connell J.R.
      • Wiggans G.R.
      • Weigel K.A.
      Genomic evaluations with many more genotypes.
      ). The σgamete2 was calculated using Equation [2] with estimated SNP effects ( αˆ1). The marker effects were derived from the PTA obtained from the genomic evaluation. Sex-specific recombination rates between SNP markers in Holstein and Jersey cattle were directly used in this study (
      • Ma L.
      • O'Connell J.R.
      • VanRaden P.M.
      • Shen B.
      • Padhi A.
      • Sun C.
      • Bickhart D.M.
      • Cole J.B.
      • Null D.J.
      • Liu G.E.
      • Da Y.
      • Wiggans G.R.
      Cattle sex-specific recombination and genetic control from a large pedigree analysis.
      ;
      • Shen B.
      • Jiang J.
      • Seroussi E.
      • Liu G.E.
      • Ma L.
      Characterization of recombination features and the genetic basis in multiple cattle breeds.
      ). Thus, a modification to the off-diagonal elements of the M matrix in Equation [2] was applied to incorporate recombination rate
      Mjk=aljk(ratejk2+0.25),


      when the recombination rate is <0.5; and Mjk = 0 when the rate ≥0.5.

      RESULTS AND DISCUSSION

      Estimation of Gametic Variance with Genomic Models

      The variance of progeny breeding values has been investigated in previous studies (
      • Cole J.B.
      • VanRaden P.M.
      Use of haplotypes to estimate Mendelian sampling effects and selection limits.
      ;
      • Segelke D.
      • Reinhardt F.
      • Liu Z.
      • Thaller G.
      Prediction of expected genetic variation within groups of offspring for innovative mating schemes.
      ;
      • Bonk S.
      • Reichelt M.
      • Teuscher F.
      • Segelke D.
      • Reinsch N.
      Mendelian sampling covariability of marker effects and genetic values.
      ). Here, we sought to use simulation to evaluate the predictability of gametic variance as a parameter for selection. To evaluate the predictability, a comparison with classical simulation studies with genomic prediction was adopted. The variance of gametic diversity ( σgamete2) was calculated considering both dependence and independence between loci, using all QTL and QTL with MAF ≥5%, and utilizing high-density SNP and sequence data with marker effects obtained from genomic models. The Pearson correlation between the true and estimated σgamete2 ranged from medium to high (Table 2), similar to those studies on breeding values (
      • Meuwissen T.H.E.
      • Hayes B.J.
      • Goddard M.E.
      Prediction of total genetic value using genome-wide dense marker maps.
      ;
      • Daetwyler H.D.
      • Pong-Wong R.
      • Villanueva B.
      • Woolliams J.A.
      The impact of genetic architecture on genome-wide evaluation methods.
      ;
      • Clark S.A.
      • Hickey J.M.
      • Van der Werf J.H.J.
      Different models of genetic variation and their effect on genomic evaluation.
      ). In general, the correlation increased when the heritability (h2) of traits increased, whereas the same relation was not apparent when the number of QTL was large. Differently, for the GEBV prediction, the increase in accuracy has been reported with increased h2 and for scenarios with a small number of QTL, particularly when these were estimated by differential shrinkage models (
      • Daetwyler H.D.
      • Pong-Wong R.
      • Villanueva B.
      • Woolliams J.A.
      The impact of genetic architecture on genome-wide evaluation methods.
      ;
      • Clark S.A.
      • Hickey J.M.
      • Van der Werf J.H.J.
      Different models of genetic variation and their effect on genomic evaluation.
      ).
      Table 2Pearson correlations between variance of gametic diversity for all QTL (σg2) for QTL with minor allele frequency (MAF) ≥0.05 (σgm2) and disregarding the covariances for all QTL (σd2) and QTL with MAF ≥0.05 (σdm2) and their estimations using a high-density marker panel and sequence data by genomic BLUP (bp) and Bayesian LASSO (ls), considering (σgbp2andσgls2) and disregarding (σdbp2andσdls2) the dependency between the markers
      Values in bold represent the best estimates.
      TraitGametic varianceHigh-density SNPSequence dataQTL data
      h2QTL (no.)σgbp2σgls2σdpb2σdls2σgbp2σgls2σdpb2σdls2σg2σgm2σd2σdm2
      0.120σg20.490.560.170.390.460.570.200.400.750.960.69
      σgm20.530.740.210.540.480.750.250.550.750.660.93
      σd20.450.530.150.430.430.530.190.430.960.660.71
      σdm20.500.740.180.610.450.730.240.610.690.930.71
      200σg20.500.600.290.370.460.610.290.400.960.500.48
      σgm20.480.610.290.390.450.630.300.410.960.460.49
      σd20.290.280.510.300.280.270.480.310.500.460.97
      σdm20.270.290.520.320.260.290.490.330.480.490.97
      0.320σg20.640.830.280.660.590.830.070.650.940.950.90
      σgm20.650.870.280.680.590.870.070.680.940.900.95
      σd20.600.810.300.690.540.810.070.680.950.900.95
      σdm20.600.850.300.710.550.850.070.700.900.950.95
      200σg20.630.770.250.490.590.770.290.480.950.550.52
      σgm20.620.780.250.510.570.780.290.490.950.530.53
      σd20.420.480.520.630.400.490.540.620.550.530.99
      σdm20.410.480.520.630.390.480.540.630.520.530.99
      0.520σg20.540.670.280.500.480.660.180.490.860.940.81
      σgm20.510.670.260.470.440.650.160.460.860.790.93
      σd20.520.640.300.530.460.630.190.510.940.790.85
      σdm20.490.640.280.510.430.630.180.490.810.930.85
      200σg20.790.850.370.510.760.840.290.510.950.650.62
      σgm20.770.860.370.550.740.860.300.550.950.640.65
      σd20.530.610.490.830.520.610.380.830.650.640.98
      σdm20.510.610.490.850.500.610.370.850.620.650.98
      1 Values in bold represent the best estimates.
      We observed higher correlations between the true and predicted σgamete2 using BLASSO compared with GBLUP in all scenarios (Table 2). These results were partly due to the small genome and large QTL effects simulated. Although GBLUP can have a similar or slightly better performance for prediction of GEBV than differential shrinkage models for scenarios with a large number of QTL (
      • Daetwyler H.D.
      • Pong-Wong R.
      • Villanueva B.
      • Woolliams J.A.
      The impact of genetic architecture on genome-wide evaluation methods.
      ), the accuracy of the estimated marker effects, mainly for QTL regions, is greater from differential shrinkage models (
      • Meuwissen T.H.E.
      • Hayes B.J.
      • Goddard M.E.
      Prediction of total genetic value using genome-wide dense marker maps.
      ;
      • Shepherd R.K.
      • Meuwissen T.H.
      • Woolliams J.A.
      Genomic selection and complex trait prediction using a fast EM algorithm applied to genome-wide markers.
      ;
      • Legarra A.
      • Robert-Granié C.
      • Croiseau P.
      • Guillaume F.
      • Fritz S.
      Improved Lasso for genomic selection.
      ). For estimating σgamete2 the marker effect has a greater impact than for GEBV prediction because σgamete2 uses the squared marker effects as well as the dependency of the chromosome segments. Therefore, this observation can also be attributed to the greater accuracy of the marker effects estimated by BLASSO and to the high dependency of the chromosome segments simulated.
      The effect on prediction was inferred by a linear regression between true and estimated σgamete2 For the intercept of regression (a), GBLUP had a lower scale effect (close to zero) than BLASSO but the difference was not large (Table 3). A low scale effect is important for σgamete2 prediction because it affects the precision of the limit values of the confidence interval for future progeny PTA. The scale effect may be affected by the prediction models and by factors inherent to the trait. However, GBLUP had a larger prediction bias, worse values of mean squared error, and regression coefficients (b) more different from 1 (Table 3). For genomic prediction, lower bias has also been reported for differential shrinkage models (
      • Meuwissen T.H.E.
      • Hayes B.J.
      • Goddard M.E.
      Prediction of total genetic value using genome-wide dense marker maps.
      ). Our result can be attributed to the accuracy of the estimated marker effects and to the small number of independent chromosome segments simulated.
      Table 3Mean squared prediction (MSE), intercept (a), and coefficient (b) of the linear regression between the variance of gametic diversity for QTL and its estimates using a high-density SNP panel and sequence data by genomic models
      Values in bold represent the least-biased estimates.
      TraitModel
      GBLUP = genomic BLUP; BLASSO = Bayesian LASSO.
      High-density SNPSequence data
      h
      GBLUP = genomic BLUP; BLASSO = Bayesian LASSO.
      QTL (no.)MSEabMSEab
      0.120GBLUP0.0014−0.00100.270.0022−0.000330.20
      BLASSO8e-050.00271.208e-050.001851.26
      200GBLUP0.00100.00580.230.00160.006370.18
      BLASSO0.00010.00741.010.00010.006811.03
      0.320GBLUP0.0017−0.006970.430.0028−0.006250.35
      BLASSO0.00020.002821.460.00020.002471.41
      200GBLUP0.00210.009790.400.00350.011230.33
      BLASSO0.00040.009451.140.00040.009501.13
      0.520GBLUP0.0019−0.002940.260.0030−0.0020390.19
      BLASSO0.00010.001881.410.00010.0018661.37
      200GBLUP0.00220.005600.620.00330.0065470.56
      BLASSO0.00080.008511.100.00070.0087991.09
      1 Values in bold represent the least-biased estimates.
      2 GBLUP = genomic BLUP; BLASSO = Bayesian LASSO.
      For a trait with h2 = 0.10 and 20 QTL (Table 2), the correlations between σgamete2 obtained with all QTL and with QTL of MAF ≥5% were of moderate to high magnitude, lower than that of other traits (high magnitude), resulting in lower correlations with the σgamete2 estimated by genomic models. Although this result may be due to allele frequency fluctuations in historical population, it also implies that QTL with low MAF are important for obtaining accurate estimates of σgamete2. This variance does not depend directly on population allele frequencies but on the individual's heterozygote status. Although MAF filtering (≥5%) can be used to improve the prediction of GEBV (
      • Uemoto Y.
      • Sasaki S.
      • Kojima T.
      • Sugimoto Y.
      • Watanabe T.
      Impact of QTL minor allele frequency on genomic evaluation using real genotype data and simulated phenotypes in Japanese Black cattle.
      ), markers with low MAF may have greater linkage disequilibrium with QTL with low MAF, providing better predictions of gametic variance.
      To facilitate rapid calculation of σgamete2 we tested some scenarios without considering the covariance (dependence) between markers. However, the correlation between true and estimated σgamete2 was always lower compared with the full model, with the difference ranging from moderate to high when the estimates were obtained from QTL, and from low to high when obtained from the marker effects (Table 2). However, the high correlation observed for one of the scenarios (h2 = 0.30 and QTL = 20) can be attributed to the random distribution of QTL in the genome. Therefore, covariance between markers should always be included for calculating σgamete2 and thus, be preferred over the traditional Mendelian sampling variance (Appendix A1). This result is consistent with
      • Bonk S.
      • Reichelt M.
      • Teuscher F.
      • Segelke D.
      • Reinsch N.
      Mendelian sampling covariability of marker effects and genetic values.
      , who recommended the use of haplotype and direct recombination data (
      • Cole J.B.
      • VanRaden P.M.
      Use of haplotypes to estimate Mendelian sampling effects and selection limits.
      ).
      No difference in correlation between true and estimated σgamete2 from BLASSO was observed between the high-density SNP and sequence data scenarios (Table 2), indicating that SNP panels with moderate densities are sufficient for estimating σgamete2 However, a decrease in correlation was observed for estimates obtained with GBLUP when the sequence data panel was used, regardless of the number of simulated QTL. For GEBV prediction,
      • Clark S.A.
      • Hickey J.M.
      • Van der Werf J.H.J.
      Different models of genetic variation and their effect on genomic evaluation.
      observed a small difference in performance using differential shrinkage with sequence data compared with medium-density SNP panels.
      • Pérez-Enciso M.
      • Forneris N.
      • de Los Campos G.
      • Legarra A.
      Evaluating sequence-based genomic prediction with an efficient new simulator.
      also reported a modest increase in accuracy using differential shrinkage model on sequence data. Therefore, sequence data are unlikely to offset SNP panels for predicting GEBV when the number of loci is large and the prior given to each SNP is uniform. Although no improvement in accuracy for σgamete2 was observed with an increased number of markers, the difference in performance between the 2 types of methods was in line with the literature on GEBV studies. This fact, together with the increase in overestimation due to an increased number of markers (Table 3), confirms the preference of shrinkage models for estimating σgamete2 in our simulation of small genome and relative large QTL effects.
      The correlation between true and predicted CRV was lower than that of σgamete2 (Supplemental Table S1; https://doi.org/10.3168/jds.2018-15971). There was no unanimous model, but GBLUP showed better prediction performance for many scenarios, whereas BLASSO had better results when ignoring the covariance between markers in scenarios with moderate heritability and a small number of QTL. Generally, the prediction with high-density markers showed a higher accuracy than that with sequence data. The CRV is a relative parameter that indicates how variable the GEBV of an individual is when transmitted to its gametes. The magnitude of the correlation showed that this parameter can be predicted, although the decreased accuracy with an increased number of markers indicated some difficulty for prediction in these cases.
      These results may also be explained by a partition of CRV (Appendix A3). Similar results were observed for σgamete2 and CRV in the 10th generation using the marker effects estimated from the 9th generation. It means that predictions for these parameters can follow the same design in genomic selection programs to calculate GEBV, and σgamete2 can be estimated using the training data from previous generations (
      • Habier D.
      • Fernando R.L.
      • Dekkers J.C.M.
      The impact of genetic relationship information on genome-assisted breeding values.
      ).

      Application of Gametic Variance in Selection Programs

      The percentage of additional genetic gain (ΔG) per generation in selection by using RPTA compared with PTA (ΔGRPTA-PTA/ΔGPTA), as well as the accumulated gain for a period of 10 generations, was used to assess the suitability of the new selection index (Figure 1 and Supplemental Figure S1; https://doi.org/10.3168/jds.2018-15971). The accumulated genetic gains obtained with RPTA were higher than those obtained with PTA when the number of QTL increased. No significant increase was observed for a small number of QTL (20). However, in scenarios with more QTL, the genetic gain was close to expected (Appendix A2), with ΔG ranging from 5 to 16% in 10 generations, indicating an advantage of RPTA for traits with large numbers of QTL. These results were in agreement with those reported by
      • Daetwyler H.D.
      • Hayden M.
      • Spangenberg G.
      • Hayes B.
      Selection on optimal haploid value increases genetic gain and preserves more genetic diversity relative to genomic selection.
      using a genomic optimal haploid value (OHV) for selection. In addition, we noted that RPTA tended to increase the frequency of the best alleles in the population more quickly than PTA, which resulted in a permanent effect over generations (Supplemental Figure S1; https://doi.org/10.3168/jds.2018-15971). This trend was also observed by
      • Daetwyler H.D.
      • Hayden M.
      • Spangenberg G.
      • Hayes B.
      Selection on optimal haploid value increases genetic gain and preserves more genetic diversity relative to genomic selection.
      using OHV, which verified the increase of genetic gain and a smaller reduction of genetic diversity compared with genomic selection. According to those authors, the selection of the individuals with the highest breeding values can lead to the loss of rare favorable alleles in the population, but individuals that carry these alleles may be more favorable in the long term, even though their GEBV can be below the truncation point. The importance of the selection for favorable minor alleles was also reported by
      • Sun C.
      • VanRaden P.M.
      Increasing long-term response by selecting for favorable minor alleles.
      using a weighted genomic selection (WGS) by the favorable MAF.
      Figure thumbnail gr1
      Figure 1Difference in percentage of genetic gain per generation between true PTA (ΔGPTA) and relative PTA (ΔGRPTA) with different adjusted future selection intensity values (if = 0.5, 0.8, and 1) and heritabilities (h2 = 0.1, 0.3, and 0.5) for 10 generations and 20 QTL with real section intensity around 0.996 for females and 1.755 for males (5 offspring per dam).
      The percentages of additional genetic gain per generation using RPTA were generally positive across scenarios (Figure 1 and Supplemental Figure S1; https://doi.org/10.3168/jds.2018-15971). Some were negative in scenarios with a few QTL but they were all positive and relatively large for scenarios with more QTL, between 3 and 8%. These results highlight the advantage of using RPTA compared with conventional PTA. Although for all scenarios we observed that the first generation under the RPTA selection obtained less genetic gain than PTA, a large increase was obtained in subsequent generations.
      • Daetwyler H.D.
      • Hayden M.
      • Spangenberg G.
      • Hayes B.
      Selection on optimal haploid value increases genetic gain and preserves more genetic diversity relative to genomic selection.
      reported similar trends in the first few generations of selection using OHV. Thus, the selection by RPTA initially provided an increase of variability, with a subsequent reduction by selection of the best alleles in the following generations.
      The effect of h2 was not evident across the scenarios, and in general, the best RPTA performance was observed for scenarios with moderate heritability.
      • Lehermeier C.
      • Teyssèdre S.
      • Schön C.C.
      Genetic gain increases by applying the usefulness criterion with improved variance prediction in selection of crosses.
      compared UC and OHV in 2 scenarios with low and high h2 and observed greater gain in the latter scenario. However, as the true values (PTA and RPTA) were used for selection in this study, heritability would not affect the estimation but the magnitude of genetic variances.
      When the number of progeny per dam increased, the genetic gain also increased for both PTA and RPTA, but the RPTA achieved a faster, and therefore greater, genetic gain (Figure 1 and Supplemental Figure S1; https://doi.org/10.3168/jds.2018-15971). These results agreed with those reported by
      • Daetwyler H.D.
      • Hayden M.
      • Spangenberg G.
      • Hayes B.
      Selection on optimal haploid value increases genetic gain and preserves more genetic diversity relative to genomic selection.
      , who observed a rapid increase in genetic gain using OHV when the number of progeny increased from 10 to 1,000.
      • Lehermeier C.
      • Teyssèdre S.
      • Schön C.C.
      Genetic gain increases by applying the usefulness criterion with improved variance prediction in selection of crosses.
      also observed greater gains for higher selection intensities for UC and OHV. For the scenarios with a few QTL, there was no large gain by using RPTA when increasing selection intensity (Figure 1 and Supplemental Figure S1). For scenarios with many QTL, we observed a greater gain using RPTA in early generations than in scenarios with lower selection intensity, as predicted by the expected gain (Appendix A2). For traits with many QTL, higher values of if had better performance than those with lower intensities, although the trend was not observed for scenarios with a few QTL. It suggests that for scenarios with many QTL, the increase in selection intensity allows the use of higher values for if.
      One of the advantages of using RPTA is the choice of if for weighting σgamete2 in the index (Figure 1 and Supplemental Figure S1). Initially, integral values (without adjustment) for if were tested but gains were much smaller than those with traditional selection. The component if×σgamete2 from RPTA is stochastic, and high values for if can be risky when the number of progeny per individual is small.
      • Lehermeier C.
      • Teyssèdre S.
      • Schön C.C.
      Genetic gain increases by applying the usefulness criterion with improved variance prediction in selection of crosses.
      used integral value of if to obtain UC; however, they simulated plant crosses with large numbers of progeny (100) to realize the predicted variance in the offspring. The increase in if proved to be unsuitable for scenarios with few QTL, where lowest values should be prioritized for this type of trait. However, for scenarios with many QTL, large if values are desired, especially when selection intensity is high. Thus, the risk related to the number of progeny should be considered and standardized equally for all individuals, preferably by increasing the minimum number of offspring.
      • Lehermeier C.
      • Teyssèdre S.
      • Schön C.C.
      Genetic gain increases by applying the usefulness criterion with improved variance prediction in selection of crosses.
      obtained the selection intensity value in UC from the proportion of individuals selected within plant homogeneous crosses, which is equivalent to using dam selection intensity in animal breeding. Empirically, to find the optimal value for if, we can adjust the real value of the future selection intensity of dams (have the least number of offspring) by 1PV¯, so the adjusted intensity value is if*=(1PV¯)×if where the individual percentage of variation (PV) is obtained from equation [A3.1] in Appendix A3. Given the number of progeny per dam (n) and the average CRV of the population, we have PV¯=(Z1α/2)×CRV¯n, where Z is the corresponding percentile value from a standard normal distribution.
      In this section, we verified the feasibility of using σgamete2 in selection programs to explore the whole additive genetic potential of individuals. The proposed index (RPTA) is easy to obtain and apply. As the variance of GEBV of future progeny is σgameteSire2+σgameteDam2, the UC for a given mating can be easily obtained as RPTAsire + RPTAdam. With greater genetic gains and better preservation of genetic diversity (Supplemental Figures S2 and S3; https://doi.org/10.3168/jds.2018-15971), the RPTA can accelerate genetic selection compared with other indices that also preserve diversity, such as the OHV, WGS, optimal population value, and genotype building (
      • Goiffon M.
      • Kusmec A.
      • Wang L.
      • Hu G.
      • Schnable P.S.
      Improving response in genomic selection with a population-based selection strategy: Optimal population value selection.
      ). This statement is supported by the literature, because first
      • Goiffon M.
      • Kusmec A.
      • Wang L.
      • Hu G.
      • Schnable P.S.
      Improving response in genomic selection with a population-based selection strategy: Optimal population value selection.
      showed a better performance of OHV than cited indices, and then
      • Lehermeier C.
      • Teyssèdre S.
      • Schön C.C.
      Genetic gain increases by applying the usefulness criterion with improved variance prediction in selection of crosses.
      introduced the UC (similar to RPTA, but used for mating) and demonstrated its superiority to OHP within the crosses when different selection intensities are applied. Thus, RPTA can be a better option to maximize genetic gain per generation and the profitability of a breeding program. In contrast, although RPTA represents an optimal equilibrium between the expected value (PTA) and the variability, the weighted value for σgamete2(if*) can be modified to preserve diversity and accelerate genetic progress. Another point to consider is that although RPTA and WGS have a similar purpose, the expected component of RPTA (PTA) can still be adjusted for a greater emphasis in the selection of favorable minor alleles if desired, as suggested by
      • Sun C.
      • VanRaden P.M.
      Increasing long-term response by selecting for favorable minor alleles.
      . Besides, although the genetic gain obtained using RPTA with random mating has been impressive for traits with many QTL, greater gains can be obtained with this criterion using a sophisticated mating design (
      • Allaire F.R.
      Mate selection by selection index theory.
      ;
      • Sonesson A.K.
      • Meuwissen T.H.E.
      Mating schemes for optimum contribution selection with constrained rate of inbreeding.
      ;
      • Lehermeier C.
      • Teyssèdre S.
      • Schön C.C.
      Genetic gain increases by applying the usefulness criterion with improved variance prediction in selection of crosses.
      ).

      Estimation and Application of Gametic Variance in Real Data

      The suitability of applying σgamete2 in livestock breeding programs was evaluated using 2 dairy populations, Holstein and Jersey, for 5 milk production traits. Because chromosomes are independent genome segments, σgamete2 was calculated separately for each chromosome (Figure 2 and Supplemental Figure S4; https://doi.org/10.3168/jds.2018-15971). The average, standard deviation, and amplitude of the estimated σgamete2 were largest on BTA14 for all production traits. This was expected because BTA14 contains the largest QTL for milk production, DGAT1 (
      • Grisart B.
      • Coppitiers W.
      • Farnir F.
      • Karim L.
      • Ford C.
      • Berzi P.
      • Cambisano N.
      • Mni M.
      • Reid S.
      • Simon P.
      • Spelman R.
      • Georges M.
      • Snell R.
      Positional candidate cloning of a QTL in dairy cattle: Identification of a missense mutation in the bovine DGAT1 gene with major effect on milk yield and composition.
      ;
      • Bouwman A.C.
      • Bovenhuis H.
      • Visker M.H.P.W.
      • Van Arendonk J.A.M.
      Genome-wide association of milk fatty acids in Dutch dairy cattle.
      ,
      • Bouwman A.C.
      • Daetwyler H.D.
      • Chamberlain A.J.
      • Ponce C.H.
      • Sargolzaei M.
      • Schenkel F.S.
      • Sahana G.
      • Govignon-Gion A.
      • Boitard S.
      • Dolezal M.
      • Pausch H.
      • Brøndum R.F.
      • Bowman P.J.
      • Thomsen B.
      • Guldbrandtsen B.
      • Lund M.S.
      • Servin B.
      • Garrick D.J.
      • Reecy J.
      • Vilkki J.
      • Bagnato A.
      • Wang M.
      • Hoff J.L.
      • Schnabel R.D.
      • Taylor J.F.
      • Vinkhuyzen A.A.E.
      • Panitz F.
      • Bendixen C.
      • Holm L.E.
      • Gredler B.
      • Hozé C.
      • Boussaha M.
      • Sanchez M.P.
      • Rocha D.
      • Capitan A.
      • Tribout T.
      • Barbat A.
      • Croiseau P.
      • Drögemüller C.
      • Jagannathan V.
      • Jagt C.V.
      • Crowley J.J.
      • Bieber A.
      • Purfield D.C.
      • Berry D.P.
      • Emmerling R.
      • Götz K.U.
      • Frischknecht M.
      • Russ I.
      • Sölkner J.
      • Van Tassell C.P.
      • Fries R.
      • Stothard P.
      • Veerkamp R.F.
      • Boichard D.
      • Goddard M.E.
      • Hayes B.J.
      Meta-analysis of genome-wide association studies for cattle stature identifies common genes that regulate body size in mammals.
      ). However, the 5 traits had different distributions of σgamete2 among chromosomes. The σgamete2 for PY was more evenly distributed, but for other traits, especially for F%, σgamete2 showed a skewed distribution with major mass concentrated on BTA14. This is due to the greater effect of DGAT1 on milk fat than on milk protein (
      • Thaller G.
      • Krämer W.
      • Winter A.
      • Kaupe B.
      • Erhardt G.
      • Fries R.
      Effects of DGAT1 variants on milk production traits in German cattle breeds.
      ). Similar results were observed in both Holstein and Jersey cattle. Although the recombination rate is different between males and females in cattle (
      • Shen B.
      • Jiang J.
      • Seroussi E.
      • Liu G.E.
      • Ma L.
      Characterization of recombination features and the genetic basis in multiple cattle breeds.
      ), little difference was observed for σgamete2 between the 2 sexes (Figure 2 and Supplemental Figure S4).
      Figure thumbnail gr2
      Figure 2Distribution of variance of gametic diversity ( σgamete2) for milk, fat, and protein yields and fat and protein percentages by chromosome and sex in US Holstein cattle. Bars indicate averages and whiskers represent standard deviations.
      The distribution of σgamete2 varied in the 2 cattle populations (Figure 3 and Supplemental Figure S5; https://doi.org/10.3168/jds.2018-15971). The results showed a distribution close to the typical Gaussian curve for PY, but non-Gaussian curves for other traits. For F%, the distribution had 2 peaks. Similar results were observed for Holstein and Jersey, but the effect of BTA14 was more pronounced in Jersey, possibly because this breed has a higher milk fat percentage, as well as different composition of fatty acid content in milk (
      • White S.L.
      • Bertrand J.A.
      • Wade M.R.
      • Washburn S.P.
      • Green Jr., J.T.
      • Jenkins T.C.
      Comparison of fatty acid content of milk from Jersey and Holstein cows consuming pasture or a total mixed ration.
      ).
      • Segelke D.
      • Reinhardt F.
      • Liu Z.
      • Thaller G.
      Prediction of expected genetic variation within groups of offspring for innovative mating schemes.
      , studying genetic variation in Holstein, observed a similar distribution for PY and FY.
      Figure thumbnail gr3
      Figure 3Normal quantile-quantile (Q-Q) and kernel density plots of variance of gametic diversity distribution for milk, fat, and protein yields as well as fat and protein percentages in US Holstein cattle.
      The predictability of the offspring variance of dairy traits for bulls was assessed using a Pearson correlation between the variance of progeny breeding values and the σgamete2 of bulls estimated by genomic models (Table 4). This correlation increased with an increased number of offspring. The moderate to high correlation indicated the feasibility of predictions. These results were consistent with
      • Segelke D.
      • Reinhardt F.
      • Liu Z.
      • Thaller G.
      Prediction of expected genetic variation within groups of offspring for innovative mating schemes.
      , who reported the same trend with slightly higher values for the correlation with standard deviations of gamete breeding value (SDGBV) in Holstein. Although the variance of progeny GEBV also contains a dam effect, our results using only sires validated the σgamete2 estimated by genomic models as a predictor of the variance of GEBV for future offspring. In general, traits with larger coefficient of variation, such as PY, MY, and FY, had a lower correlation, whereas traits with lower coefficient of variation, F%, and P%, exhibited larger correlations. The difference in variability also explains the second trend, where traits with a biased distribution of σgamete2 per chromosome had a better prediction than those showing even distributions.
      • Segelke D.
      • Reinhardt F.
      • Liu Z.
      • Thaller G.
      Prediction of expected genetic variation within groups of offspring for innovative mating schemes.
      also reported a larger correlation for FY than PY using SDGBV.
      Table 4Pearson correlations (r) of σgamete2 for milk, protein, and fat yields (MY, PY, and FY, respectively) and protein and fat percentages (P% and F%, respectively) with variances of progeny breeding values for different minimum numbers of offspring per sire
      σgamete2 = variance of gametic diversity.
      Breed and minimum no. of offspringSires (no.)rMYrPYrFYrP%rF%
      Jersey
       101,1090.240.160.200.300.58
       504510.400.330.460.500.75
       1003110.530.340.470.600.85
       2001830.640.310.490.770.95
       3001280.680.400.550.860.96
       400970.660.430.610.900.97
       500770.660.510.620.900.97
       600660.690.540.660.920.97
      Holstein
       106,7970.290.160.320.390.57
       502,7530.550.240.600.690.85
       1001,8870.660.270.670.780.91
       2001,2410.710.230.700.830.93
       3009030.750.260.740.850.94
       4007060.770.290.770.870.95
       5005690.780.300.780.890.96
       6004780.780.300.770.890.95
      1 σgamete2 = variance of gametic diversity.
      The correlations of σgamete2 between traits were all positive, ranging from moderate to high magnitude, although most of the estimated correlations between production and content percentage traits were negative (Table 5). In general, many large correlations of σgamete2 were observed with MY, indicating that selection using gametic variation of MY can result in the preservation of variability in other production traits as well. The magnitude of correlation for FY and PY was similar to those studies using other measures of progeny variation by
      • Segelke D.
      • Reinhardt F.
      • Liu Z.
      • Thaller G.
      Prediction of expected genetic variation within groups of offspring for innovative mating schemes.
      and by
      • Bonk S.
      • Reichelt M.
      • Teuscher F.
      • Segelke D.
      • Reinsch N.
      Mendelian sampling covariability of marker effects and genetic values.
      .
      Table 5Pearson correlations between σgamete2 for different traits (above diagonal) and breeding values (below diagonal) and inbreeding coefficients (right side columns)
      MY = milk yield; PY = protein yield; FY = fat yield; F% = fat percentage; P% = protein percentage; FG and FP = genomic and pedigree inbreeding coefficients, respectively; σgamete2 = variance of gametic diversity.
      Breed and traitMYFYPYF%P%FGFP
      Jersey
       MY0.770.610.920.84−0.07−0.03
       FY0.470.550.830.73−0.08−0.02
       PY0.860.740.410.37−0.15−0.08
       F%−0.670.34−0.300.87−0.03−0.00
       P%−0.590.24−0.100.83−0.07−0.01
       FG0.04−0.030.03−0.07−0.030.48
       FP0.190.130.20−0.10−0.060.48
      Holstein
       MY0.730.620.860.67−0.12−0.03
       FY0.440.430.900.46−0.11−0.04
       PY0.830.680.420.28−0.19−0.07
       F%−0.500.56−0.120.59−0.06−0.01
       P%−0.400.330.180.69−0.10−0.03
       FG0.240.300.310.070.100.51
       FP0.240.310.330.080.130.51
      1 MY = milk yield; PY = protein yield; FY = fat yield; F% = fat percentage; P% = protein percentage; FG and FP = genomic and pedigree inbreeding coefficients, respectively; σgamete2 = variance of gametic diversity.
      Correlations of σgamete2 with inbreeding coefficients, given as the diagonal of the
      • Wright S.
      Evolution in Mendelian populations.
      and genomic (
      • VanRaden P.M.
      Efficient methods to compute genomic predictions.
      ) relationship matrices, were negative but close to zero across dairy traits, with the highest correlation observed for PY and the lowest for F% (Table 5). The results observed for PY and FY were similar to those reported by
      • Segelke D.
      • Reinhardt F.
      • Liu Z.
      • Thaller G.
      Prediction of expected genetic variation within groups of offspring for innovative mating schemes.
      using SDGBV. The negative correlation was expected because the greater the inbreeding, the lower the heterozygosity of the loci and consequently the lower the σgamete2. However, because σgamete2 depends on the heterozygosity of those genomic regions with QTL effects, the low correlations were not unexpected, because inbreeding coefficients are global indicators of whole-genome heterozygosity.

      CONCLUSIONS

      This study verified the feasibility of estimating and applying variance of gametic diversity in livestock breeding programs. The σgamete2 can be accurately obtained from genomic models. To improve the estimation of σgamete2 covariance between markers needs to be considered. σgamete2 can be especially useful for traits with many QTL using a newly developed selection index, RPTA. Additionally, the confidence level of this index can be adjusted by the number of future progeny, making it suitable for dairy cattle breeding. For Holstein and Jersey cattle, DGAT1 had a large effect on the prediction of σgamete2 across all production traits. Inbreeding coefficients had a small impact on gametic variability of dairy traits, with a greater effect on traits less affected by DGAT1. Collectively, σgamete2 can be easily obtained and applied to existing genomic selection programs to improve genetic progress and control genetic diversity.

      ACKNOWLEDGMENTS

      This research was supported in part by the Agriculture and Food Research Initiative grant no. 2016-67015-24886 from the USDA National Institute of Food and Agriculture (Washington, DC) and grant no. US-4997-17 from the US-Israel Binational Agricultural Research and Development Fund. Santos was also supported by São Paulo Research Foundation (FAPESP; research scholarship no. 2017/00462-5 and no. 2015/12396-1). Cole and VanRaden were supported by appropriated project 8042-31000-002-00-D, “Improving Dairy Animals by Increasing Accuracy of Genomic Prediction, Evaluating New Traits, and Redefining Selection Goals,” of the Agricultural Research Service of the United States Department of Agriculture. Mention of trade names or commercial products in this article is solely for the purpose of providing specific information and does not imply recommendation or endorsement by the US Department of Agriculture. The USDA is an equal opportunity provider and employer. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

      APPENDIX A1

      Variance of Gametic Diversity in Traditional BLUP Models

      The additive genetic value of an individual (u) can be expressed as the sum of half the value of sire (S), half of the dam (D), and a Mendelian error (m). The components transmitted by the sire (TS) and dam (TD) can be represented as
      Ts=(us2)+ms


      andTD=(uD2)+mD,
      [A1.1]


      where the variance of m, the Mendelian sample variance, var(m), is obtained as in
      • Mrode R.A.
      Linear Models for the Prediction of Animal Breeding Values.
      :
      var(m)=(1F)σu24.
      [A1.2]


      Here, the var(m) of an EBV is obtained from the parental inbreeding coefficients (F) (
      • Dempfle L.
      Problems in the use of the relationship matrix in animal breeding.
      ). From the genomics point of view, var(m) can be obtained considering locus by locus of the parent. In the traditional BLUP model, the diagonal elements of the relationship matrix are obtained by doubling the expected variance if this diploid individual mated with itself. This variance can be divided into 2 parts for the progeny having the same alleles (αi = αj) for a given locus or different alleles (αi ≠ αj) Considering that a homozygous locus has probability Pr(αi = αj) = 1 and a heterozygous locus Pr(αi = αj) = 0.5 and Pr(αi ≠ αj) = 0.5, the total variance of a hypothetical mating will be a function of the number of homozygous loci (NHom) and the number of heterozygous loci (NHet):
      2×var(Ts+TD)=2×(NHom+NHet)iN2piqiσu2.
      [A1.3]


      Separating by components related to equality (αi = αj) and to the difference (αi ≠ αj) among the loci of the future gametes, we have
      2×var(Ts+TD)=2×[NHom+NHetPr(ai=aj)]iN2piqiσu2+2×NHetPr(aiaj)iN2piqi2×var(TS+TD)=(1+F)σu2+(1F)σu2.


      Knowing that σu2=iN2piqiσa2, where σa2 is the variance of the effect of allelic substitution considered homogeneous for all loci in BLUP, var(m) is
      var(m)=1F4σu2=0.252×NHetPr(aiaj)iN2piqiσu2


      var(m)=0.25×NHetσa2.


      When homogeneous variance is assumed for the allelic substitution effects across all loci that are inherited independently:
      var(m)=σgamete2.


      APPENDIX A2

      Expected Genetic Gain Using the Relative PTA

      The RPTAi (relative PTA) refers to the average of the genetic value relative to the group of gametes that will be selected in the future. Thus, from the key equation of genetic change, the future genetic gain considering the selection of animals can be estimated. We will differentiate between the 2 selection intensities, ir (recent) and if (future), although they are identical in most cases. The relative selection differential (SR) for the next generation is given by SR=E[RBVi>selected], where the RBVi=2×(PTAi+σi×if). The variance of RBV=2×(PTAi+σgamete×if) can be obtained as
      E{[RBVE(RBV)]2}=σu2+4×if2×[E(σgamete2)E(σgamete)2].
      [A2.1]


      The relative genetic gain (ΔGR) can then be obtained as
      GR=r×ir×σu2+4×var(σgamete2)×if2×rg2,
      [A2.2]


      where σu2 is the additive genetic variance, var(σgamete2) is the variance of gametic diversity, and r and rg2 are, respectively, accuracies of the genetic evaluation and the prediction of the σgamete2.
      The increase in rate of genetic gain from using the relative criterion in place of the traditional criterion, disregarding the accuracy of σgamete2, will then be
      ΔGRΔG=σu2+4if2var(σgamete2)σu2.
      [A2.3]


      The expected genetic gain can be stratified using different if for sex, where the term 4if2var(σgamete2) is expanded to explicitly express male (S) and female (D) contributions:
      if2Svar(σgameteS2)+ifD2var(σgameteD2)+2×ifSifDcov(σgameteSσgameteD).
      [A2.4]


      APPENDIX A3

      Sample Size and Coefficient of Relative Variation

      Given σgamete2, it is useful to find the optimal number of progeny for an individual to realize the expected variability in its offspring. This is similar to the optimal number of daughters needed for a reliable conventional progeny test. It is difficult to obtain such an estimate using a general model, but estimating a percentage variability in real data is possible.
      • Van Belle G.
      • Martin D.C.
      Sample size as a function of coefficient of variation and ratio of means.
      proposed an approach to obtain the number of samples from the coefficient of variation considering the margin of error as a percentage of variation. Because the gametic variation is a random component proportional to the additive genetic variance, a modification of
      • Van Belle G.
      • Martin D.C.
      Sample size as a function of coefficient of variation and ratio of means.
      by substituting the coefficient of variation for the relative variation (CRV) of the value transmitted to the progeny can be used to estimate the required sample size (n); CRV (equation [4]) considers the value related to the average transmission of additive variance of an individual i to its progeny (E[(0.5ui)2]) as 0.5iNHom2αi2+σgamete2.
      Thus, we can use the CRV to obtain the sample size according to levels of percentage variation admitted between an estimated value from the sample and the expected value (
      • Van Belle G.
      • Martin D.C.
      Sample size as a function of coefficient of variation and ratio of means.
      ). The percentage of variation in a sample relative to the expected value (PV) in 0.5E[ui2] can be represented as PV0.5u=0.5T10.5T0.5E[ui2] where T1 represents the change (alternative) in the mean of PTA that may differ from the expected transmitted value (T). The smaller the change desired in PTA, the greater the number of progeny will be required and, consequently, greater variability will be observed among these progenies. The number of progeny (n) to be used from the percentage of variation ( PV0.5ui) as a margin of error can be represented as
      n=(Z1α/2)2×(CRVi)2(PV0.5ui)2,
      [A3.1]


      where Z1 - α/2 is the critical value associated with the degree of confidence. For example, at significance level of 95% if we accept a percentage change in the mean of only 10% ( (PV0.5ui=0.1)) of the PTA, we have n = 400 (CRVi)2, where the homozygous animals will have a lower CRV than the heterozygous animals, requiring a smaller number of offspring for progeny testing.

      REFERENCES

        • Allaire F.R.
        Mate selection by selection index theory.
        Theor. Appl. Genet. 1980; 57 (24301147): 267-272
        • Bijma P.
        • Wientjes Y.C.J.
        • Calus M.P.L.
        Increasing genetic gain by selecting for higher Mendelian sampling variance.
        in: Proc. World Congr. Genet. Appl. Livest. Prod., Auckland, New Zealand. 2018: 11.47
        • Bonk S.
        • Reichelt M.
        • Teuscher F.
        • Segelke D.
        • Reinsch N.
        Mendelian sampling covariability of marker effects and genetic values.
        Genet. Sel. Evol. 2016; 48 (27107720): 36
        • Bouwman A.C.
        • Bovenhuis H.
        • Visker M.H.P.W.
        • Van Arendonk J.A.M.
        Genome-wide association of milk fatty acids in Dutch dairy cattle.
        BMC Genet. 2011; 12 (21569316): 43
        • Bouwman A.C.
        • Daetwyler H.D.
        • Chamberlain A.J.
        • Ponce C.H.
        • Sargolzaei M.
        • Schenkel F.S.
        • Sahana G.
        • Govignon-Gion A.
        • Boitard S.
        • Dolezal M.
        • Pausch H.
        • Brøndum R.F.
        • Bowman P.J.
        • Thomsen B.
        • Guldbrandtsen B.
        • Lund M.S.
        • Servin B.
        • Garrick D.J.
        • Reecy J.
        • Vilkki J.
        • Bagnato A.
        • Wang M.
        • Hoff J.L.
        • Schnabel R.D.
        • Taylor J.F.
        • Vinkhuyzen A.A.E.
        • Panitz F.
        • Bendixen C.
        • Holm L.E.
        • Gredler B.
        • Hozé C.
        • Boussaha M.
        • Sanchez M.P.
        • Rocha D.
        • Capitan A.
        • Tribout T.
        • Barbat A.
        • Croiseau P.
        • Drögemüller C.
        • Jagannathan V.
        • Jagt C.V.
        • Crowley J.J.
        • Bieber A.
        • Purfield D.C.
        • Berry D.P.
        • Emmerling R.
        • Götz K.U.
        • Frischknecht M.
        • Russ I.
        • Sölkner J.
        • Van Tassell C.P.
        • Fries R.
        • Stothard P.
        • Veerkamp R.F.
        • Boichard D.
        • Goddard M.E.
        • Hayes B.J.
        Meta-analysis of genome-wide association studies for cattle stature identifies common genes that regulate body size in mammals.
        Nat. Genet. 2018; 50 (29459679): 362-367
        • Brito F.V.
        • Braccini Neto J.
        • Sargolzaei M.
        • Cobuci J.A.
        • Schenkel F.S.
        Accuracy of genomic selection in simulated populations mimicking the extent of linkage disequilibrium in beef cattle.
        BMC Genet. 2011; 12 (21933416): 80
        • Clark S.A.
        • Hickey J.M.
        • Van der Werf J.H.J.
        Different models of genetic variation and their effect on genomic evaluation.
        Genet. Sel. Evol. 2011; 43 (21575265): 18
        • Cole J.B.
        • VanRaden P.M.
        Use of haplotypes to estimate Mendelian sampling effects and selection limits.
        J. Anim. Breed. Genet. 2011; 128 (22059578): 446-455
        • Daetwyler H.D.
        • Hayden M.
        • Spangenberg G.
        • Hayes B.
        Selection on optimal haploid value increases genetic gain and preserves more genetic diversity relative to genomic selection.
        Genetics. 2015; 200 (26092719): 1341-1348
        • Daetwyler H.D.
        • Pong-Wong R.
        • Villanueva B.
        • Woolliams J.A.
        The impact of genetic architecture on genome-wide evaluation methods.
        Genetics. 2010; 185 (20407128): 1021-1031
        • Dempfle L.
        Problems in the use of the relationship matrix in animal breeding.
        in: Gianola D. Hammond K. Advances in Statistical Methods for Genetic Improvement of Livestock. Vol 1. Springer, New York, NY1990: 454-473
        • García-Ruiz A.
        • Cole J.B.
        • VanRaden P.M.
        • Wiggans G.R.
        • Ruiz-López F.J.
        • Van Tassell C.P.
        Changes in genetic selection differentials and generation intervals in US Holstein dairy cattle as a result of genomic selection.
        Proc. Natl. Acad. Sci. USA. 2016; 113 (27354521): E3995-E4004
        • Goiffon M.
        • Kusmec A.
        • Wang L.
        • Hu G.
        • Schnable P.S.
        Improving response in genomic selection with a population-based selection strategy: Optimal population value selection.
        Genetics. 2017; 206 (28526698): 1675-1682
        • Grisart B.
        • Coppitiers W.
        • Farnir F.
        • Karim L.
        • Ford C.
        • Berzi P.
        • Cambisano N.
        • Mni M.
        • Reid S.
        • Simon P.
        • Spelman R.
        • Georges M.
        • Snell R.
        Positional candidate cloning of a QTL in dairy cattle: Identification of a missense mutation in the bovine DGAT1 gene with major effect on milk yield and composition.
        Genome Res. 2002; 12 (11827942): 222-231
        • Habier D.
        • Fernando R.L.
        • Dekkers J.C.M.
        The impact of genetic relationship information on genome-assisted breeding values.
        Genetics. 2007; 177 (18073436): 2389-2397
        • Hayes B.
        • Goddard M.E.
        The distribution of effects of genes affecting quantitative traits in livestock.
        Genet. Sel. Evol. 2001; 33 (11403745): 209-229
        • Henderson C.R.
        Best linear unbiased estimation and prediction under a selection model.
        Biometrics. 1975; 31 (1174616): 423-447
        • Legarra A.
        • Ricard A.
        • Filangio O.
        GS3 Genomic Selection — Gibbs Sampling — Gauss Seidel (and BayesCπ).
        http://snp.toulouse.inra.fr/~alegarra/manualgs3_last.pdf
        Date: 2015
        Date accessed: January 20, 2015
        • Legarra A.
        • Robert-Granié C.
        • Croiseau P.
        • Guillaume F.
        • Fritz S.
        Improved Lasso for genomic selection.
        Genet. Res. (Camb.). 2011; 93 (21144129): 77-87
        • Lehermeier C.
        • Teyssèdre S.
        • Schön C.C.
        Genetic gain increases by applying the usefulness criterion with improved variance prediction in selection of crosses.
        Genetics. 2017; 207 (29038144): 1651-1661
        • Ma L.
        • O'Connell J.R.
        • VanRaden P.M.
        • Shen B.
        • Padhi A.
        • Sun C.
        • Bickhart D.M.
        • Cole J.B.
        • Null D.J.
        • Liu G.E.
        • Da Y.
        • Wiggans G.R.
        Cattle sex-specific recombination and genetic control from a large pedigree analysis.
        PLoS Genet. 2015; 11 (26540184): e1005387
        • Meuwissen T.H.E.
        • Hayes B.J.
        • Goddard M.E.
        Prediction of total genetic value using genome-wide dense marker maps.
        Genetics. 2001; 157 (11290733): 1819-1829
        • Mrode R.A.
        Linear Models for the Prediction of Animal Breeding Values.
        2nd ed. CAB International, Wallingford, UK2005
        • Müller D.
        • Schopp P.
        • Melchinger A.E.
        Selection on expected maximum haploid breeding values can increase genetic gain in recurrent genomic selection.
        G3 (Bethesda). 2018; 8 (29434032): 1173-1181
        • Nejati-Javaremi A.
        • Smith C.
        • Gibson J.P.
        Effect of total allelic relationship on accuracy of evaluation and response to selection.
        J. Anim. Sci. 1997; 75 (9222829): 1738-1745
        • Pérez-Enciso M.
        • Forneris N.
        • de Los Campos G.
        • Legarra A.
        Evaluating sequence-based genomic prediction with an efficient new simulator.
        Genetics. 2017; 205 (27913617): 939-953
        • Sargolzaei M.
        • Schenkel F.
        QMSim: A large-scale genome simulator for livestock.
        Bioinformatics. 2009; 25 (19176551): 680-681
        • Schaeffer L.R.
        Strategy for applying genome-wide selection in dairy cattle.
        J. Anim. Breed. Genet. 2006; 123 (16882088): 218-223
        • Segelke D.
        • Reinhardt F.
        • Liu Z.
        • Thaller G.
        Prediction of expected genetic variation within groups of offspring for innovative mating schemes.
        Genet. Sel. Evol. 2014; 46 (24990472): 42
        • Shen B.
        • Jiang J.
        • Seroussi E.
        • Liu G.E.
        • Ma L.
        Characterization of recombination features and the genetic basis in multiple cattle breeds.
        BMC Genomics. 2018; 19 (29703147): 304
        • Shepherd R.K.
        • Meuwissen T.H.
        • Woolliams J.A.
        Genomic selection and complex trait prediction using a fast EM algorithm applied to genome-wide markers.
        BMC Bioinformatics. 2010; 11 (20969788): 529
        • Sonesson A.K.
        • Meuwissen T.H.E.
        Mating schemes for optimum contribution selection with constrained rate of inbreeding.
        Genet. Sel. Evol. 2000; 32 (14736390): 231-248
        • Sun C.
        • VanRaden P.M.
        Increasing long-term response by selecting for favorable minor alleles.
        PLoS One. 2014; 9 (24505495): e88510
        • Thaller G.
        • Krämer W.
        • Winter A.
        • Kaupe B.
        • Erhardt G.
        • Fries R.
        Effects of DGAT1 variants on milk production traits in German cattle breeds.
        J. Anim. Sci. 2003; 81 (12926772): 1911-1918
        • Uemoto Y.
        • Sasaki S.
        • Kojima T.
        • Sugimoto Y.
        • Watanabe T.
        Impact of QTL minor allele frequency on genomic evaluation using real genotype data and simulated phenotypes in Japanese Black cattle.
        BMC Genet. 2015; 16 (26586567): 134
        • Van Belle G.
        • Martin D.C.
        Sample size as a function of coefficient of variation and ratio of means.
        Biometrics. 1993; 58: 612-620
        • VanRaden P.M.
        Efficient methods to compute genomic predictions.
        J. Dairy Sci. 2008; 91 (18946147): 4414-4423
        • VanRaden P.M.
        • O'Connell J.R.
        • Wiggans G.R.
        • Weigel K.A.
        Genomic evaluations with many more genotypes.
        Genet. Sel. Evol. 2011; 43 (21366914): 10
        • White S.L.
        • Bertrand J.A.
        • Wade M.R.
        • Washburn S.P.
        • Green Jr., J.T.
        • Jenkins T.C.
        Comparison of fatty acid content of milk from Jersey and Holstein cows consuming pasture or a total mixed ration.
        J. Dairy Sci. 2001; 84 (11699461): 2295-2301
        • Wright S.
        Evolution in Mendelian populations.
        Genetics. 1931; 16 (17246615): 97-159