Research Article| Volume 98, ISSUE 12, P9026-9034, December 2015
• PDF [487 KB]PDF [487 KB]
• Top

# Single-step genomic model improved reliability and reduced the bias of genomic predictions in Danish Jersey

Open AccessPublished:September 30, 2015

## Abstract

A bias in the trend of genomic estimated breeding values (GEBV) was observed in the Danish Jersey population where the trend of GEBV was smaller than the deregressed proofs for individuals in the validation population. This study attempted to improve the prediction reliability and reduce the bias of predicted genetic trend in Danish Jersey. The data consisted of 1,238 Danish Jersey bulls and 611,695 cows. All bulls were genotyped with the 54K chip, and 1,744 cows were genotyped with either 7K chips (1,157 individuals) or 54K chips (587 individuals). The trait used in the analysis was protein yield. All cows with EBV were used in a single-step approach. Deregressed proofs were used as the response variable. Four alternative approaches were compared with genomic best linear unbiased prediction (GBLUP) model with bulls in the reference data (GBLUPBull): (1) GBLUP with both bulls and genotyped cows in the reference data; (2) GBLUP including a year of birth effect; (3) GEBV from a GBLUP model that accounted for the difference of EBV between dams and maternal grandsires; and (4) using a single-step approach. The results indicated all 4 alternatives could reduce the bias of predicted genetic trend and that the single-step approach performed best. However, not all these approaches improved reliability or reduced inflation of GEBV. The reliability was 0.30 and regression coefficients of deregressed proofs on GEBV were 0.69 in the scenario GBLUPBull. When genotyped cows were included in the reference population, the regression coefficients decreased to 0.59 but the reliability increased to 0.35. If a year effect was included in the model, the prediction reliability decreased to 0.29 and the regression coefficient improved to 0.75. The method in which GEBV were adjusted for the difference between dam EBV and maternal grandsire EBV led to much lower regression coefficients though the reliability increased to 0.4. The single-step approach improved both the reliability, to 0.38 and regression coefficient to 0.78. Therefore, the bias in genetic trend was reduced. The results suggest that implementing the single-step approach is an effective way to improve genomic prediction in Danish Jersey cattle.

## Introduction

Genomic prediction has been widely used in dairy cattle since genome-wide dense marker chips became available. To obtain accurate prediction, a large reference population is needed (
• Goddard M.E.
• Hayes B.J.
Mapping genes for complex traits in domestic animals and their use in breeding programmes.
;
• Hayes B.J.
• Bowman P.J.
• Chamberlain A.J.
• Goddard M.E.
Invited review: Genomic selection in dairy cattle: progress and challenges.
). In dairy cattle, usually progeny-tested bulls are used to form the reference population. In some large populations, such as Holsteins, accurate prediction using genomic information has been obtained (
• Van Tassell C.P.
• Wiggans G.R.
• Sonstegard T.S.
• Schnabel R.D.
• Taylor J.F.
• Schenkel F.S.
Invited review: Reliability of genomic predictions for North American Holstein bulls.
;
• Lund M.S.
• De Roos A.P.W.
• De Vries A.G.
• Druet T.
• Ducrocq V.
• Fritz S.
• Guillaume F.
• Guldbrandtsen B.
• Liu Z.
• Reents R.
• Schrooten C.
• Seefried F.
• Su G.
A common reference population from four European Holstein populations increases reliability of genomic predictions.
). For Danish Jerseys it is quite challenging to obtain a large reference population because a limited number of progeny-tested bulls are available (
• Thomasen J.R.
• Guldbrandtsen B.
• Su G.
• Brøndum R.F.
• Lund M.S.
Reliabilities of genomic estimated breeding values in Danish Jersey.
). One way to overcome this limitation is to add genotyped cows to the reference population. However, previous studies have reported an inflation of the genomic estimated breeding values (GEBV) when cows were included into the training set (
• Wiggans G.R.
• Cooper T.A.
• Cole J.B.
Technical note: Adjustment of traditional cow evaluations to improve accuracy of genomic predictions.
;
• Calus M.P.L.
• de Haas Y.
• Veerkamp R.F.
Combining cow and bull reference populations to increase accuracy of genomic prediction and genome-wide association studies.
), because the genotyped cows are usually elite and possible get preferential treatment. Another strategy is to make use of the phenotypic information from nongenotyped animals. A popular approach is to apply a single-step model which estimates genomic breeding values using the information of genotyped and nongenotyped individuals simultaneously by integrating marker- and pedigree-based relationship matrix into a joint relationship matrix (
• Misztal I.
• Legarra A.
• Aguilar I.
Computing procedures for genetic evaluation including phenotypic, full pedigree, and genomic information.
;
• Christensen O.F.
• Lund M.S.
Genomic prediction when some animals are not genotyped.
;
• Aguilar I.
• Misztal I.
• Johnson D.L.
• Legarra A.
• Tsuruta S.
• Lawlor T.J.
Hot topic: A unified approach to utilize phenotypic, full pedigree, and genomic information for genetic evaluation of Holstein final score.
).
Nordic routine genomic genetic evaluation has observed a bias of predicted genetic trends in Danish Jerseys. Bias of predicted genetic trends was defined as the annual deviation of GEBV from the deregressed proofs (DRP) of the animals in the test population. Bias of predicted genetic trends may lead to an unfair comparison of animals across birth years. The bias could be caused by a discrepancy between assumptions of the genomic prediction models and the selection histories of the practical populations (
• Vitezica Z.G.
• Aguilar I.
• Misztal I.
• Legarra A.
Bias in genomic predictions for populations under selection.
). The genomic prediction models assume there is no selection in the population, which is used for implementing genomic prediction (
• Hayes B.J.
• Visscher P.M.
• Goddard M.E.
Increased accuracy of artificial selection by using the realized relationship matrix.
). However, in practice, the genotyped populations usually consist of selected animals such as progeny-tested bulls and elite cows. The single-step approach accounts for the selection by including all records in the model. Therefore, this approach is expected to minimize the bias. Another possible solution to reduce the bias is to add a year of birth effect in the model, which may lead to a robust estimation of genetic trend (

Ducrocq, V. 2010. Sustainable dairy cattle breeding: Illusion or reality. In Proc. 9th World Congr. Genet. Appl. Livest. Prod.

). Therefore, the genetic progress on the maternal side could be taken into account by the year trend. Similarly, adjusting GEBV for the difference between EBV of dam and maternal grandsire (MGS) may reduce bias of predicted genetic trend.
The objectives of our study were to investigate the prediction reliability and bias of predicted genetic trend in Danish Jersey. A second objective was to increase prediction reliability and reduce bias of predicted genetic trend using various strategies such as adding genotyped cows to the reference population, including year effect into the prediction model, accounting for the difference of EBV between dam and MGS, and applying a single-step approach.

## Materials and Methods

### Data

Danish Jersey data were used in our study. There were 2,982 genotyped individuals comprising 1,238 bulls born between 1981 and 2009 and 1,744 cows born between 2000 and 2011, with most of them (1,733) born after 2004. Most cows (1,157) were randomly selected from a few herds, whereas the others (587) were selected as potential bull dams by individual farms according to their own breeding schemes. The DRP of protein used in different scenarios were calculated from EBV of genetic evaluation in November 2013. When using the single-step approach, all cows with EBV for protein were used in the analysis. After tracing the pedigree to as many generations as possible for the cows with EBV and bulls with genotypes, the pedigree used for single-step prediction included 819,988 individuals. The DRP for all cows were calculated using Mix99 (
• Lidauer M.
• Strandén I.
Fast and flexible program for genetic evaluation in dairy cattle.
;
• Strandén I.
• Mäntysaari E.A.
A recipe for multiple trait deregression.
); it required that the cows had an effective record contribution (ERC) larger than 0.1. This reduced the number of cows with DRP to be 611,695. Cows which are daughters of the test bulls (described later) were excluded. After filtering, the number of cows with DRP used in the single-step approach was 577,405.
The bulls were genotyped with Illumina BovineSNP50 BeadChip (54K; Illumina, San Diego, CA), which includes 54,001 SNP. Bull dams (587) were genotyped with 54K chips. Randomly selected cows (1,157) were genotyped with Illumina BovineLD BeadChip (LD) which includes 6,909 SNP. The LD data were imputed to 54K with Beagle (
• Browning B.L.
• Browning S.R.
A unified approach to genotype imputation and haplotype-phase inference for large data sets of trios and unrelated individuals.
) using the 54K genotyped animals as imputation reference population. The markers used for prediction were from 29 autosomes. The genotypes for genomic prediction were edited by deleting the markers with minor allele frequency less than 0.01 and the markers in complete linkage disequilibrium (r2 = 1) with the previous marker. After editing, 38,967 markers were used for genomic prediction.

### Methods

To validate the prediction accuracy and unbiasedness, the Jersey bulls were divided into reference and test sets using a cut-off date of birth of January 1, 2005. The bulls born after this date were used as validation animals (208 bulls). Thus, in the scenario using only bull reference data, 1,030 bulls were used as reference population.
Besides the genomic BLUP model (GBLUP) with bulls in the reference data (GBLUPBull), 5 alternative approaches were used in our study. The first was including pedigree relationships to weight the genomic relationship (GBLUPWBull). Approach 2 was the GBLUP model with both bulls and genotyped cows in reference set (GBLUPCow), in which, 25 cows were dams of test bulls. Approach 3 included a year of birth effect in the GBLUP model (GBLUPYear) to account for the part of genetic trend that is not accounted for by SNP markers. Approach 4 was to adjust GEBV using the difference of EBV between dams and maternal grandsires (GBLUPDam_mgs). Approach 5 was a single-step method to integrate the information of genotyped and nongenotyped animals for genomic prediction. Two scenarios of this approach were investigated, which were the predictions either using cow genotypes (SSPG) or without using cow genotypes (SSP).The numbers of individuals used in the reference population and test population in different scenarios are shown in Table 1.
Table 1The number of individuals in each scenario
GBLUPBull=genomic BLUP model with bulls as reference population; GBLUPWBull=same as GBLUPBull but with a genomic relationship matrix Gω=0.8G + 0.2A, where G is a genomic relationship matrix and A is pedigree relationship matrix; GBLUPCow=GBLUP model with both genotyped bulls and cows as reference population; GBLUPYear=year effects were included in the model as genetic trend; GBLUPDam_mgs=genomic EBV from GBLUP model using bull reference data were adjusted for the difference between dam EBV and maternal grandsire (mgs) EBV. SSP=the single-step approach using phenotypes of all cows and genotypes of genotyped bulls. SSPG=the single-step approach using phenotypes of all cows and genotypes of genotyped bulls and cows.
ItemGBLUPBull/GBLUPWBull/GBLUPYear/GBLUPDam_mgsGBLUPCowSSPSSPG
No. ofgenotypedanimalsNo. ofphenotypedanimalsNo. ofgenotypedanimalsNo. ofphenotypedanimals
Reference set1,0302,7741,030577,4052,774577,405
Test set208208208208208208
1 GBLUPBull = genomic BLUP model with bulls as reference population; GBLUPWBull = same as GBLUPBull but with a genomic relationship matrix Gω = 0.8G + 0.2A, where G is a genomic relationship matrix and A is pedigree relationship matrix; GBLUPCow = GBLUP model with both genotyped bulls and cows as reference population; GBLUPYear = year effects were included in the model as genetic trend; GBLUPDam_mgs = genomic EBV from GBLUP model using bull reference data were adjusted for the difference between dam EBV and maternal grandsire (mgs) EBV. SSP = the single-step approach using phenotypes of all cows and genotypes of genotyped bulls. SSPG = the single-step approach using phenotypes of all cows and genotypes of genotyped bulls and cows.

### Statistical Models

The statistical models used in different scenarios are described below.

#### GBLUP

The GBLUP model was
$y=1μ+Zg + e,$

where y is a vector of DRP of animals in reference population; μ is the overall mean; g is the direct genomic value; Z is the design matrix for linking g to y; and e is a vector of the random residuals. Random effects were assumed to be distributed as $g~N0,Gωσg2ande~N0,Dσe2,$ where $σg2$ is the additive genetic variance, Gω is the genomic relationship matrix, $σe2$ is the residual variance, and D is a diagonal matrix with elements $dii=1−rDRP2/rDRP2,$ in which $rDRP2$ is the reliability of DRP. The GEBV was calculated as $GEBV=μˆ+g⌢.$ The genomic relationship matrix, Gω, is defined as
$Gω=ωA+1−ωG,$

where G is genomic relationship matrix described in
Efficient methods to compute genomic predictions.
, and A is pedigree relationship matrix. In scenarios GBLUPBull and GBLUPCow, ω = 0. In scenario GBLUPWBull, ω = 0.2.

#### GBLUP with Year Effect

When the year effect is included in a GBLUP model, the model was
$y=1μ+bX + Zg + e,$

where b is a regression coefficient of y on birth years, and X is a vector of birth years, treated as continuous covariates in this model. The GEBV from GBLUPYear was calculated as $GEBV=μˆ+b⌢×year+g⌢.$

#### Adjusting GEBV for the Difference of EBV Between Dams and MGS

In traditional genetic evaluations for an individual without own or offspring records, when both sire EBV and dam EBV are available, the EBV for the individual is
$EBVo=12EBVsire+12EBVdam.$

When only the bulls’ EBV (sire EBV and maternal grandsire EBV) are available, the EBV for the individual is
$EBVo=12EBVsire+14EBVmgs,$

where the dam EBV is supposed as the average of EBV from all the daughters of the maternal grandsire, which is not the case because bull dam has high EBV due to selection. The difference between EBVmgs and EBVdam may cause an underestimation of GEBV of candidates when dams are absent in reference population. To reduce the influence by this difference, the GEBV for the validation animals were corrected by adding a value of $12EBVdam−12EBVmgs.$

#### Single-Step Model

The single-step model was as follows
$y = 1μ + Za + e,$

where y is the vector of DRP of all the cows with EBV in the whole population, a is a vector of additive genetic effects, and Z is the design matrix for additive genetics effects. Random effects were assumed to be normally distributed $a~N0,Hσa2ande~NDσe2,$ where $σa2$ is the additive genetic variance and H is the relationship matrix of all the individuals as defined below. Here the reliability of DRP $rDRP2$ was ERC/(ERC + λ), where λ = (1 − h2)/h2.
Following
• Legarra A.
• Aguilar I.
• Misztal I.
A relationship matrix including full pedigree and genomic information.
,
• Aguilar I.
• Misztal I.
• Johnson D.L.
• Legarra A.
• Tsuruta S.
• Lawlor T.J.
Hot topic: A unified approach to utilize phenotypic, full pedigree, and genomic information for genetic evaluation of Holstein final score.
, and
• Christensen O.F.
• Lund M.S.
Genomic prediction when some animals are not genotyped.
,
$H=A12A22−1GωA22−1A21+A11−A12A22−1A21A12A22−1GωGωA22−1A21Gω,$

where A is pedigree relationship matrix and can be partitioned as $A=A11A12A21A22$ with subscript 1 for nongenotyped individuals and 2 for genotyped individuals, and Gω = (1 − ω)G + ωA22. In our study the G matrix was adjusted for the differences in location and scale of pedigree-based relationship matrix (A22) using the method proposed by
• Christensen O.F.
• Nielsen B.
• Ostersen T.
• Su G.
Single-step methods for genomic evaluation in pigs.
. Furthermore, ω was set as 0.2 according to the study by
• Gao H.
• Christensen O.F.
• Nielsen U.S.
• Zhang Y.
• Lund M.S.
• Su G.
Comparison on genomic predictions using three GBLUP methods and two single-step blending methods in the Nordic Holstein population.
.
The inverse of H (
• Aguilar I.
• Misztal I.
• Johnson D.L.
• Legarra A.
• Tsuruta S.
• Lawlor T.J.
Hot topic: A unified approach to utilize phenotypic, full pedigree, and genomic information for genetic evaluation of Holstein final score.
;
• Christensen O.F.
• Lund M.S.
Genomic prediction when some animals are not genotyped.
) was
$H−1=A−1+000Gω−1−A22−1.$

GEBV was calculated as $GEBV=μˆ+aˆ.$ In all the models, the DMU package (
• Su G.
• Labouriau R.
• Christensen O.F.
DMU-A package for analyzing multivariate mixed models.
) was used to estimate variance components and predict breeding values.

### Validation of Predictions

The reliability of predictions was calculated as the squared correlation between GEBV and DRP divided by the average reliability of the DRP in the test set. The bias was investigated by the regression coefficient and intercept of DRP corrected with model mean on estimated genetic effects (the year trend was added to the direct genomic values in scenario GBLUPYear) and predicted genetic trend. Bias of predicted genetic trends was assessed by comparing year mean of GEBV with year mean of DRP for test individuals.

## Results

Descriptive statistics of DRP in different data sets are shown in Table 2. The mean DRP differ because individuals in different data sets were born in different periods. The average DRP of genotyped bulls was lower than the average DRP of the genotyped cows, whereas it was higher than the average of all the cows used in the single-step approach. This is caused by genetic progress over years due to selection and that genotyped cows were born in recent years.
Table 2Mean and SD of deregressed proofs (DRP) and reliability (R2DRP) of DRP for protein in different data sets
TraitGenotyped bullsGenotyped cowsCows used in single-step approach
MeanSDMeanSDMeanSD
Protein89.2913.95106.6218.7078.8929.09
R2DRP0.920.040.440.070.360.06
The number of test individuals in each year varied from 43 to 55, except for year 2009, in which there were only 16 test individuals (Table 3). The mean of the DRP in each year varied from 103.06 to 109.08, and the standard deviation varied from 6.56 to 8.88.
Table 3The number of individuals and mean and SD of deregressed proofs in each year in test set
Year20052006200720082009
No.4648554316
Mean103.06103.06104.15105.73109.08
SD7.097.288.428.886.56
The reliabilities of GEBV, as well as regression coefficients and intercept of DRP on GEBV for different scenarios are shown in Table 4. The reliabilities ranged from 0.29 to 0.40 in different scenarios. The reliability of GEBV from basic model (GBLUPBull) was 0.30. In scenario GBLUPCow, the reliability of GEBV increased. The scenarios of the single-step approach gained a large increase of reliability regardless of including cow genotypes or not. The reliability was 0.38 for scenario SSPG, whereas it was 0.36 for Scenario SSP. The highest reliability (0.40) was achieved in the scenario GBLUPDam_mgs. The reliability increased 1 percentage point in the scenario GBLUPYear. However, in scenario GBLUPWBull, the reliability decreased 1 percentage point. The regression coefficients varied from 0.58 to 0.78 in different scenarios. The regression coefficient from basic model (GBLUPBull) was 0.69. The regression coefficients increased to 0.72, 0.75, 0.74, and 0.78 in scenarios GBLUPWBull, GBLUPYear, SSPG, and SSP, respectively. The regression coefficient in scenario GBLUPCow was 0.09 lower than in the scenario GBLUPBull. The regression coefficient was the lowest in the scenario GBLUPDam_mgs. The intercept for different scenario were much larger than 0, which indicated that the mean of GEBV was lower than the mean of DRP.
Table 4Reliabilities (R2GEBV; Rel.) of genomic EBV (GEBV), regression coefficient (Reg. coef.), and intercept (Int.) of deregressed proofs on GEBV of test individuals in different scenarios
GBLUPBull=genomic BLUP model with bulls as reference population; GBLUPWBull=same as GBLUPBull but with a genomic relationship matrix Gω=0.8G + 0.2A, where G is the genomic relationship matrix and A is the pedigree relationship matrix; GBLUPCow=GBLUP model with both genotyped bulls and cows as reference population; GBLUPYear=year effects were included in the model as genetic trend; GBLUPDam_mgs=GEBV from GBLUP model using bull referencre data were adjusted for the difference between dam EBV and maternal grandsire EBV; SSP=the single-step approach using phenotypes of all cows and genotypes of genotyped bulls; SSPG=the single-step approach using phenotypes of all cows and genotypes of genotyped bulls and cows.
ItemGBLUPBullGBLUPWBullGBLUPCowGBLUPYearGBLUPDam_mgsSSPSSPG
Rel.0.300.290.350.310.400.360.38
Reg. coef.0.690.720.600.750.580.780.74
Int.6.006.557.975.903.795.886.92
1 GBLUPBull = genomic BLUP model with bulls as reference population; GBLUPWBull = same as GBLUPBull but with a genomic relationship matrix Gω = 0.8G + 0.2A, where G is the genomic relationship matrix and A is the pedigree relationship matrix; GBLUPCow = GBLUP model with both genotyped bulls and cows as reference population; GBLUPYear = year effects were included in the model as genetic trend; GBLUPDam_mgs = GEBV from GBLUP model using bull referencre data were adjusted for the difference between dam EBV and maternal grandsire EBV; SSP = the single-step approach using phenotypes of all cows and genotypes of genotyped bulls; SSPG = the single-step approach using phenotypes of all cows and genotypes of genotyped bulls and cows.
The trends of GEBV and DRP for genotyped bulls are shown in Figure 1. Bias of predicted genetic trend was observed in the scenario GBLUPBull. The difference between DRP and GEBV from GBLUPBull was around 5, which was statistically significant. Compared with scenario of GBLUPBull, all the alternative approaches reduced bias of predicted genetic trend to some extent except GBLUPWBull. Bias of predicted genetic trend was partly corrected in the scenario GBLUPCow. The scenario SSP and SSPG greatly reduced bias of predicted genetic trend. Scenario GBLUPDam_mgs also reduced bias of predicted genetic trend. Bias of predicted genetic trend was reduced slightly in scenario GBLUPYear. Figure 2 shows the boxplots results for GEBV-DRP for each scenario in each birth year.

## Discussion

Our study investigated strategies to improve the prediction reliability and reduce bias of predicted genetic trend observed in the Danish Jersey population. Several strategies were tested; that is, including cows in the reference, including a year of birth effect in the prediction model, adjusting GEBV with the difference between dam EBV and MGS EBV, and using a single-step approach. The results showed that these strategies could reduce bias of predicted genetic trend to some extent. However, the prediction reliability and regression coefficients did not consistently improve in parallel with the reduction in the bias of predicted genetic trend.
The regression coefficients in different scenarios were smaller than 1 in our study. One possible reason could be that markers were not in complete linkage disequilibrium with causal genes, and thus could not fully account for the total genetic variance. Another reason could be that the data used in the analysis were not a random sample, but selected data.
The reliability of prediction for protein was improved when the genotyped cows were included in the reference, as it clearly enlarged the size of reference population, which is the most important factor affecting prediction reliability (
• Goddard M.E.
• Hayes B.J.
Mapping genes for complex traits in domestic animals and their use in breeding programmes.
). However, we observed that GEBV were more inflated when both genotyped cows and genotyped bulls were used as reference population. Inflation may be caused by preferential treatment of cows included in the reference population (
• Wiggans G.R.
• Cooper T.A.
• Cole J.B.
Technical note: Adjustment of traditional cow evaluations to improve accuracy of genomic predictions.
;
• Kuhn M.T.
• Boettcher P.J.
• Freeman E.
Potential biases in predicted transmitting abilities of females from preferential treatment.
). The results from our study were consistent with the results reported by
• Wiggans G.R.
• Cooper T.A.
• Cole J.B.
Technical note: Adjustment of traditional cow evaluations to improve accuracy of genomic predictions.
; in their study, the regression coefficients of DRP on GEBV of protein decreased from 0.86 to 0.83 when the cows were added into the reference population. On the other hand, bias of predicted genetic trend was reduced when genotyped cows were included into the reference population. The reason could be that the cows which were bull dams and sibs of test bulls may account for the contribution of the test bulls’ dam to the bulls.
Previous studies reported that including a polygenic effect in a SNP-BLUP model or Bayesian model led to less inflation of GEBV (
• Solberg T.R.
• Sonesson A.K.
• Woolliams J.A.
• Odegard J.
• Meuwissen T.H.E.
Persistence of accuracy of genome-wide breeding values over generations when including a polygenic effect.
;
• Liu Z.
• Seefried F.R.
• Reinhardt F.
• Rensing S.
• Thaller G.
• Reents R.
Impacts of both reference population size and inclusion of a residual polygenic effect on the accuracy of genomic prediction.
;
• Su G.
• Christensen O.F.
• Janss L.
• Lund M.S.
Comparison of genomic predictions using genomic relationship matrices built with different weighting factors to account for locus-specific variances.
). The regression coefficient was improved from 0.69 to 0.72 in the current study when the polygenic effect was included in the model (GBLUPWBull). However, the bias of predicted genetic trend was not reduced compared with the model without polygenic effect.
The single-step model used cows’ deregressed EBV as a response variable rather than the raw phenotypic data. However, the effect of genomic preselection is minor in the current Jersey data and the deregressed cow EBV should not be biased at all. Therefore, the results could be considered single-step prediction using raw data. However, if genomic preselection is used in breeding schemes, the EBV estimated using pedigree will be biased. In this case, it is better to use raw data as a response variable in single-step approach. The single-step prediction, which used all the females’ DRP and pedigree as well as genotypes from genotyped bulls and cows, increased the reliability and reduced inflation of GEBV and bias of predicted genetic trend. These results were consistent with previous reports (
• Vitezica Z.G.
• Aguilar I.
• Misztal I.
• Legarra A.
Bias in genomic predictions for populations under selection.
;
• Koivula M.
• Strandén I.
• Pösö J.
• Aamand G.P.
• Mäntysaari E.A.
Single step genomic evaluations for the Nordic Red Dairy cattle test day data.
;
• Su G.
• Nielsen U.S.
• Mäntysaari E.A.
• Aamand G.P.
• Christensen O.F.
• Lund M.S.
Genomic prediction for Nordic Red Cattle using one-step and selection index blending.
). As DRP of nongenotyped animals also contributes to the prediction through a combined matrix, the prediction reliability was improved. Moreover, single-step models could reduce bias of predicted genetic trend by including all the records to trace selection (
• Vitezica Z.G.
• Aguilar I.
• Misztal I.
• Legarra A.
Bias in genomic predictions for populations under selection.
). Similar to a GBLUP model including genotyped cows in the reference data, the regression coefficient decreased when the cow genotypes were included into the single-step approach. The selection index blending (
• Van Tassell C.P.
• Wiggans G.R.
• Sonstegard T.S.
• Schnabel R.D.
• Taylor J.F.
• Schenkel F.S.
Invited review: Reliability of genomic predictions for North American Holstein bulls.
;
• Su G.
• Nielsen U.S.
• Mäntysaari E.A.
• Aamand G.P.
• Christensen O.F.
• Lund M.S.
Genomic prediction for Nordic Red Cattle using one-step and selection index blending.
) with the same information used in the single-step approach without cow genotype data was compared with single-step approach in our study (data not shown). The prediction reliability was 0.32, which was higher than reliability of GEBV directly from GBLUPBull but lower than scenario SSP even though the information used in these 2 methods was the same. The bias of predicted genetic trend was corrected for individuals born in 2005 and 2006, but not for the individuals born after 2006 when the blending index was used. Genomic relationship matrix was modified with pedigree relationship matrix in single-step approach. As the pedigree relationship has influence on the regression coefficient, to be consistent, the scenario GBLUPWBull was investigated. The results from GBLUPWBull showed GBLUP model with 20% of the pedigree relationship matrix did not increase the prediction reliability and reduce bias of predicted genetic trend. However, the regression coefficients were improved by the weighted G matrix (from 0.69 of GBLUPBull to 0.72 of GBLUPWBull) and by the single-step approach (from 0.72 of GBLUPWBull to 0.78 of SSP). These results suggest that using a single-step method is an effective approach to increase the prediction reliability and reduce the bias of predicted genetic trend.
Including the year of birth effect reduced the bias of predicted genetic trend and improved the regression coefficients. The reason could be that the year effect partly accounted for the trend of selection among the dams. The GEBV together with the year effect captured the genetic progress across years, which led to a robust estimation of genetic trend (

Ducrocq, V. 2010. Sustainable dairy cattle breeding: Illusion or reality. In Proc. 9th World Congr. Genet. Appl. Livest. Prod.

).
The mean of GEBV adjusted for the difference between dam EBV and MGS EBV were much closer to the mean of DRP in the test population compared with the GEBV without adjustment. The reliability was improved greatly, which may have been caused by a possible autocorrelation between dam EBV and the progeny DRP. However, the regression coefficients deviated more from unity, which may have been caused by the preferential treatment of selected cows. Bias of prediction trend was corrected in a form of large inflation of the GEBV. Therefore, it is not a good approach to correct for bias of predicted genetic trend.
The results from the current study indicate that the regression coefficient, which has mainly been used in previous studies (
• Verbyla K.L.
• Hayes B.J.
• Bowman P.J.
• Goddard M.E.
Accuracy of genomic selection using stochastic search variable selection in Australian Holstein Friesian dairy cattle.
;
• Su G.
• Brøndum R.F.
• Ma P.
• Guldbrandtsen B.
• Aamand G.P.
• Lund M.S.
Comparison of genomic predictions using medium-density (~54,000) and high-density (~777,000) single nucleotide polymorphism marker panels in Nordic Holstein and Red Dairy Cattle populations.
), should not be the only criterion to measure the unbiasedness of predictions. The regression coefficient is not always consistent with the bias of predicted genetic trends. As the prediction trend is important when the individuals across generations are compared, it should also be included in the evaluation criteria. The year mean of DRP could be expressed as the year mean of GEBV times the regression coefficients plus the intercept. Therefore, the bias of predicted genetic trend could be predicted using the regression coefficient and intercept. Therefore the intercept together with the regression coefficients should be given attention in genomic prediction.

## Conclusions

The main reason for the bias of predicted genetic trend could be that the reference animals did not have all the information required to trace selection, especially the information of dams. Consequently, methods using more information related to selection can reduce the bias. The most efficient way is to implement a single-step approach for genomic prediction, as the single-step approach increased the prediction reliability, improved the regression coefficients, and led to an unbiased prediction trend. As bias of predicted genetic trends can be measured by the intercept and regression coefficient of observations on GEBV, both intercept and regression coefficients should be taken into consideration in validation of genomic predictions.

## Acknowledgments

This work was performed within the project “Genomics in herds,” funded by VikingGenetics (Randers, Denmark) and Nordic Cattle Genetic Evaluation (Aarhus, Denmark).

## References

• Aguilar I.
• Misztal I.
• Johnson D.L.
• Legarra A.
• Tsuruta S.
• Lawlor T.J.
Hot topic: A unified approach to utilize phenotypic, full pedigree, and genomic information for genetic evaluation of Holstein final score.
J. Dairy Sci. 2010; 93 (http://dx.doi.org/10.3168/jds.2009-2730): 743-752
• Browning B.L.
• Browning S.R.
A unified approach to genotype imputation and haplotype-phase inference for large data sets of trios and unrelated individuals.
Am. J. Hum. Genet. 2009; 84 (http://dx.doi.org/10.1016/j.ajhg.2009.01.005): 210-223
• Calus M.P.L.
• de Haas Y.
• Veerkamp R.F.
Combining cow and bull reference populations to increase accuracy of genomic prediction and genome-wide association studies.
J. Dairy Sci. 2013; 96 (http://dx.doi.org/10.3168/jds.2012-6013): 6703-6715
• Christensen O.F.
• Lund M.S.
Genomic prediction when some animals are not genotyped.
Genet. Sel. Evol. 2010; 42 (http://dx.doi.org/10.1186/1297-9686-42-2): 2
• Christensen O.F.
• Nielsen B.
• Ostersen T.
• Su G.
Single-step methods for genomic evaluation in pigs.
Animal. 2012; 6 (http://dx.doi.org/10.1017/S1751731112000742): 1565-1571
1. Ducrocq, V. 2010. Sustainable dairy cattle breeding: Illusion or reality. In Proc. 9th World Congr. Genet. Appl. Livest. Prod.

• Gao H.
• Christensen O.F.
• Nielsen U.S.
• Zhang Y.
• Lund M.S.
• Su G.
Comparison on genomic predictions using three GBLUP methods and two single-step blending methods in the Nordic Holstein population.
Genet. Sel. Evol. 2012; 44 (http://dx.doi.org/10.1186/1297-9686-44-8): 8
• Goddard M.E.
• Hayes B.J.
Mapping genes for complex traits in domestic animals and their use in breeding programmes.
Nat. Rev. Genet. 2009; 10 (http://dx.doi.org/10.1038/nrg2575): 381-391
• Hayes B.J.
• Bowman P.J.
• Chamberlain A.J.
• Goddard M.E.
Invited review: Genomic selection in dairy cattle: progress and challenges.
J. Dairy Sci. 2009; 92 (a http://dx.doi.org/10.3168/jds.2008-1646): 433-443
• Hayes B.J.
• Visscher P.M.
• Goddard M.E.
Increased accuracy of artificial selection by using the realized relationship matrix.
Genet. Res. (Camb.). 2009; 91 (b http://dx.doi.org/10.1017/S0016672308009981): 47-60
• Koivula M.
• Strandén I.
• Pösö J.
• Aamand G.P.
• Mäntysaari E.A.
Single step genomic evaluations for the Nordic Red Dairy cattle test day data.
Interbull Bull. 2012; 46: 115-120
• Kuhn M.T.
• Boettcher P.J.
• Freeman E.
Potential biases in predicted transmitting abilities of females from preferential treatment.
J. Dairy Sci. 1994; 77 (http://dx.doi.org/10.3168/jds.S0022-0302(94)77185-X): 2428-2437
• Legarra A.
• Aguilar I.
• Misztal I.
A relationship matrix including full pedigree and genomic information.
J. Dairy Sci. 2009; 92 (http://dx.doi.org/10.3168/jds.2009-2061): 4656-4663
• Lidauer M.
• Strandén I.
Fast and flexible program for genetic evaluation in dairy cattle.
Interbull Bull. 1999; 20: 19-24
• Liu Z.
• Seefried F.R.
• Reinhardt F.
• Rensing S.
• Thaller G.
• Reents R.
Impacts of both reference population size and inclusion of a residual polygenic effect on the accuracy of genomic prediction.
Genet. Sel. Evol. 2011; 43 (http://dx.doi.org/10.1186/1297-9686-43-19): 19
• Lund M.S.
• De Roos A.P.W.
• De Vries A.G.
• Druet T.
• Ducrocq V.
• Fritz S.
• Guillaume F.
• Guldbrandtsen B.
• Liu Z.
• Reents R.
• Schrooten C.
• Seefried F.
• Su G.
A common reference population from four European Holstein populations increases reliability of genomic predictions.
Genet. Sel. Evol. 2011; 43 (http://dx.doi.org/10.1186/1297-9686-43-43): 43
• Su G.
• Labouriau R.
• Christensen O.F.
DMU-A package for analyzing multivariate mixed models.
in: Proc. 9th World Congr. Genet. Appl. Livest. Prod., Leipzig, Germany Gesellschaft für Tierzuchtwissenschaft e.V., Bonn, Gemany2010: 732
• Misztal I.
• Legarra A.
• Aguilar I.
Computing procedures for genetic evaluation including phenotypic, full pedigree, and genomic information.
J. Dairy Sci. 2009; 92 (http://dx.doi.org/10.3168/jds.2009-2064): 4648-4655
• Solberg T.R.
• Sonesson A.K.
• Woolliams J.A.
• Odegard J.
• Meuwissen T.H.E.
Persistence of accuracy of genome-wide breeding values over generations when including a polygenic effect.
Genet. Sel. Evol. 2009; 41 (http://dx.doi.org/10.1186/1297-9686-41-53): 53
• Strandén I.
• Mäntysaari E.A.
A recipe for multiple trait deregression.
Interbull Bull. 2010; 42: 21-24
• Su G.
• Brøndum R.F.
• Ma P.
• Guldbrandtsen B.
• Aamand G.P.
• Lund M.S.
Comparison of genomic predictions using medium-density (~54,000) and high-density (~777,000) single nucleotide polymorphism marker panels in Nordic Holstein and Red Dairy Cattle populations.
J. Dairy Sci. 2012; 95 (a http://dx.doi.org/10.3168/jds.2012-5379): 4657-4665
• Su G.
• Christensen O.F.
• Janss L.
• Lund M.S.
Comparison of genomic predictions using genomic relationship matrices built with different weighting factors to account for locus-specific variances.
J. Dairy Sci. 2014; 97 (http://dx.doi.org/10.3168/jds.2014-8210): 6547-6559
• Su G.
• Nielsen U.S.
• Mäntysaari E.A.
• Aamand G.P.
• Christensen O.F.
• Lund M.S.
Genomic prediction for Nordic Red Cattle using one-step and selection index blending.
J. Dairy Sci. 2012; 95 (b http://dx.doi.org/10.3168/jds.2011-4804): 909-917
• Thomasen J.R.
• Guldbrandtsen B.
• Su G.
• Brøndum R.F.
• Lund M.S.
Reliabilities of genomic estimated breeding values in Danish Jersey.
Animal. 2012; 6 (http://dx.doi.org/10.1017/S1751731111002035): 789-796
Efficient methods to compute genomic predictions.
J. Dairy Sci. 2008; 91 (http://dx.doi.org/10.3168/jds.2007-0980): 4414-4423
• Van Tassell C.P.
• Wiggans G.R.
• Sonstegard T.S.
• Schnabel R.D.
• Taylor J.F.
• Schenkel F.S.
Invited review: Reliability of genomic predictions for North American Holstein bulls.
J. Dairy Sci. 2009; 92 (http://dx.doi.org/10.3168/jds.2008-1514): 16-24
• Verbyla K.L.
• Hayes B.J.
• Bowman P.J.
• Goddard M.E.
Accuracy of genomic selection using stochastic search variable selection in Australian Holstein Friesian dairy cattle.
Genet. Res. (Camb.). 2009; 91 (http://dx.doi.org/10.1017/S0016672309990243): 307-311
• Vitezica Z.G.
• Aguilar I.
• Misztal I.
• Legarra A.
Bias in genomic predictions for populations under selection.
Genet. Res. (Camb.). 2011; 93 (http://dx.doi.org/10.1017/S001667231100022X): 357-366
• Wiggans G.R.
• Cooper T.A.