Advertisement
Research Article| Volume 98, ISSUE 2, P1296-1309, February 2015

Across-country test-day model evaluations for Holstein, Nordic Red Cattle, and Jersey

Open AccessPublished:November 27, 2014DOI:https://doi.org/10.3168/jds.2014-8307

      Abstract

      Three random regression models were developed for routine genetic evaluation of Danish, Finnish, and Swedish dairy cattle. Data included over 169 million test-day records with milk, protein, and fat yield observations from over 8.7 million dairy cows of all breeds. Variance component analyses showed significant differences in estimates between Holstein, Nordic Red Cattle, and Jersey, but only small to moderate differences within a breed across countries. The obtained variance component estimates were used to build, for each breed, their own set of covariance functions. The covariance functions describe the animal effects on milk, protein, and fat yields of the first 3 lactations as 9 different traits, assuming the same heritabilities and a genetic correlation of unity across countries. Only 15, 27, and 7 eigenfunctions with the largest eigenvalues were used to describe additive genetic animal effects and nonhereditary animal effects across lactations and within later lactations, respectively. These reduced-rank covariance functions explained 99.0 to 99.9% of the original variances but reduced the number of animal equations to be solved by 44%. Moderate rank reduction for nonhereditary animal effects and use of one-third-smaller measurement error correlations than obtained from variance component estimation made the models more robust against extreme observations. Estimation of the genetic levels of the countries’ subpopulations within a breed was found sensitive to the way the breed effects were modeled, especially for the genetically heterogeneous Nordic Red Cattle. Means to ensure that only additive genetic effects entered the estimated breeding values were to describe the crossbreeding effects by fixed and random cofactors and the calving age effect by an age × breed proportion interaction, and to model phantom parent groups as random effects. To ensure that genetic variances were the same across the 3 countries in breeding value estimation, as suggested by the variance component estimates, the applied multiplicative heterogeneous variance adjustment method had to be tailored using country-specific reference measurement error variances. Results showed the feasibility of across-country genetic evaluation of cows and sires based on original test-day phenotypes. Nevertheless, applying a thorough model validation procedure is essential throughout the model building process to obtain reliable breeding values.

      Key words

      Introduction

      The first random regression test-day model (RRM) for national genetic evaluation of dairy cattle was adopted at the onset of the millennium (
      • Schaeffer L.R.
      • Jamrozik J.
      • Kistemaker G.J.
      • Van Doormaal B.J.
      Experience with a test-day model.
      ). Since then, RRM have become the models of choice for genetic evaluation of production traits in dairy cattle. The main arguments for RRM implementation are that it gives more statistical power for modeling the data and generates information on the change over time for the trajectories of the traits (
      • Jamrozik J.
      • Schaeffer L.R.
      • Dekkers J.C.M.
      Genetic evaluation of dairy cattle using test day yields and random regression model.
      ;
      • Swalve H.H.
      Theoretical basis and computational methods for different test-day genetic evaluation methods.
      ;
      • Lidauer M.
      • Mäntysaari E.A.
      • Strandén I.
      Comparison of test-day models for genetic evaluation of production traits in dairy cattle.
      ). As RRM is increasingly being adopted for genetic evaluation of dairy breeds, interest is growing in joint evaluations across countries (
      • Canavesi F.
      • Boichard D.
      • Ducrocq V.
      • Gengler N.
      • De Jong G.
      • Liu Z.
      Production traits European joint evaluation (PROTEJE). Proc. 2001 Interbull Mtg., Budapest, Hungary.
      ;
      • Emmerling R.
      • Lidauer M.
      • Mäntysaari E.A.
      Multiple lactation random regression test-day model for Simmental and Brown Swiss in Germany and Austria. Proc. 2002 Interbull Mtg., Interlaaken, Switzerland.
      ;
      • de Roos A.P.W.
      • Harbers A.G.F.
      • De Jong G.
      Random herd curves in a test-day model for milk, fat, and protein production of dairy cattle in the Netherlands.
      ), which adds further complexity to the applied RRM. In the Nordic countries, genetic material has been exchanged across borders for many decades, mainly in terms of semen but also embryos, leading to the initiative to develop a common genetic evaluation for Nordic dairy cattle.
      The first official joint Nordic evaluation for production traits was placed into use in 2006. The applied statistical model was based on a meta-model approach (

      Mäntysaari, E. A. 2006a. Meta-model combines traits with different genetic models and data structures. Abstract c24–01 in Proc. 8th World Congr. Genet. Appl. Livest. Prod., Belo Horizonte, Brazil. Brazilian Society of Animal Breeding, Belo Horizonte, MG, Brazil.

      ) that incorporated the original national models and variance parameters but imposed a genetic correlation of unity across the countries. Thus, country-specific heritabilities and environmental effects were included for building the Nordic model (
      • Mäntysaari E.A.
      • Lidauer M.
      • Pösö J.
      • Madsen P.
      • Strandén I.
      • Pedersen J.
      • Nielsen U.S.
      • Johansson K.
      • Eriksson J.-Å.
      • Aamand G.P.
      Joint Nordic test day model: Variance components. Proc. 2006 Interbull Mtg., Kuopio, Finland.
      ;
      • Lidauer M.
      • Pedersen J.
      • Pösö J.
      • Mäntysaari E.A.
      • Strandén I.
      • Madsen P.
      • Nielsen U.S.
      • Johansson K.
      • Eriksson J.-Å.
      • Aamand G.P.
      Joint Nordic test day model: Evaluation model. Proc. 2006 Interbull Mtg., Kuopio, Finland.
      ). In principle, each country had its own national model for production traits: Danish test-day (TD) observations were modeled based on Danish RRM variance component estimates (
      • Jakobsen J.H.
      • Madsen P.
      • Jensen J.
      • Pedersen J.
      • Christensen L.G.
      • Sorensen D.A.
      Genetic parameters for milk production and persistency for Danish Holstein estimated in random regression models using REML.
      ), and Finnish TD observations were modeled as in previous Finnish RRM routine evaluation (
      • Lidauer M.
      • Mäntysaari E.A.
      • Strandén I.
      • Pösö J.
      Multiple-trait random regression test-day model for all lactations. Proc. 2000 Interbull Mtg., Bled, Slovenia.
      ). Swedish 305-d lactation yield observations were modeled using own environmental effects but with the additive genetic animal effect blended into the random regression (RR) covariance function (CF) for the additive genetic animal effect of the meta-model. The motivation for developing a meta-model was to achieve a model as good as or better than the one already implemented in each respective country. Since the first joint model, additional efforts have been made to harmonize the data from the participating countries as well as the models for these data. In 2008, Swedish 305-d yield observations were replaced by TD observations.
      Whereas joint Nordic evaluation enhanced across-country dairy cattle breeding activities, the complexity of the models was difficult to explain to breeders. This initiated research to identify heterogeneity in the data, which the meta-model needs to account for, and to harmonize modeling across countries wherever possible. A first research step was to conduct comprehensive variance component analyses for the different breeds in each country. New CF were built for the RRM based on the results from these analyses, and as an outcome, 3 revised evaluation models for the main breeds Holstein (HOL), Nordic Red Cattle (RDC), and Jersey (JER) were officially adopted in February 2012.
      The aim of this paper is to describe the Nordic across-country genetic evaluation model for yield traits. We give special attention to modeling aspects that were found crucial in across-country genetic evaluation based on original observations. Particular focus is on building of the CF, modeling of breed effects, and adjustment for heterogeneous variance.

      Materials and Methods

      Data

      Breeds

      The Nordic dairy cattle population comprises 4 dairy breeds: HOL, RDC, JER, and Finncattle (FIC); HOL and RDC are the main dairy breeds, of which HOL cows are predominant in Denmark and Sweden, and RDC cows in Finland (Table 1). Herds with JER cows are only found in Denmark and in the south of Sweden, and indigenous FIC cows only in Finland. Crossbreeding is used in all the main breeds, both between different breed strains and between breeds. The latter is especially the case for the Danish RDC population, which is a synthetic breed of old Red Danish Cattle, Swedish Red Breed, Brown Swiss, and Red Holstein. The Finnish and Swedish Red Cattle populations also have their own separate histories, making RDC a genetically very heterogeneous breed. Connectedness of the populations across the 3 countries was originally established mainly by the use of common Nordic or international sires.
      Table 1Average breed proportions (%) in dairy cows born in 2010 by main breed and by country of birth (D = Denmark, F = Finland, and S = Sweden)
      BreedHolsteinNordic Red CattleJerseyFinncattle
      DFSDFSDSF
      Holstein Friesian96.284.291.720.60.51.30.11.03.5
      European Black and White3.710.86.60.40.30.80.41.7
      Finnish Ayrshire3.50.411.059.328.50.38.1
      Swedish Red Breed0.50.922.322.243.20.91.7
      Red Danish Cattle0.122.51.56.60.1
      Norwegian Red Cattle0.70.22.15.77.40.10.6
      Canadian Ayrshire0.20.15.69.78.00.10.4
      American Brown Swiss13.60.74.1
      Danish Jersey0.157.958.5
      American Jersey40.236.7
      New Zealand Jersey1.71.5
      Finncattle0.183.9
      Other breeds0.11.80.10.10.10.5

      Test-Day Records

      The TD data were collected from farms located between 55°N in southern Denmark to almost 67°N in Finnish Lapland, from an area with wide environmental and seasonal variation. The number of cows per herd varied considerably, ranging from large herds in Denmark (n¯ = 152) to medium and small herds in Finland (n¯ = 30). The data comprised all the available 169.2 million TD records of 8.7 million cows from all dairy breeds, collected since 1990, 1988, and 1995 in Denmark, Finland, and Sweden, respectively (Table 2). A TD record consists of an observation triplet (milk, protein, and fat yields) recorded monthly for the same cow at the same TD, except for Finland where protein and fat contents are measured bimonthly. Additionally, to increase contemporary group sizes for Finnish cows, all TD records beyond third lactation were included in the data. All of the applied TD records had to fulfill several editing rules, and observations outside an acceptable range (milk yield 3.0 to 99.9 kg, protein content 1 to 9%, fat content 1 to 12%) were excluded. Data from cows with no observations at the beginning of the first lactation were omitted, as were data from first-generation crossbred cows between the 3 main breeds (HOL, RDC, and JER).
      Table 2Number of test-day (TD) records and cows with observations included in the May 2013 routine evaluation
      BreedDenmarkFinlandSweden
      TD recordsCowsTD recordsCowsTD recordsCows
      Holstein59,081,4353,382,40114,366,861595,52418,921,5941,048,669
      Nordic Red Cattle8,131,043478,68839,938,6821,570,25217,888,622975,227
      Jersey10,164,986582,922204,97711,480
      Finncattle544,33023,504

      Variance Components

      Variance component analyses were performed to determine potential differences across breeds and countries. Based on experience from the development of the first Nordic TD model, we conducted an extensive study to design the most suitable model for variance component analysis involving testing with data samples of different sizes from different breeds. A Gibbs sampler (

      Madsen, P., and J. Jensen. 2008a. A Users’s Guide to DMU. A package for analysing multivariate mixed models. Version 6, release 4.7. Faculty of Agricultural Sciences, University of Aarhus, Denmark.

      ) and a Monte-Carlo expectation-maximization REML method (
      • Matilainen K.
      • Mäntysaari E.A.
      • Lidauer M.H.
      • Strandén I.
      • Thompson R.
      Employing a Monte Carlo algorithm in expectation maximization restricted maximum likelihood estimation of the linear mixed model.
      ) were applied to check the capability of these methods to perform correctly the desired analyses (
      • Madsen P.
      • Lidauer M.H.
      • Mäntysaari E.A.
      Strategy for estimation of variance components for the joint Nordic yield evaluation. Proc. 2008 Interbull Mtg., Niagara Falls (USA).
      ;
      • Lidauer M.H.
      • Madsen P.
      • Matilainen K.
      • Mäntysaari E.A.
      • Strandén I.
      • Thompson R.
      • Pösö J.
      • Pedersen J.
      • Nielsen U.S.
      • Eriksson J.-Å.
      • Johansson K.
      • Aamand G.P.
      Estimation of variance components for Nordic Red Cattle test-day model: Bayesian Gibbs sampler vs. Monte Carlo EM REML. Proc. 2009 Interbull Mtg., Barcelona, Spain.
      ). Final analyses were carried out on 6 data samples: 2 HOL samples, 1 from Danish and 1 from Swedish data; 3 RDC samples, 1 from each country; and 1 Danish JER sample. Each sample included observations from about 20,000 cows. The structure of FIC data would have required different sampling rules to obtain a sufficiently large data set and, therefore, was not considered in the variance component estimation. The applied multiple-trait RRM described TD milk, protein, and fat yields of the first 3 lactations of a cow by 9 model equations, where the RR coefficients of specified RR functions were assumed to be correlated across traits. In matrix notation, the model is
      y=Xb+Zh+Qcc+Qp+Qa+e,
      [1]


      where y is a vector of observations; b is a vector of fixed effects; h, c, p, and a are vectors of random effects; and e is a vector of random residuals. Vector b includes the fixed effects herd × 2-yr calving period, calving age, days carried calf, regression function on DIM d nested within 2-yr calving periods, and regression on breed heterozygosity. Vector h contains the random herd × test-day (HTD) effects. Vectors c, p, and a are vectors of random regression coefficients for the herd lactation curve nested within herd × 2-yr calving period, nonhereditary animal effects, and additive genetic animal effects, respectively. The random regression functions for nonhereditary and additive genetic animal effects include 4 terms comprising a second-order Legendre polynomial and an exponential term exp(−0.04d). When comparing the −2logL values obtained for the various functions with different orders of Legendre polynomials and exponential terms, we found that including an exponential term improved the fit for fat yield, which is consistent with
      • Jakobsen J.H.
      • Madsen P.
      • Jensen J.
      • Pedersen J.
      • Christensen L.G.
      • Sorensen D.A.
      Genetic parameters for milk production and persistency for Danish Holstein estimated in random regression models using REML.
      . The same function but without the intercept term was used for the herd lactation curve. The random residuals in e were nested within 12 consecutive lactation periods from d = 8 to d = 365 with intervals of 3 × 2 wk, 3 × 3 wk, 3 × 7 wk, and 3 × 5 wk. The matrices X, Z, Qc, and Q are incidence and covariable matrices that relate the appropriate effects to each observation. Furthermore, it was assumed that
      varhcpae=IH00000IC00000IP00000AG00000IE,


      where I is an identity matrix of size equal to number of effect levels, H is a 9 × 9 covariance matrix for the HTD effect, C is a 27 × 27 covariance matrix of the herd lactation curve regression coefficients, P and G are 36 × 36 covariance matrices of the nonhereditary and additive genetic regression coefficients, and A is the numerator relationship matrix. The matrix for the residuals is in block diagonal form +El,ω, where each diagonal block is of size 3 × 3 and contains the residual covariances for an observation triplet of lactation l and stage of lactation period ω. Both methods for variance component estimation analysis reached the same conclusion (
      • Lidauer M.H.
      • Madsen P.
      • Matilainen K.
      • Mäntysaari E.A.
      • Strandén I.
      • Thompson R.
      • Pösö J.
      • Pedersen J.
      • Nielsen U.S.
      • Eriksson J.-Å.
      • Johansson K.
      • Aamand G.P.
      Estimation of variance components for Nordic Red Cattle test-day model: Bayesian Gibbs sampler vs. Monte Carlo EM REML. Proc. 2009 Interbull Mtg., Barcelona, Spain.
      ). Notable differences were found in variance component estimates across breeds but only moderate to small differences across countries within a breed. The estimated heritabilities, presented as the sum over 10 standard TD (d = 15, d = 45, …, d = 285), are given in Table 3.
      Table 3Heritabilities for milk, protein, and fat yields calculated as sum over 10 standard test-days (DIM = {15, 45, …, 285}) given by breed, country (D = Denmark, F = Finland, S = Sweden), and parity
      TraitParityHolsteinNordic Red CattleJersey
      DSDFSD
      Milk10.400.390.420.390.440.45
      20.290.290.350.340.330.30
      30.290.250.340.310.340.25
      Protein10.350.350.380.340.430.39
      20.250.280.350.340.350.28
      30.270.260.350.320.360.22
      Fat10.380.390.390.360.430.38
      20.300.330.350.350.340.26
      30.290.290.350.340.360.23

      Building of Covariance Functions

      Breed-specific CF were fitted for each of the 3 main breeds HOL, RDC, and JER to account for differences found in the variance component analyses (Table 3). However, the difference in across-country variance component estimates within the breeds HOL and RDC was deemed small enough to justify using the same variance components. Only first-parity protein and fat yield heritabilities were notably lower for Finnish RDC than for Danish and Swedish RDC. Nevertheless, this was considered to reflect the historical structure of the Finnish data rather than the future trend characterized by larger herd size and moving toward TMR feeding management. Therefore, in constructing the CF, we found it reasonable to follow the industry’s wish for simplicity by applying the same genetic variances and a genetic correlation of unity across the 3 countries.
      The model [1] used for variance component estimation was designed to fit the data as well as possible, particularly at both ends of the lactation trajectories. This led to the estimation of a great number of covariance parameters. However, fitting a CF to the estimated variance-covariance matrices (
      • Kirkpatrick M.
      • Hill W.G.
      • Thompson R.
      Estimating the covariance structure of traits during growth and ageing, illustrated with lactation in dairy cattle.
      ) made it possible to develop an applicable breeding value estimation model that required fewer random effects to be estimated, thus making the model more parsimonious. After fitting of the CF, the minimum number of RR coefficients needed to model the animal effects was determined by investigating the size of the eigenvalues of the CF. All small eigenvalues whose sum explained at maximum 1% of total variance were removed, following the findings of
      • Tyrisevä A.-M.
      • Meyer K.
      • Fikse F.W.
      • Ducrocq V.
      • Jakobsen J.
      • Lidauer M.H.
      • Mäntysaari E.A.
      Principal component approach in variance component estimation for international sire evaluation.
      . The derived CF were validated by ensuring that genetic correlations, repeatabilities, and heritabilities were consistent with the original estimates, and by investigating the predictability and stability of EBV. Before fitting the CF, all estimated variance component parameters were scaled to yield the same magnitude of residual variances across traits and lactations. This was necessary because, otherwise, the variances for milk yield would have dominated the CF, and the variances for protein yield might have been impaired by reducing the rank of the CF; that is, ignoring the eigenfunctions with very small eigenvalues. Consequently, all TD observations were pre-multiplied by 1/ϑ before entering the evaluation and all EBV were post-multiplied by ϑ where ϑ is a trait-specific scaling factor as given in Table 4.
      Table 4Trait-specific scaling factors for observations
      LactationTrait
      MilkProteinFat
      14.00.130.18
      25.50.180.26
      3+6.00.200.28

      Additive Genetic Animal Effects

      The estimated (co)variance component matrix G for random regression coefficients describing the additive genetic animal effects could have been applied to the evaluation model without any modification. Nevertheless, the rank of G was reduced to make the model more parsimonious and to decrease the number of equations per animal. This rank reduction was done using eigenvalue decomposition and only retaining the significant eigenvalues: G=VgDgVgTVg*Dg*Vg*T, where Vg and Dg are eigenfunction and eigenvalue matrices, Vg* and Dg* are corresponding matrices that include only the significant eigenfunctions, and superscript T indicates the transpose of the matrix. Thus, the additive genetic (co)variance matrix was approximated by varaAUgUgT, where Ug=QVg*Dg*12. The diagonal matrix Dg* included the 15 largest eigenvalues of Dg and explained 99.5, 99.0, and 99.9% of the variation in G for HOL, RDC, and JER, respectively.

      Nonhereditary Animal and Residual Effects

      To derive the CF for the nonhereditary animal effects, the following 3 steps were performed. First, the estimated variance component matrices C, P, and E were used to construct an overall 63 × 63 R matrix by summing
      R=Rc+Rp+Re=ΦcC#ΦcT+ΦPΦT+Re,


      where Φc=I9×9φc, Φ=I9×9φ, and C# is a block diagonal matrix of C that ignores the correlations between lactations. Matrix φ consisted of 7 rows with covariables for the second-order Legendre function and the exponential term exp(−0.04d) for 7 different DIM d within the lactation period d = 8 to d = 365. Matrix φc was the same as φ but without the intercept column. Matrix Re was constructed from the estimated El,ω3×3 submatrices that correspond to the 7 chosen DIM; namely, d = {20, 50, 80, 150, 220, 280, 330}. In a first attempt, covariables for 36 different DIM that were evenly distributed within lactation were used to construct the overall R. However, fitting the CF only to the 7 presented DIM resulted in better predictability of EBV for cows having an extreme observation at the beginning of lactation.
      Second, R was decomposed into a CF for nonhereditary animal effects ΦKpΦT and into a measurement error covariance matrix (M) by applying a maximum likelihood algorithm (

      Mäntysaari, E. A. 1999. Derivation of multiple trait reduced rank random regression (RR) model for the first-lactation test-day records of milk, protein, and fat. Page 26 in Proc. 50th Annu. Mtg. EAAP, Zurich, Switzerland. Wageningen Academic Publishers, Wageningen, the Netherlands.

      ) that simultaneously estimated the elements in Kp and M:
      ER=ΦKpΦT+M=I9×9φKp11Kp12Kp13Kp21KpKp23Kp31Kp32Kp33I9×9φT+I7M1I7M2I7M3.


      Note that the size of Kp is 36 × 36 and measurement errors are correlated only within lactation; that is, M is a block diagonal matrix with block size 3 × 3. Fitting the CF to R resulted in the same measurement error variances for all DIM within lactation, which simplifies the adjustment for heterogeneous variances, as explained later.
      Third, the rank of the obtained coefficient matrix Kp was reduced. Therefore, an eigenvalue decomposition was performed on each diagonal block (Kp11, Kp22, Kp33), and the 9 largest eigenvalues were retained within each block. For instance, for the first-lactation diagonal block, this resulted in Kp11=Vp11Dp11Vp11TVp11*Dp11*Vp11*T and a scaled eigenfunction matrix W11*=Vp11*Dp11*12. The off-diagonal blocks of the rank-reduced matrix Dp* were fitted by applying generalized right and left inverses, as proposed by
      • Tijani A.
      • Wiggans C.R.
      • Van Tassel J.C.
      • Philpot J.C.
      • Gengler N.
      Use of (co)variance functions to describe (co)variances of test day yield.
      ); for example, Dp21*=W22*\Kp21/W11*. Thus, the CF for nonhereditary animal effects is approximated as
      ΦKpΦTUp1000Up2000Up3IDp12*Dp13*Dp21*IDp23*Dp31*Dp32*IUp1000Up2000Up3T,


      where the covariable matrix for lactation l is Upl=I3×3φWll*. The rank-reduced CF UpDp*UpT has rank 27 and explained 99.8% or more of the variation described by the original CF ΦKpΦT for HOL, RDC, and JER.

      Covariance Functions for Finnish Later-Lactation Observations

      Finnish TD observations from fourth lactation onward (i.e., later lactations) were modeled by considering them as repeated observations from the third lactation. Repeatability among later lactations was accounted for by modeling 1 additional nonhereditary animal effect for each lactation from the third onward. The required variance components were constructed by assuming the same repeatabilities between later lactations as those estimated between second and third lactations (Table 5). Thus, later-lactation repeatability was approximated by splitting Kp33 into 2 matrices: Kp33=Kp33*+Kw, and by building for the third lactation submatrix of Kp a symmetric matrix resembling the covariances between second and third lactations. Hence,
      Kp33*=diagKp33120.5Cp32+Cp32TdiagKp3312,


      where Cp32 is the second to third lactation submatrix of the correlation matrix of Kp. Then, in matrix Kp, Kp33 was replaced by Kp33*, and the CF for nonhereditary animal effects for Finnish later-lactation TD observations were determined as explained above. To construct the CF for additional nonhereditary animal effects, the 7 largest eigenfunctions of Kw were used: I3×3φKwI3×3φTUwUwT. The obtained CF explained 99.8% of the variation described by Kw.
      Table 5Estimated repeatability between second and third lactations for Nordic Red Cattle calculated for 7 different DIM
      TraitDIM
      205080150220280330
      Milk0.320.420.460.530.580.600.58
      Protein0.310.400.450.520.590.610.60
      Fat0.260.330.370.460.530.580.59

      Modeling of Breed Effects

      Heterosis and Recombination Loss

      Modeling of crossbreeding effects was found important to ensure accurate estimation of the genetic level for each country’s subpopulations. This was especially the case for the RDC evaluation model for 2 reasons. First, the foreign breeds (e.g., American Brown Swiss) introduced into the Danish Red Cattle synthetic population have raised the heterozygosity levels of crossbred animals, but these heterozygosity levels are partly confounded with the genetic groups of the foreign breed, because there are no purebred cows of foreign breeds in the data. This makes it difficult to separate genetic group effects from crossbreeding effects. Second, the data include sires whose daughters were crossbred in a country other than the sires’ own country.
      Modeling of crossbreeding effects was based on the same approach as proposed by

      Lidauer, M., E. A. Mäntysaari, I. Strandén, J. Pösö, J. Pedersen, U. S. Nielsen, K. Johansson, J.-Å. Eriksson, P. Madsen, and G. P. Aamand. 2006a. Random heterosis and recombination loss effects in a multibreed evaluation for Nordic red dairy cattle. Abstract c24–02 in Proc. 8th World Congr. Genet. Appl. Livest. Prod., Belo Horizonte, Brazil. Brazilian Society of Animal Breeding, Belo Horizonte, MG, Brazil.

      , where regressions on total heterozygosity were complemented with regressions on heterozygosity of particular breed crosses. Heterosis estimates specific to the breed-crosses were regressed toward the estimates of common total heterosis by modeling the specific heterosis effect as a random regression. The applied variances were equal to 1% of the mean phenotype of the trait, which was found most suitable in an empirical validation of different sets of variances. Recombination loss was modeled in the same manner as heterosis. In total, 12 different breeds were considered for the RDC model for calculation of total heterozygosity and recombination loss. The specific heterosis effects were modeled for the 5 most relevant breed-crosses within a country. There were 11 such breed-crosses: Finnish Ayrshire × Swedish Red Breed, Finnish Ayrshire × Canadian Ayrshire, Finnish Ayrshire × HOL, Finnish Ayrshire × Norwegian Red Cattle, Red Danish Cattle × Swedish Red Breed, Red Danish Cattle × American Brown Swiss, Red Danish Cattle × HOL, American Brown Swiss × Swedish Red Breed, American Brown Swiss × HOL, Swedish Red Breed × Norwegian Red Cattle, and Swedish Red Breed × Canadian Ayrshire. The breed-cross-specific effects were modeled within countries, and a correlation of 0.95 was imposed between effects of the same breed-crosses in different countries.
      For the HOL and JER evaluation models, it was sufficient to consider fixed regression on heterozygosity only. The breed-crosses for HOL evaluation were Holstein Friesian × Red and White, Holstein Friesian × European Black and White, and HOL × Finnish Ayrshire, and those for JER evaluation were Danish JER × American JER, Danish JER × New Zealand JER, and American JER × New Zealand JER.

      Calving Age by Breed

      For RDC evaluation, it was important to model the age effect by an age × breed interaction to avoid biased estimation of genetic trends. Therefore, calving age was modeled by linear and quadratic regression coefficients on age. The covariates α and α2 were centered to zero according to mean calving age to have the calving age effect free from genetic-level differences between breeds; that is, α = (calving age in days − mean calving age)/365. Breed-specific deviations from the common mean calving age curves were modeled by including additional linear and quadratic regression coefficients, which were multiplied by the breed proportion of animal. Breed-specific calving age curves were modeled for Finnish Ayrshire, Swedish Red Breed, American Brown Swiss, HOL, and FIC.

      Genetic Groups

      Pedigree information for each of the 3 evaluations was extracted from a common Nordic dairy cattle pedigree comprising over 35 million animals. For cows with observations, the dam pedigree was traced back 2 generations to include the informative animals, whereas the sire pedigree was traced back for all informative animals. Phantom parent groups were assigned for missing parents. These were defined by selection path, breed, and time, and small year groups were merged to obtain reasonable group sizes. The phantom parent group effects were modeled as random effects to avoid confounding with fixed effects. Following
      • Schaeffer L.R.
      Multiple-country comparison of dairy sires.
      recommendation, a value of 1.0 was added to the diagonals of the phantom parent group equations in the inverse of the coefficient matrices. Note that the CF for additive genetic animal effects yielded an identity matrix for the variance-covariance matrix applied in the evaluation model.

      The Nordic Test-Day Model

      The multiple-trait RRM considered the biological traits milk, protein, and fat yields of first, second, and third lactations, plus observations from all later lactations in case of a Finnish cow, as 9 separate traits. Heterogeneous residual variation in different environments was accounted for by an environment-specific adjustment factor. The TD observations within a biological trait were stratified by herd × production year × production month × parity class, and observations of the same stratum were scaled by the same multiplicative adjustment factor. The RDC evaluation model, involving the most complexity, can be described as follows:
      ytld:cfhijmnopqrsuvxzλtlc:hjmp=HYt:hji+hst:hfiφd2+YMt:cjmp+k=15bt:cpsukφdk+w=13k=12gt:cpfαopnkπow+CCt:cqpf+DDt:crpf+hetlξT,o+retlρT,o+htdt:hjmi+k=15hetl:ckξok+k=15retl:ckρok+k=19pl:okUpdtl:ck+k=17wxokUwdt:ck+k=115aokUgdtl:k+etld:cfhijmnopqrsuvxz,
      [2]


      where ytld:cfhijmnopqrsuvz = observation z for trait t (milk yield, protein yield, fat yield) in lactation l (1, 2, 3+) of DIM d (8, …, 365) in parity p (1, 2, 3, 4, 5+), for cow o that calved at age n, in country c (DNK, FIN, SWE), herd h, and belongs to contemporary group i (primiparous, multiparous cows), 5-yr production period f, production year j, production month m, calving year-season class s, calving age class u, days carried calf class q, and dry period class r; λtlc:hjmp = multiplicative heterogeneous variance adjustment factor for stratum hjmp; HYt:hji = fixed effect of herd × year × contemporary group; hst:hfiφd2 = fixed linear regression on DIM d nested within herd × 5-yr period × contemporary group, where φd2 is a linear Legendre polynomial covariable; YMt:cjmp = fixed effect of production year × production month × parity class nested within country; k=15bt:cpsukφdk = fixed regression function on DIM d nested within country × parity class × calving year-season class (Jan–Mar, Apr–June, July–Sep, Oct–Dec) × calving age class (25% youngest, 25% second youngest, 50% oldest), where φd is a vector containing the covariates of a third-order Legendre polynomial (without intercept) plus exponential terms e0.04d and e0.15d; w=13k=12gt:cpfαopnkπow = fixed regression function on calving age × breed proportion nested within country × parity class × 5-yr period, where αopn is a vector containing the covariates of a quadratic polynomial (without intercept) for calving age n of cow o in parity p, and πo is a vector of breed proportion for cow o; CCt:cqpf = fixed effect of days carried calf classes (10-d classes) nested within country × parity class × 5-yr period; DDt:crpf = fixed effect of days dry classes (week classes) nested within country × parity class × 5-yr period for observations from multiparous cows; hetlξT,o = fixed linear regression on total (T) heterosis of cow o across countries; retlρT,o = fixed linear regression on total (T) recombination loss of cow o across countries; htdt:hjmi = random effect of herd × test-day × contemporary group; k=15hetl:ckξok = random regressions for heterosis nested within country, where ξo is a vector of heterozygosity covariates for specific breed-crosses; k=15retl:ckρok = random regressions for recombination loss nested within country, where ρo is a vector of recombination loss covariates for specific breed-crosses; k=19pl:okUpdtl:ck = random regressions for nonhereditary animal effects for milk, protein, and fat yields among stage of lactation nested within lactation, where Updtl:c is a vector of trait- and lactation-specific CF covariates for DIM d; k=17wxokUwdt:ck = random regressions for nonhereditary animal effects for milk, protein, and fat yields among stage of lactation nested within later lactation x (3, …, 10) of Finnish cows, where Uwdt:c is a vector of trait-specific CF covariates for DIM d; k=115aokUgdtl:k = random regressions for additive genetic animal effects for all 9 traits and among stage of lactation, where Ugdtl is a vector of trait- and lactation-specific CF covariates for DIM d; and etld:cfhijmnopqrsuvxz = random residual.
      The variance components applied for the RDC evaluation model were: var(htdh) = H, var(hetl:c) = 1.0, var(retl:c) = 1.0, var(po) = Dp*, var(wzo) = I7×7, var(ao) = I15×15, and var(e) = IM*. The residual (co)variance matrix M* was built from M by reducing the estimated residual correlations between traits within lactation by 33%. The reduction in residual correlations was found important to increase the predictability of EBV for cows with extreme observations.
      The evaluation model for HOL and JER had the same design as explained for RDC except for 2 simplifications. Crossbreeding effects were modeled by fixed regressions only, as described above, and the calving age effect was modeled by calving age classes without breed interactions. Additionally, all TD observations used for JER evaluation were from the first 3 lactations only, which simplified the modeling of nonhereditary animal effects.
      The population size of the FIC breed was <5,000 cows, so we did not develop a specific evaluation model for this particular breed. However, given that FIC has genetic ties with the Finnish Ayrshire population (Table 1) and FIC cows mainly milk in mixed herds, the data on FIC were entered into the RDC evaluation. Moreover, because mixed herds with RDC and HOL cows are common in Finland, the Finnish TD observations from RDC and HOL cows were included in both HOL and RDC evaluation to increase the contemporary group size.

      Adjustment for Heterogeneous Variance

      The multiplicative mixed model approach (
      • Meuwissen T.H.E.
      • De Jong G.
      • Engel B.
      Joint estimation of breeding values and heterogeneous variance for large data files.
      ) was regarded as a method of choice to account for heterogeneous variance (HV) in across-country evaluation. The approach involves alternate solving of a variance model and a mean model; that is, the evaluation model. Solutions from the variance model are used to adjust each TD observation based on its stratum variance, and the mean model is then solved using the adjusted TD observations. This allows us to retain the heterogeneity of variance explained by the evaluation model, which is important because the same contemporary comparison group may include cows of different breeds. The HV adjustment approach uses the residuals of the evaluation model. A log-linear model, the variance model, is fitted to the observations of heterogeneity to obtain the multiplicative adjustment factors for TD observations.

      Variance Model

      The TD data were stratified by country, herd, production year, test month, and lactation (1, 2, 3, 4, 5+) to describe the most important differences in HV. For heterogeneity observations belonging to the same trait t, contemporary group i, and country c combination, the following single-trait log-linear model was applied:
      stlc:hjmp=β1tc:jmp+β2tic:hj+ϵtlc:hjmp,


      where stlc:hjmp = heterogeneity observation for stratum hjmp; β1tc:jmp = fixed production year × month × parity class effect; β2tic:hj = random effect of herd × production year; and ϵtic:hjmp = random residual.
      The random β2 effect was modeled based on the results of a simulation study by
      • Márkus S.
      • Mäntysaari E.A.
      • Strandén I.
      • Eriksson J.-Å.
      • Lidauer M.H.
      Comparison of multiplicative heterogeneous variance adjustment models for genetic evaluations.
      ), who found that a first-order autoregressive correlation structure between herd × year effects was more appropriate than between herd × test-month effects. The variance components for the variance model were based on estimates obtained in the same study. The applied autocorrelation parameters were 0.70 and 0.80, and the variance ratios between σˆβ22σˆe2 were 0.20 and 0.15 for first- and later-lactation traits, respectively. The adjustment factor for a single TD observation was obtained from the solutions of the variance model:
      tlc:hjmp=exp0.5β^1tc:jmp+β^2tic:hj.


      Homogeneous Genetic Variance Across Countries

      The multiplicative mixed model approach will scale the phenotypes multiplicatively until the residual variances in the strata are the same as the residual variances M* used for the evaluation model. The method yields homogeneous genetic variances, given that heterogeneity of variance influences all effects in the model in proportionality and that the estimated variance components are close to the true variance components. To ensure the same genetic variance across countries, we modified the multiplicative mixed model approach by applying the evaluation model residual variances in M* only for the strata of an arbitrarily chosen base country (SWE), whereas own sets of residual variances were used for the other 2 countries (DNK, FIN). Suitable residual variances for DNK and FIN were obtained by a calibration procedure during model development. Thus, the multiplicative model was solved and genetic variances were re-estimated from EBV by a full model sampling approach (
      • Lidauer M.H.
      • Emmerling R.
      • Mäntysaari E.A.
      Multiplicative random regression model for heterogeneous variance adjustment in genetic evaluation for milk yield in Simmental.
      ), to update the residual variances for DNK and FIN according to detected differences in genetic variance. The procedure was repeated until differences in across-country genetic variances were within ±1%. Note that this procedure did not modify the residual variances in M* for the evaluation model.

      Ebv

      The EBV for yield traits were calculated as the sum of 305 daily breeding values from DIM d = 8 to DIM d = 312. Thus, the yield EBV of animal o for trait t in lactation l is
      EBVtl:o,yield=d=8312k=115Ugdtl:kaˆokϑtl,


      where aˆo comprises the 15 estimated random regression coefficients of the CF that describes animal o’s additive genetic effects. A breeding value for persistency of production was calculated as the sum of losses or gains in daily EBV from DIM d = 101 to DIM d = 300 compared with the EBV for DIM d = 100:
      EBVtl:o,persistency=d=101300k=115Ugdtl:kUg100tl:kaˆokϑtl.


      For purposes of practical breeding work, a combined index for milk yield, protein yield, and fat yield is published for each animal:
      Ito=EBVt1oEBV¯t10.1st1+1000.5+EBVt2oEBV¯t20.1st2+1000.3+EBVt3oEBV¯t30.1st3+1000.2,


      where EBV¯tl is the average EBV of cows with observations born in 2008 to 2010 and stl is the standard deviation of EBV of bulls born in 1997 to 1998 having an EBV reliability >0.6.

      Robustness Against Extreme Observations

      A small number of the records in the data used in developing the RRM is expected to include observations that fit very poorly into the parameter space estimated by the variance component analyses. Such observations may impose a significant effect on a cow’s EBV. Numerous models with different CF were therefore tested for their ability to model extreme observations. Yield deviations (YD) were calculated for Danish HOL cows from the solutions of model [2], which were then used as a dependent variable in breeding value estimation by the models under study:
      YDtld:coz=k=19pˆl:okUpdtl:ck+k=115aˆokUgdtl:k+eˆtld:cfhijmnopqrsuvxz,


      where YDtld:coz is a YD z for trait t (milk yield, protein yield, fat yield) in lactation l (1, 2, 3) of DIM d (8, …, 365) for cow o in country c (DNK). All of the studied models included a mean effect but differed in the way the random animal and residual effects were modeled.

      Model A

      Animal and residual effects were modeled as in model [1]. Thus, the applied variance components were those obtained from the variance component estimation.

      Model B

      Additive genetic animal effects were modeled as in model [2], but a different CF was used for nonhereditary animal effects. To build this CF, the R matrix was constructed by considering 36 DIM points within lactation; d = {10, 20, ..., 360}. The rank of the obtained CF ΦKpΦT was reduced across all lactations, and eigenfunctions with the 17 largest eigenvalues were applied for modeling of non-hereditary animal effects. The obtained estimates for M were used as variance components for residual effects. This model is consistent with an earlier model that has passed the official validation test run by the Interbull Centre but has never been placed into use.

      Model C

      This was the same as model B except that correlations between residuals were reduced by one-third.

      Model D

      Animal and residual effects were modeled as in model [2].
      Two sets of solutions were estimated with each model, either including or excluding the YD of TD records with extreme observations. The solutions were used to obtain yield indices for individual cows o, where a yield index was calculated as Io=0.8Iprotein,o+0.4Ifat,o0.2Imilk,o. The robustness of the models was then assessed using the obtained yield indices.

      The Solving Algorithm

      The algorithm applied for solving the RRM evaluation model for EBV and the variance model for HV estimates was the same as explained in
      • Lidauer M.H.
      • Emmerling R.
      • Mäntysaari E.A.
      Multiplicative random regression model for heterogeneous variance adjustment in genetic evaluation for milk yield in Simmental.
      . Both models were solved by iterative methods. The HV adjustment factors were regarded sufficiently converged when the relative change in multiplicative adjustment factors between consecutive mean model + variance model cycles was less than 10−7. The solutions for the RRM (i.e., the mean model) were considered converged when the relative change between solutions of consecutive iterations was less than 10−9. Both the mean model and variance model were solved by the preconditioned conjugate gradient method using parallel computing, as described in
      • Strandén I.
      • Lidauer M.
      Parallel computing applied to breeding value estimation in dairy cattle.
      .

      Results and Discussion

      Covariance Functions

      The developed CF resulted in 42 equations to be solved for each cow with observations. However, using the originally estimated variance components, the model would have required 72 equations per cow. Still, the reduction in the number of equations was less than in the first Nordic RRM presented by
      • Mäntysaari E.A.
      • Lidauer M.
      • Pösö J.
      • Madsen P.
      • Strandén I.
      • Pedersen J.
      • Nielsen U.S.
      • Johansson K.
      • Eriksson J.-Å.
      • Aamand G.P.
      Joint Nordic test day model: Variance components. Proc. 2006 Interbull Mtg., Kuopio, Finland.
      , where the number of equations was reduced to 32. In our study, 15 out of 36 eigenvalues explained 99.0 to 99.9% of the variation in additive genetic animal effects described by the CF. Use of these 15 eigenvalues was sufficient to estimate EBV without loss in accuracy. In the case of nonhereditary animal effects, the 7 largest eigenvalues within lactation explained over 99.5% of the variation described by the CF, which was found sufficient to leave the EBV unaffected. Nevertheless, we decided to include a larger number of eigenvalues, 9 out of 12 within lactation, because despite explaining very little additional variance, they yielded more robust EBV for cows with extreme observations at the beginning of lactation. The EBV for such cows were even more impaired by the high measurement error correlations obtained from variance component estimation. Therefore, we reduced the rank of the CF for nonhereditary animal effects to 3 × 9, or 27 equations per cow, and residual correlations by 33%.

      Modeling of Breed Effects

      Genetic Groups

      Modeling of breed effects was crucial to avoid confounding between environmental effects and additive genetic effects. This was especially important because the environmental effects in the evaluation models included country interactions, and the country-wise subpopulations in both HOL and RDC evaluations were almost entirely genetically linked due to common sires. Phantom parent group effects were therefore modeled as random effects to guarantee that none of the country-specific genetic group effects were confounded with fixed effects. This also resulted in phantom parent group solutions that were regressed toward zero when the group size was smaller than 50. However, the estimates were little affected by randomness when the phantom parent group size was larger than 400. In Red Danish Cattle, the historical use of American Brown Swiss sires led to a collinearity between the American Brown Swiss breed proportion and the level of Red Danish Cattle × American Brown Swiss heterozygosity in crossbred cows. Modeling of the phantom parent group averages as random effects eliminated this collinearity.

      Heterosis and Recombination Loss

      In HOL evaluation, heterosis estimates (presented as % of the mean phenotype) for different traits and lactations were, on average, 2.4 for Holstein Friesian × European Black and White, 4.4 for Holstein Friesian × Red and White, and 3.1 for HOL × Finnish Ayrshire. In JER evaluation, heterosis estimates were, on average, 2.5 for Danish JER × American JER, 1.0 for Danish JER × New Zealand JER, and 3.8 for American JER × New Zealand JER. In RDC evaluation, where heterosis was modeled by fixed and random effects, heterosis estimates over traits and lactations averaged 3.7. Estimates for specific crosses ranged from 0.7 to 7.5. Table 6 shows the estimates for protein yield, which were highest for Red Danish Cattle × Swedish Red Breed and lowest for Swedish Red Breed × Norwegian Red Cattle. Estimates for Finnish Ayrshire × HOL were in good agreement with those from HOL evaluation. Modeling of a fixed and a random model component clearly helped the estimates for heterosis to stay within an expectable range. Modeling of heterosis proved especially difficult for Red Danish Cattle × American Brown Swiss and for Red Danish Cattle × HOL, for which earlier studies have reported estimates as high as up to 12% (
      • Madsen P.
      • Jensen J.
      • Nielsen U.S.
      Effect of heterosis and imported germ plasm on production traits estimated in the Danish multi-breed animal model. Proc. 1997 Interbull Mtg., Vienna, Austria.
      ;

      Lidauer, M., E. A. Mäntysaari, I. Strandén, J. Pösö, J. Pedersen, U. S. Nielsen, K. Johansson, J.-Å. Eriksson, P. Madsen, and G. P. Aamand. 2006a. Random heterosis and recombination loss effects in a multibreed evaluation for Nordic red dairy cattle. Abstract c24–02 in Proc. 8th World Congr. Genet. Appl. Livest. Prod., Belo Horizonte, Brazil. Brazilian Society of Animal Breeding, Belo Horizonte, MG, Brazil.

      ).
      Table 6Heterosis and recombination loss estimates
      Estimates are given in percentages of the mean phenotype.
      for protein yield of first, second, and third lactation from the Nordic Red Cattle model calculated for the 3 most common crosses in the Finnish, Danish, and Swedish populations
      CrossHeterosisRecombination loss
      FirstSecondThirdFirstSecondThird
      Finnish Ayrshire
      × Swedish Red Breed5.675.724.64−3.06−2.19−1.81
      × Canadian Ayrshire2.803.442.040.52−0.63−0.13
      × Norwegian Red Cattle2.303.613.07−3.98−3.04−1.54
      × Holstein3.113.022.30−1.25−1.27−0.87
      Red Danish Cattle
      × Swedish Red Breed6.495.544.69−2.01−2.81−2.61
      × Holstein3.243.813.37−3.24−3.17−3.00
      × American Brown Swiss4.845.465.45−1.79−2.34−2.59
      Swedish Red Breed
      × Finnish Ayrshire4.875.124.15−3.39−2.25−1.34
      × Canadian Ayrshire2.844.902.13−1.65−0.85−0.14
      × Norwegian Red Cattle1.973.261.24−4.57−3.49−2.99
      Overall mean3.984.313.63−2.95−2.46−1.38
      1 Estimates are given in percentages of the mean phenotype.
      Recombination loss was modeled only in RDC evaluation. Average recombination loss (presented as % of the mean phenotype) over traits and lactations was estimated at −2.0. Estimates for specific crosses ranged from −7.5 to 1.1. Overall, the effects of recombination loss were smaller than heterosis effects. Highest recombination loss estimates were obtained for Swedish Red Breed × Norwegian Red Cattle, for which heterosis estimates, in contrast, were lowest (Table 6). The expectation is that recombination loss effect should not be larger than the heterosis effect. However, that was not the case for Norwegian Red Cattle, suggesting that the evaluation model was unable to model the genetic level for this breed optimally. Genetic group effects for Norwegian Red Cattle were difficult to estimate partly because major contributions from the Norwegian population to Swedish and Finnish Red Cattle were made as far back as the 1960s and 1970s, although these have largely stabilized across the cow population having TD data.

      Calving Age × Breed Proportion

      Because of the heterogeneity of the RDC breeds, it was deemed important to include breed interaction for calving age in the model for RDC evaluation. Proper estimation of the genetic trend required optimal modeling of the calving age effect. A clear difference was found in the calving age effect among RDC, American Brown Swiss, HOL, and FIC. Figure 1 illustrates the differences for milk yield in second lactation. Especially in American Brown Swiss, the effect of calving age was distinctly smaller than in the other breeds. The coefficient of the calving age effect in the model was dependent on the breed composition of a cow, and this allowed a flexible modeling as demonstrated by a synthetic Danish Red cow in Figure 1.
      Figure thumbnail gr1
      Figure 1Calving age effect by breed on second lactation test-day milk yield in synthetic Danish Red Cattle. Synthetic Danish Red = a cow with a breed composition of 65% Nordic Red Cattle, 14% Brown Swiss, and 21% Holstein.

      Heterogeneous Variance Adjustment

      Calibration of final residual variances for the HOL evaluation model required 5 cycles and for the RDC model 6 cycles of evaluation runs with updated, calibrated residual variances to yield homogeneous genetic variances across countries. The obtained calibrated residual variances applied to the Danish and Finnish strata of data differed from the original residual variances used for the Swedish data strata by −8 to 31% in HOL evaluation and by −2 to 51% in RDC evaluation. However, the multiplicative adjustment factors for the phenotypes showed a mean difference of only −2 to 7% in HOL evaluation and −7 to 14% in RDC evaluation when comparing Danish and Finnish traits with Swedish traits. These results indicate that the true measurement error variances differed more across countries than the phenotypic variances and heritabilities. The monthly means of the multiplicative adjustment factors ranged from 0.9 to 1.9 within different traits, countries, and production years.
      The HV adjustment ensured fulfilling the model assumptions of homogeneous genetic variance within a trait across time, across herds, and across countries. The first 2 assumptions were well justified, as HV in production traits is known to be mainly caused by a scaling effect due to differences in production levels (
      • Robert-Granié C.
      • Bonaiti B.
      • Boichard D.
      • Barbat A.
      Accounting for variance heterogeneity in French dairy cattle genetic evaluation.
      ). The third assumption of homogeneous variance across countries is valid only in case of genetically very similar subpopulations in the countries. This held reasonably well for HOL evaluation, where the HOL breed proportion in cows was >84% in all 3 countries. For RDC evaluation, only Finnish and Swedish RDC were similar (Table 1). Variance component analyses for different traits gave genetic variance estimates on average 11% lower for Finnish than for Swedish RDC. This is a relatively small difference, which can be explained, in part, by the lower heritability found for Finnish RDC. Estimated genetic variances for Danish RDC were, on average, 24% lower compared with Swedish RDC. However, the fact that phenotypic means and variances were also lower for Danish RDC suggests that the differences were to some extent caused by a scaling effect.

      EBV

      A comparison of the estimated genetic trend for cows across the 3 breed evaluations shows greatest genetic progress in HOL cows (Figure 2). The average protein yield index grew by 42 points between 1990 and 2010, corresponding to an average yearly increase of 4.1kg. For RDC cows, the increase was 36 index points, equal to an average yearly increase of 3.5 kg (Figure 3), and for JER cows, the increase was 35 index points, or an average yearly increase of 2.3kg (Figure 4). Similar genetic progress was found across countries within each evaluation. As expected, the highest genetic level for Danish cows was found in HOL, and for Finnish cows in RDC, although differences between populations have diminished in recent years.
      Figure thumbnail gr2
      Figure 2Average protein yield index for Holstein cows by year of birth. One index point equals 7.9% of the genetic standard deviation.
      Figure thumbnail gr3
      Figure 3Average protein yield index for Nordic Red Cattle cows by year of birth. One index point equals 7.9% of the genetic standard deviation.
      Figure thumbnail gr4
      Figure 4Average protein yield index for Jersey cows by year of birth. One index point equals 6.8% of the genetic standard deviation.
      During model development, we noted that the estimated genetic trends for cows were less robust to changes in the model than the genetic trends for bulls. Differences between countries in the genetic levels of cows were particularly sensitive to modeling of crossbreeding effects, calving age effects, and adjustment for HV. Achieved upward scaling of variance for a particular country decreased the genetic level for that country and vice versa. The same was found for heterosis, where an increase in estimates effected a decrease in genetic level and vice versa. Changes in genetic levels between cow populations across countries were, at most, 2 index points between the different model alternatives. Consequently, we carried out an intensive model validation procedure to develop the across-country evaluation models. This validation procedure included Interbull validation methods I, II, III, and IV (
      • Boichard D.
      • Bonaiti B.
      • Barbat A.
      • Mattalia S.
      Three methods to validate the estimation of genetic trend for dairy cattle.
      ;
      • Tyrisevä A.-M.
      • Mäntysaari E.A.
      • Jakobsen J.
      • Aamand G.P.
      • Dürr J.
      • Fikse W.F.
      • Lidauer M.H.
      Validation of consistency of Mendelian sampling variance in national evaluation models. In: Proceedings of the 2012 Interbull meeting, Cork, Ireland.
      ), studies of EBV predictability and stability with accumulation of additional data, analysis of residuals, analysis of variance-heterogeneity, and thorough inspection of fixed and random effect solutions as well as of EBV.

      Robustness Against Extreme Observations

      • Madsen P.
      • Pösö J.
      • Pedersen J.
      • Lidauer M.H.
      • Jensen J.
      Screening for outliers in multiple trait genetic evaluation. Proc. 2012 Interbull Mtg., Cork, Ireland.
      reported that 2.5% of Danish HOL cows had at least one observation that could be considered as extreme or as an outlier. The EBV for these cows may be significantly impaired if the applied model is not robust against such observations. For instance, one particular cow in the data with 25 TD records had one record from DIM 13 of the second lactation, which showed a very high protein yield (protein content 7.4%). Models A, B, and C each gave a significantly inflated yield index for this cow when the extreme observation was included (Table 7). In contrast, use of reduced residual correlations resulted in a less inflated yield index (models C and D). A too-rigorous CF for nongenetic animal effects also made the models less robust (models B and C), whereas model D, which had the same animal and residual effects as model [2], was sufficiently flexible to model the extreme observation. It is noteworthy that when the original variance components were applied, the index was also inflated (model A). This is most likely because the estimated residual correlations (between 0.6 and 0.9) were too high for an observation triplet that included an extreme observation. A multivariate outlier detection method (
      • Madsen P.
      • Pösö J.
      • Pedersen J.
      • Lidauer M.H.
      • Jensen J.
      Screening for outliers in multiple trait genetic evaluation. Proc. 2012 Interbull Mtg., Cork, Ireland.
      ) could be used to discard extreme observations, most likely allowing stronger rank reduction for the CF of nonhereditary animal effects as well as application of original residual correlations. This alternative merits further investigation.
      Table 7Effect of a single extreme observation on a cow’s yield index by different models
      Model A=same variance components as estimated by REML; model B=covariance functions for animal effects; model C=same model as B but measurement error correlations were reduced by one-third; model D=same model as C but covariance function for the nonhereditary animal effects was built differently.
      Extreme

      observation
      Model
      ABCD
      Included115122114110
      Excluded110110110109
      1 Model A = same variance components as estimated by REML; model B = covariance functions for animal effects; model C = same model as B but measurement error correlations were reduced by one-third; model D = same model as C but covariance function for the nonhereditary animal effects was built differently.

      Considerations on Solving the Models

      Solving of the mixed model equations was found to be a demanding task because of either the model’s size (HOL) or its complexity (RDC). The HOL evaluation, with its 135 million TD records, meant solving 375 million unknowns for the mean model and 12 million unknowns for the variance model. The RDC evaluation had 52 to 60 model factors per observation, depending on the trait, and required 95 HV solving cycles between the mean model and the variance model to reach the convergence criteria for HV adjustment factors. This was followed by an additional 5,228 iterations for the mean model to attain convergence. For routine evaluation, however, a fixed number of 94 HV solving cycles followed by 2,500 mean model iterations was found sufficient for all 3 evaluation models. The EBV from routine evaluation differed by less than half an index point from EBV obtained from the fully converged evaluation model. Sufficient convergence of HV adjustment factors within 94 solving cycles was achieved by using staggered Aitken acceleration with a half-Chebyshev procedure, as proposed by

      Hesterberg, T. 2005. Staggered Aitken acceleration for EM. [2101–2110]. Proc. Am. Stat. Assoc., Statistical Computing Section, American Statistical Association. Alexandria, VA.

      . Here, the acceleration steps are merely a function of the number of solving cycles, which secures the stability of estimates over consecutive routine evaluations. The models were solved on a Linux cluster with shared memory architecture utilizing 6 processors (Intel Xeon X5687 @ 3.60 GHz) in parallel. Computing times for the routine evaluation model were 102.5, 54.7, and 12.5 h of calculations for HOL, RDC, and JER evaluations, respectively.

      Conclusions

      The developed statistical models for test-day yield observations for Danish, Finnish, and Swedish dairy cows allow across-country genetic evaluation, not only of sires but of all animals in the population. Modeling of TD data was crucial to ensure unbiased ranking of sires and cows across the 3 countries. In particular, derivation of covariance functions for random effects, modeling of breed effects, and accounting for heterogeneity of variance proved most important to ensure reliable breeding value estimation. Our study can be used as a basis for future research where the 3 evaluation models are combined into one multi-breed model that enables inclusion of first-generation HOL × RDC, HOL × JER, and RDC × JER crossbred animals.

      References

        • Boichard D.
        • Bonaiti B.
        • Barbat A.
        • Mattalia S.
        Three methods to validate the estimation of genetic trend for dairy cattle.
        J. Dairy Sci. 1995; 78: 431-437
        • Canavesi F.
        • Boichard D.
        • Ducrocq V.
        • Gengler N.
        • De Jong G.
        • Liu Z.
        Production traits European joint evaluation (PROTEJE). Proc. 2001 Interbull Mtg., Budapest, Hungary.
        Interbull Bull. 2001; 27: 32-34
        • de Roos A.P.W.
        • Harbers A.G.F.
        • De Jong G.
        Random herd curves in a test-day model for milk, fat, and protein production of dairy cattle in the Netherlands.
        J. Dairy Sci. 2004; 87: 2693-2701
        • Emmerling R.
        • Lidauer M.
        • Mäntysaari E.A.
        Multiple lactation random regression test-day model for Simmental and Brown Swiss in Germany and Austria. Proc. 2002 Interbull Mtg., Interlaaken, Switzerland.
        Interbull Bull. 2002; 29 (): 111-112
      1. Hesterberg, T. 2005. Staggered Aitken acceleration for EM. [2101–2110]. Proc. Am. Stat. Assoc., Statistical Computing Section, American Statistical Association. Alexandria, VA.

        • Jakobsen J.H.
        • Madsen P.
        • Jensen J.
        • Pedersen J.
        • Christensen L.G.
        • Sorensen D.A.
        Genetic parameters for milk production and persistency for Danish Holstein estimated in random regression models using REML.
        J. Dairy Sci. 2002; 85: 1607-1616
        • Jamrozik J.
        • Schaeffer L.R.
        • Dekkers J.C.M.
        Genetic evaluation of dairy cattle using test day yields and random regression model.
        J. Dairy Sci. 1997; 80: 1217-1226
        • Kirkpatrick M.
        • Hill W.G.
        • Thompson R.
        Estimating the covariance structure of traits during growth and ageing, illustrated with lactation in dairy cattle.
        Genet. Res. 1994; 64: 57-69
        • Lidauer M.
        • Mäntysaari E.A.
        • Strandén I.
        Comparison of test-day models for genetic evaluation of production traits in dairy cattle.
        Livest. Prod. Sci. 2003; 79: 73-86
        • Lidauer M.
        • Mäntysaari E.A.
        • Strandén I.
        • Pösö J.
        Multiple-trait random regression test-day model for all lactations. Proc. 2000 Interbull Mtg., Bled, Slovenia.
        Interbull Bull. 2000; 25: 81-86
      2. Lidauer, M., E. A. Mäntysaari, I. Strandén, J. Pösö, J. Pedersen, U. S. Nielsen, K. Johansson, J.-Å. Eriksson, P. Madsen, and G. P. Aamand. 2006a. Random heterosis and recombination loss effects in a multibreed evaluation for Nordic red dairy cattle. Abstract c24–02 in Proc. 8th World Congr. Genet. Appl. Livest. Prod., Belo Horizonte, Brazil. Brazilian Society of Animal Breeding, Belo Horizonte, MG, Brazil.

        • Lidauer M.
        • Pedersen J.
        • Pösö J.
        • Mäntysaari E.A.
        • Strandén I.
        • Madsen P.
        • Nielsen U.S.
        • Johansson K.
        • Eriksson J.-Å.
        • Aamand G.P.
        Joint Nordic test day model: Evaluation model. Proc. 2006 Interbull Mtg., Kuopio, Finland.
        Interbull Bull. 2006; 35 (b): 103-110
        • Lidauer M.H.
        • Emmerling R.
        • Mäntysaari E.A.
        Multiplicative random regression model for heterogeneous variance adjustment in genetic evaluation for milk yield in Simmental.
        J. Anim. Breed. Genet. 2008; 125: 147-159
        • Lidauer M.H.
        • Madsen P.
        • Matilainen K.
        • Mäntysaari E.A.
        • Strandén I.
        • Thompson R.
        • Pösö J.
        • Pedersen J.
        • Nielsen U.S.
        • Eriksson J.-Å.
        • Johansson K.
        • Aamand G.P.
        Estimation of variance components for Nordic Red Cattle test-day model: Bayesian Gibbs sampler vs. Monte Carlo EM REML. Proc. 2009 Interbull Mtg., Barcelona, Spain.
        Interbull Bull. 2009; 40: 37-41
      3. Madsen, P., and J. Jensen. 2008a. A Users’s Guide to DMU. A package for analysing multivariate mixed models. Version 6, release 4.7. Faculty of Agricultural Sciences, University of Aarhus, Denmark.

        • Madsen P.
        • Jensen J.
        • Nielsen U.S.
        Effect of heterosis and imported germ plasm on production traits estimated in the Danish multi-breed animal model. Proc. 1997 Interbull Mtg., Vienna, Austria.
        Interbull Bull. 1997; 16: 85-88
        • Madsen P.
        • Lidauer M.H.
        • Mäntysaari E.A.
        Strategy for estimation of variance components for the joint Nordic yield evaluation. Proc. 2008 Interbull Mtg., Niagara Falls (USA).
        Interbull Bull. 2008; 38 (b): 36-39
        • Madsen P.
        • Pösö J.
        • Pedersen J.
        • Lidauer M.H.
        • Jensen J.
        Screening for outliers in multiple trait genetic evaluation. Proc. 2012 Interbull Mtg., Cork, Ireland.
        Interbull Bull. 2012; 46: 85-91
      4. Mäntysaari, E. A. 1999. Derivation of multiple trait reduced rank random regression (RR) model for the first-lactation test-day records of milk, protein, and fat. Page 26 in Proc. 50th Annu. Mtg. EAAP, Zurich, Switzerland. Wageningen Academic Publishers, Wageningen, the Netherlands.

      5. Mäntysaari, E. A. 2006a. Meta-model combines traits with different genetic models and data structures. Abstract c24–01 in Proc. 8th World Congr. Genet. Appl. Livest. Prod., Belo Horizonte, Brazil. Brazilian Society of Animal Breeding, Belo Horizonte, MG, Brazil.

        • Mäntysaari E.A.
        • Lidauer M.
        • Pösö J.
        • Madsen P.
        • Strandén I.
        • Pedersen J.
        • Nielsen U.S.
        • Johansson K.
        • Eriksson J.-Å.
        • Aamand G.P.
        Joint Nordic test day model: Variance components. Proc. 2006 Interbull Mtg., Kuopio, Finland.
        Interbull Bull. 2006; 35 (b): 97-102
        • Márkus S.
        • Mäntysaari E.A.
        • Strandén I.
        • Eriksson J.-Å.
        • Lidauer M.H.
        Comparison of multiplicative heterogeneous variance adjustment models for genetic evaluations.
        J. Anim. Breed. Genet. 2014; 131: 237-246
        • Matilainen K.
        • Mäntysaari E.A.
        • Lidauer M.H.
        • Strandén I.
        • Thompson R.
        Employing a Monte Carlo algorithm in expectation maximization restricted maximum likelihood estimation of the linear mixed model.
        J. Anim. Breed. Genet. 2012; 129: 457-468
        • Meuwissen T.H.E.
        • De Jong G.
        • Engel B.
        Joint estimation of breeding values and heterogeneous variance for large data files.
        J. Dairy Sci. 1996; 79: 310-316
        • Robert-Granié C.
        • Bonaiti B.
        • Boichard D.
        • Barbat A.
        Accounting for variance heterogeneity in French dairy cattle genetic evaluation.
        Livest. Prod. Sci. 1999; 60: 343-357
        • Schaeffer L.R.
        Multiple-country comparison of dairy sires.
        J. Dairy Sci. 1994; 77: 2671-2678
        • Schaeffer L.R.
        • Jamrozik J.
        • Kistemaker G.J.
        • Van Doormaal B.J.
        Experience with a test-day model.
        J. Dairy Sci. 2000; 83: 1135-1144
        • Strandén I.
        • Lidauer M.
        Parallel computing applied to breeding value estimation in dairy cattle.
        J. Dairy Sci. 2001; 84: 276-285
        • Swalve H.H.
        Theoretical basis and computational methods for different test-day genetic evaluation methods.
        J. Dairy Sci. 2000; 83: 1115-1124
        • Tijani A.
        • Wiggans C.R.
        • Van Tassel J.C.
        • Philpot J.C.
        • Gengler N.
        Use of (co)variance functions to describe (co)variances of test day yield.
        J. Dairy Sci. 1999; 82 (10.3168/jds.S0022-0302(99)75228-8): 226.e1-226.e14
        • Tyrisevä A.-M.
        • Mäntysaari E.A.
        • Jakobsen J.
        • Aamand G.P.
        • Dürr J.
        • Fikse W.F.
        • Lidauer M.H.
        Validation of consistency of Mendelian sampling variance in national evaluation models. In: Proceedings of the 2012 Interbull meeting, Cork, Ireland.
        Interbull Bull. 2012; 46: 97-102
        • Tyrisevä A.-M.
        • Meyer K.
        • Fikse F.W.
        • Ducrocq V.
        • Jakobsen J.
        • Lidauer M.H.
        • Mäntysaari E.A.
        Principal component approach in variance component estimation for international sire evaluation.
        Genet. Sel. Evol. 2011; 43: 21