Advertisement

Comparison of the genetic characteristics of directly measured and Fourier-transform mid-infrared-predicted bovine milk fatty acids and proteins

Open AccessPublished:October 25, 2022DOI:https://doi.org/10.3168/jds.2022-22089

      ABSTRACT

      Fourier-transform mid-infrared (FT-MIR) spectroscopy is a high-throughput and inexpensive methodology used to evaluate concentrations of fat and protein in dairy cattle milk samples. The objective of this study was to compare the genetic characteristics of FT-MIR predicted fatty acids and individual milk proteins with those that had been measured directly using gas and liquid chromatography methods. The data used in this study was based on 2,005 milk samples collected from 706 Holstein-Friesian × Jersey animals that were managed in a seasonal, pasture-based dairy system, with milk samples collected across 2 consecutive seasons. Concentrations of fatty acids and protein fractions in milk samples were directly determined by gas chromatography and high-performance liquid chromatography, respectively. Models to predict each directly measured trait based on FT-MIR spectra were developed using partial least squares regression, with spectra from a random selection of half the cows used to train the models, and predictions for the remaining cows used as validation. Variance parameters for each trait and genetic correlations for each pair of measured/predicted traits were estimated from pedigree-based bivariate models using REML procedures. A genome-wide association study was undertaken using imputed whole-genome sequence, and quantitative trait loci (QTL) from directly measured traits were compared with QTL from the corresponding FT-MIR predicted traits. Cross-validation prediction accuracies based on partial least squares for individual and grouped fatty acids ranged from 0.18 to 0.65. Trait prediction accuracies in cross-validation for protein fractions were 0.53, 0.19, and 0.48 for α-casein, β-casein, and κ-casein, 0.31 for α-lactalbumin, 0.68 for β-lactoglobulin, and 0.36 for lactoferrin. Heritability estimates for directly measured traits ranged from 0.07 to 0.55 for fatty acids; and from 0.14 to 0.63 for individual milk proteins. For FT-MIR predicted traits, heritability estimates were mostly higher than for the corresponding measured traits, ranging from 0.14 to 0.46 for fatty acids, and from 0.30 to 0.70 for individual proteins. Genetic correlations between directly measured and FT-MIR predicted protein fractions were consistently above 0.75, with the exceptions of C18:0 and C18:3 cis-3, which had genetic correlations of 0.72 and 0.74, respectively. The GWAS identified trait QTL for fatty acids with likely candidates in the DGAT1, CCDC57, SCD, and GPAT4 genes. Notably, QTL for SCD were largely absent in the FT-MIR predicted traits, and QTL for GPAT4 were absent in directly measured traits. Similarly, for directly measured individual proteins, we identified QTL with likely candidates in the CSN1S1, CSN3, PAEP, and LTF genes, but the QTL for CSN3 and LTF were absent in the FT-MIR predicted traits. Our study indicates that genetic correlations between directly measured and FT-MIR predicted fatty acid and protein fractions are typically high, but that phenotypic variation in these traits may be underpinned by differing genetic architecture.

      Key words

      INTRODUCTION

      Bovine milk is a rich source of dietary nutrients that are important to human health, including proteins, fats, carbohydrates, vitamins, and minerals. The concentrations of these components are determined by genetic factors such as breed and sire, as well as nongenetic factors related to the environment, stage of lactation, feed, and the nutritional status of the animal. Fats are important to human health due to the role they play in growth, development, hormone regulation, and inflammation management. In bovine milk, a typical fatty acid profile comprises about 70% saturated, 25% monounsaturated, and 5% polyunsaturated fatty acids.
      Bovine milk is also a common source of protein, an important nutrient in the human diet because of the role it has in body maintenance, as well as the growth and repair of cells. However, the concentrations of casein and whey proteins in bovine milk differ to that of human milk, with bovine milk protein comprising approximately 80% casein and 20% whey proteins, whereas most of the protein in human milk represents whey proteins. These differences in protein composition are important because casein and whey proteins have different digestibilities and AA profiles. Moreover, the protein profiles have implications for cheese processing and the manufacture of casein supplements.
      Fourier-transform mid-infrared (FT-MIR) spectroscopy is a method to determine the presence of specific chemical bonds in a composite substance such as milk, and is widely used in the dairy industry to characterize milk composition. The approach involves directing infrared light through a milk sample, leading to interactions between the infrared light and molecules in the milk that cause vibrations and rotational changes in molecular bonds, resulting in the differential absorption of the various infrared light wavelengths. From this process, a spectrum of absorbance values for light wavelengths across the mid-infrared range is generated, which can be used to predict a variety of traits. This is a high-throughput and inexpensive method for predicting milk composition from milk samples and is widely used to reliably quantify concentrations of fat and protein for dairy cattle. This methodology is also of interest for characterizing fat composition, casein, and whey proteins in milk because of the implications these milk components may have for human health and milk processability, and because the FT-MIR spectra are already available from routine milk testing.
      Applications using FT-MIR spectral data to predict milk composition traits typically involve using a set of samples with directly measured trait values to develop a calibration equation based on the spectrum of absorbance values, using methods such as partial least squares (PLS) regression. The resulting calibration equation can then be applied to future samples to predict trait values as a linear combination of individual wavenumber absorbances from any milk sample with FT-MIR spectral data. The success of using FT-MIR data as a phenotyping tool relies on the strength of the phenotypic correlation between the directly measured trait and the FT-MIR predicted trait. However, the success of using an FT-MIR predicted trait in breeding programs is further dependent on the heritability of the predicted trait, and the genetic correlation between the directly measured and predicted trait.
      Previous studies have indicated that FT-MIR spectra can be used to predict fatty acids (
      • Soyeurt H.
      • Dardenne P.
      • Dehareng F.
      • Lognay G.
      • Veselko D.
      • Marlier M.
      • Bertozzi C.
      • Mayeres P.
      • Gengler N.
      Estimating fatty acid content in cow milk using mid-infrared spectrometry.
      ;
      • Rutten M.J.M.
      • Bovenhuis H.
      • Hettinga K.A.
      • van Valenberg H.J.F.
      • van Arendonk J.A.M.
      Predicting bovine milk fat composition using infrared spectroscopy based on milk samples collected in winter and summer.
      ;
      • Lopez-Villalobos N.
      • Spelman R.J.
      • Melis J.
      • Davis S.R.
      • Berry S.D.
      • Lehnert K.
      • Holroyd S.E.
      • MacGibbon A.K.H.
      • Snell R.G.
      Estimation of genetic and crossbreeding parameters of fatty acid concentrations in milk fat predicted by mid-infrared spectroscopy in New Zealand dairy cattle.
      ;
      • Bonfatti V.
      • Degano L.
      • Menegoz A.
      • Carnier P.
      Short communication: Mid-infrared spectroscopy prediction of fine milk composition and technological properties in Italian Simmental.
      ) and protein fractions in milk (
      • De Marchi M.
      • Bonfatti V.
      • Cecchinato A.
      • Di Martino G.
      • Carnier P.
      Prediction of protein composition of individual cow milk using mid-infrared spectroscopy.
      ;
      • Bonfatti V.
      • Di Martino G.
      • Carnier P.
      Effectiveness of mid-infrared spectroscopy for the prediction of detailed protein composition and contents of protein genetic variants of individual milk of Simmental cows.
      ,
      • Bonfatti V.
      • Degano L.
      • Menegoz A.
      • Carnier P.
      Short communication: Mid-infrared spectroscopy prediction of fine milk composition and technological properties in Italian Simmental.
      ;
      • Rutten M.J.M.
      • Bovenhuis H.
      • Heck J.M.L.
      • van Arendonk J.A.M.
      Predicting bovine milk protein composition based on Fourier transform infrared spectra.
      ;
      • Soyeurt H.
      • Bastin C.
      • Colinet F.G.
      • Arnould V. M.-R.
      • Berry D.P.
      • Wall E.
      • Dehareng F.
      • Nguyen H.N.
      • Dardenne P.
      • Schefers J.
      • Vandenplas J.
      • Weigel K.
      • Coffey M.
      • Théron L.
      • Detilleux J.
      • Reding E.
      • Gengler N.
      • McParland S.
      Mid-infrared prediction of lactoferrin content in bovine milk: Potential indicator of mastitis.
      ;
      • McDermott A.
      • Visentin G.
      • De Marchi M.
      • Berry D.P.
      • Fenelon M.A.
      • O'Connor P.M.
      • Kenny O.A.
      • McParland S.
      Prediction of individual milk proteins including free amino acids in bovine milk using mid-infrared spectroscopy and their correlations with milk processing characteristics.
      ). Moreover, moderate to high heritability estimates have been reported for a range of FT-MIR predicted fatty acids (
      • Rutten M.J.M.
      • Bovenhuis H.
      • van Arendonk J.A.M.
      The effect of the number of observations used for Fourier transform infrared model calibration for bovine milk fat composition on the estimated genetic parameters of the predicted data.
      ;
      • Lopez-Villalobos N.
      • Spelman R.J.
      • Melis J.
      • Davis S.R.
      • Berry S.D.
      • Lehnert K.
      • Holroyd S.E.
      • MacGibbon A.K.H.
      • Snell R.G.
      Estimation of genetic and crossbreeding parameters of fatty acid concentrations in milk fat predicted by mid-infrared spectroscopy in New Zealand dairy cattle.
      ;
      • Bonfatti V.
      • Vicario D.
      • Lugo A.
      • Carnier P.
      Genetic parameters of measures and population-wide infrared predictions of 92 traits describing the fine composition and technological properties of milk in Italian Simmental cattle.
      ;
      • Narayana S.G.
      • Schenkel F.S.
      • Fleming A.
      • Koeck A.
      • Malchiodi F.
      • Jamrozik J.
      • Johnston J.
      • Sargolzaei M.
      • Miglior F.
      Genetic analysis of groups of mid-infrared predicted fatty acids in milk.
      ;
      • Fleming A.
      • Schenkel F.S.
      • Malchiodi F.
      • Ali R.A.
      • Mallard B.
      • Sargolzaei M.
      • Jamrozik J.
      • Johnston J.
      • Miglior F.
      Genetic correlations of mid-infrared-predicted milk fatty acid groups with milk production traits.
      ) and protein fractions (
      • Soyeurt H.
      • Colinet F.G.
      • Arnould V. M.-R.
      • Dardenne P.
      • Bertozzi C.
      • Renaville R.
      • Portetelle D.
      • Gengler N.
      Genetic variability of lactoferrin content estimated by mid-infrared spectrometry in bovine milk.
      ;
      • Arnould V. M.-R.
      • Soyeurt H.
      • Gengler N.
      • Colinet F.G.
      • Georges M.V.
      • Bertozzi C.
      • Portetelle D.
      • Renaville R.
      Genetic analysis of lactoferrin content in bovine milk.
      ;
      • Bonfatti V.
      • Vicario D.
      • Lugo A.
      • Carnier P.
      Genetic parameters of measures and population-wide infrared predictions of 92 traits describing the fine composition and technological properties of milk in Italian Simmental cattle.
      ;
      • Sanchez M.-P.
      • Govignon-Gion A.
      • Croiseau P.
      • Fritz S.
      • Hozé C.
      • Miranda G.
      • Martin P.
      • Barbat-Leterrier A.
      • Letaïef R.
      • Rocha D.
      • Brochard M.
      • Boussaha M.
      • Boichard D.
      Within-breed and multi-breed GWAS on imputed whole-genome sequence variants reveal candidate mutations affecting milk protein composition in dairy cattle.
      ). Few studies report the genetic correlations between directly measured and FT-MIR predicted fatty acids, or protein fractions, or both, but in those studies the genetic correlations are typically high (
      • Rutten M.J.M.
      • Bovenhuis H.
      • van Arendonk J.A.M.
      The effect of the number of observations used for Fourier transform infrared model calibration for bovine milk fat composition on the estimated genetic parameters of the predicted data.
      ;
      • Bonfatti V.
      • Vicario D.
      • Lugo A.
      • Carnier P.
      Genetic parameters of measures and population-wide infrared predictions of 92 traits describing the fine composition and technological properties of milk in Italian Simmental cattle.
      ).
      Several GWAS have been conducted on fatty acids and protein fractions in bovine milk, across a range of genotype densities. This includes studies of directly measured fatty acids using 50k (
      • Bouwman A.C.
      • Bovenhuis H.
      • Visker M.H.P.W.
      • van Arendonk J.A.M.
      Genome-wide association of milk fatty acids in Dutch dairy cattle.
      ) or high-density (HD) genotypes (
      • Buitenhuis B.
      • Janss L.L.G.
      • Poulsen N.A.
      • Larsen L.B.
      • Larsen M.K.
      • Sørensen P.
      Genome-wide association and biological pathway analysis for milk-fat composition in Danish Holstein and Danish Jersey cattle.
      ;
      • Palombo V.
      • Milanesi M.
      • Sgorlon S.
      • Capomaccio S.
      • Mele M.
      • Nicolazzi E.
      • Ajmone-Marsan P.
      • Pilla F.
      • Stefanon B.
      • D'Andrea M.
      Genome-wide association study of milk fatty acid composition in Italian Simmental and Italian Holstein cows using single nucleotide polymorphism arrays.
      ), and FT-MIR predicted fatty acids using 50k (
      • Cruz V.A.R.
      • Oliveira H.R.
      • Brito L.F.
      • Fleming A.
      • Larmer S.
      • Miglior F.
      • Schenkel F.S.
      Genome-wide association study for milk fatty acids in Holstein cattle accounting for the DGAT1 gene effect.
      ;
      • Iung L.H.S.
      • Petrini J.
      • Ramírez-Díaz J.
      • Salvian M.
      • Rovadoscki G.A.
      • Pilonetto F.
      • Dauria B.D.
      • Machado P.F.
      • Coutinho L.L.
      • Wiggans G.R.
      • Mourão G.B.
      Genome-wide association study for milk production traits in a Brazilian Holstein population.
      ;
      • Freitas P.H.F.
      • Oliveira H.R.
      • Silva F.F.
      • Fleming A.
      • Miglior F.
      • Schenkel F.S.
      • Brito L.F.
      Genomic analyses for predicted milk fatty acid composition throughout lactation in North American Holstein cattle.
      ), HD (
      • Olsen H.G.
      • Knutsen T.M.
      • Kohler A.
      • Svendsen M.
      • Gidskehaug L.
      • Grove H.
      • Nome T.
      • Sodeland M.
      • Sundsaasen K.K.
      • Kent M.P.
      • Martens H.
      • Lien S.
      Genome-wide association mapping for milk fat composition and fine mapping of a QTL for de novo synthesis of milk fatty acids on bovine chromosome 13.
      ), or imputed whole-genome sequence (
      • Sanchez M.-P.
      • Ramayo-Caldas Y.
      • Wolf V.
      • Laithier C.
      • El Jabri M.
      • Michenet A.
      • Boussaha M.
      • Taussat S.
      • Fritz S.
      • Delacroix-Buchet A.
      • Brochard M.
      • Boichard D.
      Sequence-based GWAS, network and pathway analyses reveal genes co-associated with milk cheese-making properties and milk composition in Montbéliarde cows.
      ) genotypes. Studies of directly measured protein fractions include those using 50k (
      • Schopen G.C.B.
      • Visker M.H.P.W.
      • Koks P.D.
      • Mullaart E.
      • van Arendonk J.A.M.
      • Bovenhuis H.
      Whole-genome association study for milk protein composition in dairy cattle.
      ;
      • Pegolo S.
      • Mach N.
      • Ramayo-Caldas Y.
      • Schiavon S.
      • Bittante G.
      • Cecchinato A.
      Integration of GWAS, pathway and network analyses reveals novel mechanistic insights into the synthesis of milk proteins in dairy cows.
      ) or HD (
      • Buitenhuis B.
      • Poulsen N.A.
      • Gebreyesus G.
      • Larsen L.B.
      Estimation of genetic parameters and detection of chromosomal regions affecting the major milk proteins and their post translational modifications in Danish Holstein and Danish Jersey cattle.
      ;
      • Zhou C.
      • Li C.
      • Cai W.
      • Liu S.
      • Yin H.
      • Shi S.
      • Zhang Q.
      • Zhang S.
      Genome-wide association study for milk protein composition traits in a Chinese Holstein population using a single-step approach.
      ) genotypes, and studies of FT-MIR predicted protein fractions include those using imputed sequence genotypes (
      • Sanchez M.-P.
      • Govignon-Gion A.
      • Croiseau P.
      • Fritz S.
      • Hozé C.
      • Miranda G.
      • Martin P.
      • Barbat-Leterrier A.
      • Letaïef R.
      • Rocha D.
      • Brochard M.
      • Boussaha M.
      • Boichard D.
      Within-breed and multi-breed GWAS on imputed whole-genome sequence variants reveal candidate mutations affecting milk protein composition in dairy cattle.
      ,
      • Sanchez M.-P.
      • Ramayo-Caldas Y.
      • Wolf V.
      • Laithier C.
      • El Jabri M.
      • Michenet A.
      • Boussaha M.
      • Taussat S.
      • Fritz S.
      • Delacroix-Buchet A.
      • Brochard M.
      • Boichard D.
      Sequence-based GWAS, network and pathway analyses reveal genes co-associated with milk cheese-making properties and milk composition in Montbéliarde cows.
      ). Aside from differences in genotype density, the breed composition of animals in these studies also varies. In particular, studies of directly measured fatty acids include Dutch Holstein-Friesians (
      • Bouwman A.C.
      • Bovenhuis H.
      • Visker M.H.P.W.
      • van Arendonk J.A.M.
      Genome-wide association of milk fatty acids in Dutch dairy cattle.
      ), Danish Holsteins and Jerseys (
      • Buitenhuis B.
      • Janss L.L.G.
      • Poulsen N.A.
      • Larsen L.B.
      • Larsen M.K.
      • Sørensen P.
      Genome-wide association and biological pathway analysis for milk-fat composition in Danish Holstein and Danish Jersey cattle.
      ), and Italian Simmental and Holsteins (
      • Palombo V.
      • Milanesi M.
      • Sgorlon S.
      • Capomaccio S.
      • Mele M.
      • Nicolazzi E.
      • Ajmone-Marsan P.
      • Pilla F.
      • Stefanon B.
      • D'Andrea M.
      Genome-wide association study of milk fatty acid composition in Italian Simmental and Italian Holstein cows using single nucleotide polymorphism arrays.
      ), whereas studies of FT-MIR predicted fatty acids include Holstein (
      • Cruz V.A.R.
      • Oliveira H.R.
      • Brito L.F.
      • Fleming A.
      • Larmer S.
      • Miglior F.
      • Schenkel F.S.
      Genome-wide association study for milk fatty acids in Holstein cattle accounting for the DGAT1 gene effect.
      ;
      • Iung L.H.S.
      • Petrini J.
      • Ramírez-Díaz J.
      • Salvian M.
      • Rovadoscki G.A.
      • Pilonetto F.
      • Dauria B.D.
      • Machado P.F.
      • Coutinho L.L.
      • Wiggans G.R.
      • Mourão G.B.
      Genome-wide association study for milk production traits in a Brazilian Holstein population.
      ;
      • Freitas P.H.F.
      • Oliveira H.R.
      • Silva F.F.
      • Fleming A.
      • Miglior F.
      • Schenkel F.S.
      • Brito L.F.
      Genomic analyses for predicted milk fatty acid composition throughout lactation in North American Holstein cattle.
      ), Norwegian Red (
      • Olsen H.G.
      • Knutsen T.M.
      • Kohler A.
      • Svendsen M.
      • Gidskehaug L.
      • Grove H.
      • Nome T.
      • Sodeland M.
      • Sundsaasen K.K.
      • Kent M.P.
      • Martens H.
      • Lien S.
      Genome-wide association mapping for milk fat composition and fine mapping of a QTL for de novo synthesis of milk fatty acids on bovine chromosome 13.
      ), and Montbéliarde (
      • Sanchez M.-P.
      • Ramayo-Caldas Y.
      • Wolf V.
      • Laithier C.
      • El Jabri M.
      • Michenet A.
      • Boussaha M.
      • Taussat S.
      • Fritz S.
      • Delacroix-Buchet A.
      • Brochard M.
      • Boichard D.
      Sequence-based GWAS, network and pathway analyses reveal genes co-associated with milk cheese-making properties and milk composition in Montbéliarde cows.
      ) cows. Studies of directly measured protein fractions in milk include Dutch Holstein-Friesians (
      • Schopen G.C.B.
      • Visker M.H.P.W.
      • Koks P.D.
      • Mullaart E.
      • van Arendonk J.A.M.
      • Bovenhuis H.
      Whole-genome association study for milk protein composition in dairy cattle.
      ), Italian Brown Swiss cows (
      • Pegolo S.
      • Mach N.
      • Ramayo-Caldas Y.
      • Schiavon S.
      • Bittante G.
      • Cecchinato A.
      Integration of GWAS, pathway and network analyses reveals novel mechanistic insights into the synthesis of milk proteins in dairy cows.
      ), and Danish Holsteins and Jerseys (
      • Buitenhuis B.
      • Poulsen N.A.
      • Gebreyesus G.
      • Larsen L.B.
      Estimation of genetic parameters and detection of chromosomal regions affecting the major milk proteins and their post translational modifications in Danish Holstein and Danish Jersey cattle.
      ), whereas studies of FT-MIR predicted protein fractions include Montbéliarde, Normande, and Holstein cows (
      • Sanchez M.-P.
      • Govignon-Gion A.
      • Croiseau P.
      • Fritz S.
      • Hozé C.
      • Miranda G.
      • Martin P.
      • Barbat-Leterrier A.
      • Letaïef R.
      • Rocha D.
      • Brochard M.
      • Boussaha M.
      • Boichard D.
      Within-breed and multi-breed GWAS on imputed whole-genome sequence variants reveal candidate mutations affecting milk protein composition in dairy cattle.
      ,
      • Sanchez M.-P.
      • Ramayo-Caldas Y.
      • Wolf V.
      • Laithier C.
      • El Jabri M.
      • Michenet A.
      • Boussaha M.
      • Taussat S.
      • Fritz S.
      • Delacroix-Buchet A.
      • Brochard M.
      • Boichard D.
      Sequence-based GWAS, network and pathway analyses reveal genes co-associated with milk cheese-making properties and milk composition in Montbéliarde cows.
      ). Differences in genotype density and breed composition for GWAS conducted on directly measured and FT-MIR predicted fatty acid and protein traits make it difficult to compare QTL between studies. To date, as far as we are aware, there have been no GWAS that compare QTL for directly measured fatty acids and protein traits to QTL for the corresponding FT-MIR predicted traits within the same study population.
      The objective of this study was to compare the genetic characteristics of directly measured fatty acids and protein fractions to the same traits predicted from FT-MIR spectra. Calibration equations were developed using milk samples from New Zealand crossbred dairy cattle, and pedigree-based models were used to evaluate the (co)variance parameters of each directly measured trait and its corresponding FT-MIR predicted trait. To understand the underlying differences in the genetic architecture of directly measured and FT-MIR predicted traits, we conducted GWAS using imputed whole-genome sequence, and compared QTL from directly measured traits to QTL from the corresponding FT-MIR predicted traits. It was expected that the use of imputed whole-genome sequence genotypes from an F2 study population would enhance our ability to identify trait QTL and candidate causative mutations, and that using the same data set to conduct GWAS across directly measured and FT-MIR predicted traits would be valuable for determining differences between QTL.

      MATERIALS AND METHODS

      Ethics Statement

      Animal ethics approval for the collection of data used in this study was granted by the Ruakura Animal Ethics Committee (Hamilton, New Zealand; approval numbers 4,232, 4,621, and 10,174), according to the rules and guidelines outlined in the New Zealand Animal Welfare Act 1999.

      Study Population/Animals and Milk Samples

      Animals included in this study were from an F2 design crossbreeding experiment with a half-sibling family structure, as previously described (
      • Spelman R.
      • Miller F.
      • Hooper J.
      • Thielen M.
      • Garrick D.
      Experimental design for QTL trial involving New Zealand Friesian and Jersey breeds.
      ;
      • Berry S.D.
      • Lopez-Villalobos N.
      • Beattie E.M.
      • Davis S.R.
      • Adams L.F.
      • Thomas N.L.
      • Ankersmit-Udy A.E.
      • Stanfield A.M.
      • Lehnert K.
      • Ward H.E.
      • Arias J.A.
      • Spelman R.J.
      • Snell R.G.
      Mapping a quantitative trait locus for the concentration of β-lactoglobulin in milk, and the effect of β-lactoglobulin genetic variants on the composition of milk from Holstein-Friesian x Jersey crossbred cows.
      ). Briefly, 6 F1 bulls were generated from reciprocal crosses of Holstein-Friesian and Jersey animals that were then mated to high genetic merit F1 cows. This resulted in a herd of 850 F2 female progeny, consisting of 2 cohorts produced over consecutive seasons, which were managed in a seasonal, pasture-based dairy system. Because of the phenotypic differences between milk composition for Friesian and Jersey animals, it was expected that the genetic variation exhibited in F2 animals would typically be higher compared with what would be seen in a study of purebred animals, and that this could assist in the identification of trait QTL.
      Measurements of FT-MIR spectra, and fatty acid and protein composition, were evaluated from second lactation milk samples collected at peak-, mid- and late-lactation in the 2003 to 2004 season for cohort 1, and the 2004 to 2005 season for cohort 2. Calving for each cohort took place over ∼3 mo between July and October. Samples for each cohort representing peak milk were collected on a daily basis for these cows at 35 d postcalving, whereas mid- and late-lactation samples were collected at a fixed date across the herd within the season. A frequency distribution of the number of samples classified by DIM at the time of sampling has been provided in Appendix A1.
      Concentrations of fatty acids were directly determined in milk fat samples by fatty acid methyl ester analysis using GC (
      • MacGibbon A.K.M.
      • Reynolds M.A.
      Milk lipids.
      ), within 1 of up to 5 batches on a given sample collection day, and were expressed as grams per 100 g of total fat content. In this study, we report an analysis for 17 individual fatty acids and 6 fatty acid groups that were classified based on the degree of saturation and the length of the carbon chain, as follows: (1) SFA (no double bonds); (2) UFA (1 or more double bonds); (3) PUFA (2 or more double bonds); (4) short-chain fatty acids (SCFA; 4, 6, or 8 carbons); (5) medium-chain fatty acids (MCFA; 10, 12, or 14 carbons); and (6) long-chain fatty acids (LCFA; 18 carbons). Milk proteins were determined using HPLC, as described by
      • Palmano K.P.
      • Elgar D.F.
      Detection and quantitation of lactoferrin in bovine whey samples by reversed-phase high-performance liquid chromatography on polystyrene–divinylbenzene.
      , and were analyzed within 1 of up to 6 batches on a given sample collection day, and were expressed as grams per liter of total milk volume. Traits were assessed for deviation from normality by visual inspection of normal quantile plots and by evaluating asymmetry according to skewness. With the exception of lactoferrin, all directly measured traits were approximately normally distributed with absolute skewness values less than 1. For lactoferrin, log, square-, and cube-root transformations were applied to determine which transformation minimized skewness. A cube-root transformation was the most effective of those investigated for minimizing skewness and was applied to lactoferrin trait values for all downstream analyses. Frequency distributions of untransformed lactoferrin concentrations and lactoferrin concentrations after applying a cube-root transformation are provided in Appendix A2. Outliers for each fatty acid and protein trait were identified and removed if the trait value was more than 3 standard deviations from the mean for the corresponding season and stage of lactation (peak, mid, late). After removal of outliers, each trait was adjusted to remove batch effects, where batch effects were evaluated from a random effects model with batch nested within season and stage of lactation, using Nelder-Mead optimization as implemented in the lme4 package in R (
      • Bates D.
      • Mächler M.
      • Bolker B.
      • Walker S.
      Fitting linear mixed-effects models using lme4.
      ).
      The same milk samples assessed for fatty acid and protein composition were also analyzed on a Foss MilkoScan FT6000 (Foss) instrument, to generate spectral records consisting of 1,060 wavenumbers across the range from 925.66 to 5,010.15 cm−1. Spectral data from regions associated with low signal-to-noise ratios and poor sample measurement repeatability due to the water content in milk were excluded, according to the definitions by
      • Tiplady K.M.
      • Sherlock R.G.
      • Littlejohn M.D.
      • Pryce J.E.
      • Davis S.R.
      • Garrick D.J.
      • Spelman R.J.
      • Harris B.L.
      Strategies for noise reduction and standardization of milk mid-infrared spectra from dairy cattle.
      . Specifically, the excluded low signal-to-noise regions were 649 to 970 cm−1, 1,608 to 1,682 cm−1, and ≥3,021 cm−1. This resulted in 542 wavenumbers for use in the development of prediction equations. Outliers in the spectral data were identified using the methodology described in
      • Tiplady K.M.
      • Sherlock R.G.
      • Littlejohn M.D.
      • Pryce J.E.
      • Davis S.R.
      • Garrick D.J.
      • Spelman R.J.
      • Harris B.L.
      Strategies for noise reduction and standardization of milk mid-infrared spectra from dairy cattle.
      . Briefly, the squared Mahalanobis distance between each spectral record and the average spectra were evaluated using the 542 wavenumbers identified as being outside low signal-to-noise regions. The distributions of Mahalanobis distance values for each season were compared and found to be similar, indicating that although the spectra were collected in 2 different seasons, the effect of instrument drift across time was likely to be small. Based on the lowest average information criterion, a logistic distribution with location and scale parameters of 541.7 and 27.3, respectively, had the best fit to the overall Mahalanobis distance values, and based on a P-value of 0.001, 18 outliers were identified and removed. In total, after outlier removal, we had 2,005 samples for 706 animals with FT-MIR spectra and either a fatty acid or protein composition result. Traits varied in the final number of records available for analysis, ranging from 1,686 to 1,977 records, and representing from 699 to 704 animals. The overall mean fat and protein concentrations as predicted from the Foss instrument calibration equation were 5.40 (SD = 0.70) and 3.98 (SD = 0.36), respectively.

      Development and Validation of Calibration Equations

      Phenotypic calibration equations for each fatty acid and protein fraction were evaluated within a cross-validation framework, whereby records for a random selection of half the animals were assigned to a training dataset, and the remaining records were assigned to a validation dataset. This ensured that validation was cow-independent in that none of the records for animals included in the training dataset were included in the validation dataset. Partial least squares models for each trait were developed using 542 spectral wavenumbers with the caret package in R (

      Kuhn, M., J. Wing, S. Weston, A. Williams, C. Keefer, A. Engelhardt, T. Cooper, Z. Mayer, B. Kenkel, R Core Team, M. Benesty, R. Lescarbeau, A. Ziem, L. Scrucca, Y. Tang, C. Candan, and T. Hunt. 2022. Caret: Classification and Regression Training.

      ), based on training data with 10 repeats of 10-fold cross-validation. In addition to the untreated spectra, several mathematical treatments of spectra were assessed using the mdatools package in R (
      • Kucheryavskiy S.
      mdatools—R package for chemometrics.
      ), including standard normal variate (SNV) transformation, multiplicative scatter correction, and first-order Savitzky-Golay derivative (
      • Savitzky A.
      • Golay M.J.E.
      Smoothing and differentiation of data by simplified least squares procedures.
      ) treatments. First-derivative treatments were applied to untreated spectra and spectra after SNV or multiplicative scatter correction treatments using a range of window sizes, with up to 1 and 10 points either side. For each trait, the performance of the PLS model was assessed according to the coefficient of determination between actual and predicted phenotypic trait values in the validation dataset (Rcv2), and the relative prediction error (RPE) between actual and predicted trait values in the validation dataset (RPEcv), as described by
      • Lopez-Villalobos N.
      • Spelman R.J.
      • Melis J.
      • Davis S.R.
      • Berry S.D.
      • Lehnert K.
      • Holroyd S.E.
      • MacGibbon A.K.H.
      • Snell R.G.
      Estimation of genetic and crossbreeding parameters of fatty acid concentrations in milk fat predicted by mid-infrared spectroscopy in New Zealand dairy cattle.
      .

      Genetic Parameters of Traits

      Genetic (co)variances of each directly measured trait and its corresponding FT-MIR predicted trait were estimated using a pairwise bivariate repeated measures animal model in ASReml-R (
      • Butler D.G.
      • Cullis B.R.
      • Gilmour A.R.
      • Gogel B.J.
      ASReml-R Reference Manual: Analysis of Mixed Models for S Language Environments.
      ) based on a pedigree comprising 5,943 animals. The model was defined as follows:
      [y1y2]=[X100X2][b1b2]+[Z100Z2][u1u2]+[W100W2][p1p2]+[e1e2],
      [1]


      where y1 is a vector of the directly measured fatty acid or protein fraction, y2 is a vector of the corresponding FT-MIR predicted trait; X1, Z1, W1, X2, Z2, and W2 are design matrices for the fixed, additive genetic and permanent environment effects, respectively, for y1 and y2; b1 and b2 are vectors of the fixed effect of DIM (represented as 35-day windows from the start of lactation) within season (2003, 2004) for the directly measured and the FT-MIR predicted trait, respectively; u1 and u2 are vectors of random additive genetic effects for each trait; p1 and p2 are vectors of permanent environment effects for each trait; and e1 and e2 are vectors of residuals. The following (co)variance structure for each directly measured (y1) and FT-MIR predicted (y2) trait pair is assumed:
      var[upe]=[GA000CIp000RIe],


      where u=[u1u2], p=[p1p2], and e=[e1e2], where A is the numerator relationship matrix, Ip is an identity matrix of order corresponding to the length of the vector p, Ie is an identity matrix of order corresponding to the length of the vector e, is the Kronecker product. Additionally, G, C, and R are genetic, permanent environment and residual (co)variance matrices, respectively, and are defined as follows:
      G=[σu12σu1u2σu1u2σu22],


      C=[σp12σp1p2σp1p2σp22],


      and
      R=[σe12σe1e2σe1e2σe22].


      The heritability and repeatability for each trait were calculated as functions of the estimated (co)variance components based on their parametric definitions of hi2=σui2σui2+σpi2+σei2 and ti=σui2+σpi2σui2+σpi2+σei2, where i = 1 or 2 for traits y1 and y2, respectively, and the genetic correlation for each pair of measured/predicted traits was calculated as ra=σu1u2σu1σu2. For each bivariate analysis, starting values for additive genetic and residual (co)variances were estimated from single trait models. A range of covariance starting values were iteratively assessed for model convergence, with starting values of a(σu12+σu22)2 and b(σe12+σe22)2 for additive genetic and residual covariances, respectively, where a and b ranged from 0.1 to 0.9 in increments of 0.1. Among models that converged for each pair of traits, genetic parameter estimates were highly consistent. For traits that had different solutions from different models, the model that minimized the squared sum of the difference between single- and multi-trait model heritability estimates was selected.

      Genotypes and Imputation

      Of the 706 animals with phenotypic data, 685 were genotyped on Illumina BovineHD (HD; n = 12; ∼777k SNP) or Illumina BovineSNP50k (50k; n = 685; ∼53k SNP) panels, or were genotyped on both. The resultant genotypes were imputed to sequence density as part of a wider set of 153,357 animals, as described previously (
      • Jivanji S.
      • Worth G.
      • Lopdell T.J.
      • Yeates A.
      • Couldrey C.
      • Reynolds E.
      • Tiplady K.
      • McNaughton L.
      • Johnson T.J.J.
      • Davis S.R.
      • Harris B.
      • Spelman R.
      • Snell R.G.
      • Garrick D.
      • Littlejohn M.D.
      Genome-wide association analysis reveals QTL and candidate mutations involved in white spotting in cattle.
      ;
      • Tiplady K.M.
      • Lopdell T.J.
      • Reynolds E.
      • Sherlock R.G.
      • Keehan M.
      • Johnson T.J.J.
      • Pryce J.E.
      • Davis S.R.
      • Spelman R.J.
      • Harris B.L.
      • Garrick D.J.
      • Littlejohn M.D.
      Sequence-based genome-wide association study of individual milk mid-infrared wavenumbers in mixed-breed dairy cattle.
      ). Briefly, the imputation process consisted of stepwise imputation of animals to whole-genome sequence genotypes via references of GeneSeek Genomic Profiler, 50k, and HD genotypes. The whole-genome sequence reference consisted of 565 animals, comprised of 138 Holstein-Friesians, 99 Jerseys, 316 Holstein-Friesian × Jersey crossbreeds, and 12 from other breeds or crosses. Notably, the 6 F1 sires included in our study were included in this whole-genome sequence reference and were sequenced with a target of 60× read-depth coverage. Phasing was undertaken using Beagle 4.0 (
      • Browning S.R.
      • Browning B.L.
      Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering.
      ), based on genotype probabilities, and variants were filtered to remove those where the allelic R2 for missing genotypes was less than 0.95. Only variants located on Bos taurus autosomes were considered, resulting in a sequence reference comprising 19,659,361 segregating variants spanning all 29 autosomes. Imputation was carried out using Beagle 4.0 (
      • Browning S.R.
      • Browning B.L.
      Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering.
      ), ignoring pedigree information, and SNP with allelic R2 < 0.7 were removed after each imputation step. The overall median imputation allelic R2 for the wider set of 153,357 animals was 0.986, but was 0.992 for the 685 genotyped animals included in this study.

      Genome-Wide Association Studies

      Before conducting GWAS, adjusted fatty acid and protein phenotypes were generated for directly measured and FT-MIR predicted traits. The generation of the adjusted phenotypes was based on 1 or more samples measured on the same cow, which were fitted to a univariate pedigree-based repeated measures model in ASReml-R (
      • Butler D.G.
      • Cullis B.R.
      • Gilmour A.R.
      • Gogel B.J.
      ASReml-R Reference Manual: Analysis of Mixed Models for S Language Environments.
      ), as follows:
      y=Xb+Zu+Wp+e,
      [2]


      where y is a vector of the measured or predicted trait, X, Z, and W are design matrices for the fixed, additive genetic, and permanent environment effects; b is the fixed effect of DIM (represented as 35-d windows from the start of lactation) within season (2003, 2004) for the trait; u is a vector of random additive genetic effects with uN(0,Aσu2); p&sim;N(0,Ipσp2) is a vector of random permanent environment effects; and e is a vector of random residuals with e&sim;N(0,Ieσe2), where A is the numerator relationship matrix, Ip is an identity matrix of order corresponding to the length of the vector p, Ie is an identity matrix of order corresponding to the length of the vector e, σu2 is the additive genetic variance, σp2 is the permanent environment variance, and σe2 is the residual variance. Adjusted phenotypes used in the GWAS were the average of y over all observations for a cow minus the relevant fixed effects.
      For each directly measured fatty acid or protein trait and its corresponding FT-MIR predicted trait, a GWAS was conducted using Bolt-LMM software (
      • Loh P.-R.
      • Tucker G.
      • Bulik-Sullivan B.K.
      • Vilhjálmsson B.J.
      • Finucane H.K.
      • Salem R.M.
      • Chasman D.I.
      • Ridker P.M.
      • Neale B.M.
      • Berger B.
      • Patterson N.
      • Price A.L.
      Efficient Bayesian mixed-model analysis increases association power in large cohorts.
      ). Before conducting GWAS, a minor allele frequency threshold of 1% based on allele frequencies in the 685-animal study population was applied, resulting in 14,990,779 imputed sequence variants included in each GWAS. To assess the additive effect of each SNP, mixed model association statistics were evaluated under an infinitesimal model. To account for population structure, a genomic relationship matrix based on a subset of 42,374 SNPs was simultaneously fitted. That subset of SNP was derived by applying a minor allele frequency threshold of 1% to the 50k SNP-chip imputation reference (previously described). A leave-one-segment-out approach was used to avoid proximal contamination in the GWAS, whereby a 5-Mbp region flanking the sequence variant of interest was excluded from the set of SNPs used to estimate the genomic relationship matrix.
      An adjusted Bonferroni threshold was adopted to determine variants with significant associations for each trait. Because a Bonferroni correction threshold based on all 14,990,779 variants is highly conservative, a modified threshold was evaluated based on the effective number of independent variants, as proposed by
      • Duggal P.
      • Gillanders E.M.
      • Holmes T.N.
      • Bailey-Wilson J.E.
      Establishing an adjusted p-value threshold to control the family-wide type 1 error in genome wide association studies.
      and implemented in other studies (
      • Zhu B.
      • Niu H.
      • Zhang W.
      • Wang Z.
      • Liang Y.
      • Guan L.
      • Guo P.
      • Chen Y.
      • Zhang L.
      • Guo Y.
      • Ni H.
      • Gao X.
      • Gao H.
      • Xu L.
      • Li J.
      Genome wide association study and genomic prediction for fatty acid composition in Chinese Simmental beef cattle using high density SNP array.
      ;
      • Wang Z.
      • Zhu B.
      • Niu H.
      • Zhang W.
      • Xu L.
      • Xu L.
      • Chen Y.
      • Zhang L.
      • Gao X.
      • Gao H.
      • Zhang S.
      • Xu L.
      • Li J.
      Genome wide association study identifies SNPs associated with fatty acid composition in Chinese Wagyu cattle.
      ). The effective number of independent variants were identified using a sliding window approach in Plink software (
      • Purcell S.
      • Neale B.
      • Todd-Brown K.
      • Thomas L.
      • Ferreira M.A.
      • Bender D.
      • Maller J.
      • Sklar P.
      • De Bakker P.I.
      • Daly M.J.
      • Sham P.C.
      PLINK: A tool set for whole-genome association and population-based linkage analyses.
      ), with an R2 threshold of 0.9, a window size of 100 kb and a step size of 5 variants. These criteria resulted in a set of 2,303,435 variants and enabled the calculation of an adjusted Bonferroni threshold which considered all tests across 2,303,435 variants as independent. Based on α = 0.05, this resulted in a nominal P-value of 4.3e-09 and a corresponding Bonferroni threshold of −log10(4.3e-09) = 8.36. Whole-genome sequence resolution genotypes within a 1Mbp window were annotated using SnpEff (version 4.3t; build 11-24-2017;
      • Cingolani P.
      • Platts A.
      • Wang L.L.
      • Coon M.
      • Nguyen T.
      • Wang L.
      • Land S.J.
      • Lu X.
      • Ruden D.M.
      A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3.
      ) and the Ensembl UMD3.1.86 gene annotations to assess the candidacy of QTL identified from the GWAS for each trait. We used a linkage disequilibrium (LD)-based approach to prioritize variants, similar to that described by
      • Lopdell T.J.
      • Tiplady K.
      • Struchalin M.
      • Johnson T.J.J.
      • Keehan M.
      • Sherlock R.
      • Couldrey C.
      • Davis S.R.
      • Snell R.G.
      • Spelman R.J.
      • Littlejohn M.D.
      DNA and RNA-sequence based GWAS highlights membrane-transport genes as key modulators of milk lactose content.
      because the association rankings of candidate variants are expected to be affected by phenotyping, genotyping, and imputation errors. Specifically, we identified QTL regions where the most highly associated variant was in high LD (R2 > 0.7) with either a splice region variant, or a moderate or high impact coding variant, according to SnpEff classification.

      RESULTS AND DISCUSSION

      Trait Prediction Models

      Cross-validation prediction model accuracies (Rcv2) were assessed for untreated spectra, as well as for spectra treated using SNV transformation, multiplicative scatter correction, or first-derivative treatments (Appendix Table A1). Window sizes of 15 data points (7 points either side) had consistently higher Rcv2 values, compared to other window sizes, so only these have been presented. Applying treatments to spectral data resulted in marginally higher Rcv2 values on average, compared to not treating spectra, and treating spectra with a SNV and first-derivative transformation prior to fitting PLS models resulted in the highest average Rcv2 value and was thus used in all further analysis. Descriptive statistics of fatty acid and protein traits, and goodness of fit measures of PLS calibration models (applied to SNV + first-derivative transformed spectra) for training and validation datasets are presented in Table 1.
      Table 1Descriptive statistics of fatty acid and protein traits, and goodness of fit measures of partial least squares calibration models for training and validation data sets
      TraitDescription and unitsTrait summary
      n = number of samples.
      Training
      Rt2 = coefficient of determination between actual and predicted trait values in the training dataset; RPEt = relative prediction error between actual and predicted trait values in the training dataset.
      Validation
      Rcv2 = coefficient of determination between actual and predicted trait values in the validation dataset; RPEcv = relative prediction error between actual and predicted trait values in the validation dataset.
      nMeanSDRt2RPEtRcv2RPEcv
      Individual fatty acid
       C4:0Butyric acid, g/100 g of total fat1,9633.900.320.7060.0430.6020.053
       C6:0Caproic acid, g/100 g of total fat1,9692.520.190.5910.0490.5420.052
       C8:0Caprylic acid, g/100 g of total fat1,9681.540.180.6970.0640.6220.073
       C10:0Capric acid, g/100 g of total fat1,9753.510.610.7010.0940.6270.108
       C10:1Caproleic acid, g/100 g of total fat1,9690.310.060.4690.1510.3000.162
       C12:0Lauric acid, g/100 g of total fat1,9723.920.740.6850.1060.5900.121
       C12:1Lauroleic acid, g/100 g of total fat1,9250.130.030.4700.1690.3530.181
       C14:0Myristic acid, g/100 g of total fat1,96711.461.170.5990.0650.4910.073
       C14:1Myristoleic acid, g/100 g of total fat1,9700.750.230.5170.2110.4140.233
       C16:0Palmitic acid, g/100 g of total fat1,97727.643.270.6330.0730.5740.076
       C16:1Palmitoleic acid, g/100 g of total fat1,9581.540.220.3010.1230.1840.132
       C18:0Stearic acid, g/100 g of total fat1,96811.952.000.5440.1150.4450.124
       C18:1 cis-7cis-Vaccenic acid, g/100 g of total fat1,9364.530.700.5310.1070.4110.118
       C18:1 cis-9Oleic acid, g/100 g of total fat1,96317.312.550.6530.0880.5690.096
       C18:2 cis-9,trans-11Conjugated linoleic acid, g/100 g of total fat1,9290.870.250.5870.1850.4980.210
       C18:2 cis-6Linoleic acid, g/100 g of total fat1,9631.200.140.5610.0780.4800.085
       C18:3 cis-3α-Linolenic acid, g/100 g of total fat1,9540.800.110.3870.1120.3600.105
      Grouped fatty acid
      SCFA = short-chain fatty acids, sum of C4:0, C6:0, and C8:0; MCFA = medium-chain fatty acids, sum of 10:0, 10:1, 12:0, 12:1, 14:0, and 14:1; LCFA = long-chain fatty acids, sum of C18 fatty acids.
       SFASaturated fatty acids, g/100 g of total fat1,96570.593.080.7030.0240.5910.028
       PUFAPolyunsaturated fatty acids, g/100 g of total fat1,9724.160.460.6410.0650.4900.081
       UFAUnsaturated fatty acids, g/100 g of total fat1,96429.423.080.7110.0570.5970.066
       SCFAShort-chain fatty acids, g/100 g of total fat1,9707.960.590.6950.0410.6480.043
       MCFAMedium-chain fatty acids, g/100 g of total fat1,96920.092.430.6590.0710.5670.080
       LCFALong-chain fatty acids, g/100 g of total fat1,97436.824.450.6090.0760.5680.079
      Individual milk protein
       α-CNα-Casein, g/L of total volume1,69515.791.760.5850.0720.5320.076
       β-CNβ-Casein, g/L of total volume1,68614.781.840.1280.1160.1900.113
       κ-CNκ-Casein, g/L of total volume1,6874.240.590.5750.0870.4760.105
       α-LAα-Lactalbumin, g/L of total volume1,9421.210.150.3790.0990.3060.104
       β-LGβ-Lactoglobulin, g/L of total volume1,9593.840.700.7730.0870.6780.104
       Lf
      Cube-root transformation of lactoferrin (Lf).
      Lactoferrin, g/L of total volume1,9360.510.120.4110.1880.3560.194
      1 n = number of samples.
      2 Rt2 = coefficient of determination between actual and predicted trait values in the training dataset; RPEt = relative prediction error between actual and predicted trait values in the training dataset.
      3 Rcv2 = coefficient of determination between actual and predicted trait values in the validation dataset; RPEcv = relative prediction error between actual and predicted trait values in the validation dataset.
      4 SCFA = short-chain fatty acids, sum of C4:0, C6:0, and C8:0; MCFA = medium-chain fatty acids, sum of 10:0, 10:1, 12:0, 12:1, 14:0, and 14:1; LCFA = long-chain fatty acids, sum of C18 fatty acids.
      5 Cube-root transformation of lactoferrin (Lf).
      For individual fatty acids, coefficient of determination values for the validation dataset (Rcv2) were generally higher for short-chain fatty acids (C4 to C8), ranging from 0.54 to 0.62, compared with medium-chain fatty acids (C10 to C14), which ranged from 0.30 to 0.63, and long-chain fatty acids (C16 to C18), which ranged from 0.18 to 0.57. Concentrations of individual saturated fatty acids were typically higher and had higher average Rcv2 values, compared with individual unsaturated fatty acids. For grouped fatty acids, Rcv2 values were higher for UFA and SFA groups, compared to PUFA; additionally, for fatty acids grouped by carbon chain length, the highest Rcv2 value of 0.65 was observed for SCFA. It is notable that although we found an overall trend of higher Rcv2 values coinciding with lower RPEcv values, there were exceptions to this. For example, among individual fatty acids, C16:1 had a particularly low Rcv2 of 0.18, but an RPEcv of 0.13, which was comparable to other traits such as C10:0 and C12:0, which had Rcv2 values of ∼0.60. This highlights the difference between Rcv2 and RPEcv as accuracy metrics, the former indicating how well the prediction model explains the variation in the directly measured trait, whereas the latter provides a comparison of how similar the predicted values are to the directly measured trait values. In the present study, most comparisons of accuracy with other studies will be based on Rcv2 values because that is the accuracy metric that is most commonly reported; however, the example above shows that other metrics can be valuable for assessing FT-MIR prediction model accuracy.
      The Rcv2 values we report are consistent with those from previous studies where fatty acids were expressed as a proportion of total fat content, with our values being similar to those reported by
      • Soyeurt H.
      • Dardenne P.
      • Dehareng F.
      • Lognay G.
      • Veselko D.
      • Marlier M.
      • Bertozzi C.
      • Mayeres P.
      • Gengler N.
      Estimating fatty acid content in cow milk using mid-infrared spectrometry.
      , but lower than those reported in other studies (
      • Rutten M.J.M.
      • Bovenhuis H.
      • Hettinga K.A.
      • van Valenberg H.J.F.
      • van Arendonk J.A.M.
      Predicting bovine milk fat composition using infrared spectroscopy based on milk samples collected in winter and summer.
      ;
      • Lopez-Villalobos N.
      • Spelman R.J.
      • Melis J.
      • Davis S.R.
      • Berry S.D.
      • Lehnert K.
      • Holroyd S.E.
      • MacGibbon A.K.H.
      • Snell R.G.
      Estimation of genetic and crossbreeding parameters of fatty acid concentrations in milk fat predicted by mid-infrared spectroscopy in New Zealand dairy cattle.
      ;
      • Bonfatti V.
      • Degano L.
      • Menegoz A.
      • Carnier P.
      Short communication: Mid-infrared spectroscopy prediction of fine milk composition and technological properties in Italian Simmental.
      ). In the present study, for grouped SCFA, MCFA, and LCFA, Rcv2 values were lower than in other studies (
      • Rutten M.J.M.
      • Bovenhuis H.
      • Hettinga K.A.
      • van Valenberg H.J.F.
      • van Arendonk J.A.M.
      Predicting bovine milk fat composition using infrared spectroscopy based on milk samples collected in winter and summer.
      ;
      • Lopez-Villalobos N.
      • Spelman R.J.
      • Melis J.
      • Davis S.R.
      • Berry S.D.
      • Lehnert K.
      • Holroyd S.E.
      • MacGibbon A.K.H.
      • Snell R.G.
      Estimation of genetic and crossbreeding parameters of fatty acid concentrations in milk fat predicted by mid-infrared spectroscopy in New Zealand dairy cattle.
      ;
      • Bonfatti V.
      • Degano L.
      • Menegoz A.
      • Carnier P.
      Short communication: Mid-infrared spectroscopy prediction of fine milk composition and technological properties in Italian Simmental.
      ). Accuracies for fatty acids predicted using FT-MIR spectra were variable in previous studies and were affected by factors such as the production system and the breed composition diversity present in calibration samples, the number of samples used to develop calibration equations, and the variability of fatty acid composition present in the calibration samples.
      • Rutten M.J.M.
      • Bovenhuis H.
      • Hettinga K.A.
      • van Valenberg H.J.F.
      • van Arendonk J.A.M.
      Predicting bovine milk fat composition using infrared spectroscopy based on milk samples collected in winter and summer.
      demonstrated that increasing the number of observations used in the calibration equations resulted in better predictions for fat composition.
      • Soyeurt H.
      • Dardenne P.
      • Dehareng F.
      • Lognay G.
      • Veselko D.
      • Marlier M.
      • Bertozzi C.
      • Mayeres P.
      • Gengler N.
      Estimating fatty acid content in cow milk using mid-infrared spectrometry.
      ,
      • Soyeurt H.
      • Dehareng F.
      • Gengler N.
      • McParland S.
      • Wall E.
      • Berry D.P.
      • Coffey M.
      • Dardenne P.
      Mid-infrared prediction of bovine milk fatty acids across multiple breeds, production systems, and countries.
      ) demonstrated that prediction accuracy could be improved by increasing the sample size of their study, and by increasing the range of variation present in the fatty acids. Importantly, studies with the highest accuracies were those where the range of fatty acid values present in the validation samples were encompassed within the range of fatty acid values represented in calibration samples.
      For individual milk proteins, Rcv2 values were generally lower than for fatty acids, ranging from 0.19 for β-CN to 0.68 for β-LG. Notably, although the Rcv2 values for β-CN and β-LG were very different, the RPEcv values for these 2 traits were similar (0.11 and 0.10, respectively). The Rcv2 values we report for individual milk proteins were typically higher than those reported in previous studies of individual milk proteins, with the exceptions of β-CN and lactoferrin (Lf), which were consistently lower here than in other studies (
      • De Marchi M.
      • Bonfatti V.
      • Cecchinato A.
      • Di Martino G.
      • Carnier P.
      Prediction of protein composition of individual cow milk using mid-infrared spectroscopy.
      ;
      • Lopez-Villalobos N.
      • Davis S.R.
      • Beattie E.M.
      • Melis J.
      • Berry S.
      • Holroyd S.E.
      • Spelman R.J.
      • Snell R.G.
      Breed effects for lactoferrin concentration determined by Fourier transform infrared spectroscopy.
      ;
      • Rutten M.J.M.
      • Bovenhuis H.
      • Heck J.M.L.
      • van Arendonk J.A.M.
      Predicting bovine milk protein composition based on Fourier transform infrared spectra.
      ;
      • Soyeurt H.
      • Bastin C.
      • Colinet F.G.
      • Arnould V. M.-R.
      • Berry D.P.
      • Wall E.
      • Dehareng F.
      • Nguyen H.N.
      • Dardenne P.
      • Schefers J.
      • Vandenplas J.
      • Weigel K.
      • Coffey M.
      • Théron L.
      • Detilleux J.
      • Reding E.
      • Gengler N.
      • McParland S.
      Mid-infrared prediction of lactoferrin content in bovine milk: Potential indicator of mastitis.
      ;
      • Bonfatti V.
      • Degano L.
      • Menegoz A.
      • Carnier P.
      Short communication: Mid-infrared spectroscopy prediction of fine milk composition and technological properties in Italian Simmental.
      ;
      • McDermott A.
      • Visentin G.
      • De Marchi M.
      • Berry D.P.
      • Fenelon M.A.
      • O'Connor P.M.
      • Kenny O.A.
      • McParland S.
      Prediction of individual milk proteins including free amino acids in bovine milk using mid-infrared spectroscopy and their correlations with milk processing characteristics.
      ).
      • Fuentes-Pila J.
      • DeLorenzo M.A.
      • Beede D.K.
      • Staples C.R.
      • Holter J.B.
      Evaluation of equations based on animal factors to predict intake of lactating Holstein cows.
      suggest that a RPE of lower than 0.1 is an indicator of satisfactory prediction, a RPE between 0.1 to 0.2 is an indicator of relatively good or acceptable predictions, and a RPE greater than 0.2 is an indicator of unsatisfactory prediction. Based on these criteria, 21 of 23 individual and grouped fatty acids and all 6 protein fractions had good or satisfactory predictions in the validation datasets. Although the guidelines proposed by
      • Fuentes-Pila J.
      • DeLorenzo M.A.
      • Beede D.K.
      • Staples C.R.
      • Holter J.B.
      Evaluation of equations based on animal factors to predict intake of lactating Holstein cows.
      are useful as an indicator of prediction acceptability, they are potentially less meaningful when we are considering the value of incorporating FT-MIR predicted traits into animal breeding programs. This is because FT-MIR predictions can provide indicator traits across large numbers of animals at little or no cost, whereas it may be infeasible to directly measure these traits across even a small number of animals. Moreover, when we are considering the potential for incorporating an FT-MIR predicted trait into a breeding program, we are not only interested in the phenotypic correlation between the directly measured and FT-MIR predicted trait, but also the heritability of the FT-MIR predicted trait, and the genetic correlation between the directly measured and FT-MIR predicted trait.

      Genetic Parameters of Directly Measured and FT-MIR Predicted Traits

      Estimates of variance components for directly measured and FT-MIR predicted fatty acid and protein traits are shown in Table 2 and Appendix Table A2. Heritability estimates (h2) for the majority of traits were moderate to high, with 17 of the directly measured traits and 20 of the FT-MIR predicted traits having an estimated heritability greater than 0.3. Because this is an F2 study, genetic variances will include a segregation variance component that would typically inflate these values compared to what would be seen in a study of purebred animals. In general, lower heritability and repeatability estimates were observed for directly measured traits, compared to FT-MIR predicted traits. This was driven by higher total variation (σT2) in the directly measured traits, coupled with a lower magnitude increase in the additive genetic variance component (σu2), compared to the FT-MIR predicted traits. Despite this, the genetic correlations between measured and predicted traits remained high and were mostly greater than 0.75.
      Table 2Variance component estimates for directly measured and Fourier-transform mid-infrared (FT-MIR) predicted fatty acid and protein traits
      Trait
      Trait definitions and units as described in Table 1. Standard errors shown in parentheses.
      Directly measured trait
      σu2 = additive genetic variance; σT2 = total variance (σu2+σp2+σe2); h2 = heritability estimate; t = repeatability estimate.
      FT-MIR prediction
      ra =genetic correlation between directly measured and FT-MIR predicted trait.
      ra
      σu2σT2h2tσu2σT2h2t
      Individual fatty acid (g/100 g of total fat)
       C4:00.0220.0690.31 (0.12)0.52 (0.03)0.0140.0420.34 (0.13)0.57 (0.03)0.988 (0.014)
       C6:00.0050.0250.20 (0.10)0.35 (0.03)0.0030.0110.24 (0.11)0.45 (0.03)0.925 (0.099)
       C8:00.0050.0190.29 (0.11)0.44 (0.03)0.0030.0090.33 (0.12)0.45 (0.03)0.983 (0.020)
       C10:00.0980.2410.41 (0.14)0.54 (0.03)0.0570.1250.46 (0.14)0.52 (0.03)0.986 (0.027)
       C10:10.0010.0030.33 (0.13)0.54 (0.03)3e-40.0010.27 (0.10)0.42 (0.03)0.811 (0.124)
       C12:00.1320.3780.35 (0.13)0.52 (0.03)0.0830.1970.42 (0.14)0.53 (0.03)0.996 (0.017)
       C12:12e-40.0010.24 (0.10)0.33 (0.03)1e-40.00030.25 (0.10)0.41 (0.03)0.849 (0.125)
       C14:00.3420.9970.34 (0.14)0.47 (0.03)0.1610.4490.36 (0.13)0.41 (0.04)0.947 (0.043)
       C14:10.0210.0370.55 (0.17)0.71 (0.02)0.0030.0120.26 (0.11)0.42 (0.03)0.866 (0.100)
       C16:02.1875.7820.38 (0.12)0.58 (0.03)1.2143.1230.39 (0.12)0.46 (0.03)0.954 (0.058)
       C16:10.0080.0430.20 (0.10)0.48 (0.03)0.0020.0110.16 (0.08)0.38 (0.03)0.773 (0.173)
       C18:00.1762.7140.07 (0.05)0.48 (0.02)0.1491.0340.14 (0.08)0.37 (0.03)0.718 (0.259)
       C18:1 cis-70.1250.4120.30 (0.12)0.51 (0.03)0.0630.1930.33 (0.12)0.51 (0.03)0.947 (0.040)
       C18:1 cis-90.8813.9860.22 (0.09)0.41 (0.03)0.5511.9550.28 (0.11)0.42 (0.03)0.986 (0.024)
       C18:2 cis-9,trans-110.0170.0480.35 (0.13)0.60 (0.03)0.0100.0230.46 (0.16)0.62 (0.03)0.939 (0.047)
       C18:2 cis-60.0040.0130.33 (0.12)0.45 (0.03)0.0020.0060.32 (0.12)0.44 (0.03)0.904 (0.077)
       C18:3 cis-30.0040.0090.40 (0.13)0.46 (0.03)0.0010.0020.45 (0.12)0.51 (0.03)0.743 (0.144)
      Grouped fatty acid (g/100 g of total fat)
       SFA1.4726.1750.24 (0.09)0.49 (0.03)1.2933.4690.37 (0.14)0.56 (0.03)0.977 (0.035)
       PUFA0.0780.1810.43 (0.14)0.57 (0.03)0.0490.1050.46 (0.15)0.63 (0.03)0.980 (0.019)
       UFA1.4686.1670.24 (0.09)0.49 (0.03)1.2993.4740.37 (0.14)0.56 (0.03)0.975 (0.037)
       SCFA0.0370.1960.19 (0.09)0.40 (0.03)0.0260.1010.26 (0.12)0.51 (0.03)0.961 (0.040)
       MCFA1.2934.2060.31 (0.12)0.45 (0.03)0.7972.1580.37 (0.13)0.46 (0.03)0.974 (0.040)
       LCFA0.85211.700.07 (0.05)0.40 (0.03)0.8135.3010.15 (0.08)0.36 (0.03)0.925 (0.099)
      Individual milk protein (g/L of total volume)
       α-CN0.5792.0290.29 (0.12)0.45 (0.03)0.5591.1090.50 (0.18)0.61 (0.03)0.941 (0.067)
       β-CN0.4213.1050.14 (0.07)0.17 (0.03)0.2040.5370.38 (0.15)0.65 (0.03)0.802 (0.222)
       κ-CN0.1720.3150.55 (0.18)0.57 (0.04)0.0830.1620.51 (0.16)0.68 (0.03)0.956 (0.050)
       α-LA0.0080.0190.42 (0.14)0.51 (0.03)0.0020.0050.39 (0.14)0.56 (0.03)0.755 (0.130)
       β-LG0.2820.4480.63 (0.18)0.80 (0.02)0.2400.3430.70 (0.19)0.80 (0.02)0.995 (0.006)
       Lf
      Cube-root transformation of lactoferrin (Lf).
      0.0070.0120.59 (0.17)0.61 (0.03)0.0010.0030.30 (0.12)0.45 (0.03)0.771 (0.148)
      1 Trait definitions and units as described in Table 1. Standard errors shown in parentheses.
      2 σu2 = additive genetic variance; σT2 = total variance (σu2+σp2+σe2); h2 = heritability estimate; t = repeatability estimate.
      3 ra =genetic correlation between directly measured and FT-MIR predicted trait.
      4 Cube-root transformation of lactoferrin (Lf).

      Fatty Acid Traits

      In fatty acid traits, the lowest heritability estimates were observed for C18:0 and LCFA, with heritability estimates of 0.07 for the directly measured traits, and heritability estimates of 0.14 and 0.15 in the FT-MIR predicted traits, respectively. Although heritability estimates were typically higher in the FT-MIR predicted traits, there were exceptions to this. In particular, C14:1 had an estimated heritability for the measured trait that was substantially higher than that of the FT-MIR predicted trait (0.55 vs. 0.26). Genetic correlations between directly measured and FT-MIR predicted traits (ra) were greater than 0.85 for 18 of 23 individual and grouped fatty acids, and for 11 of these traits, the genetic correlation was greater than 0.95. The lowest genetic correlations were observed for C18:0 (ra = 0.72) and C18:3 cis-3 (ra = 0.74). In general, we found a consistent trend for individual and grouped fatty acids, where lower genetic correlations coincided with low Rcv2 values.
      Although several studies have reported genetic parameter estimates for directly measured or FT-MIR predicted fatty acid traits, or both, these studies vary in the specific individual fatty acids (if any) presented, and whether or not they present parameter estimates for grouped fatty acids. Many studies report genetic parameter estimates for FT-MIR predicted traits only (
      • Soyeurt H.
      • Gillon A.
      • Vanderick S.
      • Mayeres P.
      • Bertozzi C.
      • Gengler N.
      Estimation of heritability and genetic correlations for the major fatty acids in bovine milk.
      ;
      • Lopez-Villalobos N.
      • Spelman R.J.
      • Melis J.
      • Davis S.R.
      • Berry S.D.
      • Lehnert K.
      • Holroyd S.E.
      • MacGibbon A.K.H.
      • Snell R.G.
      Estimation of genetic and crossbreeding parameters of fatty acid concentrations in milk fat predicted by mid-infrared spectroscopy in New Zealand dairy cattle.
      ;
      • Narayana S.G.
      • Schenkel F.S.
      • Fleming A.
      • Koeck A.
      • Malchiodi F.
      • Jamrozik J.
      • Johnston J.
      • Sargolzaei M.
      • Miglior F.
      Genetic analysis of groups of mid-infrared predicted fatty acids in milk.
      ;
      • Fleming A.
      • Schenkel F.S.
      • Malchiodi F.
      • Ali R.A.
      • Mallard B.
      • Sargolzaei M.
      • Jamrozik J.
      • Johnston J.
      • Miglior F.
      Genetic correlations of mid-infrared-predicted milk fatty acid groups with milk production traits.
      ), with only 2 studies reporting genetic parameters for both directly measured and FT-MIR predicted traits (
      • Rutten M.J.M.
      • Bovenhuis H.
      • van Arendonk J.A.M.
      The effect of the number of observations used for Fourier transform infrared model calibration for bovine milk fat composition on the estimated genetic parameters of the predicted data.
      ;
      • Bonfatti V.
      • Vicario D.
      • Lugo A.
      • Carnier P.
      Genetic parameters of measures and population-wide infrared predictions of 92 traits describing the fine composition and technological properties of milk in Italian Simmental cattle.
      ). These latter 2 studies also report genetic correlations between directly measured and FT-MIR predicted fatty acids, with
      • Bonfatti V.
      • Vicario D.
      • Lugo A.
      • Carnier P.
      Genetic parameters of measures and population-wide infrared predictions of 92 traits describing the fine composition and technological properties of milk in Italian Simmental cattle.
      presenting these for individual and grouped fatty acids, whereas
      • Rutten M.J.M.
      • Bovenhuis H.
      • van Arendonk J.A.M.
      The effect of the number of observations used for Fourier transform infrared model calibration for bovine milk fat composition on the estimated genetic parameters of the predicted data.
      presented these for individual fatty acids only.
      The heritability, repeatability, and genetic correlation estimates we report in the present study were broadly consistent with those from previous studies. For directly measured fatty acids, the heritability estimates we report were typically higher than those reported by
      • Bonfatti V.
      • Vicario D.
      • Lugo A.
      • Carnier P.
      Genetic parameters of measures and population-wide infrared predictions of 92 traits describing the fine composition and technological properties of milk in Italian Simmental cattle.
      , but lower than those reported by
      • Rutten M.J.M.
      • Bovenhuis H.
      • van Arendonk J.A.M.
      The effect of the number of observations used for Fourier transform infrared model calibration for bovine milk fat composition on the estimated genetic parameters of the predicted data.
      . For FT-MIR predicted fatty acids, the heritability and repeatability estimates we report for individual and grouped fatty acids were similar to those presented by
      • Lopez-Villalobos N.
      • Spelman R.J.
      • Melis J.
      • Davis S.R.
      • Berry S.D.
      • Lehnert K.
      • Holroyd S.E.
      • MacGibbon A.K.H.
      • Snell R.G.
      Estimation of genetic and crossbreeding parameters of fatty acid concentrations in milk fat predicted by mid-infrared spectroscopy in New Zealand dairy cattle.
      , but lower than those presented by
      • Narayana S.G.
      • Schenkel F.S.
      • Fleming A.
      • Koeck A.
      • Malchiodi F.
      • Jamrozik J.
      • Johnston J.
      • Sargolzaei M.
      • Miglior F.
      Genetic analysis of groups of mid-infrared predicted fatty acids in milk.
      and higher than those presented in other studies (
      • Soyeurt H.
      • Gillon A.
      • Vanderick S.
      • Mayeres P.
      • Bertozzi C.
      • Gengler N.
      Estimation of heritability and genetic correlations for the major fatty acids in bovine milk.
      ;
      • Bonfatti V.
      • Vicario D.
      • Lugo A.
      • Carnier P.
      Genetic parameters of measures and population-wide infrared predictions of 92 traits describing the fine composition and technological properties of milk in Italian Simmental cattle.
      ). Compared with other studies that report genetic correlations between directly measured and FT-MIR predicted fatty acids, the genetic correlations we report were similar, with standard errors of a similar magnitude (
      • Rutten M.J.M.
      • Bovenhuis H.
      • van Arendonk J.A.M.
      The effect of the number of observations used for Fourier transform infrared model calibration for bovine milk fat composition on the estimated genetic parameters of the predicted data.
      ;
      • Bonfatti V.
      • Vicario D.
      • Lugo A.
      • Carnier P.
      Genetic parameters of measures and population-wide infrared predictions of 92 traits describing the fine composition and technological properties of milk in Italian Simmental cattle.
      ). The moderate to high heritability estimates we report, alongside high genetic correlations between directly measured and FT-MIR predicted fatty acid traits, indicate that there is genetic variation in the FT-MIR predicted traits that could potentially be exploited in animal breeding programs, and, in most cases, that selection for an FT-MIR predicted fatty acid trait would be expected to provide genetic gain in the actual fatty acid trait of interest.

      Individual Milk Protein Traits

      Heritability estimates were moderate to high for nearly all directly measured and FT-MIR predicted individual milk proteins (Table 2). The exception to this was for directly measured β-CN, which had a heritability of 0.14. The highest heritability estimates were for β-LG, with h2 = 0.63 and h2 = 0.70 for directly measured and FT-MIR predicted β-LG, respectively. In general, heritability estimates for measured and FT-MIR predicted proteins were similar. An exception to this was β-CN, which had heritability estimates for the directly measured and FT-MIR predicted trait of 0.14 and 0.38, respectively. Another exception was Lf, which had an estimated heritability for the measured trait that was substantially higher than that of the FT-MIR predicted trait (0.59 vs. 0.30). With the exceptions of α-LA and Lf, genetic correlations between directly measured and FT-MIR predicted individual milk proteins were greater than 0.8. In general, as we observed for fatty acid traits, we found a trend of low Rcv2 values, coinciding with low genetic correlations between directly measured and FT-MIR predicted traits.
      There are few studies that report genetic parameters for directly measured or FT-MIR predicted milk proteins, or both, but those studies vary in the breed composition of the cows. Specifically, study populations include Dutch Holstein-Friesians (
      • Schopen G.C.B.
      • Heck J.M.L.
      • Bovenhuis H.
      • Visker M.H.P.W.
      • van Valenberg H.J.F.
      • van Arendonk J.A.M.
      Genetic parameters for major milk proteins in Dutch Holstein-Friesians.
      ), Danish Holsteins and Jerseys (
      • Buitenhuis B.
      • Poulsen N.A.
      • Gebreyesus G.
      • Larsen L.B.
      Estimation of genetic parameters and detection of chromosomal regions affecting the major milk proteins and their post translational modifications in Danish Holstein and Danish Jersey cattle.
      ), Italian Simmentals (
      • Bonfatti V.
      • Vicario D.
      • Lugo A.
      • Carnier P.
      Genetic parameters of measures and population-wide infrared predictions of 92 traits describing the fine composition and technological properties of milk in Italian Simmental cattle.
      ), or French Montbéliarde, Normande, and Holstein cows (
      • Sanchez M.P.
      • Ferrand M.
      • Gelé M.
      • Pourchet D.
      • Miranda G.
      • Martin P.
      • Brochard M.
      • Boichard D.
      Short communication: Genetic parameters for milk protein composition predicted using mid-infrared spectroscopy in the French Montbéliarde, Normande, and Holstein dairy cattle breeds.
      ). Studies also vary in that some report on individual proteins as a proportion of total protein or whey protein (
      • Schopen G.C.B.
      • Heck J.M.L.
      • Bovenhuis H.
      • Visker M.H.P.W.
      • van Valenberg H.J.F.
      • van Arendonk J.A.M.
      Genetic parameters for major milk proteins in Dutch Holstein-Friesians.
      ;
      • Buitenhuis B.
      • Poulsen N.A.
      • Gebreyesus G.
      • Larsen L.B.
      Estimation of genetic parameters and detection of chromosomal regions affecting the major milk proteins and their post translational modifications in Danish Holstein and Danish Jersey cattle.
      ), whereas other studies report on individual proteins as a proportion of total protein or as a proportion of total milk volume (
      • Bonfatti V.
      • Vicario D.
      • Lugo A.
      • Carnier P.
      Genetic parameters of measures and population-wide infrared predictions of 92 traits describing the fine composition and technological properties of milk in Italian Simmental cattle.
      ;
      • Sanchez M.P.
      • Ferrand M.
      • Gelé M.
      • Pourchet D.
      • Miranda G.
      • Martin P.
      • Brochard M.
      • Boichard D.
      Short communication: Genetic parameters for milk protein composition predicted using mid-infrared spectroscopy in the French Montbéliarde, Normande, and Holstein dairy cattle breeds.
      ). The heritability estimates we report for directly measured α-CN, β-CN, and κ-CN were lower than those previously reported by
      • Bonfatti V.
      • Vicario D.
      • Lugo A.
      • Carnier P.
      Genetic parameters of measures and population-wide infrared predictions of 92 traits describing the fine composition and technological properties of milk in Italian Simmental cattle.
      , but the heritability estimates we report for directly measured α-LA and β-LG were substantially higher. In contrast, for FT-MIR predicted α-CN, β-CN, and κ-CN, the heritability estimates we report were consistently higher than those reported by
      • Bonfatti V.
      • Vicario D.
      • Lugo A.
      • Carnier P.
      Genetic parameters of measures and population-wide infrared predictions of 92 traits describing the fine composition and technological properties of milk in Italian Simmental cattle.
      , but were similar to those reported by
      • Sanchez M.P.
      • Ferrand M.
      • Gelé M.
      • Pourchet D.
      • Miranda G.
      • Martin P.
      • Brochard M.
      • Boichard D.
      Short communication: Genetic parameters for milk protein composition predicted using mid-infrared spectroscopy in the French Montbéliarde, Normande, and Holstein dairy cattle breeds.
      .
      The only study to report genetic correlations between directly measured and FT-MIR predicted milk proteins that we are aware of is that of
      • Bonfatti V.
      • Vicario D.
      • Lugo A.
      • Carnier P.
      Genetic parameters of measures and population-wide infrared predictions of 92 traits describing the fine composition and technological properties of milk in Italian Simmental cattle.
      . The genetic correlations that we report were higher than in that study. Specifically, for the protein fractions we studied, genetic correlations ranged from 0.76 for α-LA to 0.995 for β-LG, whereas in
      • Bonfatti V.
      • Vicario D.
      • Lugo A.
      • Carnier P.
      Genetic parameters of measures and population-wide infrared predictions of 92 traits describing the fine composition and technological properties of milk in Italian Simmental cattle.
      , genetic correlations for these traits ranged from 0.24 for α-LA to 0.48 for β-LG. Interestingly,
      • Bonfatti V.
      • Vicario D.
      • Lugo A.
      • Carnier P.
      Genetic parameters of measures and population-wide infrared predictions of 92 traits describing the fine composition and technological properties of milk in Italian Simmental cattle.
      reported moderate heritability estimates for directly measured milk proteins (0.12 to 0.59), but much lower heritability estimates for FT-MIR predicted milk proteins (0.07 to 0.21). In contrast, the heritability estimates we observed for directly measured proteins (0.14 to 0.63) were similar to (and often lower than) the heritability estimates we observed for FT-MIR predicted proteins (0.30 to 0.70). These differences in heritability were likely due to factors related to differences in the breed composition and population structure between the 2 studies (i.e., Italian Simmental cows from herds enrolled in the Italian national milk recording program versus Holstein-Friesian Jersey F2 cows from a single research herd).
      Moderate to high heritability estimates and high genetic correlations between directly measured and FT-MIR predicted milk proteins in our study indicate that indirect selection on FT-MIR predicted milk proteins could be used in animal breeding programs to achieve desired changes to milk protein composition. Moreover, high genetic correlations from pedigree-based models imply that directly measured and FT-MIR predicted traits may have a similar underlying genetic architecture and that genes contributing to the traits are likely to be co-inherited (
      • Lynch M.
      • Walsh B.
      Genetics and Analysis of Quantitative Traits.
      ). To assess this directly, we conducted GWAS on directly measured traits and their corresponding FT-MIR predictions, and compared the QTL for each trait.

      Sequence-Based Genome-Wide Association Analyses

      Previously, there have been several GWAS that used a range of genotype densities for fatty acids in bovine milk samples determined by GC (
      • Bouwman A.C.
      • Bovenhuis H.
      • Visker M.H.P.W.
      • van Arendonk J.A.M.
      Genome-wide association of milk fatty acids in Dutch dairy cattle.
      ;
      • Buitenhuis B.
      • Janss L.L.G.
      • Poulsen N.A.
      • Larsen L.B.
      • Larsen M.K.
      • Sørensen P.
      Genome-wide association and biological pathway analysis for milk-fat composition in Danish Holstein and Danish Jersey cattle.
      ;
      • Palombo V.
      • Milanesi M.
      • Sgorlon S.
      • Capomaccio S.
      • Mele M.
      • Nicolazzi E.
      • Ajmone-Marsan P.
      • Pilla F.
      • Stefanon B.
      • D'Andrea M.
      Genome-wide association study of milk fatty acid composition in Italian Simmental and Italian Holstein cows using single nucleotide polymorphism arrays.
      ) or fatty acids predicted using FT-MIR spectra (
      • Olsen H.G.
      • Knutsen T.M.
      • Kohler A.
      • Svendsen M.
      • Gidskehaug L.
      • Grove H.
      • Nome T.
      • Sodeland M.
      • Sundsaasen K.K.
      • Kent M.P.
      • Martens H.
      • Lien S.
      Genome-wide association mapping for milk fat composition and fine mapping of a QTL for de novo synthesis of milk fatty acids on bovine chromosome 13.
      ;
      • Cruz V.A.R.
      • Oliveira H.R.
      • Brito L.F.
      • Fleming A.
      • Larmer S.
      • Miglior F.
      • Schenkel F.S.
      Genome-wide association study for milk fatty acids in Holstein cattle accounting for the DGAT1 gene effect.
      ;
      • Iung L.H.S.
      • Petrini J.
      • Ramírez-Díaz J.
      • Salvian M.
      • Rovadoscki G.A.
      • Pilonetto F.
      • Dauria B.D.
      • Machado P.F.
      • Coutinho L.L.
      • Wiggans G.R.
      • Mourão G.B.
      Genome-wide association study for milk production traits in a Brazilian Holstein population.
      ;
      • Sanchez M.-P.
      • Ramayo-Caldas Y.
      • Wolf V.
      • Laithier C.
      • El Jabri M.
      • Michenet A.
      • Boussaha M.
      • Taussat S.
      • Fritz S.
      • Delacroix-Buchet A.
      • Brochard M.
      • Boichard D.
      Sequence-based GWAS, network and pathway analyses reveal genes co-associated with milk cheese-making properties and milk composition in Montbéliarde cows.
      ;
      • Freitas P.H.F.
      • Oliveira H.R.
      • Silva F.F.
      • Fleming A.
      • Miglior F.
      • Schenkel F.S.
      • Brito L.F.
      Genomic analyses for predicted milk fatty acid composition throughout lactation in North American Holstein cattle.
      ). Similarly, multiple GWAS have been conducted on protein fractions in milk samples determined by HPLC (
      • Schopen G.C.B.
      • Visker M.H.P.W.
      • Koks P.D.
      • Mullaart E.
      • van Arendonk J.A.M.
      • Bovenhuis H.
      Whole-genome association study for milk protein composition in dairy cattle.
      ;
      • Buitenhuis B.
      • Poulsen N.A.
      • Gebreyesus G.
      • Larsen L.B.
      Estimation of genetic parameters and detection of chromosomal regions affecting the major milk proteins and their post translational modifications in Danish Holstein and Danish Jersey cattle.
      ;
      • Pegolo S.
      • Mach N.
      • Ramayo-Caldas Y.
      • Schiavon S.
      • Bittante G.
      • Cecchinato A.
      Integration of GWAS, pathway and network analyses reveals novel mechanistic insights into the synthesis of milk proteins in dairy cows.
      ;
      • Zhou C.
      • Li C.
      • Cai W.
      • Liu S.
      • Yin H.
      • Shi S.
      • Zhang Q.
      • Zhang S.
      Genome-wide association study for milk protein composition traits in a Chinese Holstein population using a single-step approach.
      ) or FT-MIR predicted protein fractions (
      • Sanchez M.-P.
      • Govignon-Gion A.
      • Croiseau P.
      • Fritz S.
      • Hozé C.
      • Miranda G.
      • Martin P.
      • Barbat-Leterrier A.
      • Letaïef R.
      • Rocha D.
      • Brochard M.
      • Boussaha M.
      • Boichard D.
      Within-breed and multi-breed GWAS on imputed whole-genome sequence variants reveal candidate mutations affecting milk protein composition in dairy cattle.
      ,
      • Sanchez M.-P.
      • Ramayo-Caldas Y.
      • Wolf V.
      • Laithier C.
      • El Jabri M.
      • Michenet A.
      • Boussaha M.
      • Taussat S.
      • Fritz S.
      • Delacroix-Buchet A.
      • Brochard M.
      • Boichard D.
      Sequence-based GWAS, network and pathway analyses reveal genes co-associated with milk cheese-making properties and milk composition in Montbéliarde cows.
      ). Each of those studies was conducted using either the directly measured trait (GC-based for fatty acids; HPLC-based for protein fractions) or the FT-MIR predicted trait, though none of these presented comparisons between the GWAS for directly measured and FT-MIR predicted traits. In the present study, we have sought to make these comparisons using imputed whole-genome sequence genotypes from an F2 study population to enhance our ability to identify trait QTL and candidate causative mutations.
      For each of 17 individual fatty acids, 6 grouped fatty acids, and 6 protein traits, GWAS were conducted using 14,990,779 imputed sequence variants. These analyses resulted in the identification of 40,946 variants with significant effects for directly measured traits, and 18,843 variants with significant association effects for the FT-MIR predicted traits. We found more than twice as many variants with significant effects for directly measured traits, compared with FT-MIR predicted traits, which was largely due to 20,949 variants with significant effects on BTA26 for directly measured traits compared with only 110 variants with significant effects on BTA26 for FT-MIR predicted traits. It was also notable that we detected 3,579 variants with significant effects on BTA22 for directly measured Lf, but no variants with significant effects on BTA22 for FT-MIR predicted traits. Manhattan plots showing the strength of association signals are presented in Figures 1, 2, 3, and 4 for individual fatty acids, Figure 5 for grouped fatty acids, and Figure 6 for individual protein traits. To assess the candidacy of QTL, relevant protein coding variants that were in high LD (R2 > 0.7) with the most highly associated variant from each peak were identified. The most highly associated variant from each trait QTL and any relevant protein coding variants are shown in Table 3 for directly measured fatty acid and protein traits, and Table 4 for FT-MIR predicted fatty acid and protein traits. Effect sizes and minor allele frequency details for relevant variants and effects are provided in Appendix Table A3 for fatty acids and Appendix Table A4 for protein traits.
      Figure thumbnail gr1
      Figure 1Manhattan plots showing association effects for directly measured (GC-based) and Fourier-transform mid-infrared (FT-MIR) predicted individual short-chain fatty acid traits. Dark and light blue data points represent association signals for GC-based traits and red data points represent association signals for FT-MIR predicted traits. Chromosomes and genomic position based on the UMD3.1 Bos taurus reference genome are represented on the x-axis. The strength of association signals is represented as the −log10(P-value) on the y-axis. The horizontal red line shows the Bonferroni significance threshold of −log10(4.3e-09).
      Figure thumbnail gr2
      Figure 2Manhattan plots showing association effects for directly measured (GC-based) and Fourier-transform mid-infrared (FT-MIR) predicted individual medium-chain fatty acid traits. Dark and light blue data points represent association signals for GC-based traits and red data points represent association signals for FT-MIR predicted traits. Chromosomes and genomic position based on the UMD3.1 Bos taurus reference genome are represented on the x-axis. The strength of association signals is represented as the −log10(P-value) on the y-axis. The horizontal red line shows the Bonferroni significance threshold of −log10(4.3e-09).
      Figure thumbnail gr3
      Figure 3Manhattan plots showing association effects for directly measured (GC-based) and Fourier-transform mid-infrared (FT-MIR) predicted C16 fatty acid traits. Dark and light blue data points represent association signals for GC-based traits and red data points represent association signals for FT-MIR predicted traits. Chromosomes and genomic position based on the UMD3.1 Bos taurus reference genome are represented on the x-axis. The strength of association signals is represented as the −log10(P-value) on the y-axis. The horizontal red line shows the Bonferroni significance threshold of −log10(4.3e-09).
      Figure thumbnail gr4
      Figure 4Manhattan plots showing association effects for directly measured (GC-based) and Fourier-transform mid-infrared (FT-MIR) predicted C18 fatty acid traits. Dark and light blue data points represent association signals for GC-based traits and red data points represent association signals for FT-MIR predicted traits. Chromosomes and genomic position based on the UMD3.1 Bos taurus reference genome are represented on the x-axis. The strength of association signals is represented as the −log10(P-value) on the y-axis. The horizontal red line shows the Bonferroni significance threshold of −log10(4.3e-09).
      Figure thumbnail gr5
      Figure 5Manhattan plots showing association effects for directly measured (GC-based) and Fourier-transform mid-infrared (FT-MIR) predicted fatty acids classified based on the degree of saturation and the length of the carbon chain. Dark and light blue data points represent association signals for GC-based traits and red data points represent association signals for FT-MIR predicted traits. Chromosomes and genomic position based on the UMD3.1 Bos taurus reference genome are represented on the x-axis. The strength of association signals is represented as the −log10(P-value) on the y-axis. The horizontal red line shows the Bonferroni significance threshold of −log10(4.3e-09).
      Figure thumbnail gr6
      Figure 6Manhattan plot showing association effects for directly measured (HPLC-based) and Fourier-transform mid-infrared (FT-MIR) predicted proteins. Dark and light blue data points represent association signals for HPLC-based traits and red data points represent association signals for FT-MIR predicted traits. Chromosomes and genomic position based on the UMD3.1 Bos taurus reference genome are represented on the x-axis. The strength of association signals is represented as the −log10(P-value) on the y-axis. The horizontal red line shows the Bonferroni significance threshold of −log10(4.3e-09).
      Table 3Peak variants for directly measured fatty acid and protein traits with significant association effects
      Peak variants for directly measured fatty acid traits with significant association effects; Bonferroni threshold: −log10(4.3e-09).
      Trait
      Trait definitions and units as described in Table 1.
      ChrPositionTag variant IDP-valueProtein coding variant IDLDGeneClassDescription
      Individual fatty acid (g/100 g of total fat)
       C18:1 cis-9141756075rs2084177621.3e-10rs1343646120.915SLC52A2Missensec.724A > G
       C18:1 cis-9141756075rs2084177621.3e-10rs1092342500.915DGAT1Missensec.694G > A
       C16:0141799066rs3851350661.2e-12rs1343646120.737SLC52A2Missensec.724A > G
       C16:0141799066rs3851350661.2e-12rs1092342500.737DGAT1Missensec.694G > A
       C6:01752971731rs2079976949.6e-10
       C4:01753034516rs4610375417.2e-18
       C10:01951319673rs1372700971.2e-10rs419211600.974CCDC57Missensec.1907T > C
       C12:01951319673rs1372700978.3e-13rs419211600.974CCDC57Missensec.1907T > C
       C14:01951326050rs1364243041.4e-11rs419211600.996CCDC57Missensec.1907T > C
       C10:02621141279rs412556962.2e-10rs412556930.799SCDSplice regionc.569C > T
       C14:02621141279rs412556962.7e-10rs412556930.799SCDSplice regionc.569C > T
       C10:12621148111rs412556881.8e-48rs412556930.915SCDSplice regionc.569C > T
       C14:12621149680rs3852853566.1e-61rs412556930.915SCDSplice regionc.569C > T
       C10:12626458006rs4457583062.6e-10rs3794634580.761ITPRIPMissensec.1301G > A
       C12:12626458006rs4457583062.4e-10rs3794634580.761ITPRIPMissensec.1301G > A
      Grouped fatty acid (g/100 g of total fat)
       SCFA1753034516rs4610375411.2e-14
       SFA1936187954rs1109807425.0e-10rs2100646670.816UTP18Missensec.85G > A
       SFA1936187954rs1109807425.0e-10rs3820002220.848UTP18Missensec.79T > A
       UFA1936187954rs1109807424.8e-10rs2100646670.816UTP18Missensec.85G > A
       UFA1936187954rs1109807424.8e-10rs3820002220.848UTP18Missensec.79T > A
       MCFA1951319673rs1372700971.4e-13rs419211600.974CCDC57Missensec.1907T > C
       SFA2621149680rs3852853562.1e-10rs412556930.915SCDSplice regionc.569C > T
       UFA2621149680rs3852853561.1e-10rs412556930.915SCDSplice regionc.569C > T
      Individual milk protein (g/L of total volume)
       α-CN687133508rs1095003634.3e-12rs3827931630.856ENSBTAG00000039991Missensec.1406G > A
       α-CN687133508rs1095003634.3e-12rs3856039650.839ENSBTAG00000003523Missensec.1378C > T
       α-CN687133508rs1095003634.3e-12rs437030100.923CSN1S1Missensec.620A > G
       κ-CN687405588rs1107949536.4e-28rs1097396920.805ODAMMissensec.520G > A
       κ-CN687405588rs1107949536.4e-28rs437030150.988CSN3Missensec.470T > C
       κ-CN687405588rs1107949536.4e-28rs437030160.985CSN3Missensec.506C > A
       β-LG11103291134rs1102700488.7e-117rs1100662291PAEPMissensec.239G > A
       β-LG11103291134rs1102700488.7e-117rs1099902180.997PAEPSplice regionc.305–5A > T
       β-LG11103291134rs1102700488.7e-117rs1096256490.985PAEPMissensec.401T > C
       α-CN11103292575rs3810502995.6e-10rs1100662290.965PAEPMissensec.239G > A
       α-CN11103292575rs3810502995.6e-10rs1099902180.962PAEPSplice regionc.305–5A > T
       α-CN11103292575rs3810502995.6e-10rs1096256490.950PAEPMissensec.401T > C
       Lf
      Cube-root transformation of lactoferrin (Lf).
      2253538882rs437654601.8e-41
      1 Peak variants for directly measured fatty acid traits with significant association effects; Bonferroni threshold: −log10(4.3e-09).
      2 Trait definitions and units as described in Table 1.
      3 Cube-root transformation of lactoferrin (Lf).
      Table 4Peak variants for Fourier-transform mid-infrared (FT-MIR) predicted fatty acid and protein traits with significant association effects
      Peak variants for FT-MIR predicted fatty acid traits with significant association effects; Bonferroni threshold: −log10(4.3e-09).
      Trait
      Trait definitions and units as described in Table 1.
      ChrPositionTag variant IDP-valueProtein coding variant IDLDGeneClassDescription
      Individual fatty acid (g/100 g of total fat)
       C12:111103301736rs412556876.3e-11rs1100662290.988PAEPMissensec.239G > A
       C12:111103301736rs412556876.3e-11rs1096256490.991PAEPMissensec.401T > C
       C18:3 cis-3142502770rs1374225741.0e-12rs1094036010.988ENSBTAG00000003606Missensec.154C > G
       C18:1 cis-9142528807rs1102754971.3e-10rs1094036011ENSBTAG00000003606Missensec.154C > G
       C6:01752971731rs2079976949.9e-16
       C4:01753034516rs4610375411.5e-17
       C10:01951314476rs419221437.0e-13rs419211600.989CCDC57Missensec.1907T > C
       C12:01951314476rs419221433.8e-12rs419211600.989CCDC57Missensec.1907T > C
       C14:01951314476rs419221437.0e-12rs419211600.989CCDC57Missensec.1907T > C
       C8:01951326050rs1364243048.9e-10rs419211600.996CCDC57Missensec.1907T > C
       C14:12621174891rs2094456501.9e-09
       C10:12625584818rs2109219415.8e-10
       C18:3 cis-32736200888rs1109509729.9e-15
       C16:027362046791.6e-09
      Grouped fatty acid (g/100 g of total fat)
       UFA142319003rs1101825368.1e-10rs1094036010.947ENSBTAG00000003606Missensec.154C > G
       SCFA1753034516rs4610375417.1e-22
       UFA1950919823rs3805349258.8e-10
       MCFA1951314476rs419221439.2e-13rs419211600.989CCDC57Missensec.1907T > C
       UFA2621138011rs3816552712.6e-10rs412556930.914SCDSplice regionc.569C > T
      Individual milk protein (g/L of total volume)
       κ-CN6870859188.2e-21rs2097985120.761ENSBTAG00000038520Missensec.1623G > C
       κ-CN6870859188.2e-21rs2115557670.761ENSBTAG00000038520Missensec.1301C > T
       κ-CN6870859188.2e-21rs3827931630.725ENSBTAG00000039991Missensec.1406G > A
       κ-CN6870859188.2e-21rs3856039650.711ENSBTAG00000003523Missensec.1378C > T
       κ-CN6870859188.2e-21rs437030100.787CSN1S1Missensec.620A > G
       α-CN687133508rs1095003637.0e-11rs3827931630.856ENSBTAG00000039991Missensec.1406G > A
       α-CN687133508rs1095003637.0e-11rs3856039650.839ENSBTAG00000003523Missensec.1378C > T
       α-CN687133508rs1095003637.0e-11rs437030100.923CSN1S1Missensec.620A > G
       β-CN11103299272rs1105635498.3e-19rs1100662290.997PAEPMissensec.239G > A
       β-CN11103299272rs1105635498.3e-19rs1096256490.988PAEPMissensec.401T > C
       β-LG11103299272rs1105635495.4e-116rs1100662290.997PAEPMissensec.239G > A
       β-LG11103299272rs1105635495.4e-116rs1096256490.988PAEPMissensec.401T > C
       α-CN141799066rs3851350664.8e-12rs1343646120.737SLC52A2Missensec.724A > G
       α-CN141799066rs3851350664.8e-12rs1092342500.737DGAT1Missensec.694G > A
      1 Peak variants for FT-MIR predicted fatty acid traits with significant association effects; Bonferroni threshold: −log10(4.3e-09).
      2 Trait definitions and units as described in Table 1.

      Short-Chain Fatty Acids

      Prominent peaks were observed on BTA17 for the short-chain fatty acids, C4:0, and C6:0 (Table 3, Table 4; Figure 1). For directly measured and FT-MIR predicted C4:0, these peaks were underpinned by the same QTL at chromosome (Chr) 17:53.03 Mbp (rs461037541). A peak of similar magnitude was also observed for FT-MIR predicted C6:0 at a nearby locus (rs207997694), with a less significant peak for directly measured C6:0 at that same locus. Other significant effects were also observed at this locus for directly measured SCFA (P-value = 1.2e-14) and FT-MIR predicted SCFA (P-value = 7.1e-22). The 2 implicated loci for the peaks on BTA17 were situated between the AACS and BRI3BP genes, and visual examination revealed several significant variants across both genes. The AACS gene codes for the enzyme acetoacetyl-CoA synthetase, which forms an important metabolic link between the ketone body acetoacetate on one hand, and the tricarboxylic acid cycle and fat synthesis on the other (
      • Bergman E.N.
      Hyperketonemia-ketogenesis and ketone body metabolism.
      ). As this gene is highly expressed in both adipose and mammary tissue (NCBI Bioprojects PRJEB4337 and PRJEB2445), AACS makes a good candidate for the causal gene underlying fatty acid QTL in this region.
      • Knutsen T.M.
      • Olsen H.G.
      • Tafintseva V.
      • Svendsen M.
      • Kohler A.
      • Kent M.P.
      • Lien S.
      Unravelling genetic variation underlying de novo-synthesis of bovine milk fatty acids.
      also reported an effect for C4:0 fatty acids in this region and suggested that the QTL may be the result of a regulatory effect.

      Medium-Chain Fatty Acids

      Significant effects were observed on BTA11, BTA19, and BTA26 for medium-chain fatty acids (Table 3, Table 4; Figure 2). The peak on BTA11 was underpinned by a Chr11:103.30 locus (rs41255687) and was observed for FT-MIR predicted C12:1, but was absent for directly measured C12:1. This locus was in high LD (R2 > 0.98) with 2 missense mutations in the PAEP gene, which encodes the major whey protein, β-LG. One of the missense mutations reported (rs109625649; V134A) is a variant that distinguishes the ‘A' and ‘B' haplotypes of β-LG (
      • Caroli A.M.
      • Chessa S.
      • Erhardt G.J.
      Invited review: Milk protein polymorphisms in cattle: Effect on animal breeding and human nutrition.
      ), where the ‘A' haplotype is known to be associated with higher levels of β-LG expression. The PAEP locus has also been linked to FT-MIR wavenumbers characterized by carboxylic C=O bond stretching (
      • Tiplady K.M.
      • Lopdell T.J.
      • Reynolds E.
      • Sherlock R.G.
      • Keehan M.
      • Johnson T.J.J.
      • Pryce J.E.
      • Davis S.R.
      • Spelman R.J.
      • Harris B.L.
      • Garrick D.J.
      • Littlejohn M.D.
      Sequence-based genome-wide association study of individual milk mid-infrared wavenumbers in mixed-breed dairy cattle.
      ). This type of bond is found in both fats and proteins, strongly suggesting that the peak observed for the FT-MIR predicted phenotype is a false positive due to contamination of the signal by varying levels of β-LG expression.
      Several QTL were identified for directly measured and FT-MIR predicted medium-chain fatty acids (C10:0, C12:0, C14:0) on BTA19 that were in high LD (R2 > 0.97), with a missense mutation (rs41921160) in the CCDC57 gene (Table 3, Table 4; Figure 2). Significant effects were also observed in this region for FT-MIR predicted C8:0 (P-value = 8.9e-10; Figure 1), and directly measured (P-value = 1.4e-13) and FT-MIR predicted MCFA (P-value = 9.2e-13; Figure 5). The CCDC57 encodes a coiled-coil domain-containing protein that is expressed in the bovine mammary gland (

      Medrano, J., G. Rincon, and A. Islas-Trejo. 2010. Comparative analysis of bovine milk and mammary gland transcriptome using RNA-Seq. 9th World Congr. Genet. Appl. Livest. Prod. Leipz. Ger. 852.

      ). Previous studies have implicated the same or a nearby locus to the one reported here as having a significant association for fatty acids (
      • Bouwman A.C.
      • Visker M.H.P.W.
      • van Arendonk J.A.M.
      • Bovenhuis H.
      Fine mapping of a quantitative trait locus for bovine milk fat composition on Bos taurus autosome 19.
      ;
      • Knutsen T.M.
      • Olsen H.G.
      • Tafintseva V.
      • Svendsen M.
      • Kohler A.
      • Kent M.P.
      • Lien S.
      Unravelling genetic variation underlying de novo-synthesis of bovine milk fatty acids.
      ;
      • Palombo V.
      • Milanesi M.
      • Sgorlon S.
      • Capomaccio S.
      • Mele M.
      • Nicolazzi E.
      • Ajmone-Marsan P.
      • Pilla F.
      • Stefanon B.
      • D'Andrea M.
      Genome-wide association study of milk fatty acid composition in Italian Simmental and Italian Holstein cows using single nucleotide polymorphism arrays.
      ) and fat composition (
      • Tribout T.
      • Croiseau P.
      • Lefebvre R.
      • Barbat A.
      • Boussaha M.
      • Fritz S.
      • Boichard D.
      • Hoze C.
      • Sanchez M.-P.
      Confirmed effects of candidate variants for milk production, udder health, and udder morphology in dairy cattle.
      ) in bovine milk. Significant effects have also been reported at a nearby locus for several FT-MIR wavenumbers, characterized by carboxylic C=O bond stretching (
      • Tiplady K.M.
      • Lopdell T.J.
      • Reynolds E.
      • Sherlock R.G.
      • Keehan M.
      • Johnson T.J.J.
      • Pryce J.E.
      • Davis S.R.
      • Spelman R.J.
      • Harris B.L.
      • Garrick D.J.
      • Littlejohn M.D.
      Sequence-based genome-wide association study of individual milk mid-infrared wavenumbers in mixed-breed dairy cattle.
      ).
      • Bouwman A.C.
      • Visker M.H.P.W.
      • van Arendonk J.A.M.
      • Bovenhuis H.
      Fine mapping of a quantitative trait locus for bovine milk fat composition on Bos taurus autosome 19.
      examined this region in depth using HD genotypes and identified 2 possible genes underlying an effect for C14:0, CCDC57 and FASN. The missense mutation we have highlighted (rs41921160) is located within the same region as the most highly associated variants in the study by
      • Bouwman A.C.
      • Visker M.H.P.W.
      • van Arendonk J.A.M.
      • Bovenhuis H.
      Fine mapping of a quantitative trait locus for bovine milk fat composition on Bos taurus autosome 19.
      , and was in perfect LD with the set of 8 intronic HD variants with the most highly associated effects. On closer examination of the association effects for C10:0, C12:0, and C14:0 in our study, we determined that alongside the most highly associated variants in the QTL peaks, there were 47 other imputed whole-genome sequence variants between 51,306,219 and 51,330,072 bp that were in perfect LD with one another (including the missense variant rs41921160), with only marginally less significant P-values. A small cluster of association effects near to or in the FASN gene were also observed, with the most significant of these for directly measured C14:0 being at 51,380,689 bp, but the P-value for that effect was not significant (P-value = 2.4e-07). To assess whether multiple QTL were present in this region, we repeated the GWAS, correcting for the rs136424304 locus by including it as a covariate in the Bolt-LMM model. This resulted in no significant effects remaining in a 1 Mbp region around the Chr19:51.32 locus, and the association effect near the FASN gene at 51,380,689 bp, dropping in significance to a P-value of 3.9e-02. Although our analysis provides evidence that the effect in this region may be underpinned by a missense mutation in the CCDC57 gene, the functional candidacy of FASN remains and such an effect would need to be confirmed by functional analysis.
      Multiple QTL were identified for directly measured medium-chain fatty acids on BTA26 (Table 3; Figure 2). The most significant effects were observed at Chr26:21.15 Mbp for directly measured C10:1 (rs41255688; P-value=1.8e-48) and C14:1 (rs385285356; P-value = 6.1e-61). These loci were in high LD (R2 = 0.92) with a splice region variant (rs41255693) in the SCD gene. The SCD gene was also identified as encompassing other effects with less significant P-values for directly measured C10:0, C14:0, SFA, and UFA (Table 3), and FT-MIR predicted UFA (Table 4). Stearoyl-CoA desaturase is an enzyme that plays an important role in biosynthesis of monounsaturated fatty acids (
      • Bernard L.
      • Leroux C.
      • Chilliard Y.
      Characterisation and nutritional regulation of the main lipogenic genes in the ruminant lactating mammary gland.
      ;
      • Paton C.M.
      • Ntambi J.M.
      Biochemical and physiological function of stearoyl-CoA desaturase.
      ), and has previously been reported in other studies of fatty acids in bovine milk (
      • Mele M.
      • Conte G.
      • Castiglioni B.
      • Chessa S.
      • Macciotta N.P.P.
      • Serra A.
      • Buccioni A.
      • Pagnacco G.
      • Secchiari P.
      Stearoyl-Coenzyme A desaturase gene polymorphism and milk fatty acid composition in Italian Holsteins.
      ;
      • Moioli B.
      • Contarini G.
      • Avalli A.
      • Catillo G.
      • Orrù L.
      • De Matteis G.
      • Masoero G.
      • Napolitano F.
      Short Communication: Effect of stearoyl-Coenzyme A desaturase polymorphism on fatty acid composition of milk.
      ;
      • Schennink A.
      • Heck J.M.L.
      • Bovenhuis H.
      • Visker M.H.P.W.
      • van Valenberg H.J.F.
      • van Arendonk J.A.M.
      Milk fatty acid unsaturation: Genetic parameters and effects of stearoyl-CoA desaturase (SCD1) and acyl CoA: Diacylglycerol acyltransferase 1 (DGAT1).
      ;
      • Kgwatalala P.M.
      • Ibeagha-Awemu E.M.
      • Hayes J.F.
      • Zhao X.
      Stearoyl-CoA desaturase 1 3′UTR SNPs and their influence on milk fatty acid composition of Canadian Holstein cows.
      ;
      • Conte G.
      • Mele M.
      • Chessa S.
      • Castiglioni B.
      • Serra A.
      • Pagnacco G.
      • Secchiari P.
      Diacylglycerol acyltransferase 1, stearoyl-CoA desaturase 1, and sterol regulatory element binding protein 1 gene polymorphisms and milk fatty acid composition in Italian Brown cattle.
      ;
      • Bouwman A.C.
      • Bovenhuis H.
      • Visker M.H.P.W.
      • van Arendonk J.A.M.
      Genome-wide association of milk fatty acids in Dutch dairy cattle.
      ). The strong effect we see for directly measured C14:1 in the SCD gene is unsurprising, given that C14:0 in milk fat is predominantly derived from de novo synthesis in the mammary gland, meaning that almost all C14:1 cis-9 is likely to have been synthesized by stearoyl-CoA desaturase (
      • Bernard L.
      • Leroux C.
      • Chilliard Y.
      Characterisation and nutritional regulation of the main lipogenic genes in the ruminant lactating mammary gland.
      ). Interestingly, although we found a significant effect for FT-MIR predicted UFA at a nearby locus that was also in high LD with the rs41255693 splice region variant (R2 = 0.91), no other effects were identified within the SCD gene for individual FT-MIR predicted fatty acids. A peak for FT-MIR predicted C14:1 was tagged by a nearby Chr26:21.17 Mbp locus (rs209445650; Table 4). However, the LD between the rs209445650 locus and the splice region variant identified for directly measured fatty acids (rs41255693) was moderately low (R2 = 0 .32). Moreover, in a recent GWAS of individual FT-MIR wavenumbers, there was no evidence of an association effect linked to the SCD gene (
      • Tiplady K.M.
      • Lopdell T.J.
      • Reynolds E.
      • Sherlock R.G.
      • Keehan M.
      • Johnson T.J.J.
      • Pryce J.E.
      • Davis S.R.
      • Spelman R.J.
      • Harris B.L.
      • Garrick D.J.
      • Littlejohn M.D.
      Sequence-based genome-wide association study of individual milk mid-infrared wavenumbers in mixed-breed dairy cattle.
      ), indicating that changes in milk composition due to this gene may be difficult to detect using FT-MIR spectral data. However, we may also view the absence of FT-MIR predicted trait QTL in the SCD gene within the context of trait prediction accuracy. The largest QTL underpinned by SCD in directly measured fatty acids were for C10:1 (P-value = 1.8e-48) and C14:1 (P-value = 6.1e-61). The prediction accuracies for these traits were relatively poor: C10:1 (Rcv2/ = 0.30; RPEcv = 0.16) and C14:1 (Rcv2/ = 0.41; RPEcv = 0.23; Table 1). Also, it is notable that for C10:1 and C14:1, the heritability estimates of the FT-MIR predictions were lower than those for direct measurements of these traits. This contrasts with the typical pattern for nearly all other fatty acids where the heritability for the FT-MIR prediction was greater than the heritability for the directly measured trait. In particular, the heritability estimate for directly measured C14:1 was 0.55, whereas the heritability estimate for FT-MIR predicted C14:1 was 0.26 (Table 2). Low prediction accuracy and a substantially lower heritability estimate for FT-MIR predicted C14:1 may in part be explained by C14:1 being at relatively low concentrations in milk samples, particularly compared with saturated fatty acids. Specifically, C14:1 had a mean concentration of 0.75 g/100 g of total fat, compared to mean concentrations of 1.54 to 27.64 g/100 g of total fat for the individual saturated fatty acids included in this study (Table 1). Potentially, it may be possible to improve trait prediction accuracies, heritability estimates, and QTL identification for C14:1 by basing FT-MIR predictions on the ratio of C14:1 to C14:0, as in the study by
      • Arnould V.
      • Gengler N.
      • Soyeurt H.
      Genetic variability of test-day stearoyl coenzyme-A desaturase 9 activity.
      . In that study, they highlight that genetic variation and heritability estimates change throughout lactation for the ratio of C14:1 to C14:0, so it may also be valuable to examine other methods of accounting for stage of lactation such as using Legendre polynomials within random regression models.
      One further QTL was observed for directly measured C10:1 and C12:1 on BTA26 at a Chr26:26.46 Mbp locus (rs445758306; Table 3; Figure 2). This locus was in high LD (R2 = 0.76) with a missense mutation in the ITPRIP gene (rs379463458). The ITPRIP gene modulates intracellular messaging by binding the inositol 1,4,5-triphosphate receptor ITPR. This gene has not previously been reported in GWAS of bovine milk composition, and the potential role it may play in the regulation of bovine milk fatty acids is unclear. An alternative potential candidate gene that the Chr26:26.46 Mbp locus maps close to is SORCS3, which encodes the sortilin-related receptor SorCS3. Sortilins are involved in regulating glucose transport into cells in response to insulin (
      • Huang G.
      • Buckler-Pena D.
      • Nauta T.
      • Singh M.
      • Asmar A.
      • Shi J.
      • Kim J.Y.
      • Kandror K.V.
      Insulin responsiveness of glucose transporter 4 in 3T3–L1 cells depends on the presence of sortilin.
      ). A potential mechanism by which this gene could influence milk fatty acid concentrations is via changing the supply of glucose available for the pentose phosphate pathway, which in turn provides the nicotinamide adenine dinucleotide phosphate necessary for fatty acid synthesis.

      Long-Chain Fatty Acids

      Two QTL were identified on BTA14 for directly measured individual long-chain fatty acids (Table 3; Figure 3, Figure 4). One of these was at a Chr14:1.80 Mbp (rs385135066) locus that had a significant effect for directly measured C16:0 (P-value = 1.2e-12). This locus was in high LD (R2 = 0.74) with missense mutations in the SLC52A2 and DGAT1 genes. The other QTL was for directly measured C18:1 cis-9 at a Chr14:1.76 Mbp (rs208417762) locus, that was also in high LD (R2 = 0.92) with missense mutations in the SLC52A2 and DGAT1 genes. Closer examination of association effects for FT-MIR predicted C16:0 revealed evidence of a peak at this locus, but the peak was marginally below the significance threshold. Notably, in the present study, the identified protein coding mutation in the SLC52A2 gene (rs134364612) was in perfect LD with the DGAT1 K232A polymorphism (rs109234250), which has been attributed to changes in bovine milk fat composition (
      • Grisart B.
      • Coppieters W.
      • Farnir F.
      • Karim L.
      • Ford C.
      • Berzi P.
      • Cambisano N.
      • Mni M.
      • Reid S.
      • Simon P.
      • Spelman R.
      • Georges M.
      • Snell R.
      Positional candidate cloning of a QTL in dairy cattle: Identification of a missense mutation in the bovine DGAT1 gene with major effect on milk yield and composition.
      ;
      • Fink T.
      • Lopdell T.J.
      • Tiplady K.
      • Handley R.
      • Johnson T.J.J.
      • Spelman R.J.
      • Davis S.R.
      • Snell R.G.
      • Littlejohn M.D.
      A new mechanism for a familiar mutation—Bovine DGAT1 K232A modulates gene expression through multi-junction exon splice enhancement.
      ) and fatty acids (
      • Bouwman A.C.
      • Bovenhuis H.
      • Visker M.H.P.W.
      • van Arendonk J.A.M.
      Genome-wide association of milk fatty acids in Dutch dairy cattle.
      ;
      • Buitenhuis B.
      • Janss L.L.G.
      • Poulsen N.A.
      • Larsen L.B.
      • Larsen M.K.
      • Sørensen P.
      Genome-wide association and biological pathway analysis for milk-fat composition in Danish Holstein and Danish Jersey cattle.
      ;
      • Li C.
      • Sun D.
      • Zhang S.
      • Wang S.
      • Wu X.
      • Zhang Q.
      • Liu L.
      • Li Y.
      • Qiao L.
      Genome wide association study identifies 20 novel promising genes associated with milk fatty acid traits in Chinese Holstein.
      ). The DGAT1 gene encodes diacylglycerol O-acyltransferase 1, an enzyme that catalyzes the final step in triglyceride production, thus making this a compelling candidate for the observed effects.
      Two further QTL were identified for FT-MIR predicted C16:0 and C18:3 cis-3 at Chr27:36.20 Mbp loci that were not in high LD with a splice region variant, or a moderate or high impact coding variant (Table 4; Figure 3, Figure 4). However, the locus for C18:3 cis-3 (rs110950972) was in perfect LD with a 5′ untranslated region (rs208675276) in GPAT4, and the locus for C16:0 was also in high LD (R2 = 0.997) with that same 5′ untranslated region. Interestingly, we found no evidence of QTL for the corresponding directly measured traits (Figure 3, Figure 4). The Chr27:36.20 Mbp loci are situated in the GPAT4 gene, which encodes the triglyceride synthesis enzyme glycerol-3-phosphate acyltransferase 4. As the milk fat percentage and other QTL at this locus have previously been shown to operate via a mechanism linked to gene expression (
      • Littlejohn M.D.
      • Tiplady K.
      • Lopdell T.
      • Law T.A.
      • Scott A.
      • Harland C.
      • Sherlock R.
      • Henty K.
      • Obolonkin V.
      • Lehnert K.
      • MacGibbon A.
      • Spelman R.J.
      • Davis S.R.
      • Snell R.G.
      Expression variants of the lipogenic AGPAT6 gene affect diverse milk composition phenotypes in Bos taurus.
      ), it is not surprising that no significant coding mutations were identified in GPAT4.

      Other Grouped Fatty Acid Effects

      Further significant effects were observed for directly measured SFA and UFA at a Chr19:36.19 Mbp locus (rs110980742), that was in high LD (R2 > 0.81) with 2 missense mutations in the UTP18 gene (Table 3; Figure 5). This effect was not observed in any other individual or grouped fatty acid traits. The UTP18 gene is involved in the nucleolar processing of pre-18S ribosomal RNA, and has not previously been reported in GWAS of bovine milk composition. The signal at Chr19:36.19 is close to the locus of the KCNJ12 gene, which has a similar function to the KCNJ2 gene that has previously been shown to affect milk phenotypes (
      • Tiplady K.M.
      • Lopdell T.J.
      • Reynolds E.
      • Sherlock R.G.
      • Keehan M.
      • Johnson T.J.J.
      • Pryce J.E.
      • Davis S.R.
      • Spelman R.J.
      • Harris B.L.
      • Garrick D.J.
      • Littlejohn M.D.
      Sequence-based genome-wide association study of individual milk mid-infrared wavenumbers in mixed-breed dairy cattle.
      ), although a mechanism by which this gene could affect fatty acids is unclear.

      Individual Milk Proteins

      Significant effects were observed on BTA6, BTA11, BTA14 and BTA22 for individual milk proteins (Table 3, Table 4; Figure 6). Four QTL were identified on BTA6, 2 of which were for directly measured and FT-MIR predicted α-CN, and the other 2 for directly measured and FT-MIR predicted κ-CN, respectively. The effects for α-CN were observed at a Chr6:87.13 Mbp locus (rs109500363) that was in high LD (R2=0.92) with a missense mutation in the CSN1S1 gene (rs43703010). As the CSN1S1 gene codes for the α-CN protein (along with CSN1S2), it is not surprising that genetic signals affecting α-CN were enriched at this locus. Interestingly, FT-MIR predicted κ-CN also had a significant effect in the same region that was also in high LD (R2 = 0.79) with rs43703010. The effect for directly measured κ-CN was observed at a Chr6:87.41 Mbp locus (rs110794953), which was in high LD (R2 > 0.98) with 2 missense mutations in the CSN3 gene (rs43703015 and rs43703016). The CSN3 gene encodes κ-CN, an abundantly expressed milk protein. One of the missense mutations reported here (rs43703015) has previously been associated with milk composition traits and differential expression in mammary tissue (
      • MacLeod I.M.
      • Bowman P.J.
      • Vander Jagt C.J.
      • Haile-Mariam M.
      • Kemper K.E.
      • Chamberlain A.J.
      • Schrooten C.
      • Hayes B.J.
      • Goddard M.E.
      Exploiting biological priors and sequence variants enhances QTL discovery and genomic prediction of complex traits.
      ). Significant effects have also been reported at this locus for a number of FT-MIR wavenumbers characterized by amide III and phosphate bands, C–H stretching vibrations of CH2 and –CH3, and N–H bending and C–N stretching in the amide II band (
      • Tiplady K.M.
      • Lopdell T.J.
      • Reynolds E.
      • Sherlock R.G.
      • Keehan M.
      • Johnson T.J.J.
      • Pryce J.E.
      • Davis S.R.
      • Spelman R.J.
      • Harris B.L.
      • Garrick D.J.
      • Littlejohn M.D.
      Sequence-based genome-wide association study of individual milk mid-infrared wavenumbers in mixed-breed dairy cattle.
      ).
      Several QTL were identified for individual milk proteins on BTA11 that were in high LD (R2 > 0.95) with missense mutations in the PAEP gene (rs110066229; rs109625649; Table 3, Table 4; Figure 6). Of these, the QTL with the most significant effects were observed for directly measured β-LG (P-value = 8.7e-117) and FT-MIR predicted β-LG (P-value = 5.4e-116). Smaller association effects were also observed for directly measured α-CN (P-value = 5.6e-10) and FT-MIR predicted β-CN (P-value = 8.3e-19). One of the implicated missense mutations in the PAEP gene was the V134A PAEP mutation (rs109625649) that distinguishes the ‘A' and ‘B' haplotypes of β-LG (previously described). This locus is likely driven by a regulatory effect, with a promoter variant reported to be in LD with the V134A mutation previously reported (
      • Lum L.S.
      • Dovč P.
      • Medrano J.F.
      Polymorphisms of bovine β-lactoglobulin promoter and differences in the binding affinity of activator protein-2 transcription factor.
      ) to affect the binding of the Activator Protein-2 transcription factor. An expression QTL (eQTL) for PAEP was also reported in lactating bovine mammary tissue (
      • Tiplady K.M.
      • Lopdell T.J.
      • Reynolds E.
      • Sherlock R.G.
      • Keehan M.
      • Johnson T.J.J.
      • Pryce J.E.
      • Davis S.R.
      • Spelman R.J.
      • Harris B.L.
      • Garrick D.J.
      • Littlejohn M.D.
      Sequence-based genome-wide association study of individual milk mid-infrared wavenumbers in mixed-breed dairy cattle.
      ;
      • Davis S.R.
      • Ward H.E.
      • Kelly V.
      • Palmer D.
      • Ankersmit-Udy A.E.
      • Lopdell T.J.
      • Berry S.D.
      • Littlejohn M.D.
      • Tiplady K.
      • Adams L.F.
      • Carnie K.
      • Burrett A.
      • Thomas N.
      • Snell R.G.
      • Spelman R.J.
      • Lehnert K.
      Screening for phenotypic outliers identifies an unusually low concentration of a β-lactoglobulin B protein isoform in bovine milk caused by a synonymous SNP.
      ).
      One further QTL of interest was for directly measured Lf at a Chr22:53.54 Mbp locus (rs43765460; Table 3; Figure 6). The association effect at this locus had a P-value of 1.8e-41, but we found no relevant splice region variant, or moderate or high impact coding variant ascribed to this effect. However, the rs43765460 locus is a synonymous variant in the LTF gene. Using our previously published mammary RNA sequence dataset and eQTL mapping methodology (
      • Lopdell T.J.
      • Tiplady K.
      • Struchalin M.
      • Johnson T.J.J.
      • Keehan M.
      • Sherlock R.
      • Couldrey C.
      • Davis S.R.
      • Snell R.G.
      • Spelman R.J.
      • Littlejohn M.D.
      DNA and RNA-sequence based GWAS highlights membrane-transport genes as key modulators of milk lactose content.
      ;
      • Tiplady K.M.
      • Lopdell T.J.
      • Reynolds E.
      • Sherlock R.G.
      • Keehan M.
      • Johnson T.J.J.
      • Pryce J.E.
      • Davis S.R.
      • Spelman R.J.
      • Harris B.L.
      • Garrick D.J.
      • Littlejohn M.D.
      Sequence-based genome-wide association study of individual milk mid-infrared wavenumbers in mixed-breed dairy cattle.
      ), we identified the presence of a co-localized expression-based effect for LTF in this region. The rs43765460 locus we identified was in high LD with the top associated eQTL variant for Lf (R2 = 0.88), and the Pearson correlation between the −log10(P-values) of the directly measured Lf QTL, and the −log10(P-values) of the Lf eQTL within a 1 Mbp region flanking the rs43765460 variant was 0.92. The LTF gene is a major iron-binding protein in milk that is linked to iron homeostasis and plays a key role in immune system response and cell growth. Previous studies have shown that the LTF gene is linked to changes in Lf concentrations in bovine milk (
      • Bahar B.
      • O'Halloran F.
      • Callanan M.J.
      • McParland S.
      • Giblin L.
      • Sweeney T.
      Bovine lactoferrin (LTF) gene promoter haplotypes have different basal transcriptional activities.
      ;
      • Pawlik A.
      • Sender G.
      • Sobczyńska M.
      • Korwin-Kossakowska A.
      • Lassa H.
      • Oprządek J.
      Lactoferrin gene variants, their expression in the udder and mastitis susceptibility in dairy cattle.
      ). Notably, there was no evidence of an association effect at or near this locus for FT-MIR predicted Lf (Table 4). Further, in a recent GWAS of individual FT-MIR wavenumbers, there was also no evidence of an association effect linked to the LTF gene (
      • Tiplady K.M.
      • Lopdell T.J.
      • Reynolds E.
      • Sherlock R.G.
      • Keehan M.
      • Johnson T.J.J.
      • Pryce J.E.
      • Davis S.R.
      • Spelman R.J.
      • Harris B.L.
      • Garrick D.J.
      • Littlejohn M.D.
      Sequence-based genome-wide association study of individual milk mid-infrared wavenumbers in mixed-breed dairy cattle.
      ), indicating that changes in milk composition due to this gene may not be easily detectable using FT-MIR spectral data. However, it is also important to note that prediction accuracies for Lf in the present study were relatively poor (Rcv2/ = 0.36; RPEcv = 0.19; Table 1), and the heritability estimate for FT-MIR predicted Lf was only 0.30, compared to the heritability estimate for directly measured Lf, which was 0.59 (Table 2). This pattern is similar to that which we observed for C14:1 and the SCD gene. That is, the component was in relatively low concentrations in the milk sample, model prediction accuracy was relatively poor, the heritability for the measured trait was substantially higher than for the FT-MIR predicted trait, and a compelling peak was observed for the directly measured trait; however, no corresponding peak was observed for the FT-MIR predicted trait.

      Perspectives on the Use of FT-MIR Trait Predictions in Dairy Cattle Selection

      Utilizing FT-MIR predictions for fatty acids and proteins in milk can provide indicator traits across large numbers of animals at little or no marginal cost, because FT-MIR spectral data are already generated as part of routine milk testing to predict total fat and protein concentrations. The alternative to using FT-MIR trait predictions is to directly measure traits, which may be infeasible across even relatively small numbers of animals. Phenotypic correlations between directly measured and FT-MIR predicted traits provide a useful indication of the utility of FT-MIR trait predictions, particularly for herd management and milk processability traits. However, for breeding programs, we are also interested in the heritability of the FT-MIR predicted trait and the genetic correlation between the directly measured and FT-MIR predicted trait. This is because the heritability of the FT-MIR predicted trait defines the level of genetic variation present, whereas the genetic correlation between the directly measured and FT-MIR predicted trait defines the breeding progress we could expect in the directly measured trait if we were to select animals based on the FT-MIR predicted trait. Specifically, within the context of dairy cattle progeny test schemes, the genetic correlation will limit the relative amount of selection response that will result from using FT-MIR predictions instead of directly measured traits (
      • Rutten M.J.M.
      • Bovenhuis H.
      • van Arendonk J.A.M.
      The effect of the number of observations used for Fourier transform infrared model calibration for bovine milk fat composition on the estimated genetic parameters of the predicted data.
      ). Based on this assumption, the genetic gain from selection using FT-MIR predictions for all traits we have studied would be greater than 70% of the gains achievable by direct selection on these traits; additionally, for 21 of the 29 traits, the genetic gains achievable would be greater than 85% of the gains achievable by direct selection. However, it is important to note that this assumes that there is no true difference in heritability between the directly measured and FT-MIR predicted trait. For traits such as Lf and C14:1 where the estimated heritability of the direct measurement was lower than the heritability of the FT-MIR prediction, the genetic gain achievable would also be lower.
      Although we observed high genetic correlations between directly measured and FT-MIR predicted traits in this study, the QTL underlying each trait were not always the same. An example of this includes where we observed a large association effect within the GPAT4 gene on BTA27 for FT-MIR predicted C18:3 cis-3, but no corresponding association effect was observed for directly measured C18:3 cis-3 (Figure 4). Similarly, a large association effect was observed for FT-MIR predicted β-CN within the PAEP gene on BTA11, but no corresponding association effect was observed in directly measured β-CN (Figure 6). The presence of QTL with significant effects in an FT-MIR predicted trait only are not entirely surprising, given that FT-MIR predicted traits are a weighted linear function of absorbance values for individual wavenumbers, each of which may be underpinned by multiple genetic signals and QTL (
      • Wang Q.
      • Bovenhuis H.
      Genome-wide association study for milk infrared wavenumbers.
      ;
      • Benedet A.
      • Ho P.N.
      • Xiang R.
      • Bolormaa S.
      • De Marchi M.
      • Goddard M.E.
      • Pryce J.E.
      The use of mid-infrared spectra to map genes affecting milk composition.
      ;
      • Zaalberg R.M.
      • Janss L.
      • Buitenhuis A.J.
      Genome-wide association study on Fourier transform infrared milk spectra for two Danish dairy cattle breeds.
      ;
      • Tiplady K.M.
      • Lopdell T.J.
      • Reynolds E.
      • Sherlock R.G.
      • Keehan M.
      • Johnson T.J.J.
      • Pryce J.E.
      • Davis S.R.
      • Spelman R.J.
      • Harris B.L.
      • Garrick D.J.
      • Littlejohn M.D.
      Sequence-based genome-wide association study of individual milk mid-infrared wavenumbers in mixed-breed dairy cattle.
      ). This means that when a spectral wavenumber is included in a trait prediction equation, multiple genetic signals will also be present, some of which are specifically related to the trait of interest and some that are not. It is important that when FT-MIR predicted traits are used as proxies for other traits, we are mindful of this, particularly when using SNP-based approaches in our estimation of breeding values, whereby the impact will be determined by the relative proportion of genetic variation captured by each SNP and the interaction of additive effects between SNPs.
      Instances also arose where a QTL was observed for a directly measured trait, but we found no corresponding QTL observed in the FT-MIR predicted trait. Examples of this include large association effects within the SCD gene for directly measured C10:1 and C14:1, but no corresponding association effects for individual FT-MIR predicted fatty acids (Figure 2). Similarly, a large association effect was observed within the LTF gene for directly measured Lf, but a corresponding association effect for FT-MIR predicted Lf was absent (Figure 6). In these examples, there was a consistent pattern where we have a component in relatively low concentrations in the milk sample, with relatively poor model prediction accuracies and lower heritability estimates for the FT-MIR predicted trait, compared with the directly measured trait (Table 1, Table 2). Although it might be argued that the failure to detect QTL in the SCD and LTF genes was because the calibration equations were inadequate for the task of quantifying the milk component targets (C10:1, C14:1, and Lf), it is also notable that in a previous GWAS we conducted on individual FT-MIR wavenumbers (
      • Tiplady K.M.
      • Lopdell T.J.
      • Reynolds E.
      • Sherlock R.G.
      • Keehan M.
      • Johnson T.J.J.
      • Pryce J.E.
      • Davis S.R.
      • Spelman R.J.
      • Harris B.L.
      • Garrick D.J.
      • Littlejohn M.D.
      Sequence-based genome-wide association study of individual milk mid-infrared wavenumbers in mixed-breed dairy cattle.
      ), no significant associations were identified between FT-MIR wavenumbers and variants within the SCD and LTF genes. Potentially, this means that changes in milk composition attributable to these 2 genes may be difficult to quantify directly using FT-MIR wavenumber spectra. For Lf to be detected using FT-MIR spectral data, it needs to provide a unique signal that distinguishes it from other whey proteins in solution that are at much higher concentrations. However, when the mean concentration of Lf is around 0.1 g/L and the major whey protein β-LG is at a 20- to 40-fold higher concentration, it is not surprising that a QTL is seen within the PAEP gene and not within the LTF gene.
      With the growing interest in using FT-MIR spectral data to predict molecules at low concentrations in milk, it is important to understand that the predictive performance of these models may be limited, compared with models for predicting major milk components such as total fat and protein concentrations (
      • Grelet C.
      • Dardenne P.
      • Soyeurt H.
      • Fernandez J.A.
      • Vanlierde A.
      • Stevens F.
      • Gengler N.
      • Dehareng F.
      Large-scale phenotyping in dairy sector using milk MIR spectra: Key factors affecting the quality of predictions.
      ). In the context of the present study, we have shown that for many fatty acids and protein traits, model prediction accuracies are moderate, but that genetic correlations between directly measured and FT-MIR predicted fatty acid and protein fractions are typically high. However, it is also clear that phenotypic variation between directly measured and FT-MIR predicted traits may be underpinned by differing genetic architecture. This may be due to several related factors including the trait of interest being at low concentrations in the milk sample, low prediction model accuracy, or that the trait is not easily detectable using FT-MIR spectroscopy. Improving calibration equations is central to optimizing our use of FT-MIR spectra to generate proxies for traits of interest to the industry such as fatty acids and protein fractions. Collaboration between research groups to generate data sets that include data from a range of herds that capture differences in climate, management systems, diet, and breed composition might improve calibration equations (
      • Grelet C.
      • Dardenne P.
      • Soyeurt H.
      • Fernandez J.A.
      • Vanlierde A.
      • Stevens F.
      • Gengler N.
      • Dehareng F.
      Large-scale phenotyping in dairy sector using milk MIR spectra: Key factors affecting the quality of predictions.
      ). However, a key barrier to consolidating FT-MIR spectral data sets from different research groups is variation in spectral measurements between instruments and within instruments across time. Standardization of individual FT-MIR spectra wavenumbers using reference samples can effectively address these sources of variation (
      • Grelet C.
      • Fernández Pierna J.A.
      • Dardenne P.
      • Baeten V.
      • Dehareng F.
      Standardization of milk mid-infrared spectra from a European dairy network.
      ,
      • Grelet C.
      • Pierna J.A.F.
      • Dardenne P.
      • Soyeurt H.
      • Vanlierde A.
      • Colinet F.
      • Bastin C.
      • Gengler N.
      • Baeten V.
      • Dehareng F.
      Standardization of milk mid-infrared spectrometers for the transfer and use of multiple models.
      ;
      • Tiplady K.M.
      • Sherlock R.G.
      • Littlejohn M.D.
      • Pryce J.E.
      • Davis S.R.
      • Garrick D.J.
      • Spelman R.J.
      • Harris B.L.
      Strategies for noise reduction and standardization of milk mid-infrared spectra from dairy cattle.
      ); however, outside the European OptiMIR network, reference sample sharing and standardization is not common practice. Other approaches, such as those offered by Foss or Bentley, are appealing in that they are not reliant on perishable milk samples. However, as far as we are aware, the effectiveness of these procedures has not been independently evaluated. Validation of these within-instrument standardization procedures is important, because if the procedures work well, they could facilitate the consolidation of spectral data from different networks/countries, and assist with the development of better prediction equations and improve trait prediction accuracies.

      Study Limitations

      In this study, we developed PLS prediction equations and compared the genetic characteristics of directly measured fatty acids and protein fractions to the same traits predicted from FT-MIR spectra. There are several areas of refinement that might improve prediction equations and the identification of QTL. First, before the development of prediction equations, we assessed several mathematical treatments of spectra, but we only assessed the prediction accuracies of those treatments using PLS models. Although PLS is a widely used method for developing calibration models from FT-MIR spectra, it may be possible to develop better prediction models for some traits by employing Bayesian or other machine learning approaches, as demonstrated in other studies of milk composition (
      • Bonfatti V.
      • Tiezzi F.
      • Miglior F.
      • Carnier P.
      Comparison of Bayesian regression models and partial least squares regression for the development of infrared prediction equations.
      ;
      • El Jabri M.
      • Sanchez M.-P.
      • Trossat P.
      • Laithier C.
      • Wolf V.
      • Grosperrin P.
      • Beuvier E.
      • Rolet-Répécaud O.
      • Gavoye S.
      • Gaüzère Y.
      • Belysheva O.
      • Notz E.
      • Boichard D.
      • Delacroix-Buchet A.
      Comparison of Bayesian and partial least squares regression methods for mid-infrared prediction of cheese-making properties in Montbéliarde cows.
      ;
      • Frizzarin M.
      • Gormley I.C.
      • Berry D.P.
      • Murphy T.B.
      • Casa A.
      • Lynch A.
      • McParland S.
      Predicting cow milk quality traits from routinely available milk spectra using statistical machine learning methods.
      ), or animal health and feed intake traits (
      • Dórea J.R.R.
      • Rosa G.J.M.
      • Weld K.A.
      • Armentano L.E.
      Mining data from milk infrared spectroscopy to improve feed intake predictions in lactating dairy cows.
      ;
      • Brand W.
      • Wells A.T.
      • Smith S.L.
      • Denholm S.J.
      • Wall E.
      • Coffey M.P.
      Predicting pregnancy status from mid-infrared spectroscopy in dairy cow milk using deep learning.
      ;
      • Contla Hernández B.
      • Lopez-Villalobos N.
      • Vignes M.
      Identifying health status in grazing dairy cows from milk mid-infrared spectroscopy by using machine learning methods.
      ). Second, it is expected that increasing the number of samples in the study and including data from different herds would also improve trait prediction accuracies, particularly for fatty acids and protein fractions at low concentrations in milk samples. Extending the study to include data from different herds would also facilitate a more robust validation strategy. Although the cow-independent validation approach we have used is commonly practiced in studies of FT-MIR spectra trait prediction, it has been shown that record- or cow-independent validation can overinflate prediction accuracies, compared with herd-independent validation (
      • Dórea J.R.R.
      • Rosa G.J.M.
      • Weld K.A.
      • Armentano L.E.
      Mining data from milk infrared spectroscopy to improve feed intake predictions in lactating dairy cows.
      ;
      • Lahart B.
      • McParland S.
      • Kennedy E.
      • Boland T.M.
      • Condon T.
      • Williams M.
      • Galvin N.
      • McCarthy B.
      • Buckley F.
      Predicting the dry matter intake of grazing dairy cows using infrared reflectance spectroscopy analysis.
      ;
      • Luke T.D.W.
      • Rochfort S.
      • Wales W.J.
      • Bonfatti V.
      • Marett L.
      • Pryce J.E.
      Metabolic profiling of early-lactation dairy cows using milk mid-infrared spectra.
      ;
      • Wang Q.
      • Bovenhuis H.
      Validation strategy can result in an overoptimistic view of the ability of milk infrared spectra to predict methane emission of dairy cattle.
      ). Improving and validating the prediction equations we have developed in this study are important steps for future research to confirm their utility for prediction and use in future breeding programs.
      Other potential refinements for the present study relate to genomic information and the strategy for identifying QTL. Specifically, we have used data sets mapped to the UMD3.1 genome; however, it is expected that the improved sequence continuity and per-base accuracy of the ARS-UCD1.2 reference genome (
      • Rosen B.D.
      • Bickhart D.M.
      • Schnabel R.D.
      • Koren S.
      • Elsik C.G.
      • Tseng E.
      • Rowan T.N.
      • Low W.Y.
      • Zimin A.
      • Couldrey C.
      • Hall R.
      • Li W.
      • Rhie A.
      • Ghurye J.
      • McKay S.D.
      • Thibaud-Nissen F.
      • Hoffman J.
      • Murdoch B.M.
      • Snelling W.M.
      • McDaneld T.G.
      • Hammond J.A.
      • Schwartz J.C.
      • Nandolo W.
      • Hagen D.E.
      • Dreischer C.
      • Schultheiss S.J.
      • Schroeder S.G.
      • Phillippy A.M.
      • Cole J.B.
      • Van Tassell C.P.
      • Liu G.
      • Smith T.P.L.
      • Medrano J.F.
      De novo assembly of the cattle reference genome with single-molecule sequencing.
      ) may yield a few additional QTL and reveal additional candidate mutations given improvements in accompanying transcript annotations. Also, the approach we used to identify QTL could be extended to account for nonadditive QTL in a similar manner to that outlined in
      • Reynolds E.G.M.
      • Neeley C.
      • Lopdell T.J.
      • Keehan M.
      • Dittmer K.
      • Harland C.S.
      • Couldrey C.
      • Johnson T.J.J.
      • Tiplady K.
      • Worth G.
      • Walker M.
      • Davis S.R.
      • Sherlock R.G.
      • Carnie K.
      • Harris B.L.
      • Charlier C.
      • Georges M.
      • Spelman R.J.
      • Garrick D.J.
      • Littlejohn M.D.
      Non-additive association analysis using proxy phenotypes identifies novel cattle syndromes.
      . Finally, the approach we used to identify causative genes and variants only considered protein-altering variants as candidates, which we acknowledge is relatively simple and crude, and that many of the identified signals could be underpinned by regulatory effects (e.g., gene expression-based mechanisms). It is expected that integration of other functional data sets such as mammary eQTL and ChIP-seq data sets could map additional molecular QTL and enhance fine mapping and candidate variant identification (
      • Tiplady K.M.
      • Lopdell T.J.
      • Littlejohn M.D.
      • Garrick D.J.
      The evolving role of Fourier-transform mid-infrared spectroscopy in genetic improvement of dairy cattle.
      ).

      CONCLUSIONS

      We developed PLS calibration equations to predict bovine fatty acids and protein fractions in milk samples, and compared the genetic architecture underlying directly measured traits to that of corresponding FT-MIR predicted traits. Low to moderate prediction accuracies were observed, indicating that the potential application of using FT-MIR prediction equations for some traits may be limited. However, for most traits, heritability estimates were moderate to high, indicating that genetic variation exists that could potentially be exploited for the purposes of animal selection. Moreover, high genetic correlations between directly measured and FT-MIR predicted fatty acids and individual milk proteins indicated that selection based on FT-MIR predicted traits could provide high rates of genetic gain in the corresponding trait of interest. Trait QTL for fatty acids were identified with likely candidates in the DGAT1, CCDC57, SCD, and GPAT4 genes, but QTL underpinned by SCD were largely absent in FT-MIR predicted fatty acids. Similarly, likely candidates were identified for directly measured proteins in the CSN1S1, CSN3, PAEP, and LTF genes, but the QTL for CSN3 and LTF were absent in corresponding FT-MIR predicted traits. This highlighted that, in some instances, phenotypic variation for directly measured and FT-MIR predicted traits were underpinned by differing genetic architecture and segregation of alleles at QTL.

      ACKNOWLEDGMENTS

      The authors thank Livestock Improvement Corporation (LIC; Hamilton, New Zealand) farm and technical staff for collecting milk samples, and herd-testing staff for the processing and analysis of milk samples, as well as the staff at Fonterra Research and Development Centre (Palmerston North, New Zealand) for milk analyses. Kathryn also thanks the wider LIC R&D team and fellow students for underlying technical support and thoughtful discussion, and Tracey Monehan (R&D Programme Manager, LIC) for overseeing the funding for this work. We also gratefully acknowledge the use of New Zealand eScience Infrastructure (NeSI) high-performance computing for this research. This research was funded through BoviQuest, a joint venture between LIC and ViaLactia Biosciences Ltd., a subsidiary (now closed) of Fonterra Cooperative Ltd. (Auckland, New Zealand), LIC (Hamilton, New Zealand), and the New Zealand Ministry for Primary Industries, within the Resilient Dairy Programme through Sustainable Food & Fibre Futures (funding no: PGP06-17006). The authors have not stated any conflicts of interest.

      APPENDIX

      Figure thumbnail gr7
      Figure A1Frequency distribution of samples across DIM (n = 2,005).
      Figure thumbnail gr8
      Figure A2Frequency distributions of (a) untransformed lactoferrin concentrations and (b) lactoferrin concentrations after cube-root transformation (n = 1,936).
      Appendix Table A1.Goodness of fit (Rcv2) of partial least squares calibration models for untreated and pretreated spectra based on cow-independent validation
      Trait
      Trait definitions and units as described in Table 1.
      Spectral pretreatment
      Untreated = untreated spectral data; first derivative = spectra pretreated with a first-order Savitzky-Golay derivative with a window of 7 data points either side; MSC = spectra pretreated with multiplicative scatter correction; MSC + first = spectra pretreated with MSC, followed by first-derivative transformation; SNV = spectra pretreated with a standard normal variate transformation; SNV + first = spectra pretreated with SNV followed by first-derivative transformation.
      UntreatedFirst derivativeMSCMSC + firstSNVSNV + first
      Individual fatty acid (g/100 g of total fat)
       C4:00.6270.6170.6250.6020.6230.602
       C6:00.5340.5440.5330.5480.5340.542
       C8:00.6220.6100.6250.6220.6280.622
       C10:00.6270.6220.6420.6270.6410.627
       C10:10.3440.3600.3530.3650.3480.360
       C12:00.5900.5870.5940.5900.5960.590
       C12:10.3210.3520.3230.3530.3260.353
       C14:00.4940.4920.4980.4990.5010.491
       C14:10.4080.4130.4180.4160.4120.414
       C16:00.6000.5730.6030.5780.6120.574
       C16:10.2050.2260.2090.1820.2120.184
       C18:00.4660.4520.4460.4470.4750.445
       C18:1 cis-70.4030.4080.4310.4090.4440.411
       C18:1 cis-90.5620.5530.5540.5650.5540.569
       C18:2 cis-9,trans-110.4750.4970.5080.4970.5080.498
       C18:2 cis-60.4310.4650.4510.4800.4550.480
       C18:3 cis-30.3560.3640.3560.3510.3560.360
      Grouped fatty acid (g/100 g of total fat)
       SFA0.5870.5900.6010.5950.5980.591
       PUFA0.4490.4710.4820.4700.4770.490
       UFA0.5880.5870.6010.5930.5950.597
       SCFA0.6550.6530.6520.6470.6510.648
       MCFA0.5390.5660.5530.5640.5570.567
       LCFA0.5610.5670.5630.5690.5680.568
      Individual milk protein (g/L of total volume)
       α-CN0.4760.5280.4580.5340.4600.532
       β-CN0.1930.1850.1840.1880.1840.190
       κ-CN0.4670.4860.4490.4710.4520.476
       α-LA0.3240.3070.3240.3060.3220.306
       β-LG0.6600.6860.6670.6750.6610.678
       Lf
      Cube-root transformation of lactoferrin (Lf).
      0.3470.3440.3560.3550.3560.356
       Mean Rcv20.4720.4790.4770.4790.4790.480
      1 Trait definitions and units as described in Table 1.
      2 Untreated = untreated spectral data; first derivative = spectra pretreated with a first-order Savitzky-Golay derivative with a window of 7 data points either side; MSC = spectra pretreated with multiplicative scatter correction; MSC + first = spectra pretreated with MSC, followed by first-derivative transformation; SNV = spectra pretreated with a standard normal variate transformation; SNV + first = spectra pretreated with SNV followed by first-derivative transformation.
      3 Cube-root transformation of lactoferrin (Lf).
      Appendix Table A2.Variance component estimates for directly measured and Fourier-transform mid-infrared (FT-MIR) predicted fatty acid and protein traits
      Trait
      Trait definitions and units as described in Table 1. Standard errors shown in parentheses.
      Directly measured trait
      σu2 = additive genetic variance; σp2 = permanent environment variance; σe2 = residual variance; σT2 = total variance (σu2+σp2+σe2).
      FT-MIR prediction
      σu2σp2sigmae2σT2Rσu2σp2Rσe2σT2
      Individual fatty acid (g/100 g of total fat)
       C4:00.022 (0.009)0.014 (0.007)0.033 (0.001)0.069 (0.004)0.014 (0.006)0.009 (0.005)0.018 (0.001)0.042 (0.002)
       C6:00.005 (0.003)0.004 (0.002)0.016 (0.001)0.025 (0.001)0.003 (0.001)0.002 (0.001)0.006 (2e-4)0.011 (0.001)
       C8:00.005 (0.002)0.003 (0.002)0.011 (4e-4)0.019 (0.001)0.003 (0.001)0.001 (0.001)0.005 (2e-4)0.009 (5e-4)
       C10:00.098 (0.039)0.032 (0.028)0.110 (0.004)0.241 (0.015)0.057 (0.020)0.008 (0.014)0.060 (0.002)0.125 (0.008)
       C10:10.001 (4e-4)0.001 (3e-4)0.001 (1e-4)0.003 (2e-4)0.0003 (1e-4)0.0002 (1e-4)0.001 (0.00003)0.001 (1e-4)
       C12:00.132 (0.055)0.064 (0.04)0.182 (0.007)0.378 (0.021)0.083 (0.031)0.020 (0.022)0.094 (0.004)0.197 (0.012)
       C12:12e-4 (1e-4)7e-5 (1e-4)5e-4 (2e-5)0.001 (3e-5)1e-4 (3e-5)4e-5 (2e-5)2e-4 (1e-5)3e-4 (1e-5)
       C14:00.342 (0.154)0.122 (0.111)0.532 (0.021)0.997 (0.058)0.161 (0.065)0.022 (0.046)0.266 (0.011)0.449 (0.025)
       C14:10.021 (0.007)0.006 (0.005)0.011 (4e-4)0.037 (0.003)0.003 (0.001)0.002 (0.001)0.007 (3e-4)0.012 (0.001)
       C16:02.187 (0.804)1.145 (0.579)2.451 (0.098)5.782 (0.327)1.214 (0.424)0.209 (0.302)1.700 (0.068)3.123 (0.169)
       C16:10.008 (0.004)0.012 (0.003)0.022 (0.001)0.043 (0.002)0.002 (0.001)0.003 (0.001)0.007 (3e-4)0.011 (5e-4)
       C18:00.176 (0.130)1.137 (0.135)1.400 (0.056)2.714 (0.108)0.149 (0.087)0.232 (0.070)0.653 (0.026)1.034 (0.044)
       C18:1 cis-70.125 (0.053)0.084 (0.039)0.202 (0.008)0.412 (0.022)0.063 (0.026)0.036 (0.019)0.095 (0.004)0.193 (0.011)
       C18:1 cis-90.881 (0.379)0.770 (0.288)2.335 (0.093)3.986 (0.180)0.551 (0.234)0.264 (0.172)1.140 (0.046)1.955 (0.097)
       C18:2 cis-9,trans-110.017 (0.007)0.012 (0.005)0.019 (0.001)0.048 (0.003)0.010 (0.004)0.004 (0.003)0.009 (4e-4)0.023 (0.002)
       C18:2 cis-60.004 (0.002)0.001 (0.001)0.007 (3e-4)0.013 (0.001)0.002 (0.001)0.001 (5e-4)0.003 (1e-4)0.006 (3e-4)
       C18:3 cis-30.004 (0.001)5e-4 (0.001)0.005 (2e-4)0.009 (5e-4)0.001 (3e-4)1e-4 (2e-4)0.001 (4e-5)0.002 (1e-4)
      Grouped fatty acid (g/100 g of total fat)
       SFA1.472 (0.626)1.541 (0.478)3.162 (0.126)6.175 (0.291)1.293 (0.530)0.646 (0.381)1.530 (0.062)3.469 (0.206)
       PUFA0.078 (0.029)0.026 (0.021)0.077 (0.003)0.181 (0.011)0.049 (0.018)0.017 (0.013)0.039 (0.002)0.105 (0.007)
       UFA1.468 (0.626)1.544 (0.478)3.156 (0.126)6.167 (0.291)1.299 (0.531)0.640 (0.382)1.535 (0.062)3.474 (0.206)
       SCFA0.037 (0.020)0.041 (0.015)0.117 (0.005)0.196 (0.009)0.026 (0.013)0.025 (0.010)0.050 (0.002)0.101 (0.005)
       MCFA1.293 (0.564)0.610 (0.410)2.302 (0.092)4.206 (0.223)0.797 (0.311)0.203 (0.222)1.158 (0.046)2.158 (0.121)
       LCFA0.852 (0.546)3.787 (0.549)7.060 (0.282)11.699 (0.445)0.813 (0.445)1.102 (0.355)3.386 (0.137)5.301 (0.223)
      Individual milk protein (g/L of total volume)
       α-CN0.579 (0.260)0.337 (0.191)1.112 (0.049)2.029 (0.108)0.559 (0.230)0.122 (0.162)0.428 (0.019)1.109 (0.082)
       β-CN0.421 (0.241)0.097 (0.193)2.586 (0.115)3.105 (0.126)0.204 (0.091)0.146 (0.066)0.187 (0.008)0.537 (0.035)
       κ-CN0.172 (0.067)0.007 (0.047)0.136 (0.006)0.315 (0.024)0.083 (0.031)0.027 (0.022)0.052 (0.002)0.162 (0.012)
       α-LA0.008 (0.003)0.002 (0.002)0.009 (4e-4)0.019 (0.001)0.002 (0.001)0.001 (5e-4)0.002 (9e-5)0.005 (3e-4)
       β-LG0.282 (0.103)0.076 (0.072)0.09 (0.004)0.448 (0.037)0.240 (0.084)0.034 (0.058)0.07 (0.003)0.343 (0.03)
       Lf
      Cube-root transformation of lactoferrin (Lf).
      0.007 (0.003)2e-4 (0.002)0.005 (2e-4)0.0122 (0.001)0.001 (4e-4)0.001 (0.0003)0.0018 (7e-5)0.003 (2e-4)
      1 Trait definitions and units as described in Table 1. Standard errors shown in parentheses.
      2 σu2 = additive genetic variance; σp2 = permanent environment variance; σe2 = residual variance; σT2 = total variance (σu2+σp2+σe2).
      3 Cube-root transformation of lactoferrin (Lf).
      Appendix Table A3.Effect sizes and minor allele frequency details for fatty acid traits with a significant association effect
      Chr
      Chr = chromosome.
      PositionTag variant IDMinor allele frequencyTrait
      Trait definitions and units as described in Table 1.
      Trait typeBetaSEP-value
      Individual fatty acid (g/100 g of total fat)
       141756075rs2084177620.311C18:1 cis-9Measured0.6820.1061.3e-10
       141799066rs3851350660.238C16:0Measured−1.0390.1461.2e-12
       141799066rs3851350660.238C16:0Measured−1.0390.1461.2e-12
       1752971731rs2079976940.085C6:0Measured0.0810.0139.6e-10
       1753034516rs4610375410.083C4:0Measured0.2080.0247.2e-18
       1951319673rs1372700970.265C10:0Measured0.1620.0251.2e-10
       1951319673rs1372700970.263C12:0Measured0.2390.0338.3e-13
       1951326050rs1364243040.262C14:0Measured0.3380.0501.4e-11
       2621141279rs412556960.476C10:0Measured−0.1450.0232.2e-10
       2621141279rs412556960.475C14:0Measured−0.2880.0462.7e-10
       2621148111rs412556880.493C10:1Measured−0.0370.0031.8e-48
       2621149680rs3852853560.496C14:1Measured−0.1360.0086.1e-61
       2626458006rs4457583060.318C10:1Measured−0.0170.0032.6e-10
       2626458006rs4457583060.308C12:1Measured−0.0080.0012.4e-10
       11103301736rs412556870.420C12:1Predicted−0.0050.0016.3e-11
       142502770rs1374225740.414C18:3 cis-3Predicted0.0160.0021.0e-12
       142528807rs1102754970.415C18:1 cis-9Predicted0.4290.0671.3e-10
       1752971731rs2079976940.085C6:0Predicted0.0680.0099.9e-16
       1753034516rs4610375410.083C4:0Predicted0.1500.0181.5e-17
       1951314476rs419221430.262C10:0Predicted0.1340.0197.0e-13
       1951314476rs419221430.260C12:0Predicted0.1580.0233.8e-12
       1951314476rs419221430.264C14:0Predicted0.2300.0337.0e-12
       1951326050rs1364243040.261C8:0Predicted0.0320.0058.9e-10
       2621174891rs2094456500.452C14:1Predicted0.0290.0051.9e-09
       2625584818rs2109219410.485C10:1Predicted−0.0100.0025.8e-10
       2736200888rs1109509720.455C18:3 cis-3Predicted0.0170.0029.9e-15
       27362046790.464C16:0Predicted−0.4850.0801.6e-09
      Grouped fatty acid (g/100 g of total fat)
       1753034516rs4610375410.081SCFAMeasured0.3040.0391.2e-14
       1936187954rs1109807420.260SFAMeasured−0.9270.1495.0e-10
       1936187954rs1109807420.259UFAMeasured0.9330.1504.8e-10
       1951319673rs1372700970.265MCFAMeasured0.7910.1071.4e-13
       2621149680rs3852853560.495SFAMeasured0.8180.1292.1e-10
       2621149680rs3852853560.495UFAMeasured−0.8320.1291.1e-10
       142319003rs1101825360.408UFAPredicted0.5930.0978.1e-10
       1753034516rs4610375410.081SCFAPredicted0.2750.0297.1e-22
       1950919823rs3805349250.171UFAPredicted−0.8250.1358.8e-10
       1951314476rs419221430.262MCFAPredicted0.5440.0769.2e-13
       2621138011rs3816552710.493UFAPredicted−0.6280.0992.6e-10
      1 Chr = chromosome.
      2 Trait definitions and units as described in Table 1.
      Appendix Table A4.Effect sizes and minor allele frequency details for protein traits with a significant association effect
      Chr
      Chr = chromosome.
      PositionTag variant IDMinor allele frequencyTrait
      Trait definitions and units as described in Table 1.
      Trait typeBetaSEP-value
      687133508rs1095003630.329α-CNMeasured0.6590.0954.3e-12
      687405588rs1107949530.450κ-CNMeasured−0.4120.0386.4e-28
      11103291134rs1102700480.421β-LGMeasured0.8380.0368.7e-117
      11103292575rs3810502990.455α-CNMeasured−0.5400.0875.6e-10
      2253538882rs437654600.457Lf
      Cube-root transformation of lactoferrin (Lf).
      Measured−0.0720.0051.8e-41
      6870859180.361κ-CNPredicted0.2560.0278.2e-21
      687133508rs1095003630.329α-CNPredicted0.4610.0717.0e-11
      11103299272rs1105635490.440β-CNPredicted−0.4290.0488.3e-19
      11103299272rs1105635490.420β-LGPredicted0.7280.0325.4e-116
      141799066rs3851350660.237α-CNPredicted−0.5270.0764.8e-12
      1 Chr = chromosome.
      2 Trait definitions and units as described in Table 1.
      3 Cube-root transformation of lactoferrin (Lf).

      REFERENCES

        • Arnould V.
        • Gengler N.
        • Soyeurt H.
        Genetic variability of test-day stearoyl coenzyme-A desaturase 9 activity.
        J. Dairy Sci. 2009; 92: 353-354
        • Arnould V. M.-R.
        • Soyeurt H.
        • Gengler N.
        • Colinet F.G.
        • Georges M.V.
        • Bertozzi C.
        • Portetelle D.
        • Renaville R.
        Genetic analysis of lactoferrin content in bovine milk.
        J. Dairy Sci. 2009; 92 (19389973): 2151-2158
        • Bahar B.
        • O'Halloran F.
        • Callanan M.J.
        • McParland S.
        • Giblin L.
        • Sweeney T.
        Bovine lactoferrin (LTF) gene promoter haplotypes have different basal transcriptional activities.
        Anim. Genet. 2011; 42 (21554347): 270-279
        • Bates D.
        • Mächler M.
        • Bolker B.
        • Walker S.
        Fitting linear mixed-effects models using lme4.
        J. Stat. Softw. 2015; 67: 1-48
        • Benedet A.
        • Ho P.N.
        • Xiang R.
        • Bolormaa S.
        • De Marchi M.
        • Goddard M.E.
        • Pryce J.E.
        The use of mid-infrared spectra to map genes affecting milk composition.
        J. Dairy Sci. 2019; 102 (31178181): 7189-7203
        • Bergman E.N.
        Hyperketonemia-ketogenesis and ketone body metabolism.
        J. Dairy Sci. 1971; 54 (4946669): 936-948
        • Bernard L.
        • Leroux C.
        • Chilliard Y.
        Characterisation and nutritional regulation of the main lipogenic genes in the ruminant lactating mammary gland.
        Rumin. Physiol. Dig. Metab. Impact Nutr. Gene Expr. Immunol. Stress. 2006; : 295-326
        • Berry S.D.
        • Lopez-Villalobos N.
        • Beattie E.M.
        • Davis S.R.
        • Adams L.F.
        • Thomas N.L.
        • Ankersmit-Udy A.E.
        • Stanfield A.M.
        • Lehnert K.
        • Ward H.E.
        • Arias J.A.
        • Spelman R.J.
        • Snell R.G.
        Mapping a quantitative trait locus for the concentration of β-lactoglobulin in milk, and the effect of β-lactoglobulin genetic variants on the composition of milk from Holstein-Friesian x Jersey crossbred cows.
        N. Z. Vet. J. 2010; 58 (20200568): 1-5
        • Bonfatti V.
        • Degano L.
        • Menegoz A.
        • Carnier P.
        Short communication: Mid-infrared spectroscopy prediction of fine milk composition and technological properties in Italian Simmental.
        J. Dairy Sci. 2016; 99 (27497897): 8216-8221
        • Bonfatti V.
        • Di Martino G.
        • Carnier P.
        Effectiveness of mid-infrared spectroscopy for the prediction of detailed protein composition and contents of protein genetic variants of individual milk of Simmental cows.
        J. Dairy Sci. 2011; 94 (22118068): 5776-5785
        • Bonfatti V.
        • Tiezzi F.
        • Miglior F.
        • Carnier P.
        Comparison of Bayesian regression models and partial least squares regression for the development of infrared prediction equations.
        J. Dairy Sci. 2017; 100 (28647337): 7306-7319
        • Bonfatti V.
        • Vicario D.
        • Lugo A.
        • Carnier P.
        Genetic parameters of measures and population-wide infrared predictions of 92 traits describing the fine composition and technological properties of milk in Italian Simmental cattle.
        J. Dairy Sci. 2017; 100 (28478002): 5526-5540
        • Bouwman A.C.
        • Bovenhuis H.
        • Visker M.H.P.W.
        • van Arendonk J.A.M.
        Genome-wide association of milk fatty acids in Dutch dairy cattle.
        BMC Genet. 2011; 12 (21569316): 43
        • Bouwman A.C.
        • Visker M.H.P.W.
        • van Arendonk J.A.M.
        • Bovenhuis H.
        Fine mapping of a quantitative trait locus for bovine milk fat composition on Bos taurus autosome 19.
        J. Dairy Sci. 2014; 97 (24315323): 1139-1149
        • Brand W.
        • Wells A.T.
        • Smith S.L.
        • Denholm S.J.
        • Wall E.
        • Coffey M.P.
        Predicting pregnancy status from mid-infrared spectroscopy in dairy cow milk using deep learning.
        J. Dairy Sci. 2021; 104 (33485687): 4980-4990
        • Browning S.R.
        • Browning B.L.
        Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering.
        Am. J. Hum. Genet. 2007; 81 (17924348): 1084-1097
        • Buitenhuis B.
        • Janss L.L.G.
        • Poulsen N.A.
        • Larsen L.B.
        • Larsen M.K.
        • Sørensen P.
        Genome-wide association and biological pathway analysis for milk-fat composition in Danish Holstein and Danish Jersey cattle.
        BMC Genomics. 2014; 15 (25511820)1112
        • Buitenhuis B.
        • Poulsen N.A.
        • Gebreyesus G.
        • Larsen L.B.
        Estimation of genetic parameters and detection of chromosomal regions affecting the major milk proteins and their post translational modifications in Danish Holstein and Danish Jersey cattle.
        BMC Genet. 2016; 17 (27485317): 114
        • Butler D.G.
        • Cullis B.R.
        • Gilmour A.R.
        • Gogel B.J.
        ASReml-R Reference Manual: Analysis of Mixed Models for S Language Environments.
        Queensland Government, 2009
        • Caroli A.M.
        • Chessa S.
        • Erhardt G.J.
        Invited review: Milk protein polymorphisms in cattle: Effect on animal breeding and human nutrition.
        J. Dairy Sci. 2009; 92 (19841193): 5335-5352
        • Cingolani P.
        • Platts A.
        • Wang L.L.
        • Coon M.
        • Nguyen T.
        • Wang L.
        • Land S.J.
        • Lu X.
        • Ruden D.M.
        A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3.
        Fly (Austin). 2012; 6 (22728672): 80-92
        • Conte G.
        • Mele M.
        • Chessa S.
        • Castiglioni B.
        • Serra A.
        • Pagnacco G.
        • Secchiari P.
        Diacylglycerol acyltransferase 1, stearoyl-CoA desaturase 1, and sterol regulatory element binding protein 1 gene polymorphisms and milk fatty acid composition in Italian Brown cattle.
        J. Dairy Sci. 2010; 93 (20105547): 753-763
        • Contla Hernández B.
        • Lopez-Villalobos N.
        • Vignes M.
        Identifying health status in grazing dairy cows from milk mid-infrared spectroscopy by using machine learning methods.
        Animals (Basel). 2021; 11 (34438612)2154
        • Cruz V.A.R.
        • Oliveira H.R.
        • Brito L.F.
        • Fleming A.
        • Larmer S.
        • Miglior F.
        • Schenkel F.S.
        Genome-wide association study for milk fatty acids in Holstein cattle accounting for the DGAT1 gene effect.
        Animals (Basel). 2019; 9 (31752271): 997
        • Davis S.R.
        • Ward H.E.
        • Kelly V.
        • Palmer D.
        • Ankersmit-Udy A.E.
        • Lopdell T.J.
        • Berry S.D.
        • Littlejohn M.D.
        • Tiplady K.
        • Adams L.F.
        • Carnie K.
        • Burrett A.
        • Thomas N.
        • Snell R.G.
        • Spelman R.J.
        • Lehnert K.
        Screening for phenotypic outliers identifies an unusually low concentration of a β-lactoglobulin B protein isoform in bovine milk caused by a synonymous SNP.
        Genet. Sel. Evol. 2022; 54 (35296234): 22
        • De Marchi M.
        • Bonfatti V.
        • Cecchinato A.
        • Di Martino G.
        • Carnier P.
        Prediction of protein composition of individual cow milk using mid-infrared spectroscopy.
        Ital. J. Anim. Sci. 2009; 8: 399-401
        • Dórea J.R.R.
        • Rosa G.J.M.
        • Weld K.A.
        • Armentano L.E.
        Mining data from milk infrared spectroscopy to improve feed intake predictions in lactating dairy cows.
        J. Dairy Sci. 2018; 101 (29680644): 5878-5889
        • Duggal P.
        • Gillanders E.M.
        • Holmes T.N.
        • Bailey-Wilson J.E.
        Establishing an adjusted p-value threshold to control the family-wide type 1 error in genome wide association studies.
        BMC Genomics. 2008; 9 (18976480): 516
        • El Jabri M.
        • Sanchez M.-P.
        • Trossat P.
        • Laithier C.
        • Wolf V.
        • Grosperrin P.
        • Beuvier E.
        • Rolet-Répécaud O.
        • Gavoye S.
        • Gaüzère Y.
        • Belysheva O.
        • Notz E.
        • Boichard D.
        • Delacroix-Buchet A.
        Comparison of Bayesian and partial least squares regression methods for mid-infrared prediction of cheese-making properties in Montbéliarde cows.
        J. Dairy Sci. 2019; 102 (31178172): 6943-6958
        • Fink T.
        • Lopdell T.J.
        • Tiplady K.
        • Handley R.
        • Johnson T.J.J.
        • Spelman R.J.
        • Davis S.R.
        • Snell R.G.
        • Littlejohn M.D.
        A new mechanism for a familiar mutation—Bovine DGAT1 K232A modulates gene expression through multi-junction exon splice enhancement.
        BMC Genomics. 2020; 21 (32847516): 591
        • Fleming A.
        • Schenkel F.S.
        • Malchiodi F.
        • Ali R.A.
        • Mallard B.
        • Sargolzaei M.
        • Jamrozik J.
        • Johnston J.
        • Miglior F.
        Genetic correlations of mid-infrared-predicted milk fatty acid groups with milk production traits.
        J. Dairy Sci. 2018; 101 (29477537): 4295-4306
        • Freitas P.H.F.
        • Oliveira H.R.
        • Silva F.F.
        • Fleming A.
        • Miglior F.
        • Schenkel F.S.
        • Brito L.F.
        Genomic analyses for predicted milk fatty acid composition throughout lactation in North American Holstein cattle.
        J. Dairy Sci. 2020; 103 (32418690): 6318-6331
        • Frizzarin M.
        • Gormley I.C.
        • Berry D.P.
        • Murphy T.B.
        • Casa A.
        • Lynch A.
        • McParland S.
        Predicting cow milk quality traits from routinely available milk spectra using statistical machine learning methods.
        J. Dairy Sci. 2021; 104 (33865578): 7438-7447
        • Fuentes-Pila J.
        • DeLorenzo M.A.
        • Beede D.K.
        • Staples C.R.
        • Holter J.B.
        Evaluation of equations based on animal factors to predict intake of lactating Holstein cows.
        J. Dairy Sci. 1996; 79 (8899522): 1562-1571
        • Grelet C.
        • Dardenne P.
        • Soyeurt H.
        • Fernandez J.A.
        • Vanlierde A.
        • Stevens F.
        • Gengler N.
        • Dehareng F.
        Large-scale phenotyping in dairy sector using milk MIR spectra: Key factors affecting the quality of predictions.
        Methods. 2021; 186 (32763376): 97-111
        • Grelet C.
        • Fernández Pierna J.A.
        • Dardenne P.
        • Baeten V.
        • Dehareng F.
        Standardization of milk mid-infrared spectra from a European dairy network.
        J. Dairy Sci. 2015; 98 (25682131): 2150-2160
        • Grelet C.
        • Pierna J.A.F.
        • Dardenne P.
        • Soyeurt H.
        • Vanlierde A.
        • Colinet F.
        • Bastin C.
        • Gengler N.
        • Baeten V.
        • Dehareng F.
        Standardization of milk mid-infrared spectrometers for the transfer and use of multiple models.
        J. Dairy Sci. 2017; 100 (28755945): 7910-7921
        • Grisart B.
        • Coppieters W.
        • Farnir F.
        • Karim L.
        • Ford C.
        • Berzi P.
        • Cambisano N.
        • Mni M.
        • Reid S.
        • Simon P.
        • Spelman R.
        • Georges M.
        • Snell R.
        Positional candidate cloning of a QTL in dairy cattle: Identification of a missense mutation in the bovine DGAT1 gene with major effect on milk yield and composition.
        Genome Res. 2002; 12 (11827942): 222-231
        • Huang G.
        • Buckler-Pena D.
        • Nauta T.
        • Singh M.
        • Asmar A.
        • Shi J.
        • Kim J.Y.
        • Kandror K.V.
        Insulin responsiveness of glucose transporter 4 in 3T3–L1 cells depends on the presence of sortilin.
        Mol. Biol. Cell. 2013; 24 (23966466): 3115-3122
        • Iung L.H.S.
        • Petrini J.
        • Ramírez-Díaz J.
        • Salvian M.
        • Rovadoscki G.A.
        • Pilonetto F.
        • Dauria B.D.
        • Machado P.F.
        • Coutinho L.L.
        • Wiggans G.R.
        • Mourão G.B.
        Genome-wide association study for milk production traits in a Brazilian Holstein population.
        J. Dairy Sci. 2019; 102 (30904307): 5305-5314
        • Jivanji S.
        • Worth G.
        • Lopdell T.J.
        • Yeates A.
        • Couldrey C.
        • Reynolds E.
        • Tiplady K.
        • McNaughton L.
        • Johnson T.J.J.
        • Davis S.R.
        • Harris B.
        • Spelman R.
        • Snell R.G.
        • Garrick D.
        • Littlejohn M.D.
        Genome-wide association analysis reveals QTL and candidate mutations involved in white spotting in cattle.
        Genet. Sel. Evol. 2019; 51 (31703548): 62
        • Kgwatalala P.M.
        • Ibeagha-Awemu E.M.
        • Hayes J.F.
        • Zhao X.
        Stearoyl-CoA desaturase 1 3′UTR SNPs and their influence on milk fatty acid composition of Canadian Holstein cows.
        J. Anim. Breed. Genet. 2009; 126 (19765166): 394-403
        • Knutsen T.M.
        • Olsen H.G.
        • Tafintseva V.
        • Svendsen M.
        • Kohler A.
        • Kent M.P.
        • Lien S.
        Unravelling genetic variation underlying de novo-synthesis of bovine milk fatty acids.
        Sci. Rep. 2018; 8 (29391528)2179
        • Kucheryavskiy S.
        mdatools—R package for chemometrics.
        Chemom. Intell. Lab. Syst. 2020; 198103937
      1. Kuhn, M., J. Wing, S. Weston, A. Williams, C. Keefer, A. Engelhardt, T. Cooper, Z. Mayer, B. Kenkel, R Core Team, M. Benesty, R. Lescarbeau, A. Ziem, L. Scrucca, Y. Tang, C. Candan, and T. Hunt. 2022. Caret: Classification and Regression Training.

        • Lahart B.
        • McParland S.
        • Kennedy E.
        • Boland T.M.
        • Condon T.
        • Williams M.
        • Galvin N.
        • McCarthy B.
        • Buckley F.
        Predicting the dry matter intake of grazing dairy cows using infrared reflectance spectroscopy analysis.
        J. Dairy Sci. 2019; 102 (31351717): 8907-8918
        • Li C.
        • Sun D.
        • Zhang S.
        • Wang S.
        • Wu X.
        • Zhang Q.
        • Liu L.
        • Li Y.
        • Qiao L.
        Genome wide association study identifies 20 novel promising genes associated with milk fatty acid traits in Chinese Holstein.
        PLoS One. 2014; 9 (24858810)e96186
        • Littlejohn M.D.
        • Tiplady K.
        • Lopdell T.
        • Law T.A.
        • Scott A.
        • Harland C.
        • Sherlock R.
        • Henty K.
        • Obolonkin V.
        • Lehnert K.
        • MacGibbon A.
        • Spelman R.J.
        • Davis S.R.
        • Snell R.G.
        Expression variants of the lipogenic AGPAT6 gene affect diverse milk composition phenotypes in Bos taurus.
        PLoS One. 2014; 9 (24465687)e85757
        • Loh P.-R.
        • Tucker G.
        • Bulik-Sullivan B.K.
        • Vilhjálmsson B.J.
        • Finucane H.K.
        • Salem R.M.
        • Chasman D.I.
        • Ridker P.M.
        • Neale B.M.
        • Berger B.
        • Patterson N.
        • Price A.L.
        Efficient Bayesian mixed-model analysis increases association power in large cohorts.
        Nat. Genet. 2015; 47 (25642633): 284-290
        • Lopdell T.J.
        • Tiplady K.