Using milk mid-infrared spectroscopy to estimate cow-level nitrogen efficiency metrics

Minimizing pollution from the dairy sector is paramount; one potential cause of such pollution is excess nitrogen. Nitrogen pollution contributes to a deterioration in water quality as well as an increase in both eutro-phication and greenhouse gases. It is therefore essential to minimize the loss of nitrogen from the sector, including excretion from the cow. Breeding programs are one potential strategy to improve the efficiency with which nitrogen is used by dairy cows, but they rely on routine access to individual cow information on how efficiently each cow uses the nitrogen it ingests. A total of 3,497 test-day records for individual-cow nitrogen efficiency metrics along with milk yield and the associated milk spectra were used to investigate the ability of milk infrared spectral data to predict these nitrogen traits; both traditional partial least squares regression and neural networks were used in the prediction process. The data originated from 4 farms across 11 yr. The nitrogen traits investigated were nitrogen intake, nitrogen use efficiency, and nitrogen balance. Both nitrogen use efficiency and nitrogen balance were calculated considering nitrogen intake, nitrogen in milk, nitrogen in the conceptus, nitrogen used for the growth, nitrogen stored in body reserves, and nitrogen mobilized from body reserves. Irrespective of the nitrogen-related trait being investigated, the best predictions from 4-fold cross validation were achieved using neural networks that considered both the morning and evening milk spectra along with milk yield, parity, and DIM in the prediction process. The coefficient of determination in the cross validation was 0.61, 0.74, and 0.58 for nitrogen intake, nitrogen use efficiency, and nitrogen balance, respectively. In a separate series of validation approaches, the calibration and validation was stratified by herd (n = 4) and separately by year. For these scenarios, partial least squares regression generated more accurate predictions compared with neural networks; the coefficient of determination was always lower than 0.29 and 0.60 when validation was stratified by herd and year, respectively. Therefore, if the variability of the data being predicted in the validation datasets is similar to that in the data used to develop the predictions, then nitrogen-related traits can be predicted with reasonable accuracy. In contrast, where the variability of the data that exists in the validation dataset is poorly represented in the calibration dataset, then poor predictions will ensue.


INTRODUCTION
Public interest and demand for improved sustainability metrics of dairy cow production systems is intensifying globally.Excessive nitrogen excretion from dairy cows can contribute to pollution, as well as having environmental and human health repercussions (Galloway et al., 2003;WHORO, 2003); such losses also represent a monetary loss to dairy producers (de Freitas et al., 2019).Less than 25% of the ingested nitrogen in grazing dairy cows is used for their biological needs (Powell et al., 2010;Tavernier et al., 2023); the remaining 75% of the ingested nitrogen is excreted into the environment.The respective utilization rate for cows fed in confinement is often higher (i.e., 30%; Powell et al., 2010), but still relatively poor.Interanimal variability is known to exist in the efficiency with which dairy cows use ingested nitrogen (Zamani et al., 2011;Lopez-Villalobos et al., 2018;Tavernier et al., 2024) with 6% to 11% of this variability attributed to interanimal genetic variability (Tavernier et al., 2024).Hence, improvement in nitrogen utilization is possible.It has been demonstrated in grazing dairy cows that animals stratified on genetic merit for nitrogen efficiency metrics had phenotypically different efficiency metrics in line with expectation (Tavernier et al., 2024).Nonetheless, the relatively low heritability for nitrogen efficiency metrics in dairy cows implies that achieving Using milk mid-infrared spectroscopy to estimate cow-level nitrogen efficiency metrics high accuracy of selection for such traits requires routine access to large quantities of individual-cow phenotypic data; this barrier is exacerbated in grazing dairy systems where measuring individual-cow nitrogen intake is more challenging.
Two metrics generally used to reflect nitrogen utilization in dairy cows include (1) nitrogen use efficiency (NUE; Powell et al., 2010) represented as the nitrogen partitioned into milk divided by the ingested nitrogen, and (2) nitrogen balance (Nbal; Gourley et al., 2012), a proxy for the quantity of nitrogen excreted represented by the ingested nitrogen minus the nitrogen partitioned into the milk produced by the cow.Tavernier et al. (2023) expanded this definition to also include other nitrogen sinks and sources such as animal growth, body tissue mobilization, and fetal growth.Quantifying the nitrogen in milk is possible based on the widely reported milk CP content along with the recorded milk yield for that test day.Measuring individual-animal nitrogen intake or nitrogen excreted is, however, labor intensive, especially in grazing systems and is therefore often just limited to cows in experimental farms.Consequently, there is an interest in developing proxies for nitrogen utilization.
Predicting NUE and Nbal in dairy cows fed indoors from the infrared analysis of milk samples has already been proposed as a possible option (Grelet et al., 2020;Shi et al., 2023).If successful, then predictions of nitrogen efficiency could be routinely generated from all milk-tested cows and possibly even bulk milk samples.However, the potential to predict nitrogen efficiency in grazing dairy cows from mid-infrared (MIR) spectral analysis of milk samples has not yet been explored.Furthermore, previous attempts to predict nitrogen efficiency in dairy cows from milk MIR spectra focused on predicting nitrogen efficiency directly (Grelet et al., 2020;Shi et al., 2023); however, because the nitrogen in milk is already available (with high accuracy; Luke et al., 2019), better predictions could possibly be achieved by predicting nitrogen intake and using the reported nitrogen in the milk to estimate nitrogen efficiency.An additional exploratory strategy, not previously investigated, could be to consider alternative prediction approaches, such as neural networks (NN), in predicting nitrogen efficiency from milk spectral analysis; one of the benefits of such an approach over the traditionally used partial least squares regression (PLSR) includes the ability to capture complex nonlinear relationships between the dependent and independent variables.The objective therefore of the present study was to investigate if nitrogen intake, NUE, and Nbal could be predicted from the MIR spectra of individual-cow milk samples where the cows were grazing outdoors.Having nitrogen utilization measures of individual cows on a large scale has multiple use cases, including (1) being able to routinely provide efficiency metrics for individual cows and, by extension, herds; (2) tailor feeding and management advice based on herd-level metrics; and (3) a source of individual-cow phenotypes to enable the generation of accurate estimates of genetic merit for individual cows for consideration in a breeding program.The cumulative effect could contribute to more economically, socially, and environmentally sustainable dairy production systems.

Data
All data used in this study were collected between the years 2008 and 2018 on 4 Teagasc experimental dairy farms in Ireland.Therefore, because no human or animal subjects were used, this study did not require approval by an Institutional Animal Care and Use Committee or Institutional Review Board.The characteristics of the different farms, such as geographical location, number of milking cows, breeds, research trials, and years when data were collected are summarized in Table 1.Moreover, the data were already described in detail by Tavernier et al. (2023).The data originated from 2,241 lactations from 1,291 dairy cows of multiple breeds and crossbreds, namely Holstein-Friesian (896 cows), Jersey (61 cows), their cross (181 cows), and "others" (153 cows).Daily milk yield, weekly milk composition, bimonthly BW and BCS records, sporadic DMI measures (i.e., an average of 2 records per lactation), and their respective experimental conditions and grass quality measures (e.g., CP content) were available.Only test days from cows calved more than 5 d and less than 305 d were retained; only parities 1 to 10 were considered further.
Individual-cow milk samples were collected weekly during consecutive evening and morning milkings.All milk samples were analyzed using the same MIR spectrometer (Foss MilkoScan FT6000; Foss Electric A/S, Hillerød, Denmark) generating 1,060 transmittance spectral values.Each wavenumber value was transformed from transmittance to absorbance by taking the log 10 of the reciprocal of the transmittance value.The high-noise-level regions of each spectrum were removed, resulting in data from just 531 wavenumbers as per Visentin et al. (2015).The Mahalanobis distance of each edited spectrum from the centroid was calculated using the first 5 principal components, which explained 99% of the spectral variability.Outlying spectra were identified as those with a Mahalanobis distance from the centroid greater than the 99th quantile of all the computed distances and were consequently removed (Frizzarin et al., 2021).

Data Editing
Sources of nitrogen available to the cow considered in the present study were nitrogen intake (Nintake) and the nitrogen mobilized from reserves (Nmobilized).The sum of the Nintake and Nmobilized is referred to as the nitrogen available (Navail; Tavernier et al., 2023).All cows were fed predominantly grazed grass, with supplements mainly provided during periods of high energy demand or when the quality and quantity of grass was sub-optimal (i.e., average supplements fed of 0.90 kg DM).Dry matter intake for grass and concentrates, along with the CP content of each, was quantified, on average, twice per lactation.Nitrogen intake was quantified as the sum of grass and concentrate DMI multiplied by the respective CP.Grass DMI was measured using the n-alkane technique (Mayes et al., 1986), as modified by Dillon and Stakelum (1989) for dairy cows.Briefly, twice daily across 12 d an artificial even-chain alkane (i.e., C32) was administered orally to the cows; between d 7 and d 12 after the start of the alkane administration, individualcow feces were collected twice daily.The collected feces samples were bulked per animal into a single sample.The n-alkane content of the feces was then quantified using GC.The sampling covered the entire lactation period but most of the records were between 30 and 150 DIM and 180 and 270 DIM.
Nitrogen mobilized corresponded to the nitrogen released when reserves were used by the dairy cows in negative energy balance; Nmobilized was quantified as energy balance multiplied by 6.25/33, where 33 is the grams of protein mobilized or fixed per unité fourragère lait of energy balance, as described by Tavernier et al. (2023).Daily energy balance was calculated as the energy intake adjusted for the effect of the amount of concentrate in the diet, less the energy used for lactation, maintenance, gestation, and growth (Tavernier et al., 2023).
The nitrogen sinks considered in the present study were the nitrogen output in milk (Nmilk), the nitrogen used for the conceptus (Nconceptus), the nitrogen used for growth (Ngrowth), and the nitrogen stored as body reserves (Nreserve); all calculations are described in detail by Tavernier et al. (2023) using the data used in the present study.The sum of all the sinks is referred to in the present study as the nitrogen output (Nout).Nitrogen output in milk was quantified as the sum of the true protein yield (true protein content times milk yield) converted to nitrogen equivalents (i.e., divided by 6.38;Jones, 1931) and the MUN.The Nconceptus (i.e., sum of the calf and the maternal tissues) was computed using the equation to predict calf birth weight as proposed by Agabriel and de la Torre (2018) divided by 0.58 (Martin and Sauvant, 2010) to take into account the total weight  Coffey et al., 2017Coffey et al., 2008Coffey et al., , 2009Coffey et al., , 2010Coffey et al., , 2014Coffey et al., , and 2015 3 Dairygold of the conceptus (i.e., including membranes and liquids) and multiplied by 6.25 (Jones, 1931) to convert to nitrogen.Cow growth was estimated from a model with a populationwide quadratic effect (i.e., fixed effect), along with an animal-specific intercept and linear coefficient (i.e., random effects), fitted to individual-cow BW records corrected for the conceptus weight.Therefore, the fitted model was: where y ij is the conceptus-free BW of cow i on day j; a, b, and c are the coefficients of the fitted quadratic polynomial to the entire population; a i and b i are the cow-specific intercept and linear coefficients; day j was the jth day since first calving; and e ij was the error (Tavernier et al., 2023).The Ngrowth value was subsequently calculated per day as the weight gained that day from growth multiplied by 0.024 (Satter and Roffler, 1975;Jones, 1931), and Nreserve was only nonzero for periods of positive energy balance, with Nreserve being the energy balance at the respective test day multiplied by 6.25/33.Two nitrogen utilization metrics were evaluated in the present study: total NUE and total Nbal.Both metrics were described by Tavernier et al. (2023) and both metrics considered all the nitrogen sources and sinks.Total NUE was quantified as:

Development of the Prediction Equations and Validation
Although our interest was in predicting NUE and Nbal directly, an alternative strategy was also explored, which was to predict Nintake and use this in the (indirect) calculation of NUE and Nbal.For the indirect approach, the Nmobilized, Nconceptus, Ngrowth, and Nreserve per animal were assumed to be known.The NUE indirect was quantified as measured Nmilk + Nconceptus + Ngrowth + Nreserve divided by MIR-predicted Nintake + measured Nmobilized.Similarly, Nbal indirect was quantified as the sum of MIR-predicted Nintake and measured Nmobilized, less the sum of measured Nmilk, Nconceptus, Ngrowth plus Nreserve.For the purpose of the present study, NUE direct and NUE indirect represents prediction of NUE directly or indirectly, respectively; the same approach was used for Nbal direct and Nbal indirect .
Two different prediction methods were evaluated for their ability to estimate the direct nitrogen-related traits: PLSR and NN.Different cow-level model features were considered in the prediction model to identify the best combination of features to predict each trait: (1) spectra only; (2) spectra and milk yield; (3) spectra and DIM; (4) spectra, milk yield, and DIM; (5) spectra, milk yield, and cow parity; and (6) spectra, milk yield, DIM, and parity.The spectra used for the analyses were either spectra collected during the morning milking, spectra collected during the evening milking, or both morning and evening spectra (i.e., 1,062 wavenumbers).All the analyses were conducted using the statistical software R ver.3.6.1 (R Core Team, 2022).
For the development of the PLSR model, the R package pls (Mevik et al., 2019) was used.To identify the optimum number of latent factors, 10-fold cross validation was performed.Nonetheless, to prevent overfitting, different numbers of maximum number of latent factors were tested; the maximum number of latent factors was decided based on the predicted root mean square error (RMSE) and varied according to the validation scenario.The NN equation was developed using the R package brnn (Perez Rodriguez and Gianola, 2020) and the default tuning parameters were used.These parameters correspond to 2 hidden layers and a Bayesian regularization applied to the input layer.
A 4-fold cross validation was undertaken to test the prediction accuracy of the different tested methods.The records were randomly assigned to different subsets; however, if records from a given cow were in the calibration dataset then data from that cow could not appear in the validation dataset.In the 4-fold cross validation, the calibration dataset comprised of 3 of the 4 subsets with the fourth subset representing the validation dataset; this was iterated 4 times with a different subset each time.In a separate series of validation approaches, the calibration and validation were stratified by herd (n = 4) and separately by year (i.e., 2 datasets were created, the calibration dataset included data from 2008 to 2016 and the validation dataset spanned the years 2017 and 2018).When validation was performed by herd, the calibration dataset comprised data from 3 herds with the fourth herd constituting the validation dataset.The process was repeated until all the herds were used for validation.

Measures of Prediction Performance
The mean bias, the RMSE of prediction, the coefficient of determination (R 2 ), the linear regression coefficient of actual values for each N trait on the prediction of the respective N trait (slope), and the ratio of performance to interquartile distance (RPIQ) were the accuracy metrics used to assess prediction performance.The bias is the average of the residuals; the RMSE corresponds the standard deviation of the residuals; the RPIQ is the ratio of the interquartile range of a variable relative the root mean square error of prediction.When 4-fold cross validation was performed, multiple test datasets were used to validate the prediction equations.Therefore, the SD for all these performance metrics across test datasets was also calculated; this SD was used to assess the robustness of the prediction methods.To test for differences in the variance of the residuals between prediction methods, a paired F-test was used.

RESULTS
The mean Nintake, NUE, and Nbal in the edited dataset was 557.

Direct Prediction of Traits
Prediction accuracy for all 3 nitrogen-related traits using PLSR when 4-fold cross validation was performed resulted in a greater (P < 0.05) RMSE (i.e., worst) compared with those developed using NN; the decrease in RMSE when using NN instead of PLSR ranged from 4.7% to 14.4%, from 6.4% to 8.9%, and from 6.2% to 9.5% when Nintake, NUE direct , and Nbal direct were predicted, respectively.The results from PLSR are therefore presented in Supplemental Table S1 (see Notes).Prediction performance using NN when the different combinations of spectra and model features were used are in Table 2. Irrespective of the trait predicted, the combination of morning and evening spectra often produced the lowest RMSE (P < 0.05) compared with using either just morning or just evening spectra; the exceptions were when Nintake was predicted using evening spectra combined with milk yield, DIM, and parity, but also when Nbal direct was predicted using the evening spectra combined with (1) milk yield and parity, or (2) milk yield, DIM, and parity.Models using the evening spectra always produced a lower RMSE (P < 0.05) when predicting Nintake or Nbal direct compared with models using the morning spectra.In contrast, no difference was evident between predictions of NUE direct using either the morning or evening spectra.Irrespective of the spectra used (i.e., morning, evening, or morning and evening), relative to a model that considered just the spectral data, including milk yield as a feature into the prediction model reduced (P < 0.05) the RMSE (i.e., better prediction) by 18.3% to 23.5% when Nintake was being predicted, by 17.9% to 21.3% when NUE direct was being predicted, and by 8.8% to 11.1% when Nbal direct was being predicted.Including just DIM along with the spectra did not reduce the RMSE compared with using just the spectra.Furthermore, compared with a model using just the spectra and milk yield, including also DIM, or parity, or both DIM and parity did not reduce the RMSE (P > 0.05).
Irrespective of the spectra used and the model features in the prediction model, the R 2 for the 3 nitrogen-related traits in 4-fold cross validation was always greater than 0.33, 0.57, and 0.39 for Nintake, NUE direct , and Nbal direct , respectively.The strongest R 2 for Nintake, NUE direct , and Nbal direct was 0.61, 0.74, and 0.58, respectively, and was achieved for all traits when using both the morning and evening spectra along with milk yield, parity, and DIM.Irrespective of the predicted trait, the best model always produced a RPIQ greater than 2.
The mean lactation profile for actual and predicted Nintake, NUE, and Nbal is in Figure 2; the predicted values were generated using both morning and evening spectra along with milk yield, parity, and DIM.Nitrogen intake and Nbal had similar profiles across lactation with different peaks and troughs, and NUE decreased consistently as the lactation progressed.The lactation profiles of the predicted traits followed closely the lactation profiles of the respective trait using the actual measured data.When lactation was broken in 5 stages, each 60 d in length, the R 2 ranged from 0.52 to 0.67, from 0.40 to 0.72, and from 0.40 to 0.64, for Nintake, NUE direct , and Nbal direct , respectively.

Indirect Prediction of Traits
Nitrogen use efficiency and Nbal were indirectly predicted by using the Nintake predicted using both morning and evening spectra along with milk yield, parity, and DIM, which was summed with actual Nmobilized to generate Navail.The RMSE of NUE indirect was 14.6% greater (i.e., worse) compared with NUE direct ; similarly, the RMSE of Nbal indirect was 8.5% greater when compared with Nbal direct .The correlation between actual NUE and NUE indirect was equal to 0.82 (i.e., R 2 of 0.67), and the correlation between actual Nbal and Nbal indirect was 0.71 (i.e., R 2 of 0.50).Moreover, the correlation between NUE direct and NUE indirect was 0.92, and the correlation between Nbal direct and Nbal indirect was 0.93.The inclusion of Nout as a prediction variable along with the spectral data did not improve the prediction performances for any of the investigated traits both when quantified either directly or indirectly.

Validation by Farm and Year
Prediction performance across farms and years using PLSR and NN with both morning and evening spectra along with milk yield in the prediction model is in Table 3 and Table 4, respectively.Different numbers of maximum allowable latent factors were tested when developing the PLSR models; the lowest RMSE of validation was achieved using a maximum of 15 latent factors when across-farm validation was performed, and using a maximum 20 latent factors when across-year validation was performed.The model using both morning and evening spectra along with milk yield as model features produced the lowest RMSE of validation (which was not different from the model including also DIM and parity) and is therefore the model focused on.
When the predictions were validated across farms, irrespective of the prediction model used and of the predicted trait, the RPIQ was always lowest for farm 4. Predictions of Nintake, NUE direct , and Nbal direct using PLSR were more accurate (P < 0.05) compared with NN (i.e., average RMSE of validation across farms of 126.07 g N/d and 136.10 g N/d for PLSR and NN when predict-ing Nintake, of 4.92 and 5.44 for PLSR and NN when predicting NUE direct , and of 123.25 g N/d and 133.19 g N/d for PLSR and NN when predicting Nbal direct ).The RMSE of validation ranged from 105.04 g N/d to 164.53 g N/d (i.e., farm 2 and farm 1, respectively) when predicting Nintake, from 4.17 to 6.43 (i.e., farm 2 and farm 1, respectively) when predicting NUE direct , and from 100.57g N/d to 161.69 g N/d (i.e., farm 2 and farm 1, respectively) when predicting Nbal direct .The slope of the linear regression of actual values on predicted values was always lower than 0.67, 0.80, and 0.55 for Nintake, NUE direct , and Nbal direct , respectively; furthermore, the R 2 was always lower than 0.28, 0.29, and 0.28 for Nintake, NUE direct , and Nbal direct , respectively.Irrespective of the prediction method and of the trait predicted, the achieved RPIQ was always lower than 2.
When validation was based on year, irrespective of the prediction method (i.e., PLSR or NN), the greatest achieved R 2 was 0.38, 0.59, and 0.27 for Nintake, NUE direct , and Nbal direct , respectively.For all the investigated traits, PLSR always outperformed (P < 0.05) NN.Moreover, when using PLSR, the RPIQ of all the investigated traits was >1.70.

DISCUSSION
The main consequences of nitrogen pollution are the deterioration in water quality (i.e., nondrinkable water with possible carcinogenic effects), eutrophication, and an increase of greenhouse gases (i.e., nitrous oxide; Sutton et al., 2021).For example, 31% of the Irish rivers between the years 2019 and 2021 had a less than satisfactory nitrate content, mainly due to agricultural nutrient loss (EPA, 2021).Therefore, reducing the nitrogen excreted by dairy cows or improving their nitrogen utilization could help reduce nitrogen pollution.Strategies proposed to reduce the nitrogen load from dairy production systems include more optimized feeding and manure management; genetic improvement in NUE is also known to be possible (Zamani et al., 2011;Lopez-Villalobos et al., 2018;Tavernier et al., 2024).Successful breeding programs, however, are predicated on having nitrogen utilization records on individual animals or at least some proxy of same.Generating individual-cow measures for nitrogen efficiency metrics is particularly hindered by the challenges associated with measuring (nitrogen) intake on an individual-animal basis; this is exacerbated in grazing dairy cows.The efficacy of using the MIR spectrum from individual-cow milk samples to predict phenotypes not routinely available has already been documented in dairy production systems for cow methane emissions (Dehareng et al., 2012;Shadpour et al., 2022;McParland et al., 2024), cow energy status (McParland et al., 2014), and cow DMI (Shetty et al., 2017;Lahart et al., 2019).The objective of the present study was to evaluate the potential to expand the suite of traits that could be predicted from this MIR spectrum to also include nitrogen efficiency metrics.

Comparison with Other Studies
The ability of milk MIR to predict nitrogen and, more generally, any trait is largely affected by the actual accuracy of the gold-standard values.Moreover, when the gold-standard values are themselves predicted values and are therefore subject to uncertainty, perfect predictions from the MIR equations are not expected.For example, in the present study, the actual values for Nintake, NUE, and Nbal are likely to contain errors because their accuracy was conditional on the precision of the estimation of DMI.Dry matter intake in the grazing cows used in the present study was quantified using the n-alkane technique; the accuracy of this technique has been reported by Wright et al. (2019) in the Irish pasture-based system.Wright et al. (2019) reported that the n-alkane technique can be considered an accurate technique to quantify DMI, with a reported measurement error SD of 1.0 to 1.3 kg of DM/cow per day.
Previous studies in dairy cows (Grelet et al., 2020;Shi et al., 2023) investigated the ability of milk MIR to predict NUE, defined as just Nmilk divided by Nintake, as well as N losses (similar to Nbal in the present study) defined as Nintake minus Nmilk.Grelet et al. (2020) and Shi et al. (2023) developed prediction equations for indoor-fed cows in early lactation and late lactation, respectively.Grelet et al. (2020) also attempted to predict Nintake in early lactation dairy cows from milk MIR.To develop their prediction equations, Grelet et al. (2020) used a total of 1,119 records from 129 Holstein-Friesian cows collected in the UK, Denmark, and Ireland, and Shi et al. (2023) used a total of 706 records from 86 lactating Chinese dairy cows.Cows producing in an indoorconfinement production system are known to be more nitrogen efficient (i.e., NUE between 25% and 30%;Powell et al., 2010) compared with cows producing in a grass-based system (i.e., NUE between 20% and 25%;Powell et al., 2010;Tavernier et al., 2023).Indeed, nitrogen intake tends to be more aligned with energy intake in confinement production systems than it is in outdoor grazing production systems.Despite the difference in production systems and even the difference in the definition of NUE in the present study relative to other studies, prediction accuracy for NUE in the present study based on cross validation was similar to the accuracy reported by both Grelet et al. (2020) and Shi et al. (2023), who also predicted NUE in dairy cows.The R 2 for NUE was 0.74 and 0.66 for Grelet et al. (2020) and Shi et al. (2023), respectively, compared with a value of 0.74 in the present study; both Grelet et al. (2020) and Shi et al. (2023) used cross validation.Shi et al. (2023) performed an external validation of predicting NUE from cows in different years and with different diets compared with the records in the calibration dataset.The R 2 for NUE was still good, ranging from 0.58 to 0.63, which is similar to the R 2 of 0.59 achieved in the present study when validated by year.Grelet et al. (2020) also performed external validation where data from cows fed with a specific diet were removed from the calibration dataset and then used in validation; they reported a R 2 for NUE ranging from 0.14 to 0.68, with the worst prediction achieved when validated in the diet group with spectra differing the most from the spectra used to calibrate the model.In the present study, similar to what was reported by Grelet et al. (2020), when NN was used for the prediction, Farm 4 and Farm 1 were the 2 farms where all the 3 nitrogen investigated traits were the most poorly predicted (i.e., lowest RPIQ) but were also the farms where the cows were, on average, the most efficient (i.e., lowest Nintake and Nbal and the greatest NUE) and the less efficient (i.e., lowest NUE and the greatest Nbal), respectively.When PLSR was used, farm 4 was still the farm with the lowest achieved RPIQ, but farm 1 performed similarly to farm 2 and farm 3 when Nintake and Nbal direct were predicted.
Therefore, a further series of analyses was performed to evaluate the effect on prediction performance in the leave-one-farm-out analysis by incrementally including data in the calibration dataset, which captures (some of) the variability in the validation dataset.For the first iteration, 143 records were randomly selected from a given farm and these were considered to be the validation data for the subsequent analyses.All remaining data of the cows contributing to these records were discarded so as not to bias the calibration.The remaining data from the other 3 farms were then initially considered as the calibration data.The prediction model developed using this calibration dataset was then applied to the validation dataset of 143 records.While retaining the same 143 records as validation, a random 20% of the remaining records from that herd were then included in the calibration dataset and the predictions rerun and validated on the same 143 records.The process was repeated with 20% extra records per herd progressively included in the calibration dataset while the 143 records in the validation dataset remained the same throughout.This whole process was repeated 4 times where the validation data of 143 records was a different herd each time.The correlation between the actual and the predicted NUE direct values is shown in Figure 3 separately for each validation farm.Irrespective of validation farm, the greatest improvement in the performance was achieved once 20% of the data from the farm used for validation were included in the calibration dataset.Moreover, the improvement in prediction performance for all the farms followed a logarithmic pattern.
Results from the present study indicate that concatenating morning and evening spectra instead of using just morning or just evening spectra separately produced the most accurate results when using both PLSR and NN.The approach of averaging the morning and evening spectra was also investigated (results not shown) and although the prediction accuracy achieved was better than predictions based just on morning or just evening spectra, it did not surpass the accuracy of predictions from the concatenated morning and evening spectra.Nonetheless, there is a transition away from the collection and analy-sis of separate morning and evening samples during herd testing (McParland et al., 2019) and hence it may not be possible to use both samples in the prediction process.

Predicting Nitrogen Intake to Compute Nitrogen Utilization Metrics
A hypothesis was tested in the present study that because Nout in milk is already routinely available for many producers, it may be more sensible to attempt to predict nitrogen intake in the validation population and use the actual Nout values in these validation animals to transform this to nitrogen efficiency.In the present study it was also assumed that live-weight records were available to calculate both the underlying growth rate (of the cow and fetus) and body tissue mobilization rate.Despite this, the prediction of NUE direct and Nbal direct outperformed (P < 0.05) those obtained using the indirect approach.Nonetheless, when in a second set of analyses NUE was quantified as just Nmilk divided by Nintake, the prediction of NUE indirect and the prediction of NUE direct produced similar results (i.e., RMSE of validation of 22.05 and 21.82 for NUE direct and NUE indirect , respectively; P > 0.05).Therefore, in particular when NUE and Nbal were quantified including all the nitrogen sources and nitrogen sinks, results demonstrate the benefit of predicting NUE and Nbal directly from the spectra.
When validation by farm was performed, predicting NUE direct and Nbal direct also produced a lower RMSE of validation (i.e., better predictions) compared with the indirect approach.Also when Nbal was validated by year, the direct prediction from the spectra plus milk yield produced a lower RMSE of validation than indirect prediction.Nonetheless, when NUE was validated by year, the RMSE of validation decreased from 9.28 for NUE indirect to 4.49 for NUE direct .

CONCLUSIONS
Nitrogen intake, NUE, and Nbal are traits that can be used in decision making to help reduce the environmental impact of dairy farms, as well as to improve farm efficiency.Their quantification, in particular in pasturebased production systems, can be expensive; therefore their quantification using less-expensive solutions could be advantageous.In the present study, Nintake, NUE direct , and Nbal direct could be accurately predicted, achieving an RPIQ always greater than 2 in cross validation.When predictions were validated by year, all 3 investigated traits were predicted with reasonable accuracy.Nonetheless, when the prediction model was validated by farm, prediction accuracy decreased, particularly for the farms that were extreme for nitrogen efficiency.

NOTES
This research was funded by a Science Foundation Ireland (Dublin, Ireland), Starting Investigator Research Grant, Infrared spectroscopy analysis of milk as a low cost solution to identify efficient and profitable dairy cows, 18/SIRG/5562 and the grant 16/RC/3835 (Vista-Milk, Fermoy, Co. Cork, Ireland).Supplemental material for this article is available at http: / / hdl .handle.net/11019/ 3397.All data used in this study were from a pre-existing database.Therefore, because no human or animal subjects were used, this study did not require approval by an Institutional Animal Care and Use Committee or Institutional Review Board.The authors have not stated any conflicts of interest.validation dataset; RPIQ = ratio of performance to interquartile distance.

Frizzarin
et al.: MIR ESTIMATION OF NITROGEN EFFICIENCY 29 g N/d (SD = 113.32g N/d), 22.29 (SD = 4.82), and 440.45 g N/d (SD = 99.91 g N/d), respectively.A boxplot of the Nintake, NUE, and Nbal values per farm is in Figure 1.The mean Nintake per farm ranged from 497.36 g N/d to 601.21 g N/d; the mean NUE ranged between 19.03 and 26.97, and Nbal ranged between 369.47 g N/d and 481.32 g N/d.The Nintake, NUE, and Nbal for the years 2008 to 2016 ranged from 265.31 to 990.72 g N/d, from 10.01 to 44.63, and from 184.27 to 832.64 g N/d, respectively; for the data collected in 2017 and 2018, Nintake, NUE, and Nbal ranged from 302.57 to 919.23 g N/d, from 8.69 to 34.97, and from 220.85 to 741.73 g N/d, respectively.

FrizzarinFigure 1 .
Figure 1.Boxplot of Nintake (g/d), NUE, and Nbal (g/d).The box represents the spread of the middle 50% of the data with the horizontal line in the box representing the median; the whiskers extending from the edges of the box represent the minimum and maximum data point that are within 1.5 times the interquartile range of the respective quartile, with outliers represented as the dots.

Frizzarin
Figure 3. Correlation between actual and predicted NUE across farms when a random subset of 143 records for each farm was separately used as the validation dataset.A progressively greater number of records from the farm used in the validation were added in the calibration dataset, which included all the records from the other farms; this increased from 20% of the remaining data from the farm to 100%.

Table 1 .
Description of the research farms, all located in southwest Ireland, used in the study Farm

Table 2 .
Frizzarin et al.: MIR ESTIMATION OF NITROGEN EFFICIENCY Prediction performance in cross validation for predicting Nintake ( g of N/d), NUE, and Nbal (g of N/d) with NN using either morning, evening, or a combination of morning and evening spectra jointly with other model features such as milk yield (MY),DIM, and parity (par) a-g Different letters indicate differences (P < 0.05) across rows within variable for each trait separately.1RMSEcv= root mean square error in the cross-validation dataset.

Table 3 .
Frizzarin et al.: MIR ESTIMATION OF NITROGEN EFFICIENCY Validation across different farms, prediction performance for predicting Nintake (g of N/d), NUE, and Nbal ( g of N/d) with PLSR and NN using morning and evening spectra plus milk yield across different farms1 1 n = number of records; RMSEV = root mean square error in validation dataset.

Table 4 .
Validation across years, prediction performance for predicting Nintake (g of N/d), NUE, and Nbal ( g of N/d) with PLSR and NN using morning and evening spectra and milk yield when 2,852 data points spanning the years 2008 to 2016 were used in the calibration and 645 data points from the years 2017 and 2018 were used for validation1