Ability of three dairy feed programs to predict post-rumen outflows of nitrogenous compounds in dairy cows: A meta-analysis

Adequate prediction of post-rumen outflow of protein fractions is the starting point for the determination of me-tabolizable protein supply in dairy cows. The objective of this meta-analysis was to compare the performance of 3 dairy feed programs [National

to predict outflows (g/d) of NAN, microbial N (MiN), nonammonia nonmicrobial N (NANMN).Predictions of rumen degradabilities (% of nutrient) of protein (RDP), NDF and starch were also evaluated.The data set included 1,294 treatment means from 312 digesta flow studies.The 3 feed programs were compared using the concordance correlation coefficient (CCC), the ratio of root mean square prediction error (RMSPE) on standard deviation of observed values (RSR), and the slope between observed and predicted values.Mean and linear biases were deemed biologically relevant and are discussed if higher than a threshold of 5% of the mean of observed values.The comparisons were done on observed values adjusted or not for the study effect; the adjustment had a small effect on the mean bias but the linear bias reflected a response to a dietary change rather than absolute predictions.
For the absolute predictions of NAN and MiN, CNCPS had the best fit statistics (8% greater CCC; 6% lower RMSPE) without any bias; NRC and NASEM under-predicted NAN and MiN, and NASEM had an additional linear bias indicating that the underprediction of MiN increased at increased predictions.For NANMN, fit statistics were similar among the 3 feed programs with no mean bias; however, the linear bias with NRC and CNCPS indicated under-prediction at low predictions and over-prediction at elevated predictions.On average, the CCC were smaller and RSR ratios were greater for MiN vs, NAN indicating increased prediction errors for MiN.For NAN responses to a dietary change, CNCPS also had the best predictions, although the mean bias with NASEM was not biologically relevant and the 3 feed programs did not present a linear bias.However, CNCPS, but not the 2 other feed programs, presented a linear bias for MiN, with responses being over-predicted at increased predictions.For NANMN, responses were over-predicted at increased predictions for the 3 feed programs, but to a lesser extent with NASEM.The site of sampling had an effect on the mean bias of MiN and NANMN in the 3 feed programs.The mean bias of MiN was higher in omasal than duodenal studies in the 3 feed programs (from 55 to 61 g/d) and this mean bias was twice as large when 15 N labeling was used as a microbial marker compared with purines.Such a difference was not observed for duodenal studies.The reasons underlying these systematic differences are not clear as the type of measurements used in the current meta-analysis does not allow to delineate if one site or one microbial marker is yielding the "true" post-rumen N outflows.
Rumen degradabilities of protein (RDP) was underpredicted with CNCPS, and RDP responses to a dietary change was under-predicted by the 3 feed programs with increased RDP predictions.Rumen degradability of NDF was under-predicted and had poor fit statistics for NASEM compared with CNCPS.Fit statistics were similar between CNCPS and NASEM for rumen degradability of starch, but with an under-prediction of the response with NASEM and absolute values being over-predicted with CNCPS.
Multivariate regression analyses showed that diet characteristics were correlated with prediction errors of N outflows in each feed program.Globally, compared with NAN and NANMN, residuals of MiN were correlated with several moderators in the 3 feed programs

INTRODUCTION
Numerous dairy cow feed programs have been developed and are regularly being upgraded to incorporate recent knowledge.The 8th revision of the Nutrient Requirements of Dairy Cattle from the National Academies of Sciences, Engineering, and Medicine (NAS-EM) was published in December 2021, replacing its previous version from the National Research Council (NRC, 2001).The revised NASEM edition provides updates on energy, protein, AA and mineral bioavailability, as well as an improved prediction of DMI.In addition to NRC and NASEM, the Cornell Net Carbohydrate and Protein System (CNCPS) is another American feed program, initially described in a series of papers in 1992 and 1993 (Fox et al., 1992;Russell et al., 1992;Sniffen et al., 1992;O'Connor et al., 1993), with several updated versions released over the last 20 years, the latest being from Van Amburgh et al. (2015a).
Predictions of MP supply, and ultimately the digestible flow of EAA, are the core of the protein formulation of rations and are based on the estimation of the different post-rumen outflows (hereafter termed outflows) of digesta N compounds [ammonia, microbial (MiN), undegradable intake, and endogenous; the latter 2 referred to as nonammonia nonmicrobial N (NANMN)].Ammonia is of little benefit to the animal and in small proportion; therefore, it is subtracted from total N and NAN is mostly reported in the publications (Titgemeyer, 1997).A large body of literature has been published on N outflows in dairy cows since 1970, and reviewed in meta-analyses (e.g., Broderick et al., 2010;Pacheco et al., 2012;Ipharraguerre and Clark, 2014;Van Amburgh et al., 2015b;White et al., 2017b;Hanigan et al., 2021).Collectively, these meta-analyses identified mean and linear biases for some N outflows that could be related to the technical challenges associated with the markers used to determine the flow of digesta (Firkins et al., 1998;Hristov et al., 2019) and its microbial component (Dehority, 1995;Titgemeyer, 1997).
Predictions of N outflows with NASEM have not been evaluated yet against exhaustive data from the literature, and compared with NRC and CNCPS.Therefore, the main objective of the current study was to compare the precision and accuracy of NRC, NASEM and CNCPS to predict N outflows.Our hypothesis was that the 2 most recent feed programs would perform better than the older one.Another objective was to examine the influence of categorical (e.g., sampling site and microbial marker) and continuous moderators (e.g., animal and dietary characteristics) on the residual errors of predictions.

Data Collection
A comprehensive search of the literature was conducted to identify digesta flow studies reporting DM, OM, NAN, MiN, NANMN, NDF, ADF, starch, ether extract (EE), and fatty acids in dairy cows.The literature search included 2 search engines (Scopus database and Google Scholar), and was conducted between December 2019 and November 2022.The key words used in the search query are reported in the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) flowchart (Supplemental Figure S1).References listed in every article recovered, including reviews and meta-analyses, were screened for additional articles on the subject.If data from a same study were reported in multiple or companion papers, the study received a single entry in the data set (no duplicate allowed).The data were collated in an Excel workbook (Microsoft Corp.) which included 1,842 treatment means from 421 studies.For the current study, the preliminary data set included 313 studies (1,337 treatment means) which reported DMI, CP intake, feed ingredients and their inclusion rates, and outflows of NAN, MiN, or NANMN.All studies included in the preliminary data set are referenced in Supplemental Table S1.

Observed Values
Outflows of Nitrogen Fractions.Observed values were as reported in the publications or, if missing and Martineau et al.: Prediction of post-rumen nitrogen outflows when possible, they were calculated from other variables reported in the publications.Outflows of NAN and MiN are measurements calculated from laboratory analyses conducted on digesta samples whereas NANMN is not a direct measurement but is calculated by difference between NAN and MiN.Results from N assay completed on wet or freeze-dried digesta samples may contain ammonia.Therefore, in studies reporting total N and not NAN (e.g., Bernard et al., 2004), NAN was calculated as total N × 0.959 based on the relationship between duodenal N and NAN (Marini, 2008); this adjustment needed to be used in less than 9% of data.
In studies using other methods to measure MiN (e.g., near-infrared reflectance spectroscopy, Lebzien and Paul, 1997), MiN and NANMN outflows were discarded from the data set.If MiN outflow was expressed as microbial net synthesis of AA N (g/d; e.g., Jensen et al., 2006), then the amount was divided by 0.85 assuming 15% of microbial CP under nucleic acid form (Fox et al., 2004).The MiN outflows could be calculated indirectly from the efficiency of microbial synthesis as reported in the publication; for example: MiN (g/d) = OM truly digested in the rumen (kg/d) × (g MiN/kg OM truly digested in the rumen).The indirect calculation of MiN outflows was used in less than 8% of data.Finally, if endogenous N was subtracted from a N outflow (e.g., Hvelplund and Madsen, 1985), it was added back into the outflow; therefore, all observed NAN and NANMN outflows included endogenous N.
Rumen Degradabilities.Rumen degradability of protein (RDP, % of CP) was calculated de novo in all studies that reported outflows of NAN and MiN as follows [adapted from Lebzien and Voigt (1999) and Pappritz et al. (2011)]: where N intake, NAN, MiN and endogenous N are expressed in g/d.In studies sampling digesta from an abomasal or duodenal cannula, endogenous N flow (g/d) was assumed to be: 15.4 + 1.21 × DMI kg/d (Lapierre et al., 2016;NASEM, 2021).
Although not yet well quantified, Broderick et al. (2010) acknowledged that the endogenous protein flow would be smaller in omasal vs. duodenal digesta samples citing Ørskov et al. (1986) and Hart and Leibholz (1990).In the current study, endogenous N flow (g/d) was assumed to be 0.63 × (15.4 + 1.21 × DMI kg/d ) in studies sampling digesta from the reticulum or the omasum, based on Egan et al. (1986) and Reynolds et al. (2004) (refer to Supplemental File I for more details).If not reported in the publication, rumen degradabilities of NDF (Rum_dcNDF, % of NDF) and starch (Rum_dcSt, % of starch) were calculated as: (intake -outflow)/intake × 100, all values expressed in kg/d.

Diet Characterization
The nutrient composition of each diet was calculated using NRC (2001), NASEM (2021), and CNCPS (v6.5.5).The NRC was rebuilt in R (version 4.1.3)based on the equations detailed in the book (NRC, 2001), but updated to remove coding errors in the passage rate equation of dry forages present in the original program according to Seo et al. (2006).The NASEM R codes of the software were used to compute predictions; however, equation 20-53 in NASEM ( 2021) was updated to replace Dt_St by Dt_StIn, referring to starch intake and not dietary starch concentration.A spreadsheet version of the CNCPS, as published in v6.5.5, was used to compute predictions at Cornell University (Van Amburgh; personal access).
If reported in the publication, the nutrient composition of feed ingredients was used to create the diet, otherwise data were populated from the feed table of each feed program to closely match their description.Therefore, unless the chemical composition of all feed ingredients was reported, diet characteristics differed between feed programs due to small discrepancies between feed ingredient composition; for example, default CP concentrations of corn gluten meal are, respectively, 65.0, 68.5, and 65.5% of DM in NRC, NASEM, and CNCPS.
Data Filtering Dietary CP intake (kg/d) was the amount reported during digesta sampling or, if missing, computed from DMI and dietary CP concentration.When these CP intakes were compared with CP intakes computed by each feed program based on diet and feed ingredient composition; differences associated with ingredient misspecification were observed, especially when the composition of feed ingredients was missing in the publications.The ratio of reported on programcomputed CP intake (RatioCPIn) averaged 1.01 ± 0.034 (range = 0.89 to 1.12) in 163 diets in which the composition of all feed ingredients was reported.In the other diets, the RatioCPIn was on average similar but more variable: 1.00 ± 0.080 (range = 0.68 to 1.52; n = 1,174 diets).Errors in predictions of N outflow fractions are expected to be greater as RatioCPIn deviates from unity.To minimize errors related to RatioCPIn, one option was to make adjustments to the composition of all feed ingredients in the ration to ensure that dietary Martineau et al.: Prediction of post-rumen nitrogen outflows nutrient concentrations reported in the publication match those computed by the feed program (Hanigan et al., 2013).This method was used in the development of NASEM (White et al., 2017ab), and is well described in Li et al. (2018).One major drawback of this approach is that it would affect the CP concentration of feed ingredients for which CP is either reported in the publication or known to be fixed (e.g., urea).
In the current study, we opted to remove diets with outlying RatioCPIn using an approach adapted from Yoder et al. (2014) because the core of this study was to assess predictions by each feed program, and not to develop new prediction equations.The search for RatioCPIn outliers was done on an individual feed program basis, starting with all diets with incomplete composition of feed ingredients.Diets with an absolute RatioCPIn greater than 3.0 × standard deviation (SD) from the mean were considered outliers and removed; the procedure was repeated until all diets fell within the selected range of RatioCPIn.Overall, 43 out of 1,337 diets (3.2%) had outlying RatioCPIn and were removed from the preliminary data set.As a result, the final data set contained 1,294 diets (312 studies) with a RatioCPIn averaging 1.00 ± 0.060 (range = 0.81 to 1.19).

Predictions of Outflows by Feed Programs
The prediction of the different N fractions was based on the diet characterization described above for each feed program.In NRC and NASEM, the NAN outflows are predicted as the summation of MiN and NANMN, where NANMN is N from RUP plus endogenous N; the latter being 1.9 × DMI kg/d and 15.4 + 1.21 × DMI kg/d in NRC and NASEM, respectively.In NRC, MiN is related to intakes of discounted TDN and capped at 85% of RDP intakes; in NASEM, it is associated with intakes of rumen-degraded carbohydrates (starch plus NDF) and proteins (RDP being capped at 12% of DM).An asymptotic integrated form of the Michaelis-Menten equation and a Baysian approach are used to predict MiN in NASEM (2021;equations 20-74 and 20-75).Ammonia outflows are not predicted in NRC and NAS-EM; however, predicted NANMN can be compared with observed values (NASEM, 2021).In CNCPS, NAN outflows are also predicted as the summation of MiN and NANMN, but NANMN is N from RUP only (no ammonia or RUP fraction A1; Van Amburgh et al., 2015a).Indeed, in CNCPS v6.5.5, it is considered that the NAN outflow is exclusively constituted of RUP and MiN whereas endogenous N flow is considered nonexistent.However, that will change in the next CNCPS version.Predictions of MiN are the yield of fiber and non-fiber carbohydrate fermenting bacteria, calculated from the respective rumen-degraded carbohydrate fractions, using the Michaelis-Menten equation, applying bacterial maintenance and growth yield coefficients, and multiplying bacteria yield by 0.625 (constant for CP content of bacteria) / 6.25 (Fox et al., 2004;Table 15).

Descriptive and Fit Statistics
The overall agreement between the observed and predicted values was computed using the Hackman (2019) model evaluation function available from the National Animal Nutrition Program website.The R codes used in the current study are available in Supplemental Material File II.The descriptive statistics included the number of treatment means; the average, SD, minimum and maximum of predicted and observed values.
The fit statistics included several parameters to evaluate the quality of predictions.The concordance correlation coefficient (CCC; Lin, 1989) was extracted from the epi.ccc function of the epiR package in R version 2.0.52.Residual errors were defined as: where O i = ith observed value and P i = ith corresponding predicted value.The mean square prediction error (MSPE) was calculated as: where n = number of experimental units per treatment.We used the term MSPE to reflect that the evaluation of predictions was done against independent data, although part of the studies from the data set were used to develop prediction equations by each feed program.The square root of MSPE (RMSPE) and the relative prediction error (RPE; RMSPE as a % of the average of observed values) were used to assess the model fit of each feed program.Because RMSPE and RPE give no indication of consistency throughout the range of predictions, MSPE was decomposed into error in central tendency (ECT), error due to deviation of the regression slope from unity (ER), and error due to disturbances (ED; not reported) (Theil, 1966;Bibby and Toutenburg, 1978).The ECT was calculated as: ED was the average of the sum of squares error extracted from the regression of residual errors on predicted The ECT and ER are expressed as a percentage of MSPE in the Tables.In addition, the ratio of RMSPE on SD of observed values (RSR) was computed to take into account the variability of data as it incorporates the benefits of error index statistics and includes a scaling/normalization factor (the smaller, the better the model performance; Moriasi et al., 2007).The mean bias was the average of residual errors and its P-value was obtained conducting a Student's t-test on residual errors using the t.test function in R. The coefficient and standard error (SE) of the slope were determined by regressing the observed values on the predictions using the lm function in R. The probability of the slope being different from unity was assessed by regressing residual errors on predicted values.The mean biases and the slopes were compared among feed programs using a pairwise comparison approach (Tukey's test) performed using the emmeans and emtrends functions from the package emmeans v.1.8.2.Supplement II and III describe, respectively, the R codes and the pairwise comparison approach to evaluate models.

Linear Biases
Residual errors were regressed on predicted values centered around their mean; as a result, the slope and intercept estimates in the regression were orthogonal and independent of each other (St-Pierre, 2003).The intercept estimate corresponds to the mean bias computed above and the slope is used to compute the linear bias.St-Pierre (2003) calculated the linear biases at the minimum and maximum predicted values, and judged the maximum absolute linear bias relative to the size of the model SE or in comparison to the 95% CI of data in the literature.This approach has merits but also some drawbacks: (1) the slope can be close to zero but the linear bias can be elevated if the intercept is large; (2) some extreme predicted values unlikely to be observed in the field can create large linear biases; and (3) the magnitude of the linear bias can be smaller than SE (or acceptable) but associated with important differences in milk yield.In St-Pierre (2003), NAN was judged acceptable below 83.7 g/d, which translated into approximately 8 kg of milk yield per day [83.7 × 6.25 × 0.85 (AA N/N) × 0.85 (intestinal digestibility) × 0.67 (MP efficiency) / (32 g true protein/kg milk)].Therefore, in the current study, (1) biases were computed at Q1 and Q3 of predicted values as follows: intercept + [slope × (quantile -average of predicted values)] to exclude extreme values; (2) the absolute difference was calculated between Q1 and Q3 biases to remove the mean bias from the calculation; and (3) this difference is termed the "linear bias" from now and hereafter, and is judged relative to the average of observed values to determine its biological relevance.

Biological Relevance of the Mean and Linear Biases
Through an exhaustive literature search, the data set included 1,294 treatment means, twice as many as reported in Roman-Garcia et al. (2016) and Hanigan et al. (2021).With such a large number of data, caution should be taken when statistical significance is achieved to ensure that the biases are meaningful and biologically relevant.For the presentation of results and discussion purposes, and to discriminate between feed programs, either the mean or linear biases will be considered accurate if they remain within a threshold of 5% of the average of observed values (Pacheco et al., 2012).

Observed Values Adjusted for the Study Effect
The observed values come from a multidimensional space and are collapsed into a 2-dimensional plane to represent the data as function of the predicted values.For this purpose, the value of each observation is adjusted for the lost dimensions (adjusted Y) as follows (adapted from St-Pierre, 2001): (1) observed values are regressed on predicted values in a model adjusted for the random study effects (generating a single intercept for each study), with observations weighted by the square root of the number of experimental units, using the lmer function in R; (2) fitted Y values are calculated for each prediction: intercept + (slope × predicted value); (3) the conditional residuals from the regression model are extracted using the residuals function in R; and (4) adjusted Y values are computed as the summation of the conditional residuals and their corresponding fitted Y values.The R codes of this procedure are reported in Supplemental File II.Daniel et al. (2020) evaluated the quality of predictions at the within-study level by computing the fit statistics between adjusted Y values and predictions; the same methodology is used in the current study.The adjusted Y values will be termed "observed adjusted values" here and thereafter.

Influence of Moderators
We explored the relationships between residual errors and factors (categorical and continuous), herein named moderators, to identify elements for feed program improvement.The presence of a significant relationship suggests the feed program might not properly represent the effect or the impact of the moderator.

Categorical Moderators
The influence of a categorical moderator on residual errors was evaluated in a regression model that included the interaction between the predicted values centered around their mean and the moderator.The selected categorical moderators were the digesta sampling site and the microbial marker used to measure MiN.In the final data set, the digesta was sampled from 4 different sites: (1) reticulum in Naadland et al. (2016); (2) omasum in 61 studies (224 treatment means); (3) abomasum in Mabjeesh et al. (1997), Shabi et al. (1998;1999) and Gorniak et al. (2014); and (4) duodenum in 246 studies (1,052 treatment means).Therefore, the digesta sampling site was categorized as omasal (omasum and reticulum) or duodenal (duodenum and abomasum).Initially, the digesta was mostly sampled from the duodenum but sampling at the omasum has gained popularity since the early 2000s.To the best of our knowledge, reticulo-omasal sampling was reported only in one study before 1997 (Nagel and Broderick, 1992); therefore, the influence of sampling site was evaluated on studies published from 1997 onwards.
The influence of microbial markers on MiN outflows was evaluated separately for duodenal and omasal studies on studies published from 1997 onwards.Because DAPA was not used as a microbial marker in omasal studies, only the effect of 15N vs. purines was evaluated per digesta sampling site.Outflows of MiN were measured using 15N in 80 and 160 treatment means (21 duodenal and 44 omasal studies, respectively), and purines in 293 and 29 treatment means from, respectively, 64 duodenal and 9 omasal studies published from 1997 onwards.

Continuous Moderators
For each N outflow and each feed program, separate linear regression analyses were conducted between the residual errors and each moderator in univariate models adjusted for the random effect of study.The continuous moderators included DMI (kg/d), dietary concentrations (% of DM) of NDF, CP, starch, and EE, and RDP (% of CP), Rum_dcNDF (% of NDF), and Rum_dcSt (% of starch); starch, Rum_dcNDF, and Rum_dcSt were not available with NRC.Multivariate regression analyses were conducted on the full model which included all moderators with P ≤ 0.10 in univariate models, using a backward elimination procedure with a significance level of P ≤ 0.10.The continuous moderators centered around their mean, with observations weighted by the square root of the number of experimental units per treatment.The cor-relation matrix among moderators was computed using the cor function from the stats package in R.

Multilevel Mixed-Effect Meta-Regression Model
All models were computed using the rma.mv and robust functions of the metafor package in R (version 3.8-1; REML method; Viechtbauer, 2010Viechtbauer, , 2018Viechtbauer, , 2022; see R codes in Supplement II).The square root of the estimated amount of residual heterogeneity ( )  σe was reported in the robust function output and the Akaike's information corrected criterion (AICc) was computed by the fitstats.rmafunction of metafor in R. A multilevel model was used to account for the hierarchical structure of data.For example, a digesta flow study (PubID) typically reports the data from 4 dietary treatments (TID) fed to 4 cows assigned to a 4 × 4 Latin square experimental design (Exp), i.e., 4 cow observations per TID, 4 TID in 1 Exp, and 1 Exp in 1 PubID as in Palmquist et al. (1993).Apart from this simple structure and coding (1 PubID/1 Exp/4 TID), several studies in the data set had a more complex structure: for example, Schwab et al. (1992) sampled digesta from 4 cows assigned to a 4 × 4 Latin square design at each of 4 stages of lactation, i.e., 1 PubID/4 Exp/16 TID.The hierarchical structure of data needs to be accounted for because the 4 Exp in Schwab et al. (1992) have more in common than the Exp in Palmquist et al. (1993); otherwise it would be in violation of the principle that studies should be statistically independent in a meta-analytic review (Wang and Bushman, 2007).
The variance component in the multilevel model was divided into 3 parts to indicate where most of the random variation lies, i.e., between (1) TID nested within Exp, itself nested within PubID, (2) Exp nested within PubID, and (3) PubID.The statistical model including an interaction term can be written as: where y ij is the dependent variable for the ith Exp (i = 1, …, I j ) clustered in the jth PubID (j = 1, …, J jk ); β 0 , β 1 , β 2 , and β 3 being, respectively, the fixed effect parameters corresponding to the intercept, the independent variables x 1 and x 2 , and the interaction between x 1 and x 2 ; u i and w ij are, respectively, random effects for the residual random heterogeneity at the Exp and Pu-bID levels; and e ij is the sampling error.The Exp and PubID random effects are assumed to be independently, identically, and normally distributed with a mean of , respectively.The sampling errors are assumed to be independently and normally distributed with a mean of zero and a variance of v ij .Therefore, the v ij are the (approximately) known and heteroscedastic sampling variances of the independent variables.Unbiased estimates of fixed effects and valid estimates of P-values are obtained using the robust function which does not change the weight matrix but only affects the way the variance-covariance matrix of fixed effects and downstream SE and P-values are computed.

Data Quality Assessment
Data quality assessment is strongly advised before the use of a data set for conducting meta-analyses to get accurate and unbiased results.Several methods were used to assess the quality of data in the final data set.First, fit statistics and RatioCPIn were compared between the best quality studies and the other ones; the former being a subset of 42 studies with CP reported for all feed ingredients (163 diets).In the other studies, the nutrient composition was missing for some or all feed ingredients, and default CP values from the feed library of each feed program had to be used to construct the diets.Second, prediction errors might arise from unreliable digesta flow measurements.Different approaches are used to measure digesta flow which presents great technical challenge (Ipharraguerre et al., 2007), especially if using a single-marker approach (Hristov et al., 2019).We used NASEM to predict N outflows and compared the fit statistics from studies using different approaches to measure digesta flow: reentrant cannula (no marker), single-or multiple-marker approach in duodenal studies; and omasal studies which all used a multiple-marker approach.Finally, Daniel et al. (2020) removed the effect of DMI from the outflows and compared the quality of predictions with NRC (2001) and the recently updated Institut National de la Recherche Agronomique (INRA, 2018); therefore, we converted the data in g N/kg DMI compared our NRC predictions to those reported by Daniel et al. (2020).

Additionnal Analyses -Simpler Models and Outflow Units
The precision and accuracy to predict N outflows was compared between simpler models based on dietary factors (DMI, OM intake, N intake) expected to influence N outflows vs. the more complex models developed by NRC, NASEM, and CNCPS.In addition, for comparison purposes, the evaluation of the 3 feed programs, the effects of sampling site and microbial marker on residual errors, as well as the multivariate analyses were also conducted on N outflows expressed on a g/kg DMI basis.Results of N outflows in g/d are presented and discussed in the main body of the text whereas additional tables and figures for values expressed in g/ kg DMI are reported as supplemental material and will be shortly discussed at the end of the paper.

Outliers
An inner StudentResid function in R was used to identify outliers (Anon, 2015.Stack overflow, Available at: https: / / r -How can I extract studentized residuals from mixed model (lmer)?-Stack Overflow).Observed values were regressed on predicted values in models adjusted for the random study effects, with observations weighted by the square root of number of experimental units per treatment.Observations with an absolute studentized residual value greater than 3 were deemed outliers and removed from all meta-regressions involving the outflow.The level of agreement between observed values and predicted values was assessed in plots created using the ggplot2 package (version 3.3.6;Wickham, 2009).New plots were also created to explore the relationships between residual errors and predicted values.

Features of the Data Set
Summary Statistics.The summary statistics on cow and diet characteristics are detailed in Tables 1  and 2, respectively.Studies included in the data set covered a large range of DMI (3.3 to 30.4 kg/d), N intake (66 to 947 g/d), milk production (4.5 to 48.2 kg/d), and milk true protein yield (MTPY; 235 to 1,420 g/d).On average, NAN outflow represented 96% of N intake, in line with the meta-analyses of Broderick et al. (2010) and Ipharraguerre and Clark (2014); and 59% of NAN was from MiN in agreement with Clark et al. (1992) and Roman-Garcia et al. (2016).In line with previous meta-analyses on N utilization in dairy cows (e.g., Huhtanen and Hristov, 2009;Roman-Garcia et al., 2016), MTPY reported with 850 treatment means, represented, on average, 27% of CP intake, and between 41 to 46% of the estimated MP supply, depending of the feed program.It is worth noting that the dietary nutrient composition was lacking in several publications (Table 2).In older studies, it is understandable that analyses of feed ingredients were more limited, but for recent studies, a more comprehensive feed chemistry analysis and dietary nutrient composition would help to reduce diet mis-specifications and improve the quality of meta-analyses.The summary statistics of diet characteristics predicted with the 3 feed programs are markedly similar among feed programs except for Rum_dcNDF between NASEM and CNCPS (Table 3).A negative quadratic term for diet CP concentration is included in the equation developed to estimate Rum_dcNDF (NASEM, 2021; in equation 20-52) and negative estimates of Rum_dcNDF are set to 0.1% of NDF by default (see below for its impact on MiN predictions).
Data Quality Assessment.One challenge in constructing the diets was the selection of feed ingredients from the feed table of each program which corresponded the best to those used in the publication.We were able to verify that feeds were selected properly because the CCC between dietary CP intake calculated using reported vs. feed table values averaged 96% among the 3 feed programs in the subset of high quality studies (data not shown).In Supplemental Figure S2, RatioCPIn is shown as a function of CP intake for each feed program in (1) the high quality studies, (2) all other studies before the removal of outliers, and (3) the final data set.As reported previously, the SD of RatioCPIn was smallest for the subset of studies reporting the composition of all feed ingredients (SD in high quality studies = 0.034 vs. SD = 0.07 for other studies).However, SD of RatioCPIn was similar between feed programs in the 3 panels of Supplemental Figure S2 (data not shown), suggesting that the error related to dietary N intake did not differ among feed programs.In Supplemental Table S2, the fit statistics for the predictions of N ouflows with NASEM are very similar between the data set of studies reporting the composition of all feed ingredients and the data set from the other studies where the information was missing for some or all feed ingredients.This clearly indicates that, overall, the utilization of default feed table values for missing feed ingredient composition did not have a negative impact on the quality of the data.
Another issue was to evaluate the impact of including duodenal studies using a single-marker approach to measure digesta flow in the data set.In Supplemental Table S3, the fit statistics for the predictions of N ouflows with NASEM were comparable between duodenal studies using different approaches to measure digesta flow.Therefore, the method used to measure digesta flow in duodenal studies does not generate unrealistic variations in N outflow measurements (very similar RSR ratios).We were able to built a large data set with several nutritional scenarios useful to analyze the influence of categorical and continuous moderators on residual errors.However, these results in Supplemental Table S3 indicate a major difference between sampling sites that will be addressed in detail below.In Supplemental Table S4, despite the large difference in the number of treatment means, our results were markedly similar to those reported by Daniel et al. (2020).
Prediction of Outflows and Rumen Degradabilities.The statistical fits for the N outflows are reported in Tables 4 and 5, for the observed values as reported and adjusted for the effect of study, respectively.For each feed program, the corresponding relationships are depicted between observed values vs. predictions in Figure 1, and between observed adjusted values and predictions in Figure 2. In Table 4, the analysis was performed on raw observed values vs. predictions; thus each datum was treated as an independent observation, and prediction errors included differences between studies in animals, DMI, diet, methodology (measurement approaches), management conditions and laboratories among others.This type of analysis would represent the fact that, on a given farm with no  previous background, there is no possibility to adjust the prediction of N outflows based on previous observations because it is never experimentally estimated.Therefore, it is relevant to know how models perform in this context, basically predicting the absolute value of N outflow based on ration composition (ingredient and chemically) and ingestion.In Table 5, the analysis was performed on observed adjusted values vs. predictions to support more equivalent comparison among studies.
The evaluation would indicate if models can accurately predict N outflow response to a dietary change, a very useful tool for optimizing an existing ration, and even on-farm.Because both approaches have their own significance, both are presented for the N outflows.For comparison purposes between feed programs, outflows will be discussed by type of N fractions.NAN Outflows.The NAN was under-predicted (P ≤ 0.005) by the 3 feed programs (Table 4) with the mean biases of NRC and NASEM having a biological impact representing 6.0 and 5.3% of the observed mean, but not with CNCPS which had the lowest mean bias among feed programs (P ≤ 0.001).The CCC was highest with CNCPS (84%) and lowest with NASEM (79%), in agreement with RPE and RSR being in opposite direction.The ECT and ER estimates of the 3 feed programs represented less than 10% of MSPE.The slopes with NASEM and CNCPS differed from unity (P ≤ 0.01) being, respectively, greater and smaller than 1.0; the NASEM slope was the highest among feed programs (1.18 vs. 0.97 g/g; P ≤ 0.001) and induced a biological relevant linear bias at 27 g/d (6.0% of observed mean; Table 4).Together, the mean bias and the slope indicate that, on average, NASEM underpredicted NAN for predicted values above 300 g/d on average (Figure S3).
There is scarce scientific literature comparing N outflows predicted by the feed programs analyzed in the current study, and no studies have yet reported on NASEM predictions.Bateman et al. (2001)  small biases relative to the error of measurements: the mean bias tended to be significant and the slope of the residuals was significant at −0.161 g/g (corresponding to a slope of 0.839 between observed values and predictions) but the magnitude of linear biases over the range of predictions was less than σe (84 g/d).For comparison purposes with the approach used in St-Pierre (2003), the slope was 0.98 g/g in the current data set (Table 4) and generated linear biases smaller than SE (or ˆ) σ e of the regression model when recalculated at minimum (72 g/d) and maximum (749 g/d) predictions (i.e., 36 and 19 g/d, respectively; σe = 86 g/d; data not shown).
With the second objective of evaluating the ability of feed programs to predict responses of N outflows to dietary changes, analyses were also conducted on data adjusted for the effect of study (Figure 1).Overall, the adjustment for the effect of study reduced SD of observed adjusted values, RMSPE, RPE, and RSR ratio, and allocated an greater proportion of MSPE into ECT and ER (Tables 4 and 5).The slopes were reduced without changing the pattern observed in Table 4, and the linear biases were now all below the 5% threshold for biological relevance (Table 5).Note that the analysis of the data adjusted for the effect of study describes the differences in NAN induced by a diet change within a study; therefore, the main focus is on the slope (and linear bias), not the mean bias.
We found one study comparing 2 of the feed programs used in the current study and in which the effect of study was removed (Pacheco et al., 2012).The authors used a data set of 40 duodenal studies (154 diets) and compared predictions from NRC and Agricultural Modeling and Training Systems feed program (AMTS; a commercial version of CNCPS v6.1 updated to an improved version 6.5).For NAN, there was no mean bias for NRC and a small one (12 g/d) with AMTS; and the slopes were, respectively, 0.82 and 0.88 g/g with NRC and AMTS.The authors found that AMTS was yielding a slightly lower RMSPE (25.5 vs. 28.9g/d) and RPE (5.3 vs. 6.0%)than NRC, in line with the current study (Table 5).
Overall, results from Tables 4 and 5, and Figures 1  and 2, suggest that if a diet is formulated on a farm for which we do not have the history (e.g., ration formulation and ingredient composition, estimated DMI, production), CNCPS will likely give the best predictions of the absolute NAN outflows (5% smaller RSMPE on average), whereas NRC would yield a small but constant under-prediction and NASEM would under-predict outflows in most of the current on-farm situations, the under-prediction increasing at elevated NAN predictions.If we are changing an existing diet and want to estimate the potential change in NAN outflows, the 3 feed programs appeared as good because they did not present any meaningful linear bias.
As mentioned above, NAN is predicted as the summation of MiN and NANMN in the 3 feed programs; however, because MiN and NANMN are estimated using different concepts and variables within each feed program, the predictions of MiN and NANMN have been analyzed and are presented separately.  2 n = number of treatment means; minimum (min) and maximum (max); CCC = concordance correlation coefficient (Lin, 1989); RMSPE = root mean squared prediction error (Theil, 1966); RPE = relative prediction error (RMSPE as a % of mean observed); ECT and ER = error in central tendency and error due to the regression (Bibby and Toutenburg, 1978); RSR = ratio of RMSPE and SD of observed values (the smaller, the better; Moriasi et al., 2007).   2 n = number of treatment means; minimum (min) and maximum (max).Observed values were adjusted for the effect of experiment plus the residuals (St-Pierre, 2001; refer to the text).CCC = concordance correlation coefficient (Lin, 1989); RMSPE = root mean squared prediction error (Theil, 1966); RPE = relative prediction error (RMSPE as a % of mean observed); ECT and ER = error in central tendency and error due to the regression (Bibby and Toutenburg, 1978); RSR = ratio of RMSPE and SD of observed values (the smaller, the better; Moriasi et al., 2007).MiN Outflows.Of the 3 N fractions, MiN had the lowest CCC for the 3 feed programs (Tables 4 and 5), suggesting a challenge not only in its prediction, but more likely in measurements.In the analysis of raw data (Table 4), CNCPS yielded the highest CCC, with the lowest RPE and RSR ratio.The mean bias was smallest with CNCPS (P ≤ 0.001) and below the 5% threshold; in addition, the slope was equal to 1.0.In CNCPS feed program, a complex rumen sub-model has been developed and refined over time (Fox et al., 2004;Van Amburgh et al., 2015a) where bacterial growth is directly related to the digestion rate and size of the digestible pool for the respective carbohydrate fractions as long as rumen ammonia levels are adequate to meet the N demands of the bacteria.Both NRC and NASEM under-predicted (P < 0.001) MiN outflows by a large margin, the mean bias representing 12 and 8% of mean observed values, respectively, and being larger (P ≤ 0.001) with NRC than NASEM.In addition, NRC and NASEM had a slope greater than 1.0 and different from unity (P < 0.001).The slope was greater (P ≤ 0.001) with NASEM than NRC, and the linear bias was meaningful only with NASEM at 9% of mean observed values (4.5% for that of NRC; Table 4).
In Bateman et al. (2001), RPE with NRC (1989) and CNCPS v.3 were, respectively, 30.0 and 39.5% of observed MiN values, and similar to RPE in the current study.In St-Pierre (2003), there was no mean bias but the slope was significant at −0.197 g/g (or 0.803 g/g between observed values and predictions).Again the magnitude of linear biases over the range of predictions was less than σe (66 g/d).For comparison purposes, the slope was 1.15 g/g in the current data set (Table 4) and generated linear biases smaller than σe at minimum (44   2021) reported CCC, RMSPE, and RPE ratio similar to those in the current study, a significant mean bias of 9 g/d, and a slope of 1.22 g/g similar to the NRC slope in the current study (Table 4).
The analysis of the data adjusted for the effect of study showed a discrepancy in the pattern of data, especially with CNCPS (Figures 1 and 2).In Table 5, NRC slope was equal to 1.0 whereas NASEM and CNCPS slopes were, respectively, 1.08 and 0.77 g/g, leading to a meaningful linear bias only with CNCPS at 8% of mean observed values.This indicates that CNCPS over-predicts MiN outflow responses with increased predictions.Compared with slopes on raw data (Table 4), MiN slopes dropped by 0.21 g/g on average with observed adjusted values (Table 5); resulting in improved predictions with NRC and NASEM, but worsened MiN predictions with CNCPS for reasons that are not clear.One reason could be that the CNCPS update (based on omasal studies) might not predict MiN as well in duodenal studies which constitutes 83% of the data set (more discussion below).In Pacheco et al. (2012), the mean biases for MiN were, respectively, 6 and −11 g/d, and the slopes 0.67 and 0.70 g/g with NRC and AMTS.In this latter study, the RPE was nearly twice larger for MiN vs. NAN with NRC and AMTS, similar to the current study.
As mentioned above, NRC estimates MiN mainly from TDN intake.With the aim of better representing the biology, NASEM opted for a model based on diet characteristics occurring in the rumen and having a direct impact on microbial growth, selecting the intakes of RDP, rumen-degraded starch and rumen-degraded NDF as having the largest impact based on previous work by Roman-Garcia et al. ( 2016) and White et al. (2016;2017b).Although NASEM intended to improve MiN predictions using complex equations, RMSPE, RPE and RSR ratio were almost identical between NRC and NASEM in Table 4, and slightly better in Table 5, indicating that predictions of MiN were moderately improved with NASEM.Indeed, the estimations of RDP, Rum_dcNDF, and Rum_dcSt still remain a challenge, as evidenced by the statistical fits between observed adjusted values and predictions, especially for Rum_dcNDF with NASEM (Table 6, Figure 2).This weakness is partially overcome because the intake of the degradable fractions is used to predict MiN; therefore, the main driver of predictions becomes the DMI.
An issue with NRC and NASEM appears to be their limitation to predict MiN outflows above 375 g/d (i.e., 49% of maximum value, Table 1), whereas 14% of the observed data are above that point (Figure 1).As mentioned above, another point to mention with NASEM is its inadequacy to predict MiN outflows for diets with a high CP content due to the inclusion of a negative quadratic term for diet CP concentration in the equation used to estimate Rum_dcNDF.For example, in Palmquist et al. (1993), one diet had 29.6% CP and Rum_dcNDF was negative but set at 0.1% of NDF (refer to Table 3).Predictions of MiN outflow for that diet were, respectively, 197, 14, and 198 g/d with NRC, NASEM, and CNCPS (data not shown).This issue needs to be corrected with NASEM because it occurs for diets with CP above 27% of DM.However, the discussion and conclusion on the predictions of MiN outflows will be pursued below analyzing and taking into account the site of sampling and the microbial marker.
NANMN Outflows.As mentioned above, NANMN is obtained by difference between measured NAN and MiN, and programs differ in their assumption of which N compounds contribute to NANMN: RUP plus endogenous protein with NRC and NASEM, and only the former with CNCPS.Because of these different assumptions of the composition of NANMN between feed programs, the comparison between feed programs could only be performed for NANMN and not for RUP.In Table 4, NASEM presented no mean bias, whereas NRC over-and CNCPS under-predicted NANMN outflows (P < 0.001), but their mean biases were just below the 5% threshold for biological relevance.All slopes were lower than 1.0 and different than unity (P ≤ 0.002), although the NASEM slope was closer to 1.0 and different from the 2 others (P ≤ 0.001).Accordingly, the linear bias was meaningful only with NRC and CNCPS at, respectively, 15 and 13% of the mean observed values (Table 4, Figure 1).
In Bateman et al. (2001), RPE with NRC (1989) and CNCPS v.3 were, respectively, 50.5 and 43.8% of observed NANMN values (312 g/d), i.e., 33% higher than the RPE in the current study.In St-Pierre (2003), there was no mean bias but the slope bias was significant at −0.131 g/g (or 0.869 g/g between observed values and predictions).Again the magnitude of linear biases over the range of predictions was less than σe (65 g/d).For comparison purposes, the slope was lower at 0.74 g/g in the current data set (Table 4) which generated a linear bias smaller than σe at minimum (22 g/d) but closer at maximum (402 g/d) predictions (i.e., 37 and −61 g/d, respectively; σe = 64 g/d; data not shown).Note that 61 g N/d translates into a difference of nearly 6 kg of milk per day (see calculations above), i.e., acceptable linear bias vs. ˆ; σ e in comparison, the linear bias of 29 g/d (or 3 kg milk yield/d) was considered biologically relevant in the current study.In Hanigan et al. (2021), all fit statistics were similar to those reported in the current study, except for a lower CCC (55 vs. 68%).The authors detected a mean bias of −9 g N/d and the slope was 0.70 g/g, both in line with the current study.
Alike NAN, the analysis of data adjusted for the effect of study had a small impact on results (Table 5; Figure 2).The NASEM had the best model fit statistics but its linear bias reach the 5% threshold for biological relevance as the other 2 feed programs.In Pacheco et al. (2012), the mean biases for NANMN were, respectively, 10 and 21 g/d, and the slopes 0.68 and 0.94 g/g with NRC and AMTS, respectively.Interestingly, the slope was closer to 1.0 for the older vs. the recent version of CNCPS which could be related to the sampling site of studies used to develop the model.
Overall, with the analysis on raw data, i.e., without accounting for systematic differences between studies (Table 4 and Figure 1), CNCPS offers an advantage over the other 2 feed programs to predict NAN and MiN outflows with no biologically relevant mean and linear biases, whereas NRC had mean biases and NASEM had mean biases plus linear biases.The NANMN outflows were predicted without bias with NASEM, whereas NRC and CNCPS had linear biases as they over-predict NANMN at higher observed values.Finally, the behavior of NANMN predictions somewhat counterbalanced the weaknesses of MiN predictions to finally yield moderately unbiased predictions of NAN outflows.The analysis of data on data adjusted for the effect of study resulted in MiN slope being lower than 1.0 associated with a meaningul linear bias with CNCPS.In addition, 2 n = number of treatment means; minimum (min) and maximum (max).Observed values were adjusted for the effect of experiment plus the residuals (St-Pierre, 2001; refer to the text).CCC = concordance correlation coefficient (Lin, 1989); RMSPE = root mean squared prediction error (Theil, 1966); RPE = relative prediction error (RMSPE as a % of mean observed); ECT and ER = error in central tendency and error due to the regression (Bibby and Toutenburg, 1978); RSR = ratio of RMSPE and SD of observed values (the smaller, the better; Moriasi et al., 2007). 3 True rumen CP degradability (% of CP) = [N intake -(NAN -microbial N -endogenous N)]/N intake × 100 (all in g/d; adapted from Pappritz et al., 2011), where endogenous N (g/d) = 15.4 + 1.21 × DMI (kg/d) (Lapierre et al., 2016) for duodenal and abomasal sampling, and 0.63*endogenous N for reticular and omasal sampling (refer to text).
a linear bias for NANMN was present in the 3 feed programs indicating an over-prediction of the variations induced by dietary change (Table 5 and Figure 1).However, variations of NAN outflows induced by a change of diet would be similarly predicted by the 3 feed programs.Rumen Degradabilities.Observed RDP values were recalculated for all treatments that reported NAN and MiN; therefore, the 3 programs are compared on the same basis.In the analysis on data adjusted for the effect of study (Table 6), all feed programs underpredicted (P < 0.001) RDP but the difference was only relevant with CNCPS (64 vs. 70% of CP) with ECT at 61% of MSPE.The CCC was 25% lower with CNCPS compared with the other 2 programs, with RPE and RSR ratio being in opposite direction.All slopes were below 1.0 and different from unity (P < 0.001), but they were similar among feed programs, indicating that variations in predicted RDP to a dietary change were larger than observed (Figure 2).
In contrast with CNCPS, Rum_dcNDF was underpredicted with NASEM and had poor model fit statistics: CCC was low at 19% vs. 83%, with high RPE and RSR ratio, and ECT at 72% of MSPE (Table 6).The slope was small at 0.31 and different than unity (P < 0.001) in contrast with that of CNCPS very close to 1.0 (Figure 2).The Rum_dcSt was under-predicted (P < 0.001) with NASEM but over-predicted with CNCPS (ECT at 63% of MSPE).Model fit statistics were similar between feed programs (Table 6), and the slopes were smaller than 1.0 and different than unity (P ≤ 0.005).
The NASEM equations to predict Rum_dcNDF and Rum_dcSt were developed by White et al. (2016) from a data set containing 156 and 193 treatment means, respectively.In that study, the CCC were 95 and 90% for Rum_dcNDF and Rum_dcSt, respectively, and the slopes were close to 1.0 (1.03 and 1.06%/%, respectively).The reasons for the discrepancy between our results and those for Rum_dcNDF in White et al. (2016) are not clear and should be investigated further.Albeit biologically attractive, the inclusion of RDP, Rum_dcNDF, and Rum_dcSt for predictions of MiN outflows definitively requires more work to have a positive impact.The analyses of raw and adjusted data suggested that predictions of the N outflows might be improved, especially for MiN outflows with NASEM and CNCPS.Therefore, a first objective was to determine if moderators related to measurements of the outflows could be associated with these variations; a second objective was to identify moderators related to the diet that would need to be included or better estimated in the feed programs.

Moderators Related to Sampling Methodology.
Sampling Site In the last 2 decades, there has been a shift in digesta sampling from the duodenum to the omasum in dairy cows, and the advantages of omasal sampling have been presented and discussed previously (Ahvenjärvi et al., 2000;Broderick et al., 2010).In Supplemental Tables S5 and S6, N intake in omasal studies averaged 105 to 107% of that in duodenal studies but observed adjusted NAN, MiN and NANMN outflows in omasal studies displayed distorted proportions averaging, respectively, 115%, 138%, and 87% of those measured in duodenal studies.There are major differences in measured outflows relative to N intake between sampling sites, and the discrepancy is more important for MiN.
The influence of sampling site on residual errors was evaluated in regression models that included the interaction between the predicted values centered around their mean and the moderator; therefore, the intercept corresponds to the mean bias (Table 7 and Figure 4).Among the 3 feed programs, mean biases (P ≤ 0.02) differed between sampling sites, except for NAN with NASEM and CNCPS.In line with the distortion described above, mean biases of MiN were 55 to 61 g/d (P ≤ 0.01) greater in omasal vs. duodenal studies, indicating that observed MiN outflows were systematically higher in omasal vs. duodenal studies than those predicted by the feed programs.As expected, the distortion had the opposite effect on mean biases of NANMN outflows being smaller by −28 to −51 g/d (P ≤ 0.02) in omasal vs. duodenal studies.It can be concluded that the positive mean biases of MiN ouflows with NRC and NASEM in Tables 4 and 5 were partly driven by the strong positive mean biases from omasal studies (Table 7).
The site of sampling had an effect of the MiN slope (P ≤ 0.05) with NRC and NASEM, and a tendency (P = 0.06) on that with CNCPS: the slopes were, respectively, 0.40, 0.58, and 0.44 g/g greater in omasal vs. duodenal studies.Together, with the mean biases, these results indicate that MiN predictions with NRC and NASEM in omasal studies were lower than reported values, the under-prediction increasing with incremental outflows, whereas there was no significant slope on the residuals in duodenal studies.The opposite was observed with CNCPS, with an over-prediction in duodenal studies starting at 250 g/d on average and increasing with incremental outflows vs. no significant slope for omasal studies (Figure 3).On NAN and NANMN outflows, the site of sampling only tended (P = 0.06) to have an impact on the slope of NAN predicted with CNCPS; the predictions in duodenal studies being lower than observed values below 500 g/d on average and viceversa above (Figure 3).
The ability of each program to predict N outflows in different conditions is obviously related to the type of studies used to build the equations of the program, e.g., different sampling sites, as well as dairy cows, steers, and beef cattle.The NRC used 99 duodenal studies with mixed cattle, NASEM utilized 483 duodenal and 103 omasal studies with dairy cows, whereas CNCPS v6.5 update was based on 20 omasal studies with dairy cows and steers (3 steer studies; Van Amburgh et al., 2015a).Clearly, from results above and those in Supplemental Tables S5 and S6, CNCPS predicted NAN and MiN outflows from omasal studies better than did NRC and NASEM; and this was expected, as no duodenal studies contributed to the last CNCPS update.In contrast, NRC better predicted NAN and MiN outflows from duodenal vs. omasal studies, as it only used duodenal studies to develop its equations.The NASEM, which used both types but omasal to a lesser extent, did not improve predictions of NAN and MiN in both duodenal or omasal studies compared with NRC.Van Amburgh et al. (2015a) reported statistical fits for omasal MiN outflows comparable to those reported in Supplemental Table S6: CCC of 87% and RPE estimated at 17% of observed mean (n = 74 treatment means).
In a meta-analysis conducted on omasal studies and reporting N outflows, Broderick et al. (2010) reported that the observed MiN outflows were 26% higher than MiN predicted with NRC, in agreement with the underprediction of MiN with NRC in the current study (287 vs. 379 g/d or 24% of observed mean; Table 7).This discrepancy can be explained using results from studies where MiN outflows were determined simultaneously using different sampling sites and microbial markers.Ipharraguerre et al. (2007) compared both the site of sampling (duodenal vs. omasal) and the microbial marker ( 15 N vs. purines) using 3 mid-lactating dairy cows in an incomplete 4 × 4 Latin square design.The authors reported a difference of 97 g/d between MiN outflows measured at omasal canal vs. duodenum (381 vs. 284 g/d) using 15 N labeling, but only a difference of 25 g/d (288 vs. 263 g/d) using purines as the microbial marker.They suggested that the MiN outflow at the omasal canal might have been overestimated, in link with the high correlation between omasal flow of MiN with the proportional contribution of small-particle phase (SP) and fluid phase (FP) to the reconstituted omasal digesta, but not with that of the large-particle phase (LP).Ipharraguerre et al. (2007) argued that rumen bacteria pass to the gut mainly attached to SP and as solutes in FP, and that potential losses of LP during omasal samples reconstitution could result in greater proportions of SP and FP (hence bacteria) than true digesta.
Similarly, Ahvenjärvi et al. (2000) compared the site of sampling (duodenal vs. omasal) using 4 cows in a 4 × 4 Latin square design.The authors reported that MiN outflows tended to be 6% higher at the omasal canal vs. duodenum, using purines as microbial marker.Furthermore, Ahvenjärvi et al. (2000) and Ipharraguerre et al. (2007) reported that the site used to harvest bacteria (rumen, omasum or duodenum), and the relative proportion of fluid-and particle-associated bacteria used to measure the ratio of purines:N or the 15 N enrichment of the bacteria could alter the MiN outflow up to 50 g/d.
In the current study, 15N was the microbial marker of choice in omasal studies: 160 and 29 treatment means using 15N and purines, respectively.Therefore, for MiN outflows, it cannot be precluded that the large discrepancy in ECT between omasal and duodenal studies is a technical issue related to the microbial marker.
Microbial Markers To further asses if differences observed between sampling sites were related to microbial markers (15N or purines), their effect on MiN outflow was evaluated in studies published from 1997 onwards per sampling site (Table 8 and Figure 4).In duodenal studies, the microbial marker had no effect (P ≥ 0.37) on the mean biases and the slopes.In omasal studies, the microbial marker had an effect (P ≤ 0.01) on the mean biases, being on average 57 g/d higher (a relative increase of 17%) with 15N vs. purines in the 3 feed programs; and with no effect on the slopes (Table 8; Figure 4).
Our results support the findings of Ipharraguerre et al. (2007) reporting that MiN outflows were higher using 15 N labeling vs. purines when sampling digesta at the omasal canal, with no difference when sampled at the duodenum.Ipharraguerre et al. (2007) discussed potential problems associated with the use of 15 N labeling in omasal studies while agreeing that using 15 N labeling rather than purines could reduce the variation of MiN outflow measurements.Therefore, technical challenges with the use of the 15 N labeling as a microbial marker are associated with differences in its enrichment at different sites of sampling, plus the difficulty to reconstitute a "true" digesta sample.
Altogether, these data indicate that the mean biases of NAN and MiN (observed > predicted; Tables 4 and  5) with NRC and NASEM partly originated from the under-prediction of MiN outflows in omasal studies (Table 7), which is more associated with 15 N being used as the microbial marker (Table 8 and Figure 4).However, few data were available for omasal sampling with purines as microbial marker; therefore, results need to be interpreted with caution and more research is warranted to compare microbial markers in omasal studies.Unfortunately, the current data set does not allow to delineate which sampling site or microbial marker is best, as the "true" MiN amount outflowing the rumen is unknown.It seems clear, however, that omasal MiN measurements are, on average, 50 g/d higher than their duodenal counterpart, with 15 N potentially associated with higher measurements than purines in omasal studies.To resolve this conundrum, EAA outflow predictions from the 3 feed programs could be compared or related to EAA net portal absorption in a meta-analytical re- were too low or too high.However, one should keep in mind that gut metabolism has already taken its toll on AA when measuring net portal absorption of EAA (Pacheco et al., 2006).
Diet Characteristics Finally, we determined if dietary characteristics were correlated with the residual errors in multivariate regression models adjusted for random study effects (Table 9).The correlation matrix among diet parameters is reported in Supplemental Table S7.For a factor already included in the prediction equation of an outflow by a feed program, a correlation with the residual errors would mean that its coefficient or the structure of the equation should be re-examined within the feed program to improve the prediction of the outflow.For a factor missing in the prediction equation, a correlation would indicate that its inclusion in the equation might reduce the prediction error.As an example, the results for MiN and NANMN are discussed below.
Compared with NANMN, several moderators were correlated with the residuals of MiN outflows highlighting the complexity to predict MiN by feed programs and/or difficulties associated with the measurements of MiN in digesta flow studies (Table 9).Overall, depending on the feed program, the inclusion of the following moderators in MiN prediction models (or their estimation) should be re-examined to reduce the prediction error: DMI, NDF, CP, EE, RDP, Rum_dcNDF, and Rum_dcSt.Dry matter intake tended (P = 0.06) to be correlated positively with residual errors of MiN predicted with NRC.At mean and constant EE and RDP, each 1 unit (kg/d) increment in DMI would increase the residual error of MiN by 1.6 g/d.As MiN is under-predicted with NRC (mean bias = 32 g/d; P < 0.001), the results indicate that, globally and in absolute terms, the error in the underprediction of MiN tends to increase with incremental values of DMI.Although the effect of DMI on residual errors of MiN is small, it might be associated to the calculation of MiN as a function of discounted TDN intake which ignores  7).The vertical dotted line corresponds to the average of predictions.the increased efficiency of microbial synthesis with incremental DMI (NRC, 2001).
In univariate analyses, residual errors of MiN outflow predicted with NASEM were correlated positively (P ≤ 0.008) with DMI and starch but not in the multivariate analysis (Table 9).Residual errors of MiN outflow predicted with NASEM were correlated negatively (P ≤ 0.003) with NDF, CP, EE, RDP, and Rum_dcSt; as MiN is under-predicted with NASEM (mean bias = 22 g/d; P < 0.001), the results indicate that, globally and in absolute terms, the error in the underprediction of MiN decreases with incremental values of NDF, CP, EE, RDP, or Rum_dcSt.In univariate analyses, residual errors of MiN outflow predicted with CNCPS tended (P = 0.06) to be correlated positively with DMI and were correlated negatively (P = 0.01) with RDP but not in the multivariate analysis (Table 9).
For NANMN, DMI was correlated negatively with residual errors predicted with NRC (P = 0.03) and CNCPS (P = 0.08), RDP was correlated positively (P ≤ 0.004) with residual errors predicted with the 3 feed programs, as was Rum_dcNDF (P = 0.03) with residual errors predicted with CNCPS (Table 9).For example, the results indicate that at mean and constant DMI, each 10% increment in RDP would increase by 18 g/d the residual error of NANMN predicted by NRC (Table 9).As NANMN is over-predicted with NRC (mean bias = −11 g/d; P = 0.005), the results indicate that, globally and in absolute terms, the error in the overprediction of NANMN decreases with incremental values of RDP (Figure 5).These results suggest that the ruminal protein degradability was overestimated in line with the RDP slopes being below 1.0 with the 3 feed programs in Table 6.
Not surprisingly, Rum_dcNDF of NASEM was not correlated (P ≥ 0.29; data not shown) with the residual errors of N outflows in the univariate analyses and was not included in the multivariate analysis.This likely reflected the low CCC and high RSR ratio of that parameter in NASEM (19.2% and 2.74, respectively; Table 5).

Additionnal Analyses -Simpler Models and Outflow Units
Simpler models Intakes of DM, OM, or N were major drivers of N outflows, especially for MiN.Overall, inclusion of OM intake and N intake in the prediction equations yielded adjusted RMSPE similar to those obtained using the complex equations developed by the 3 feed programs (Supplemental Table S8).Our results are in line with previous results showing that omasal MiN outflows were best related to OM intake (Broderick et al., 2010).However, despite the modest improvements in adjusted RMSPE with the more complex equations (7% and 12% for NAN and NANMN outflows on average), only the latter will allow to progress from the prediction of N outflows to that of individual AA outflows from RUP.
Outflow units Because of the impact of intakes on the predictions of N outflows, the ability of feed programs to predict N outflows was evaluated leaving out the effect of DMI, i.e., expressing the outflows as g/kg DMI.The change of unit from g/d to g/kg DMI decreased the CCC, to a larger extent when outflows were not adjusted for the ffect of study.On average, CCC of NAN, MiN and NANMN decreased from 82, 62 and 68% for outflcows in g/d to 20, 5 and 35% for outflows expressed in g/kg DMI with no adjustments for the effect of study; in the same order, when outflows were adjusted for the effect of study, CCC decreased from 97, 89 and 92% to 70, 33 and 81%, respectively (Supplemental Tables S9 and S10, respectively; Figure S3).Expressing outflows relative to DMI had the greatest impact on MiN outflows for values either unadjusted or adjusted for the effect of study.This strongly suggests that the weakness of feed programs to predict MiN outflows was partially masked when data were expressed on a g/d basis because DMI was driving the response.
The change of unit from g/d to g/kg DMI, however, had a smaller impact on the magnitude of the mean biases, but had an impact on the slopes of observed adjusted vs. predicted MiN outflows which were all smaller than unity (P < 0.001) and, on average, 58% less when expressed in g/kg DMI vs. g/d with the 3 feed programs (Supplemental Table S10 and Figure S3).In line with Daniel et al. (2020) and in contrast with NASEM and CNCPS, our results clearly showed the limitation of NRC to predict MiN outflows above 16 g/kg DMI (Figure S3) which supports the fact that the increased efficiency of microbial synthesis with incremental DMI is ignored in NRC (2001).Again, this difference between NRC and the other 2 feed programs was not present for MiN outflows expressed on a g/d basis because DMI was driving the response (Tables 3  and 4; Figures 1 and 2).
For comparison purposes with Tables 7 and 8, the effects of sampling site and microbial marker were tested on residual errors expressed in g/kg DMI (Supplemental Tables S11 and S12) and no major difference was observed in relation to the unit of N outflows.The multivariate regression analyses were also conducted on residual errors of N outflows expressed in g/kg of DMI (Supplemental Table S13) and no major difference was observed with results reported in Table 9.

CONCLUSION
Using the analysis of the raw observed values vs. predictions, i.e., the absolute value of N outflow predicted based on ration composition and ingestion, CNCPS offered the best predictions of NAN: this superiority was maintained for MiN predictions but not for NANMN.Using the analysis of observed values adjusted for the effect of study vs. predictions, i.e., prediction of N outflow response to a dietary change, the 3 feed programs predicted NAN within acceptable limits of the biological relevance, except NRC which had a slight mean bias.However, the 3 feed programs performed better on NAN predictions than its 2 components, MiN and NANMN.The NRC and NASEM underpredicted MiN, in absolute terms, whereas CNCPS overpredicted the response to increased predicted MiN.The 3 feed programs overpredicted the response to increased predicted NANMN whereas NRC overpredicted NANMN in absolute terms.Moving to more complex feed programs to predict N outflows remains a challenge, because of the limited ability to correctly predict pa-rameters needed by such complex feed programs, such as the rumen degradability coefficient of CP, NDF and starch.
In addition, a clear effect of categorical moderators emerged from the analysis.Higher mean bias of MiN were reported in omasal studies vs. duodenal studies, and furthermore, within omasal studies, the mean bias was higher when 15 N labeling was used as the microbial marker compared with purines.It is not clear if these higher observed omasal MiN are biologically relevant or an artifact related to 15 N labeling.This technical issue is crucial to resolve.
Inclusion or re-evaluation of dietary concentrations of CP and EE is advocated in the prediction equation of MiN of each feed program.Further research is needed to improve the inclusion of rumen degradability coefficients of CP, NDF and starch into the prediction equation of MiN.Although progress is still to be made to improve of equations predicting N outflows, the current feed programs provide good accuracy to predict MP supply, especially in terms of response to a change in the diet.9).
Martineau et al.: Prediction of post-rumen nitrogen outflows values; and ER was calculated by difference: MSPE -(ECT + ED).
Martineau et al.: Prediction of post-rumen nitrogen outflows Martineau et al.: Prediction of post-rumen nitrogen outflows Martineau et al.: Prediction of post-rumen nitrogen outflows evaluated the quality of predictions of N ouflows in a data set of 164 individual cow observations from 6 duodenal studies: RPE for NRC (1989) and CNCPS version 3 (Russell et al., 1992) were, respectively, 20.0 and 26.0% of observed NAN values (601 g/d), thus similar to RPE in the current study.St-Pierre (2003) re-evaluated NAN data from NRC (2001; 275 treatment means) and found for post-rumen outflows (g/d) of NAN, microbial N (MiN), and nonammonia nonmicrobial N (NANMN) predicted by National Research Council (2001; NRC), National Academies of Sciences, Engineering and Medicine (2021; NASEM), and Cornell Net Carbohydrate and Protein System v. 6.5.5 (CNCPS).
3A, B, C: mean bias or slope within row and N outflow differ at P ≤ 0.001; and a, b: mean bias or slope within row and N outflow differ at P ≤ 0.05 (Tukey test; refer to Supplemental File II).4The coefficient and SE of the slope are from the simple linear regression of observed versus predicted values; and P-value of the slope being different from 1.0 is reported.5Thelinear bias is the absolute difference between biases computed at first (Q1) and third (Q3) quantiles of predicted values (adapted fromSt-Pierre, 2003), and is used to assess the biological relevance of the slope (refer to the text).6Outlierswere identified based on an absolute studentized residual value >3 with inner StudentResid function in R (refer to the text).

3A
, B, C: mean bias or slope within row and N outflow differ at P ≤ 0.001; and a, b: mean bias or slope within row and N outflow differ at P ≤ 0.05 (Tukey test; refer to Supplemental File III).4The coefficient and SE of the slope are from the simple linear regression of observed versus predicted values; and P-value of the slope being different from 1.0 is reported.5Thelinear bias is the absolute difference between biases computed at first (Q1) and third (Q3) quantiles of predicted values (adapted fromSt-Pierre, 2003), and is used to assess the biological relevance of the slope (refer to the text).6Outlierswere identified based on an absolute studentized residual value >3 with inner StudentResid function in R (refer to the text).

Figure 2 .
Figure 2. Relationships between observed values adjusted for the effect of study and post-rumen NAN, microbial N (MiN), and nonammonia nonmicrobial (NANMN) outflows (g/d) predicted by National Research Council (2001, NRC), National Academies of Sciences, Engineering and Medicine (2021, NASEM), and Cornell Net Carbohydrate and Protein System v. 6.5.5 (CNCPS).Solid bold blue lines represent the best fit regression line (refer to Table5).
g/d) and maximum (390 g/d) predictions (i.e., 1 and 53 g/d, respectively; σe = 77 g/d; data not shown).The discrepancy between the NRC slope in our study and that in St-Pierre (2003; 1.15 vs. 0.80 g/g, respectively) is likely related to the inclusion of omasal studies in our data set (more discussion below).Also comparing observed MiN outflows from 581 treatment means (mixed sampling sites) with NRC predictions,Hanigan et al.  ( included studies published from 1997 onwards.The effects of sampling site (duodenal versus omasal) on residual errors (observed minus predicted) were determined in regression models which included the interaction between predicted N outflows (g/d) centered around their mean, and the sampling sites (duodenal = duodenum and abomasum; omasal = omasum and reticulum).Regression models were adjusted for random study effects, and observations were weighted by the square root of the number of experimental units per treatment.2 MiN = microbial N; NANMN = nonammonia nonmicrobial N. National Research Council (2001; NRC), National Academies of Sciences, Engineering and Medicine (2021; NASEM), and Cornell Net Carbohydrate and Protein System v. 6.5.5 (CNCPS).3 n = number of treatment means.4 Average of the observed (Obs) outflows; and average, first (Q1), and third (Q3) quantiles of predicted (Pred) outflows.5 P-values are given for the difference in mean bias and slope between sampling sites.Coefficients are given with their SE in brackets with a symbol for their significance level: $ P ≤ 0.10; * P ≤ 0.05; ** P ≤ 0.01; *** P ≤ 0.001.σe = square root of the estimated amount of (residual) heterogeneity.AICc = Akaike's information corrected criterion.6 Outliers were identified based on an absolute studentized residual value >3 with inner StudentResid function (refer to the text).

Figure 3 .
Figure 3. Relationships between observed values adjusted for the effect of study and predictions of rumen degradability of protein (RDP; % of CP), NDF (Rum_dcNDF; % of NDF), and starch (Rum_dcSt; % of starch) by National Research Council (2001, NRC), National Academies of Sciences, Engineering and Medicine (2021, NASEM), and Cornell Net Carbohydrate and Protein System v. 6.5.5 (CNCPS).Rum_dcNDF and Rum_dcSt are not available for NRC.Solid bold blue lines represent the best fit regression line (refer to Table6).

Figure 4 .
Figure 4. Effects of sampling site on residual values adjusted for the effect of study vs. predictions of NAN, microbial N (MiN) and nonammonia nonmicrobial (NANMN) outflows (g/d) by National Research Council (2001; NRC), National Academies of Sciences, Engineering and Medicine (2021; NASEM), and Cornell Net Carbohydrate and Protein System v. 6.5.5 (CNCPS) in studies published from 1997 onwards.Sampling site was categorized as duodenal (black) and omasal (red).Solid bold lines represent the best fit regression line (refer to Table7).The vertical dotted line corresponds to the average of predictions.
included studies published from 1997 onwards.The effects of microbial markers (15n = labeled 15 N; purines = purine bases and nucleic acids) was evaluated separately in duodenal and omasal studies (duodenal = duodenum and abomasum; omasal = omasum and reticulum).The effects of microbial markers on residual errors (observed minus predicted) were determined in regression models which included the interaction between predicted N outflows (g/d) centered around their mean, and the microbial marker.Regression models were adjusted for random study effects, and observations were weighted by the square root of the number of experimental units per treatment.2 National Research Council (2001; NRC), National Academies of Sciences, Engineering and Medicine (2021; NASEM), and Cornell Net Carbohydrate and Protein System v. 6.5.5 (CNCPS).3 n = number of treatment means.4 Average of the observed (Obs) MiN outflows; average, first (Q1) and third (Q3) quantiles of predicted (Pred) MiN outflows.5 P-values are given for the difference in mean bias and slope between microbial markers.Coefficients are given with their SE in brackets with a symbol for their significance level: $ P ≤ 0.10; * P ≤ 0.05; ** P ≤ 0.01; *** P ≤ 0.001.σe = square root of the estimated amount of (residual) heterogeneity.AICc = Akaike's information corrected criterion.6 Outliers were identified based on an absolute studentized residual value >3 with inner StudentResid function (refer to the text).

Table 9 .
Multivariate regression analyses of residual errors for NAN, microbial N (MiN) and nonammonia nonmicrobial (NANMN) outflows (g/d) predicted by NRC, NASEM and CNCPS conducted separately for each outflow predicted by National Research Council (2001; NRC), National Academies of Sciences, Engineering and Medicine (2021; NASEM), and Cornell Net Carbohydrate and Protein System v. 6.5.5 (CNCPS).All regression models were adjusted for random study effects, variables were centered on their mean, and observations were weighted by the square root of the number of experimental units per treatment.Variables with P ≤ 0.10 in univariate analysis were included in the full model and a backward elimination procedure (P ≤ 0.10) was used to reduce the model.2 n = number of treatment means; RDP = rumen degradability of protein; Rum_dcNDF and Rum_dcSt = rumen degradability of NDF and starch, respectively; σe = square root of the estimated amount of (residual) heterogeneity; RMSPE = root mean squared prediction error adjusted for the study effect; AICc = Akaike's information corrected criterion.3 Starch, Rum_dcNDF, and Rum_dcSt are not available for NRC. 4 Slopes listed are P ≤ 0.10 and those in bold character are P < 0.001 (SE of slope in brackets).5 Potential outliers were identified based on an absolute studentized residual value >3 with inner StudentResid function (refer to text).

Figure 5 .
Figure 5.Effect of microbial markers on residual values adjusted for the effect of study vs. predictions of microbial N (MiN) outflows (g/d) by National Research Council (2001; NRC), National Academies of Sciences, Engineering and Medicine (2021; NASEM), and Cornell Net Carbohydrate and Protein System v. 6.5.5 (CNCPS) in studies published from 1997 onwards, and using duodenal (top panels) or omasal (bottom panels) sampling.Microbial markers were categorized as labeled 15 N (15N; red) and purines (black).Solid bold lines represent the best fit regression line (refer to Table8).The vertical dotted line corresponds to the average of predictions.

Figure 6 .
Figure 6.Relationship between residual errors of NANMN (g/d) and predictions of rumen degradability of protein (RDP; % of CP) by National Research Council (2001; NRC).The solid bold black line represents the best fit regression line and the vertical dotted line corresponds to the average of RDP (refer to Table9).

Table 1 .
Martineau et al.:Prediction of post-rumen nitrogen outflows Summary statistics of post-rumen N outflows and cow characteristics for the studies included in the data set 1 (DePeters and Cant, 1992) to be CP for publications earlier than 1990 and true protein thereafter; milk true protein = CP × 0.951(DePeters and Cant, 1992).

Table 2 .
Summary statistics of diet characteristics for the studies included in the data set 1

Table 3 .
Summary statistics of diet characteristics predicted by 3 dairy feed programs 1 1 National Research Council (2001; NRC), National Academies of Sciences, Engineering and Medicine (2021; NASEM), and Cornell Net Carbohydrate and Protein System v. 6.5.5 (CNCPS).2n = number of treatment means.3 Starch, Rum_dc = apparent rumen degradability coefficient, and Rum_dcNDF and Rum_dcSt are not available for NRC.

Table 4 .
Summary of feed program performance for predictions of post-rumen N outflows (data not adjusted for the effect of study) 1

Table 5 .
Summary of feed program performance for predictions of post-rumen N outflows (data adjusted for the effect of study) 1

Table 6 .
Martineau et al.:Prediction of post-rumen nitrogen outflows Summary of of feed program performance for predictions of rumen degradabilities (data adjusted for the effect of study) 1

Table 7 .
Effect of sampling site on predictions of post-rumen N outflows 1

Table 8 .
Effect of microbial markers on predictions of post-rumen microbial N (MiN) outflows per sampling site 1