nutri-Predictive formulas for different measures of cheese yield using milk composition from individual goat samples

The objective of this study was to develop formulas based on milk composition of individual goat samples for predicting cheese yield (%CY) traits (fresh curd, milk solids, and water retained in the curd). The specific aims were to assess and quantify (1) the contribution of major milk components (fat, protein, and casein) and udder health indicators (lactose, somatic cell count, pH, and bacterial count) on %CY traits (fresh curd, milk solids, and water retained in the curd); (2) the cheese-making method; and (3) goat breed effects on prediction accuracy of the %CY formulas. The %CY traits were analyzed in duplicate from 600 goats, us-ing an individual laboratory cheese-making procedure (9-MilCA method; 9 mL of milk per observation) for a total of 1,200 observations. Goats were reared in 36 herds and belonged to 6 breeds (Saanen, Murciano-Granadina, Camosciata delle Alpi, Maltese, Sarda, and Sarda Primitiva). Fresh %CY (%CY CURD ), total solids (%CY SOLIDS ), and water retained (%CY WATER ) in the curd were used as response variables. Single and multiple linear regression models were tested via different combinations of standard milk components (fat, protein, casein) and indirect udder health indicators (UHI; lactose, somatic cell count, pH, and bacterial count). The 2 %CY observations within animal were averaged, and a cross-validation (CrV) scheme was adopted, in which 80% of observations were randomly assigned to the calibration (CAL) set and 20% to the validation (VAL) set. The procedure was repeated 10 times to account for sampling variability. Further, the model presenting the best prediction accuracy in CrV (i.e., comprehensive formula) was used in a secondary analysis to assess the accuracy of the %CY predictive formulas as part of the laboratory cheese-making procedure (within-animal validation, WAV), in which the first %CY observation within animal was assigned to CAL


INTRODUCTION
Approximately 39% of the world dairy goat population is located in high-income countries (FAOSTAT, 2018), mainly North America and Europe, where the modern dairy systems have been developed maintaining some traditional approaches (i.e., Clark and Mora García, 2017), and are characterized by rearing both local and cosmopolitan dairy breeds (Miller and Lu, 2019).A major part of the goat milk is used to produce cheese; thus, cheese yield (%CY) is a key component for increasing farm profitability.Laboratory cheese-making procedures have been developed in recent years, mimicking cheese manufacture at the individual animal level in controlled and standardized conditions (Jacob et al., 2010;Cipolat-Gotet et al., 2016), offering the opportunity to observe animal variability and to recover nutri-ents in the curd.The information acquired with those procedures allow estimation of heritability of measured %CY, which was found to be around 0.19 to 0.27 for bovine (Bittante et al., 2013;Dadousis et al., 2018) and about 0.15 to 0.30 for ovine species (Sánchez-Mayor et al., 2019;Pelayo et al., 2021).Despite the aforementioned advantages, the collection and processing of milk samples at individual level are still time-consuming and labor-intensive.Therefore, predictions of %CY need to be investigated, although actual measures from laboratory cheese-making are fundamental for their study.Nowadays, the use of predictive formulas for %CY traits based on major milk components is still limited to the use of bulk milk at the dairy industry level.Use of other indirect methods, such as infrared spectroscopy, for measuring %CY traits in goat milk is still under investigation.Hence, that information cannot be used for breeding purposes, and neither it can be applied at population level, as the prediction accuracies are expected to be unsatisfactory, mainly because of the lower variability of the calibration data set used developed at the dairy industry level with respect to the external set (population level).Indeed, as evidenced by recent studies, the coagulation (Pazzola et al., 2018;Vacca et al., 2018) and cheese-making abilities of goat milk (Stocco et al., 2019b;Vacca et al., 2020) are characterized by large variability, due to factors related to the animal and the breed of goat, mainly related to variations of milk composition.Providing %CY prediction formulas at goat population level could be useful for implementing milk payment systems, and for new selection indices focused on cheese-making ability.Those formulas would be extremely advantageous in indigenous breeds (Vacca et al., 2018;Paschino et al., 2020), and could pioneer local economies.For those purposes, such models need to be derived from individual milk samples.
Our objectives were to (1) develop predictive formulas for %CY traits (fresh curd, milk solids, and water retained in the curd) based on milk major components (fat, protein, and casein) and udder health indicators (lactose, pH, and somatic cell and bacterial counts); (2) assess the cheese-making method; and (3) quantify the effect of goat breed on the prediction accuracy of the %CY formulas.

Analysis of Milk Composition
Immediately after collection, individual milk samples were stored at 4°C and then analyzed within 24 h.All samples were analyzed for fat, protein, casein, lactose, total solids, and pH with a MilkoScan FT6000 infrared analyzer (Foss Electric A/S) calibrated in accordance with the related reference methods [ISO 9622/IDF 141 (ISO-IDF, 2013) for fat, protein, casein, lactose, and pH; ISO 6731/IDF 21 (ISO-IDF, 2010) for TS].Somatic cell count was determined by a Fossomatic 5000 somatic cell counter (Foss Electric A/S) and transformed into the logarithmic SCS [log 2 (SCC × 10 −5 ) +3] (Ali and Shook, 1980).Total bacterial count was measured by a BactoScan FC150 analyzer (Foss Electric A/S) and transformed into the logarithmic bacterial count [LBC = log 10 (total bacterial count/1,000)].

Individual Cheese-Making Procedure
The 9-mL laboratory cheese-making method (9-Mil-CA) was adopted to measure individual %CY traits for each milk sample, as described in Cipolat-Gotet et al. (2016), processing 2 replicates per animal (9 mL per each replicate), for a total of 600 goats and 1,200 observations, respectively.In brief, each milk replicate was transferred into a glass tube (9 mL), inserted into the modified sample rack of the lactodynamograph instrument, heated to 35°C for 15 min, and mixed with 0.2 mL of a rennet solution [Hansen Standard 215, with 80 ± 5% chymosin and 20 ± 5% pepsin; 215 international milk clotting units per milliliter; Pacovis Amrein AG; diluted to 1.2% (wt/vol) in distilled water].The sample rack was then transferred from the heater to the lactodynamograph (30 min duration test).Coagulation occurred at 35°C.At the end of this phase, coagulated milk samples were manually cut using a stainless-steel spatula, and the rack was moved to the heater for the 30-min curd-cooking phase (55°C).At 15 min after the beginning of the cooking phase, each sample was subjected to a second manual cutting.Further, each glass tube was removed from the sample rack, and the curd was separated from the whey.The curd was slightly pressed to aid expulsion of whey, and the curd was suspended above the whey at room temperature (15 min).The obtained curds and whey were weighed using a precision scale.Then, the whey of the 2 replicates of each milk sample was pooled and analyzed for chemical composition using an infrared spectrophotometer (MilkoScan FT2, Foss Electric).The measured %CY traits [the ratios between the weight of the milk processed and the weight of the curd (%CY CURD ), the curd TS (%CY SOLIDS ), and the water retained in the curd (%CY WATER )] were calculated as follows:

Statistical Analysis
Editing.Before statistical analysis, all traits (milk composition and %CY measures) showing values outside the interval of the mean ± 3 standard deviations (SD) were excluded as outliers.
Regression Models.A series of linear regression models were applied for %CY CURD , %CY SOLIDS , and %CY WATER , separately.Milk fat, protein, and casein were used as predictors either one at a time or in combinations.The best predictive model derived included fat and casein and was further extended including predictors related to udder health (udder health indicators, UHI: lactose, SCS, pH, and LBC), selected on the basis of their technological roles and effects on cheese production (Fox et al., 2017;Pazzola et al., 2019;Stocco et al., 2019a).Multicollinearity of all predictors was also checked by evaluation of tolerance, variance inflation factor, eigenvalues, and condition index (Supplemental Table S1, https: / / figshare .com/articles/ dataset/ Supplemental _Table _S1/ 19694800) before using them in the combination models.The results obtained from those tests evidenced the absence of multicollinearity among predictors.Therefore, the 2 groups of predictive models were tested as follows: (1) Basic composition-that is, fat, protein, or casein, tested individually and in combination (see Table 2); (2) Basic composition combined with UHI-that is, fat + casein + combinations of lactose level, SCS, pH, and LBC (see Table 3).
For all the %CY measures, we tested regression models with or without intercept.Fitting statistics between the 2 models were comparable (data not shown).Thus, results from models with the intercept are not reported, as our goal was to quantify the actual contribution of each of the predictors to %CY.
Validation Procedures.The accuracy of the %CY predictive formulas was assessed by different procedures: (a) a random cross-validation (CrV) scheme with 10 replicates was adopted to address the first objective, that is, to quantify the effects of the major milk components and those related to UHI on %CY traits, where data were split into a training set (80% of the total records), used to build the model, and a testing set (20% of the total records), used as validation; (b) a within-animal validation procedure was used to assess the effect of laboratory cheese-making method on the accuracy of the %CY predictive formulas (second objective), where a training data set composed of the first measurement within animal was used to build the predictive models and the second measurement within animal was used in the testing set; and (c) a stratified CrV (SCrV) for the third objective (i.e., to quantify the effect of goat breed on the prediction accuracy of the %CY predictive formulas), evaluating each breed separately (testing set) by using the records from the other 5 breeds in the training set.To further investigate the within-breed relationships between each milk component and the %CY measures, regression models were used testing the predictors individually (Supplemental Figure S1, https: / / figshare .com/articles/ figure/ Supplemental _Figure _S1/ 19694806).
Assessment of Prediction Accuracy.Model assessment was based on coefficient of determination of validation (R 2 VAL ), the root mean square error of validation (RMSE VAL ), and the ratio performance deviation, calculated as the ratio between SD and RM-SE VAL .In the case of CrV, R 2 VAL , RMSE VAL , and ratio performance deviation, values were averaged over the 10 replicates.

Milk Composition of Individual Goat Milk Samples
The descriptive statistics of milk composition and %CY traits of individual goat milk samples are given in Table 1.The average contents of fat, protein, and lactose were 4.48%, 3.57%, and 4.68%, respectively, with fat showing the highest coefficient of variation (29%).
It is well known that variability of milk composition is a major factor affecting cheese-making efficiency (Vacca et al., 2019).In this study, the use of individual samples showed high variability of milk composition that affected that of %CY traits.Our results showed that, following the 9-MilCA method, the average %CY CURD was 15.5%, with approximately equal contributions of %CY SOLIDS and %CY WATER (mean values of 7.6% and 7.9%, respectively).

Prediction of Cheese Yield Traits Based on Milk Fat, Protein, or Casein Percentage
Milk Fat Percentage.When milk fat percentage was used as unique predictor in a simple linear regression model for %CY traits (Table 2), regression coefficients β ( ) varied from 3.31 (%CY CURD ) to 1.67 (for both %CY SOLIDS and %CY WATER ).The R 2 VAL of those models was high for %CY SOLIDS (0.89), intermediate for %CY CURD (0.60), and low for %CY WATER (0.17).As is known, the addition of rennet triggers the coagulation process and causes the casein micelles to aggregate, entrapping the majority of the fat globules in the network.Given that fat accounts for the major part of cheese solids in full-fat cheeses and that lipids are hydrophobic (Fox et al., 2017), it is not surprising that the validation accuracy of the fat-based model predicting %CY SOLIDS and %CY WATER had opposite values.Compared with a similar analysis on the use of individual predictive formulas in dairy cattle, β of fat were found to be higher for all 3 %CY traits, but R 2 VAL were lower compared with our results (R 2 VAL = 0.29, 0.57, and 0.06 for %CY CURD , %CY SOLIDS , and %CY WATER , respectively; Mariani et al., 2020).This could be attributed to the differences in the physicochemical struc-ture and composition of fats between goat and cow milk.For example, goat milk consists of smaller fat globules, compared with bovine milk, which make better dispersion and a more homogeneous mixture in milk, and hence provide a greater surface of fat for lipases to act (Park and Haenlein, 2006).Small fat globules behave as pseudo-protein particles, with a greater ability to become part of the gel network (Fox et al., 2017).Moreover, in a pathway-based genome-wide association analysis of milk coagulation and cheese-making properties in dairy cattle, the phosphatidylinositol signaling pathways have been proven to be strictly associated with milk technological properties of milk (Dadousis et al., 2017).Phosphatidylinositol represents a small fraction of the phospholipid components of milk, and phospholipids are mainly present on the surface of milk fat globules.The biological explanation of the connection between phosphatidylinositol pathway and coagulation properties can be found in the close association between fat globule size and phospholipid contents, with higher amounts of phospholipids in small globules compared with the large ones, likely affecting the technological properties of milk (Dadousis et al., 2017).These characteristics of milk fat globules might explain the high predictive ability of fat for %CY CURD , and especially %CY SOLIDS .Our results are in agreement with previous research studies, which have clearly evidenced the overall positive and linear effect of goat milk fat on %CY CURD and %CY SOLIDS , and on the recovery of the nutrients (fat and TS) in the curd (Pazzola et al., 2019).
Milk Protein and Casein Percentages.Milk protein percentage, as predictor of %CY traits (Table 2), provided consistently higher β in all cases compared with fat (4.32, 2.14, and 2.20) but a lower R 2 VAL (0.41, 0.57, and 0.13) for %CY CURD , %CY SOLIDS , and %CY WATER , respectively.Those results are probably related to the higher variability of fat with respect to the other milk compounds (Table 1).Compared with protein, casein concentration as a unique predictor of the 3 %CY traits showed higher β, and almost doubled for %CY SOLIDS (4.09 for casein vs. 2.19 for protein percentage).When casein was tested as individual predictor, the R 2 VAL was higher for %CY CURD and similar for %CY WATER compared with protein (Table 2), with more profound difference found for %CY SOLIDS (0.54 vs. 0.89, for casein and protein percentage, respectively).Although the quality criteria of goat milk used in most of the milk payment systems are still based on total protein concentration, caseins should be considered as well, because they are essential for the cheese-making process.
It is true that milk proteins play an active role during coagulation, but the functionality varies based on their Cheese yield traits represent the ratios between the weight of the milk processed and the weight of the curd (%CY CURD ), the curd TS (%CY SOLIDS ), and the water retained in the curd (%CY WATER ).
size and the actual proportions of casein and whey proteins fractions (Brule et al., 2000).For example, in goat milk, the lower casein concentration, different ratios among casein fractions, and higher casein micelle size can explain the weak curd firmness compared with milk of other ruminants (Park et al., 2007).Also, the number of hydrophobic sites on the protein surface is one of the most important factors affecting the functional properties of protein and caseins during coagulation of milk (Fox and McSweeney 1998;Hiller and Lorenzen 2008;Yildirim and Erdem, 2015).In goat milk, the hydrophobic sites on the protein surface are found in lower numbers than in cow milk, combined with lower protein surface binding affinity (Yildirim and Erdem, 2015).This could partly explain why the contribution of protein and casein (in terms of β) to %CY traits found here was high.However, R2 VAL for %CY WATER was more than double in bovine (R 2 VAL = 0.31 and 0.33 for protein and casein, respectively; Mariani et al., 2020) compared with the caprine values found in this study.This could be attributed to the higher waterholding capacity of bovine proteins than caprine (Yildirim and Erdem, 2015).

Prediction of Cheese Yield Traits Based on Fat and Protein, or Fat and Casein Percentage
In general, predictive formulas for %CY traits built upon the combination of milk components (fat and protein or fat and casein) were, on average, more accurate than the single-nutrient formulas (Table 2).Indeed, the R 2 VAL for the %CY traits were always higher and the RMSE VAL lower when fat was fitted together with either protein or casein.Overall, these results were expected, as Pazzola et al. (2019) reported that in caprine milk these 3 components together represent the major factors affecting cheese-making process and contributors for %CY.
The β of fat on the %CY CURD formulas were slightly higher (1.01 with protein, 1.07 with casein) compared with those for %CY SOLIDS (0.93 and 0.92 with protein or casein, respectively), and consistent with the smallest regression coefficients obtained for %CY WATER (0.27 and 0.34 with protein or casein, respectively).This indicates that, although fat on its own has little waterholding capacity, its presence in the paracasein network affects the degree of contraction of the matrix and hence moisture content and %CY.The occluded fat globules physically limit the contraction and hence the aggregation of the surrounding paracasein network; therefore they also reduce the extent of syneresis (Fox et al., 2017).Although in goats any significant effect of fat has been reported on syneresis, the expulsion of whey is reduced in milk samples with high fat content (Stocco et al., 2018).The β of protein was always higher than that of fat (Table 2).This is not surprising, considering that the majority of other solids retained in the curd, especially hydrophilic solids (lactose, soluble salts, and others), are proportional to the quantity of whey retained, which in turn is much more proportional to protein (i.e., whey proteins) than fat (Emmons et al., 1990).Moreover, the β of casein for %CY traits, when combined with fat, were consistently higher than those of protein combined with fat (Table 2), reflecting its direct role during coagulation, as it forms the continuous paracasein network, acting like a sponge, which occludes the fat and moisture (Fox et al., 2017).

Prediction of Cheese Yield Traits Based on Fat and Casein and Udder Health Indicators
The inclusion in the statistical model of the UHI traits slightly increased the prediction accuracy of the %CY formulas (Table 3), especially if compared with the fat + protein or fat + casein formulas (Table 2).However, the β gained for the other milk components are useful for increasing our knowledge about the relationships between these traits and the efficiency of the cheese-making process in goats.It is widely recognized that lactose, SCS, milk pH, and LBC are associated in different ways with the udder health status of dairy goats (Leitner et al., 2004;Pirisi et al., 2007;Bagnicka et al., 2011).Somatic and bacterial counts are of further importance, as they are fundamental parameters for establishing the hygienic quality of raw milk, and are currently used in different milk payment systems (Pirisi et al., 2007).
Lactose.When lactose was used as predictor for %CY traits together with fat and casein, it provided β values of 1.43, 0.09, and 1.19 for %CY CURD , %CY SOLIDS , and %CY WATER , respectively (Table 3).These values are higher compared with those in bovine milk for %CY CURD and %CY WATER , and very similar for %CY SOLIDS (Mariani et al., 2020).It is known that about 98% of the lactose in milk is lost in the whey during cheese-making (Fox et al., 2017), and the remaining part is bound to the water in fresh curd.This explains why lactose had a minimal part in the formulas used to predict %CY SOLIDS , but it largely contributed to %CY WATER , even after the inclusion of all the other UHI (Table 3).These characteristics influenced the precision of the predictive formulas for %CY CURD .Although a direct effect of lactose on cheese-making is not evident, the fermentation of the small part remaining in the fresh curd has a significant effect on cheese quality (Fox et al., 2017).
SCS.When SCS was used as predictor of %CY traits, together with fat and casein, it contributed 0.11, 0.01, and 0.09 for %CY CURD , %CY SOLIDS , and %CY WATER , respectively (Table 3), providing lower (in the case of %CY CURD and %CY WATER ) or equal (%CY SOLIDS ) R 2 VAL values to those observed when lactose was included as predictor.In a previous study, it was reported that high SCS was associated with high amount of moisture retained in the curd, resulting in a nonlinear increase of %CY CURD , but with a lower recovery of milk protein in the curd (Stocco et al., 2019b).Hence, we further tested the linear and quadratic regressions for the effect of SCS on %CY traits, but no differences were observed in the fitting statistics with respect to models including SCS as linear predictor (data not shown).The low β found here confirmed that high values of somatic cells in goat milk should not necessarily be associated with mastitic milk (Contreras et al., 2007), and that the contribution of SCS to %CY traits is negligible.When combined with other UHI (i.e., + lactose, or + lactose + pH, or + lactose + pH + LBC) the β of SCS reduced to zero, and the fitting statistics marginally improved.
pH.When milk pH was included in the predictive formulas with fat and casein, this resulted in β <1 for all %CY traits and close to zero for %CY SOLIDS (Table 3).As for lactose, the predictive performance of the model with pH slightly outperformed the models with SCS.The contribution of pH on water retention was higher than that of fat and casein.Indeed, pH has a strong influence on whey expulsion, but, in particular, the change in pH leads to different conformations in goat milk proteins and distribution of hydrophobic groups inside and outside the molecule, resulting in changes in the surface hydrophobicity of the protein during heating (Lam and Nickerson, 2015).When pH was included with the other UHI, the regression coefficient of lactose reduced by almost 3 times compared with the model with fat, casein, and lactose (1.43 vs. 0.52) in predicting %CY CURD ; the sign changed (0.09 vs. −0.08) in the case of %CY SOLIDS ; and the regression coefficient almost halved (1.19 vs. 0.71) in the prediction of %CY WATER (Table 3).Although the β of pH reduced in the case of %CY CURD , moving from the model with fat, casein, and pH to the comprehensive model (0.97 vs. 0.66), it almost doubled in %CY SOLIDS (0.07 vs. 0.13), and more than halved (0.80 vs. 0.34) in the case of %CY WATER .These changes in the β values of each milk component among different groups of predictive formulas describe the effective role they have during coagulation and cheese-making.When considered alone, each component was not fully able to describe its real contribution, as the variability of the β values within milk component and across predictive formulas was very high, especially for %CY CURD and %CY WATER .This could be due to the fact that they carried the indirect effects of the other nonincluded components, even though the prediction accuracies were already high using only fat and casein, as well as in combinations with lactose level, pH, and SCS, especially in the case of %CY SOLIDS .LBC.When LBC was used as predictor for %CY traits, together with fat and casein, it had β of 0.21, 0.00, and 0.10 for %CY CURD , %CY SOLIDS , and %CY WATER , respectively (Table 3), whereas R 2 VAL were comparable with the model of SCS.However, the technological meaning of LBC became clearer when combined in a model with all the other components.For instance, it showed negative β for all 3 %CY measures, which agrees with previous studies on the effect of LBC on goat milk coagulation properties and cheese-making traits (Stocco et al., 2019b).Moreover, in the formula considering all the components tested in the present study, lactose also displayed a negative coefficient, whereas the effect of SCS was negligible (0.00).

Accuracy of the Cheese-Making Method
Table 4 reports the β and the validation performance measures (R 2 VAL , RMSE VAL , ratio performance deviation, and SD) from the within-animal validation procedure performed on the predictive formulas for the 3%CY traits based on fat, casein, and UHI of individual goat milk samples.The β values were slightly different compared with those of the same combination formula in CrV, only in terms of fat, casein, and lactose, in predicting %CY CURD and %CY WATER (Table 4).Moreover, compared with the CrV procedure, the R 2 VAL values for %CY CURD (0.76) and %CY WATER (0.27) increased but still remained low for %CY WATER .The %CY SOLIDS held the highest prediction accuracy (R 2 VAL = 0.96).This procedure allowed us to obtain an indirect estimation of repeatability of the 9-MilCA method, as this validation procedure uses the first %CY measure as calibration set and the second as validation set.Estimates of the repeatability of the %CY measures are limited in the literature, as the laboratory procedures at the individual animal level usually do not provide analyses of the cheese-making in duplicate, due to the quantity of milk needed and the workload required.Nevertheless, the efficiency of the 9-MilCA method has been previously demonstrated, with this method being a powerful research tool for a rapid and inexpensive analysis of a large number of milk samples in duplicate, yielding in a complete picture of the cheese-making process (Cipolat-Gotet et al., 2016).In goats, previous studies have reported the repeatability of the %CY measures expressed as the ratio of the sum of the variances of the random effects included in the model to the sum of the total variance (Paschino et al., 2020).Those authors reported repeatability values of 93.3, 99.9, and 89.7%, respectively, for %CY CURD , %CY SOLIDS , and %CY WATER .Similarly, in bovines, Cipolat-Gotet et al. (2016) reported repeatability values of 83.8, 99.5, and 67.3%, respectively, for %CY CURD , %CY SOLIDS , and %CY WATER .

Prediction of Cheese Yield Traits Across Goat Breeds
Based on the results of the first procedure, a stratified CrV was applied using the best model identified in CrV.Table 5 summarizes the β and validation performance parameters from the models of the SCrV proce- Validation performance traits: R 2 VAL = coefficient of determination in validation; RMSE VAL = root mean square error of validation; RPD VAL = ratio performance deviation; SD VAL = SD of the validation set.
2 Cheese yield traits: the ratios between the weight of the milk processed and the weight of the curd (%CY CURD ), the curd TS (%CY SOLIDS ), and the water retained in the curd (%CY WATER ).As regards the β provided by the milk components, some differences were noticed across breeds, in particular for lactose, pH, and LBC in predicting all 3 %CY measures.Larger differences were evidenced in R 2 VAL , with the ranking of the breeds related to the %CY considered.For example, MG had the highest R 2 VAL (0.68), followed by Sr (0.54), SP (0.48), CA (0.39), Sa (0.30), and Ma (0.19) in predicting %CY CURD (Table 5).However, the RMSE VAL did not follow the pattern of R 2 VAL .Less differences across breeds were found for the β values for %CY SOLIDS , with R 2 VAL varying from 0.87 (CA) to 0.96 (Sr).The largest differences were observed for %CY WATER , where Sa and Sr had R 2 VAL close to zero (0.02 and 0.06, respectively), and CA and SP had 0.10, followed by Ma (0.19) and MG (0.36).A previous study investigating the effect of 4 breeds of goat on the prediction accuracy of Fourier-transform infrared spectroscopy on milk coagulation traits clearly evidenced the importance of adopting a SCrV procedure, whose results were strongly influenced by breed, and the general low prediction accuracies restricted practical application (Stocco et al., 2021).In our study, the promising results achieved with high prediction accuracies (≥0.87) for %CY SOLIDS in all breeds were not confirmed for %CY CURD and %CY WATER .Therefore, the existing differences among breeds have to be further investigated.For example, as depicted in Supplemental Figure S1 (https: / / figshare .com/articles/ figure/ Supplemental _Figure _S1/ 19694806), reporting the regression plots of each component considered individually for prediction of the %CY traits per each breed, the relationship of each component with the %CY measures differed by breed.Values of β are reported subsequently only for large differences among breeds.
Those differences were mostly related to the β values of the UHI predictors, probably because of their lower importance in predicting %CY traits with respect to the other milk components.Indeed, when lactose was used to predict %CY WATER , Ma and SP showed extreme values ( β = 1.70 and −0.71, respectively, for Ma and SP).Moving to %CY SOLIDS , Sa and MG had almost the opposite β values for lactose (1.01 and −1.84, respectively).These breeds again showed extreme values for SCS predicting %CY WATER (0.20 and −0.09, respectively, for MG and Sa), whereas for %CY SOLIDS CA showed the lowest β value (0.32 and −0.02, respectively, for MG and CA).Regarding milk pH, Sr and MG displayed opposite β values for %CY WATER (5.33 and 0.88, respectively), but in the case of %CY SOLIDS , Sa showed the highest and positive (2.69) and MG the lowest and negative (−3.52) β values.
Regarding UHI predictors, casein and fat β values were more consistent among breeds for all the %CY traits.The only negative association with casein-%CY WATER , in Sr goats ( β = −0.21;Supplemental Figure S1)-suggests the greater ability of the casein network in this breed to contract during coagulation and to expel whey from the curd, thus reducing the overall moisture content.This also confirms the fundamental role of caseins in the final outcome of the cheese-making, with single casein fractions differently linked to the water TS components of the curd (Cipolat-Gotet et al., 2018).

CONCLUSIONS
In this study we directly quantified the effects of major milk components on %CY traits, in terms of fresh cheese, milk solids, and water retained in the curd, and of the most important indicators of udder health in milk.The large number and variability of individual samples, and direct measurements of %CY traits, allowed us to collect information on accuracies of prediction for application at the dairy goat population level.Knowledge about the relationships between UHI and efficiency of the cheese-making process could be used together with data provided by the standard composition.Overall, the results gave a much more detailed understanding of the mechanisms that determine cheese yield in goats.The different accuracy of the %CY predictive formulas within the 9-MilCA method leads us to speculate that the control of fresh curd, and especially moisture retention in the curd, is under multifactorial control, which must be considered to increase the reliability of the measure of this trait.Findings arising from the differences among breeds confirmed that the SCrV approach is more appropriate than CrV, in particular when different breeds are sampled, and to create within-breed formulas.
Stocco et al.: CHEESE YIELD PREDICTIVE FORMULAS IN GOATS Table 5. Regression coefficients β ( ) and validation performance traits 1 from the models of the stratified cross-validation procedure for cheese yield traits 2 based on fat, casein, and udder health indicators 3 of individual goat milk samples across breeds Stocco et al.: CHEESE YIELD PREDICTIVE FORMULAS IN GOATS

Table 1 .
Stocco et al.: CHEESE YIELD PREDICTIVE FORMULAS IN GOATS Descriptive statistics of milk composition and cheese yield traits of individual goat milk samples 2

Table 2 .
Stocco et al.: CHEESE YIELD PREDICTIVE FORMULAS IN GOATS   Regression coefficients ( β; related SE in parentheses)and validation performance parameters 1 from the models of the cross-validation procedure for cheese yield traits 2 in fresh cheese based on single nutrients (fat, protein, or casein) and on their combinations RMSE VAL = root mean square error of validation; RPD VAL = ratio performance deviation; SD VAL = SD of the validation set.
Stocco et al.: CHEESE YIELD PREDICTIVE FORMULAS IN GOATS

Table 3 .
Stocco et al.: CHEESE YIELD PREDICTIVE FORMULAS IN GOATS   Regression coefficients ( β; related SE in parentheses)and validation performance parameters 1 from the models of the cross-validation procedure for cheese yield traits, 2 based on fat, casein, and combinations of udder health indicators 3 of individual goat milk samples

Table 4 .
Stocco et al.: CHEESE YIELD PREDICTIVE FORMULAS IN GOATS Regression coefficients β for %CY CURD , %CY SOLIDS , and %CY WATER based on fat, casein, and UHI of individual goat milk samples.
VAL = SD of the validation set. 2 Cheese yield traits: the ratios between the weight of the milk processed and the weight of the curd (%CY 3 Udder health indicators include lactose level, SCS, pH, and logarithmic bacterial count (LBC).dure