Early detection of subclinical mastitis in lactating dairy cows using cow-level features

Subclinical mastitis in cows affects their health, well-being, longevity, and performance, leading to reduced productivity and profit. Early prediction of subclinical mastitis can enable dairy farmers to perform interventions to mitigate its effect. The present study investigated how well predictive models built using machine learning techniques can detect subclinical mastitis up to 7 d before its occurrence. The data set used consisted of 1,346,207 milk-day (i.e., a day when milk was collected on both morning and evening) records spanning 9 yr from 2,389 cows producing on 7 Irish research farms. Individual cow composite milk yield and maximum milk flow were available twice daily, whereas milk composition (i.e., fat, lactose, protein) and somatic cell count (SCC) were collected once per week. Other features describing parity, calving dates, predicted transmitting ability for SCC, body weight, and history of subclinical mastitis were also available. The results of the study showed that a gradient boosting machine model trained to predict the onset of subclinical mastitis 7 d before a subclinical case occurs achieved a sensitivity and specificity of 69.45 and 95.64%, respectively. Reduced data collection frequency, where milk composition and SCC were recorded only every 15, 30, 45, and 60 d was simulated by masking data, to reflect the frequency of recording of this data on commercial dairy farms in Ireland. The sensitivity and specificity scores reduced as recording frequency reduced with respective scores of 66.93 and 80.43% when milk composition and SCC were recorded just every 60 d. Re-sults demonstrate that models built on data that could be recorded routinely available on commercial dairy farms, can achieve useful predictive ability of subclinical mastitis even with reduced frequency of milk composition and SCC recording.


INTRODUCTION
Subclinical mastitis is one of the most common infections on dairy farms globally, with approximately 20 to 30% of cows in any herd likely to become infected annually (Heringstad et al., 2000). Subclinical mastitis has many unfavorable repercussions such as reduced milk yield and quality (Halasa et al., 2007), greater veterinary costs, and increased risk of early culling (Cavero et al., 2007). Clinical mastitis infections create swelling or clotting of milk in the udder making it generally visible to producers; subclinical mastitis, however, often has no obvious visible symptoms and can only be detected through dedicated examination, for example by measuring SCC or using tests such as the California Mastitis Test (Viguier et al., 2009).
Pressure is mounting globally to reduce reliance on antimicrobials. Although antimicrobial use in dairy production is not excessive (Petrovski et al., 2006), most of its use is for the treatment of udder-related ailments (Krogh et al., 2020). Therefore, strategies to reduce reliance on antimicrobials in dairy production must consider mammary system health. One way to reduce reliance on antimicrobials in mammary system health programs is to predict the likelihood of a cow succumbing to subclinical mastitis sufficiently early to enable preventative treatment to be employed to halt progression to the clinical state.
As well as affecting animal welfare, subclinical mastitis can also erode farm profit. The estimated opportunity cost of lost milk yield due to an incidence of subclinical mastitis is, on average, €100 per cow (Yalcin et al., 1999;Petrovski et al., 2006). Moreover, Viguier et al. (2009) estimated that if subclinical mastitis progresses to clinical mastitis, then the cost ranges from €160 to €700 per clinical mastitis case. Therefore, early Early detection of subclinical mastitis in lactating dairy cows using cow-level features detection of subclinical mastitis could aid dairy producers in taking early action by isolating or treating cows before a subclinical case progresses to the clinical state or even before spreads within the herd.
Several studies have used cross-sectional data analyses to identify the risk factors associated with the occurrence of subclinical mastitis (Mammadova and Keskin, 2013). The risk tends to be greatest in older parity cows (Berry et al., 2007;Rahularaj et al., 2019) and early in lactation (Busato et al., 2000;Fox, 2009). The strength of the associations between risk factors and subclinical mastitis can also differ by parity, as reflected by 2-way interactions between parity and risk factors for clinical mastitis previously reported in dairy cows.
Explanatory analyses (Ebrahimie et al., 2018b) are useful to retrospectively quantify the associations between discovered risk factors and subclinical mastitis. These, however, do not necessarily always translate into prediction models capable of accurately detecting subclinical mastitis sufficiently early to enable remedial action. Developments in the application of machine learning techniques to predictive modeling supports deeper analytical capabilities but also potential improvements in predictive ability of different outcomes . Some promising advances have recently been documented in predicting subclinical mastitis using machine learning algorithms (Ebrahimi et al., 2019;Bobbo et al., 2021), as more data have become available to leverage the benefits of these methods.
Using a large cross-sectional data set from several dairy herds, the objective of the present study was to use machine learning techniques to build a prediction model capable of accurately detecting the onset of subclinical mastitis up to 7 d in advance. Daily milk yield and maximum milk flow rate are available in most commercial dairy farms along with other variables that could be routinely recorded, such as milk composition data (i.e., fat, protein, lactose) and SCC. Other animallevel phenotypic data either already exist on some farms, or the technology exists to routinely record them (e.g., BW, BCS). These cow-level features were used to build predictive models that can, on any given day, predict if a cow is likely to develop subclinical mastitis in the following 7-d window. If used on a daily basis, these frequent predictions can help dairy producers identify cows at risk of developing subclinical mastitis, thereby enabling remedial measures.

MATERIALS AND METHODS
Because the data already existed within a pre-existing database, approval by an Institutional Animal Care and Use Committee or Institutional Review Board was not required. The data used in this study were sourced from 7 research farms in Ireland, limited to cows born after January 1, 2010. The data spanned 9 yr from 2012 to 2021. The main data sources used in the study were milk yield data (i.e., volume and maximum milk flow rate), milk composition data and SCC. All measures were a composite of all cow quarters. Cows were milked twice daily, with milk yield and maximum milk flow rate per milking available for each a.m. and p.m. milking separately. Milk composition data were available once per week, which included milk fat, protein, lactose and urea concentration on consecutive p.m. and a.m. milking. The SCC was available once per week but for the a.m. milking only. Predicted transmitting ability values (PTA; i.e., measure of genetic merit reflecting half the EBV) for SCC were available from the national genetic evaluation in 2014. The national genetic evaluation for SCC is a univariate repeatability mixed model of log e SCC using data up to parity 15. Fixed effects accounted for in the repeatability model include contemporary group of herd-year-season of calving, parity, and both heterosis and recombination; breed differences are accounted for through the use of genetic groups in the numerator relationship matrix. Cow BW was recorded weekly for all cows. The BCS records were available every 2 to 3 wk, with BCS assessed on a 1 (thin) to 5 (fat) scale in increments of 0.25 units (Edmonson et al., 1989).
The data set was divided into nonoverlapping development and test partitions on a per-lactation basis. The model development partition consisted of 2,131 cows that calved between January 2012 and April 2019. The test data partition consisted of 932 cows that calved between January and June 2020. Some cows appeared in both partitions (28.21% of cows overlap), but no cow lactations were present in both partitions. In the development partition, there was 2,030 first parity records, 1,493 second parity records, and 2,164 third or greater parity records. The milking dates in the development partition spanned from January 2012 to September 2019 and included 1,118,809 milk-day records. The milking dates in the test partition ranged from January 2020 to January 2021, and included 227,398 milk-day records. The total numbers of cows milked in each year from 2012 to 2019 in the development partition were 197, 562, 719, 876, 988, 1,129, 1,173, and 30, respectively

Prediction Target Description
A primiparous cow was considered to have a positive subclinical mastitis case on a given day if she had an SCC of ≥150,000 cells/mL, whereas a multiparous cow was considered to have a positive subclinical mastitis case on a given day if she had an SCC of ≥250,000 cells/mL (Brightling et al., 2000). Based on this definition, the development set consisted of 1,120,470 milkday records labeled as healthy, and 195,562 milk-day records labeled as instances of subclinical mastitis. The test partition contained 189,145 milk-day records labeled as healthy and 38,253 milk-day records labeled as instances of subclinical mastitis.

Prediction Task
Models were trained to predict on a given day if a cow will develop subclinical mastitis within a time window of the next 7 d, based on the milking data recorded on that day, and the cow's history up to that day. For example, if a cow in the data set had a recorded SCC value above the subclinical mastitis threshold on d 76 of lactation, then a positive prediction target would be recorded for d 69 to 75 of lactation, as the cow developed subclinical mastitis within the subsequent 7 d. This also means that there were 7 opportunities to predict the onset of subclinical mastitis for a specific cow, once on each of the 7 d leading up to the positive event.

Identifying the 10% of Cows Most Likely to Have Subclinical Mastitis
The gradient boosting machine models trained in this study output a prediction score (in the range 0 to 1), which can be interpreted as the likelihood of a cow developing subclinical mastitis in the next 7 d. These scores are dichotomized to convert them to a binary label [using a threshold determined using Youden's j-statistic (Youden, 1960)]. Another way to use those scores is to identify the group of cows in a herd that are most likely to develop subclinical mastitis by ranking cows according to the prediction scores output from the model. The cows with the highest prediction scores are most likely to develop subclinical mastitis.
To illustrate how this approach would be used in practice, the top 10% of cows (based on decreasing prediction scores of the model) in one of the herds in the test partition (containing 850 cows) with the highest scores were identified on each day covered by the test partition. Within this group, the average number of true identifications of subclinical mastitis (i.e., true positives) and false alarms (i.e., false positives) present per day was calculated.
If the prediction model is working in a reliable and repeatable way, once a cow has been included in the set of cows predicted by the model as most likely to develop subclinical mastitis on a given day, then that cow should also be included in this set for subsequent days based on predictions made by the model on those days (i.e., the model generates a prediction for every cow on every day). To assess this, the ratio of cows included in the set of cows receiving the highest prediction scores from the model (e.g., top 10%) on a given day that were also included in this group on the following day was calculated. A ratio near 1.0 indicates that the models are behaving in a reliable and repeatable way.

Data Editing
Only days for which both an a.m. and p.m. milk yield were available were retained in the data set. As other features were not recorded daily, the last recorded (most recent) values for milk composition and SCC were used for each milk-day. The number of previous subclinical events in each lactation, as well as the total number of subclinical events for a cow were calculated. Only data from spring-calving cows were retained as these are the predominant herd types in Ireland (Berry et al., 2013). Data from cows calved less than 10 d were removed from the data set. Before data editing there were a total of 1,298,160 rows in the development partition and 233,942 rows in the test partition. After the data editing there were 1,118,809 rows in the development partition and 227,398 rows in the test partition.
Summary statistics (i.e., maximum, minimum, mean, median, SD, skew, and difference) for milk yield, maximum milk flow rate, milk composition data (lactose, protein, fat, urea), SCC, BCS, and BW based on a rolling time window were computed. For each of these attributes, for a specific milk-day, summary statistics based on windows of the previous 15 and 30 d were computed and added as derived attributes to the data set. For the attributes for which separate a.m. and p.m. values were available (all except SCC, BCS, and BW), separate a.m. and p.m. derived attributes were generated. Where insufficient recordings were available to compute the summary statistics (e.g., attempting to generate SD using one value), they were not calculated and treated as missing values in the modeling process (which in this study were replaced with zeros). Summary statistics for the main features in the development partition are shown in Table 1. In total the model uses 231 variables: 15 main features (milk yield, maximum milk flow rate, milk composition data, SCC, BCS, and BW) from each of which 14 derived attributes are generated, plus 6 other main features (parity, days since calving, PTA, subclinical cases per parity, subclinical cases per cow, month in milk).

Simulating Infrequent Recordings of Milk Composition Data
One of the objectives of the present study was to quantify how the performance of a predictive model changes as the frequency of milk composition recording reduced to simulate the data collection frequencies on commercial farms. The frequency of milk composition (including SCC) recording was artificially reduced to simulate recording every 15, 30, 45, and 60 d by masking recordings. For example, to simulate milk composition recording only every 15 d, the data point at every 15th day was copied forward for the next 15 d, thus overwriting any intermediate recordings. Following the overwriting of the milk composition and SCC data, the summary statistic features described previously were also updated appropriately.
The data set containing milk composition data at the original recording frequency (every 7 d) is denoted D 7 , and the data sets containing simulated reduced frequency milk composition recording are denoted D 15 , D 30 , D 45 , and D 60 , with the numbers in the subscripts indicating the simulated recording frequency. The prediction target (true value for subclinical mastitis) values for the simulated recording frequency data sets were based on the actual subclinical cases (as in D 7 ).

Modeling
Because of their effectiveness for prediction problems based on tabular data sets containing semantically rich features (Chang et al., 2018;Punmiya and Choe, 2019), gradient boosting machines (GBM; Friedman, 2001) were used to build all predictive models in this study. A GBM is an additive ensemble model that iteratively creates decision tree predictors that correct the predictions made by the decision trees already in the ensemble.
The data set used in this study displayed a high degree of target class imbalance; only 17% of the instances in the model development partition represented a positive occurrence of subclinical mastitis. Machine learning models can struggle to successfully learn from data sets with this degree of imbalance (Fernández et al., 2018). Two methods were used to mitigate the effects of class imbalance in the present study. Class reweighting (Zhou and Liu, 2010) mitigates class imbalance within the cost function of a GBM by allocating a higher weight to losses arising from instances of the minority class (in this study cases with high SCC values indicating subclinical mastitis) than instances of the majority class. The cost for the minority class was fixed to the ratio of the number of instances of the majority class to the number of instances of the minority class in the data used to train the model.
For binary classification problems, it is typical to use a default threshold value of 0.5 from the model output to determine which class is predicted. In data sets with significant class imbalance, however, this can lead to poor results (Kennedy et al., 2013). Prediction threshold tuning (He and Ma, 2013) adapts the threshold that is applied to the prediction scores [in the range (0,1)] to distinguish between the 2 classes in a binary classification problem. In the present study, the prediction threshold tuning using Youden's j-statistic (Youden, 1960) was used to calculate a tuned threshold. The tuned threshold was calculated using a validation data partition after a model had been trained. This threshold was saved as a part of the model, and its value was used when making predictions. Other commonly used sampling methods to balance class imbalance, for example the synthetic over and under sampling methods SMOTE (Chawla et al., 2002) and ADASYN (He et al., 2008), were explored during preliminary experiments but were not found to be effective (data and results are not shown). Finally, a variable importance analysis was performed.
All of the data editing tasks and modeling were performed in the Python language. The XGBoost algorithm implementation from the xgboost package (version 1.7.1) was used (Chen and Guestrin, 2016).

Hyperparameter Tuning
For any machine learning algorithm, choosing optimal values for the hyperparameters is very important (Pakrashi and Mac Namee, 2021). The GBM has several hyperparameters that should be tuned to achieve good predictive performance: the fraction of the features in the data set made available to each tree in the GBM (feature_fraction), the fraction of the total instances available for training used to train each tree in the GBM (instance_fraction), the learning rate (lambda), the regularization parameter (gamma), the maximum depth of each tree trained in the GBM (max_depth), and the number of trees included in the GBM (num_trees). Bayesian hyperparameter tuning (Snoek et al., 2012;Nogueira, 2014) was used to find the best set of hyperparameters for feature_fraction, instance_fraction, max_depth, lambda, and gamma (num_trees were set using early stopping which is discussed below).
To facilitate hyperparameter tuning, the development data partition was further subdivided into training and validation partitions. The training partition contained cows that calved between January 2012 and December 2016 while the validation partition contained cows that calved between January 2017 and April 2019. During hyperparameter tuning, for each hyperparameter combination, the GBM was trained using the training partition and evaluated using the validation partition. Fifty hyperparameter combinations were explored using Bayesian hyperparameter tuning and the combination that gave the best performance, based on the area under the receiver operating characteristic curve (AUC; Kelleher et al., 2020) measured on the validation partition, was selected. All hyperparameter tuning was performed using D 7 , and the hyperparameter combinations found were used when building models based on all other simulated frequency data sets (D 7 , D 15 , D 30 , D 45 , and D 60 ). For the model trained using the best values of feature_fraction, instance_fraction, max_depth, lambda, and gamma (found as described above on D 7 ), the tuned prediction threshold was determined using Youden's j-statistic for each of D 7 , D 15 , D 30 , D 45 , and D 60 independently using the validation partition.
To avoid model overfitting, early stopping was used when models were trained. Early stopping is a process used to avoid overfitting whereby the training process is stopped when performance measured against a validation data set starts to decline, even if the AUC score measured using training data continues to improve. In the case of GBM, this limits the number of trees (num_trees) added to the model and is how this hyperparameter was tuned.
The best hyperparameter values found for the GBM model were feature_fraction = 0.5, instance_fraction = 1.0, max_depth = 6, lambda = 0.005, and gamma = 1.253. These were used to train the final model on the development partition.

Final Model Training
Using the optimal set of hyperparameters, the final model was trained using the entire development data partition. The performance of the models trained was then assessed using the test partition. This approach was used to train and evaluate models for the different data recording frequency data sets; one model was trained for each of the data recording frequency data sets (D 7 , D 15 , D 30 , D 45 , and D 60 ). The performance of different models was evaluated using AUC scores in combination with sensitivity and specificity values to understand overall model performance.

Model Performance for Frequent Milk Composition Measurement
When evaluated using the test partition, the model trained using the 7-d recording frequency data set (i.e., D 7 ) achieved an AUC of 0.9287, with a sensitivity and specificity of 69.45 and 95.64%, respectively. Each day the model used the most recently available data to predict if a cow will develop subclinical mastitis within a 7-d time window. Based on the D 7 data set, 80.21% of the occurrences of subclinical mastitis were predicted one day before they occurred. This reduced to 71.12% of cases predicted 7 d before the high SCC value was recorded (Table 2).
Due to the way that the prediction problem is framed, for any instance of subclinical mastitis, there are multiple opportunities to detect it -from one day in advance up to 7 d in advance. Another way to assess the performance of the model is by aggregating performance across these multiple opportunities. The model trained using the D 7 data set predicted 81.89% of cases of subclinical mastitis at least once in the 7-d window before occurrence. This reduced to 79.58% of cases predicted at least once up to 2 d ahead of occurrence, and, eventually, to 71.12% of cases predicted 7 d before occurrence (Table 3).
The most impactful variables in the prediction model were SCC recording on the milk-day, days since calving, parity, subclinical mastitis cases per parity, PTA for SCC, and the derived attributes of the minimum and maximum values of SCC in the last 15 and 30 d. A plot of variable importance for the model trained using the D 7 data set is given in Figure 1.

Assessing Predictions for the Top 10% Predictions
The ability of the model to identify a group of cows likely to develop subclinical mastitis based on the prediction score output by the model is assessed by aggregating performance measures over each month in the test period (Figures 2 and 3). For example, in the month of August 2020, the performance of the predictive model on each day was aggregated to demonstrate that, on an average day of the 85 predictions made for the 10% of cows from the herd (containing 850 cows in total) with the highest prediction scores, 69.16 cows were, on average, correctly predicted as about to develop cases of subclinical mastitis, while 15.61 were, on average, incorrectly flagged. The month with the worst average false positive rate was June 2020 with a daily average number of false positives of 19.83, and 58.23 correctly predicted cases. Note that the counts are not whole numbers as they are averaged over the days in a month and that the total number of cows is not always 85 as not all cows are milked every day. For the simulated data recording frequency data sets, the average numbers of true and false positive predictions are shown in Figures 3. Figure 4 shows a histogram of the percentage of cows predicted by the model to belong to the group at highest risk of developing subclinical mastitis on a particular day (the 10% of cows with the highest prediction scores) that were also part of this group on the following day. Most of the cows that were predicted to have subclinical mastitis on one day were also predicted to have subclinical mastitis the next day.

Simulated Reduced Frequency Milk Composition Measurement
The predictive performance of the models reduced as the simulated frequency of the recording of the milk composition data and SCC reduced. This was most evident in the AUC metric where the model trained using the data set D 7 achieved an AUC of 0.9287 and the models trained using data based on reduced recording frequencies (D 15 , D 30 , D 45 , and D 60 ) achieved AUC values of 0.8840, 0.8597, 0.8429, and 0.8200, respectively (     Table 3 summarizes the ability of the models trained using reduced frequency data to detect the onset of subclinical mastitis at least w days (w = 1 d, 2 d, …, 7 d) before the subclinical mastitis event. For D 15 , 81.40% of the subclinical cases were correctly identified at least 1 d before the occurrence of subclinical mastitis while 74.41% were correctly detected at least 6 d before the occurrence of subclinical mastitis. For D 30 , 83.08 and 73.71% cases were correctly detected at least 1 d and at least 6 d before the occurrence of subclinical mastitis, respectively. In the case of D 45 , 78.81 and 70.77% cases were correctly detected at least 1 d and at least 6 d before the occurrence of subclinical mastitis, respectively. Finally, in the case of D 60 , 74.48 and 65.38% cases were correctly detected at least 1 d and at least 6 d before the occurrence of subclinical mastitis, respectively.

DISCUSSION
The objective of the present study was to investigate the effectiveness of machine learning models that only use data routinely available on commercial dairy farms for predicting the likelihood of cows to develop subclinical mastitis. Predictions can be made for every cow on every day they are milked. This can enable dairy producers to take early action to limit the spread of subclinical mastitis by, for example, separating at-risk cows, more thoroughly cleaning milking equipment used with at-risk cows, draining at-risk cows' glands, or simply closer monitoring of at-risk cows. The study also set out to quantify the loss in predictive ability that arises when the frequency of milk composition and SCC measurement is reduced from every 7 d to just once every 60 d. The results of the study suggest that it is indeed possible to predict the onset of subclinical mastitis with a high level of accuracy using routinely available data. Although performance reduced as milk composition and SCC data measurement frequency reduced, performance remained relatively high.

Data
The models described in the present study used data that could be routinely recorded at milking on commercial dairy farms -that is, milk yield, milk composition data, SCC, and a cow's subclinical mastitis infection history with the possibility to use less commonly available data from the farm such as PTA, BW, and BCS, if available. This study did not use the farm identification number as a feature in the model since the model should be applicable across farms even where no prior information exists (as well as the fact that herd effects can change over time); moreover, preliminary analyses (results not shown) indicated that farm used as a feature did not affect the model performance. The analysis of the importance of different features in the model (Figure 1) showed that SCC (and features derived from it), days since calving, parity, infection history, PTA, and features derived from milk yield and maximum milk flow rate were the most informative in the model. This feature importance confirms findings in other studies. For example, milk yield (Friggens et al., 2007;Panchal et al., 2016;Ebrahimi et al., 2019) and milk composition (Ebrahimi et al., 2019;Bobbo et al., 2021) have both been documented to provide good predictive value in the past; Ring et al. (2021) demonstrated an association between PTA for SCC and phenotypic SCC in Irish dairy cows. The effect of PTA for SCC on milk production was also previously studied by Ruelle et al. (2019) and how BCS contributes to mastitis in dairy cows was discussed by Roche et al. (2009).
Some other studies that have focused on predicting subclinical mastitis have relied on data not routinely   available on commercial dairy farms, which were also not available in the present study. For example, electrical conductivity has been shown to be a good predictor of subclinical mastitis (Panchal et al., 2016;Ebrahimi et al., 2019) as has the pH of milk (Panchal et al., 2016;Bobbo et al., 2021). These attributes would likely improve the predictive accuracy of the model in the present study, although, as previous studies use different sets of variables and different data sets, it is difficult to estimate the scale of this improvement. The size of the data set used in the present study (1,346,207 milk-day records from 2,389 cows over 9 yr, including both development and test partitions) is much larger than the data sets used in most other studies in the literature. The study described by Friggens et al. (2007) used a data set covering 3 yr and contained data describing 332 cows from Danish farms. Bobbo et al. (2021) used a data set spanning 2 yr with 18,442 records from Italian farms. Panchal et al. (2016) used just 100 cows from Indian farms over one year to build models to predict subclinical and clinical mastitis. There have, nonetheless, been studies of similar scale to the present study; Ebrahimi et al. (2019) used data from 2,400 cows from Australian farms, albeit covering a period of just 2 yr. The study by Hyde et al. (2020) was of a much larger scale than the present study using records from 1,000 UK dairy farms across 5 yr.

Model Performance
The present study focused on models trained to predict the onset of subclinical mastitis within a time window of 7 d, given daily recorded milk yield and maximum milk flow data, routinely recorded data (e.g., milk composition data, SCC), and other cow-level variables available from the farm (e.g., BCS, BW, PTA). Other studies have predicted the onset of subclinical mastitis ahead of time based on test-day data records but did not use daily milk yield data (Anglart et al., 2020;Bobbo et al., 2021). Several studies have attempted to predict subclinical mastitis using data from the same day or data using previous days (Sitkowska et al., 2017;Ebrahimi et al., 2019;Anglart et al., 2020) although without including the SCC information. In contrast, instead of just predicting based on the test-day records, the models developed in the present study provide daily predictions on every milk-day using milk yield and milk maximum flow rate. Also, as the model developed in the present study predicts if a cow will have subclinical mastitis in the next 7-d window, and this is done every milk-day for a cow, the dairy farmer can receive a daily alert prompting early action rather than waiting for the next test-day. Bobbo et al. (2021) described a study that attempted to determine the best machine learning algorithm for predicting subclinical mastitis in dairy cows and concluded that neural networks and random forest models were the most effective. Although they used a stratified cross-validation technique to handle the class imbalance in the cross-validation partitions, they did not use techniques to address the significant class imbalance while training the models. The GBM models (Ebrahimi et al., 2019) and random forests (an ensemble approach very similar to GBM; Ebrahimie et al., 2018a) have also been shown to be useful algorithms for training models to predict the onset of subclinical mastitis in dairy cows. This consideration, in part, motivated the selection of GBM models for the present study.
It is difficult to compare the performance of prediction models across studies that use very different data sets, and different framings of the prediction problem (e.g., predicting the onset of subclinical mastitis over different time horizons). It is, however, informative to compare the performance achieved in the present study to those reported elsewhere for dairy cows. Bobbo et al. (2021) described models trained to predict the onset of subclinical mastitis on a test-day based on data from the previous test-day using a random forest model (test-days were typically one month apart). Their reported specificity of 89.7% and sensitivity of 55.3% is comparable to the present study. A specificity of 39.7% and sensitivity of 93.0% for predicting subclinical mastitis was reported by Ebrahimie et al. (2018a). Although, these models made predictions over a much shorter time horizon than the models in the present study (1 vs. 7 d). Those 2 studies also used other attributes that may not be available on commercial dairy farms. Bobbo et al. (2021) used monthly differential SCC (Damm et al., 2017) along with the other milk compositional data, as an input in their model (in fact it was the most informative feature), and Ebrahimie et al. (2018a) used electrical conductivity as well as milk composition data as an input in their model. Panchal et al. (2016) reported a sensitivity of 98% and a specificity of 95% for same-day prediction of subclinical mastitis in dairy cows without using SCC; however, the tuning of their train-test data splitting strategy suggests a possible degree of overfitting.
The present study provides the first analysis of how different milk characteristic and SCC measurement frequencies influence the effectiveness of machine learning models for subclinical mastitis prediction in dairy cows. The results show that, although performance does reduce as frequency reduces, the models retain strong predictive power even at the lowest frequencies; an AUC of 0.82 was achieved even with a measurement frequency of 60 d.

Practical Application
As well as affecting animal welfare, subclinical mastitis can also be extremely expensive. The opportunity cost of lost milk yield due to a subclinical mastitis case for a cow is, on average, €100 per lactation (Yalcin et al., 1999;Petrovski et al., 2006). Moreover, the review by Viguier et al. (2009) estimated that if subclinical mastitis progresses to be clinical, then the cost ranges from €160 to €700 per infected cow. If mastitis infection is allowed to spread within a herd, the costs can be very large, as reported in the review by Halasa et al. (2007). Therefore, early detection of the onset of subclinical mastitis could help dairy producers to take early action by isolating or treating cows before a subclinical case progresses to the clinical state or even spread within the herd.
The mode of deployment for the model described in the present study would be to run it daily after milking and alert the farm manager as to the cows most likely to develop mastitis in the next 7 d. The choice of data used in the models described in the present study was, in part, motivated by a desire for ease of deployment because the model features used are already readily accessible in many modern dairy farms. Milk yield and maximum milk flow rate are measured in most modern milking machines and some frequency of measurement of milk characteristic data and SCC usually takes place. The other attributes used in the model (e.g., BW, BCS, parity, month of milking, PTA) would typically be available from farm management systems. The most likely mode in which the model would be used by a farm manager would be to highlight the 10% of cows with the highest likelihood of developing subclinical mastitis for isolation or treatment. With this approach of considering only the top 10% of the predictions, even if the simulated recording frequencies are reduced for the routinely recorded data, the model was still able to perform fairly well even in the case of at least D 45 . Also, if the model predicts a cow is likely to develop subclinical mastitis, it will keep predicting this in the subsequent days of milking (assuming no action is taken in between), which demonstrates the robustness of the model.

CONCLUSIONS
The objective of the present study was to predict the onset of subclinical mastitis within a 7-d window using cow-level data routinely available on some commercial dairy farms. The predictions generated by the model aids farmers in identifying cows most at risk of develop-ing subclinical mastitis potentially triggering remedial action. The study also investigated how the predictive performance of models changes when the recording frequency of milking composition and SCC decreases in the data used to train them. It was found that the GBM model developed was effective and demonstrated good performance, even when the milk composition and SCC were recorded less frequently.