If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
Accurate forecasting of dairy cow milk yield is useful to dairy farmers, both in relation to financial planning and for detection of deviating yield patterns, which can be an indicator of mastitis and other diseases. In this study we developed a dynamic linear model (DLM) designed to forecast milk yields of individual cows per milking, as they are milked in milking robots. The DLM implements a Wood's function to account for the expected total daily milk yield. It further implements a second-degree polynomial function to account for the effect of the time intervals between milkings on the proportion of the expected total daily milk yield. By combining these 2 functions in a dynamic framework, the DLM was able to continuously forecast the amount of milk to be produced in a given milking. Data from 169,774 milkings on 5 different farms in 2 different countries were used in this study. A separate farm-specific implementation of the DLM was made for each of the 5 farms. To determine which factors would influence the forecast accuracy, the standardized forecast errors of the DLM were described with a linear mixed effects model (lme). This lme included lactation stage (early, middle, or late), somatic cell count (SCC) level (nonelevated or elevated), and whether or not the proper farm-specific version of the DLM was used. The standardized forecast errors of the DLM were only affected by SCC level and interactions between SCC level and lactation stage. Therefore, we concluded that the implementation of Wood's function combined with a second-degree polynomial is useful for dynamic modeling of milk yield in milking robots, and that this model has potential to be used as part of a mastitis detection system.
). Most studies have focused on assessing the sensors' ability to detect clinical mastitis, and were mostly based on the electrical conductivity measures of milk. Electrical conductivity measures can be combined with other sensor data, for example with milk yield measures, potentially resulting in better detection performances (
created a static linear model to describe the herd-level milk yield per milking given the time interval since the last milking. The study showed a significant quadratic effect of the interval, which is in accordance with the findings of
. This means that the milk yield for a given milking is expected to peak at a certain interval of time since the last session, after which the yield will decrease, resulting in the total daily milk yield being below the individual cow's potential. The effect of the interval on milk yield varies between herds as well as between individual cows (
). To know the deviation between expected and observed milk yields, it is important to have insight in the milk production curve. Various models have been studied for their ability to properly describe milk production curves (
. The Wood's lactation curve is based on a gamma function describing the milk yield as a function of DIM, with an initial increase, a peak value, and a final decrease in milk yield. Because few records exist in early lactation and the daily variation in milk yield is large, a robust fit of the Wood's function for the correct milk yield prediction is difficult. A method that enables prediction of milk yield with a limited amount of data is a dynamic linear model (DLM). The DLM easily handles missing data, because one-step-ahead forecasts are always produced given the available data (
). The DLM provides forecast values, which are automatically adjusted over time while considering the expected trend of the data. The forecast errors of the DLM will be more extreme when the status (e.g., health) of a cow deviates from its normal expected status. The adaptive short-term nature of the forecasts produced by DLM allows for a good short-term prediction. Such a dynamic approach for estimating the expected milk yield per milking for individual cows can be a good basis to quantify the deviation between observed milk production and predicted milk production. These estimates will be based on the expected total daily milk yield of the individual cow given its milk production curve and time since the previous milking.
Because a negative correlation has been observed between SCC and milk yield (
). To be able to observe deviation in the predicted milk production due to the cows' health state, milk production needs to be predicted on relatively healthy cows. This being the case, a reliable and dynamic milk yield prediction model could be used as part of a mastitis detection system, where overestimated milk yields could be one of the criteria for triggering an alarm.
Therefore, this study had 3 objectives: (1) to implement and demonstrate a novel method for predicting the milk yield per milking of individual dairy cows using a DLM. This was achieved by modeling the daily milk yield in a given lactation implementing a Wood's lactation curve to give the rate of change in daily milk yield, and a polynomial function to give the expected fraction of daily milk yield given the interval since last milking; (2) to test the importance of the DLM being specifically implemented and optimized for the specific farm where it is being used; and (3) to test the effect of SCC on the standardized forecast error of the DLM.
MATERIALS AND METHODS
All data processing, modeling, and visualization was done using the statistical language and environment R (
). The data included in this study were divided in a learning set of data, used to define the farm-specific DLM in accordance with aim 1 of this study, and a test set of data that was used to answer the questions posed by aims 2 and 3.
The data used in this study were provided by Lely Industries (Maassluis, the Netherlands). The raw data consisted of 1,094,780 observations of sensor data and identification data recorded by, or related to, milking robots produced by Lely. The data had been recorded between the January 1, 2015, and April 29, 2016, and originated from 57 farms in 6 different countries. These farms were selected from the thousands of farms in the Lely database, based on the availability of inline SCC measurements by the farm's milking robots. The method used by the inline SCC sensor has been described by
. The available identification data allowed the distinction between individual cows, the anonymized farm a given cow belonged to, and in which country the farm was located. Each individual cow in the data set was represented with exactly 1 lactation. The parities of the cows ranged from 1 to 8, although in this study only a distinction between primiparous and multiparous cows was made. This division was used because our model included the shape of the lactation curve, which is known to vary much between parity 1 and 2+, but not much between parity 2 and 3+, with the curve of the first lactation generally being more flat than those of the subsequent lactations (
). The median number of entries into the milking robot was 3 times per cow per day. None of the records were from cows that had been denied access to the milking robot. No information was available about which breeds were used on the included farms.
For this study, the relevant sensor data were the milk yields recorded for individual cows per milking and the inline measurements of SCC, which were used to judge the udder health of the cows. Only lactations with milk yield data on the first DIM and which lasted until at least 300 DIM were included. Only observations made between and including 1 DIM and 300 DIM were included in this study. The day of calving was considered as 0 DIM. This resulted in a total of 979,671 observations from 52 farms. Milkings where the milk yield was <1 kg were considered to be failed milkings and were removed from the data set (n = 9,208), leaving 970,463 observations from 52 farms.
In this study we wanted to predict the milk production of individual cows under the assumption that they have healthy udders. The idea is that when a cow's health state deviates from this assumption, the forecast errors of the DLM will be more extreme. Therefore, the data used for training the DLM needed to come from cows that were known to have as good an udder health as possible, as judged by the SCC measurements. It was thus necessary to be able to assess the udder health for all included cows, and for this reason only cows for which inline SCC data were available for at least once per week throughout the entire lactation were included in the study. This was the case for a total of 337 cows from 26 farms, reducing the data set to a total of 290,309 observations. The per-cow prevalence of elevated SCC was determined for each of the 337 individual cows based on the subset of recordings where inline SCC measurements were available. The thresholds for elevated SCC were set to 150,000 cells/mL for primiparous cows and 250,000 cells/mL for multiparous cows, in accordance with Dutch standard practice based on
. The overall median value of the per-cow prevalence of elevated SCC was 12%. Only individual cows with less than this per-cow prevalence of elevated SCC were allowed to be included in the learning set, which would be used for defining the farm-specific DLM to model the milk yields. All other cows would be included in the test set if they came from a farm with at least 3 primiparous cows and at least 3 multiparous in the DLM learning set. This was needed to estimate the variance components of the DLM (described later in the Estimation of Variance Components section). The final data set included a total of 169,774 observations divided over 196 cows on 5 farms in Germany and France (Table 1).
Table 1Overview of the number of cows included in the learning set and the test set for the 5 farms included in this study
) was used to forecast the milk yield of individual cows at each milking. In general, a DLM consists of an observation equation (Equation ) and a system equation (Equation ):
In our case, θ is a 2 × 1 matrix,
is a 1 × 2 matrix, V is a scalar, Gt is a 2 × 2 matrix, W is a 2 × 2 variance-covariance matrix, and vt and wt are the error terms of Equations  and , respectively. The matrices Ft, Gt, V, and W are assumed known at each observation time t. The observation equation describes how the observed milk yield (Yt) at a given time (t) depends on an unobservable parameter vector (θt). The system equation describes how the system, as expressed by the parameter vector (θt), is expected to change over time. In this study, the parameter vector was defined as
is the expected filtered mean of the daily milk yield at the time t given the DIM at observation time t and T is a trend factor, which serves to adjust the value of the trend, as given by the system matrix defined below. This trend factor is initialized to 1.
The system matrix, which determines how the expected daily milk yield changes over time, was defined as
is the expected rate of change (i.e., the trend) in milk yield at observation time t given the DIM at observation time t, which was further defined by Equation :
where a, b, and c are the parameters of the Wood's lactation function (
). Equation  states that the rate of change is defined as the total daily milk yield estimated by the Wood's function for the DIM of the current milking minus the estimated milk yield of the DIM of the previous milking. This approach means that meaningful estimates of changes can be made even if missing data result in that 2 consecutively observed milkings do not fall on 2 consecutive days. The Wood's parameters were estimated separately for each of the 2 parity groups for each of the 5 farms included in the study (see the Estimation of Wood's Parameters section).
The design matrix,
determines how individual milk yield observations are expected to deviate from the expected total daily milk yield, and is defined as
is the expected relative milk yield (i.e., the expected fraction of the expected total daily milk yield). This value was predicted by a second-degree polynomial function, which was defined separately for each farm and parity group (see Estimation of Expected Relative Milk Yield section). V is the observational variance of the milk yield, whereas W is the systematic variance-covariance matrix, describing the variance of the random change in the milk yield, random change in the systematic trend of the milk yield, and the covariance between the milk yield and the systematic trend of the milk yield. They, along with the initial prior variance of the milk yield and the systematic trend of the milk yield (C0), were estimated for each farm separately (see the Estimation of Variance Components section). The prior variance is used when updating the parameter vector at each observation time t using the Kalman filter (
The observed milk yield was compared with the forecasted milk yield at each observation as part of the Kalman filtering, and the forecast errors, et, were calculated as
Furthermore, the variance of the forecasted value of the milk yield (i.e., the forecast variance, Qt) is estimated as part of the Kalman filtering. The forecast variance can be used to standardize the forecast errors as seen in Equation :
This ensures that the standardized forecast error, ut, will follow a standard normal distribution, as long as the health state of the cows remains normal. This standardization is typically used in research where the forecast errors of a DLM are used to classify the behavior or health state of one or more animals (e.g.,
) were estimated separately for primiparous and multiparous cows for each of the included farms based on the learning set. The parameters were estimated using a genetic algorithm implemented specifically for this purpose. The objective function of the genetic algorithm was to minimize the root mean squared error (RMSE) of the average daily milk yield from DIM = 1 to DIM = 300 for each farm. All registered milkings, regardless of the corresponding interval of time since last milking, were used when aggregating the milk yields to daily total yields.
The genetic algorithm would take an initial vector with 3 values, corresponding to the 3 parameters in the Wood's function (i.e., a, b, and c), where a is a scale parameter that corresponds to the expected number of liters of milk on 1 DIM, and b and c are shape parameters. From this initial vector, a “population” with a total of 1,000 vectors was generated by randomly increasing or decreasing each element in the vector. By repeatedly removing the 80% of the population with the highest RMSE and reestablishing a population of a 1,000 vectors based on the remaining 20%, the best RMSE was lowered with each iteration. This procedure was repeated until no further reduction in RMSE was observed for 10 consecutive iterations. At this point, the best parameter vector (i.e., the one producing the lowest RMSE for a given farm) was saved to be used with the DLM for that specific farm.
, the fraction of the expected total daily milk yield at observation time t, which was expected from a given milking based on the interval of time since the last milking of the same cow (i.e., the relative milk yield), was described as a second degree polynomial function of the interval of time since last milking at observation time t in whole rounded hours. Thus, the 3 parameters α0, α1, and α2, as they apply to Equation , were estimated for each of the 5 farms separately.
The estimation of the 3 parameters was achieved with linear regression, using the built-in R function lm with the model specification poly(), with the number of degrees in the polynomial set to 2. Only observations with intervals of at least 1 h and less than 24 h were used. All observations with intervals outside this range (n = 36) were considered unreliable, in accordance with the findings of
). For each farm, an EM set and a validation set were created. The EM set, on which the EM algorithm was run, consisted of 50% of the cows in the learning set of that farm. These cows were randomly selected. The remaining 50% of the learning set cows in the farm were used as the validation set. For each iteration of the EM algorithm, the DLM was run through the milk yield observations of the cows in the validation set, using the most recently estimated variance components. The validation performance was measured as the RMSE over all observations in the validation set. Whenever the RMSE was lower than the until-then lowest RMSE, the most recent V and W were saved as the (temporarily) best versions of V and W. Whenever the RMSE was either lower or equal to the until-then lowest RMSE, a new C0 would be estimated using the smoothening algorithm (
), and the resulting new C0 would be used in the next iterations. This cycle was repeated for each farm until no improvements in RMSE had been seen for 10 consecutive iterations.
Each of the 5 farm-specific DLM was applied to the milk yield of all cows in each of the 5 farm-specific test sets (Table 1). Thus, the milk yield of all cows were modeled with 1 “proper” model (i.e., a model created specifically to describe the milk yields in the farm where the cow had been observed) and 4 “random” models (i.e., models created specifically to describe the milk yields of one of the other farms included in this study). To minimize the initial adaptation phase, where the DLM needs to adapt to the peculiarities of the individual cow, the systematic variance matrix, W, was multiplied by a factor of 20,000 for all observations on DIM 1 to 7. This factor was not used when running the EM algorithm, as described above.
To determine which factors would influence the standardized forecast errors of the DLM, a mixed effects model was fitted to describe the standardized forecast errors given the SCC level, lactation stage, and the modeling strategy, including all 2- and 3-way interactions between these 3 factors. These 3 factors were all categorized as defined in Table 2. The individual cow and farm were included as random variables with the cow variable nested within the farm variable, which is equivalent to including the effect of the farm alone as well as an interaction effect between farm and cow. This mixed effects model was fitted using the R function lme. The ANOVA function in R was applied to the fitted linear mixed effects model to identify which factors and interactions significantly (α = 5%) affected the standardized forecast errors. The Tukey honestly significant difference test was applied to the factors that had been shown by the ANOVA test to significantly affect the forecast errors, using the glht function from the R package multcomp (
The following subsections describe the farm-specific parameters for the various elements of the DLM, as they were estimated based on the learning set data from each of the respective farms.
Figure 1 provides a visual overview of the differences between the Wood's curves for the 2 parity groups in the 5 farms. The specific parameters of the Wood's functions, a, b, and c, for both parity groups are presented in the plot of each farm. As would be expected, the production levels of multiparous cows were higher than for primiparous cows, especially due to a high peak production. For multiparous cows, the highest and lowest peak production was respectively 50 kg/d (farm DE.1) and 43 kg/d (farm FR.2). For primiparous cows, the highest and lowest peak productions were, respectively, 37 kg/d (farm DE.3) and 33 kg/d (farm FR.2). The persistence of the primiparous cows was higher than of multiparous cows, leading to almost identical production levels at 300 DIM between the 2 parity groups.
Relative Milk Yield
The factors for the second-degree polynomial function, α0, α1, and α2 as seen in Equation , for each of the 5 farms are presented in Table 3. The determination coefficient, R2, shows the goodness of fit to the average relative milk yield given the interval of time since last milking.
Table 3The factors for the second-degree polynomial, describing the effect of time intervals between milkings on the relative milk yield for each parity group in each farm
α0 is the intercept, α1 is the constant factor applied to the interval of time since last milking, and α2 is the constant factor applied to the squared value of the interval of time since last milking.
No. of cows2
3The number of cows of the given parity group, which was included in the learning set for the given farm.
1 α0 is the intercept, α1 is the constant factor applied to the interval of time since last milking, and α2 is the constant factor applied to the squared value of the interval of time since last milking.
Figure 2 illustrates the polynomial models (i.e., how the relative milk yield depends on the interval of time since last milking) for the primiparous and multiparous cows on each of the 5 farms included in this study. The figure shows vastly diverse shapes between the parity groups as well as between the farms. A decreasing coefficient of the curve (i.e., the curve bends down) denotes a decreasing milk production in terms of milk production per hour. The point where the curve coefficient starts to decrease can be seen as the optimal milking interval because that interval provides the highest milk production per hour. The sharper the decrease in curve coefficient, the larger the effect of milking interval on milk production per hour. For instance, on farm DE.1, the milking frequency for both the primiparous and the multiparous cows should not move beyond 12 h. In contrast, although there are effects of milking interval on relative milk production on farm DE.2, these effects are relatively small. Hence, for farm DE.2 it is less important to have a shorter milking interval. For farms DE.3 and FR.2, the effects of a longer milking interval differ considerably between primiparous and multiparous cows; however, the difference is not similar between farms.
For the various farms, the EM algorithm ran between 25 and 38 iterations, with a median of 33 iterations, before no further reduction of the RMSE had been seen for 10 consecutive iterations.
The observational variances, V, were estimated to be between 1.05 and 2.03 kg2 (median = 1.88 kg2). This means that, for the median variance of 1.88 kg2, for any given milk yield measurement, the 95% accuracy of that measurement would be an estimated
around the measured value.
The systematic variances of the evolution of the milk yield, W1,1, were between 1.04 and 2.43 kg2 (median = 1.79 kg2), meaning that the true total daily milk yield would be expected, within a 95% accuracy, to randomly drift by up to ±2.6 kg from what would be expected given the Wood's curve. The systematic variances of the evolution of the trend component of the milk yield, W2,2, were between 0.002 and 0.004 kg2 (median = 0.003 kg2). This means that the systematic effect of the rate of change in milk yield, given by the Wood's function, would be expected, within a 95% accuracy, to drift by ±0.1 kg for each observation. The systematic co-variances between evolution of the milk yield and the evolution of the trend component, W1,2 and W2,1, were between −0.10 and −0.04 kg2 (median = −0.07 kg2). These negative co-variance should be interpreted to mean that higher levels of random drift of the true daily milk yield results in a lower drift of the systematic rate of change, and vice versa.
The initial prior variances of the milk yield, C01,1, were between 0.22 and 0.81 kg2 (median = +0.41 kg2). This means that, for the median value, the true total daily milk yield on 1 DIM, given farm and parity group, would be expected to fall within ±1.3 kg around the initial estimate with a 95% accuracy. The initial prior variances of the milk yield trend, C02,2, were between 0.0004 and 0.0002 kg2 (median = 0.0011 kg2). This means that the initial systematic rate of change in the milk yield should be 95% accurate within a factor of ±0.1 kg. The initial prior co-variances between milk yield and the milk yield trend were between −0.032 and −0.009 kg2 (median = −0.018 kg2).
This section relates to the results of the descriptive analyses done on the standardized forecast errors, which were achieved when applying the farm-specific DLM, described in the previous section, on the test set data. Table 4 shows the results of the ANOVA test applied to the mixed effects model, which was fitted to describe the standardized forecast errors given the lactation stage, SCC level, and modeling strategy, as defined in Table 2. The SCC level was associated with an increase in the standardized forecast error in the negative direction when the SCC level was elevated. In other words, an elevated SCC level would cause the DLM to overestimate the milk yield. The Tukey's honestly significant difference test, used to investigate the interaction effect between lactation stage and SCC level, showed that the standardized forecast errors were least affected in the early stage of lactation (1:30 DIM, P = 0.033), more so in the middle stage (31:80 DIM, P < 0.001), and most affected in the late stage of lactation (>80 DIM, P < 0.001).
Table 4The results of the ANOVA test applied to the mixed effects model describing the standardized forecast errors
This study showed that the observational variances of the milk yield of the farms, as estimated by the EM algorithm, varied between 1.05 and 2.03 kg2. This corresponds to standard deviations of between 7.4% and 10.3% of the average milk yield recorded at milkings where more than 10 kg were produced, regardless of parity group or health status. For comparison, the International Committee for Animal Recording (ICAR) recording guidelines state that automatic milking systems should measure the milk yield with a standard deviation of no more than 5% of the reference milk yield for milkings where more than 10 kg are produced (
Devices used by automated milking systems are similarly accurate in estimating milk yield and in collecting a representative milk sample compared with devices used by farms with conventional milk recording.
. Given this knowledge, it seems likely that the observational variance seen in our study is largely due to measurement errors on the farms.
To shorten the adaptation phase of the DLM, a factor of 20,000 was applied to the variance of the first 7 d of lactation, which ensured an accelerated adaptation to the level of the individual cow. After this accelerated adaptation, the only truly farm-specific variables were the observation variance and the system variance. If these variables had varied more between farms, it is expected that the effect of not using the proper farm-specific models would have been greater. Furthermore, applying the factor of 20,000 meant that the forecast errors would be centered on 0 for the remainder of the lactation. In contrast, not applying this factor had the consequence that the initial adaption phase would take around 30 d instead of 7 (data not shown).
The standardized forecast errors were not affected by the lactation stage alone (P = 0.68), even though a significant interaction effect was seen between lactation stage and SCC level. This result shows that the Wood's function is a useful method for taking the effect of the DIM into account when dynamically modeling the milk yield. Several other functions to describe lactation curves do, however, exist. Most of these are more complex than the Wood's function, but all of them could equally well be implemented into a DLM.
compared static implementations of 9 different lactation functions for accuracy in describing group-level mean daily milk yield. They found that the Wood's function ranked worst in terms of the adjusted correlation coefficient, and the 2 equally best models were the diphasic model (
showed that, in the dynamic context of a DLM, describing the milk yield with the assumption of a simple linear trend, which was continually adjusted by the Kalman filter, produced one-step milk yield forecasts, which were useful for detecting clinical mastitis in individual dairy cows. Furthermore,
dynamically modeled the daily milk yield without including any lactation function at all, but instead had the daily milk yield be a function of feed intake and the number of milkings per day. The model of
did, however, show a general bias toward underestimating the milk yield. For future research, it could therefore be interesting to investigate the utility of implementing different lactation functions with different levels of complexity into a DLM. Based on our results we do, however, expect that, in the context of a DLM, any added value of implementing more complex lactation curves than the Wood's curve will be extremely limited.
In this study, our focus has been on short-term milk yield predictions. A DLM like the one described in this paper can, however, also be used to predict the long-term lactation curves for individual cows, and by extension for the farm as a whole. If used in this way, the DLM would initially predict that a given cow would follow the average farm-level lactation curve given her parity group. It would, however, rapidly adapt to the level of the individual cow, just as was the case in this study, and the long-term milk yield forecasts would thus become increasingly more precise as more data are observed. Annual milk predictions can assist dairy farmers in determining the efficiency of their production on the long-run, but also would enable dairy farmers to make better informed decisions, such as whether or not to shorten the dry period, select cows for culling or breeding, and determine the optimal interval for milking (
). Using a DLM for early prediction of the long-term milk yields could therefore be used for early identification of cows that are likely to develop problems related to a negative energy balance.
No effect (P = 0.77) was found for the modeling strategy, which indicates that using a DLM which is not specifically created to describe the farm where it is used has no effect on predicting milk yield. We further looked at the effect on each of the 5 farms, when applying the model made for each of the 4 other farms, and found that the P-values ranged from 0.46 to 1 with a mean of 0.93 and a median of 1. This is further evidence that the dynamic nature of the DLM is able to correct for the effects of using parameter values for the DLM that are based on data from different farms or, in principle, from relevant scientific literature. For practical uses, however, it is of course always most prudent to build a model for a specific farm based on data from that farm, if such data are available. The practical implication of our results is that, if farm-specific data are not available (e.g., if the farmer has only recently acquired the relevant sensor system), then using a model based on data from another farm could still be meaningful. Additionally, it should be remembered that only 5 farms were included in our study, and further studies with more farms are warranted to determine whether this finding holds true in general.
Only the SCC level as well as an interaction effect between the SCC level and the lactation stage showed statistically significant effects on the standardized forecast errors of the DLM. This supports the idea that the model described in this study may be used as part of an inline disease alarm system. This also means that more accurate milk yield measurements (i.e., lower observational variance) should be expected to allow such an alarm system to be more accurate. The fact that there was an interaction effect between SCC and the lactation stage further suggests that the lactation stage should be taken into account by such an alarm system, even though the standardized forecast errors of the DLM were insensitive to the lactation stage alone. Additionally, the milk yield forecast errors could be combined with other information and used as part of a larger framework, such as artificial neural networks (
). The idea of using such a larger framework would be that a specific disease might have a specific signature in the forecast errors of multiple DLM made to describe the data from multiple different sensor systems. Some existing sensor systems are already commonly used in the dairy industry, such as electrical conductivity of the milk (
). As with milk yield, a DLM can also be used to describe the electrical conductivity, and the standardized forecast errors of such a DLM could be combined with those of a milk yield DLM in the larger framework. Further additional information could be extracted from the time intervals between the milkings. For example, if a cow develops mastitis she might become increasingly unwilling to enter the milking robot. A DLM designed to describe the mean time interval between milkings for individual cows, including a trend component, could be used. Then, if a given interval between 2 milkings is significantly higher than what would be expected for an individual cow, it could indicate a problem. Alternatively, if the trend for the interval becomes significantly greater than 0, meaning that her intervals become increasingly longer over time, it could suggest an emerging problem. Using the direction of the trend in this way has been demonstrated by
), and so a unifying framework for a disease alarm system should also be able to consider this kind of additional information. A well-made alarm system of this kind would also be more robust than the currently used 1-variable alarm system, as it would have several sources of information as the basis for the alarm, which means that missing data from 1 sensor would be less of a problem.
A DLM for forecasting milk yield at milking level for individual cows being milked in milking robots was developed and tested. The standardized forecast errors of the DLM predictions were affected by the SCC level, which suggest that this model could be used as part of a mastitis detection system. Furthermore, a significant interaction effect was observed between SCC level and the lactation stage. This suggests that if our model were to be implemented in practice as part of a mastitis detection system, the accuracy of such a system would likely benefit from considering the magnitude of the forecast errors in the context of the lactation stage before raising a mastitis alarm, as opposed to raising the alarm based on the forecast error alone. We did not see any effect from lactation stage alone. We also saw no effect from whether or not a farm-specific version of the DLM was used. Because this effect was only tested on 5 farms, we do however caution the reader to build farm-specific models whenever possible, at least until further research can confirm or deny the generalizability of this finding.
We thank Rik van der Tol from Lely Industries (Maassluis, the Netherlands) for providing the data used in this study. The work described in this paper was funded by the Wageningen University (WU) Talent Program, project number 2100949600.
Mastitis and the shape of the lactation curve in Norwegian dairy cows.
Devices used by automated milking systems are similarly accurate in estimating milk yield and in collecting a representative milk sample compared with devices used by farms with conventional milk recording.