Advertisement

A machine learning proposal method to detect milk tainted with cheese whey

Open AccessPublished:October 04, 2022DOI:https://doi.org/10.3168/jds.2021-21380

      ABSTRACT

      Cheese whey addition to milk is a type of fraud with high prevalence and severe economic effects, resulting in low yield for dairy products, nutritional reduction of milk and milk-derived products, and even some safety concerns. Nevertheless, methods to detect fraudulent addition of cheese whey to milk are expensive and time consuming, and are thus ineffective as screening methods. The Fourier-transform infrared (FTIR) spectroscopy technique is a promising alternative to identify this type of fraud because a large number of data are generated, and useful information might be extracted to be used by machine learning models. The objective of this work was to evaluate the use of FTIR with machine learning methods, such as classification tree and multilayer perceptron neural networks to detect the addition of cheese whey to milk. A total of 520 samples of raw milk were added with cheese whey in concentrations of 1, 2, 5, 10, 15, 20, 25, and 30%; and 65 samples were used as control. The samples were stored at 7, 20, and 30°C for 0, 24, 48, 72, and 168 h, and analyzed using FTIR equipment. Complementary results of 520 samples of authentic raw milk were used. Selected components (fat, protein, casein, lactose, total solids, and solids nonfat) and freezing point (°C) were predicted using FTIR and then used as input features for the machine learning algorithms. Performance metrics included accuracy as high as 96.2% for CART (classification and regression trees) and 97.8% for multilayer perceptron neural networks, with precision, sensitivity, and specificity above 95% for both methods. The use of milk composition and freezing point predicted using FTIR, associated with machine learning techniques, was highly efficient to differentiate authentic milk from samples added with cheese whey. The results indicate that this is a potential method to be used as a high-performance screening process to detected milk adulterated with cheese whey in milk quality laboratories.

      Key words

      INTRODUCTION

      Raw milk tampering with cheese whey is a serious problem for the dairy industry, especially in developing countries. This fraudulent practice is of concern for food inspection agencies and consumers because of the nutritional value reduction of milk and some derivatives and even safety issues (
      • Brandao M.C.M.P.
      • Carmo A.P.
      • Bell M.J.V.
      • Anjos V.C.
      Characterization of milk by infrared spectroscopy.
      ;
      • Robim M.S.
      • Cortez M.A.S.
      • Silva A.C.O.
      • Filho R.A.T.
      • Gemal N.H.
      • Nogueira E.B.
      Research fraud in UHT whole milk marketed in the state of Rio de Janeiro and comparison between the methods of physicochemical officers and the method of ultrasound.
      ;
      • Tibola C.S.
      • da Silva S.A.
      • Dossa A.A.
      • Patrício D.I.
      Economically motivated food fraud and adulteration in Brazil: Incidents and alternatives to minimize occurrence.
      ). For example, milk protein dilution after cheese whey tampering may motivate the addition of cheaper materials, such as urea or even hazardous chemicals such as melamine to disguise lower protein composition (
      • Handford C.E.
      • Campbell K.
      • Elliott C.T.
      Impacts of milk fraud on food safety and nutrition with special emphasis on developing countries.
      ;
      • Poonia A.
      • Jha A.
      • Sharma R.
      • Singh H.B.
      • Rai A.K.
      • Sharma N.
      Detection of adulteration in milk: A review.
      ).
      However, suitable analytical methods to investigate this fraud usually are expensive, time consuming, and limit precision and accuracy (
      • de La Fuente M.A.
      • Juárez M.
      Authenticity assessment of dairy products.
      ;
      • de Carvalho B.M.A.
      • de Carvalho L.M.
      • dos Reis Coimbra J.S.
      • Minim L.A.
      • de Souza Barcellos E.
      • da Silva Júnior, W.F.
      • Detmann E.
      • de Carvalho G.G.P.
      Rapid detection of whey in milk powder samples by spectrophotometric and multivariate calibration.
      ;
      • Tibola C.S.
      • da Silva S.A.
      • Dossa A.A.
      • Patrício D.I.
      Economically motivated food fraud and adulteration in Brazil: Incidents and alternatives to minimize occurrence.
      ). Notably, one of the most known methods for cheese whey detection, based on the quantification of the caseinomacropeptide (CMP) by HPLC, is a subject of controversy due to false-positive results and accuracy issues (
      • Lenardon L.
      • Meneghini L.Z.
      • Hoff R.B.
      • Motta T.M.C.
      • Pizzolato T.M.
      • Ferrão M.F.
      • Bergold A.M.
      Determination of caseinomacropeptide in Brazilian bovine milk by high-performance liquid chromatography-mass spectrometry.
      ;
      • de Pádua Alves É.
      • de Alcântara A.L.D.A.
      • Guimarães A.J.K.
      • de Santana E.H.W.
      • Botaro B.G.
      • Fagnani R.
      Milk adulteration with acidified rennet whey: A limitation for caseinomacropeptide detection by high-performance liquid chromatography.
      ;
      • Raymundo N.K.L.
      • Daguer H.
      • Osaki S.C.
      • Bersot L.S.
      Correlating mesophilic counts to the pseudo-CMP content of raw milk.
      ;
      • Lobato P.R.
      • Heringer J.P.M.
      • Fortini M.E.R.
      • Ferreira L.F.
      • Feijó F.A.C.
      • Leite M.O.
      • Cerqueira M.M.O.P.
      • Penna C.A.M.
      • Souza M.R.
      • Fonseca L.M.
      Índice de CMP em leite pasteurizado comercializado em Minas Gerais, Brasil, durante os anos de 2011 a 2017.
      ).
      Current advances in the use of machine learning in analytical methods may be an answer to this problem because new and innovative techniques could be created and incorporated in a laboratory routine (
      • de La Fuente M.A.
      • Juárez M.
      Authenticity assessment of dairy products.
      ). Predictive methods such as machine learning algorithms are powerful modeling tools which can detect complex, nonlinear relationships between inputs and outputs (
      • Alves da Rocha R.
      • Paiva I.M.
      • Anjos V.
      • Furtado M.A.M.
      • Bell M.J.V.
      Quantification of whey in fluid milk using confocal Raman microscopy and artificial neural network.
      ;
      • Morota G.
      • Ventura R.V.
      • Silva F.F.
      • Koyama M.
      • Fernando S.C.
      Big data analytics and precision agriculture symposium: Machine learning and data mining advance predictive big data analysis in precision animal agriculture.
      ;
      • Skansi S.
      Introduction to Deep Learning: From Logical Calculus to Artificial Intelligence.
      ;
      • Neto H.A.
      • Tavares W.L.F.
      • Ribeiro D.C.S.Z.
      • Alves R.C.O.
      • Fonseca L.M.
      • Campos S.V.A.
      On the utilization of deep and ensemble learning to detect milk adulteration.
      ). Their use has expanded to many fields such as information technology, linguistics, medicine, finance, marketing, and so on. In analytics, several applications have been developed such as the use of neural networks associated with infrared analysis to investigate the addition of extraneous substances to milk, such as sugar and starch, among others (
      • Liakos K.G.
      • Busato P.
      • Moshou D.
      • Pearson S.
      • Bochtis D.
      Machine Learning in Agriculture: A Review.
      ;
      • Conceição D.
      • Gonçalves B.-H.
      • da Hora F.
      • Faleiro A.
      • Santos L.
      • Ferrão S.
      Use of FTIR-ATR spectroscopy combined with multivariate analysis as a screening tool to identify adulterants in raw milk.
      ;
      • Neto H.A.
      • Tavares W.L.F.
      • Ribeiro D.C.S.Z.
      • Alves R.C.O.
      • Fonseca L.M.
      • Campos S.V.A.
      On the utilization of deep and ensemble learning to detect milk adulteration.
      ).
      Decision tree is a popular machine learning method for classification or regression problems. A decision tree is a predictive technique that involves splitting the values of the predictors based on a set of splitting rules, which divide the data into homogeneous subsets. A prediction, or decision, can be made by navigating from the root to a terminal node. Several algorithms can be used, depending on the type of tree. Decision tree learning is powerful, although simple and efficient, and can be easily understood, interpreted, and controlled (
      • Wu X.
      • Kumar V.
      • Ross Quinlan J.
      • Ghosh J.
      • Yang Q.
      • Motoda H.
      • McLachlan G.J.
      • Ng A.
      • Liu B.
      • Yu P.S.
      • Zhou Z.-H.
      • Steinbach M.
      • Hand D.J.
      • Steinberg D.
      Top 10 algorithms in data mining.
      ;
      • Ertel W.
      Introduction to Artificial Intelligence.
      ).
      Classification and regression trees (CART) cover the use of trees as a data analysis method, and was developed by Leo Breiman (
      • Breiman L.
      • Friedman J.
      • Stone C.J.
      • Olshen R.A.
      Classification and Regression Trees.
      ). Despite its simplicity and analysis power, CART use as a tool in food analytical methods has been limited (
      • Hansen L.
      • Ferrão M.F.
      Classification of milk samples using CART.
      ). In a CART tree, the dependent variable can be either categorical (classification trees) or continuous (regression trees), whereas the predictors can be both continuous and categorical (
      • Bramer M.
      Principles of Data Mining.
      ). For binary classification trees, the value of each terminal node is the mode of the observations in the corresponding subset, and the prediction accuracy is given by the percentage of correctly classified cases.
      Artificial neural networks are a powerful learning method based on the components of the biological brain. An artificial neural network is composed by connected nodes called neurons, and each connection transmits a signal from one neuron to the other neurons in a deeper layer. An artificial neuron receives a signal, which is a data input, and processes the signal by evaluating a computational function with a specified weight value. Finally, the neuron transmits the result as another signal to other connected neurons. Neural networks can detect complex, nonlinear relationships between inputs and outputs, and their uses are found in different fields, such as finance (
      • Xu X.
      • Zhang Y.
      Corn cash price forecasting with neural networks.
      ;
      • Ghaffarian S.
      • van der Voort M.
      • Valente J.
      • Tekinerdogan B.
      • de Mey Y.
      Machine learning-based farm risk management: A systematic mapping review.
      ), marketing (
      • Guiné R.P.F.
      • Ferrão A.C.
      • Ferreira M.
      • Correia P.
      • Mendes M.
      • Bartkiene E.
      • Szűcs V.
      • Tarcea M.
      • Sarić M.M.
      • Černelič-Bizjak M.
      • Isoldi K.
      • El-Kenawy A.
      • Ferreira V.
      • Klava D.
      • Korzeniowska M.
      • Vittadini E.
      • Leal M.
      • Frez-Muñoz L.
      • Papageorgiou M.
      • Djekić I.
      Influence of sociodemographic factors on eating motivations - modelling through artificial neural networks (ANN).
      ), physics (
      • Schiassi E.
      • Furfaro R.
      • Leake C.
      • De Florio M.
      • Johnston H.
      • Mortari D.
      Extreme theory of functional connections: A fast physics-informed neural network method for solving ordinary and partial differential equations.
      ;
      • Ma L.
      • Kashanj S.
      • Xu S.
      • Zhou J.
      • Nobes D.S.
      • Ye M.
      Flow reconstruction and prediction based on small particle image velocimetry experimental datasets with convolutional neural networks.
      ), linguistics (
      • Lakretz Y.
      • Hupkes D.
      • Vergallito A.
      • Marelli M.
      • Baroni M.
      • Dehaene S.
      Mechanisms for handling nested dependencies in neural-network language models and humans.
      ), medicine (
      • Dunnmon J.A.
      • Yi D.
      • Langlotz C.P.
      • Ré C.
      • Rubin D.L.
      • Lungren M.P.
      Assessment of convolutional neural networks for automated classification of chest radiographs.
      ;
      • Koo T.
      • Kim M.
      • Jue M.
      Automated detection of superficial fungal infections from microscopic images through a regional convolutional neural network.
      ;
      • Zhu X.
      • Zheng B.
      • Cai W.
      • Zhang J.
      • Lu S.
      • Li X.
      • Xi L.
      • Kong Y.
      Deep learning-based diagnosis models for onychomycosis in dermoscopy.
      ), and so on. (
      • Alpaydin E.
      Introduction to Machine Learning.
      ;
      • Witten I.
      • Frank E.
      • Hall M.
      • Pal C.
      Data Mining: Practical Machine Learning Tools and Techniques.
      ).
      Multilayer perceptron (MLP) is a feedforward network with 3 or more layers (one input, one output, and one or more hidden layers), which usually employs a sigmoid or a hyperbolic tangent function as an activation function. Indexes, such as accuracy, precision, sensitivity, specificity, may be valuable tools to estimate the performance of a prediction method, together with the receiver operating characteristics curve (
      • Neto H.A.
      • Tavares W.L.F.
      • Ribeiro D.C.S.Z.
      • Alves R.C.O.
      • Fonseca L.M.
      • Campos S.V.A.
      On the utilization of deep and ensemble learning to detect milk adulteration.
      ).
      The most used infrared equipment for raw milk analysis today is based on Fourier-transform infrared (FTIR) spectroscopy in the mid-infrared range, 649 to 3,999 cm−1 (
      • Ho P.N.
      • Luke T.D.W.
      • Pryce J.E.
      Validation of milk mid-infrared spectroscopy for predicting the metabolic status of lactating dairy cows in Australia.
      ). This type of equipment is used worldwide for daily compositional analyses for millions of raw milk samples, aiming for quality control and inspection in the dairy industry, and dairy herd improvement programs. Association of this technology with machine learning algorithms might be an optimization tool to screen milk compositional data for authenticity (
      • Oliveira M.C.P.P.
      • Silva N.M.A.
      • Bastos L.P.F.
      • Fonseca L.M.
      • Cerqueira M.M.O.P.
      • Leite M.O.
      • Conrrado R.S.
      Fourier transform infrared spectroscopy (FTIR) for MUN analysis in normal and adulterated milk.
      ;
      • Gondim C.S.
      • Junqueira R.G.
      • Souza S.V.C.
      • Ruisánchez I.
      • Callao M.P.
      Detection of several common adulterants in raw milk by MID-infrared spectroscopy and one-class and multi-class multivariate strategies.
      ;
      • Neto H.A.
      • Tavares W.L.F.
      • Ribeiro D.C.S.Z.
      • Alves R.C.O.
      • Fonseca L.M.
      • Campos S.V.A.
      On the utilization of deep and ensemble learning to detect milk adulteration.
      ;
      • Brito R.F.
      • Rodrigues R.
      • Diniz S.A.
      • Fonseca L.M.
      • Leite M.O.
      • Souza M.R.
      • Conrrado R.S.
      • Veríssimo S.A.O.
      • Valente G.L.S.
      • Cerqueira M.M.O.P.
      Analysis of the freezing point of milk by precision method and by Fourier transform infrared (FTIR) spectroscopy.
      ).
      Although multivariate analysis has been used to detect milk authentication from cheese whey adulteration (
      • Valente G.F.S.
      • Guimarães D.C.
      • Gaspardi A.L.A.
      • Oliveira L.A.
      Applying artificial neural networks as a test to detect milk fraud by whey addition.
      ;
      • Vinciguerra L.L.
      • Marcelo M.C.A.
      • Motta T.M.C.
      • Meneghini L.Z.
      • Bergold A.M.
      • Ferrao M.F.
      Chemometric tools and FTIR-ATR spectroscopy applied in milk adulterated with cheese whey.
      ), the novelty of our study is underpinned by the use of supervised machine learning methods applied to a large set of real samples of bulk tank raw milk with potential for use as a quality control tool in a milk quality laboratory routine. Artificial neural networks may be classified as a robust nonlinear multivariate analysis technique (
      • Witten I.
      • Frank E.
      • Hall M.
      • Pal C.
      Data Mining: Practical Machine Learning Tools and Techniques.
      ;
      • Kubat M.
      An Introduction to Machine Learning.
      ).
      The objective of this work was to discriminate between raw milk and milk adulterated with cheese whey using machine learning methods applied to FTIR results. This is an innovative screening method with the possibility to optimize analytical speed of raw milk samples with practical implications.

      MATERIALS AND METHODS

      Ethical approval was waived by the local Ethics Committee of the University (CEP-UFMG) in view of the nature of the study and all the procedures.

      Milk and Cheese Whey

      The experiment was done in the Laboratory for Milk Quality Analysis (ISO/IEC 17025 accredited), School of Veterinary Medicine, Universidade Federal de Minas Gerais (UFMG), Brazil. Five batches of refrigerated raw milk were collected from a refrigerated farm bulk tank, from May to December 2019, in a research farm, with a herd containing about 100 lactating cows with different genetic ratio of Holstein and Gyr cattle. The milk was processed to obtain Minas cheese (a typical Brazilian cheese) by rennet addition (chymosin) and coagulation (
      • Andreatta E.
      • Fernandes A.M.
      • Santos M.V.
      • Mussarelli C.
      • Marques M.C.
      • Gigante M.L.
      • Oliveira C.A.F.
      Qualidade de queijo minas frescal preparado com leite com diferentes quantidades de células somáticas.
      ). Briefly, 10-L batches of raw milk were pasteurized (low temperature, long time) at 64°C for 30 min, and after cooling to approximately 35°C, liquid rennet (Ha-la, Christian-Hansen) was added (0.8 mL/L). After coagulation (about 40 min), the gel was cut into cubes with sides of approximately 1.5 cm and stirred for 30 min. At the end, whey was collected and filtered in qualitative filtration paper (11 µm). The resulting cheese whey was heated to 72–75°C for 10 min to denature chymosin, and then refrigerated to 20°C for immediate experimental use. For each repetition, whey was added to raw milk into 50-mL vials at different concentrations (0, 1, 2, 5, 10, 15, 20, 25, and 30%), and added with bronopol as a preservative (Broad Spectrum MicroTabs, 8 mg of bronopol and 0.30 mg of natamycin; Advanced Instruments). After randomization, samples were stored at 7, 20, and 30°C for a period of 0, 24, 48, 72, and 168 h, generating a total of 585 samples (Figure 1). Samples from each treatment were randomly positioned in racks specific for the FTIR equipment. Complementary results of 520 samples of authentic bulk tank raw milk, analyzed in the years 2019 and 2020, were collected from the laboratory server, generating a total of 1,105 samples.
      Figure thumbnail gr1
      Figure 1Sample preparation scheme with cheese whey addition to milk (1, 2, 5, 10, 15, 20, 25, and 30%, and control without cheese whey addition), stored at times 0, 24, 48, 72, and 168 h, under 7°C, 20°C, and 30°C.

      Fourier-Transform Infrared Analyses

      Fourier-transform infrared analyses were done in the Laboratory for Milk Quality Analysis, Veterinary School, UFMG, Brazil. This is an ISO 17025 accredited laboratory, which can analyze about 80,000 samples of raw milk per month.
      Raw milk and raw milk added with cheese whey were analyzed for composition and freezing point using an FTIR equipment (CombiScope FTIR 400 Delta Instruments) containing a validated multivariate calibration model (partial least squares;
      • Delta Instruments
      Datascope with 20 FT components.
      ). Instrument verification was based on standard milk samples (Valacta). Sample results included composition (fat, protein, lactose, TS, SNF, casein, MUN, and freezing point (°C). The mid-infrared region was used for FTIR measurement (900–3,000 cm−1). Statistical analysis for the analytical FTIR measurement was done according to
      • ISO/IDF
      ISO 9622/IDF 141, Milk and liquid milk products — Guidelines for the application of mid-infrared spectrometry.
      official method. Briefly, the following instrumental and analytical factors were verified for compliance to ISO 9622:2013 (
      • ISO/IDF
      ISO 9622/IDF 141, Milk and liquid milk products — Guidelines for the application of mid-infrared spectrometry.
      ) repeatability, reproducibility, zero stability, homogenization, linearity, and carryover. The quality of the analytical procedure was done with control charts (
      • ISO/IDF
      ISO 9622/IDF 141, Milk and liquid milk products — Guidelines for the application of mid-infrared spectrometry.
      ).

      Machine Learning

      Machine learning workflows usually divide data in specific sets: training, validation, and test. The training set consists of samples used to fit the model. During training, the model can split the training set and define a validation set, which is used to provide an unbiased evaluation and guide the algorithm into tuning the model hyperparameters. Finally, the test set consists of samples to which the model is applied (
      • James G.
      • Witten D.
      • Hastie T.
      • Tibshirani R.
      An introduction to statistical learning: With applications in R.
      ).
      The CART Classification Tree (Minitab 19.2020) was used as the classification method, and the resulting predictive algorithms were applied to the test data set with the objective of classifying the authentic milk from the adulterated one. For binary classification, CART algorithm, the value of each terminal node was the mode of the observations in the corresponding subset, and the prediction accuracy was given by the percentage of correctly classified cases. The tree structure was composed by a root node, internal nodes, and terminal nodes. Each internal node divided the instance space into 2 or more spaces according to a discrete function of the input attributes. Each terminal node represented a decision on the target attribute. The included parameters: probabilities matching sample frequencies; a ratio of training, validation, and test sets of approximately 55:25:20, respectively, were randomly split from 1,105 samples; Gini splitting method, with one standard error of minimum misclassification cost.
      The procedure for a MLP neural network (MLP;IBM SPSS Modeler 18.2) included the following parameters: Training, validation, and test samples were randomly split at a rate of approximately 55:25:20, randomly chosen by the software algorithm; input layer with the selected features as covariates; algorithm optimization based on scaled conjugate gradient for training; maximum training time of 15 min; training epochs computed automatically. With one hidden layer containing 3 units, excluding the bias unit, the activation function for this layer was the hyperbolic tangent function and the activation function for the output layer was Softmax. Cross-entropy was used as a loss function for optimization of the neural network.
      Because the reference CMP index method, using HPLC, detects levels with certainty above 1% of added cheese whey, treatments with low levels of whey addition (1% and below) were additionally tested as nondetectable cheese whey in both methods, CART and MLP (noted as CART1 and MLP1). All features as input included protein, casein, lactose, SNF, TS, fat, freezing point (°C), and MUN (CART all features and MLP all features, respectively). A simpler model with exclusion of fat, TS, freezing point, and MUN was tested due to lower relative importance.

      Statistics

      Statistical analyses of the compositional data included descriptive and multivariate (SPSS 22.0, IBM; JMP 16.0.0, SAS Institute Inc.). Tukey's test was used for post hoc comparison in the treatments at the significance level of 5% (
      • Dean A.
      • Voss D.
      • Draguljić D.
      Design and Analysis of Experiments.
      ).
      Performance metrics were evaluated based on accuracy, precision, sensitivity, and specificity, where
      Accuracy=samplescorrectlypredictedtotalsamples,


      Precision=truepositive(truepositive+falsepositive),


      Sensitivity=truepositive(truepositive+falsenegative),


      Specificity=truenegative(truenegative+falsepositive).


      The receiver operating characteristics curve will plot the true positive rate, also known as power, on the y-axis, and the false-positive rate, also known as type 1 error, on the x-axis. Hence, in a hypothetical situation when a classification tree can perfectly separate the classes, the area under the curve would be 1. On the other hand, if the tree does not classify better than a random process, the area under the curve would be 0.5 (
      • Neto H.A.
      • Tavares W.L.F.
      • Ribeiro D.C.S.Z.
      • Alves R.C.O.
      • Fonseca L.M.
      • Campos S.V.A.
      On the utilization of deep and ensemble learning to detect milk adulteration.
      ).

      RESULTS AND DISCUSSION

      Fourier-Transform Infrared Compositional Results

      Compositional FTIR results are shown in Table 1. Cheese whey addition to raw milk resulted in the reduction of components concentration, except for lactose, with increasing concentration correlated with increasing amounts of added cheese whey (R2 = 0.60; P < 0.001). No difference was found for freezing point in the different treatments. However, milk components concentration, fat, protein, casein, TS, SNF, lactose, and MUN were affected by cheese whey addition (P < 0.05), noticeably dilution effect for fat and protein (Table 1). It is important to observe that the raw milk samples without cheese whey addition were obtained from the bulk tank raw milk samples used for the treatments and from authentic bulk tank raw milk from routine analysis. These findings are expected because a significant amount of the milk solids components will be retained in the curd during renneting. Consequently, whey addition to the milk will result in lower solids concentration due to a dilution effect (
      • Lou Y.
      • Ng-Kwai-Hang K.F.
      Effects of protein and fat levels in milk on cheese and whey compositions.
      ;
      • Cortez M.A.S.
      • Dias V.G.
      • Maia R.G.
      • Costa C.C.A.
      Physicochemical characteristics and sensorial evaluation of pasteurized milk added with water, cheese whey, 0.9% sodium chloride solution and 5.0% dextrose solution.
      ;
      • Condé V.A.
      • Valente G.F.S.
      • Minighin E.C.
      Milk fraud by the addition of whey using an artificial neural network.
      ).
      Table 1Composition and freezing point (FP; mean and SD) of raw milk and cheese whey added to raw milk analyzed with Fourier-transform infrared spectroscopy
      Component (g/100 g)Cheese whey added to raw milk
      65 samples for each treatment.
      (% vol/vol)
      01251015202530
      FatMean3.67
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      3.59
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      3.57
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      3.49
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      3.33
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      3.19
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      3.06
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      2.90
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      2.78
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      SD0.260.220.210.230.230.230.220.230.22
      ProteinMean3.29
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      3.42
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      3.40
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      3.33
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      3.22
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      3.11
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      3.01
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      2.90
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      2.80
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      SD0.110.080.080.080.080.070.080.080.08
      CaseinMean2.56
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      2.67
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      2.66
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      2.60
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      2.50
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      2.40
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      2.31
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      2.22
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      2.13
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      SD0.110.080.080.080.080.070.080.080.08
      SolidsMean12.46
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      12.65
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      12.63
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      12.50
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      12.25
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      12.03
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      11.85
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      11.61
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      11.43
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      SD0.340.250.240.250.270.280.250.280.27
      SNFMean8.92
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      9.33
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      9.33
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      9.27
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      9.17
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      9.08
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      9.03
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      8.93
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      8.87
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      SD0.320.090.090.090.090.090.100.100.11
      LactoseMean4.52
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      4.67
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      4.69
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      4.71
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      4.74
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      4.77
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      4.83
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      4.85
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      4.90ª
      SD0.140.030.030.030.020.030.030.030.03
      MUNMean13.42
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      11.58
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      11.84
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      11.62
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      11.14
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      11.38
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      11.28
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      10.98
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      10.70
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      SD3.322.142.062.052.172.031.751.891.85
      FP (°C)Mean0.519
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      0.526
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      0.527
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      0.527
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      0.525
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      0.524
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      0.526
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      0.525
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      0.525
      Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      SD0.0240.0030.0020.0030.0030.0040.0030.0050.005
      a–g Means within a row with different superscripts differ using Tukey's test (P < 0.05).
      1 65 samples for each treatment.
      The raw and adulterated milk with cheese whey did not present distinct bands, because their positions overlapped due to the same absorption wavelength number. However, differences were observed regarding the absorption intensity in some of the bands because the intensity of the vibrational modes is proportional to the concentration of the constituents (Figure 2).
      Figure thumbnail gr2
      Figure 2Fourier-transform infrared spectra of authentic milk and milk adulterated with cheese whey (10 and 30%). Main functional groups are pointed out.
      It is noteworthy that, for some high levels of cheese whey addition, average concentration for components remained in the legally accepted range for raw milk (

      Brazil. 2018. Diário Oficial da União. Vol. 76/2018. Brasília.

      ; Table 2). For example, protein concentration was within the legal Brazilian requirements for all samples (at least 2.9 g/100 g) even after 15% of cheese whey addition to milk. For treatments with the addition of 20 and 25% of cheese whey, about 80% of the samples remained within legal parameters. Only at 30% of whey addition, the majority of the samples were noncompliant (98.5%) with the legal requirements. Similar trends were found for fat. This is a concerning finding, because even at high levels of adulteration with cheese whey, gross composition might be in the acceptable range for components, impairing routine surveillance to detect these samples.
      Table 2Relative number for samples of raw milk and cheese whey added to raw milk within minimum concentration for selected parameters, analyzed by Fourier-transform infrared spectroscopy
      Component %
      Minimum composition according to Brazilian legislation (Brazil, 2018).
      (g/100 g)
      Cheese whey added to raw milk
      65 samples for each treatment.
      (% vol/vol)
      01251015202530
      Fat >3.0%10010010010098.580.056.920.020.0
      Protein >2.9%10010010010010010080.078.51.5
      Lactose >4.3%100100100100100100100100100
      SNF >8.4%100100100100100100100100100
      TS >11.4%10010010010010010010069.233.8
      1 Minimum composition according to Brazilian legislation (

      Brazil. 2018. Diário Oficial da União. Vol. 76/2018. Brasília.

      ).
      2 65 samples for each treatment.
      Similar results were found in another study, with protein average values in the legal range even after the addition of 30% cheese whey to milk, despite showing a tendency to reduce concentration. Lactose concentration, however, showed slight reduction. Fat values were 2.7% after the addition of 15% cheese whey. This will decrease the chances to identify fraudulent addition of cheese whey to milk (
      • Cortez M.A.S.
      • Dias V.G.
      • Maia R.G.
      • Costa C.C.A.
      Physicochemical characteristics and sensorial evaluation of pasteurized milk added with water, cheese whey, 0.9% sodium chloride solution and 5.0% dextrose solution.
      ).
      In several countries, HPLC based on CMP index is the standard method to detect cheese whey addition to dairy products (
      • Olieman K.
      • Bedem J.
      A sensitive HPLC method of detecting and estimating rennet whey total solids in skim milk powder.
      ;
      • Olieman K.
      • Riel J.
      Detection of rennet whey solids in skim milk and buttermilk powder with reversed-phase HPLC.
      ;
      • Brazil
      Manual de métodos oficiais para análise de alimentos de origem animal.
      ). However, some reports have demonstrated accuracy and performance problems of this method due to several factors, such as whey acidity and storage conditions. For more reliable results using the CMP method, milk samples should be immediately analyzed or frozen until analysis. If not, proteases from microbial origin may hydrolyze the κ-CN close to the same cleavage point of chymosin, which results in the pseudo-CMP formation and, consequently, overestimation of the cheese whey addition (
      • de Pádua Alves É.
      • de Alcântara A.L.D.A.
      • Guimarães A.J.K.
      • de Santana E.H.W.
      • Botaro B.G.
      • Fagnani R.
      Milk adulteration with acidified rennet whey: A limitation for caseinomacropeptide detection by high-performance liquid chromatography.
      ;
      • Raymundo N.K.L.
      • Daguer H.
      • Osaki S.C.
      • Bersot L.S.
      Correlating mesophilic counts to the pseudo-CMP content of raw milk.
      ;
      • Lobato P.R.
      • Heringer J.P.M.
      • Fortini M.E.R.
      • Ferreira L.F.
      • Feijó F.A.C.
      • Leite M.O.
      • Cerqueira M.M.O.P.
      • Penna C.A.M.
      • Souza M.R.
      • Fonseca L.M.
      Índice de CMP em leite pasteurizado comercializado em Minas Gerais, Brasil, durante os anos de 2011 a 2017.
      ). Hence, an alternative method, not susceptible to such factors, is a major need for the dairy industry.
      Because of this potential pseudo-CMP production due to quality problems, some countries establish acceptable limit levels of CMP. For example, Brazilian levels of CMP are up to 30 mg/L for an equivalent of liquid milk (
      • Brazil
      Manual de métodos oficiais para análise de alimentos de origem animal.
      ). Nevertheless, raw milk CMP levels remain within this legal limit even after 1% of cheese whey addition to raw milk. Based on this fact, additional predictive and classification methods were processed, assigning samples with 1% of cheese whey added to milk as “no whey detected.” In fact, after processing the samples for the evaluated predictive methods, the best prediction results were reached with the treatment using 1% of whey added to milk being treated as nondetectable.

      Classification and Regression Trees Classification

      The CART method was processed with all input features, except milk urea nitrogen. The nodes that are mostly blue indicate a strong proportion of the event level (chance of whey addition to milk), contrasting with the mostly red nodes, which indicate a strong proportion of the nonevent level (chance of authentic milk; Figure 3).
      Figure thumbnail gr3
      Figure 3Example of CART (classification and regression trees) classification method for milk and milk with cheese whey added based on composition obtained using Fourier-transform infrared spectroscopy. Node view is complete and part is amplified below. Gray nodes in the node view presented a stronger influence (rectangular part below is amplified from tree above).
      Although CART results from several variables with positive importance, the relative rankings provide information about how many of these variables are needed for a certain application, as the relative importance values from one variable to the next variable can be useful for decision making about which variables to control or monitor. This metric helps us to explain the predictive power of each feature in the data set. Relative importance values range from 0 to 100%. The more important variable is assigned with a relative importance value of 100%. Low relative variables are not important and automatically eliminated from the tree.
      For example, in these data, the most important predictor for this model was lactose concentration with a relative importance of 100% compared with protein which had a relative importance of 60.8%. This means that protein has a relative importance close to half of the lactose in this classification tree. The feature with the lowest relative importance was MUN (18.8%). The misclassification cost for this simulation was 0.027 for the training samples and 0.054 for the test samples. The most accurate tree is the one with the lowest misclassification cost. Misclassification may occur due to selection of property which is not suitable for classification (IBM SPSS Modeler 18.2).
      The receiver operating characteristics curve is an important visualization tool for the method performance (Figure 4). With an area under the curve of 0.994 for the training and 0.980 for the test data, this receiver operating characteristics curve indicates an optimal classification performance of the model which may be applied for prediction purposes, because the model presents high levels of correct predictions for each class (
      • Dunnmon J.A.
      • Yi D.
      • Langlotz C.P.
      • Ré C.
      • Rubin D.L.
      • Lungren M.P.
      Assessment of convolutional neural networks for automated classification of chest radiographs.
      ;
      • Neto H.A.
      • Tavares W.L.F.
      • Ribeiro D.C.S.Z.
      • Alves R.C.O.
      • Fonseca L.M.
      • Campos S.V.A.
      On the utilization of deep and ensemble learning to detect milk adulteration.
      ).
      Figure thumbnail gr4
      Figure 4Receiver operating characteristic (ROC) curve for CART (classification and regression trees) of milk and milk with cheese whey added, and features based on composition (g/100 g) of fat, protein, lactose, TS, SNF, and casein, and freezing point (°C), measured with Fourier-transform infrared spectroscopy.
      This can be confirmed using the classification matrix (Table 3) which indicates high rates of sample correct prediction and low rates of misclassification, both in the training and the test set. Correct predictions to detect tainted samples were as high as 96.2% in the training set and 97.2% in the test set of samples.
      Table 3Classification matrix for a CART (classification and regression tree) with fat, protein, casein, lactose, TS, SNF, and freezing point as input features, and binomial output as raw milk and cheese whey added to raw milk (2, 5, 10, 15, 20, 25, and 30%)
      SampleObservedPredicted
      Raw milkWhey addedPercent correct
      TrainingRaw milk3651097.3
      Whey added1025096.2
      Overall percent96.9
      ValidationRaw milk1411292.2
      Whey added310497.2
      Overall percent94.2
      TestingRaw milk121199.2
      Whey added83594.3
      Overall percent97.1
      The best model was used in the test set of samples (Figure 5). The performance of the best model rendered a test with high-performance, with an accuracy of 0.962, and precision, sensitivity, and specificity as high as 0.965, 0.943, and 0.975, respectively.
      Figure thumbnail gr5
      Figure 5Test performance of CART (classification and regression trees) method to detect cheese whey added to raw milk based on compositional data obtained with Fourier-transform infrared spectroscopy method.
      Decision tree algorithms similar to CART can provide a better understanding of the whole classification process and also provide meaningful information about each feature, such as feature importance. Other ensemble methods, similar to random forest, would make explaining the algorithm's decisions much more complex. This makes even more sense in our application proposal, which is to be used for milk screening in laboratories, where a simple and explainable solution is desired. The main advantages of decision trees are that they are easy to visualize and interpret, can handle all type of predictors, work well in the case of nonlinear relationship between variables, and make no assumption about the variables distribution (decision tree learning is a nonparametric method). However, some of the disadvantages may include the possibility of overfitting in the training set and a smaller predictive accuracy in the holdout set (test set;
      • Miller J.N.
      • Miller J.C.
      Statistics and Chemometrics for Analytical Chemistry.
      ).

      Multilayer Perceptron Networks

      The same trends were found using MLP networks, with the best results for the MLP with protein, casein, lactose, SNF, TS, and freezing point as input features and treatment of 1% of cheese whey assigned as “no detectable whey” in the training data set. As in the CART, MUN was eliminated as a feature because it worsened the performance index in the models (Figure 6). The neural network architecture is exemplified with 8 input neurons related to the milk, and milk and cheese whey components, and an additional input neuron for bias. Each neuron from the input layer is connected to each neuron in the second layer (hidden), but they are not interconnected in the same layer. This second layer with a bias node is processed for a final classification as raw milk or raw milk added with cheese whey (
      • Skansi S.
      Introduction to Deep Learning: From Logical Calculus to Artificial Intelligence.
      ).
      Figure thumbnail gr6
      Figure 6Network diagram for a multilayer perceptron network with fat, protein, casein, lactose, TS, SNF, and freezing point (FP) as input features, and binomial output as raw milk and raw milk with cheese whey added (2, 5, 10, 15, 20, 25, and 30%). H = hidden layer.
      The MLP model created with raw milk added with different levels of cheese whey was validated with real samples analyzed in the laboratory routine.
      The training set resulted in 2.6% of incorrect predictions, whereas for the testing set, incorrect predictions were 1.6% as shown in Table 4. These results indicate that this MLP network has an excellent prediction power, with huge potential for testing real samples, as further indicated. To our knowledge, reports of the use of machine learning methods to detect fraudulent addition of cheese whey to milk are scarce. The use of artificial neural networks was reported elsewhere, using the compositional results of routine analyzes in milk samples as input variables. Cheese whey was added to milk at levels of 0, 5, 10, and 20%, and samples were analyzed for fat, SNF, density, protein, lactose, minerals, and freezing point, totaling 164 samples, of which 60% were used for network training, 20% for network validation, and 20% for neural network testing. Although the authors stated that the use of neural networks proved to be efficient, they suggested the use of a more accurate method to confirm the fraud. Despite being a model for quantification, model performance metrics were not presented (
      • Condé V.A.
      • Valente G.F.S.
      • Minighin E.C.
      Milk fraud by the addition of whey using an artificial neural network.
      ).
      Table 4Classification matrix for a multilayer perceptron network with protein, casein, lactose, TS, fat composition, and freezing point as input features, and binomial outputs as raw milk and cheese whey added to raw milk (2, 5, 10, 15, 20, 25, and 30%)
      SampleObservedPredicted
      Raw milkWhey addedPercent correct
      TrainingRaw milk358897.8
      Whey added824996.9
      Overall percent97.4
      ValidationRaw milk153199.4
      Whey added39496.9
      Overall percent98.4
      TestingRaw milk126496.9
      Whey added110099.0
      Overall percent97.8
      The implemented models were applied to the same ratios of randomly split data set with similar performance outcomes from MLP and CART. The algorithm for the MLP model, considering 1% of added whey to milk as nondetectable was applied to the test set of samples, and total accuracy, precision, sensitivity, and specificity were, respectively, 0.978, 0.96, 0.99, and 0.973, which indicate very good performance for a screening method (Figure 7). It is important to note that current FTIR analytical equipment for raw milk has reached production of up to 600 samples/h. So, the use of a screening method with such performance combined with database processing with this type of MLP algorithm would be an important tool for more strict surveillance of suspected farms and dairy plants, and to reduce the use of more expensive and time-consuming precision methods, such as HPLC.
      Figure thumbnail gr7
      Figure 7Test performance of multilayer perceptron networks (MLP) to detect cheese whey added to raw milk based on compositional data.
      Although we did not find reports using mid-infrared FTIR associated with neural networks to detect cheese whey addition to milk, other techniques have been studied. For example, radial function and MLP were applied to analytical results of milk and cheese whey added milk obtained using ultrasound analyzer. Classification error was reported as less than 5%; however, sample number was limited (101, 33, and 33 samples for training, validation, and testing, respectively;
      • Valente G.F.S.
      • Guimarães D.C.
      • Gaspardi A.L.A.
      • Oliveira L.A.
      Applying artificial neural networks as a test to detect milk fraud by whey addition.
      ).
      None of the previous reported studies, which evaluated the identification of fraud by cheese whey in raw milk by FTIR spectroscopy, used a similar machine learning methodology or obtained results superior to those of this work.
      Overall strength of both methods is the use of compositional data, easily obtained through FTIR or other analytical methods, and the optimal performance without additional data preprocessing. However, the raw milk samples represented a specific population of cows with a different genetic ratio of Holstein and Gyr cattle. Hence, different milk origin profiles might require a different structural approach for the evaluated machine learning methods. It is important to note that this work was aimed at bulk raw milk, not individual milk, whose composition is more variable.

      CONCLUSIONS

      The CART and MLP network, associated with milk features predicted with FTIR spectroscopy, presented high-performance metrics to detect cheese whey added to raw milk, with high levels of correctly predicted samples and reduced misclassifications. Such performance is practically relevant because it might allow future implementation of both techniques in a laboratory routine for milk quality analysis to screen suspected milk samples, which can be directed for complementary analyses to confirm fraud.

      ACKNOWLEDGMENTS

      The authors acknowledge the Milk Quality Analysis Laboratory of the Veterinary School, Universidade Federal de Minas Gerais (LabUFMG; ISO/IEC 17025 accredited) for the continuous support. FAPEMIG–APQ-02740-17; CNPq; FINEP (Financiadora de Estudos e Projetos), Brazil. The authors have not stated any conflicts of interest.

      REFERENCES

        • Alpaydin E.
        Introduction to Machine Learning.
        3rd ed. The MIT Press, 2014
        • Alves da Rocha R.
        • Paiva I.M.
        • Anjos V.
        • Furtado M.A.M.
        • Bell M.J.V.
        Quantification of whey in fluid milk using confocal Raman microscopy and artificial neural network.
        J. Dairy Sci. 2015; 98 (25828656): 3559-3567
        • Andreatta E.
        • Fernandes A.M.
        • Santos M.V.
        • Mussarelli C.
        • Marques M.C.
        • Gigante M.L.
        • Oliveira C.A.F.
        Qualidade de queijo minas frescal preparado com leite com diferentes quantidades de células somáticas.
        Pesqui. Agropecu. Bras. 2009; 44: 320-326
        • Bramer M.
        Principles of Data Mining.
        in: Undergraduate Topics in Computer Science. 3rd ed. Springer, 2016
        • Brandao M.C.M.P.
        • Carmo A.P.
        • Bell M.J.V.
        • Anjos V.C.
        Characterization of milk by infrared spectroscopy.
        Rev. Inst. Laticínios Cândido Tostes. 2010; 65: 30-33
      1. Brazil. 2018. Diário Oficial da União. Vol. 76/2018. Brasília.

        • Brazil
        Manual de métodos oficiais para análise de alimentos de origem animal.
        2nd ed. Rev. ed. Brasília, 2019
        • Breiman L.
        • Friedman J.
        • Stone C.J.
        • Olshen R.A.
        Classification and Regression Trees.
        Taylor & Francis, 1984
        • Brito R.F.
        • Rodrigues R.
        • Diniz S.A.
        • Fonseca L.M.
        • Leite M.O.
        • Souza M.R.
        • Conrrado R.S.
        • Veríssimo S.A.O.
        • Valente G.L.S.
        • Cerqueira M.M.O.P.
        Analysis of the freezing point of milk by precision method and by Fourier transform infrared (FTIR) spectroscopy.
        Arq. Bras. Med. Vet. Zootec. 2020; 72: 1713-1718
        • Conceição D.
        • Gonçalves B.-H.
        • da Hora F.
        • Faleiro A.
        • Santos L.
        • Ferrão S.
        Use of FTIR-ATR spectroscopy combined with multivariate analysis as a screening tool to identify adulterants in raw milk.
        J. Braz. Chem. Soc. 2019; 30: 780-785
        • Condé V.A.
        • Valente G.F.S.
        • Minighin E.C.
        Milk fraud by the addition of whey using an artificial neural network.
        Cienc. Rural. 2020; 50e20190312
        • Cortez M.A.S.
        • Dias V.G.
        • Maia R.G.
        • Costa C.C.A.
        Physicochemical characteristics and sensorial evaluation of pasteurized milk added with water, cheese whey, 0.9% sodium chloride solution and 5.0% dextrose solution.
        Rev. Inst. Laticínios Cândido Tostes. 2010; 65: 18-25
        • de Carvalho B.M.A.
        • de Carvalho L.M.
        • dos Reis Coimbra J.S.
        • Minim L.A.
        • de Souza Barcellos E.
        • da Silva Júnior, W.F.
        • Detmann E.
        • de Carvalho G.G.P.
        Rapid detection of whey in milk powder samples by spectrophotometric and multivariate calibration.
        Food Chem. 2015; 174 (25529644): 1-7
        • de La Fuente M.A.
        • Juárez M.
        Authenticity assessment of dairy products.
        Crit. Rev. Food Sci. Nutr. 2005; 45 (16371328): 563-585
        • de Pádua Alves É.
        • de Alcântara A.L.D.A.
        • Guimarães A.J.K.
        • de Santana E.H.W.
        • Botaro B.G.
        • Fagnani R.
        Milk adulteration with acidified rennet whey: A limitation for caseinomacropeptide detection by high-performance liquid chromatography.
        J. Sci. Food Agric. 2018; 98 (29277909): 3994-3996
        • Dean A.
        • Voss D.
        • Draguljić D.
        Design and Analysis of Experiments.
        in: Springer Texts in Statistics. 2nd ed. Springer, 2017
        • Delta Instruments
        Datascope with 20 FT components.
        Delta Instruments, 2009
        • Dunnmon J.A.
        • Yi D.
        • Langlotz C.P.
        • Ré C.
        • Rubin D.L.
        • Lungren M.P.
        Assessment of convolutional neural networks for automated classification of chest radiographs.
        Radiology. 2019; 290 (30422093): 537-544
        • Ertel W.
        Introduction to Artificial Intelligence.
        in: Undergraduate Topics in Computer Science. 2nd ed. Springer, 2017
        • Ghaffarian S.
        • van der Voort M.
        • Valente J.
        • Tekinerdogan B.
        • de Mey Y.
        Machine learning-based farm risk management: A systematic mapping review.
        Comput. Electron. Agric. 2022; 192106631
        • Gondim C.S.
        • Junqueira R.G.
        • Souza S.V.C.
        • Ruisánchez I.
        • Callao M.P.
        Detection of several common adulterants in raw milk by MID-infrared spectroscopy and one-class and multi-class multivariate strategies.
        Food Chem. 2017; 230 (28407966): 68-75
        • Guiné R.P.F.
        • Ferrão A.C.
        • Ferreira M.
        • Correia P.
        • Mendes M.
        • Bartkiene E.
        • Szűcs V.
        • Tarcea M.
        • Sarić M.M.
        • Černelič-Bizjak M.
        • Isoldi K.
        • El-Kenawy A.
        • Ferreira V.
        • Klava D.
        • Korzeniowska M.
        • Vittadini E.
        • Leal M.
        • Frez-Muñoz L.
        • Papageorgiou M.
        • Djekić I.
        Influence of sociodemographic factors on eating motivations - modelling through artificial neural networks (ANN).
        Int. J. Food Sci. Nutr. 2020; 71 (31771374): 614-627
        • Handford C.E.
        • Campbell K.
        • Elliott C.T.
        Impacts of milk fraud on food safety and nutrition with special emphasis on developing countries.
        Compr. Rev. Food Sci. Food Saf. 2016; 15 (33371582): 130-142
        • Hansen L.
        • Ferrão M.F.
        Classification of milk samples using CART.
        Food Anal. Methods. 2020; 13: 13-20
        • Ho P.N.
        • Luke T.D.W.
        • Pryce J.E.
        Validation of milk mid-infrared spectroscopy for predicting the metabolic status of lactating dairy cows in Australia.
        J. Dairy Sci. 2021; 104 (33551158): 4467-4477
        • ISO/IDF
        ISO 9622/IDF 141, Milk and liquid milk products — Guidelines for the application of mid-infrared spectrometry.
        2nd ed. ISO, 2013
        • James G.
        • Witten D.
        • Hastie T.
        • Tibshirani R.
        An introduction to statistical learning: With applications in R.
        Springer, 2017
        • Koo T.
        • Kim M.
        • Jue M.
        Automated detection of superficial fungal infections from microscopic images through a regional convolutional neural network.
        PLoS One. 2021; 16 (34403443)e0256290
        • Kubat M.
        An Introduction to Machine Learning.
        2nd ed. Springer, 2017
        • Lakretz Y.
        • Hupkes D.
        • Vergallito A.
        • Marelli M.
        • Baroni M.
        • Dehaene S.
        Mechanisms for handling nested dependencies in neural-network language models and humans.
        Cognition. 2021; 213 (33941375)104699
        • Lenardon L.
        • Meneghini L.Z.
        • Hoff R.B.
        • Motta T.M.C.
        • Pizzolato T.M.
        • Ferrão M.F.
        • Bergold A.M.
        Determination of caseinomacropeptide in Brazilian bovine milk by high-performance liquid chromatography-mass spectrometry.
        Anal. Lett. 2017; 50: 2068-2077
        • Liakos K.G.
        • Busato P.
        • Moshou D.
        • Pearson S.
        • Bochtis D.
        Machine Learning in Agriculture: A Review.
        Sensors (Basel). 2018; 18 (30110960)2674
        • Lobato P.R.
        • Heringer J.P.M.
        • Fortini M.E.R.
        • Ferreira L.F.
        • Feijó F.A.C.
        • Leite M.O.
        • Cerqueira M.M.O.P.
        • Penna C.A.M.
        • Souza M.R.
        • Fonseca L.M.
        Índice de CMP em leite pasteurizado comercializado em Minas Gerais, Brasil, durante os anos de 2011 a 2017.
        Arq. Bras. Med. Vet. Zootec. 2020; 72: 641-646
        • Lou Y.
        • Ng-Kwai-Hang K.F.
        Effects of protein and fat levels in milk on cheese and whey compositions.
        Food Res. Int. 1992; 25: 445-451
        • Ma L.
        • Kashanj S.
        • Xu S.
        • Zhou J.
        • Nobes D.S.
        • Ye M.
        Flow reconstruction and prediction based on small particle image velocimetry experimental datasets with convolutional neural networks.
        Ind. Eng. Chem. Res. 2022; 61: 8504-8519
        • Miller J.N.
        • Miller J.C.
        Statistics and Chemometrics for Analytical Chemistry.
        Pearson Education Limited, 2010
        • Morota G.
        • Ventura R.V.
        • Silva F.F.
        • Koyama M.
        • Fernando S.C.
        Big data analytics and precision agriculture symposium: Machine learning and data mining advance predictive big data analysis in precision animal agriculture.
        J. Anim. Sci. 2018; 96 (29385611): 1540-1550
        • Neto H.A.
        • Tavares W.L.F.
        • Ribeiro D.C.S.Z.
        • Alves R.C.O.
        • Fonseca L.M.
        • Campos S.V.A.
        On the utilization of deep and ensemble learning to detect milk adulteration.
        BioData Min. 2019; 12 (31320927): 13
        • Olieman K.
        • Bedem J.
        A sensitive HPLC method of detecting and estimating rennet whey total solids in skim milk powder.
        Int. Dairy J. 1983; 37: 27-36
        • Olieman K.
        • Riel J.
        Detection of rennet whey solids in skim milk and buttermilk powder with reversed-phase HPLC.
        Neth. Milk Dairy J. 1989; 43: 171-184
        • Oliveira M.C.P.P.
        • Silva N.M.A.
        • Bastos L.P.F.
        • Fonseca L.M.
        • Cerqueira M.M.O.P.
        • Leite M.O.
        • Conrrado R.S.
        Fourier transform infrared spectroscopy (FTIR) for MUN analysis in normal and adulterated milk.
        Arq. Bras. Med. Vet. Zootec. 2012; 64: 1360-1366
        • Poonia A.
        • Jha A.
        • Sharma R.
        • Singh H.B.
        • Rai A.K.
        • Sharma N.
        Detection of adulteration in milk: A review.
        Int. J. Dairy Technol. 2017; 70: 23-42
        • Raymundo N.K.L.
        • Daguer H.
        • Osaki S.C.
        • Bersot L.S.
        Correlating mesophilic counts to the pseudo-CMP content of raw milk.
        Arq. Bras. Med. Vet. Zootec. 2018; 70: 1660-1664
        • Robim M.S.
        • Cortez M.A.S.
        • Silva A.C.O.
        • Filho R.A.T.
        • Gemal N.H.
        • Nogueira E.B.
        Research fraud in UHT whole milk marketed in the state of Rio de Janeiro and comparison between the methods of physicochemical officers and the method of ultrasound.
        Rev. Inst. Laticínios Cândido Tostes. 2012; 67: 43-50
        • Schiassi E.
        • Furfaro R.
        • Leake C.
        • De Florio M.
        • Johnston H.
        • Mortari D.
        Extreme theory of functional connections: A fast physics-informed neural network method for solving ordinary and partial differential equations.
        Neurocomputing. 2021; 457: 334-356
        • Skansi S.
        Introduction to Deep Learning: From Logical Calculus to Artificial Intelligence.
        Springer, 2018
        • Tibola C.S.
        • da Silva S.A.
        • Dossa A.A.
        • Patrício D.I.
        Economically motivated food fraud and adulteration in Brazil: Incidents and alternatives to minimize occurrence.
        J. Food Sci. 2018; 83 (30020548): 2028-2038
        • Valente G.F.S.
        • Guimarães D.C.
        • Gaspardi A.L.A.
        • Oliveira L.A.
        Applying artificial neural networks as a test to detect milk fraud by whey addition.
        Rev. Inst. Laticínios Cândido Tostes. 2014; 69: 425-432
        • Vinciguerra L.L.
        • Marcelo M.C.A.
        • Motta T.M.C.
        • Meneghini L.Z.
        • Bergold A.M.
        • Ferrao M.F.
        Chemometric tools and FTIR-ATR spectroscopy applied in milk adulterated with cheese whey.
        Quim. Nova. 2019; 42: 249-254
        • Witten I.
        • Frank E.
        • Hall M.
        • Pal C.
        Data Mining: Practical Machine Learning Tools and Techniques.
        4th ed. Morgan Kaufmann, 2016
        • Wu X.
        • Kumar V.
        • Ross Quinlan J.
        • Ghosh J.
        • Yang Q.
        • Motoda H.
        • McLachlan G.J.
        • Ng A.
        • Liu B.
        • Yu P.S.
        • Zhou Z.-H.
        • Steinbach M.
        • Hand D.J.
        • Steinberg D.
        Top 10 algorithms in data mining.
        Knowl. Inf. Syst. 2008; 14: 1-37
        • Xu X.
        • Zhang Y.
        Corn cash price forecasting with neural networks.
        Comput. Electron. Agric. 2021; 184106120
        • Zhu X.
        • Zheng B.
        • Cai W.
        • Zhang J.
        • Lu S.
        • Li X.
        • Xi L.
        • Kong Y.
        Deep learning-based diagnosis models for onychomycosis in dermoscopy.
        Mycoses. 2022; 65 (35119144): 466-472