Research| Volume 105, ISSUE 9, P7242-7252, September 2022

Ok

# Simultaneous detection for adulterations of maltodextrin, sodium carbonate, and whey in raw milk using Raman spectroscopy and chemometrics

Open AccessPublished:July 18, 2022

## ABSTRACT

To achieve rapid on-site identification of raw milk adulteration and simultaneously quantify the levels of various adulterants, we combined Raman spectroscopy with chemometrics to detect 3 of the most common adulterants. Raw milk was artificially adulterated with maltodextrin (0.5–15.0%; wt/wt), sodium carbonate (10–100 mg/kg), or whey (1.0–20.0%; wt/wt). Partial least square discriminant analysis (PLS-DA) classification and a partial least square (PLS) regression model were established using Raman spectra of 144 samples, among which 108 samples were used for training and 36 were used for validation. A model with excellent performance was obtained by spectral preprocessing with first derivative, and variable selection optimization with variable importance in the projection. The classification accuracy of the PLS-DA model was 95.83% for maltodextrin, 100% for sodium carbonate, 95.84% for whey, and 92.25% for pure raw milk. The PLS model had a detection limit of 1.46% for maltodextrin, 4.38 mg/kg for sodium carbonate, and 2.64% for whey. These results suggested that Raman spectroscopy combined with PLS-DA and PLS model can rapidly and efficiently detect adulterants of maltodextrin, sodium carbonate, and whey in raw milk.

## INTRODUCTION

In light of the increasing types of milk products available (such as milk powder, yogurt, and cheese) and the boosted consumption of dairy products, the quality and safety of raw milk are regarded as the cornerstone for the development of milk-based products (
• He H.
• Sun D.W.
• Pu H.
• Chen L.
• Lin L.
Applications of Raman spectroscopic techniques for quality and safety evaluation of milk: A review of recent developments.
). Milk adulteration with the addition of various chemicals such as melamine, caustic soda, formalin, and hydrogen peroxide is still very common; however, these chemicals have the potential to cause serious health-related problems (
• Kamal M.
• Karoui R.
Analytical methods coupled with chemometric tools for determining the authenticity and detecting the adulteration of dairy products: A review.
;
• Abdallah Musa Salih M.
• Yang S.
Common milk adulteration in developing countries cases study in China and Sudan: A review.
;
• Poonia A.
• Jha A.
• Sharma R.
• Singh H.B.
• Rai A.K.
• Sharma N.
Detection of adulteration in milk: A review.
;
• He H.
• Sun D.W.
• Pu H.
• Chen L.
• Lin L.
Applications of Raman spectroscopic techniques for quality and safety evaluation of milk: A review of recent developments.
). The purpose of adulteration is to increase volume of products, to compensate for undesirable factors for consumption, to enhance the content of protein and fat, as well as to improve economic benefits (
• Moore J.C.
• Spink J.
• Lipp M.
Development and application of a database of food ingredient fraud and economically motivated adulteration from 1980 to 2010.
;
• Cattaneo T.M.P.
• Holroyd S.E.
New applications of near infrared spectroscopy on dairy products.
;
• Santos P.M.
• Pereira-Filho E.R.
• Rodriguez-Saona L.E.
Rapid detection and quantification of milk adulteration using infrared microspectroscopy and chemometrics analysis.
). Adding water to raw milk can be masked by an addition of a thickener, such as maltodextrin (
• Tronco V.M.
Manual para inspeo da qualidade do leite.
;
• Capuano E.
• Boerrigter-Eenling R.
• Koot A.
• van Ruth S.M.
Targeted and untargeted detection of skim milk powder adulteration by near-infrared spectroscopy.
;
• de Souza Gondim C.
• Cesar Santos de Souza R.
• de Paula Penna e Palhares M.
• Junqueira R.G.
• Carvalho de Souza S.V.
Performance improvement and single laboratory validation of classical qualitative methods for the detection of adulterants in milk: Starch, chlorides and sucrose.
;
• Bergana M.M.
• Harnly J.
• Moore J.C.
• Xie Z.
Non-targeted detection of milk powder adulteration by 1H NMR spectroscopy and conformity index analysis.
). The lactic acid created during long-term storage can significantly affect the quality of raw milk; therefore, neutralizers such as sodium carbonate or sodium bicarbonate are added to reduce acidity (
• de Souza Gondim C.
• Cesar Santos de Souza R.
• de Paula Penna e Palhares M.
• Junqueira R.G.
• Carvalho de Souza S.V.
Performance improvement and single laboratory validation of classical qualitative methods for the detection of adulterants in milk: Starch, chlorides and sucrose.
;
• Chakraborty M.
• Biswas K.
Limit of detection for five common adulterants in milk: A study with different fat percent.
). In Brazil, the addition of cheese whey to milk has been widely reported to increase the volume and fat content without significantly altering the sensory characteristics (
• Aquino L.
• Silva A.
• Freitas M.Q.
• Felicio T.L.
• Cruz A.G.
• Conte-Junior C.A.
Identifying cheese whey an adulterant in milk: Limited contribution of a sensometric approach.
;
• Farah J.S.
• Cavalcanti R.N.
• Guimarães J.T.
• Balthazar C.F.
• Coimbra P.T.
• Pimentel T.C.
• Esmerino E.A.
• Duarte M.C.K.H.
• Freitas M.Q.
• Granato D.
• Neto R.P.C.
• Tavares M.I.B.
• Silva M.C.
• Cruz A.G.
Differential scanning calorimetry coupled with machine learning technique: An effective approach to determine the milk authenticity.
). Usually, different adulterants are not added separately to raw milk for adulteration; for instance, an addition of whey leads to a decrease in the density of raw milk, which requires maltodextrin to increase density, and sodium carbonate may also be added simultaneously to extend shelf life.
However, most of the current literature on the detection of raw milk adulteration by the use of Raman spectroscopy (RS) focuses on single adulteration or multiple adulterations of the same adulteration type. Confocal Raman microscopy and artificial neural network have been used to quantify whey in milk (
• Alves da Rocha R.
• Paiva I.M.
• Anjos V.
• Bell M.J.
Quantification of whey in fluid milk using confocal Raman microscopy and artificial neural network.
). Several studies have reported an effective screening method for detecting melamine, dicyandiamide, ammonium sulfate, and urea in milk (
• Nieuwoudt M.K.
• Holroyd S.E.
• McGoverin C.M.
• Simpson M.C.
• Williams D.E.
Raman spectroscopy as an effective screening method for detecting adulteration of milk with small nitrogen-rich molecules and sucrose.
,
• Nieuwoudt M.K.
• Holroyd S.E.
• McGoverin C.M.
• Simpson M.C.
• Williams D.E.
Screening for adulterants in liquid milk using a portable Raman miniature spectrometer with immersion probe.
). Our work is the first time that 3 types of adulterants have been simultaneously detected by RS. The detection of only a single adulteration is not efficient. In addition, a good detection method is also key to improving the efficiency of adulteration detection. Liquid chromatography (
• MacMahon S.
• Begley T.H.
• Diachenko G.W.
• Stromgren S.A.
A liquid chromatography–Tandem mass spectrometry method for the detection of economically motivated adulteration in protein-containing foods.
), infrared spectroscopy (
• Kene Ejeahalaka K.
• On S.L.W.
Effective detection and quantification of chemical adulterants in model fat-filled milk powders using NIRS and hierarchical modelling strategies.
), front-face fluorescence spectroscopy (
• Ullah R.
• Khan S.
• Ali H.
• Bilal M.
Potentiality of using front face fluorescence spectroscopy for quantitative analysis of cow milk adulteration in buffalo milk.
), and RS (
• Xu Y.
• Zhong P.
• Jiang A.
• Shen X.
• Li X.
• Xu Z.
• Shen Y.
• Sun Y.
• Lei H.
Raman spectroscopy coupled with chemometrics for food authentication: A review.
) have been used to detect milk adulteration, among which RS is more popular because it is nondestructive, rapid, and does not require sample pretreatment.
Specifically, chemometric methods can provide data processing, variable selection feature extraction, and pattern recognition. In contrast to other chemometric methods, support-vector machines and partial least square (PLS) discriminant analysis (PLS-DA) classification models are more accurate (
• Jiménez-Carvelo A.M.
• Osorio M.T.
• Koidis A.
Chemometric classification and quantification of olive oil in blends with any edible vegetable oils using FTIR-ATR and Raman spectroscopy.
). However, support-vector machines are highly time-consuming algorithms and can easily be over-fitting (
• Xu N.
• Liu M.H.
• Yuan H.
• Huang S.G.
• Song Y.X.
Classification of sulfadimidine and sulfapyridine in duck meat by surface enhanced Raman spectroscopy combined with principal component analysis and support vector machine.
). Conversely, PLS-DA is a model with simple operation and excellent performance; specifically, near-infrared spectroscopy combined with PLS-DA presents 100% sensitivity and specificity in calibration, cross-validation, and prediction in detecting water, urea, bovine whey, and cow milk in goat milk samples (
• Teixeira J.L.P.
• Caramês E.T.S.
• Baptista D.P.
• Gigante M.L.
• Pallone J.A.L.
Vibrational spectroscopy and chemometrics tools for authenticity and improvement the safety control in goat milk.
). Additionally, PLS is one of the most widely used analytical vibrational spectroscopy techniques to quantify various components.
• Pereira E.V.S.
• Fernandes D.D.S.
• de Araújo M.C.U.
• Diniz P.H.G.D.
• Maciel M.I.S.
Simultaneous determination of goat milk adulteration with cow milk and their fat and protein contents using NIR spectroscopy and PLS algorithms.
used near-infrared spectroscopy and PLS algorithms to quantify goat milk adulteration by adding cow milk, reaching a linear correlation coefficient of the cross-validation set (RCV) of 0.9996 and a linear correlation coefficient of the prediction set (RPred) of 0.9955. Nevertheless, no studies have combined RS with PLS-DA and PLS to simultaneously identify and quantify adulterations of whey, maltodextrin, and sodium carbonate in milk samples.
Herein, we are for the first time establishing a method for the simultaneous identification of 3 types of adulterants (whey, maltodextrin, and sodium carbonate) in raw milk using RS with PLS-DA. Additionally, the concentration of adulterants is determined by a PLS quantitative model that presents efficient processing with low error. This strategy represents a rapid and promising analytical method to identify the type of substance used in the adulteration process and to predict different levels of adulteration.

## MATERIALS AND METHODS

### Sample Collection and Preparation

Raw milk samples were collected from Shanghai No. 4 Dairy Product Factory Co. Ltd. over a 3-mo period (from September to December of 2020). The overall process was under standard quality control. The whey was prepared from the laboratory. First, 2 mL of a milk coagulating enzyme (MT2200) solution was added into 1.5 L of mixed raw milk, incubated at 40°C for 40 min, and then stirred slowly and continuously for 30 min. Finally, we filtered the mixture and collected whey for later use. Maltodextrin was purchased from Adamas Reagent Co. Ltd., and sodium carbonate was purchased from Sigma-Aldrich Trading Co. Ltd. No animals were used in this study, and ethical approval for the use of animals was thus deemed unnecessary.
The whey, maltodextrin, and sodium carbonate samples were prepared from the laboratory. Whey was added to the raw milk (10 mL) at different mixture percentages (1.0, 2.0, 3.0, 4.0, 7.0, 9.0, 11.0, 12.0, 14.0, 16.0, 18.0, and 20.0%; wt/wt), simulating 12 levels of adulteration. Similarly, maltodextrin was added to raw milk (10 mL) by appropriate dilutions of 10.0% to give final mixture percentages of 0.5, 1.0, 2.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 11.0, 13.0, and 15.0%. For sodium carbonate, 10, 25, 30, 40, 45, 50, 60, 70, 75, 80, 90, and 100 mg/kg solutions in milk (10 mL) were prepared by diluting a 50-mL solution of 100 mg/kg sodium carbonate in milk. Two replicates of adulterated raw milk were prepared for each concentration. A total of 144 test samples, including 72 adulterated raw milk and 72 real raw milk samples, were prepared for analysis.

### Raman Spectroscopy

For confocal Raman microscopic observations, 100 µL of each sample was pipetted onto a silicon chip and air dried for 10 min. Raman spectra were collected by a DXR laser micro-Raman spectrometer (Thermo Fisher Scientific) with a diode-pumped solid-state laser source (532 nm), which produced Raman scattering of irradiated molecules. The sample needed to be focused by a microscope before spectral collection. It was difficult to achieve microscopic focusing with the liquid sample (liquid milk), so the milk droplets needed to be dried. The spectral regions from 50 to 3,000 cm−1 were recorded at a resolution of 0.2 nm, with 5 s of acquisition time and 100 mW of laser power. For each sample, 3 spectra were collected at 3 different points, and the average values were calculated. All measurements were carried out at room temperature.

### Data Analysis

The data analysis workflow is shown in Figure 1, and the principal component analysis (PCA) was carried out for outlier detection. Multivariate analysis of RS data was carried out using PLS-DA and PLS. Before building a classification model, we investigated the effects of various preprocessing method on the classification model (
• Bērziņš K.
• Harrison S.D.L.
• Leong C.
• Fraser-Miller S.J.
• Harper M.J.
• Diana A.
• Gibson R.S.
• Houghton L.A.
• Gordon K.C.
Qualitative and quantitative vibrational spectroscopic analysis of macronutrients in breast milk.
). The first transformation applied to the Raman spectra was the smoothing method of median filtering, using a gap size of 3, because the spectra were noisy and exhibited systematic variations on the baseline. The first derivative (1st DER), orthogonal signal correction, Savitzky-Golay first derivative (1st DSG), Savitzky-Golay second derivative (2nd DSG), standard normal variate, multiplicative signal correction, and adjacent-averaging were then applied to the smoothed spectral data.
The PLS-DA was performed to identify the adulteration substances added in raw milk, including whey, maltodextrin, or sodium carbonate. The PLS-DA was calibrated with combined regression and discriminant analyses. To do this, the whey, maltodextrin, sodium carbonate, and pure raw milk spectra were pooled to create a total of 144 spectra, which were then divided into sets of 108 training and 36 validation spectra according to the data set partitioning method of Kennard-Stone (both samples for training and validation were at the adulterated samples to nonadulterated samples of 1:1;
• Monzón P.
• Ramón J.E.
• Gandía-Romero J.M.
• Valcuende M.
• Soto J.
• Palací-López D.
PLS multivariate analysis applied to corrosion studies on reinforced concrete.
;
• Sun Y.
• Yuan M.
• Liu X.
• Su M.
• Wang L.
• Zeng Y.
• Zang H.
• Nie L.
A sample selection method specific to unknown test samples for calibration and validation sets based on spectra similarity.
). Then PLS regression model was established based on the same data set.

### Model Performance Evaluation Method

The total classification accuracy (ACC, %), sensitivity or true positive rate (TPR), specificity or true negative rate (TNR), root mean square error (RMSE), and receiver operating characteristic (ROC) were used as the model performance evaluation indexes. Among them, ACCT and ACCP represent the classification accuracy of the training set and prediction set, respectively, and ACC is the arithmetic mean of ACCT and ACCP. The TPRT and TNRT represent the sensitivity and specificity of the training set, respectively, and TPRP and TNRP represent the sensitivity and specificity of the prediction set, respectively. The RMSECV is root mean square error which is used to evaluate the discriminant error of the training set, whereas RMSEP represents the predicted root mean square error, which is used for evaluating the discriminant error of the validation set. The above parameters were calculated according to methods reported in the literature (
• Bassbasi M.
• De Luca M.
• Ioele G.
• Oussama A.
• Ragno G.
Prediction of the geographical origin of butters by partial least square discriminant analysis (PLS-DA) applied to infrared spectroscopy (FTIR) data.
).
$Accuracy=numberofcorrectpredictionstotalnumberofpredictions=TP+TNTP+TN+FP+FN,$
[1]

$TPR=TPTP+FN,$
[2]

$TNR=TNTN+FP,$
[3]

where TP = true positives, TN = true negatives, FP = false positives, and FN = false negatives.
Receiver operating characteristic curves were drawn with the TPR (or sensitivity) as ordinates and false positive rate (1 − specificity) as the abscissa. The area under the curve (AUC) was between 1.0 and 0.5. When AUC was above 0.9, the model was considered to have high accuracy. By selecting different thresholds, the sensitivity and specificity of the model provide different results. The ideal classification method would produce a point in the upper left corner of the ROC space whereby the sensitivity and specificity of the model are at a maximum, which is the best threshold point (
• Ballabio D.
• Consonni V.
Classification tools in chemistry. Part 1: Linear models. PLS-DA.
).

### Software

Raman spectra were acquired using OMNIC-Atlµs image software from Thermo Fisher Scientific. Chemometric statistical analyses were carried out using MATLAB 2019b (MathWorks) and SIMCA version 14.1 (Umetrics).

## RESULTS AND DISCUSSION

### Raman Spectra of Milk Samples and Adulterations

Figure 2 (a) shows spectra of milk and pure adulterations of whey, sodium carbonate, and maltodextrin. The strongest band for milk appeared at 2,888 cm−1, which was caused by superposition of vibrations at 2,900 cm−1 (H–C asymmetric stretching) and 2,854 cm−1 (H–C symmetric stretching) of lactose molecules (
• Pijls K.E.
• Smolinska A.
• Jonkers D.M.
• Dallinga J.W.
• Masclee A.A.
• Koek G.H.
• van Schooten F.J.
A profile of volatile organic compounds in exhaled air as a potential non-invasive biomarker for liver cirrhosis.
). The weaker peak near 1,646 cm−1 was attributed to C=O stretching from amide I, associated with the COOH and COC deformation modes of phenylalanine (
• Almeida M.R.
• Oliveira K.S.
• Stephani R.
• de Oliveira L.F.C.
Fourier-transform Raman analysis of milk powder: A potential method for rapid quality screening.
). The peak at 1,442 cm−1 was due to C–H scissoring from lipid molecules (
• Pijls K.E.
• Smolinska A.
• Jonkers D.M.
• Dallinga J.W.
• Masclee A.A.
• Koek G.H.
• van Schooten F.J.
A profile of volatile organic compounds in exhaled air as a potential non-invasive biomarker for liver cirrhosis.
); meanwhile, at 1,300 cm−1, there was a peak corresponding to SFA (
• Ullah R.
• Khan S.
• Ali H.
• Bilal M.
Potentiality of using front face fluorescence spectroscopy for quantitative analysis of cow milk adulteration in buffalo milk.
). However, no significant differences were observed in corresponding Raman spectra between milk and whey due to their similarities in chemical compositions [Figure 2 (c)]. Sodium carbonate had a strong absorption peak at 1,080 cm−1, which was due to C–O stretching vibration in CO32−. For maltodextrin, 2 bands were seen at 2,900 and 1,122 cm−1 due to C–H and C–O stretching (
• Rodrigues Júnior, P.H.
• de Sá Oliveira K.
• Almeida C.E.R.
• De Oliveira L.F.C.
• Stephani R.
• Pinto M.S.
• Carvalho A.F.
• Perrone Í.T.
FT-Raman and chemometric tools for rapid determination of quality parameters in milk powder: Classification of samples for the presence of lactose and fraud detection by addition of maltodextrin.
).
The 1,122 and 2,900 cm−1 bands seen in maltodextrin samples were near the 1,150 and 2,888 cm−1 bands seen for milk. As a result, the bands overlapped in mixtures of maltodextrin and milk, resulting in imprecise peak height measurements of the spectra [Figure 2 (b)]. As seen in Figure 2 (c), visual inspection of the spectra could not clearly distinguish pure raw milk and whey-adulterated milk. This led to nonlinear training set plots between peak intensity and concentration, which is easy to over-fit when the data are insufficient, and its calculation process is complicated, particularly for maltodextrin and whey. However, peak heights at 1,080 and 2,900 cm−1 can be used to analyze sodium carbonate content qualitatively and quantitatively in raw milk.

### Spectral Preprocessing and Variable Selection

Two different classification models were considered: (1) four 2-class PLS-DA models, one for each of the 3 adulterated and 1 nonadulterated milk samples and (2) one multiclass PLS-DA model for all sample types. First, the influence of different spectral preprocessing methods on the quality and discrimination ability of the multiclass PLS-DA model was examined. We found that these procedures significantly influenced the classification models, as reflected in variations in sensitivity and specificity (Table 1). The 1st DER, multiplicative signal correction, adjacent-averaging, and 1st DSG improved classification ability, whereas the orthogonal signal correction and 2nd DSG reduced classification ability, and the standard normal variate had no obvious effect on classification ability. It should be emphasized that after selection of the optimal signal preprocessing method, the whole set of spectra were processed in the same way. Detection conditions were optimized to ensure that the relative standard deviation of parallel samples remained below 10%. Finally, the data with 1st DER preprocessing were determined to provide the best sensitivity and specificity for the training sets and validation sets, compared with unprocessed spectra (Table 1). Therefore, all spectra (108 spectra of the training set and 36 spectra of the validation set) were preprocessed using the median filter together with the 1st DER method before using them for PLS analysis and PLS-DA (Figure 3).
Table 1Results of spectral pretreatment and variable selection on the performance of the multiclass partial least square discriminant analysis model
Variable selectionPretreatment
AAv = adjacent-averaging; MSC = multiplicative scatter correction; SNV = standard normal variate; OSC = orthogonal signal correction; 1st DSG = Savitzky-Golay first derivative; 2nd DSG = Savitzky-Golay second derivative; 1st DER = first derivative.
Classification accuracy (%)RMSEcv
RMSECV = root mean square error that is used to evaluate the discriminant error of the training set.
NoneNone94.400.3665
AAv95.130.3372
MSC95.800.3619
SNV92.440.4113
OSC90.120.5149
1st DSG96.470.2731
2nd DSG91.870.4513
1st DER97.390.2776
VIP
VIP = variables in projection.
> 0.8
1st DER97.460.2649
VIP > 198.060.2231
VIP > 1.294.070.2491
VIP > 1.390.440.2186
VIP > 1.489.720.2106
1 AAv = adjacent-averaging; MSC = multiplicative scatter correction; SNV = standard normal variate; OSC = orthogonal signal correction; 1st DSG = Savitzky-Golay first derivative; 2nd DSG = Savitzky-Golay second derivative; 1st DER = first derivative.
2 RMSECV = root mean square error that is used to evaluate the discriminant error of the training set.
3 VIP = variables in projection.
Using a diagram of important variables in projection (VIP), determinant regions of Raman spectra used by the PLS-DA classification model were identified. This curve is shown in Figure 4. In this study, VIP scores were used for data reduction, and the model established using variables with VIP >1 was found to have the highest classification accuracy. The RMSECV value was significantly lower for this model than for PLS-DA models lacking VIP variable selection. Therefore, subsequent studies used 1st DER methods to preprocess the spectral data, and spectral points with VIP >1 were selected for PLS-DA and PLS modeling.

### PCA of the Raman Spectra Data of Samples

Principal component analysis is used in exploratory data analysis and for making predictive models. It is commonly used for dimensionality reduction by projecting each data point onto only the first few principal components to obtain lower-dimensional data while preserving as much of the variation in the data as possible. The first principal component (PC1) can equivalently be defined as a direction that maximizes the variance of the projected data. The nth principal component can be taken as a direction orthogonal to the PC1 that maximizes the variance of the projected data. Therefore, in the spectral analysis, PCA projects the displacement points of all spectra into a small dimension representation space, and calculates the new variables called principal components, which are the linear combination of the original spectral data.
First, PCA is used to process the spectrum. Principal component analysis can determine the main characteristics of the spectra and highlight the relationship between description variables. Seven main components were extracted in a PCA of 144 of spectral data. The contribution rates of the PC1 and the second principal component (PC2) were 57.9 and 14.7%. The total contribution rate of the first 2 principal components was 71.6%. This proved that the first 2 principal components contain most of the information about variables, so the first 2 principal components are selected for data visualization. Figure 5 shows the analysis results of the data on the 2-dimensional scatter plot, in which PC1 is on the x-axis and PC2 is on the y-axis. The adulterated raw milk samples were dispersed along the PC1 axis according to adulterant concentration, showing an obvious classification trend. However, scores of many samples with low adulterant concentrations were close to those of the unadulterated raw milk samples. To better distinguish adulterated and unadulterated raw milk samples, a PLS-DA model was required. The PLS-DA training model was trained with 108 samples, including 54 spectra without adulteration and 54 spectra from milk samples containing maltodextrin, whey, or sodium carbonate.

### PLS-DA for Screening of Adulterated Milk Samples

Partial least square discriminant analysis is a supervised statistical method of discriminant analysis that judges how to classify research objects according to the observed or measured variable values. In spectral analysis, PLS-DA can establish the relationship model between peak intensity of spectral point signal and sample category, so as to realize the prediction of a sample category. First, four 2-class PLS-DA models were established to investigate the effect of pairwise classification of 4 samples (one sample as a category, the remaining 3 samples as another category).
The modeling process includes the selection of the optimal latent variables (LV). When the LV number is too large, it will lead to over-fitting of the model; however, when the LV number is too small, the spectral information of the sample cannot be fully expressed. Both cases result in decreases of the prediction ability of the model (
• Luo L.
• Bao S.
• Mao J.
Adaptive selection of latent variables for process monitoring.
). Therefore, through cross-validation misclassification, the optimal number of LV (4–5) was selected for each PLS-DA adulteration discriminant model. We examined variations, TPR (or sensitivity), and TNR (or specificity) for the training set, as shown in Table 2. Among the four 2-class PLS-DA models, the TPR and TNR of the prediction set were 1; additionally, the sodium carbonate model had the highest classification accuracy, reaching 100% and indicating that the model had good prediction ability. The classification accuracy of maltodextrin and whey models was 97.92 and 97.83%, indicating that these 2 classification models could classify effectively. For the classification model of pure raw milk samples, the accuracy was 97.02%, slightly lower than that of the other 3 classification models. Therefore, RS combined with PLS-DA can identify and accurately distinguish the 4 adulteration types using 2-class models.
Table 2Parameters of the 2-class partial least square discriminant analysis models
Parameter
LV = latent variable; TPRT = true positive rate of training set; TNRT = true negative rate of training set; TPRP = true positive rate of prediction set; TNRP = true negative rate of prediction set; ACC = total classification accuracy.
MaltodextrinSodium carbonateWheyPure raw milk
LV5444
TPRT0.9910.970.98
TNRT1111
TPRP1111
TNRP1111
ACC (%)97.9210097.8397.02
Threshold0.37480.62410.35790.3263
1 LV = latent variable; TPRT = true positive rate of training set; TNRT = true negative rate of training set; TPRP = true positive rate of prediction set; TNRP = true negative rate of prediction set; ACC = total classification accuracy.
Furthermore, we investigated whether RS combined with chemometric data analysis can successfully identify multiple adulterants in an unknown raw milk sample. To answer this question, a multiclass PLS-DA model was established to simultaneously detect maltodextrin, sodium carbonate, and whey adulteration in raw milk. Using cross-validation misclassification, the optimal number of the LV was 8 for the multiclass PLS-DA adulteration discrimination model, as shown in Table 3. The multiclass PLS-DA model can incorporate up to 4 milk samples. The ROC curve of the model is shown in Figure 6. The AUC values of maltodextrin-, whey-, and sodium carbonate-adulterated milk samples were greater than 0.99, indicating that the model was very effective. The upper left corner of the ROC represents maximum sensitivity and specificity, corresponding to the intersection point of the sensitivity-specificity curve. The corresponding threshold at this point is the optimal threshold point of the model. A deviation from the optimal threshold point was seen for adulterated raw milk and pure raw milk. Considering the specific classification process, it was hypothesized that the misjudgment rates of the adulterated raw milk samples were the lowest. Therefore, the ROC curve threshold of adulterated raw milk was selected as the discriminant threshold of the PLS-DA model. When the predictive value of the discriminant model was greater than or equal to the threshold, the sample was identified as a positive sample; moreover, when the predictive value of the discriminant model was less than the threshold, the sample was identified as a negative sample.
Table 3Parameters of the multiclass partial least square discriminant analysis model
Parameter
LV = latent variable; TPRT = true positive rate of training set; TNRT = true negative rate of training set; TPRP = true positive rate of prediction set; TNRP = true negative rate of prediction set; ACC = total classification accuracy.
MaltodextrinSodium carbonateWheyPure raw milk
LV8888
TPRT0.9910.920.95
TNRT110.960.97
TPRP1111
TNRP1111
ACC (%)95.8310095.8492.25
Threshold0.47820.63900.42090.5819
1 LV = latent variable; TPRT = true positive rate of training set; TNRT = true negative rate of training set; TPRP = true positive rate of prediction set; TNRP = true negative rate of prediction set; ACC = total classification accuracy.
The results of multiclass model discrimination are shown in Figure 7. Four distinct areas corresponding to maltodextrin-, whey-, and sodium carbonate-adulterated milk and pure raw milk samples were observed. The classification performance of PLS-DA model for all samples is shown in Table 3. The TPR and TNR values are equal to 1, showing excellent specificity and sensitivity. Among the results of classification accuracy, the effect of sodium carbonate is the best, and the accuracy is 100%. The whey and maltodextrin models are slightly poor, and the classification accuracy is 95.83 and 95.84%, indicating that the Raman spectra of these 2 adulteration samples are not significantly different from those of the other 3 samples. For the classification of pure raw milk samples, the Raman spectra of raw milk were compared with those of 3 adulteration samples at the same time. Due to the small spectral difference between maltodextrin- and whey-adulterated samples and raw milk samples, pure raw milk may be classified as adulterated samples, or adulterated samples may be classified as pure raw milk samples. When setting the threshold, the adulteration samples should be classified as pure raw milk samples as seldomly as possible. Finally, the PLS-DA model accuracy of pure raw milk was 92.25%.
• de Souza Gondim C.
• Cesar Santos de Souza R.
• de Paula Penna e Palhares M.
• Junqueira R.G.
• Carvalho de Souza S.V.
Performance improvement and single laboratory validation of classical qualitative methods for the detection of adulterants in milk: Starch, chlorides and sucrose.
showed that a multiclass model established using mid-infrared spectroscopy and soft independent modeling of class analogy techniques could provide 82% correct classifications of unadulterated and formaldehyde-, hydrogen peroxide-, citrate-, hydroxide-, and starch-adulterated milk samples. By contrast, the prediction accuracy of the model established in our research was higher. Moreover, our strategy was shown to be highly efficient, especially when a large number of samples required analysis, and it greatly reduced the required experimental time with very low error rates. This method also can be applied to other signals, samples, or adulterants.

### Quantification of Adulterant in Milk

The PLS models were developed to quantify the level of the adulterants in the milk samples. A PLS scatter plot (Figure 8) shows the correlation between the reference value and the values predicted by the PLS Raman method. Models were developed using 5 to 6 LV and could explain more than 90% of the variance in the multispectral data set. Correlation between training sets and validation sets for each of 3 adulterated samples were all greater than 0.95, showing the predictive accuracy of quantitative model is satisfactory. According to Table 4, RMSE and RMSEP values of maltodextrin were 0.53 and 0.49%, sodium carbonate values were 1.46 and 1.09 mg/kg, and whey values were 0.67 and 0.86%. Our PLS model showed advanced performance statistics when compared with RMSEP value for milk adulterated with whey (2.33%), as reported by
• Santos P.M.
• Pereira-Filho E.R.
• Rodriguez-Saona L.E.
Rapid detection and quantification of milk adulteration using infrared microspectroscopy and chemometrics analysis.
. Low RMSE or RMSEP and high R2 confirmed that RS was a suitable method for milk adulteration detection. However, the accuracy and reliability of the PLS model used for quantitative detection needs to be assessed according to relative error of prediction (REP), limit of detection (LOD), and limit of quantification. The REP was calculated according to sensitivity, and LOD and limit of quantification values were calculated according to the previous method (
• Cattaneo T.M.P.
• Holroyd S.E.
New applications of near infrared spectroscopy on dairy products.
;
• Allegrini F.
• Olivieri A.C.
IUPAC-consistent approach to the limit of detection in partial least-squares calibration.
). The REP of maltodextrin, sodium carbonate, and whey were 9.39, 11.57, and 10.31, whereas the LOD was 1.46%, 4.86 mg/kg, and 2.64%, respectively. The 4.86 mg/kg LOD obtained here for sodium carbonate is much lower than the 2 g/L obtained in a recent study using electrochemical sensors for sodium bicarbonate in milk (
• Chakraborty M.
• Biswas K.
Limit of detection for five common adulterants in milk: A study with different fat percent.
).
Table 4Statistical parameters of the partial least square models
Parameter
LOD = limit of detection; LOQ = limit of quantification; RSD = relative error of standard deviation; REP = relative error of prediction; RMSE = root mean square error of training set; RMSEP = root mean square error of prediction; RPD = residual prediction deviation.
Maltodextrin (%)Sodium carbonate (mg/kg)Whey (%)
LOD1.464.382.64
LOQ4.3813.147.92
RSD8.759.987.56
REP9.3911.5710.31
RMSE0.531.460.67
RMSEP0.491.090.86
RPD5.619.178.79
ApplicabilityQualitative detectionQuantitative detectionQuantitative detection
1 LOD = limit of detection; LOQ = limit of quantification; RSD = relative error of standard deviation; REP = relative error of prediction; RMSE = root mean square error of training set; RMSEP = root mean square error of prediction; RPD = residual prediction deviation.
Considering that adulteration is typically carried out to obtain economic benefits, adulterations in fractions below 5% are not typically found (Rocha et al., 2015). Therefore, the performance of method is considered to be satisfactory and sufficient for qualitative and quantitative detection of adulterated substances. Residual prediction deviation (RPD) estimations revealed that the Raman with PLS combined with RS is excellent for all analysis tasks when RPD >8 (
• Nieuwoudt M.K.
• Holroyd S.E.
• McGoverin C.M.
• Simpson M.C.
• Williams D.E.
Raman spectroscopy as an effective screening method for detecting adulteration of milk with small nitrogen-rich molecules and sucrose.
,
• Nieuwoudt M.K.
• Holroyd S.E.
• McGoverin C.M.
• Simpson M.C.
• Williams D.E.
Screening for adulterants in liquid milk using a portable Raman miniature spectrometer with immersion probe.
;
• de Oliveira Mendes T.
• Manzolli Rodrigues B.V.
• Simas Porto B.L.
• Alves da Rocha R.
• de Oliveira M.A.L.
• de Castro F.K.
• dos Anjos V.C.
• Bell M.J.V.
Raman spectroscopy as a fast tool for whey quantification in raw milk.
). For sodium carbonate and whey, the PLS training sets model is suitable for quantitative detection, whereas for maltodextrin, the RPD of 5.61 qualifies it for use in qualitative detection alone.

## CONCLUSIONS

The present study demonstrated a simultaneous and effective strategy based on RS and chemometrics for the rapid screening of adulterants in raw milk. Maltodextrin, whey, and sodium carbonate could be identified in raw milk using RS and PLS-DA, and the concentrations of each adulterant were quantified by PLS. The classification accuracy of pure raw milk, maltodextrin, sodium carbonate, and whey in the PLS-DA model was 92.25, 95.83, 100, and 95.84%, respectively, whereas the LOD of maltodextrin, sodium carbonation, and whey in PLS model was 1.46%, 4.38 mg/kg, and 2.64%, respectively. The low RMSE or RMSEP and high R2 verified that RS is a suitable technique for qualitative and quantitative detection of adulterated substances in raw milk. Raman fingerprints were recorded from dry milk drops without sample preparation or addition of chemicals, making the technique suitable for rapid on-site screening methods. Simultaneous analysis of the 3 adulterated substances in raw milk was possible at low levels as a consequence of the high sensitivity of RS and the use of multivariate analysis. With the increasing popularity of competitively priced portable mini-Raman systems and microchip applications, the technique shows potential for deployment on-site to achieve rapid and reliable detection of adulterated milk.

## ACKNOWLEDGMENTS

This research was supported by “Science and Technology Innovation Action Plan” Agricultural Project of Shanghai Science and Technology Commission (grant number 19391902600; China) and Shanghai Engineering Technology Research Center of Shanghai Science and Technology Commission (grant number 20DZ2255600; China). The authors have not stated any conflicts of interest.

## REFERENCES

• Abdallah Musa Salih M.
• Yang S.
Common milk adulteration in developing countries cases study in China and Sudan: A review.
J. Adv. Dairy Res. 2017; 51000192
• Allegrini F.
• Olivieri A.C.
IUPAC-consistent approach to the limit of detection in partial least-squares calibration.
Anal. Chem. 2014; 86 (25008998): 7858-7866
• Almeida M.R.
• Oliveira K.S.
• Stephani R.
• de Oliveira L.F.C.
Fourier-transform Raman analysis of milk powder: A potential method for rapid quality screening.
J. Raman Spectrosc. 2011; 42: 1548-1552
• Alves da Rocha R.
• Paiva I.M.
• Anjos V.
• Bell M.J.
Quantification of whey in fluid milk using confocal Raman microscopy and artificial neural network.
J. Dairy Sci. 2015; 98 (25828656): 3559-3567
• Aquino L.
• Silva A.
• Freitas M.Q.
• Felicio T.L.
• Cruz A.G.
• Conte-Junior C.A.
Identifying cheese whey an adulterant in milk: Limited contribution of a sensometric approach.
Food Res. Int. 2014; 62: 233-237
• Ballabio D.
• Consonni V.
Classification tools in chemistry. Part 1: Linear models. PLS-DA.
Anal. Methods. 2013; 5: 3790-3798
• Bassbasi M.
• De Luca M.
• Ioele G.
• Oussama A.
• Ragno G.
Prediction of the geographical origin of butters by partial least square discriminant analysis (PLS-DA) applied to infrared spectroscopy (FTIR) data.
J. Food Compos. Anal. 2014; 33: 210-215
• Bergana M.M.
• Harnly J.
• Moore J.C.
• Xie Z.
Non-targeted detection of milk powder adulteration by 1H NMR spectroscopy and conformity index analysis.
J. Food Compos. Anal. 2019; 78: 49-58
• Bērziņš K.
• Harrison S.D.L.
• Leong C.
• Fraser-Miller S.J.
• Harper M.J.
• Diana A.
• Gibson R.S.
• Houghton L.A.
• Gordon K.C.
Qualitative and quantitative vibrational spectroscopic analysis of macronutrients in breast milk.
Spectrochim. Acta A Mol. Biomol. Spectrosc. 2021; 246 (33017792)118982
• Capuano E.
• Boerrigter-Eenling R.
• Koot A.
• van Ruth S.M.
Targeted and untargeted detection of skim milk powder adulteration by near-infrared spectroscopy.
Food Anal. Methods. 2015; 8: 2125-2134
• Cattaneo T.M.P.
• Holroyd S.E.
New applications of near infrared spectroscopy on dairy products.
J. Near Infrared Spectrosc. 2013; 21: 307-310
• Chakraborty M.
• Biswas K.
Limit of detection for five common adulterants in milk: A study with different fat percent.
IEEE Sens. J. 2018; 18: 2395-2403
• de Oliveira Mendes T.
• Manzolli Rodrigues B.V.
• Simas Porto B.L.
• Alves da Rocha R.
• de Oliveira M.A.L.
• de Castro F.K.
• dos Anjos V.C.
• Bell M.J.V.
Raman spectroscopy as a fast tool for whey quantification in raw milk.
Vib. Spectrosc. 2020; 111103150
• de Souza Gondim C.
• Cesar Santos de Souza R.
• de Paula Penna e Palhares M.
• Junqueira R.G.
• Carvalho de Souza S.V.
Performance improvement and single laboratory validation of classical qualitative methods for the detection of adulterants in milk: Starch, chlorides and sucrose.
Anal. Methods. 2015; 7: 9692-9701
• Farah J.S.
• Cavalcanti R.N.
• Guimarães J.T.
• Balthazar C.F.
• Coimbra P.T.
• Pimentel T.C.
• Esmerino E.A.
• Duarte M.C.K.H.
• Freitas M.Q.
• Granato D.
• Neto R.P.C.
• Tavares M.I.B.
• Silva M.C.
• Cruz A.G.
Differential scanning calorimetry coupled with machine learning technique: An effective approach to determine the milk authenticity.
Food Control. 2021; 121107585
• He H.
• Sun D.W.
• Pu H.
• Chen L.
• Lin L.
Applications of Raman spectroscopic techniques for quality and safety evaluation of milk: A review of recent developments.
Crit. Rev. Food Sci. Nutr. 2019; 59 (30614242): 770-793
• Jiménez-Carvelo A.M.
• Osorio M.T.
• Koidis A.
Chemometric classification and quantification of olive oil in blends with any edible vegetable oils using FTIR-ATR and Raman spectroscopy.
Lebensm. Wiss. Technol. 2017; 86: 174-184
• Kamal M.
• Karoui R.
Analytical methods coupled with chemometric tools for determining the authenticity and detecting the adulteration of dairy products: A review.
Trends Food Sci. Technol. 2015; 46: 27-48
• Kene Ejeahalaka K.
• On S.L.W.
Effective detection and quantification of chemical adulterants in model fat-filled milk powders using NIRS and hierarchical modelling strategies.
Food Chem. 2020; 309 (31732247)125785
• Luo L.
• Bao S.
• Mao J.
Adaptive selection of latent variables for process monitoring.
Ind. Eng. Chem. Res. 2019; 58: 9075-9086
• MacMahon S.
• Begley T.H.
• Diachenko G.W.
• Stromgren S.A.
A liquid chromatography–Tandem mass spectrometry method for the detection of economically motivated adulteration in protein-containing foods.
J. Chromatogr. A. 2012; 1220: 101-107
• Monzón P.
• Ramón J.E.
• Gandía-Romero J.M.
• Valcuende M.
• Soto J.
• Palací-López D.
PLS multivariate analysis applied to corrosion studies on reinforced concrete.
J. Chemometr. 2019; 33e3096
• Moore J.C.
• Spink J.
• Lipp M.
Development and application of a database of food ingredient fraud and economically motivated adulteration from 1980 to 2010.
J. Food Sci. 2012; 77 (22486545): R118-R126
• Nieuwoudt M.K.
• Holroyd S.E.
• McGoverin C.M.
• Simpson M.C.
• Williams D.E.
Raman spectroscopy as an effective screening method for detecting adulteration of milk with small nitrogen-rich molecules and sucrose.
J. Dairy Sci. 2016; 99 (26874427): 2520-2536
• Nieuwoudt M.K.
• Holroyd S.E.
• McGoverin C.M.
• Simpson M.C.
• Williams D.E.
Screening for adulterants in liquid milk using a portable Raman miniature spectrometer with immersion probe.
Appl. Spectrosc. 2017; 71 (27329831): 308-312
• Pereira E.V.S.
• Fernandes D.D.S.
• de Araújo M.C.U.
• Diniz P.H.G.D.
• Maciel M.I.S.
Simultaneous determination of goat milk adulteration with cow milk and their fat and protein contents using NIR spectroscopy and PLS algorithms.
Lebensm. Wiss. Technol. 2020; 127109427
• Pijls K.E.
• Smolinska A.
• Jonkers D.M.
• Dallinga J.W.
• Masclee A.A.
• Koek G.H.
• van Schooten F.J.
A profile of volatile organic compounds in exhaled air as a potential non-invasive biomarker for liver cirrhosis.
Sci. Rep. 2016; 6 (26822454)19903
• Poonia A.
• Jha A.
• Sharma R.
• Singh H.B.
• Rai A.K.
• Sharma N.
Detection of adulteration in milk: A review.
Int. J. Dairy Technol. 2017; 70: 23-42
• Rodrigues Júnior, P.H.
• de Sá Oliveira K.
• Almeida C.E.R.
• De Oliveira L.F.C.
• Stephani R.
• Pinto M.S.
• Carvalho A.F.
• Perrone Í.T.
FT-Raman and chemometric tools for rapid determination of quality parameters in milk powder: Classification of samples for the presence of lactose and fraud detection by addition of maltodextrin.
Food Chem. 2016; 196 (26593531): 584-588
• Santos P.M.
• Pereira-Filho E.R.
• Rodriguez-Saona L.E.
Rapid detection and quantification of milk adulteration using infrared microspectroscopy and chemometrics analysis.
Food Chem. 2013; 138 (23265450): 19-24
• Sun Y.
• Yuan M.
• Liu X.
• Su M.
• Wang L.
• Zeng Y.
• Zang H.
• Nie L.
A sample selection method specific to unknown test samples for calibration and validation sets based on spectra similarity.
Spectrochim. Acta A Mol. Biomol. Spectrosc. 2021; 258 (33957450)119870
• Teixeira J.L.P.
• Caramês E.T.S.
• Baptista D.P.
• Gigante M.L.
• Pallone J.A.L.
Vibrational spectroscopy and chemometrics tools for authenticity and improvement the safety control in goat milk.
Food Control. 2020; 112107105
• Tronco V.M.
Manual para inspeo da qualidade do leite.
Vania Maria Tronco, 2010
• Ullah R.
• Khan S.
• Ali H.
• Bilal M.
Potentiality of using front face fluorescence spectroscopy for quantitative analysis of cow milk adulteration in buffalo milk.
Spectrochim. Acta A Mol. Biomol. Spectrosc. 2020; 225 (31518755)117518
• Xu N.
• Liu M.H.
• Yuan H.
• Huang S.G.
• Song Y.X.
Classification of sulfadimidine and sulfapyridine in duck meat by surface enhanced Raman spectroscopy combined with principal component analysis and support vector machine.
Anal. Lett. 2020; 53: 1-12
• Xu Y.
• Zhong P.
• Jiang A.
• Shen X.
• Li X.
• Xu Z.
• Shen Y.
• Sun Y.
• Lei H.
Raman spectroscopy coupled with chemometrics for food authentication: A review.
Trends Analyt. Chem. 2020; 131116017