If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
Casein in fluid milk determines cheese yield and affects cheese quality. Traditional methods of measuring casein in milk involve lengthy sample preparations with labor-intensive nitrogen-based protein quantifications. The objective of this study was to quantify casein in fluid milk with different casein-to-crude-protein ratios using front-face fluorescence spectroscopy (FFFS) and chemometrics. We constructed calibration samples by mixing microfiltration and ultrafiltration retentate and permeate in different ratios to obtain different casein concentrations and casein-to-crude-protein ratios. We developed partial least squares regression and elastic net regression models for casein prediction in fluid milk using FFFS tryptophan emission spectra and reference casein contents. We used a set of 20 validation samples (including raw, skim, and ultrafiltered milk) to optimize and validate model performance. We externally tested another independent set of 20 test samples (including raw, skim, and ultrafiltered milk) by root mean square error of prediction (RMSEP), residual prediction deviation (RPD), and relative prediction error (RPE). The RMSEP for casein content quantification in raw, skim, and ultrafiltered milk ranged from 0.12 to 0.13%, and the RPD ranged from 3.2 to 3.4. The externally validated error of prediction was comparable to the existing rapid method and showed practical model performance for quality-control purposes. This FFFS-based method can be implemented as a routine quality-control tool in the dairy industry, providing rapid quantification of casein content in fluid milk intended for cheese manufacturing.
Improving cheese yield is a constant pursuit among cheese manufacturers. Dairy food scientists have identified multiple factors that can influence cheese yield. Milk compositions—namely the amount of casein and fat—have been highlighted in multiple studies as indicators that determine cheese yield (
). Several determinants of cheese yield include curd firmness, syneresis rate, and moisture retention, and these qualities of the cheese curd have been partially linked to the casein content of the cheese milk (
). Therefore, standardization of cheese milk has become a common practice in cheese manufacturing. By adjusting the casein-to-fat ratio, standardized milk can maximize cheese yield without losing excessive fat and casein into whey (
). Moreover, with advancements in membrane processing technologies, cheese can be produced from UF and microfiltered (MF) milk. Both UF and MF milk contain more casein than regular cheese milk, which can increase cheese yield and improve vat utilization (
standard methods (990.20 and 990.21). Casein content can be directly measured from isolated casein solids or indirectly calculated as the difference between total protein and non-casein proteins using a Kjeldahl-based method. The Kjeldahl-based method has good repeatability and reproducibility and has served as the industry standard method for casein quantification since 1938 (
). However, this quantification process is laborious and time-consuming, and it uses multiple hazardous chemical reagents. Dairy food researchers have proposed 2 general alternative approaches to measuring casein based on separation techniques and infrared spectroscopies. High-performance liquid chromatography methods based on reverse phase, gel permeation, and size exclusion have been developed to quantify caseins in skim and raw milk (
). Although the primary goal of these methods was protein separation, quantification of casein can also be achieved using appropriate standards. Infrared spectroscopic methods of casein quantification have been developed with the help of multivariate statistical models. One early attempt at near-infrared measurement of casein was based on an indirect approach, taking the difference between total protein and serum phase protein (
). With consistent CN:CP, casein measurement becomes a secondary quantification by measuring CP. Foss Electric (Hiller⊘d, Denmark) has also implemented calibration options for measuring casein in fluid milk using its FTIR models MilkoScan FT 120 and MilkoScan FT2 (
). However, published FTIR-based methods have reported only cross-validation results. The lack of external validation has made it difficult to fully examine the robustness and practicality of this method when measuring unknown samples.
Multiple intrinsic fluorophores in milk are suitable for fluorescence spectroscopic analysis. Front-face fluorescence spectroscopy (FFFS) is known for its sensitivity and ability to analyze turbid samples. Tryptophan, a commonly studied fluorophore in milk, can be used to measure milk coagulation, degrees of heat treatment, and dairy powder solubility during storage (
studied tryptophan emission spectra and characterized the acid coagulation process in milk. The study indicated that in acidic conditions, casein in milk yielded fluorescence spectral differences from milk with a native pH as a result of structural changes in casein, such as solubilization of micellar calcium phosphate and partial micellar disintegration. These structural changes in casein affect surface tryptophan exposure and lead to an increase in tryptophan fluorescence intensity.
measured acid-precipitated casein tryptophan fluorescence in raw milk and established multivariate calibration models to quantify casein. This study extends the previous study to develop and externally validate a FFFS-based method to quantify casein in raw, skim, and UF milk with various CN:CP.
MATERIALS AND METHODS
We constructed calibration samples by mixing permeates and retentates obtained from UF and MF pasteurized skim milk in different ratios to obtain a range of different casein concentrations and CN:CP. We developed multivariate calibration models using tryptophan emission spectra and reference values of casein content and CN:CP based on the Kjeldahl method. We subsequently optimized the calibration models using a set of validation samples including raw, skim, and UF milk and evaluated the final model performance for casein and CN:CP measurement using another independent set of test samples including raw, skim, and UF milk. Detailed methods are described in the following sections.
Calibration, Validation, and Test Samples
The UF retentate and permeate (about 5× concentrated) made from 1 lot of pasteurized skim milk were donated by a commercial milk protein concentrate manufacturer in the United States. The MF retentate and permeate (about 3× concentrated) from 1 lot of pasteurized skim milk were donated by the Southeast Dairy Foods Research Center (Raleigh, NC). Detailed UF and MF sample preparation is described in
After the casein content was measured in the UF and MF permeate and retentate, the UF and MF retentates were diluted with varying amounts of UF and MF permeates to vary casein content and CN:CP in the calibration samples. We prepared 30 calibration samples (ncal = 30), with casein content ranging from 1.21 to 4.45% and CN:CP ranging from 0.66 to 0.88, and we used them to develop the calibration model (Table 1).
Table 1Mean (range) protein fractions of calibration samples (n = 30), ultrafiltered milk (n = 10), pasteurized skim milk (n = 10), and raw milk (n = 20)
For validation and test samples, we purchased 10 samples of pasteurized skim milk and 10 samples of UF milk with different production days from local supermarkets. An additional 20 raw milk samples from individual cows were randomly collected from the Kansas State University Dairy Cattle Teaching and Research Unit (Manhattan, KS). The validation and test samples (nval/test = 40) were stored at 5°C until further analysis.
Reference Measurement of Casein and Calibration Sample Preparation
We analyzed CP and NPN in the UF and MF retentate, permeate, and validation and test milk samples using
. We determined casein content from the difference between CP and noncasein nitrogen content, multiplied by 6.38. We calculated CN:CP using casein content divided by CP to represent the proportion of casein in relationship to the total protein of the milk sample.
Tryptophan Fluorescence Collection for Calibration and Validation Samples
Based on preliminary studies, completely precipitating casein at pH 4.6 yielded distinctive spectra compared with the rest of the pH-adjusted and native samples. The FFFS spectral collection was achieved according to
. Prior to FFFS measurement, 7 mL of sample was taken in a 10-mL test tube and mixed with 0.6 mL of 10% acetic acid (Fisher Scientific, Hampton, NH) to ensure a pH of 4.60 ± 0.05. The mixture was vortexed for 15 s and transferred immediately into a quartz cuvette (Starna Cells Inc., Atascadero, CA), ensuring no phase separation. Tryptophan emission spectra were immediately acquired using a spectrofluorimeter fitted with a 1% attenuator (LS-55; Perkin Elmer, Waltham, MA) at an excitation wavelength of 280 nm and an emission scan of 300 to 440 nm, with a scan speed of 300 nm/min. We performed triplicate measurements on freshly precipitated calibration samples at 25°C and averaged them to improve signal-to-noise ratio. In total, we collected tryptophan fluorescence spectra on 30 calibration and 40 validation and test samples to develop the chemometric model.
Chemometric Model Development, Optimization, and Validation
Developing chemometric models involves optimization and testing of the finalized models (
). The detailed model-development approach followed in this study can be found in Figure 1. We developed calibration models using FFFS tryptophan spectra, reference casein content, and the CN:CP of the 30 calibration samples. The 40 validation and test samples were randomly partitioned into a validation set (nval = 20) and a test set (ntest = 20). A summary of the casein content and CN:CP of the validation and test sets can be found in Table 1. We used the validation set to validate and optimize the preliminary models, and we used the test set to evaluate the optimized model for quantification of casein content and CN:CP. The quantification results from the test set provided estimates of the future performance of the developed model.
Spectral Preprocessing and Construction of Calibration Models.
Preprocessing tools such as normalization, derivation, and smoothing are commonly used to reduce drift noise and reveal spectral overlays before model development (
). In this study, raw fluorescence spectra were transformed using Savitzky–Golay smoothing and first derivative algorithms with 9-point neighbor values to reduce the spectral noise from directly measuring turbid milk samples and reveal additional spectral information (
We developed supervised prediction models using partial least squares regression (PLSR) and elastic net regression (ENR). In chemometrics, PLSR is a popular method for relating 2 data matrices using a linear multivariate model, capable of handling large number variables with noise and collinearity (
). The model input consisted of the smoothed or first derivative of the tryptophan emission spectra, and casein content was predicted independently using the preprocessed spectra. In this study, we considered up to 15 latent variables in the initial model development of PLSR and used them as model optimization parameters. For ENR, the elastic net parameter (α) and regularization parameter (λ) were considered optimization parameters. In this study, we used an increment of 0.1 for α optimization, using leave-one-out cross-validation to find the best-performing λ. A total of 10 models from ENR were produced from the calibration step, and they were later optimized by the validation set for the optimal α value.
We optimized the established calibration models by predicting the validation set. We evaluated model performance using root mean square error of validation (RMSEV) and the coefficient of determination (R2) between the reference and predicted values. We selected optimization parameters for PLSR (number of latent variables) and ENR (α and λ values) based on the lowest RMSEV. The R2 evaluated the linearity of the model prediction to the reference values, and we used calibration transfer based on linear models to correct the estimated bias from the preliminary PLSR or ENR predictions. We recorded the optimal parameters of PLSR and ENR and used them for the final test set predictions.
Model Performance Evaluation.
Model evaluation was achieved by predicting the test set using the finalized PLSR and ENR models. We evaluated the final model performance using root mean square error of prediction (RMSEP), showing the difference between predicted values and reference values. We calculated residual prediction deviation (RPD) from the RMSEP divided by the standard deviation of the reference values and used it as a parameter to estimate the model's prediction power. We calculated relative prediction errors (RPE) by dividing the average reference values by the RMSEP to evaluate the relative error of the prediction to the reference method. We conducted spectral preprocessing, statistical model building, and evaluation using an in-house program developed in the R programming language with the Caret, pls, and glmnet packages (
monitored pooled raw milk for 1 year and found that casein content ranged from 2.08 to 2.52%. With the recent popularity of high-protein beverages, UF milk as a consumer product has entered the market. According to a high-protein milk application developed by
, casein in UF and delactosed milk can range from 2 to 8% during production, and for the finished product, CP content can range from 4.9 to 5.2%. Although casein content is not specified for the finished product, it is assumed to be less than the reported total protein content. Table 1 summarizes the protein fractions of UF milk obtained for this study. For the 10 commercial UF milk samples we obtained, casein content ranged from 2.90 to 3.34%, and CP ranged from 3.54 to 3.93%. For the commercial pasteurized skim milk samples, casein content ranged from 2.03 to 2.64%, and CP ranged from 2.75 to 3.42%. Casein content for the raw milk samples in this study ranged from 1.88 to 2.66%, and CP ranged from 2.57 to 3.39%. The casein content variation observed in this study suggested that the calibration range for measuring casein needed to cover the casein range for raw, skim, and UF milk. In Table 1, the casein content of the calibration samples ranged from 1.21 to 4.45%, providing a sufficient calibration range for measuring casein in these milk types.
The casein content variation in milk can also cause variations in CN:CP.
). We also observed variations in CN:CP in this study. The CN:CP variation ranged from 0.82 to 0.86 in UF milk, 0.71 to 0.79 in pasteurized skim milk, and 0.73 to 0.80 in raw milk (Table 1). The variation in milk samples required a set of calibration samples that covered the target CN:CP range. Table 1 showed that the calibration sample range was obtained by mixing various amount of UF and MF retentate and permeate, which produced CN:CP of 0.66 to 0.88. With the wide range of CN:CP in the calibration set, we were able to avoid the collinearity effects between casein and CP and ensure robust casein predictions in various types of milk.
Tryptophan Fluorescence Emission Spectra of Calibration Samples
We collected tryptophan fluorescence emission spectra of acid-precipitated calibration samples using FFFS. The emission maxima (λmax) of the calibration samples (ncal = 30) ranged from 338 to 341 nm. According to
, emission maxima of approximately 340 nm confirm the fluorophore to be tryptophan. The calibration set varied in terms of casein content and CN:CP (Table 1) and led to differences in tryptophan emission intensity. Figure 2A shows 4 representative calibration samples with low CP (samples I and II) and high CP (samples III and IV). In the samples with low CP, the casein content in sample I (1.6%) was less than in sample II (2.1%). The non-casein nitrogen levels for samples I and II were also different, at 0.9 and 0.4%, respectively. The CP levels for samples I and II were very similar, at 2.5%. In Figure 2B, the tryptophan emission spectra of samples I and II appeared to have large fluorescence emission intensity differences, with a λmax increase of 23.3%. On the other hand, a similar amount of casein was present in samples II and III (2.1%), but their CP contents were different, because the non-casein nitrogen of sample III was higher than that of sample II (1.3 and 0.4%, respectively). The emission spectra of samples II and III appeared to be similar, with a 2.3% change in λmax. Samples III and IV had a similar amount of CP (3.4%), but different casein content (2.1 and 2.7%, respectively). The difference in casein content is again highlighted in the change of emission spectra shown in Figure 2B. The emission spectral differences among samples I, II, III, and IV illustrate that the tryptophan emission spectra of the acid-precipitated casein were more sensitive to casein content than to CP content or CN:CP in milk.
At pH 4.6, casein reaches its isoelectric point, aggregating in the milk dispersion system, and the serum phase remains as a transparent liquid. The tryptophan-containing casein aggregates can absorb excitation light (280 nm) and emit fluorescence at 300 to 400 nm (
). The scattered excitation light may be again absorbed and emit more fluorescence from the casein aggregates. In the serum phase, tryptophan-containing whey proteins, peptides, and free amino acids will also absorb the excitation light and emit fluorescence (
). However, due to the low optical density of the serum phase, less scattering effects could occur. With the scattering of casein aggregates, the tryptophan fluorescence observed in Figure 2 may reflect a change in casein content more than a change in CP content in a given milk system at a pH of 4.6.
Calibration Model Development and Optimization
We constructed PLSR and ENR models using the acid-precipitated casein tryptophan emission spectra as inputs. Because the calibration samples were laboratory-constructed, a validation set was necessary to ensure the model's validity for real milk samples. For PLSR model validation and optimization, the number of latent variables determined the model performance in terms of RMSEV and R2 (
A typical latent variable selection process is shown in Figure 3A for PLSR prediction of casein content using the Savitzky–Golay smoothing preprocessing technique. The RMSEV (0.66%) showed minimal value with 2 latent variables, indicating the lowest prediction error for casein content. The R2 (0.90) had the highest value with 2 latent variables, so we selected the PLSR model with 2 latent variables for the prediction model. However, we observed a high RMSEV from validation, resulting in an RPE of 26.3%. The prediction of the validation set is visualized in Figure 3B. Although we found a good linear trend between reference and predicted casein contents, the high-casein samples showed a proportional overestimation of prediction. Considering the high R2 obtained from the validation, we applied a linear model was applied to correct the overestimation in the validation set. This linear model correction approach is known as calibration transfer. Calibration transfer methods are more commonly seen in the development of near-infrared spectroscopy models to account for instrument signal drifts (
). It is less common to use the calibration transfer method on the same instrument. However, because the calibration samples were laboratory-constructed, they may have yielded different fluorescence intensities because of possible variations in total solids and mineral contents. The validation set was able to capture the high estimation bias and correct it as part of the model development process.
Model Optimization for ENR.
To optimize the ENR model, we selected the optimal α using the lowest RMSEV generated from the validation set. In Figure 4A, a representative elastic net parameter selection process is shown for prediction of casein content using Savitzky–Golay smoothing as the preprocessing step. The lowest RMSEV observed was 0.64% when α was 0.9. When α was 0.1, R2 reached the highest value of 0.9. When we observed different α values for optimal RMSEV and R2, we chose RMSEV as the evaluating criterion because it judges the true prediction power of the model (
). Therefore, we chose an α value of 0.9 to optimize the ENR model. As in the PLSR model, we used linear model correction to adjust the biased estimation plotted in Figure 4B for ENR predictions.
Optimized Casein Content Prediction Models.
Table 2 shows the optimized PLSR and ENR calibration models for casein quantifications, with corresponding optimization factors. For casein content prediction, different preprocessing and regression techniques yielded similar prediction power. The PLSR and ENR models showed similar performance when predicting casein, in agreement with another study that compared PLSR with ENR (
, a R2 value greater than 0.95 indicates reliable model prediction power in food analysis. The low RMSEV for the casein predictions also indicated that the model carried potential as a rapid casein quantification method. However, the optimized models needed to be tested externally with unknown samples to estimate their final performance.
Table 2Optimized partial least squares regression (PLSR) and elastic net regression (ENR) calibration model performance of casein and CN:CP quantification
The externally tested model performance for casein quantification in UF, skim, and raw milk testing samples is shown in Table 3. For casein prediction, PLSR and ENR yielded similar test results, with RMSEP ranging from 0.12 to 0.13% and an R2 of 0.91 (Figure 5). In further analysis of casein quantification error, the RPD of the models ranged from 3.2 to 3.4. According to Willams and Norris (2001), an RPD greater than 3 showed very good prediction power for food analysis purposes. The RPE of the casein predictions ranged from 4.9 to 5.1%, showing the relative error in the context of the average casein content of the test set. According to
, a linear regression between the reference and predicted casein content showed no significant difference from the slope of 1 and intercept of 0, indicating that the FFFS-based method had no significant proportional or constant bias for the prediction of casein content (P < 0.05) compared with the reference method.
Table 3Final partial least squares regression (PLSR) and elastic net regression (ENR) test model performance of casein and CN:CP quantification
first reported an FTIR-based method for casein quantification on raw milk samples. The method was developed using multivariate statistical models and validated with a set of 20 raw milk samples. The casein measurement range of this method was 2.71 to 3.62%, with an error of 0.08 to 0.1%. A follow-up study by
using standard milk samples showed a casein measurement range of 1.8 to 4.5% with a lower measurement error of 0.046 to 0.08%. It appeared that the increase of calibration range and high sample homogeneity improved the accuracy of FTIR-based measurements of casein. A mid-infrared-based method published by
aimed to quantify milk casein and free amino acids. The method was designed to capture casein content variation from different genetic breeds. Although the range of the casein measurement was not reported, the error from the study was 0.48%, almost 10 times higher than the FTIR method.
applied an enzymatic precipitation approach to quantify casein in milk with various CN:CP, using FTIR combined with PLSR. The study reported a cross-validated RMSE of 0.12%, which was very similar to the current study's externally validated RMSEP (0.12 to 0.13%). The FFFS-based prediction results from PLSR and ENR were comparable to existing methods. Test samples from UF, skim, and raw milk improved the robustness of the method, measuring casein contents from various types of milk with different CN:CP.
Cheese milk standardization accounts for natural variations in casein and CN:CP to ensure consistent product quality. In this study, we developed and validated a FFFS-based method to measure casein in fluid milk with various CN:CP. Using PLSR and ENR with external validations, the prediction models quantified casein in raw, skim, and UF milk with a RMSEP of 0.12 to 0.13%, an RPD of 3.2 to 3.4, and an RPE of 4.9 to 5.1% relative to the reference method. The FFFS-based method provides practical prediction power to serve as a rapid tool for measuring casein content in fluid milk, indicating cheese yield and standardizing cheese milk in dairy farms and processing plants.
The authors thank MaryAnne Drake from the Southeast Dairy Foods Research Center (Raleigh, NC) for the MF retentate and permeate donations. The authors appreciate the technical discussion with Dr. Inès-Birlouez-Aragon from Spectralys Innovation (Romainville, France). This project is Kansas State Research and Extension contribution number 12-278-J. The authors have not stated any conflicts of interest.
Fluorescence spectroscopy: A rapid tool for analyzing dairy products.