Advertisement

A rapid method to quantify casein in fluid milk by front-face fluorescence spectroscopy combined with chemometrics

Open ArchivePublished:November 05, 2020DOI:https://doi.org/10.3168/jds.2020-18799

      ABSTRACT

      Casein in fluid milk determines cheese yield and affects cheese quality. Traditional methods of measuring casein in milk involve lengthy sample preparations with labor-intensive nitrogen-based protein quantifications. The objective of this study was to quantify casein in fluid milk with different casein-to-crude-protein ratios using front-face fluorescence spectroscopy (FFFS) and chemometrics. We constructed calibration samples by mixing microfiltration and ultrafiltration retentate and permeate in different ratios to obtain different casein concentrations and casein-to-crude-protein ratios. We developed partial least squares regression and elastic net regression models for casein prediction in fluid milk using FFFS tryptophan emission spectra and reference casein contents. We used a set of 20 validation samples (including raw, skim, and ultrafiltered milk) to optimize and validate model performance. We externally tested another independent set of 20 test samples (including raw, skim, and ultrafiltered milk) by root mean square error of prediction (RMSEP), residual prediction deviation (RPD), and relative prediction error (RPE). The RMSEP for casein content quantification in raw, skim, and ultrafiltered milk ranged from 0.12 to 0.13%, and the RPD ranged from 3.2 to 3.4. The externally validated error of prediction was comparable to the existing rapid method and showed practical model performance for quality-control purposes. This FFFS-based method can be implemented as a routine quality-control tool in the dairy industry, providing rapid quantification of casein content in fluid milk intended for cheese manufacturing.

      Key words

      INTRODUCTION

      Improving cheese yield is a constant pursuit among cheese manufacturers. Dairy food scientists have identified multiple factors that can influence cheese yield. Milk compositions—namely the amount of casein and fat—have been highlighted in multiple studies as indicators that determine cheese yield (
      • Barbano D.M.
      • Sherbon J.W.
      Cheddar cheese yields in New York.
      ;
      • Emmons D.B.
      • Modler H.W.
      Invited review: A commentary on predictive cheese yield formulas.
      ). Several determinants of cheese yield include curd firmness, syneresis rate, and moisture retention, and these qualities of the cheese curd have been partially linked to the casein content of the cheese milk (
      • Cipolat-Gotet C.
      • Cecchinato A.
      • De Marchi M.
      • Bittante G.
      Factors affecting variation of different measures of cheese yield and milk nutrient recovery from an individual model cheese-manufacturing process.
      ). Therefore, standardization of cheese milk has become a common practice in cheese manufacturing. By adjusting the casein-to-fat ratio, standardized milk can maximize cheese yield without losing excessive fat and casein into whey (
      • Lucey J.
      • Kelly J.
      Cheese yield.
      ). Moreover, with advancements in membrane processing technologies, cheese can be produced from UF and microfiltered (MF) milk. Both UF and MF milk contain more casein than regular cheese milk, which can increase cheese yield and improve vat utilization (
      • Kumar P.
      • Sharma N.
      • Ranjan R.
      • Kumar S.
      • Bhat Z.F.
      • Jeong D.K.
      Perspective of membrane technology in dairy industry: A review.
      ).
      The current standard method of casein measurement involves isoelectrically precipitating casein at pH 4.6 in milk and separating the casein from the non-casein fraction by filtration according to
      • AOAC International
      Official Methods of Analysis.
      standard methods (990.20 and 990.21). Casein content can be directly measured from isolated casein solids or indirectly calculated as the difference between total protein and non-casein proteins using a Kjeldahl-based method. The Kjeldahl-based method has good repeatability and reproducibility and has served as the industry standard method for casein quantification since 1938 (
      • Rowland S.J.
      176. The determination of the nitrogen distribution in milk.
      ;
      • Lynch J.M.
      • Barbano D.M.
      • Fleming J.R.
      Indirect and direct determination of the casein content of milk by Kjeldahl nitrogen analysis: Collaborative study.
      ). However, this quantification process is laborious and time-consuming, and it uses multiple hazardous chemical reagents. Dairy food researchers have proposed 2 general alternative approaches to measuring casein based on separation techniques and infrared spectroscopies. High-performance liquid chromatography methods based on reverse phase, gel permeation, and size exclusion have been developed to quantify caseins in skim and raw milk (
      • Dimenna G.P.
      • Segall H.J.
      High-performance gel-permeation chromatography of bovine skim milk proteins.
      ;
      • van der Ven C.
      • Gruppen H.
      • de Bont D.B.A.
      • Voragen A.G.J.
      Reversed phase and size exclusion chromatography of milk protein hydrolysates: Relation between elution from reversed phase column and apparent molecular weight distribution.
      ;
      • Bonfatti V.
      • Grigoletto L.
      • Cecchinato A.
      • Gallo L.
      • Carnier P.
      Validation of a new reversed-phase high-performance liquid chromatography method for separation and quantification of bovine milk protein genetic variants.
      ). Similarly, a capillary electrophoresis-based method has been developed to quantify whey protein and casein in milk (
      • Recio I.
      • Olieman C.
      Determination of denatured serum proteins in the casein fraction of heat-treated milk by capillary zone electrophoresis.
      ). Although the primary goal of these methods was protein separation, quantification of casein can also be achieved using appropriate standards. Infrared spectroscopic methods of casein quantification have been developed with the help of multivariate statistical models. One early attempt at near-infrared measurement of casein was based on an indirect approach, taking the difference between total protein and serum phase protein (
      • Barbano D.M.
      • Dellavalle M.E.
      Rapid method for determination of milk casein content by infrared analysis.
      ). With the advancement of Fourier-transform infrared spectroscopy (FTIR),
      • Hewavitharana A.K.
      • van Brakel B.
      Fourier transform infrared spectrometric method for the rapid determination of casein in raw milk.
      and
      • Luginbühl W.
      Evaluation of designed calibration samples for casein calibration in fourier transform infrared analysis of milk.
      have both developed and validated casein quantification with FTIR and multivariate statistical models.
      The milk samples used in previous spectroscopic studies had similar casein-to-crude-protein ratios (CN:CP), making prediction dependent on the collinearity between casein and CP (
      • Baum A.
      • Hansen P.W.
      • Nørgaard L.
      • Sørensen J.
      • Mikkelsen J.D.
      Rapid quantification of casein in skim milk using Fourier transform infrared spectroscopy, enzymatic perturbation, and multiway partial least squares regression: Monitoring chymosin at work.
      ). With consistent CN:CP, casein measurement becomes a secondary quantification by measuring CP. Foss Electric (Hiller⊘d, Denmark) has also implemented calibration options for measuring casein in fluid milk using its FTIR models MilkoScan FT 120 and MilkoScan FT2 (
      • Foss Electric
      Application note No. 102. Calibration for casein in cow milk. MilkoScan FT 120.
      ;
      • Baum A.
      • Hansen P.W.
      • Nørgaard L.
      • Sørensen J.
      • Mikkelsen J.D.
      Rapid quantification of casein in skim milk using Fourier transform infrared spectroscopy, enzymatic perturbation, and multiway partial least squares regression: Monitoring chymosin at work.
      ). However, published FTIR-based methods have reported only cross-validation results. The lack of external validation has made it difficult to fully examine the robustness and practicality of this method when measuring unknown samples.
      Multiple intrinsic fluorophores in milk are suitable for fluorescence spectroscopic analysis. Front-face fluorescence spectroscopy (FFFS) is known for its sensitivity and ability to analyze turbid samples. Tryptophan, a commonly studied fluorophore in milk, can be used to measure milk coagulation, degrees of heat treatment, and dairy powder solubility during storage (
      • Herbert S.
      • Riaublanc A.
      • Bouchet B.
      • Gallant D.J.
      • Dufour E.
      Fluorescence spectroscopy investigation of acid-or rennet-induced coagulation of milk.
      ;
      • Kulmyrzaev A.A.
      • Levieux D.
      • Dufour É.
      Front-face fluorescence spectroscopy allows the characterization of mild heat treatments applied to milk. Relations with the denaturation of milk proteins.
      ;
      • Babu K.S.
      • Amamcharla J.K.
      Application of front-face fluorescence spectroscopy as a tool for monitoring changes in milk protein concentrate powders during storage.
      ).
      • Herbert S.
      • Riaublanc A.
      • Bouchet B.
      • Gallant D.J.
      • Dufour E.
      Fluorescence spectroscopy investigation of acid-or rennet-induced coagulation of milk.
      studied tryptophan emission spectra and characterized the acid coagulation process in milk. The study indicated that in acidic conditions, casein in milk yielded fluorescence spectral differences from milk with a native pH as a result of structural changes in casein, such as solubilization of micellar calcium phosphate and partial micellar disintegration. These structural changes in casein affect surface tryptophan exposure and lead to an increase in tryptophan fluorescence intensity.
      • Ma Y.B.
      • Birlouez-Aragon I.
      • Amamcharla J.K.
      Development and validation of a front-face fluorescence spectroscopy-based method to determine casein in raw milk.
      measured acid-precipitated casein tryptophan fluorescence in raw milk and established multivariate calibration models to quantify casein. This study extends the previous study to develop and externally validate a FFFS-based method to quantify casein in raw, skim, and UF milk with various CN:CP.

      MATERIALS AND METHODS

      Experimental Approach

      We constructed calibration samples by mixing permeates and retentates obtained from UF and MF pasteurized skim milk in different ratios to obtain a range of different casein concentrations and CN:CP. We developed multivariate calibration models using tryptophan emission spectra and reference values of casein content and CN:CP based on the Kjeldahl method. We subsequently optimized the calibration models using a set of validation samples including raw, skim, and UF milk and evaluated the final model performance for casein and CN:CP measurement using another independent set of test samples including raw, skim, and UF milk. Detailed methods are described in the following sections.

      Calibration, Validation, and Test Samples

      The UF retentate and permeate (about 5× concentrated) made from 1 lot of pasteurized skim milk were donated by a commercial milk protein concentrate manufacturer in the United States. The MF retentate and permeate (about 3× concentrated) from 1 lot of pasteurized skim milk were donated by the Southeast Dairy Foods Research Center (Raleigh, NC). Detailed UF and MF sample preparation is described in
      • Carter B.
      • Patel H.
      • Barbano D.M.
      • Drake M.
      The effect of spray drying on the difference in flavor and functional properties of liquid and dried whey proteins, milk proteins, and micellar casein concentrates.
      . Both UF and MF milk fractions arrived under refrigerated conditions and were analyzed for casein content based on the Kjeldahl method (
      • AOAC International
      Official Methods of Analysis.
      ; methods 990.20 and 990.21).
      After the casein content was measured in the UF and MF permeate and retentate, the UF and MF retentates were diluted with varying amounts of UF and MF permeates to vary casein content and CN:CP in the calibration samples. We prepared 30 calibration samples (ncal = 30), with casein content ranging from 1.21 to 4.45% and CN:CP ranging from 0.66 to 0.88, and we used them to develop the calibration model (Table 1).
      Table 1Mean (range) protein fractions of calibration samples (n = 30), ultrafiltered milk (n = 10), pasteurized skim milk (n = 10), and raw milk (n = 20)
      ItemCalibration samplesValidation and test samples
      Raw milkPasteurized skim milkUltrafiltered milk
      CP (%)3.29 (1.82–5.17)2.91(2.57–3.34)3.16 (2.75–3.42)3.77 (3.54–3.93)
      Casein (%)2.57 (1.21–4.45)2.25 (1.88–2.66)2.37 (2.03–2.64)3.17 (2.90–3.34)
      NPN (% protein-equivalent)0.14 (0.10–0.26)0.12 (0.09–0.14)0.12 (0.10–0.14)0.17 (0.14–0.20)
      CN:CP ratio0.77 (0.66–0.88)0.77 (0.73–0.80)0.76 (0.70–0.79)0.84 (0.82–0.86)
      For validation and test samples, we purchased 10 samples of pasteurized skim milk and 10 samples of UF milk with different production days from local supermarkets. An additional 20 raw milk samples from individual cows were randomly collected from the Kansas State University Dairy Cattle Teaching and Research Unit (Manhattan, KS). The validation and test samples (nval/test = 40) were stored at 5°C until further analysis.

      Reference Measurement of Casein and Calibration Sample Preparation

      We analyzed CP and NPN in the UF and MF retentate, permeate, and validation and test milk samples using
      • AOAC International
      Official Methods of Analysis.
      standard methods (990.20 and 990.21). Because of the high protein content in the MF and UF retentate, we measured non-casein nitrogen according to the method of
      • Zhang H.
      • Metzger L.E.
      Noncasein nitrogen analysis of ultrafiltration and microfiltration retentate.
      . We determined casein content from the difference between CP and noncasein nitrogen content, multiplied by 6.38. We calculated CN:CP using casein content divided by CP to represent the proportion of casein in relationship to the total protein of the milk sample.

      Tryptophan Fluorescence Collection for Calibration and Validation Samples

      Based on preliminary studies, completely precipitating casein at pH 4.6 yielded distinctive spectra compared with the rest of the pH-adjusted and native samples. The FFFS spectral collection was achieved according to
      • Ma Y.B.
      • Birlouez-Aragon I.
      • Amamcharla J.K.
      Development and validation of a front-face fluorescence spectroscopy-based method to determine casein in raw milk.
      . Prior to FFFS measurement, 7 mL of sample was taken in a 10-mL test tube and mixed with 0.6 mL of 10% acetic acid (Fisher Scientific, Hampton, NH) to ensure a pH of 4.60 ± 0.05. The mixture was vortexed for 15 s and transferred immediately into a quartz cuvette (Starna Cells Inc., Atascadero, CA), ensuring no phase separation. Tryptophan emission spectra were immediately acquired using a spectrofluorimeter fitted with a 1% attenuator (LS-55; Perkin Elmer, Waltham, MA) at an excitation wavelength of 280 nm and an emission scan of 300 to 440 nm, with a scan speed of 300 nm/min. We performed triplicate measurements on freshly precipitated calibration samples at 25°C and averaged them to improve signal-to-noise ratio. In total, we collected tryptophan fluorescence spectra on 30 calibration and 40 validation and test samples to develop the chemometric model.

      Chemometric Model Development, Optimization, and Validation

      Developing chemometric models involves optimization and testing of the finalized models (
      • Bevilacqua M.
      • Bro R.
      • Marini F.
      • Rinnan Å.
      • Rasmussen M.A.
      • Skov T.
      Recent chemometrics advances for foodomics.
      ). The detailed model-development approach followed in this study can be found in Figure 1. We developed calibration models using FFFS tryptophan spectra, reference casein content, and the CN:CP of the 30 calibration samples. The 40 validation and test samples were randomly partitioned into a validation set (nval = 20) and a test set (ntest = 20). A summary of the casein content and CN:CP of the validation and test sets can be found in Table 1. We used the validation set to validate and optimize the preliminary models, and we used the test set to evaluate the optimized model for quantification of casein content and CN:CP. The quantification results from the test set provided estimates of the future performance of the developed model.
      Figure thumbnail gr1
      Figure 1Chemometric model development overview with casein (range) and CN:CP (range). CN:CP = casein-to-crude-protein ratio; ENR = elastic net regression; ncal = number of calibration samples; nval = number of validation samples; ntest = number of test samples; PLSR = partial least squares regression; RMSE = root mean square error; RPD = residual prediction deviation; RPE = relative prediction error; SG = Savitzky–Golay.

      Spectral Preprocessing and Construction of Calibration Models.

      Preprocessing tools such as normalization, derivation, and smoothing are commonly used to reduce drift noise and reveal spectral overlays before model development (
      • Brown C.D.
      • Vega-Montoto L.
      • Wentzell P.D.
      Derivative preprocessing and optimal corrections for baseline drift in multivariate calibration.
      ). In this study, raw fluorescence spectra were transformed using Savitzky–Golay smoothing and first derivative algorithms with 9-point neighbor values to reduce the spectral noise from directly measuring turbid milk samples and reveal additional spectral information (
      • Savitzky A.
      • Golay M.J.E.
      Smoothing and differentiation of data by simplified least squares procedures.
      ).
      We developed supervised prediction models using partial least squares regression (PLSR) and elastic net regression (ENR). In chemometrics, PLSR is a popular method for relating 2 data matrices using a linear multivariate model, capable of handling large number variables with noise and collinearity (
      • Wold S.
      • Sjöström M.
      • Eriksson L.
      PLS-regression: A basic tool of chemometrics.
      ). Elastic net regression is a type of panelized linear regression with the ability to eliminate and shrink variable contributions in multivariate models (
      • Chen B.
      • Lewis M.J.
      • Grandison A.S.
      Effect of seasonal variation on the composition and properties of raw milk destined for processing in the UK.
      ). Chemometric researchers have applied both PLSR and ENR to model spectral data because of their ability to handle large numbers of predictors (
      • Filzmoser P.
      • Gschwandtner M.
      • Todorov V.
      Review of sparse methods in regression and classification with application to chemometrics.
      ). The model input consisted of the smoothed or first derivative of the tryptophan emission spectra, and casein content was predicted independently using the preprocessed spectra. In this study, we considered up to 15 latent variables in the initial model development of PLSR and used them as model optimization parameters. For ENR, the elastic net parameter (α) and regularization parameter (λ) were considered optimization parameters. In this study, we used an increment of 0.1 for α optimization, using leave-one-out cross-validation to find the best-performing λ. A total of 10 models from ENR were produced from the calibration step, and they were later optimized by the validation set for the optimal α value.

      Model Optimizations.

      We optimized the established calibration models by predicting the validation set. We evaluated model performance using root mean square error of validation (RMSEV) and the coefficient of determination (R2) between the reference and predicted values. We selected optimization parameters for PLSR (number of latent variables) and ENR (α and λ values) based on the lowest RMSEV. The R2 evaluated the linearity of the model prediction to the reference values, and we used calibration transfer based on linear models to correct the estimated bias from the preliminary PLSR or ENR predictions. We recorded the optimal parameters of PLSR and ENR and used them for the final test set predictions.

      Model Performance Evaluation.

      Model evaluation was achieved by predicting the test set using the finalized PLSR and ENR models. We evaluated the final model performance using root mean square error of prediction (RMSEP), showing the difference between predicted values and reference values. We calculated residual prediction deviation (RPD) from the RMSEP divided by the standard deviation of the reference values and used it as a parameter to estimate the model's prediction power. We calculated relative prediction errors (RPE) by dividing the average reference values by the RMSEP to evaluate the relative error of the prediction to the reference method. We conducted spectral preprocessing, statistical model building, and evaluation using an in-house program developed in the R programming language with the Caret, pls, and glmnet packages (
      • Mevik B.-H.
      • Wehrens R.
      The pls Package: principal component and partial least squares regression in R.
      ;
      • Kuhn M.
      Building predictive models in R using the caret package.
      ;
      • Friedman J.
      • Hastie T.
      • Tibshirani R.
      Regularization paths for generalized linear models via coordinate descent.
      ;
      • R Core Team
      R: A language and environment for statistical computing..
      ).

      RESULTS AND DISCUSSION

      Reference Casein and CN:CP Measurements

      In general, the average casein content in milk is 2.6 to 2.8% (
      • Walstra P.
      • Jenness R.
      Dairy Chemistry and Physics..
      ;
      • Fox P.F.
      • McSweeney P.L.
      • Paul L.H.
      Dairy Chemistry and Biochemistry..
      ). However, average casein content can be affected by season, diet, and genetic variations in dairy cows (
      • Lin Y.
      • O'Mahony J.A.
      • Kelly A.L.
      • Guinee T.P.
      Seasonal variation in the composition and processing characteristics of herd milk with varying proportions of milk from spring-calving and autumn-calving cows.
      ).
      • Lin Y.
      • O'Mahony J.A.
      • Kelly A.L.
      • Guinee T.P.
      Seasonal variation in the composition and processing characteristics of herd milk with varying proportions of milk from spring-calving and autumn-calving cows.
      evaluated the casein content of pooled pasteurized skim milk from Holstein Friesians over 1 year and found that casein content ranged from 2.61 to 3.02%.
      • Chen B.
      • Lewis M.J.
      • Grandison A.S.
      Effect of seasonal variation on the composition and properties of raw milk destined for processing in the UK.
      monitored pooled raw milk for 1 year and found that casein content ranged from 2.08 to 2.52%. With the recent popularity of high-protein beverages, UF milk as a consumer product has entered the market. According to a high-protein milk application developed by

      Ur-Rehman, S., B. Kopesky, S. Backinoff, T. P. Doelman, and C. White, inventors. 2017. Fractionating milk and UHT sterilization of milk fractions. Fairlife LLC, assignee. U.S. Pat. No. 15/446,032.

      , casein in UF and delactosed milk can range from 2 to 8% during production, and for the finished product, CP content can range from 4.9 to 5.2%. Although casein content is not specified for the finished product, it is assumed to be less than the reported total protein content. Table 1 summarizes the protein fractions of UF milk obtained for this study. For the 10 commercial UF milk samples we obtained, casein content ranged from 2.90 to 3.34%, and CP ranged from 3.54 to 3.93%. For the commercial pasteurized skim milk samples, casein content ranged from 2.03 to 2.64%, and CP ranged from 2.75 to 3.42%. Casein content for the raw milk samples in this study ranged from 1.88 to 2.66%, and CP ranged from 2.57 to 3.39%. The casein content variation observed in this study suggested that the calibration range for measuring casein needed to cover the casein range for raw, skim, and UF milk. In Table 1, the casein content of the calibration samples ranged from 1.21 to 4.45%, providing a sufficient calibration range for measuring casein in these milk types.
      The casein content variation in milk can also cause variations in CN:CP.
      • Lin Y.
      • O'Mahony J.A.
      • Kelly A.L.
      • Guinee T.P.
      Seasonal variation in the composition and processing characteristics of herd milk with varying proportions of milk from spring-calving and autumn-calving cows.
      reported that the CN:CP for pasteurized skim milk from Holstein Friesian cows ranged from 0.75 to 0.81 over 1 yr of observation. According to
      • Schaar J.
      • Hansson B.
      • Pettersson H.-E.
      Effects of genetic variants of κ-casein and β-lactoglobulin on cheesemaking.
      , genetic variants affected κ-CN and β-LG synthesis during lactation, which led to variations in casein numbers (CN:CP × 100) and cheese composition (
      • Lundén A.
      • Nilsson M.
      • Janson L.
      Marked effect of β-lactoglobulin polymorphism on the ratio of casein to total protein in milk.
      ). We also observed variations in CN:CP in this study. The CN:CP variation ranged from 0.82 to 0.86 in UF milk, 0.71 to 0.79 in pasteurized skim milk, and 0.73 to 0.80 in raw milk (Table 1). The variation in milk samples required a set of calibration samples that covered the target CN:CP range. Table 1 showed that the calibration sample range was obtained by mixing various amount of UF and MF retentate and permeate, which produced CN:CP of 0.66 to 0.88. With the wide range of CN:CP in the calibration set, we were able to avoid the collinearity effects between casein and CP and ensure robust casein predictions in various types of milk.

      Tryptophan Fluorescence Emission Spectra of Calibration Samples

      We collected tryptophan fluorescence emission spectra of acid-precipitated calibration samples using FFFS. The emission maxima (λmax) of the calibration samples (ncal = 30) ranged from 338 to 341 nm. According to
      • Andersen C.M.
      • Mortensen G.
      Fluorescence spectroscopy: A rapid tool for analyzing dairy products.
      , emission maxima of approximately 340 nm confirm the fluorophore to be tryptophan. The calibration set varied in terms of casein content and CN:CP (Table 1) and led to differences in tryptophan emission intensity. Figure 2A shows 4 representative calibration samples with low CP (samples I and II) and high CP (samples III and IV). In the samples with low CP, the casein content in sample I (1.6%) was less than in sample II (2.1%). The non-casein nitrogen levels for samples I and II were also different, at 0.9 and 0.4%, respectively. The CP levels for samples I and II were very similar, at 2.5%. In Figure 2B, the tryptophan emission spectra of samples I and II appeared to have large fluorescence emission intensity differences, with a λmax increase of 23.3%. On the other hand, a similar amount of casein was present in samples II and III (2.1%), but their CP contents were different, because the non-casein nitrogen of sample III was higher than that of sample II (1.3 and 0.4%, respectively). The emission spectra of samples II and III appeared to be similar, with a 2.3% change in λmax. Samples III and IV had a similar amount of CP (3.4%), but different casein content (2.1 and 2.7%, respectively). The difference in casein content is again highlighted in the change of emission spectra shown in Figure 2B. The emission spectral differences among samples I, II, III, and IV illustrate that the tryptophan emission spectra of the acid-precipitated casein were more sensitive to casein content than to CP content or CN:CP in milk.
      Figure thumbnail gr2
      Figure 2(A) Protein fractions (casein and non-casein nitrogen) of 4 representative calibration samples of low CP (samples I and II) and high CP (samples III and IV). (B) Tryptophan emission spectra of the corresponding samples.
      At pH 4.6, casein reaches its isoelectric point, aggregating in the milk dispersion system, and the serum phase remains as a transparent liquid. The tryptophan-containing casein aggregates can absorb excitation light (280 nm) and emit fluorescence at 300 to 400 nm (
      • Herbert S.
      • Riaublanc A.
      • Bouchet B.
      • Gallant D.J.
      • Dufour E.
      Fluorescence spectroscopy investigation of acid-or rennet-induced coagulation of milk.
      ). At the same time, the casein aggregates have surface protuberances that could randomly scatter the excitation light (
      • McMahon D.J.
      • Du H.
      • McManus W.R.
      • Larsen K.M.
      Microstructural changes in casein supramolecules during acidification of skim milk.
      ). The scattered excitation light may be again absorbed and emit more fluorescence from the casein aggregates. In the serum phase, tryptophan-containing whey proteins, peptides, and free amino acids will also absorb the excitation light and emit fluorescence (
      • Birlouez-Aragon I.
      • Sabat P.
      • Gouti N.
      A new method for discriminating milk heat treatment.
      ). However, due to the low optical density of the serum phase, less scattering effects could occur. With the scattering of casein aggregates, the tryptophan fluorescence observed in Figure 2 may reflect a change in casein content more than a change in CP content in a given milk system at a pH of 4.6.

      Calibration Model Development and Optimization

      We constructed PLSR and ENR models using the acid-precipitated casein tryptophan emission spectra as inputs. Because the calibration samples were laboratory-constructed, a validation set was necessary to ensure the model's validity for real milk samples. For PLSR model validation and optimization, the number of latent variables determined the model performance in terms of RMSEV and R2 (
      • Wold S.
      • Sjöström M.
      • Eriksson L.
      PLS-regression: A basic tool of chemometrics.
      ).

      Model Optimization for PLSR.

      A typical latent variable selection process is shown in Figure 3A for PLSR prediction of casein content using the Savitzky–Golay smoothing preprocessing technique. The RMSEV (0.66%) showed minimal value with 2 latent variables, indicating the lowest prediction error for casein content. The R2 (0.90) had the highest value with 2 latent variables, so we selected the PLSR model with 2 latent variables for the prediction model. However, we observed a high RMSEV from validation, resulting in an RPE of 26.3%. The prediction of the validation set is visualized in Figure 3B. Although we found a good linear trend between reference and predicted casein contents, the high-casein samples showed a proportional overestimation of prediction. Considering the high R2 obtained from the validation, we applied a linear model was applied to correct the overestimation in the validation set. This linear model correction approach is known as calibration transfer. Calibration transfer methods are more commonly seen in the development of near-infrared spectroscopy models to account for instrument signal drifts (
      • Bouveresse E.
      • Hartmann C.
      • Massart D.L.
      • Last I.R.
      • Prebble K.A.
      Standardization of near-infrared spectrometric instruments.
      ;
      • Liu Y.
      • Cai W.
      • Shao X.
      Linear model correction: A method for transferring a near-infrared multivariate calibration model without standard samples.
      ). It is less common to use the calibration transfer method on the same instrument. However, because the calibration samples were laboratory-constructed, they may have yielded different fluorescence intensities because of possible variations in total solids and mineral contents. The validation set was able to capture the high estimation bias and correct it as part of the model development process.
      Figure thumbnail gr3
      Figure 3(A) Example of parameter optimization for partial least squares regression (PLSR) for latent variable selections using the validation set. The solid line represents the change in root mean square error of validation (RMSEV), and the dotted line represents the change in coefficient of determination (R2). (B) Predicted versus reference casein (%) of the validation set using PLSR. The solid trend line represents the least square fit of the scatter plot, and the dotted line represents the ideal prediction target (x = y).

      Model Optimization for ENR.

      To optimize the ENR model, we selected the optimal α using the lowest RMSEV generated from the validation set. In Figure 4A, a representative elastic net parameter selection process is shown for prediction of casein content using Savitzky–Golay smoothing as the preprocessing step. The lowest RMSEV observed was 0.64% when α was 0.9. When α was 0.1, R2 reached the highest value of 0.9. When we observed different α values for optimal RMSEV and R2, we chose RMSEV as the evaluating criterion because it judges the true prediction power of the model (
      • Geladi P.
      Some recent trends in the calibration literature.
      ). Therefore, we chose an α value of 0.9 to optimize the ENR model. As in the PLSR model, we used linear model correction to adjust the biased estimation plotted in Figure 4B for ENR predictions.
      Figure thumbnail gr4
      Figure 4(A) Example of parameter optimization for elastic net regression (ENR) for elastic net parameter selections using the validation set. The solid line represents the change in root mean square error of validation (RMSEV), and the dotted line represents the change in coefficient of determination (R2). (B) Predicted versus reference casein (%) of the validation set using ENR. The solid trend line represents the least square fit of the scatter plot, and the dotted line represents the ideal prediction target (x = y).

      Optimized Casein Content Prediction Models.

      Table 2 shows the optimized PLSR and ENR calibration models for casein quantifications, with corresponding optimization factors. For casein content prediction, different preprocessing and regression techniques yielded similar prediction power. The PLSR and ENR models showed similar performance when predicting casein, in agreement with another study that compared PLSR with ENR (
      • Giglio C.
      • Brown S.D.
      Using elastic net regression to perform spectrally relevant variable selection.
      ). According to
      • Williams P.C.
      • Norris K.
      Near-Infrared Technology in the Agricultural and Food Industry.
      , a R2 value greater than 0.95 indicates reliable model prediction power in food analysis. The low RMSEV for the casein predictions also indicated that the model carried potential as a rapid casein quantification method. However, the optimized models needed to be tested externally with unknown samples to estimate their final performance.
      Table 2Optimized partial least squares regression (PLSR) and elastic net regression (ENR) calibration model performance of casein and CN:CP quantification
      RMSEV = root mean square error of validation; Nlv = number of latent variables; SG-S = Savitzky–Golay smoothing; SG-1st = Savitzky–Golay first derivative. α = elastic net parameter; λ = regularization parameter.
      ItemPreprocessingPLSRENR
      R2RMSEVNlvR2RMSEVαλ
      Casein (%)SG-S0.950.1820.960.180.900.088
      SG-1st0.970.1420.970.170.300.33
      1 RMSEV = root mean square error of validation; Nlv = number of latent variables; SG-S = Savitzky–Golay smoothing; SG-1st = Savitzky–Golay first derivative. α = elastic net parameter; λ = regularization parameter.

      External Testing of Casein Quantifications

      The externally tested model performance for casein quantification in UF, skim, and raw milk testing samples is shown in Table 3. For casein prediction, PLSR and ENR yielded similar test results, with RMSEP ranging from 0.12 to 0.13% and an R2 of 0.91 (Figure 5). In further analysis of casein quantification error, the RPD of the models ranged from 3.2 to 3.4. According to Willams and Norris (2001), an RPD greater than 3 showed very good prediction power for food analysis purposes. The RPE of the casein predictions ranged from 4.9 to 5.1%, showing the relative error in the context of the average casein content of the test set. According to
      • Piñeiro G.
      • Perelman S.
      • Guerschman J.P.
      • Paruelo J.M.
      How to evaluate models: Observed vs. predicted or predicted vs. observed?.
      , a linear regression between the reference and predicted casein content showed no significant difference from the slope of 1 and intercept of 0, indicating that the FFFS-based method had no significant proportional or constant bias for the prediction of casein content (P < 0.05) compared with the reference method.
      Table 3Final partial least squares regression (PLSR) and elastic net regression (ENR) test model performance of casein and CN:CP quantification
      RMSEP = root mean square error of prediction; RPD = residual prediction deviation; RPE = relative prediction error; SG-S = Savitzky–Golay smoothing; SG-1st = Savitzky–Golay first derivative.
      ItemPreprocessingPLSRENR
      R2RMSEPRPDRPE (%)R2RMSEPRPDRPE (%)
      Casein (%)SG-S0.910.123.44.90.910.133.35.0
      SG-1st0.910.133.25.10.910.133.35.0
      1 RMSEP = root mean square error of prediction; RPD = residual prediction deviation; RPE = relative prediction error; SG-S = Savitzky–Golay smoothing; SG-1st = Savitzky–Golay first derivative.
      Figure thumbnail gr5
      Figure 5(A) Test model of predicted versus reference casein (%) using Savitzky–Golay smoothed spectra and partial least squares regression. (B) Test model of predicted versus reference casein (%) using Savitzky–Golay smoothed spectra and elastic net regression. The solid trend line represents the least square fit of the scatter plot, and the dotted line represents the ideal prediction target (x = y).
      • Hewavitharana A.K.
      • van Brakel B.
      Fourier transform infrared spectrometric method for the rapid determination of casein in raw milk.
      first reported an FTIR-based method for casein quantification on raw milk samples. The method was developed using multivariate statistical models and validated with a set of 20 raw milk samples. The casein measurement range of this method was 2.71 to 3.62%, with an error of 0.08 to 0.1%. A follow-up study by
      • Luginbühl W.
      Evaluation of designed calibration samples for casein calibration in fourier transform infrared analysis of milk.
      using standard milk samples showed a casein measurement range of 1.8 to 4.5% with a lower measurement error of 0.046 to 0.08%. It appeared that the increase of calibration range and high sample homogeneity improved the accuracy of FTIR-based measurements of casein. A mid-infrared-based method published by
      • McDermott A.
      • Visentin G.
      • De Marchi M.
      • Berry D.P.
      • Fenelon M.A.
      • O'Connor P.M.
      • Kenny O.A.
      • McParland S.
      Prediction of individual milk proteins including free amino acids in bovine milk using mid-infrared spectroscopy and their correlations with milk processing characteristics.
      aimed to quantify milk casein and free amino acids. The method was designed to capture casein content variation from different genetic breeds. Although the range of the casein measurement was not reported, the error from the study was 0.48%, almost 10 times higher than the FTIR method.
      • Baum A.
      • Hansen P.W.
      • Nørgaard L.
      • Sørensen J.
      • Mikkelsen J.D.
      Rapid quantification of casein in skim milk using Fourier transform infrared spectroscopy, enzymatic perturbation, and multiway partial least squares regression: Monitoring chymosin at work.
      pointed out that varying CN:CP can break existing FTIR-based casein quantification methods, because they rely on collinearity between casein and CP to make predictions.
      • Baum A.
      • Hansen P.W.
      • Nørgaard L.
      • Sørensen J.
      • Mikkelsen J.D.
      Rapid quantification of casein in skim milk using Fourier transform infrared spectroscopy, enzymatic perturbation, and multiway partial least squares regression: Monitoring chymosin at work.
      applied an enzymatic precipitation approach to quantify casein in milk with various CN:CP, using FTIR combined with PLSR. The study reported a cross-validated RMSE of 0.12%, which was very similar to the current study's externally validated RMSEP (0.12 to 0.13%). The FFFS-based prediction results from PLSR and ENR were comparable to existing methods. Test samples from UF, skim, and raw milk improved the robustness of the method, measuring casein contents from various types of milk with different CN:CP.

      CONCLUSIONS

      Cheese milk standardization accounts for natural variations in casein and CN:CP to ensure consistent product quality. In this study, we developed and validated a FFFS-based method to measure casein in fluid milk with various CN:CP. Using PLSR and ENR with external validations, the prediction models quantified casein in raw, skim, and UF milk with a RMSEP of 0.12 to 0.13%, an RPD of 3.2 to 3.4, and an RPE of 4.9 to 5.1% relative to the reference method. The FFFS-based method provides practical prediction power to serve as a rapid tool for measuring casein content in fluid milk, indicating cheese yield and standardizing cheese milk in dairy farms and processing plants.

      ACKNOWLEDGMENTS

      The authors thank MaryAnne Drake from the Southeast Dairy Foods Research Center (Raleigh, NC) for the MF retentate and permeate donations. The authors appreciate the technical discussion with Dr. Inès-Birlouez-Aragon from Spectralys Innovation (Romainville, France). This project is Kansas State Research and Extension contribution number 12-278-J. The authors have not stated any conflicts of interest.

      REFERENCES

        • Andersen C.M.
        • Mortensen G.
        Fluorescence spectroscopy: A rapid tool for analyzing dairy products.
        J. Agric. Food Chem. 2008; 56 (18173241): 720-729
        • AOAC International
        Official Methods of Analysis.
        20th ed. AOAC International, Gaithersburg, MD2016
        • Babu K.S.
        • Amamcharla J.K.
        Application of front-face fluorescence spectroscopy as a tool for monitoring changes in milk protein concentrate powders during storage.
        J. Dairy Sci. 2018; 101 (30316594): 10844-10859
        • Barbano D.M.
        • Dellavalle M.E.
        Rapid method for determination of milk casein content by infrared analysis.
        J. Dairy Sci. 1987; 70 (3668027): 1524-1528
        • Barbano D.M.
        • Sherbon J.W.
        Cheddar cheese yields in New York.
        J. Dairy Sci. 1984; 67: 1873-1883
        • Baum A.
        • Hansen P.W.
        • Nørgaard L.
        • Sørensen J.
        • Mikkelsen J.D.
        Rapid quantification of casein in skim milk using Fourier transform infrared spectroscopy, enzymatic perturbation, and multiway partial least squares regression: Monitoring chymosin at work.
        J. Dairy Sci. 2016; 99 (27265175): 6071-6079
        • Bevilacqua M.
        • Bro R.
        • Marini F.
        • Rinnan Å.
        • Rasmussen M.A.
        • Skov T.
        Recent chemometrics advances for foodomics.
        TrAC Trend. Anal. Chem. 2017; 96: 42-51
        • Birlouez-Aragon I.
        • Sabat P.
        • Gouti N.
        A new method for discriminating milk heat treatment.
        Int. Dairy J. 2002; 12: 59-67
        • Bonfatti V.
        • Grigoletto L.
        • Cecchinato A.
        • Gallo L.
        • Carnier P.
        Validation of a new reversed-phase high-performance liquid chromatography method for separation and quantification of bovine milk protein genetic variants.
        J. Chromatogr. A. 2008; 1195 (18495141): 101-106
        • Bouveresse E.
        • Hartmann C.
        • Massart D.L.
        • Last I.R.
        • Prebble K.A.
        Standardization of near-infrared spectrometric instruments.
        Anal. Chem. 1996; 68: 982-990
        • Brown C.D.
        • Vega-Montoto L.
        • Wentzell P.D.
        Derivative preprocessing and optimal corrections for baseline drift in multivariate calibration.
        Appl. Spectrosc. 2000; 54: 1055-1068
        • Carter B.
        • Patel H.
        • Barbano D.M.
        • Drake M.
        The effect of spray drying on the difference in flavor and functional properties of liquid and dried whey proteins, milk proteins, and micellar casein concentrates.
        J. Dairy Sci. 2018; 101 (29501331): 3900-3909
        • Chen B.
        • Lewis M.J.
        • Grandison A.S.
        Effect of seasonal variation on the composition and properties of raw milk destined for processing in the UK.
        Food Chem. 2014; 158 (24731334): 216-223
        • Cipolat-Gotet C.
        • Cecchinato A.
        • De Marchi M.
        • Bittante G.
        Factors affecting variation of different measures of cheese yield and milk nutrient recovery from an individual model cheese-manufacturing process.
        J. Dairy Sci. 2013; 96 (24094531): 7952-7965
        • Dimenna G.P.
        • Segall H.J.
        High-performance gel-permeation chromatography of bovine skim milk proteins.
        J. Liq. Chromatogr. 1981; 4: 639-649
        • Emmons D.B.
        • Modler H.W.
        Invited review: A commentary on predictive cheese yield formulas.
        J. Dairy Sci. 2010; 93 (21094725): 5517-5537
        • Filzmoser P.
        • Gschwandtner M.
        • Todorov V.
        Review of sparse methods in regression and classification with application to chemometrics.
        J. Chemometr. 2012; 26: 42-51
        • Foss Electric
        Application note No. 102. Calibration for casein in cow milk. MilkoScan FT 120.
        Foss Electric, Hillerød, Denmark1997
        • Fox P.F.
        • McSweeney P.L.
        • Paul L.H.
        Dairy Chemistry and Biochemistry..
        Blackie Academic & Professional, London, UK1998
        • Friedman J.
        • Hastie T.
        • Tibshirani R.
        Regularization paths for generalized linear models via coordinate descent.
        J. Stat. Softw. 2010; 33 (20808728): 1-22
        • Geladi P.
        Some recent trends in the calibration literature.
        Chemom. Intell. Lab. Syst. 2002; 60: 211-224
        • Giglio C.
        • Brown S.D.
        Using elastic net regression to perform spectrally relevant variable selection.
        J. Chemometr. 2018; 32e3034
        • Herbert S.
        • Riaublanc A.
        • Bouchet B.
        • Gallant D.J.
        • Dufour E.
        Fluorescence spectroscopy investigation of acid-or rennet-induced coagulation of milk.
        J. Dairy Sci. 1999; 82: 2056-2062
        • Hewavitharana A.K.
        • van Brakel B.
        Fourier transform infrared spectrometric method for the rapid determination of casein in raw milk.
        Analyst (Lond.). 1997; 122: 701-704
        • Kuhn M.
        Building predictive models in R using the caret package.
        J. Stat. Softw. 2008; 28: 1-26
        • Kulmyrzaev A.A.
        • Levieux D.
        • Dufour É.
        Front-face fluorescence spectroscopy allows the characterization of mild heat treatments applied to milk. Relations with the denaturation of milk proteins.
        J. Agric. Food Chem. 2005; 53 (15686393): 502-507
        • Kumar P.
        • Sharma N.
        • Ranjan R.
        • Kumar S.
        • Bhat Z.F.
        • Jeong D.K.
        Perspective of membrane technology in dairy industry: A review.
        Asian-Australas. J. Anim. Sci. 2013; 26 (25049918): 1347-1358
        • Lin Y.
        • O'Mahony J.A.
        • Kelly A.L.
        • Guinee T.P.
        Seasonal variation in the composition and processing characteristics of herd milk with varying proportions of milk from spring-calving and autumn-calving cows.
        J. Dairy Res. 2017; 84 (28929997): 444-452
        • Liu Y.
        • Cai W.
        • Shao X.
        Linear model correction: A method for transferring a near-infrared multivariate calibration model without standard samples.
        Spectrochim. Acta A Mol. Biomol. Spectrosc. 2016; 169 (27380302): 197-201
        • Lucey J.
        • Kelly J.
        Cheese yield.
        Int. J. Dairy Technol. 1994; 47: 1-14
        • Luginbühl W.
        Evaluation of designed calibration samples for casein calibration in fourier transform infrared analysis of milk.
        Lebensm. Wiss. Technol. 2002; 35: 554-558
        • Lundén A.
        • Nilsson M.
        • Janson L.
        Marked effect of β-lactoglobulin polymorphism on the ratio of casein to total protein in milk.
        J. Dairy Sci. 1997; 80 (9406093): 2996-3005
        • Lynch J.M.
        • Barbano D.M.
        • Fleming J.R.
        Indirect and direct determination of the casein content of milk by Kjeldahl nitrogen analysis: Collaborative study.
        J. AOAC Int. 1998; 81 (9680702): 763-774
        • Ma Y.B.
        • Birlouez-Aragon I.
        • Amamcharla J.K.
        Development and validation of a front-face fluorescence spectroscopy-based method to determine casein in raw milk.
        Int. Dairy J. 2019; 93: 81-84
        • McDermott A.
        • Visentin G.
        • De Marchi M.
        • Berry D.P.
        • Fenelon M.A.
        • O'Connor P.M.
        • Kenny O.A.
        • McParland S.
        Prediction of individual milk proteins including free amino acids in bovine milk using mid-infrared spectroscopy and their correlations with milk processing characteristics.
        J. Dairy Sci. 2016; 99 (26830742): 3171-3182
        • McMahon D.J.
        • Du H.
        • McManus W.R.
        • Larsen K.M.
        Microstructural changes in casein supramolecules during acidification of skim milk.
        J. Dairy Sci. 2009; 92 (19923590): 5854-5867
        • Mevik B.-H.
        • Wehrens R.
        The pls Package: principal component and partial least squares regression in R.
        J. Stat. Softw. 2007; 18
        • Piñeiro G.
        • Perelman S.
        • Guerschman J.P.
        • Paruelo J.M.
        How to evaluate models: Observed vs. predicted or predicted vs. observed?.
        Ecol. Modell. 2008; 216: 316-322
        • R Core Team
        R: A language and environment for statistical computing..
        R Foundation for Statistical Computing, Vienna, Austria2016
        • Recio I.
        • Olieman C.
        Determination of denatured serum proteins in the casein fraction of heat-treated milk by capillary zone electrophoresis.
        Electrophoresis. 1996; 17 (8855409): 1228-1233
        • Rowland S.J.
        176. The determination of the nitrogen distribution in milk.
        J. Dairy Res. 1938; 9: 42-46
        • Savitzky A.
        • Golay M.J.E.
        Smoothing and differentiation of data by simplified least squares procedures.
        Anal. Chem. 1964; 36: 1627-1639
        • Schaar J.
        • Hansson B.
        • Pettersson H.-E.
        Effects of genetic variants of κ-casein and β-lactoglobulin on cheesemaking.
        J. Dairy Res. 1985; 52: 429-437
      1. Ur-Rehman, S., B. Kopesky, S. Backinoff, T. P. Doelman, and C. White, inventors. 2017. Fractionating milk and UHT sterilization of milk fractions. Fairlife LLC, assignee. U.S. Pat. No. 15/446,032.

        • van der Ven C.
        • Gruppen H.
        • de Bont D.B.A.
        • Voragen A.G.J.
        Reversed phase and size exclusion chromatography of milk protein hydrolysates: Relation between elution from reversed phase column and apparent molecular weight distribution.
        Int. Dairy J. 2001; 11: 83-92
        • Walstra P.
        • Jenness R.
        Dairy Chemistry and Physics..
        John Wiley & Sons, Hoboken, NJ1984
        • Williams P.C.
        • Norris K.
        Near-Infrared Technology in the Agricultural and Food Industry.
        American Association of Cereal Chemists, Inc., St. Paul, MN2001
        • Wold S.
        • Sjöström M.
        • Eriksson L.
        PLS-regression: A basic tool of chemometrics.
        Chemom. Intell. Lab. Syst. 2001; 58: 109-130
        • Zhang H.
        • Metzger L.E.
        Noncasein nitrogen analysis of ultrafiltration and microfiltration retentate.
        J. Dairy Sci. 2011; 94 (21427004): 2118-2125