Perspective: Challenges with product testing in powdered infant formula

is knowledge to improve sampling plan guidelines, all suggesting that taking many small samples increases the likelihood of correctly identifying contaminated lots. Empirical data from a recalled Cronobacter contaminated PIF batch describe a localized, low-level


PERSPECTIVE
Right now, many concerned parents are probably thinking, how could Cronobacter sakazakii get through the powdered infant formula (PIF) production system? Shouldn't a food safety monitoring system find such a serious problem? Aren't those products tested before arriving on shelves? And the very challenging answer is yes, these products were likely tested with a standardized sampling and testing method consistent with regulatory guidance. In fact, in an official statement, the company said all finished products are tested for Salmonella spp. and Cronobacter sakazakii before they are released, and the product samples the company retained tested negative for the implicated microorganism related to the complaints (Abbott, 2022). This begs the questions, how powerful can the current sampling and testing program be, and is there an end-product testing strategy that would guarantee that this would not happen again? The short answers are that current sampling and testing are not that powerful, and no, testing will not ever guarantee complete safety.
A thorough investigation of a PIF batch previously recalled for Cronobacter spp. (Jongenburger et al., 2011a) shows that the actual hazard tends to be localized with high within-lot variability and low-level concentration showing an estimated contamination level of −2.8 ± 1.1 log 10 (cfu/g), an average of about one cell per kilogram of product, with the high variability likely due to clusters of relatively more contaminated product. The FAO and WHO (2006) summary of industry routine testing data from nonrecalled batches also support the low-level contamination of C. sakazakii in PIF with an estimated contamination level of −3.8 ± 0.7 log 10 (cfu/g), an average of about one cell per 10 kg of product. Critically, this highlights 2 issues with sampling: (1) high variability of contamination within a lot implies that end-product testing will not absolutely rule out the possibility of C. sakazakii contamination; and (2) the low level of contamination further complicates the challenge of finding the proverbial needle (C. sakazakii) in the haystack (PIF). Current sampling guidelines for PIF provided by the US Food and Drug Administration specify 60 grabs of 25-g samples (1,500 g total) for Salmonella spp. and 30 grabs of 10-g samples (300 g total) for Cronobacter spp. from each production lot (FDA, 2018). Some producers are likely using autosamplers (discussed in the next paragraph) to test products, which take more frequent yet smaller grabs, but their use is not codified in official guidance. However, modern infant formula processing is done on a very large scale with 22,700 kg or larger lot size. The massive scale of production compared to the relatively trivial scale of sample testing (tens of grabs, hundreds of grams of product) creates issues of representativeness to catch small clusters and power to catch low levels.
There are statistical tools that calculate the power of sampling plans developed by ICMSF (2020) and JERMA (2022). We used these tools to evaluate current sampling guidelines (30 grabs of 10 g), assuming a large lot size with low-level contamination consistent with the recalled batch estimated contamination level (around 1 cell/1 kg), where the proportion being tested is small (300 g of ~23,000 kg). The ICMSF tool uses the mean, deviation, and sampling guidelines to predict the chance at least one sample has at least one pathogen cell, and predicts about a 97% change to reject the lot. Conversely, the JEMRA tool, which additionally accounts for the very large total lot size in relation to the small sampled size, predicts only about a 30% chance to reject the lot. This suggests that the current sampling plan guidelines are not likely to detect localized low-level contamination in large lot sizes. Therefore, product testing is not likely useful to verify the system is operating with a very low level of food safety residual risk (Zwietering et al., 2021). Instead, it would likely only catch major contamination failures. Unfortunately, events such as the current recall demonstrate that PIF may cause meaningful harm when C. sakazakii is present in very large lots, even if present at a relatively low level.
There is knowledge to improve sampling plan guidelines, all suggesting that taking many small samples increases the likelihood of correctly identifying contaminated lots. Empirical data from a recalled Cronobactercontaminated PIF batch describe a localized, low-level contamination event (the previously mentioned Jongenburger et al., 2011a). Statistical research supports that collecting many small incremental samples using an autosampler can increase the detection power in powder sampling (Thevaraja et al., 2021). These autosamplers can be applied in the baghouse, to potentially capture contamination introduced up to the point of packaging. Other research on the best pattern to take incremental samples using autosamplers has shown that when localized hazards exist, systematic sampling can detect the hazard better than simple random sampling (Jongenburger et al., 2011b), and that stratified random sampling may be a good hybrid approach detecting both localized hazards and periodic, systematic hazards (Jongenburger et al., 2015).
Our research group developed a simulation approach to guide improvements to sampling plans for large-scale bulk products (Cheng and Stasiewicz, 2021). We used this to study detection of small hotspots of aflatoxin in large corn bins, and concluded that (1) clustering reduces the power of all sampling and testing plans, but (2) taking a greater number of smaller samples with some type of randomization improves detection when clustering is high. We currently have a funded project to adapt this approach to simulate testing for powdered products such as PIF. Because this is a simulation, the work could ultimately allow stakeholders (producers, retailers, regulators) to experiment on their own with developing sampling plans specific to the production scales, hazard profiles, and sampling capabilities in their specific systems of concern. The work of our group and others cited could help guide improvements to infant formula sampling and improve the identification of contaminated lots. Still, practical limits to the scale of testing in relation to the scale of production mean no testing strategy will guarantee pathogen-free infant formula. After all, one cannot test the entire lot, and even setting an autosampler to take 1 g from every can of formula during packing would still leave a substantial fraction not tested.
This outbreak is a signal that the infant formula industry needs to collaborate around modern food safety plans that proactively prevent contamination with pathogens such as Cronobacter through continuous improvement in good manufacturing practices, sanitation standard operating procedures, and hygienic equipment and plant design. Other industries have done this. For example, after high-profile outbreaks in almonds, new USDA rules required almonds to be pasteurized with a validated process, and in response, the Almond Board of California came up with a voluntary action plan (USDA, 2007). Perhaps more relevant, after being linked to many listeriosis outbreaks, the US deli meat industry has put many resources into collective ac-tion to improve Listeria monocytogenes environmental monitoring, tracking contamination, and eliminating it both at that time and in the future with the "seek and destroy" approach (Ferreira et al., 2014;Malley et al., 2015). Optimized product sampling and testing programs for detecting the organisms of concern can play an important, though limited, role in such a comprehensive food safety system as an important verification tool and method to identify important failures for follow-up study and future control.

ACKNOWLEDGMENTS
This perspective paper was created in response to an invitation by the editor-in-chief of the Journal of Dairy Science. Although the authors received no funding specific to this perspective piece, M. J. Stasiewicz does have an active research project, Simulating Powdered Product Sampling to Improve Food Safety Sampling Plans. That work is supported by the Institute for the Advancement of Food and Nutrition Sciences (IAFNS; Washington, DC), a nonprofit science organization that pools funding from industry and advances science through the in-kind and financial contributions from private and public sector members; IAFNS had no role in the design, analysis, interpretation, or presentation of the data and results for the larger project, nor this perspective paper. The authors have not stated any conflicts of interest.