Identification of new reference genes for colony counting by reverse-transcription quantitative PCR in Bifidobacterium animalis

Bifidobacterium animalis , one of the predominant bacteria in the intestines of humans and other mammals, is widely added to dairy products. We employed RNA sequencing to analyze gene expression variance on a genome-wide scale and found stable reference genes (RG) in B. animalis . A total of 1,665 genes were identified by analyzing the data from the transcriptome under 4 different conditions, and 13 probable candidate RG with variation coefficient values <0.1 were validated using reverse-transcription quantitative PCR (RT-qP-CR). The amplification efficiency of candidate RG were ranging from 94.16% to 126.25%. We integrated the analysis results of BestKeeper, geNorm, NormFinder, and RefFinder algorithms and revealed that rplD and atpA comprehensive ranked 1.68 and 2.82, respectively, which were more stable than traditional RG. Compared with plate count (1.58 × 10 6 cfu/mL), the concentrations of B. animalis AR668 by RT-qPCR using rplD , atpA , and 16S rRNA as RG were 2.27 × 10 6 , 2.24 × 10 6 , and 6.66 × 10 6 cfu/mL, respectively, after 10 h of fermentation in fermented skim milk. It suggested that rplD and atpA as RG can be accurate for colony counting of B. animalis . Our study provides the foundation for more accurate analysis of colony counting by RT-qPCR of B. animalis in dairy foods.


INTRODUCTION
Bifidobacterium animalis are one of the generally recognized safe bacteria with nonpathogenic, fermentative, and anaerobic characteristics (Yasmin et al., 2020).Dairy products containing B. animalis provide nutritional benefits and probiotic functions, which could play an important role in human gut flora through generating organic acids, reducing gastric juice pH, and boosting mineral bioavailability (García-Burgos et al., 2020).In addition, B. animalis can convert linoleic acid to conjugated linoleic acid and decompose lactose in fermented dairy products, which has beneficial effects in preventing cancer, cardiovascular diseases, diabetes mellitus, atherosclerosis, and lactose intolerance (Esmaeilnejad Moghadam et al., 2019).Therefore, B. animalis is widely added in dairy products, such as beverages, yogurt, cheese, and ice cream (Shori, 2021).
The most widely used approach for measuring gene expression in various biological systems is reverse-transcription quantitative PCR (RT-qPCR).The proper transcript abundance estimation depends significantly on standardization, which demands the use of reference genes (RG) that are consistently expressed under analytical circumstances (Pombo et al., 2017).Most of the traditionally used RG for RT-qPCR are adapted from northern blot and reverse-transcription semiquantitative PCR investigations (Connors et al., 2019) such as GAPDH (glyceraldehyde 3-phosphate dehydrogenase), 16S rRNA (16S ribosomal RNA), recA (DNA recombination/repair protein), elongation factor 1 α (EF1), and actin (ACT).These common RG are used in probiotics (Bunesova et al., 2017), which were arbitrarily chosen without any stability test or explanation (Zhang et al., 2019).As we all know, the expression of any type of RG cannot remain constant across all experimental settings.The expression of RG generally varies between cell types and phases of cell development (Huggett et al., 2005).To achieve more legitimate and accurate results, RG should be validated before gene expression analysis, and only the relatively stable internal RG can be used to correct the expression of the target genes.When identifying species within the genus Bifidobacterium, Delétoile et al. (2010) used a multilocus sequence typing method based on the housekeeping genes clpC (encoding for a protease), fusA (GTP-binding elongation factor EF-G), gyrB (DNA gyrase, subunit B), ileS (isoleucyl-tRNA synthetase), purF (amidophosphoribosyltransferase), rplB (50S ribosomal subunit protein L2), and rpoB (β-subunit of RNA polymerase).Unfortunately, no exhaustive examination of stable RG in B. animalis has been performed (Huyghe, 2017).Most investigations of gene expression in bacteria by RT-qPCR still arbitrarily pick certain traditional RG without stability testing or even interpretation (Zhang et al., 2019).
The emergence of high-throughput sequencing technology has made it simpler to discover these RG that are expressed in a consistent manner (Pombo et al., 2019).Recently, RNA sequencing (RNA-seq) has been used to identify more trustworthy RG for RT-qPCR research in a variety of species and experimental circumstances (Medina-Lozano et al., 2023).The choice of a trustworthy RG has a major influence on the accuracy and interpretation of RT-qPCR results.Using inadequate RG to normalize target gene expression will result in erroneous or even deceptive findings.The ideal RG is an internal control that demonstrates low or no change in expression levels across a range of experimental circumstances (Xiong et al., 2020).Our previous work also employed RNA-seq to systematically examine gene expression variation under different treatments and identified 3 novel RG with lower variation for RT-qPCR in Streptococcus thermophilus (Xiong et al., 2020).To date, no in-depth investigation of the stable RG in B. animalis has occurred.The aim of this work was to discover the stable RG expressed for RT-qPCR at a genome-wide level based on RNA-seq, which is important for the identification and reliable counting of B. animalis in dairy products.

MATERIALS AND METHODS
No human or animal subjects were used, so this analysis did not require approval by an Institutional Animal Care and Use Committee or Institutional Review Board.
Overnight cultures of B. animalis AR668 and AR668-R1 were inoculated in 10% (wt/vol) reconstituted skim milk supplemented with 1% yeast extract and incubated at 37°C for 30 h in an anaerobic environment without shaking.Fermented skim milk samples were collected every 4 h after 10 h for plate counting and RT-qPCR analysis.The diluted samples were incubated anaerobically on BS agar plates at 37°C for 48 h, and the plate with a colony count of 30 to 300 cfu was chosen for counting.

Identification and Annotation of Stably Expressed Genes Based on Transcriptome Datasets in B. animalis
To identify stably expressed genes in B. animalis, the cDNA libraries from B. animalis AR668 and AR668-R1 mRNA were sequenced and assembled using the pairedend sequencing method of Illumina (Supplemental Table S1, https://doi.org/10.6084/m9.figshare.23089547.v2;Liu et al., 2023).The Blast2GO (https: / / www .blast2go.com/ ) and Diamond (http: / / ab .inf.uni-tuebingen .de/software/ diamond) software programs were used to add functional annotations to the Gene Ontology (http: / / www .geneontology.org/ ) and the Kyoto Encyclopedia of Genes and Genomes (http: / / www .genome.jp/kegg/ ) for these genes.The R package DEGseq (V1.48) was used to examine the gene expression between samples or groups.Over the whole set of transcriptome data, the transcripts per million reads, standard deviation (SD), and coefficient of variation (CV) values were determined for each gene.The CV value for each gene was computed by dividing the SD by the mean expression.Based on the CV value of gene expression, the potential novel RG with reduced variation were selected for RT-qPCR.The nucleotide sequences of the potential novel RG were retrieved from the genomic data in B. animalis (Supplemental Table S2, https:// doi.org/10.6084/m9.figshare.23089547.v2;Liu et al., 2023) and generated the qPCR primers (Table 1) by using NCBI Primer-BLAST (https: / / www .ncbi.nlm.nih.gov/tools/ primer -blast).

Total RNA Extraction, cDNA Synthesis, and Primer Sequences and Amplicon Characteristics of Candidate RG for RT-qPCR
Total RNA was isolated by total RNA extraction reagent (Sangon Biotech, Shanghai, China).Due to the thickness of the cell wall of B. animalis, lysozyme was used for 1 h pretreatment to extract high-quality RNA (Villa-Rodríguez et al., 2018).The resulting cDNA from B. animalis AR668 and AR668-R1 was balanced to different gradient concentrations by using sterile distilled water.The cDNA dilutions (1, 1:10, 1:100, 1:1,000, 1:10,000, and 1:100,000) were employed for RT-qPCR.The qPCR reactions were done in 96-well plates (Light-Cycler 8-Tube Strips), and 3 repetitions were made for each sample.Amplification efficiency (E) was measured as 10 −1/slope and represented as a percentage (Table 1).

Analysis of RG Expression Stability
The original cycle amplification values (Ct values) of RG were analyzed by LightCycler 96 (Roche), and the data were sorted to examine differences in the expression levels of chosen RG.BestKeeper (https: / / www .gene-quantification .com/bestkeeper .html),geNorm (https: / / genorm .cmgg.be/), and NormFinder (https: / / www .moma.dk/normfinder -software), 3 widely used Microsoft Excel-based software, were used to statistically analyze the expression stability of the chosen RG.The raw Ct values were converted to relative numbers using the formula 2 (−ΔΔCT) .RefFinder (release year: 2012; http: / / leonxie .esy.es/RefFinder/ ) was used to deliver a thorough ranking recommendation based on the geometric mean of ranking data (Singh et al., 2019).

Standard Curves for Colony Counting Using RT-qPCR
Standard curves were generated by plotting the Ct values of the RT-qPCR performed on 10-fold serial di-lutions of purified cDNA from B. animalis AR668 and AR668-R1 against plate counts.The use of RT-qPCR cannot identify when the amount of B. animalis is below 10 4 cfu/mL.The generated standard curve presented coefficient of determination (R 2 ), indicating that the results were linear over the tested range of bacterial concentrations (10 4 -10 8 cfu/mL).All values were presented as mean ± SD (log cfu/mL) of triplicate trials.The regression equation can be used to determine the log cells (x) and copy number (y) for respective RG in B. animalis.

Statistical Analysis
The results were statistically analyzed by one-way ANOVA from the statistical software package SPSS 26.0.The statistical degree of significance was set at P < 0.05.All the Ct values are averages of at least 3 repetitions.

Identification of Candidate RG by RNA-seq
Using RNA-seq is the most effective way for screening more reliable RG in RT-qPCR research (Nong et al., 2019).A total of 1,665 genes were predicted after analyzing the data from the transcriptome.Using Venn analysis, we expressed 1,552 genes under 4 different  1).We calculated the CV value that defines as the ratio between SD and the average reads per kilobase of transcript per million mapped reads of each gene expression.The lower CV indicated more stable gene expression across all conditions.In this study, the cutoff CV value for stable gene expression was set as <0.1 in line.Thirteen genes were candidate RG (Table 2) and were subsequently analyzed by RT-qPCR, including 16S rRNA, atpA (ATP synthase α chain), dnaN (DNA polymerase III subunit β), GAPDH, gene0484 (histidine kinase, DNA gyrase and HSP90-like ATPase family protein), gene0589 (ABC transporter AA-binding protein), gene1692 (hypothetical protein), hisE (Phosphoribosyl-ATP pyrophosphatase), recA, rpmG (LSU ribosomal protein L33P), rplD (50S ribosomal protein L4), and rplI (50S ribosomal protein L9), and rplX (50S ribosomal protein L24).For this set of genes, CV ranged from 0.36% to 9.51%, which was satisfied with CV <0.1.The most stable genes were atpA (0.36%), rpmG (0.69%), hisE (1.16%), and rplX (1.33%), which have the potential as novel RG with lower CV values (<0.03) in B. animalis.However, higher CV values of the traditional RG 16S rRNA (3.58%), recA (9.32%), and GAPDH (9.51%) suggested that these common RG do not have the most stable gene expression in B. animalis.In comparison to recA and GAPDH, the 16S rRNA gene showed a low CV value.Hence, 16S rRNA was chosen as the control for further validation using RT-qPCR.

Efficiency of PCR Amplification and Expression of Candidate RG in B. animalis
Ct value was used to assess the possibility of 13 genes as RG for RT-qPCR by the dilutions of cDNA.It showed the primer sequences specific to each selected candidate genes for B. animalis (Table 2).Except for gene0395, all other primers had E values from 94.16% to 126.25% within the allowed range for a reliable RT-qPCR analysis.Furthermore, the standard curves revealed an acceptable R 2 value, confirming the reliability of the primer pairs in the RT-qPCR analysis.The differences in transcript levels between the candidate genes were given as Ct values (Figure 2).Average Ct values ranging from 16.05 (gene0426) to 37.74 (gene0840) indicated that the majority of the genes (with the exception of a section of the 1:100,000 cDNA dilution) had appropriate Ct values (15-30) for RT-qPCR.The E and Ct values indicated that the 13 candidate genes were reliable as RG.

Expression Stability of RG in B. animalis
Four software tools, geNorm, NormFinder, Best-Keeper and RefFinder, were used to evaluate the expression stability of our selected RG in B. animalis.BestKeeper analyzes unconverted Ct values, but geNorm and NormFinder need Ct values to be converted to relative quantification data.The statistical computations provided by the BestKeeper, geNorm, NormFinder, and RefFinder algorithms ranked the genes with the best stable expression across all conditions examined (Table 3).GeNorm ranked RG based on expression stability (M value) and pairwise variation (V value), with stepwise exclusion of the least stable gene ranked evaluated genes.The results of the geNorm analysis showed that the M values and V values of the 13 RG were all less than 1.0 and 0.15, respectively, in B. animalis AR668 and AR668-R1 (Figure 3), indicating that these genes were acceptable as RG.A gene with a lower M value is more stable.The most stable genes were atpA, rplD, and rplI, with M values of 0.134, 0.146, and 0.188, respectively, in B. animalis AR668 by the geNorm algorithm.The analysis indicated with M values that rplD (0.178), atpA (0.202), and rplX (0.253) were the most stable genes in B. animalis AR668-R1.All the analyzed genes showed lower M values than the common RG recA, GAPDH, and 16S rRNA (M values all >0.5).Other RG exhibited higher M values but still <1.0, which were suitable as normalization factors in RT-qPCR analysis.
NormFinder was also used to investigate expression stability of these RG, which can calculate intra-and intergroup variance to rank RG based on the lowest variance in the sample set.According to the Norm-Finder program, the gene is more stable when it shows lower stability value.The results by NormFinder analysis indicated rplD (0.46), rplI (0.59), atpA (0.66), and dnaN (0.66) as the top ranked genes.Similar to the ge-Norm result, NormFinder analysis showed that all RG have lower stability value than recA, GAPDH, and 16S rRNA (stability values were 0.86, 1.47, and 1.48, respectively).Furthermore, the rows were ordered according to increasing stability value by NormFinder algorithm (Supplemental Table S3, https://doi.org/10.6084/m9.figshare.23089547.v2;Liu et al., 2023), including the group difference, the group standard deviation, the individual group difference, and individual group standard deviation.The predicted stable genes (rplD, atpA, rplI, and dnaN) had low standard deviation and difference, which verify the accuracy of the result.
BestKeeper estimated the expression stability of RG using the CV and SD values of the Ct values (Supplemental Table S4, https://doi.org/10.6084/m9.figshare.23089547.v2;Liu et al., 2023), which allows the genes to be sorted from most stably expressed (lowest SD [±Ct]) to least stably expressed (highest SD [±Ct]).Furthermore, the BestKeeper algorithm performed pairwise correlation tests between each gene examined and the geometric mean Ct value of all RG combined and symbolized the Pearson correlation coefficient by r.Genes with the lower SD and CV values and higher r value are more stable.The most stable genes were rplD, rplX,and atpA, with r values of 0.998, 0.986, and 0.985, respectively, in B. animalis AR668, according to the BestKeeper algorithm.The analysis indicated with r values that rplD (0.996), rplI (0.977), and atpA (0.973) were the most stable genes in B. animalis AR668-R1.All RG showed higher r values than traditional RG GAPDH and 16S rRNA (r values all lower than 0.8).But the results also showed that another common RG, recA, was considered to be a stable gene (r values >0.9).
Bifidobacterium animalis uses lactose and synthesized organic acid byproducts (i.e., lactic acid), which can provide the preservatives and flavors during the milk fermentation process (Nguyen et al., 2012).The qPCR-based techniques are routinely used to detect and quantify a wide variety of microorganisms, such as B. animalis in dairy products (Agrimonti et al., 2019).But there is no consensus for an ideal and universal RG for B. animalis.The 16S rRNA gene is still employed as RG in bacteria such as Streptococcus, Lactobacillus, and Bifidobacterium.Our result showed that rplD and atpA as RG can be more accurate than 16S rRNA for B. animalis enumeration in fermented skim milk.Since the 16S rRNA gene exists in numerous copies in the Bifidobacterium genome, quantitative data derived from 16S rRNA-based qPCR may be inflated (Mianzhi and Shah, 2017).It can obtain more accurate quantitative data by targeting only a single-copy RG in RT-qPCR.Thus, rplD and atpA, as the more stable genes as RG, can be accurate for counting of B. animalis in dairy products.

CONCLUSIONS
We used RNA-seq for genome-wide identification of new stable RG in B. animalis.To our knowledge, this is the first systematic attempt to find RG for RT-  (16S rRNA, atpA, dnaN, GAPDH, gene0484, gene0589, gene1692, hisE, recA, rpmG, rplD, rplI, and rplX) was selected and validated by using RT-qPCR based on the expression of all 1,665 genes from transcriptome.
In the analysis of our results, the traditional RG such as recA, 16S rRNA, and GAPDH exhibited higher CV values than our tested RG in B. animalis.Based on the analysis of expression stability by BestKeeper, Norm-Finder, geNorm, and RefFinder algorithms, rplD and atpA were more stable than traditional RG.This work suggested that rplD and atpA as RG can achieve more accurate gene expression analyses and total bacterial quantification in B. animalis for dairy products.
Liu et al.: NEW REFERENCE GENES IDENTIFIED USING RT-qPCR

Figure 1 .
Figure1.Analysis of co-expressed and specific genes between samples using the Venn diagram.Wild-type Bifidobacterium animalis AR668 and oxygen-tolerant domesticated AR668-R1 in this study were isolated from infant feces.The 2 strains were cultivated in Bifidobacterium liquid medium (Hope Bio, Qingdao, China) at 37°C for 12 h under aerobic (AE) and anaerobic (AN) conditions, respectively.
Figure 2. Cycle amplification values (Ct values) of 13 reference genes (RG) under experimental conditions.The box and whisker plot depict the Ct values of each chosen RG under 4 different treatments, as described in Figure 1.The upper and lower edges of boxes indicate the 25th and 75th percentiles.The lines across the boxes depict the median.Minimum and maximum values are shown by whiskers.

Table 1 .
Liu et al.:NEW REFERENCE GENES IDENTIFIED USING RT-qPCR Primer sequences and amplicon characteristics of candidate reference genes for reverse-transcription quantitative PCR