Single-step genome-wide association analyses for selected infrared-predicted cheese-making traits in Walloon Holstein cows

This study aimed to perform genome-wide association study to identify genomic regions associated with milk production and cheese-making properties (CMP) in Walloon Holstein cows. The studied traits were milk yield, fat percentage, protein percentage, casein percentage (CNP), calcium content, somatic cell score (SCS), coagulation time, curd firmness after 30 min from rennet addition, and titratable acidity. The used data have been collected from 2014 to 2020 on 78,073 first-parity (485,218 test-day records), 48,766 second-parity (284,942 test-day records), and 21,948 third-parity (105,112 test-day records) Holstein cows distributed in 671 herds in the Walloon Region of Belgium. Data of 565,533 single nucleotide polymorphisms (SNP), located on 29 Bos taurus autosomes (BTA) of 6,617 animals (1,712 males), were used. Random regression test-day models were used to estimate genetic parameters through the Bayesian Gibbs sampling method. The SNP solutions were estimated using a single-step genomic BLUP approach. The proportion of the total additive genetic variance explained by windows of 50 consecutive SNPs (with an average size of ~216 KB) was calculated, and regions accounting for at least 1.0% of the total additive genetic variance were used to search for positional candidate genes. Heritability estimates for the studied traits ranged from 0.10 (SCS) to 0.53 (CNP), 0.10 (SCS) to 0.50 (CNP), and 0.12 (SCS) to 0.49 (CNP) in the first, second, and third parity, respectively. Genome-wide association analyses identified 6 genomic regions (BTA1, BTA14 [4 regions], and BTA20) associated with the considered traits.


INTRODUCTION
The cheese-making properties (CMP) of bovine milk are economically important for the dairy industry since a significant and growing fraction of the milk produced worldwide is used to make cheese (Wedholm et al., 2006;Cassandro et al., 2008; Food and Agriculture Organization of the United Nations, 2015; The World Dairy Situation, 2016).Milk coagulation is a key step that strongly influences the efficiency of cheese production.Furthermore, the composition, in particular casein and fat, and physicochemical properties of milk have considerable effects on its processing properties and the quality of resulting dairy products (Walstra et al., 2005;Visentin et al., 2017b;Nilsson et al., 2019).The coagulation ability of milk is also associated with the pH of milk, protein composition, SCC, milk fatty acids (FA), and milk calcium content (CC) (Bencini, 2002;Pastorino et al., 2003;Wedholm et al., 2006;Bobbo et al., 2016).The coagulation ability of milk can be evaluated using developed milk coagulation properties such as rennet coagulation time (RCT, min), which is the time from the addition of coagulant to milk to the beginning of coagulation, the time to a curd firmness of 20 mm, and curd firmness at 30 min after coagulant addition (a30, mm) (De Marchi et al., 2007;Bittante, 2011;Visentin et al., 2015).It has been documented that the CMP can be affected by environmental factors such as feeding, udder health, season, and physiological stage (e.g., parity, lactation stage), but they are also genetically influenced (De Marchi et al., 2007;Cassandro et al., 2008;Visentin et al., 2017a;Atashi et al., 2022a).Therefore, genetic selection can be used to improve cheese-making traits, and consequently, to improve the quality and the amount of cheese yield per volume of milk.
There are different laboratory methods developed for direct recording of individual milk coagulation properties (Pretto et al., 2011;Bittante et al., 2012;Bobbo et al., 2016); however, they are time consuming and expensive to implement on a large scale; then, genetic selection based on direct measures of milk coagulation properties is limited (Sanchez et al., 2018(Sanchez et al., , 2019)).Fourier-transform mid-infrared (MIR) spectroscopy has been proposed as an alternative method for prediction of various milk characteristics, including fractions of protein, fat, casein, minerals, and milk FA contents (De Marchi et al., 2009, 2014;Soyeurt et al., 2009Soyeurt et al., , 2011;;Visentin et al., 2015).The MIR technology also is considered as a cheap method to measure individual CMP on a large scale, which is needed for breeding programs in dairy cattle (De Marchi et al., 2009, 2014;Sanchez et al., 2018).
However, enough knowledge on the genetic background of cheese-making traits is needed before they are included into the breeding program.Furthermore, identification of genomic regions and individual genes responsible for genetic variation in CMP will improve our understanding about the biological pathways involved and can be used for improving cheese-making traits.Atashi et al. (2022a) investigated the genetic parameters and genomic regions associated with selected infrared-predicted cheese-making traits in the Dual-Purpose Belgian Blue (DPBB) population, which is the second most important cattle breed reared by dairy farmers in the Walloon Region of Belgium.However, to the best of our knowledge, no comprehensive study has been performed to investigate genetic background and the genetic architecture of cheese-making traits in Walloon Holstein cows.Therefore, the aim of this study was to estimate genetic parameters and identify genomic regions associated with milk production and selected infrared-predicted cheese-making traits in Walloon Holstein cows.

Phenotypic Data
No human or animal subjects were used, so this analysis did not require approval by an Institutional Animal Care and Use Committee or Institutional Review Board.The data used consisted of test-day records of traits including milk yield (MY), SCC, MIR predicted fat percentage (FP), protein percentage (PP), casein percentage (CNP), CC, coagulation time (CT), a30, and titratable acidity (TA).The FP and PP records were generated by the official milk recording in the Walloon Region of Belgium using MIR spectrometry and commercially available instruments and calibrations from FOSS (Foss Electric A/S).Test-day SCC records were log-transformed to SCS based on the following equation: SCS = log 2 (SCC/100,000) + 3.
The MIR prediction equations used to predict CNP, CC, CT, a30, and TA were obtained from various studies (Soyeurt et al., 2009;Colinet et al., 2010Colinet et al., , 2013Colinet et al., , 2015)).Table 1 shows the calibration and the cross-validation statistics of the used MIR prediction equations.Milk MIR spectra were obtained by the analysis of individual milk samples on MilkoScan FT6000 spectrometer (Foss, Hillerød, Denmark).The MIR spectra were preprocessed to remove baseline variation and then were standardized.The MIR prediction equations were applied on standardized spectra from individual milk samples collected in the frame of the Walloon milk recording scheme.Full cross-validation was performed to assess the accuracy of the developed equations.The cross-validation coefficients of determination were 0.95, 0.82, 0.63, 0.42, and 0.68 for CNP, CC, CT, a30, and TA, respectively.The ratio performance/deviation of cross-validation, the ratio of standard deviation (SD) to standard error of cross-validation, was 4.47, 2.34, 1.64, 1.31, and 1.77 for CNP, CC, CT, a30, and TA, respectively.
Data were edited to include only cows with known birth date, calving date, and parity number.Only records from the first 3 parities that had data for all included traits on a given test-day were kept.Records from DIM lower than 5 d and greater than 365 d were eliminated.Age at the first calving was calculated as the difference between birth date and first calving date and restricted to the range of 540 to 1,200 d.Daily MY, FP, and PP were restricted to the range from 3 to 99 kg, 1 to 9%, and 1 to 7%, respectively (ICAR, 2022).Test-day records of the other considered traits were edited to remove records outside the range of mean ± 5 SD.Within cow, if parity 3 was present, parities 1 and 2 were also present, and if parity 2 was present, parity 1 was also present.The number of test-day records in the first-, second-, and third-parity cows were 485,218 (on 78,073 cows), 284,942 (on 48,766 cows), and 105,112 (on 21,941 cows), respectively.The data were collected from 2014 to 2020 on 78,073 animals distributed in 671 herds in the Walloon Region of Belgium.On average across the data set, 5.88 test-day records were available per cow per lactation.Pedigree depth of the animals was traced back to 5 generations.The used pedigree consisted of 186,548 females and 10,076 males.

Genotypic Data
Genotypic data were available for 6,617 (1,712 males and 4,905 females) phenotyped animals or those animals included in the pedigree.Individuals were genotyped using the BovineSNP50 Beadchip v1 to v3 and EuroG MD (SI) v9 (Illumina, San Diego, CA).Single nucleotide polymorphisms in common among the 4 chips were kept.Nonmapped SNP, SNP located on sexual chromosomes, and triallelic SNPs were excluded.A minimum GenCall Score of 0.15 and a minimum GenTrain Score of 0.55 were used to keep SNP.Then, the genotypes were imputed to HD with a reference population of 4,352 HD individuals (1,046 males and 3,288 females) using FImpute V2.2 software (Sargolzaei et al., 2014).Single nucleotide polymorphisms with Mendelian conflicts and those with minor allele frequency (MAF) less than 5% were excluded.The difference between the observed and expected heterozygosity was estimated, and if the difference was greater than 0.15, the SNP was excluded (Wiggans et al., 2009).Finally, 565,533 SNPs located on 29 BTA were used in the genomic analyses.

Variance Component Estimation
The (co)variance components and breeding values for the considered cheese-making properties were estimated based on the integration of the random regression test-day model (RR-TDM) into the single-step GBLUP procedure (SS RR-TDM) using the following single-trait, multiple-lactation (first 3 lactations) model (Paiva et al., 2022):   tively, the random regression coefficients of herd-year of calving, permanent environmental, and additive effects modeled using Legendre polynomials of order 3; and e ijklmn is the residual effect.The herd-year of calving, permanent environment, additive genetic, and residual variances were assumed as follows: where HY is the 12 × 12 covariance matrix of the herd-year of calving regression coefficients; I is an identity matrix, ⊗ represents the Kronecker product function, P is the 12 × 12 covariance matrix of the permanent environmental regression coefficients; Ga is the 12 × 12 covariance matrix of the additive genetic regression coefficients, R is the residual covariance matrix, and blocks within R r p = + ∑ contain residual variance  Coagulation time is defined here as the sum of the rennet coagulation time plus the time to a curd firmness of 20 mm measured by the computerized renneting meter.
(r) that depends on parity (p).Residual variance was the same within each parity.The H is a matrix that combines pedigree and genomic relationships, and its inverse consists of the integration of additive and genomic relationship matrices, A and G, respectively (Aguilar et al., 2010): where A is the numerator relationship matrix based on the pedigree for all animals, A 22 is the numerator relationship matrix for genotyped animals, and G is the weighted genomic relationship matrix obtained using the following function: The G* is the genomic relationship matrix obtained using the following function described by VanRaden (2008): where Z is a matrix of gene content adjusted for allele frequencies (0, 1, or 2 for aa, Aa, and AA, respectively); D is a diagonal matrix of weights for SNP variances (D = I); M is the number of SNPs, and p i is the MAF of the ith SNP.The H matrix was built scaling G based on A 22 considering that the average of the diagonal of G is equal to the average of the diagonal of A 22 , and the average of the off-diagonal G is equal to the average of the off-diagonal A 22 .
The (co)variance components were estimated by Bayesian inference using the GIBBS3F90 software (Aguilar et al., 2018).Gibbs sampling was used to obtain marginal posterior distributions for the various parameters using a single chain of 200,000 iterates with a sampling interval of 20 samples.The first 50,000 iterates of the chain were regarded as a burn-in period to allow sampling from the proper marginal distributions.Post-Gibbs analysis was performed using the software POSTGIBBSF90 (Aguilar et al., 2018), using the retained 7,500 samples.Genetic (co)variances on each test-day were calculated using the equation described by Jamrozik and Schaeffer (1997).Daily heritability was defined as the ratio of genetic variance to the sum of the additive genetic, permanent environmental, herd-year calving, and residual variances at a given DIM.
The vector of genomic estimated breeding values (GEBV) of the included traits for each animal i, which included daily GEBV from all DIM (5 to 365) in each parity, was estimated by multiplying the vector of additive genetic predicted regression coefficients by the matrix of Legendre orthogonal polynomial covariates; that is, GEBV Tg i i = ˆ, where ĝi is the vector of additive genetic predicted regression coefficients for animal i, and T is a matrix of orthogonal covariates associated with the Legendre orthogonal polynomial functions.

Genome-Wide Association Study
The GWAS analyses were performed for all included traits in the first 3 parities considering following 3 lactation stages: (1) from 5 to 60 DIM, representing the ascending production stage and lactation peak; (2) from 61 to 200 DIM, representing the middle lactation stage; and (3) from 201 to 365 DIM, representing the production decline up to the end of the lactation (Oliveira et al., 2019).Therefore, the GEBV for each lactation stage of each animal i (for each trait in each parity) were obtained by averaging (summing for MY) the daily GEBV solutions of the specific DIM; that is, where GEBV i ˆ, 1 GEBV i ˆ, 2 and GEBV i ˆ3 are the GEBV for the first, second, and third lactation stages of animal i obtained by averaging (summing for MY) the GEBV from 5 to 60, 61 to 200, and 201 to 365 DIM, respectively.Furthermore, the GEBV of animal i through the entire lactation were obtained by summing (averaging) the daily GEBV solutions of all DIM; that is, where GEBVe i ˆ is the GEBV of animal i through the entire lactation, obtained by averaging (summing for MY) the GEBV from 5 to 365.
The SNP effects for each lactation stage were estimated individually for each trait in each parity using the postGSf90 software (Aguilar et al., 2014).The animal effects were decomposed into those for genotyped (a g ) and ungenotyped animals (a n ).The animal effects of genotyped animals are a function of the SNP effects, a g = Zu, where Z is a matrix relating genotypes of each locus and u is a vector of the SNP marker effect.The variance of animal effects was assumed as where D is a diagonal matrix of weights for variances of markers (D = I), and σ u 2 is the additive genetic variance captured by each SNP marker when the weighted relationship matrix (G) was built with no weight.The SNP effects were obtained using the following equation: where λ was defined by VanRaden ( 2008) as a normalizing constant, as described below: The percentage of the total additive genetic variance explained by the ith genomic region was estimated as follows: where a i is the genetic value of the ith region that consists of 50 adjacent SNPs, σ a 2 is the total additive genetic variance, Z j is the vector of the SNP content of the jth SNP for all individuals, and ûj is the marker effect of the jth SNP within the ith region.The additive genetic variance explained by 50-SNP moving windows, with an average size of ~216 KB, was calculated across the whole genome, and those windows explaining at least 1.0% of the total additive genetic variance were considered promising regions and used to identify positional candidate genes.The concept of grouping SNP into windows was adopted as a way to better capture the genetic information such as the extent of linkage disequilibrium (LD) in neighboring SNPs (Habier et al., 2011).

Identification of Positional Candidate Genes for the Studied Traits
The animals included in this study were genotyped using the BovineSNP50 Beadchip v1 to v3 and EuroG MD (SI) v9 (Illumina, San Diego, CA); then, the genotypes were imputed to the BovineHD Beadchip, which is based on the bovine reference genomes assembly UMD3.1.However, new bovine reference genome assembly ARS-UCD1.2,assembled using long sequencing reads, filled gaps and resolved repetitive regions of the UMD3.1 assembly, and has more credible annotation information (Rosen et al., 2020).The Lift Genome Annotations tool, available through a simple web interface (https: / / genome .ucsc.edu/cgi -bin/ hgLiftOver), was used to convert coordinate ranges of the identified genomic regions from the UMD3.1 to the ARS-UCD1.2assembly.Then, to identify possible candidate genes associated with the considered traits, genes located within the identified genomic regions (i.e., between the start and end of genomic coordinates of the identified regions based on the ARS-UCD1.2assembly) were further investigated.We identified genes using the National Center for Biotechnology Information (NCBI) Map Viewer tool for the ARS-UCD1.2assembly as the reference map.

RESULTS AND DISCUSSION
The lactation curves of daily average of phenotypic records for the considered traits are presented in Figure 1.The peak of MY occurred at the DIM 41 (27.59 kg), 35 (35.03 kg), and 32 (39.40 kg) for the first, second, and third parity, respectively.The FP, PP, and CNP curves were observed to be lower for first-parity cows than for second-and third-parity cows.The TA curves were higher for first-parity cows than for second-and third-parity cows.The descriptive statistics for studied traits in the first 3 lactations are presented in Table 2. Daily MY averaged 24.2, 27.9, and 30.0 kg in the first 3 lactations, which is comparable to those previously reported for Walloon Holstein cows (Bastin et al., 2013).Somatic cell score was the trait with the greatest coefficient of variation (53.54 to 59.45%), whereas a30 had the lowest coefficient of variation (7.17 to 7.54%).Cassandro et al. (2008) reported that among milk coagulation and production traits (MY, FP, PP, CNP, SCS, TA, RCT, and TA) of Italian Holstein, SCS has the highest and casein percentage has the lowest coefficient of variation.The average a30 ranged from 32.11 to 32.33 mm, which is in agreement with previous studies (Cassandro et al., 2008;Atashi et al., 2022a).However, mean a30 reported for Finnish Ayrshire cows ranged from 25 to 27 mm (Ikonen et al., 2004).The average (SD) SCS were 2.25 (1.36), 2.46 (1.46), and 2.77 (1.48) in the first, second, and third parity, respectively.Averaged MY, and SCS were higher with increasing parity in line with previous studies (Bastin et al., 2013;Atashi and Hostens, 2021), whereas averaged TA was lower with increasing parity as has been reported for DPBB cows (Atashi et al., 2022a).
Average daily heritability (h 2 ) estimates for the studied traits are shown in Table 3. Heritability estimates ranged from 0.10 (SCS) to 0.53 (CNP), 0.10 (SCS) to 0.50 (CNP), and 0.12 (SCS) to 0.49 (CNP) in the first, second, and third parity, respectively.Heritability estimates were generally higher in first parity than in later parities.The mean daily h 2 estimates for CT, a30, and TA ranged from 0.47 to 0.52, 0.46 to 0.47, and 0.46 to 0.51 in agreement with previously reported estimates for dairy cattle (Vallas et al., 2010;Tiezzi et al., 2013).Atashi et al. (2022a) reported that h 2 of infraredpredicted CT, a30, and TA in DPBB ranged from 0.40 to 0.48, 0.36 to 0.39, and 0.41 to 0.50, respectively.The high h 2 estimated for milk coagulation properties supports possible genetic improvement of milk coagulation ability in the population of Walloon Holstein cows.Cecchinato et al. (2011) reported that h 2 of RCT (min), a30, and TA ranged, respectively, from 0.22 to 0.58, 0.05 to 0.32, and 0.11 to 0.42 in Italian Holstein cows.Colinet et al. (2012) reported that the daily h 2 of infrared-predicted TA in the first-parity Holstein cows in Wallonia ranged from 0.45 to 0.60 with an average of 0.57.Mean daily h 2 of CC ranged from 0.47 to 0.50, which is in line with those reported for Montbéliarde dairy cows (Sanchez et al., 2021).The variation found for h 2 of CMP in the literature can be explained by the differences in the studied breeds, structure of the data, number of records, statistical models, and the length of the period of data collection.
Typically, GWAS methods are based on testing the significance of SNP effects on the traits of interest.However, SNPs within a genomic region can be highly correlated and jointly influence the phenotype.Furthermore, the genetic information in neighboring SNPs, such as the extent of LD, is not used in the GWAS depends on single SNP (Bao and Wang, 2017).Therefore, window-based GWAS procedure have been proposed as an effective procedure to estimate the combined effect of several consecutive SNPs in a specific region and identify genomic regions explaining a given amount of genetic variance (Aguilar et al., 2019).Window-based GWAS may use different window types (distinct or slid- Coagulation time is defined here as the sum of the rennet coagulation time plus the time to a curd firmness of 20 mm measured by the computerized renneting meter. ing windows) and variable window sizes (defined as the number of SNP or the number of base pairs).However, the absence of a universal approach for hypothesis testing is an important challenge of window-based GWAS, even though it is quite a common procedure in genetic studies.The common form for declaring significance is to use a threshold on the additive genetic variance explained by individual window (Aguilar et al., 2019).However, it is unclear what window size is optimal, and no standard presently exists to define the threshold on explained genetic variance.Therefore, determining the proper window size is usually subjective, and researchers often have not justified their choices or sometimes have acknowledged that their choices are arbitrary (Beissinger et al., 2015).Medeiros de Oliveira Silva et al. ( 2017), using the BovineHD SNP panel, considered 50 adjacent SNP windows (with an average of 280 KB) that explained at least 0.50% of additive genetic variance as the threshold to declare significance.Atashi et al. (2020), using the BovineHD SNP panel, considered 50 adjacent SNP windows that explained more than 1% of the total additive genetic variance as the threshold to declare significance for milk production and lactation curve parameters.Han and Peñagaricano (2016) considered 1.5-MB windows that explained at least 0.50% of the total genetic variance as the threshold to declare significance.Suwannasing et al. (2018) considered windows that explained more than 1% of the total genetic variance as the threshold to declare significance.Tiezzi et al. (2015) calculated the variance absorbed by 10-SNP moving windows and reported the 10 windows explaining the largest amount of genomic variance as the most important windows.In this study, a window-based GWAS through the single-step genomic best linear unbiased predictor (ssGBLUP) was used.The results were presented by the proportion of total genetic variance explained by a window of 50 adjacent SNPs with an average size of ~216 KB and windows explaining for at least 1.0% of the total additive genetic variance were used to search for positional candidate genes.We used 1 SNP as the moving step of the window, which ensured that we do not miss genomic regions potentially associated with the traits due to the combination of SNPs.General information (starts and end SNP numbers, window size, start and end genomic positions, and the variance explained by each windows) about the results of single-step GWAS for the included milk production and cheese-making traits are presented in Supplemental Data S1-S108 (9 traits [MY, FP, PP, CNP, CC, SCS, CT, a30, and TA] × 3 parities × 4 stages per parity; https: / / github .com/hadiatashi/ Holstein -Cheese).The Manhattan plots of the proportion of total additive ge-netic variance explained by 50-SNP windows are shown in Supplemental Figures S1-S9 (https: / / github .com/hadiatashi/ Holstein -Cheese).The windows associated with the studied traits along with corresponding genes are presented in Table 4.In total, 6 genomic regions distributed over 3 chromosomes (BTA1, BTA14 [4 regions], and BTA20) were identified that are associated with one or more of the included traits.However, there was no genomic region that explained more than 1.0% of the total additive genetic variance of SCS.The following are the results discussed by chromosome.

BTA1
The genomic region located from 144.38 to 144.47 MB (UMD3.1 assembly) on BTA1 was associated with TA and CT.This window was 84.82 KB in size and explained more than 2.0% of the total additive genetic variance of TA in all defined lactation stages and more than 1.0% of the total additive genetic variance of CT in the first 2 stages of first 3 lactations.Moderate negative genetic correlations were found between TA and CT (−0.46 in the first 3 lactations), which may explain why these traits are affected by the same genomic region (Supplemental Tables S1-S3, https: / / github .com/hadiatashi/ Holstein -Cheese).This region has been previously reported to be associated with PP, TA, CT, a30, and CNP in the DPBB cows (Atashi et al., 2022a,b).Iung et al. (2019) reported that this region is associated with SCS and MY in the Brazilian Holstein population.Sanchez et al. (2021) reported that this region is associated with milk mineral content of magnesium (Mg), potassium (K), sodium (Na), and phosphorus (P) in Montbéliarde dairy cows.This region harbors the solute carrier family 37 member 1 (SLC37A1) gene.SLC37A1 is highly expressed in the mammary glands (Chamberlain et al., 2015;Raven et al., 2016) and encodes a glucose-6-phosphate transporter involved in the blood glucose homeostasis and sugar transport (Pan et al., 2011).It has been reported that the SLC37A1 gene has a very strong effect on milk mineral contents (Sanchez et al., 2021;Zaalberg et al., 2021) and is associated with MY, FP, and PP (Kemper et al., 2015;Raven et al., 2016;Pausch et al., 2017;Sanchez et al., 2017), milk FA profile and SCS (Iung et al., 2019), and milk's CMP (Sanchez et al., 2019) in dairy cows.Sanchez et al. (2019) reported that the SLC37A1 plays a role in inorganic anion transport and is a good candidate for CMP and milk composition.This gene has been shown to be associated with casein phosphorylation (Fang et al., 2019) and participates in fat metabolism or mammary gland development in Holstein cows (Wang et al., 2022).Positions of the identified genomic regions based on the UMD3.1 assembly. 3 Positions of the identified genomic regions based on the ARS-UCD1.2assembly.
5 MY = milk yield; FP = fat percentage; PP = protein percentage; CNP = casein percentage; CC = calcium content (mg/kg milk); SCS = somatic cell score, defined as SCS = log 2 (SCC/100,000) + 3: CT = coagulation time defined as the sum of the rennet coagulation time plus the time to a curd firmness of 20 mm measured by the computerized renneting meter; a30 = curd firmness, defined as the curd firmness measured 30 min after enzyme addition by the computerized renneting meter; and TA = milk titratable acidity measured in Dornic degrees (°D). 6 The GWAS analyses were performed for all considered traits in each parity, considering 4 stages of lactation: (1) from 5 to 60 DIM, representing the ascending production stage and lactation peak; (2) from 61 to 200 DIM, representing the lactation persistency stage; (3) from 201 to 365 DIM, representing the production decline up to the end of the lactation, and (e) from 5 to 365 DIM, representing the entire lactation.Table 4 (Continued).Genomic regions associated with milk yield traits and cheese-making properties in the first 3 parities in Walloon Holstein cows 1

BTA14
Four genomic regions located from 1.52 to 2.15 MB, 2.19 to 2.57 MB, 2.67 to 2.98 MB, and 3.13 to 3.38 MB (UMD3.1 assembly) on BTA14 were identified to be associated with one or more of the included traits.Hereafter, these regions are identified as BTA14-I, BTA14-II, BTA14-III, and BTA14-IV.These 4 regions combined explained 23.82 to 26.92% of the total genetic variance of FP.Genomic regions of BTA14-I, BTA14-II, and BTA14-III combined explained 7.5 to 10.87%, 7.09 to 9.73%, and 6.72 to 9.29% of the total genetic variances of CC, CNP, and PP, respectively.Genomic regions of BTA14-I and BTA14-III combined explained 3.27 to 5.08% of the total genetic variances of MY.The association between this region (1.52 to 3.38 MB on BTA14) and milk production traits has been reported by previous studies (Pausch et al., 2015;Jiang et al., 2019;Pedrosa et al., 2021).Iung et al. (2019) reported that the region located from 1.19 to 3.70 MB (UMD3.1 assembly) on BTA14 explained the highest proportions of genetic variance for FP and milk FA profile in the Brazilian Holstein population.The following are the results discussed by regions identified on BTA14.
The results showed moderate negative genetic correlated between CT and milk composition traits, whereas high positive genetic correlations were estimated between milk composition traits and a30 and TA.Therefore, these traits are correlated and can be affected by the same genes.Chen et al. (2023) reported that this region is associated with nitrogen efficiency index and milk true protein nitrogen in Walloon Holstein cows.Clancey et al. (2019) reported that SNP inside this region are associated with MY in Holstein cows.This region was 633.27 KB in size and harbors 30 genes including the scratch family zinc finger 1 (SCRT1), diacylglycerol O-acyltransferase 1 (DGAT1), cleavage and polyadenylation specific factor 1 (CPSF1), tonsoku like, DNA repair protein (TONSL), and spermatogenesis and centriole associated 1 (SPATC1).Most of the positional candidate genes found in this region support results from previous studies for MY and milk composition in dairy cattle (Kolbehdari et al., 2009;Li et al., 2010;Capomaccio et al., 2015;Jiang et al., 2019).DGAT1, involved in the last step of the synthesis of triacylglycerol, has a major effect on milk production traits (Jiang et al., 2010;Maxa et al., 2012;Nayeri et al., 2016;Clancey et al., 2019;Cruz et al., 2019).Nayeri et al. (2016) reported that SNP located within CPSF1 and TONSL are associated with MY in Canadian dairy Holstein cattle.The SHANK associated RH domain interactor (SHARPIN) gene product is inside this region.This product of this gene involved in the regulation of immune and inflammatory responses (Wang et al., 2012) and has been reported to be associated with the colostrum and serum albumin concentrations in Holstein cows (Lin et al., 2020).Sanchez et al. (2017) reported DGAT1, maestro heat like repeat family member 1 (MROH1), and BOP1 ribosomal biogenesis factor (BOP1) as the most important genes explaining the majority of the variability of milk protein composition in Montbéliarde, Normande, and Holstein dairy cattle.
BTA14-II.Furthermore, the region located from 2.19 to 2.57 MB on BTA14 was associated with milk composition traits including FP, PP, CC, and CNP.This region explained 3.33 to 3.71%, 1.08 to 1.38%, 1.09 to 1.42%, and 1.24 to 1.67% of the total additive genetic variance of FP, PP, CNP, and CC, respectively.Milk composition traits including FP, PP, CNP, and CC are strongly correlated and are expected to be influenced be the same genomic regions.The genetic correlations estimated among FP, PP, CNP, and CC ranged from 0.56 (FP and CC) to 0.97 (PP and CNP), 0.61 (FP and CC) to 0.97 (PP and CNP), and 0.58 (FP and CC) to 0.97 (PP and CNP) for the first, second, and third lactations, respectively.Atashi et al. (2020) reported that the region located from 1.86 to 2.12 MB on BTA14 is associated with 305-d MY and the lactation curve parameters in Holstein dairy cows.This region was 374.22 KB in size and harbors 22 genes.Among genes inside this region, nicotinate phosphoribosyltransferase (NAPRT), family with sequence similarity 83 member H (FAM83H), eukaryotic translation elongation factor 1 delta (EEF1D), pyrroline-5-carboxylate reductase 3 (PYCR3), scribbled planar cell polarity protein (SCRIB), lymphocyte antigen 6 family member H (LY6H), and mitogen-activated protein kinase 15 (MAPK15) have been previously reported as candidate genes for milk production traits in Holstein cows (Li et al., 2010;Buitenhuis et al., 2014;Ning et al., 2017;Wang et al., 2019).MAPK15 may affect milk composition through downregulating transactivation of the glucocorticoid receptor, as glucocorticoid is an important hormone in maintaining milking (Saelzler et al., 2006).The pyrroline-5-carboxylate reductase 3 (PYCR3) gene is located inside this region.This gene encodes a protein that belongs to the pyrroline-5-carboxylate reductase family of enzymes that responds to inflammatory, nutrient, and oxidative stress (Kuo et al., 2016) and has been reported to be associated with colostrum and serum albumin concentrations in Holstein cows (Lin et al., 2020).
BTA14-III.Additional region explaining large proportions of the genetic variance of MY and milk composition (CC, CNP, FP, PP) were found between 2.67 to 2.98 MB on BTA14, where 15 genes are located.Milk composition traits are strongly correlated and are expected to be affected by same polytrophic genes.This region contains lymphocyte-antigen-6 complex (LY6) including LY6K, SLURP1, LYNX1, PSCA, LYPD2, and LY6D in regulating the major histocompatibility complex.Tiezzi et al. (2015) reported that this region is associated with clinical mastitis in US Holsteins.Atashi et al. (2020) reported that this region is highly associated with 305-d MY and peak yield in Holstein cows.Among genes inside this region, adhesion G protein-coupled receptor B1 (ADGRB1), glycosylphosphatidylinositol anchored molecule like (GML), and lymphocyte antigen 6 family member D (LY6D) have been reported as candidate genes for MY, FP, PP, fat yield (FY), protein yield (PY), and milk FA profile in Holstein cows (Cole et al., 2011;Buitenhuis et al., 2014;Ning et al., 2017;Jiang et al., 2019).Costa et al. (2019) reported that the variant within ADGRB1 accounted for 2.44% of the total additive genetic variance for lactose yield in Fleckvieh cattle.The LY6D gene has been reported to be associated with MY and FY in Holstein cows (Jiang et al., 2014;Suchocki et al., 2016).The cytochrome P450 family 11 subfamily B member 1 (CYP11B1) gene is involved in glucose and lipid metabolism and has been reported as a functional candidate gene for milk production in dairy cows (Bülow and Bernhardt, 2002;Kaupe et al., 2007).
BTA14-IV.The genomic region located from 3.13 to 3.38 MB on BTA14, where the t-SNARE domain containing 1 (TSNARE1) gene is located, was associated with FP.This region was 246.80 KB in size and explained 1.12 to 1.25% of the total additive genetic variance of FP.TSNARE1 plays a role in intracellular protein transport and synaptic vesicle exocytosis (Smith et al., 2012;Luo et al., 2021) and has been reported as a candidate gene for traits including MY, FY, FP, PY, PP, and milk FA profile in Holstein cows (Buitenhuis et al., 2014;Jiang et al., 2019;Freitas et al., 2020;Bohlouli et al., 2022).Luo et al. (2021) evaluated physiological indicators of heat stress in Holstein dairy cows and reported a strong association of TSNARE1 with rectal temperature.

BTA20
On BTA20, the window located from 58.87 to 58.97 MB (UMD3.1 assembly), where the trio Rho guanine nucleotide exchange factor (TRIO) gene is located, was associated with TA.TRIO encodes a large protein that functions as a guanosine diphosphate (GDP) to guanosine triphosphate (GTP) exchange factor (Bateman et al., 2000).It has also been shown that the TRIO gene is associated with FY and milk BHB concentration in Holstein dairy cattle (Nayeri et al., 2019).The SNP inside this region has been reported to be associated with MY, FP, PP, FY, PY, SCS, and udder morphology traits in Holstein cows (Bennewitz et al., 2004;Schrooten et al., 2004;Höglund et al., 2009;Cole et al., 2011;Wang et al., 2022), and weaning weight and MY in beef cattle (Michenet et al., 2016).

CONCLUSIONS
Milk's CMP are among important breeding traits in dairy cattle breeds, especially in modern animal husbandry environments.This study aimed to estimate genetic parameters and to identify genomic regions associated with milk's cheese-making traits in Walloon Holstein cows.The findings showed that the included CMP are moderately heritable and could be included into the breeding program currently used for Walloon Holstein cows.Using available genotyping data, additional analyses were done to identify genomic regions associated with milk's cheese-making traits.Different milk production and cheese-making traits were associated with 6 genomic regions distributed over 3 chromosomes (BTA1, BTA14, and BTA20), which could further be used for genomic prediction purposes.The results confirmed most previously identified genes for milk production traits and identified several novel candidate genes (including SLC37A1, TRIO, and genes inside located from 1.52 to 2.15 MB on BTA14) for the studied cheese-making traits including CT, a30, and TA.Future research based on gene enrichment analysis might complement the GWAS results and help to deepen the understanding of the biological pathways related to the studied milk's cheese-making traits.).Genotyping was facilitated through the support of the Fonds de la Recherche Scientifique-FNRS under grant no.J.0174.18 (CDR "PREDICT-2").Relevant information supporting the results not presented here is provided in supplemental data (https: / / github .com/hadiatashi/ Holstein -Cheese).None of the data were deposited in an official repository because they are the property of the breeding organizations, and they are available upon reasonable request.Author contributions are as follows: Hadi Atashi: conceptualization, formal analysis, investigation, methodology, writing the original draft; Yansen Chen: reviewing and editing the draft; Hélène Wilmot: reviewing and editing the draft; Catherine Bastin: development of original trait definitions, reviewing and editing the draft; Sylvie Vanderick: Reviewing and editing the draft; Xavier Hubin: data curation, reviewing and editing the draft; and Nicolas Gengler: conceptualization, funding acquisition, project administration, resources, supervision, and reviewing and editing the draft.The authors have not stated any conflicts of interest.

Figure 1 .
Figure 1.Lactation curves for milk yield (MY), fat percentage (FP), protein percentage (PP), casein percentage (CNP), milk calcium content (CC) expressed as milligrams per kilogram of milk, SCS, coagulation time (CT) expressed in minutes, curd firmness after 30 min from rennet addition (a30) expressed in millimeters, and titratable acidity (TA) in Dornic degrees for the first (blue), second (red), and third (green) parity in Walloon Holstein cows.

H
. Atashi acknowledges the support of the Walloon Government (Service Public de Wallonie, Direction Générale Opérationnelle Agriculture, Ressources Naturelles et Environnement, SPW-DGARNE, Namur, Belgium) for its financial support facilitating his stay in Belgium through the WALLeSmart Project (D31-1392 and D65-1435).H. Wilmot, as a current research fellow, and N. Gengler, as a former senior research associate, acknowledge the support of the Fonds de la Recherche Scientifique-FNRS (Brussels, Belgium).The authors thank the INTERREG VA France-Wallonie-Vlaan-Atashi et al.: GENOME-WIDE ASSOCIATION STUDY FOR CHEESE-MAKING TRAITS deren program and the Walloon Government (Service Public de Wallonie, Direction Générale Opérationnelle Agriculture, Ressources Naturelles et Environnement, SPW-DGARNE, Namur, Belgium) for their financial support through the BlueSter project and previous projects.The authors also acknowledge the technical support by the Walloon Breeders Association (Elevéo, Ciney, Belgium).The University of Liège-Gembloux Agro-Bio Tech (Gembloux, Belgium) supported computations through the technical platform Calcul et Modélisation Informatique (CAMI) of the TERRA Teaching and Research Centre, partly supported by the Fonds de la Recherche Scientifique-FNRS under grant no.T.0095.19(PDR "DEEPSELECT" Atashi et al.: GENOME-WIDE ASSOCIATION STUDY FOR CHEESE-MAKING TRAITS

Table 1 .
Atashi et al.: GENOME-WIDE ASSOCIATION STUDY FOR CHEESE-MAKING TRAITS The calibration and the cross-validation statistics of the mid-infrared prediction equations used

Table 2 .
Atashi et al.: GENOME-WIDE ASSOCIATION STUDY FOR CHEESE-MAKING TRAITS Descriptive statistics for milk yield traits and cheese-making properties in Walloon Holstein cows 1

Table 3 .
Mean (SD) daily heritability for milk yield traits and cheese-making properties estimated across the lactation in the first 3 parities in Walloon Holstein cows 1 3

Table 4 .
Atashi et al.: GENOME-WIDE ASSOCIATION STUDY FOR CHEESE-MAKING TRAITS Atashi et al.: GENOME-WIDE ASSOCIATION STUDY FOR CHEESE-MAKING TRAITS Genomic regions associated with milk yield traits and cheese-making properties in the first 3 parities in Walloon Holstein cows 1 Chromosome 2