Genome-wide association study of anogenital distance and its (co)variances with fertility in growing and lactating Holstein-Friesian dairy cattle

Anogenital distance (AGD) is a moderately heritable trait that can be measured at a young age that may provide an opportunity to indirectly select for improved fertility in dairy cattle. In this study, we characterized AGD and its genetic and phenotypic relationships with a range of body stature and fertility traits. We measured AGD, shoulder height, body length, and body weight in a population of 5,010 Holstein-Friesian and Holstein-Friesian × Jersey crossbred heifers at approximately 11 mo of age (AGD1). These animals were born in 2018 across 54 seasonal calving, pasture-based dairy herds. A second measure of AGD was collected in a subset of herds (n = 17; 1,956 animals) when the animals averaged 29 mo of age (AGD2). Fertility measures included age at puberty (AGEP), then time of calving, breeding, and pregnancy during the first and second lactations. We constructed binary traits reflecting the animal’s ability to calve during the first 42 d of their herd’s seasonal calving period (CR42), be presented for breeding during the first 21 d of the seasonal breeding period (PB21) and become pregnant during the first 42 d of the seasonal breeding period (PR42). The posterior mean of sampled heritabilities for AGD1 was 0.23, with 90% of samples falling within a credibility interval (90% CRI) of 0.20 to 0.26, whereas the heritability of AGD2 was 0.29 (90% CRI 0.24 to 0.34). The relationship between AGD1 and AGD2 was highly positive, with a genetic correlation of 0.89 (90% CRI 0.82 to 0.94). Using a GWAS analysis of 2,460 genomic windows based on 50k genotype data, we detected a region on chromosome 20 that was highly associated with variation in AGD1, and a second region on chromosome 13 that was moderately associated with variation in AGD1. We did not detect any genomic regions associated


INTRODUCTION
The reproductive success of a dairy cow is a crucial determinant of its economic value within a farm system (Verkerk, 2003).This is especially true in grazing-based systems, where cows are generally required to maintain a strict annual seasonal calving cycle to coincide feed demand with pasture growth and availability (Macdonald and Roche, 2023).Cows with superior reproductive performance calve earlier in the calving season and thus have more time for postpartum recovery before the start of the next seasonal breeding period.The former leading to longer lactations and the latter to an increased chance of becoming pregnant in a timely manner, and therefore having longer productive lives in the herd.
The accuracy of fertility EBVs may be improved though selection for traits that are genetically correlated to the breeding objective.For example, Sun et al. (2010) reported improved model stability and predictive ability when using milk production traits as predictors for fertility EBVs.However, large ranges in estimated genetic correlations between fertility traits and potential predictor traits such as milk production have been reported (Berry et al., 2014).Furthermore, milk production traits may not be suitable predictors of fertility, because antagonistic genetic correlations have been reported between milk production and fertility traits (Berry et al., 2014).Milk production is a trait that is under direct selection in most countries and using milk production traits as predictors of fertility would make it difficult to identify bulls with superior EBVs for both production and fertility.It is therefore preferable to focus on more suitable predictor traits that do not exhibit a strong antagonistic association with traits that are already under direct selection.
One possible candidate for an early in life and moderately heritable (Gobikrushanth et al., 2019) predictor of fertility is anogenital distance (AGD).In females, AGD can be defined as the distance between the anus and the clitoris (Figure 1; Gobikrushanth et al., 2017) and it has been associated with fertility performance in mice (Zehr et al., 2001;Welsh et al., 2008), humans (Mendiola et al., 2012) and cattle (Gobikrushanth et al., 2017;Carrelli et al., 2021;Grala et al., 2021;Madureira et al., 2022).The process for measuring AGD is straightforward, and this trait could be measured by farmers in the future.There is some evidence, however, that the association between AGD and fertility performance in cattle is not consistent across populations and parities.Gobikrushanth et al. (2019) were not able to detect an association in an Irish population of mixed-parity dairy cattle and an interaction between the association of AGD with fertility performance and parity has been reported (Gobikrushanth et al., 2017;Madureira et al., 2022).Hence, it is of interest to understand not only the phenotypic and genetic associations between AGD and fertility traits in cattle, but also the potential interactions of age or parity on these associations.Furthermore, a GWAS of a predictor trait may improve understanding of the genetic architecture of a low heritability target trait.Some genomic regions that are associated with a predictor trait will also be associated with the target trait, and enrichment of SNP chips in these regions might improve the accuracy of genomic EBVs for the target trait (Xiang et al., 2021).
The present study had 3 objectives.First, to calculate (co)variances of AGD measured relatively early in life, at approximately 11 mo of age (AGD1) around the time Holstein-Friesian and Jersey heifers reach puberty (Hickson et al., 2011;Meier et al., 2021), and at approximately 29 mo of age (AGD2) during first lactation.Second, to identify regions of the genome associated with variation in AGD.Third, to estimate the covariances between AGD, yearling body stature traits and fertility phenotypes.We hypothesized that AGD1 would exhibit a high genetic correlation with AGD2, and that AGD at either age would exhibit moderate phenotypic and genetic relationships with reproductive phenotypes.

Animals
The Ruakura Animal Ethics Committee (Hamilton, New Zealand) approved the design of this study and all manipulations (AE application: 14448).Heifer phenotypes (age at puberty [AGEP], shoulder height, body length, BW, and AGD) were collected from 5,010 animals born between July and September 2018 that had been reared on 54 seasonal calving, pasturebased herds located across 3 regions (Waikato: n = 35 herds, 3,197 animals; Taranaki: n = 15 herds, 1,277 animals; Otago: n = 4 herds, 536 animals) of New Zealand.On average, there were 92 (SD = 49) heifers in each herd.The herds were selected based on the quality of existing data records, the proportion of the heifer cohort that was >90% Holstein-Friesian, their ability to enhance the diversity of sires contributing to the final pool of animals and the willingness of the herd owners to make animals available for measurements.Most of the animals in this study were >90% Holstein-Friesian (n = 2,340) and Holstein-Friesian × Jersey crossbred (n = 2,276).A small number of were >90% Jersey (n = 24).There were 48 animals could not be assigned to the breed categories of Holstein-Friesian, Jersey or Holstein-Friesian × Jersey, these animals were predominantly Ayrshire cross Holstein-Friesian.Some 322 animals were excluded from our analysis due to incomplete parentage records or missing genotypes or both.
A total of 257 sires were represented in the data set, and 56 of those sires had at least 10 daughters distributed over at least 3 of the 54 herds.Most of these heifers were followed through to the completion of their second lactation.First-lactation calving (n = 4,327), breeding (n = 4,111), and pregnancy (n = 3,939) phenotypes were recorded between June 1, 2020, and May 31, 2021.Second lactation calving (n = 3,575), breeding (n = 3,507) and pregnancy (n = 3,353) phenotypes were recorded between June 1, 2021, and May 31, 2022.Animals with missing phenotypes in first or second lactation were either culled, sold, or deceased.A subset of herds (n = 17; Waikato: n = 9 herds, 1,015 animals; Taranaki: n = 6 herds, 629 animals; Otago: n = 2 herds, 312 animals) comprising 1,956 animals had AGD measured for a second time during first lactation.Some 131 of these animals were excluded from our analysis due to incomplete parentage records, missing genotypes, or both.Herds were selected to be included in the second measure of AGD based on their quality of recording and the willingness of their owners to make animals available for measurement.A total of 128 sires were represented in the subset of animals with AGD2 phenotypes, and 33 of those sires had at least 10 daughters distributed over at least 3 of the 17 herds.

Anogenital Distance
We defined AGD as the distance between the center of the anus and the base of the clitoris (Figure 1), measured using stainless steel digital calipers as described by Gobikrushanth et al. (2017).The AGD1 phenotype was measured when the average age of heifers was 11 mo (SD = 0.5 mo).The AGD2 phenotype was measured approximately 11 to 14 wk after the herd's planned start of the seasonal breeding period during the first lactation, when the animals were, on average, 29 mo old (SD = 0.65 mo).

Age at Puberty
We used the time at which blood plasma concentrations of progesterone (BP4) first exceeded 1 ng/mL (Meier et al., 2021) to represent the trait AGEP.This phenotype was based on 3 herd visits at monthly inter-vals between May and August 2019.The BP4 levels of a postpubertal heifer follow a 3-wk cycle, and generally BP4 will be basal for 1 wk, followed by 2 weeks of elevation (Williams and Cardoso, 2021).Cyclic BP4 levels will result in some false negatives when using BP4 concentration measured at monthly intervals to represent pubertal status.Some animals will happen to be in a basal phase of their 3 wk cycle in the first visit following their onset of puberty, and so they will not be identified as postpubertal until the following herd visit, around 1 mo later.Therefore, as well as the reduced precision arising from a monthly interval between visits, the monthly visit schedule implemented here will also introduce some upward bias in mean AGEP phenotypes.Although our phenotyping approach yields an AGEP phenotype that lacks some precision, reducing visit frequency is an established strategy to manage costs and enable phenotype collection across thousands of animals (Johnston et al., 2009;Wolcott et al., 2018).Furthermore, Stephen et al. (2022a) have previously used simulated AGEP phenotypes to demonstrate that estimated variance parameters and breeding values are remarkably robust to infrequent herd visits.The average age of these 2018-born heifers was 10, 11, and 12 mo at these visits.The visits were timed to target 45% of animals being postpubertal on the second visit, which was predicted according to the stochastic model of Dennis et al. (2018).Blood samples were collected from a coccygeal vessel into blood tubes containing lithium heparin (BD Vacutainers, BD New Zealand, Auckland, New Zealand).These samples were placed on ice for transport and then centrifuged (at 4°C, 1,900 × g for 12 min) within 24 h of collection.The blood plasma was stored at −20°C.The BP4 concentrations were determined using a commercial radioimmune assay kit (ImmuChem Progesterone Double Antibody RIA, MP Biomedicals LLC, Irvine, CA), following the same protocol described by Meier et al. (2021).
Each animal's AGEP phenotype was censored, in that it is not known precisely, but rather determined to fall within an upper and lower bound.This type of phenotype censoring is described in detail by Stephen et al. (2022a) who used simulated data to produce an identical phenotyping strategy to that used here.Briefly, the lower bound for an animal that attained puberty during the period of measurement was their age at the visit immediately before the visit when the animal was first observed to have BP4 >1 ng/mL.For example, the lower bound for an animal with first BP4 elevation on the second visit was their age on the first visit, because we observed them as prepubertal (that is, they did not have BP4 elevation) at that age, and so we assumed that they did not reach puberty any younger than this.The lower bound for an animal that attained puberty before the period of measurement was its birth date as we do not have information about how old the animal was when BP4 first became elevated.The upper bound for an animal that attained puberty during the measurement period was the age on the visit when the animal was first observed to have BP4 >1 ng/mL.For example, the upper bound for an animal with elevated BP4 on the second visit was the age at the day of the second visit.The upper bound for an animal that attained puberty after the measurement period was 600 d in the current analysis.This represents an age at which we would expect every animal to have attained puberty (Dennis et al., 2018).

Height, Length, and BW
Animals had their height, length, and BW measured on the same day as AGD1 (at approximately 11 mo old).Height was defined as the distance (cm) from the ground to the top of the shoulder.Length was defined as the distance (cm) between the shoulders and the base of the tail.Body weight (kg) was measured using calibrated electronic weigh scales, while animals were stationary.

Calving and Breeding in First and Second Lactation
Animals were seasonally bred to calve for the first time between June and November 2020 when they were, on average, 24 mo old (SD = 0.75).The mean calving date across all herds was July 31, 2020, and 95% of cows calved before September 4, 2020.These cows were then re-bred using either AI or natural breeding between October and December of the same year as per routine farm management under a seasonal calving, pasture-based grazing system.The length of the breeding periods (AI only) in each herd ranged from 61 to 100 d (mean ± SD: 77 ± 9 d).Animals calved for the second time between June and November 2021 when they were, on average, 36 mo old (SD = 0.80).The mean calving date across all herds was August 10, 2021, and 95% of cows calved before October 26, 2021.These cows were then re-bred between October and December of the same year using either AI or natural breeding.The length of the 2021 breeding periods (AI only) for the 2018-born animals in each herd ranged from 28 to 95 d (mean ± SD: 61 ± 19 d).Hormone-based fertility interventions to advance estrous are a routine management tool in the New Zealand dairy industry and were used strategically on later calving cows in most herds involved in this study; however, producer recording of these interventions was incomplete and therefore these were ignored in the current analysis.Calving and breeding dates were recorded by herd managers.All available first lactation data were downloaded from the New Zealand national dairy database in June 2021, at the completion of that cohort's first lactation.Second lactation data were downloaded in June 2022.In both lactations, calving and breeding rates were defined as binary scores.We assigned animals that calved within the first 42 d of their herd's calving season (CR42) a score of 1, and those who calved later than this a score of 2. Animals that were removed from the herd due to reproductive failure (late pregnancy or failure to become pregnant) in first lactation were assigned a score of 2 for CR42 in second lactation.This phenotype recognizes that if they had not been removed from the herd, they would have either calved after the first 42 d of the herd's calving season, or they would have failed to calve.The start of calving for each herd was defined as the mean calving date of the first 10% of 2018 born cows to calve.This method of deriving a calving phenotype is consistent with national evaluations in New Zealand (Bowley et al., 2015).We assigned animals that were presented for breeding within the first 21 d of their herd's seasonal breeding period (PB21) a score of 1, and those who were mated later than this, or did not present for breeding, a score of 2. The start of breeding for each herd was defined as the mean breeding date of the first 10% of 2018 born cows to be mated (Bowley et al., 2015).

Pregnancy in First and Second Lactation
All 2018-born animals underwent pregnancy testing for fetal-age using ultrasonography as per standard practice offered commercially by veterinary providers.Pregnancy testing was carried out 11 to 14 wk after the herd's planned start of the seasonal breeding period.Animals were pregnancy tested in both first and second lactations.Conception dates were confirmed by verifying that the size of the fetus was consistent with the predicted age based on insemination records.Pregnancy rate was defined as a binary score.We assigned cows that attained pregnancy within the first 42 d of their herd's seasonal breeding period (PR42) a score of 1, whereas cows that did not attain pregnancy within this timeframe were assigned a score of 2. Animals that were removed from the herd in the first lactation were not assigned a phenotype for PB21 or PR42 in the second lactation.

Genotypes
All animals were genotyped using the Weatherbys Versa 50k SNP array (Illumina, USA).The SNP call rates were consistently high, and only 12 samples were excluded from our analysis due to low call rates Stephen et al.: VARIANCE PARAMETERS FOR ANOGENITAL DISTANCE (<90%).We imputed a small proportion of missing genotypes using FImpute software (Sargolzaei et al., 2014).We disregarded SNP located on the X chromosome, unmapped SNP, as well as 2,120 SNP with minor allele frequency <1%.This left around 47,000 SNP to be included in our analyses.

Analyses
We first estimated genetic, residual, and phenotypic variances for each trait using univariate analyses.We used trivariate analyses to estimate the covariances between body traits (height, length, body weight, AGD1, and AGD2) and fertility traits in first and second lactations (CR42, PB21, and PR42).For example, the covariances between AGD1, CR42 in first lactation and CR42 in second lactation were estimated using a trivariate analysis to lessen the implications of fertility-driven selection occurring between first and second lactation, where animals that do not become pregnant during the seasonal breeding period in first lactation are removed from the herd.We used bivariate analyses to estimate the covariances between the pair-wise combinations of height, length, BW, AGEP, AGD1, and AGD2.We used bivariate analyses to estimate covariances between fertility traits using the pair-wise combinations of CR42, PB21, and PR42 during first and second lactations.

Model Equation
We fitted a marker effects model (Garrick et al., 2014) to estimate random additive marker effects, fixed herd effects, genetic and residual variance parameters, and in the case of bivariate and trivariate analyses, covariance parameters.Univariate analyses of AGD1 and AGD2 were used to undertake a GWAS.Matrix representation of the marker effects model equation is where y is a vector of phenotypes (1 phenotype per study animal) and b is a vector of fixed effects.Fixed effects for AGD1, height, length, and BW were herd and age (in d), whereas only herd was fitted as a fixed effect for AGD2, AGEP, and fertility measured during lactation.Bayesian multiple-regression analysis is robust to population structure (Toosi et al., 2018); therefore, we did not fit breed proportions as fixed effects.
The vector a contains additive marker effects.The vector e represents residuals corresponding to each of the phenotypes.The incidence matrix X relates each phenotype record to relevant fixed herd effects, and in the case of AGD1, height, length, or LWT analyses, the fixed effects of age as a covariate.The covariate matrix M is a matrix of genotype covariates (coded as 0, 1, or 2) and relates each phenotype record to the number of 1 of the alleles at each SNP marker position.M has a column for each SNP marker, and a row for each genotyped animal with a phenotype.The unknowns sampled in each analysis include the vectors b and a and the scalars representing genetic and residual (co)variances.For the AGEP trait, where phenotypes are censored, only a lower and upper bound is available.Therefore, for the analysis involving AGEP, the vector y is also unknown, and this variable is sampled within the Gibbs sampling process, alongside other unknowns.

Software and Solver
Genetic analysis was performed using the JWAS package (Cheng, 2018) implemented in Julia (Bezanson et al., 2017).A Markov-chain Monte Carlo (MCMC) technique was applied using a single site Gibbs sampler to obtain samples from the posterior distributions of variance parameters and fixed and marker effects.BayesC methodology was used, where Pi, the scalar proportion of markers with no effect, was set to 0.99 for univariate analyses.In bivariate and trivariate analyses, the vector Pi was sampled along with other unknowns (Habier et al., 2011).
For the analysis of AGEP, where phenotypes were censored, we used a data augmentation approach, implemented within the Gibbs sampler to obtain plausible values for each animal's phenotype.The AGEP phenotypes were re-sampled at each iteration at the Gibbs sampler, along with each of the other unknown variables (Stephen et al., 2022a).
We tested the chains of MCMC samples in our analyses for evidence of nonconvergence of parameters, we made inference on using the method described by (Geweke, 1992).In addition, we assessed MCMC convergence by grouping post burn-in samples consecutively in quartile groups.That is, group 1 included the first 25% of MCMC samples, group 2 included next 25% of MCMC samples and so forth).We compared the mean and distribution of unknown variance parameters calculated from each of the 4 MCMC sample quartiles.Each analysis was considered converged when there was consistent alignment in the 25% and 75% percentile (50% credibility interval) for parameters across the consecutive quartiles of MCMC samples.The univariate, bivariate, and trivariate analyses comprised 100,000, 300,000, and 600,000 samples, respectively.We selected suitable chain lengths by undertaking several test analyses where the number of MCMC iterations was varied and tested for evidence of nonconvergence.The first 2,000 samples were disregarded as a burn-in.
Prior values for genetic and residual variances estimated using univariate analysis were obtained from existing literature (see Supplemental Material, https: / / doi .org/ 10 .17632/bzsd2kr7mm .1;Stephen, 2023).Prior values for genetic and residual variances estimated using bivariate or trivariate analysis were obtained from our univariate analysis, whereas prior values for covariances were obtained from the literature (see Supplemental Material, https: / / doi .org/ 10 .17632/bzsd2kr7mm .1;Stephen, 2023).
The Julia packages CSV, StatsPlots, and DataFrames were used to postprocess the results.We produced credibility intervals based on thresholds for the 5% (lower bound) and 95% (upper bound) percentiles (90% CRI).

Genome-Wide Association Study
We defined genomic windows as spanning 20 contiguous SNP, which resulted in 2,460 windows, that were generally 1 cM long.The JWAS software computes WPPA (window posterior probability of association) for each of the 2,460 genomic windows, which represents the probability that a given region is associated with variance in a trait (Fernando et al., 2017).The WPPA metric can be used as a proxy for the probability that a region harbors a QTL.We considered windows that explained at least 1% of the total genetic variance and used a WPPA threshold of 0.8 to identify regions that were associated with genetic variance in AGD.At that threshold, we would expect the proportion of false positives to be restricted to 0.20 (Fernando et al., 2017).We used Ensembl (http: / / ensembl .org/ ) to view the University of Maryland assembly of the Bos taurus 3.1 genome build (UMD 3.1, College Park, MD) and to search for protein-coding genes within genomic regions that exhibited a WPPA above threshold.

Correlations-AGD with Fertility Traits Measured During First and Second Lactation
The genetic and phenotypic correlations between AGD measured at 2 ages and fertility traits during first and second lactations are presented in Table 3.The genetic correlations between AGD1 and CR42 in first and Heritabilities are displayed on the diagonal.Correlations were calculated using pair-wise bivariate analyses.Heritabilities were calculated using univariate analyses.Values in parentheses represent 90% credibility intervals.Heritabilities are displayed on the diagonal.Correlations were calculated using pair-wise bivariate analyses.Heritabilities were calculated using univariate analyses.Values in parentheses represent 90% credibility intervals.
second lactations were 0.24 and 0.32, respectively.The genetic correlations between AGD1 and PB21 in first and second lactations were 0.52 and 0.40, respectively.The genetic correlation was 0.28 for AGD1 and PR42 in first lactation, whereas it was 0.19 for AGD1 and PR42 in second lactation.The phenotypic correlations between AGD1 and fertility traits measured during lactation ranged from −0.01 to 0.03.The genetic correlations between AGD2 and fertility traits measured during first and second lactations were 0.58 and 0.63 for CR42, 0.57 and 0.52 for PB21, and 0.46 and 0.48 for PR42, respectively.Phenotypic correlations between AGD2 and fertility traits measured during lactation were weakly positive, ranging from 0.03 to 0.11.

AGD Correlations with Prelactation Heifer Traits
The genetic and phenotypic correlations between AGD measured at 2 ages with prelactation heifer traits are presented in Table 1 and Table 2.The AGD1 and AGD2 traits exhibited a strong positive genetic correlation of 0.89, whereas the phenotypic correlation was weaker at 0.33.In general, both genetic and phenotypic correlations between AGD and body stature traits were weak (<0.16).Genetic and phenotypic correlations were 0.16 and 0.06 between AGD1 and height, 0.06 and 0.06 between AGD1 and length, and 0.13 and 0.12 between AGD1 and BW, respectively.The genetic and phenotypic correlations were 0.15 and 0.04 between AGD2 and height, −0.10 and 0.03 between AGD2 and length, and 0.12 and 0.04 between AGD2 and BW, respectively.The genetic relationship between AGD and AGEP varied depending on the timing of the AGD measure.The genetic correlation between AGD1 and AGEP was weak at 0.10.Conversely, the genetic correlation between AGD2 and AGEP was moderate at 0.3.The phenotypic relationship between AGD and AGEP was also dependent on the timing of the AGD measure.The phenotypic correlation between AGD1 and AGEP was −0.02, whereas the phenotypic correlation between AGD2 and AGEP was 0.09.

Body Stature Correlations with Prelactation Heifer Traits
Genetic and phenotypic correlations between prelactation body stature phenotypes were consistently positive, and relationships ranged from moderate to high (Table 1).Height exhibited a genetic correlation of 0.63 with length and 0.67 with BW, and length and BW exhibited a genetic correlation of 0.82.Phenotypically, height exhibited a correlation of 0.32 with length and 0.47 with BW, and length and BW exhibited a phenotypic correlation of 0.48.Heritabilities are displayed on the diagonal.CR42: a binary score denoting whether a cow calved within the first 42 d of her herd's seasonal calving period (success = 1, failure = 2) during her first (L1) or second lactation (L2).PB21: a binary score denoting whether a cow was mated within the first 21 d of her herd's seasonal breeding period (success = 1, failure = 2) during her first (L1) or second lactation (L2).PR42: a binary score denoting whether a cow became pregnant within the first 42 d of her herd's seasonal breeding period (success = 1, failure = 2) during her first (L1) or second (L2) lactation.Correlations involving CR42, PB21, and PR42 were calculated using trivariate analyses so that both L1 and L2 phenotypes were included.All other correlations were calculated using pair-wise bivariate analyses.Heritabilities were calculated using univariate analyses.Values in parentheses represent 90% credibility intervals.A positive correlation between an AGD trait and a fertility trait (CR42, PB21, PR42) indicates that selection for shorter AGD would result in improved fertility outcomes.

Correlations Between First and Second Lactation Fertility Traits
The genetic and phenotypic correlations between fertility traits measured during lactation are presented in Table 3. Genetic correlations among first lactation fertility phenotypes were high and positive, ranging from 0.73 to 0.78, whereas genetic correlations among second lactation fertility phenotypes were moderately to highly positive, ranging from 0.36 to 0.69.Genetic correlations between first lactation and second lactation fertility phenotypes were also high and positive, ranging from 0.64 to 0.91.Phenotypic correlations between fertility phenotypes were weakly positive, ranging from 0.08 to 0.19 between first lactation traits and from 0.20 to 0.26 between second lactation traits.Phenotypic correlations between first lactation and second lactation fertility phenotypes also tended to be weakly positive; however, PR42 in first lactation exhibited a highly positive phenotypic correlation with CR42 in second lactation (0.85).

GWAS of AGD1 and AGD2
We identified 1 genomic region in our GWAS of AGD1 with WPPA values above our nominal threshold of 0.8 (Figure 2).This region was on chromosome 20, spanning 34,158,865 to 35,034,032 bp, and had a WPPA of 0.87.The SNP with the largest effect within this window was BTA-50400-NO-RS (location: 34,474,873 bp).We detected 1 protein-coding gene within this region on Ensembl (Table 4).A second noteworthy region was identified on chromosome 13, with a WPPA of 0.78, just below our threshold of 0.80.This genomic region spanned 9,276,071 to 9,996,039 bp (Table 4).The SNP with the largest effect within this window was BOVINEHD1300002589 (location: 9,656,579 bp).We detected 1 protein-coding gene recorded within this region on Ensembl (Table 4).The GWAS of AGD2, based on only about a third of the number of animals, did not yield any regions associated with variation in the AGD trait (Figure 3).

DISCUSSION
We investigated AGD and its genetic and phenotypic relationships with fertility traits in a population of predominantly Holstein-Friesian sired dairy cattle managed in a seasonal calving, pasture-based system.Our analysis determined that AGD measured at approximately 11 and 29 mo old were moderately heritable and highly genetically correlated traits, indicating that AGD has promise as a novel predictor trait that can be measured relatively early in life.Most importantly, AGD exhibited a moderate genetic correlation with a range of low heritability fertility traits that are measured during lactation and often used for genetic selection for fertility.Together, these results indicate that inclusion of AGD as a predictor trait could improve the accuracy of fertility EBVs, much earlier in the life of cows (and their sires) than conventional fertility phenotypes allow.Furthermore, we identified 2 genomic regions (chromosome 20 and 13) that were associated with AGD1.Investigation of these genomic regions could improve our understanding of the genetic architecture of calving-, breeding-and pregnancy-related fertility traits and offer opportunities to accelerate genetic improvement.

Heritability of AGD
Anogenital distance was a moderately heritable trait when measured at either age, but there was a suggestion that measuring AGD later in life may yield a higher heritability.An increased heritability in older animals could reflect their greater maturity.This logic is consistent with Rajesh et al. (2022), who reported that AGD became more repeatable as animals got older, possibly due to an interaction between AGD and maturity.Animals in the present study were born over a 3-mo period, as is typical for replacement heifers in a seasonal, pasture-based system, which means animal age and, therefore, maturity varied on the day that we measured AGD.We attempted to account for the maturity-related variance in AGD by fitting age as a fixed covariate within our analysis.For our analysis of AGD1, the effect of age was approximately 0.01 cm per day, which represents a difference between the oldest and youngest animals in this study of approximately 1 phenotypic standard deviation in AGD1.Conversely, the age effect was negligible when analyzing AGD2 phenotypes, and it was subsequently removed from the model.
The heritability of AGD has not been widely published; however, Gobikrushanth et al. (2019) reported a heritability of 0.37 (±0.08) in a population of dairy cows 2 to 4 yr old, which is slightly higher than the heritability of 0.29 for AGD when measured at 29 mo old in our study.In the context of fertility-related traits, a candidate predictor trait with a moderate heritability is of high interest, as target fertility traits that tend to be measured during lactation generally have very low heritabilities.In New Zealand, the target fertility trait is a combination of calving rate (CR42) during second, third and fourth lactation.Calving rate phenotypes have previously been reported to have heritabilities of less than 0.10 (Harris et al., 2006;Bowley et al., 2015) and, in the present study, we have also found low heritabilities for fertility phenotypes measured during the first and second lactations.For any given trait, response to selection is dependent on the reliability of EBVs and that is a function of heritability and number of observations.One strategy for improving the response to selection in a low heritability trait is to make use of traits with a nonzero genetic correlation.The large differential between the heritability of AGD and fertility traits measured during lactation means that AGD may add value as a predictor trait.

Correlation Between AGD and Fertility Measured During Lactation
To our knowledge, this is the first time that genetic associations between AGD and reproductive traits have been reported in dairy cattle.Our analysis indicates that the genetic relationships between AGD and calving, breeding and pregnancy rate phenotypes are moderately positive, indicating that shorter AGD is associated with earlier calving, breeding, and pregnancy dates in a seasonal calving system.Therefore, the use of AGD as a predictor trait would increase the reliability of fertility EBVs, particularly among animals that have an AGD phenotype, but do not have direct fertility phenotypes.In New Zealand, the fertility EBV represents a combination of calving rate in first, second, third and fourth parity.We have only considered first and second parity in our analysis, but we did not detect an interaction of parity on the genetic association between AGD and fertility traits measured during lactation.The consistency of our results across first and second lactation suggests that, in this population, AGD is predictive of fertility performance regardless of parity.
Our analysis indicates that the genetic correlation between AGD and target fertility traits are highest when AGD is measured in older animals.In this study, our second measure of AGD coincided with early pregnancy testing during the seasonal breeding period in first lactation, which would not confer a timing advantage over using pregnancy rate directly in fertility evaluations and would actually be available later than calving rate measured in first lactation.For both AGD1 and AGD2, however, our estimated genetic associations were accompanied by large 90% CRI.The lack of precision in these estimates makes it difficult to know if the genetic correlations truly depend on the age that AGD is measured, especially as AGD1 and AGD2 were themselves highly genetically correlated (0.89) with a tight 90% CRI.We recommend that AGD and fertility phenotypes are measured on further cohorts of animals, as additional data will improve the precision of our estimated correlations.If the genetic correlation between AGD and fertility does depend on the timing of the AGD measure, then it would be useful to understand this dependency in more detail.It is likely that there is an age for measuring AGD when the trade-off between the strength of the genetic correlation and the potential earlier timing advantage are optimized; a longitudinal study of AGD is therefore required.Ideally, this would involve repeat measurements of AGD at least bimonthly in a population of animals from birth through to 30 mo of age so that the genetic correlations between AGD measured at many ages and fertility during lactation can be estimated directly.
In the population of dairy cattle used in the present study, the phenotypic associations between AGD and fertility traits were consistently close to 0; however, they tended to be weakly positive when animals were older at the time of the AGD measurement (AGD2).That is, shorter AGD2 exhibited a weak phenotypic association with superior fertility outcomes in our population.Our results indicate that there may be an interaction of age at the time of AGD measure, on the phenotypic association between AGD and fertility.This potential interaction is an important consideration for those who aim to predict fertility using AGD phenotypes.Similar to the genetic associations discussed previously, we did not detect an interaction of parity on the phenotypic associations we determined.Our results are supported by a general consensus in the literature that shorter AGD is associated with superior fertility (Gobikrushanth et al., 2017;Akbarinejad et al., 2019;Grala et al., 2021;Carrelli et al., 2022;Madureira et al., 2022).Gobikrushanth et al. (2017) measured AGD in a population of North American dairy cattle managed in continuous calving, housed systems.They identified moderate phenotypic associations between AGD and reproductive traits, whereby a shorter AGD in first and second parity cows was associated with a greater pregnancy rate to first artificial insemination (P/AI) and a greater likelihood of pregnancy by 250 DIM.The authors noted, however, the possibility of an interaction between parity and AGD, as they did not observe an association between AGD and these fertility outcomes in third parity cows and greater.These higher parity groups are likely comprised of cows with superior genetic merit for fertility, as poorer fertility cows are generally culled from the herd.The work of Gobikrushanth et al. (2017) was later repeated in a mixed-parity population that included first, second, and third plus parity Irish dairy cows managed in a seasonal calving, pasture-based system (Gobikrushanth Stephen et al.: VARIANCE PARAMETERS FOR ANOGENITAL DISTANCE et al., 2019).In that second study, authors reported a null association between AGD and fertility phenotypes, which included the PB21 and PR42 traits reported in the current study.Gobikrushanth et al. (2019) considered that aggressive selection for reproductive traits in their population of Irish cattle may have altered the phenotypic relationship between AGD and reproduction traits.However, more recently Grala et al. (2021) reported a moderate phenotypic association between AGD and CR42 in second lactation (shorter AGD was associated with earlier calving) in a herd of around 475 seasonally managed, grazing Holstein-Friesian cows in New Zealand.Fertility is a priority trait in New Zealand, and it has been directly incorporated into the national selection index since 2005 (Pryce et al., 2014).Instead, the null associations reported by Gobikrushanth et al. (2019) may be due to the greater representation of third parity and older animals in their population, as we detected some evidence that the association between AGD and fertility phenotypes is dependent on parity (Gobikrushanth et al., 2017).Madureira et al. (2022) and Carrelli et al. (2022) also grouped the cows in their studies by parity and reported that shorter AGD was phenotypically associated with improved performance for a range of fertility-related phenotypes in both their younger (first parity) and older (second parity and greater) cow groups.They also detected an interaction between AGD and parity for 1 of the phenotypes they measured (the relative increase in activity during estrus) and their results suggested a stronger relationship between AGD and estrus activity change in later parity cows.Gobikrushanth et al. (2019) analyzed cows from multiple parities together, and it is possible they would have detected an association between AGD and fertility if they had partitioned animals by parity and accounted for the fact that the least fertile animals were likely to have been culled before their third parity.Nevertheless, it is also possible that the association between AGD and fertility traits is population specific, which supports the need to investigate this novel phenotype across different cow populations.
The drivers of genetic variance in AGD and the physiological mechanisms underpinning its genetic and phenotypic association with fertility are not well understood; however, it has been established that AGD is a marker for prenatal exposure to androgens, and dysfunction in prenatal androgen exposure can have long standing effects on the health and fertility outcomes of mammals.For example, exposure to excess prenatal testosterone concentrations has been shown to masculinize external genitalia of female rats (Hotchkiss et al., 2007) and sheep (Lamm et al., 2012).Conversely, androgen deficiencies, engineered by the administration of androgen or androgen receptor antagonists, resulted in a significantly reduced AGD in male rats (Welsh et al., 2008;MacLeod et al., 2010).Furthermore, inappropriate prenatal androgen exposure is associated with several metabolic and reproductive disorders in mammals, including insulin resistance (Bruns et al., 2004), polycystic ovaries (Abbott et al., 1998), and permanent deformation of reproductive organs (Welsh et al., 2008;Lamm et al., 2012).It is likely that both AGD and fertility traits are influenced by prenatal androgen exposure, and this is the basis of the associations exhibited between the 2 traits.

GWAS
Our GWAS of AGD1 identified 1 genomic region on chromosome 20 associated with variance in the trait, and a second region on chromosome 13 that was borderline using our WPPA threshold of 0.8.As a WPPA threshold is lowered, the chance of false positives in a GWAS is increased (Fernando et al., 2017).Conversely, using a higher WPPA threshold introduces risk that true positives will be disregarded.We attempted to balance these risks by setting our WPPA threshold as 0.80.To our knowledge, this is the largest data set examined to date for genomic regions or SNP associated with AGD, but an even larger population size for phenotyping AGD1 would improve the statistical power of the GWAS presented here and perhaps lead to the identification of additional genomic regions above this WPPA threshold.Further, neither of these regions were associated with AGD2 and our GWAS of AGD2 did not identify any additional genomic regions of interest.The discordance of our results across the 2 AGD phenotypes may simply be due to population size, as fewer animals had AGD2 phenotypes measured.In future, it would be beneficial to measure AGD2 in a larger population and repeat our GWAS analysis for this later phenotype.Enriching SNP chips across genomic regions associated with variation in AGD would potentially increase the prediction accuracy of genomic EBVs for AGD, improving its utility as a predictor trait for fertility.Identifying specific gene effects on AGD could improve the utility of AGD as a predictor trait for fertility in 2 ways.First, if a gene that is associated with AGD is also associated with fertility, then enriching a SNP chip in the region of that gene could directly improve genomic the prediction of fertility.Second, if a gene that is associated with AGD is not associated with fertility then removing genetic variation in AGD caused by this genomic region could ultimately improve the predictive ability of the adjusted ADG trait.The genomic region that we identified on chromosome 20 has been previously identified in a GWAS of AGD in 908 mixedparity Irish Holstein-Friesian cows (Gobikrushanth et al., 2019).Although Gobikrushanth et al. (2019) did not identify any SNP significantly associated with variance in AGD after adjustment for multiple testing; they did highlight 6 SNP of suggestive association on chromosomes 6, 15, 20 and 26.The SNP identified on chromosome 20 in their study is approximately 10Mb from the boundary of our region of interest on chromosome 20.Therefore, our GWAS provides some support of an association between a QTL in this region, and variance in AGD, although given the distance of 10 Mb it is possible that we have identified an entirely different QTL in the present study.One candidate gene is located within this region, named dab2.The human analog of the dab2 gene (DAB2) has been previously established as a tumor suppressant (Zhang et al., 2014).Morris et al. (2002) investigated the prenatal function of this gene using a knockout mouse model and reported that embryos that lacked the gene failed to progress past a very early development stage.In particular, the authors determined that the dab2 gene was critical for normal development of the visceral endoderm, an extraembryonic cell layer that is involved with early embryo development.It is well established that AGD is permanently influenced during prenatal development, and so it is plausible that a gene associated with normal early embryo development and general growth via regulation of cell proliferation could be associated with AGD.That said, further work is required to validate the association between dab2 and AGD in dairy cattle.The second genomic region that we identified in our GWAS of AGD1 is located on chromosome 13.To our knowledge, this is the first time that this region has been associated with variance in AGD.We detected 1 candidate gene within this region, called MACROD2, a gene associated with a reduction in antral follicle count (AFC), and anti-Mullerian hormone (AMH) in humans (Schuh-Huerta et al., 2012).Although these associations have not been validated in cattle, both AFC, which is a count of the number of antral follicles on a female's ovaries, and AMH, which is a hormonal marker for AFC, have been established as predictors of reproductive competence in cattle (Alward and Bohlen, 2020).Given that the region of the MACROD2 gene has now been associated with variance in AGD, and the gene itself has been associated with AFC and AMH, it is plausible that this gene is responsible for some of the covariance between AGD and fertility traits we determined.The MACROD2 gene has also been associated with survival in dairy cattle (Shabalina et al., 2020), resistance to bovine tuberculosis (González-Ruiz et al., 2019), and cell growth mediation and proliferation in humans (Mohseni et al., 2014).Although this region on chromosome 13 exhibited a borderline WPPA in our GWAS analysis, the previous associations identified between MACROD2 and fertility, immunity and cell growth mean that further investigation into the MAC-ROD2 gene is warranted.
Two previously published GWAS on AGD have detected associations between regions on chromosome 26, and variance in AGD (Gobikrushanth et al., 2019;Grala et al., 2021).In particular, Grala et al. (2021) used a small population of 538 first parity Holstein-Friesian cows managed in a seasonal calving, pasturebased dairy system in New Zealand (Grala et al., 2021); our data set did not validate their result in a population of animals with similar genetic selection history and managed under a similar grazing-based dairy system.However, this different outcome may reflect the ad-mixed Holstein-Friesian and Jersey breed composition of our population, as it is possible that the QTL previously identified on chromosome 26 segregates to a lesser extent in the Jersey population, relative to Holstein-Friesians, and thus our ability to detect an effect at this locus may be reduced.We tested the effect of the 24 purebred Jersey animals on our GWAS results by excluding them and repeating the GWAS analysis.Excluding these Jersey animals did not meaningfully change our GWAS results.

AGD1 and AGD2 Share a Highly Positive Genetic Correlation
We detected a highly positive genetic correlation between AGD1 and AGD2.This is important because AGD is more valuable as a predictor trait for fertility if it can be measured relatively early in an animal's life.The phenotypic correlation between the 2 AGD traits was moderate, and this aligns well with the results presented by Rajesh et al. (2022) who reported that AGD measures taken roughly 12 mo apart had a phenotypic correlation of around 0.30.Rajesh et al. (2022) concluded that animals must be at least 6 mo old before their AGD becomes phenotypically repeatable.The genetic correlations between AGD at ages younger than 6 mo old, and AGD at breeding age was not reported in this previous study.Here, we report a high genetic correlation, and a moderate phenotypic correlation between AGD measured at 11 mo and 28 mo of age.Thus, a low phenotypic correlation between AGD measured in animals <6 mo old is not necessarily indicative of a low genetic correlation.It would be beneficial to establish the genetic correlations between fertility and AGD at various early stages of life (<6 mo) to assist with designing large-scale phenotyping strategies if AGD is included as a predictor trait for fertility in routine genetic evaluations.

Correlations Between AGD and Body Stature Traits
Both phenotypic and genetic correlations between AGD and body stature (height, length, BW) phenotypes were weakly positive, demonstrating that longer AGD is associated with increased height, length, and BW.To our knowledge, this is the first time that genetic correlations between AGD and body confirmation traits have been reported for dairy cattle.The weak positive phenotypic correlations presented here align with these previous cattle studies (Gobikrushanth et al., 2017(Gobikrushanth et al., , 2019;;Rajesh et al., 2022).Correlations reported for humans are varied, ranging from weak to moderate.For example, Sathyanarayana et al. (2010) determined a moderate correlation of 0.4 to 0.5 between AGD and birth weight and birth length in human infants, whereas Thankamony et al. (2009) reported only weak associations between AGD and weight and length in humans.Weak correlations between AGD and body confirmation traits has important implications for the use of AGD as a marker of reproductive success, as it indicates that the AGD trait is largely independent of these routinely measured phenotypes.As such, AGD represents a truly novel phenotype.

Correlation Between AGD and AGEP
In our population, AGD1 exhibited a low genetic correlation with AGEP.This genetic correlation improved slightly when AGD was measured at an older age, but still remained low to moderate.Age at puberty has previously been identified as a candidate trait to predict reproductive success (Morris et al., 2000;Mialon et al., 2001;Lefebvre et al., 2021).It is useful to know that AGD and AGEP do not share a high genetic covariance as this suggests that they could have complementary value as predictor traits for reproductive success.Anogenital distance is more straightforward to measure then AGEP, and it has the advantage of being easily measured by farmers.It is worth noting that AGEP was measured in this population using a phenotyping strategy designed to be cost effective and practical at a large scale, and as a result, measurement precision was compromised.We have previously established that the reduced precision of our AGEP phenotype is unlikely to have implication for our genetic analyses presented here (Stephen et al., 2022a,b)

CONCLUSIONS
Our results indicate that female AGD can provide value as a predictor for fertility EBVs in dairy cattle.The AGD trait can be measured much earlier in a cow's life than economically important fertility traits such as calving, breeding and pregnancy rates, and using AGD to predict fertility EBVs may increase the reliability of fertility EBVs.This increase in reliability may improve the accuracy of selection for fertility, and thus increase the rate of genetic gain.Further, the strength of the genetic correlation between AGD and fertility may depend on the timing of the AGD measure.

Figure 1 .
Figure 1.Photo demonstrating the method used to measure anogenital distance.The calipers were positioned to measure the distance from the center of the anus to the base of the clitoris, as previously defined by Gobikrushanth et al. (2017).
Figure 2. Genome-wide association study for anogenital distance measured at approximately 11 mo old, in approximately 5,000 Holstein-Friesian or Holstein-Friesian × Jersey crossbred heifers.WPPA: window posterior probability of association for each 20-marker (approximately 1 Mb) genomic window; Chr: chromosome.
Figure 3. Genome-wide association study for anogenital distance measured at approximately 29 mo old, in approximately 2,000 Holstein-Friesian, Holstein-Friesian × Jersey crossbred heifers during their first lactation.WPPA: window posterior probability of association for each 20-marker (approximately 1 Mb) genomic window.

Table 1 .
Stephen et al.: VARIANCE PARAMETERS FOR ANOGENITAL DISTANCE Genetic (above the diagonal) and phenotypic (below the diagonal) correlations between anogenital distance (AGD) measured at 2 ages (AGD1: approximately 11 mo of age; AGD2: approximately 29 mo of age) and heifer (prelactation) traits including height, length, and BW 1

Table 2 .
Genetic (above the diagonal) and phenotypic (below the diagonal) correlations between anogenital

Table 3 .
Stephen et al.: VARIANCE PARAMETERS FOR ANOGENITAL DISTANCE Genetic (above the diagonal) and phenotypic (below the diagonal) correlations between anogenital distance (AGD) measured at 2 ages (AGD1: approximately 11 mo of age; AGD2: approximately 29 mo of age) and fertility phenotypes recorded during first and second lactation 1

Table 4 .
Genomic regions of suggestive association with anogenital distance measured at 11 mo of age in approximately 5,000 Holstein-Friesian and Holstein-Friesian × Jersey crossbreed dairy cattle (annotated using Stephen et al.: VARIANCE PARAMETERS FOR ANOGENITAL DISTANCE