Advertisement
Research| Volume 106, ISSUE 3, P1925-1941, March 2023

Quantitative trait locus for calving traits on Bos taurus autosome 18 in Holstein cattle is embedded in a complex genomic region

Open AccessPublished:January 27, 2023DOI:https://doi.org/10.3168/jds.2021-21625

      ABSTRACT

      Although the quantitative trait locus (QTL) on chromosome 18 (BTA18) associated with paternal calving ease and stillbirth in Holstein Friesian cattle and its cross has been known for over 20 years, to our knowledge, the exact causal genetic sequence has yet escaped identification. The aim of this study was to re-examine the region of the published QTL on BTA18 and to investigate the possible reasons behind this elusiveness. For this purpose, we carried out a combined linkage disequilibrium and linkage analysis using genotyping data of 2,697 German Holstein Friesian (HF) animals and subsequent whole-genome sequencing (WGS) data analyses and genome assembly of HF samples. We confirmed the known QTL in the 95% confidence interval of 1.089 Mbp between 58.34 and 59.43 Mbp on BTA18. Additionally, these 4 SNPs in the near-perfect linkage disequilibrium with the QTL haplotype were identified: rs381577268 (on 57,816,137 bp, C/T), rs381878735 (on 59,574,329 bp, A/T), rs464221818 (on 59,329,176 bp, C/T), and rs472502785 (on 59,345,689 bp, T/C). Search for the causal mutation using short and long-read sequences, and methylation data of the BTA18 QTL region did not reveal any candidates though. The assembly showed problems in the region, as well as an abundance of segmental duplications within and around the region. Taking the QTL of BTA18 in Holstein cattle as an example, the data presented in this study comprehensively characterize the genomic features that could also be relevant for other such elusive QTL in various other cattle breeds and livestock species as well.

      Key words

      INTRODUCTION

      Holstein Friesian (HF) is one of the most economically important dairy cattle all over the world. Certain pathological conditions in HF such as low fertility, dystocia, or the production of weak calves, which die during the first 24 h of birth, are some of the major issues having significant economic impact on the global dairy industry. According to one estimate (
      • Bellows D.
      • Ott S.
      • Bellows R.
      Cost of reproductive diseases and conditions in cattle.
      ), all these conditions are estimated to lead to a total loss of about $53.20 (€46.74–47.81) per dairy cow per year.
      The above financial losses are caused in part by genetic effects (i.e., by QTL affecting fertility). For more than 20 yr, different research groups (Table 1) have been working on the decryption of the significant QTL on chromosome 18. This QTL is proposed to have a major effect on calving performance, stillbirth, and other fertility traits, as well as on conformation traits. Notably, it was only discovered in purebred or upgraded breeds of different HF breeding populations. All these studies had in common that the most significant mapping results were located in a region between 50 to 60 Mbp on BTA18 (
      • Cole J.B.
      • VanRaden P.M.
      • O’Connell J.R.
      • Van Tassell C.P.
      • Sonstegard T.S.
      • Schnabel R.D.
      • Taylor J.F.
      • Wiggans G.R.
      Distribution and location of genetic effects for dairy traits.
      ;
      • Mao X.
      • Kadri N.K.
      • Thomasen J.R.
      • De Koning D.J.
      • Sahana G.
      • Guldbrandtsen B.
      Fine mapping of a calving QTL on Bos taurus autosome 18 in Holstein cattle.
      ;
      • Müller M.P.
      • Rothammer S.
      • Seichter D.
      • Russ I.
      • Hinrichs D.
      • Tetens J.
      • Thaller G.
      • Medugorac I.
      Genome-wide mapping of 10 calving and fertility traits in Holstein dairy cattle with special regard to chromosome 18.
      ;
      • Fang L.
      • Jiang J.
      • Li B.
      • Zhou Y.
      • Freebern E.
      • Vanraden P.M.
      • Cole J.B.
      • Liu G.E.
      • Ma L.
      Genetic and epigenetic architecture of paternal origin contribute to gestation length in cattle.
      ;
      • Purfield D.C.
      • Evans R.D.
      • Berry D.P.
      Breed- and trait-specific associations define the genetic architecture of calving performance traits in cattle.
      ).
      • Cole J.B.
      • VanRaden P.M.
      • O’Connell J.R.
      • Van Tassell C.P.
      • Sonstegard T.S.
      • Schnabel R.D.
      • Taylor J.F.
      • Wiggans G.R.
      Distribution and location of genetic effects for dairy traits.
      identified an association between the SNP rs109478645 (57,137,302 bp on ARS-UCD1.2) and the following traits: body depth, maternal and paternal calving ease, maternal and paternal stillbirth, rump width, stature, and strength. The SNP was in an intron of SIGLEC5. Because this gene is highly expressed in human placenta, they proposed it as the candidate gene affecting fertility traits in cattle. Several other studies confirmed an association between the SNP (rs109478645) and fertility traits (
      • Maltecca C.
      • Gray K.A.
      • Weigel K.A.
      • Cassady J.P.
      • Ashwell M.
      A genome-wide association study of direct gestation length in US Holstein and Italian Brown populations.
      ;
      • Sahana G.
      • Guldbrandtsen B.
      • Lund M.S.
      Genome-wide association study for calving traits in Danish and Swedish Holstein cattle.
      ;
      • Purfield D.C.
      • Bradley D.G.
      • Kearney J.F.
      • Berry D.P.
      Genome-wide association study for calving traits in Holstein-Friesian dairy cattle.
      ;
      • Mao X.
      • Kadri N.K.
      • Thomasen J.R.
      • De Koning D.J.
      • Sahana G.
      • Guldbrandtsen B.
      Fine mapping of a calving QTL on Bos taurus autosome 18 in Holstein cattle.
      ;
      • Müller M.P.
      • Rothammer S.
      • Seichter D.
      • Russ I.
      • Hinrichs D.
      • Tetens J.
      • Thaller G.
      • Medugorac I.
      Genome-wide mapping of 10 calving and fertility traits in Holstein dairy cattle with special regard to chromosome 18.
      ;
      • Cunningham F.
      • Achuthan P.
      • Akanni W.
      • Allen J.
      • Amode M.R.
      • Armean I.M.
      • Bennett R.
      • Bhai J.
      • Billis K.
      • Boddu S.
      • Cummins C.
      • Davidson C.
      • Dodiya K.J.
      • Gall A.
      • Girón C.G.
      • Gil L.
      • Grego T.
      • Haggerty L.
      • Haskell E.
      • Hourlier T.
      • Izuogu O.G.
      • Janacek S.H.
      • Juettemann T.
      • Kay M.
      • Laird M.R.
      • Lavidas I.
      • Liu Z.
      • Loveland J.E.
      • Marugan J.C.
      • Maurel T.
      • McMahon A.C.
      • Moore B.
      • Morales J.
      • Mudge J.M.
      • Nuhn M.
      • Ogeh D.
      • Parker A.
      • Parton A.
      • Paricio M.
      • Salam A.I.A.
      • Schmitt B.M.
      • Schuilenburg H.
      • Sheppard D.
      • Sparrow H.
      • Stapleton E.
      • Szuba M.
      • Taylor K.
      • Threadgold G.
      • Thormann A.
      • Vullo A.
      • Walts B.
      • Winterbottom A.
      • Zadissa A.
      • Chakiachvili M.
      • Frankish A.
      • Hunt S.E.
      • Kostadima M.
      • Langridge N.
      • Martin F.J.
      • Muffato M.
      • Perry E.
      • Ruffier M.
      • Staines D.M.
      • Trevanion S.J.
      • Aken B.L.
      • Yates A.D.
      • Zerbino D.R.
      • Flicek P.
      Ensembl 2019.
      ). Interestingly,
      • Mao X.
      • Kadri N.K.
      • Thomasen J.R.
      • De Koning D.J.
      • Sahana G.
      • Guldbrandtsen B.
      Fine mapping of a calving QTL on Bos taurus autosome 18 in Holstein cattle.
      identified another SNP, rs136283363 (at position 57,089,460 bp on ARS-UCD1.2), as the variant with the largest effects on calving traits. The close locations of the 2 candidate SNPs, rs109478645 and rs136283363 on BTA18 (48 kb), resulted in a high linkage disequilibrium (LD = 0.98). In a study by
      • Müller M.P.
      • Rothammer S.
      • Seichter D.
      • Russ I.
      • Hinrichs D.
      • Tetens J.
      • Thaller G.
      • Medugorac I.
      Genome-wide mapping of 10 calving and fertility traits in Holstein dairy cattle with special regard to chromosome 18.
      the putative QTL harboring the peak position of the likelihood ratio test (LRT) at 58,905,582 bp was located 1.7 Mbp apart from the position of the SNP rs109478645 published by
      • Cole J.B.
      • VanRaden P.M.
      • O’Connell J.R.
      • Van Tassell C.P.
      • Sonstegard T.S.
      • Schnabel R.D.
      • Taylor J.F.
      • Wiggans G.R.
      Distribution and location of genetic effects for dairy traits.
      . They explained their divergent results by the fact that in their research a combined LD and linkage analysis (cLDLA) was performed instead of a GWAS. Additionally, they identified the unique haplotype (
      • Müller M.P.
      • Rothammer S.
      • Seichter D.
      • Russ I.
      • Hinrichs D.
      • Tetens J.
      • Thaller G.
      • Medugorac I.
      Genome-wide mapping of 10 calving and fertility traits in Holstein dairy cattle with special regard to chromosome 18.
      ) that completely explained the observed QTL effect in their mapping model and segregated with a frequency of 13.3% in the HF population. At least 2 studies (
      • Fang L.
      • Jiang J.
      • Li B.
      • Zhou Y.
      • Freebern E.
      • Vanraden P.M.
      • Cole J.B.
      • Liu G.E.
      • Ma L.
      Genetic and epigenetic architecture of paternal origin contribute to gestation length in cattle.
      ;
      • Purfield D.C.
      • Evans R.D.
      • Berry D.P.
      Breed- and trait-specific associations define the genetic architecture of calving performance traits in cattle.
      ) confirmed the significant QTL on BTA18, but both studies identified the SNP, rs381577268 (57,816,137 bp in ARS-UCD1.2), in association with the calving and its related traits in HF cattle. This SNP was located downstream of zinc finger protein 613 (ZNF613). Further,
      • Fang L.
      • Jiang J.
      • Li B.
      • Zhou Y.
      • Freebern E.
      • Vanraden P.M.
      • Cole J.B.
      • Liu G.E.
      • Ma L.
      Genetic and epigenetic architecture of paternal origin contribute to gestation length in cattle.
      investigated whether differences in methylation could be associated with extreme differences, which they observed in fertility-related traits such as gestation length, sire calving ease, body depth, and cow conception rate in HF population. Interestingly, they identified a 2-kb region (∼57.800 and ∼57.802 Mbp) in the second intron of ZNF613 which showed differential methylation rates between animals of extreme phenotypes. Nevertheless, further research is needed to establish a direct link between the possible inactivation of genes by methylation and the fertility-related traits in HF cattle.
      Table 1Previous studies, which identified the significant QTL between 50 to 60 Mbp on BTA18 for fertility traits in Holstein cattle
      ReferenceYearSample sizePopulation
      • Kühn C.
      • Bennewitz J.
      • Reinsch N.
      • Xu N.
      • Thomsen H.
      • Looft C.
      • Brockmann G.A.
      • Schwerin M.
      • Weimann C.
      • Hiendleder S.
      • Erhardt G.
      • Medjugorac I.
      • Forster M.
      • Brenig B.
      • Reinhardt F.
      • Reents R.
      • Russ I.
      • Averdunk G.
      • Blumel J.
      • Kalm E.
      Quantitative trait loci mapping of functional traits in the German Holstein cattle population.
      2003872 bullsGerman Holstein
      • Thomasen J.R.
      • Guldbrandtsen B.
      • Sorensen P.
      • Thomsen B.
      • Lund M.S.
      Quantitative trait loci affecting calving traits in Danish Holstein cattle.
      20082,297 bullsDanish Holstein
      • Cole J.B.
      • VanRaden P.M.
      • O’Connell J.R.
      • Van Tassell C.P.
      • Sonstegard T.S.
      • Schnabel R.D.
      • Taylor J.F.
      • Wiggans G.R.
      Distribution and location of genetic effects for dairy traits.
      20095,285 bullsNorth American Holstein
      • Sahana G.
      • Guldbrandtsen B.
      • Lund M.S.
      Genome-wide association study for calving traits in Danish and Swedish Holstein cattle.
      20112,062 bullsDanish and Swedish Holstein
      • Purfield D.C.
      • Bradley D.G.
      • Kearney J.F.
      • Berry D.P.
      Genome-wide association study for calving traits in Holstein-Friesian dairy cattle.
      20141,970 bullsIrish Holstein
      • Cole J.B.
      • Waurich B.
      • Wensch-Dorendorf M.
      • Bickhart D.M.
      • Swalve H.H.
      A genome-wide association study of calf birth weight in Holstein cattle using single nucleotide polymorphisms and phenotypes predicted from auxiliary traits.
      201431,984 bullsGerman Holstein
      • Mao X.
      • Kadri N.K.
      • Thomasen J.R.
      • De Koning D.J.
      • Sahana G.
      • Guldbrandtsen B.
      Fine mapping of a calving QTL on Bos taurus autosome 18 in Holstein cattle.
      20166,113 bullsNordic Holstein
      • Müller M.P.
      • Rothammer S.
      • Seichter D.
      • Russ I.
      • Hinrichs D.
      • Tetens J.
      • Thaller G.
      • Medugorac I.
      Genome-wide mapping of 10 calving and fertility traits in Holstein dairy cattle with special regard to chromosome 18.
      20172,527 bullsGerman Holstein
      • Fang L.
      • Jiang J.
      • Li B.
      • Zhou Y.
      • Freebern E.
      • Vanraden P.M.
      • Cole J.B.
      • Liu G.E.
      • Ma L.
      Genetic and epigenetic architecture of paternal origin contribute to gestation length in cattle.
      201927,214 bullsNorth American Holstein
      • Purfield D.C.
      • Evans R.D.
      • Berry D.P.
      Breed- and trait-specific associations define the genetic architecture of calving performance traits in cattle.
      202060,747 bullsAngus (8,304), Charolais (17,175), Limousin (16,794), Irish Holstein (18,474)
      Despite the economic and ethical impact of this QTL, none of the existing publications successfully pinpointed the causal genes or mutations (
      • Ma L.
      • Cole J.B.
      • Da Y.
      • VanRaden P.M.
      Symposium review: Genetics, genome-wide association study, and genetic improvement of dairy fertility traits.
      ). Therefore, the aim of the present study was to confirm the position of the significant QTL that
      • Müller M.P.
      • Rothammer S.
      • Seichter D.
      • Russ I.
      • Hinrichs D.
      • Tetens J.
      • Thaller G.
      • Medugorac I.
      Genome-wide mapping of 10 calving and fertility traits in Holstein dairy cattle with special regard to chromosome 18.
      identified by using, improved methods (cLDLA) material, data (ARS-UCD1.2) and investigate the region of interest in a subsequent sequencing analysis. To characterize the genetic architecture of the critical region in detail, we conducted several analyses including de novo assembly on long-read sequence data as well as the identification of SNP, segmental duplications (SEGD), structural variations (SV), and DNA-methylation using both long as well as short-read whole-genome sequence data.

      MATERIALS AND METHODS

      Animals and Phenotypes

      The previous study by
      • Müller M.P.
      • Rothammer S.
      • Seichter D.
      • Russ I.
      • Hinrichs D.
      • Tetens J.
      • Thaller G.
      • Medugorac I.
      Genome-wide mapping of 10 calving and fertility traits in Holstein dairy cattle with special regard to chromosome 18.
      had a sample size of 2,572 daughter-proven Holstein sires. This data set was further extended by adding 125 animals; therefore, the final data set consisted of 2,697 animals; of these, 2,525 were bulls and 172 were cows. Because all animal samples and data came from previous projects (
      • Thaller G.
      • Kramer W.
      • Winter A.
      • Kaupe B.
      • Erhardt G.
      • Fries R.
      Effects of DGAT1 variants on milk production traits in German cattle breeds.
      ;
      • Medugorac I.
      • Seichter D.
      • Graf A.
      • Russ I.
      • Blum H.
      • Gopel K.H.
      • Rothammer S.
      • Forster M.
      • Krebs S.
      Bovine polledness–an autosomal dominant trait with allelic heterogeneity.
      ;
      • Gehrke L.J.
      • Capitan A.
      • Scheper C.
      • Konig S.
      • Upadhyay M.
      • Heidrich K.
      • Russ I.
      • Seichter D.
      • Tetens J.
      • Medugorac I.
      • Thaller G.
      Are scurs in heterozygous polled (Pp) cattle a complex quantitative trait? Genetics, selection, evolution.
      ,
      • Gehrke L.J.
      • Upadhyay M.
      • Heidrich K.
      • Kunz E.
      • Klaus-Halla D.
      • Weber F.
      • Zerbe H.
      • Seichter D.
      • Graf A.
      • Krebs S.
      • Blum H.
      • Capitan A.
      • Thaller G.
      • Medugorac I.
      A de novo frameshift mutation in ZEB2 causes polledness, abnormal skull shape, small body stature and subfertility in Fleckvieh cattle.
      ), no ethical approval was required.
      Pedigree records and EBV for all animals were obtained from the official breeding value evaluation from April 2019 by Vereinigte Informationssysteme Tierhaltung (VIT, Verden, Germany). For further analyses on BTA18, we focused on the calving traits that showed significant association in
      • Müller M.P.
      • Rothammer S.
      • Seichter D.
      • Russ I.
      • Hinrichs D.
      • Tetens J.
      • Thaller G.
      • Medugorac I.
      Genome-wide mapping of 10 calving and fertility traits in Holstein dairy cattle with special regard to chromosome 18.
      ; these traits included paternal calving ease and paternal stillbirth. Calving ease is classified into the following 4 classes: easy, normal, heavy, and with veterinary assistance. Stillbirth is specified as an “all-or-nothing” trait and defined as calving where the calf was born dead or died within 48 h (
      • Vereinigte Informationssysteme Tierhaltung w.V.
      Trend - Fakten - Zahlen 2019.
      ).

      SNP Array Genotypes

      All 2,697 animals were genotyped as part of the projects mentioned above with the Illumina BovineSNP50 BeadChip version 1 or version 2. Additionally, 256 of the 2,697 animals were also genotyped with the BovineHD BeadChip. The HD-genotyped animals were 85 bulls (mostly sires of half-sib families) and 172 cows that were genotyped for other purpose such as to investigate genotypic basis of scurs phenotypes in cattle (
      • Gehrke L.J.
      • Capitan A.
      • Scheper C.
      • Konig S.
      • Upadhyay M.
      • Heidrich K.
      • Russ I.
      • Seichter D.
      • Tetens J.
      • Medugorac I.
      • Thaller G.
      Are scurs in heterozygous polled (Pp) cattle a complex quantitative trait? Genetics, selection, evolution.
      ), but were also recorded for calving ease and stillbirth. Version 1 of the 50K-chip included 54,001 SNP, version 2 included 54,609 SNP and the HD-chip included 777,962 SNP (Illumina). Physical marker positions were determined on the latest bovine reference genome assembly ARS-UCD1.2 (
      • Rosen B.D.
      • Bickhart D.M.
      • Schnabel R.D.
      • Koren S.
      • Elsik C.G.
      • Tseng E.
      • Rowan T.N.
      • Low W.Y.
      • Zimin A.
      • Couldrey C.
      • Hall R.
      • Li W.
      • Rhie A.
      • Ghurye J.
      • McKay S.D.
      • Thibaud-Nissen F.
      • Hoffman J.
      • Murdoch B.M.
      • Snelling W.M.
      • McDaneld T.G.
      • Hammond J.A.
      • Schwartz J.C.
      • Nandolo W.
      • Hagen D.E.
      • Dreischer C.
      • Schultheiss S.J.
      • Schroeder S.G.
      • Phillippy A.M.
      • Cole J.B.
      • Van Tassell C.P.
      • Liu G.
      • Smith T.P.L.
      • Medrano J.F.
      De novo assembly of the cattle reference genome with single-molecule sequencing.
      ) using National Center for Biotechnology Information (NCBI) remap service (https://www.ncbi.nlm.nih.gov/genome/tools/remap) as well as UCSC liftover tool (https://genome.ucsc.edu/cgi-bin/hgLiftOver). The SNP fulfilling one of the following criteria were excluded from further analyses: (1) markers with a call rate below 95%, (2) unknown or ambiguous map position according to the ARS-UCD1.2 reference genome, (3) SNPs with a heterozygosity of less than 0.025, (4) frequent paternity conflicts in animals with approved paternity (i.e., markers with Mendelian error rates greater than 0.2%), and (5) markers located on another chromosome than BTA18. After these quality controls, 1,144 SNP remained in the marker set.

      Genotypes from Illumina Whole-Genome Sequences

      We also downloaded publicly available whole-genome Illumina paired end sequences from additional 21 Holstein samples (Supplemental Table S1, https://doi.org/10.6084/m9.figshare.21857988;
      • Dachs N.
      • Upadhyay M.
      • Hannemann E.
      • Hauser A.
      • Krebs S.
      • Seichter D.
      • Russ I.
      • Gehrke L.J.
      • Thaller G.
      • Medugorac I.
      Quantitative trait locus for calving traits on Bos taurus autosome 18 in Holstein cattle is embedded in a complex genomic region.
      ), a Hereford dam (SRR8324584) and a Brahman Zebu bull (SRR2016745). The data were downloaded from the NCBI Sequence Read Archive (SRA; https://www.ncbi.nlm.nih.gov/sra/) (
      • Katz K.
      • Shutov O.
      • Lapoint R.
      • Kimelman M.
      • Brister J.R.
      • O’Sullivan C.
      The Sequence Read Archive: A decade more of explosive growth.
      ) and correspond to the following bio-projects accession numbers: PRJEB273739 (Holstein), PRJNA494431 (Hereford), and PRJNA277147 (Brahman).
      Raw sequencing data were trimmed and filtered using the default settings (–qual-threshold 20, –length-threshold 20) of sickle version 1.33 (
      • Joshi N.
      • Fass J.
      Sickle: A sliding-window, adaptive, quality-based trimming tool for FastQ files (Version 1.33).
      ). Subsequently, the filtered data were mapped against the ARS-UCD1.2 reference genome using bwa v0.7.17 (
      • Li H.
      • Durbin R.
      Fast and accurate short read alignment with Burrows-Wheeler transform.
      ). The resulting SAM files were converted into BAM files and subsequently sorted based on positions with SAMtools 1.9 (
      • Li H.
      • Handsaker B.
      • Wysoker A.
      • Fennell T.
      • Ruan J.
      • Homer N.
      • Marth G.
      • Abecasis G.
      • Durbin R.
      The sequence alignment/map format and SAMtools.
      ). Next, Picard version 2.20.7 (
      • Broad Institute
      Picard Toolkit.
      ) was used to remove the duplicated reads. The quality report and alignment statistic of each sample are listed in the Supplemental Table S1. Visualization of the alignment in the regions of interest was carried out in Integrative Genomics Viewer (IGV) 2.6.3 (
      • Thorvaldsdóttir H.
      • Robinson J.T.
      • Mesirov J.P.
      Integrative Genomics Viewer (IGV): High-performance genomics data visualization and exploration.
      ); this was carried out to visualize genomic alignment and depth in the target region and to assess if there are any obvious genomic SV in the QTL region (
      • Lagler D.K.
      • Hannemann E.
      • Eck K.
      • Klawatsch J.
      • Seichter D.
      • Russ I.
      • Mendel C.
      • Lühken G.
      • Krebs S.
      • Blum H.
      • Upadhyay M.
      • Medugorac I.
      Fine-mapping and identification of candidate causal genes for tail length in the Merinolandschaf breed.
      ). To obtain the variants at each SNP position of the BovineHD BeadChip in a VCF file, a modified pipeline based on the best practices recommendations (
      • Bailey J.A.
      • Gu Z.
      • Clark R.A.
      • Reinert K.
      • Samonte R.V.
      • Schwartz S.
      • Adams M.D.
      • Myers E.W.
      • Li P.W.
      • Eichler E.E.
      Recent segmental duplications in the human genome.
      ;
      • Van der Auwera G.A.
      What is a VCF and How Should I Interpret It?.
      ) with the tools HaplotypeCaller, GenotypeGVCFs, and VariantFiltration as implemented in GATK 4.1.4.0 was used.
      When running the GenotypeGVCFs tool, the option IncludeNonVariantSites was added to the default parameters to obtain the genotypes at each position in the genome irrespective of whether a variant is present or not. In addition to the recommended settings of the VariantFiltration tool, a filter for read depth on every position was set to more than 7 reads. According to the binomial distribution that on a specific position the reference or the alternative allele was chosen, the probability of estimating the wrong genotype with 7 reads is less than 0.01 (i.e., 0.57 = 0.0078). The genotype quality corresponding to a normalized phred quality score was set to a minimum of 22. Consequently, the probability that the second most likely genotype is actually true was less than 0.01 (i.e., 10−2.2). Afterward the generated genotypes were implemented in an additional run of the following haplotyping and imputation analyses.

      Phasing of the SNP Genotypes

      As the cLDLA pipeline requires the phased data as input, Beagle 5.0 (
      • Browning S.R.
      • Browning B.L.
      Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering.
      ;
      • Browning B.L.
      • Zhou Y.
      • Browning S.R.
      A one-penny imputed genome from next-generation reference panels.
      ) was applied for reconstruction of haplotypes and imputation of missing genotypes for the entire animal set. To increase haplotyping accuracy we used all available genotypes (i.e., also animals that otherwise were not included in the cLDLA).

      Combined LD and Linkage Analysis

      To confirm the results by
      • Müller M.P.
      • Rothammer S.
      • Seichter D.
      • Russ I.
      • Hinrichs D.
      • Tetens J.
      • Thaller G.
      • Medugorac I.
      Genome-wide mapping of 10 calving and fertility traits in Holstein dairy cattle with special regard to chromosome 18.
      and to identify the position of the most significant QTL on the new assembly ARS-UCD1.2, cLDLA analysis was performed. This approach was based on the well-described method by
      • Meuwissen T.H.
      • Karlsen A.
      • Lien S.
      • Olsaker I.
      • Goddard M.E.
      Fine mapping of a quantitative trait locus for twinning rate using combined linkage and linkage disequilibrium mapping.
      and is already described in our previous studies (
      • Gehrke L.J.
      • Capitan A.
      • Scheper C.
      • Konig S.
      • Upadhyay M.
      • Heidrich K.
      • Russ I.
      • Seichter D.
      • Tetens J.
      • Medugorac I.
      • Thaller G.
      Are scurs in heterozygous polled (Pp) cattle a complex quantitative trait? Genetics, selection, evolution.
      ,
      • Gehrke L.J.
      • Upadhyay M.
      • Heidrich K.
      • Kunz E.
      • Klaus-Halla D.
      • Weber F.
      • Zerbe H.
      • Seichter D.
      • Graf A.
      • Krebs S.
      • Blum H.
      • Capitan A.
      • Thaller G.
      • Medugorac I.
      A de novo frameshift mutation in ZEB2 causes polledness, abnormal skull shape, small body stature and subfertility in Fleckvieh cattle.
      ). In brief, the following mixed linear model was used to carry out variance component analysis using ASReml:
      y = + Z1u + Z2q + e,


      where y is the vector of paternal calving ease traits (deregressed proofs), and β is the vector of fixed effects (overall mean and sex), u represents the vector of random polygenic effects, q is random additive-genetic QTL effects, e is the vector of random residual effects, and X, Z1, and Z2 are the incidence matrices for β, u, and q, respectively. The estimates of variance components and likelihood were then used in calculation of likelihood ratio test statistic (LRT). The statistic follows a χ2 distribution with one degree of freedom and were calculated as
      LRTp = −2 × [log(L0) − log(L1p)],


      where log(L0) is the logarithm of likelihood for model without random QTL effect and log(L1p) is the logarithm of likelihood for model with QTL effect at position p.
      The same mixed linear model was applied for the trait of stillbirth as well, in which y represented the vector of binary traits of stillbirth.

      Samples for Long-Read Sequencing

      Based on the availability of blood and sperm samples, and on the results of haplotype analysis, 4 HF animals were whole-genome sequenced with long-read Oxford Nanopore Technology (ONT). We sequenced 2 Holstein cows (DHFnano01 and DHFnano02, blood samples) both heterozygous for the harmful haplotype (i.e., group q/Q) and 2 Holstein bulls (DHFnano03 and DHFnano04, sperm samples) with a homozygous haplotype at the target QTL, where bull DHFnano03 was homozygous for the harmful haplotype (i.e., group Q/Q), and bull DHFnano04 from group q/q was homozygous for the common haplotype which is not associated with increased dystocia and stillbirth.
      Furthermore, we selected a Kärntner Blondvieh (KBVnano05) from the in-house database and previously published Fleckvieh Bull (FVnano06) from the study by
      • Gehrke L.J.
      • Capitan A.
      • Scheper C.
      • Konig S.
      • Upadhyay M.
      • Heidrich K.
      • Russ I.
      • Seichter D.
      • Tetens J.
      • Medugorac I.
      • Thaller G.
      Are scurs in heterozygous polled (Pp) cattle a complex quantitative trait? Genetics, selection, evolution.
      to validate the results with other breeds. Those 2 Alpine breed bulls were also sequenced with ONT.

      Generation of Oxford Nanopore Sequences and its Alignment

      To prepare the sequencing libraries with the Oxford Nanopore's LSK109 Kit, app. Three micrograms of unsheared genomic DNA were used. The DNA was first end-repaired and A-tailed for 10 min at 20°C using the Ultra II end-repair module provided by New England Biolabs. The inactivation of enzymes was conducted at 65°C for 5 min and 1× volumes of Ampure XP beads (Beckman Coulter) was used to purify the DNA. After that the end-repaired DNA was eluted from the beads for 20 min at 55°C. For ligating the DNA fragments, a sequencing adapter with attached motor protein was added with the T4 Quick ligation module (New England Biolabs) and components from the LSK109 sequencing kit. With 0.45 volumes of Ampure beads, the purification of the adapted library was performed and eluted in LSK109 EB buffer for 20 min at 37°C. Approximately 400 ng of the completed reagent was loaded onto a PromethION flow cell and sequenced on a PromethION β sequencer for 72 h (ONT, Oxford, UK). Thereafter, base calling was performed offline on the PromethION compute unit's GPU by using Guppy version 3.2.2 from Oxford Nanopore (
      • Payne A.
      • Holmes N.
      • Clarke T.
      • Munro R.
      • Debebe B.J.
      • Loose M.
      Readfish enables targeted nanopore sequencing of gigabase-sized genomes.
      ). Using Minimap2 (
      • Li H.
      Minimap2: Pairwise alignment for nucleotide sequences.
      ), we aligned all long read to the latest cattle assembly, ARS-UCD1.2. Regions of interest were then further analyzed in Ensembl (
      • Cunningham F.
      • Achuthan P.
      • Akanni W.
      • Allen J.
      • Amode M.R.
      • Armean I.M.
      • Bennett R.
      • Bhai J.
      • Billis K.
      • Boddu S.
      • Cummins C.
      • Davidson C.
      • Dodiya K.J.
      • Gall A.
      • Girón C.G.
      • Gil L.
      • Grego T.
      • Haggerty L.
      • Haskell E.
      • Hourlier T.
      • Izuogu O.G.
      • Janacek S.H.
      • Juettemann T.
      • Kay M.
      • Laird M.R.
      • Lavidas I.
      • Liu Z.
      • Loveland J.E.
      • Marugan J.C.
      • Maurel T.
      • McMahon A.C.
      • Moore B.
      • Morales J.
      • Mudge J.M.
      • Nuhn M.
      • Ogeh D.
      • Parker A.
      • Parton A.
      • Paricio M.
      • Salam A.I.A.
      • Schmitt B.M.
      • Schuilenburg H.
      • Sheppard D.
      • Sparrow H.
      • Stapleton E.
      • Szuba M.
      • Taylor K.
      • Threadgold G.
      • Thormann A.
      • Vullo A.
      • Walts B.
      • Winterbottom A.
      • Zadissa A.
      • Chakiachvili M.
      • Frankish A.
      • Hunt S.E.
      • Kostadima M.
      • Langridge N.
      • Martin F.J.
      • Muffato M.
      • Perry E.
      • Ruffier M.
      • Staines D.M.
      • Trevanion S.J.
      • Aken B.L.
      • Yates A.D.
      • Zerbino D.R.
      • Flicek P.
      Ensembl 2019.
      ). Additionally, we generated a combined FASTQ file from the heterozygous HF samples (DHFnano01 and DHFnano02) and this combined data were subjected to the same analysis as described previously for the single ONT HF samples.
      For further validation, we implemented an additional alignment analysis in this study using the Bos taurus UOA_Angus_1 assembly (
      • Low W.Y.
      • Tearle R.
      • Liu R.
      • Koren S.
      • Rhie A.
      • Bickhart D.M.
      • Rosen B.D.
      • Kronenberg Z.N.
      • Kingan S.B.
      • Tseng E.
      • Thibaud-Nissen F.
      • Martin F.J.
      • Billis K.
      • Ghurye J.
      • Hastie A.R.
      • Lee J.
      • Pang A.W.C.
      • Heaton M.P.
      • Phillippy A.M.
      • Hiendleder S.
      • Smith T.P.L.
      • Williams J.L.
      Haplotype-resolved genomes provide insights into structural variation and gene content in Angus and Brahman cattle.
      ). The alignment was performed on the ONT data of the 2 heterozygous HF cows (DHFNano01 and DHFnano02) and the usage of the identical tools as previously reported in the approach including the ARS-UCD1.2 assembly as reference. To compare critical regions between the different alignments, positions were converted with NCBI genome remapping service (
      • NCBI Resource Coordinators
      Database resources of the National Center for Biotechnology Information.
      ).

      De Novo Assemblies of BTA18 Using ONT Sequences

      To avoid reference-bias in ONT sequence analyses, we also constructed 7 de novo assemblies of BTA18 using long-read sequencing data of 6 animals (4 Holstein samples from each haplotype group, one Kärntner Blondvieh, one Fleckvieh sample, and pooled long reads of 2 heterozygous HF cows). From the raw sequences, adapter trimming was carried out using the tool Porechop v.0.2.4 (
      • Wick R.R.
      • Judd L.M.
      • Gorrie C.L.
      • Holt K.E.
      Completing bacterial genome assemblies with multiplex MinION sequencing.
      ). The reconstruction of contigs was performed with wtdbg2 (readbean) version 2.5 assembler (
      • Ruan J.
      • Li H.
      Fast and accurate long-read assembly with wtdbg2.
      ) and the default parameters for Oxford Nanopore reads (option –x ont). The error correction of the newly generated assembly was executed with up to 3 rounds of Racon version 1.3.3 (
      • Vaser R.
      • Sovic I.
      • Nagarajan N.
      • Sikic M.
      Fast and accurate de novo genome assembly from long uncorrected reads.
      ). In the last step the generated FASTA files of the 7 de novo assemblies were mapped to the ARS-UCD1.2 reference (
      • Rosen B.D.
      • Bickhart D.M.
      • Schnabel R.D.
      • Koren S.
      • Elsik C.G.
      • Tseng E.
      • Rowan T.N.
      • Low W.Y.
      • Zimin A.
      • Couldrey C.
      • Hall R.
      • Li W.
      • Rhie A.
      • Ghurye J.
      • McKay S.D.
      • Thibaud-Nissen F.
      • Hoffman J.
      • Murdoch B.M.
      • Snelling W.M.
      • McDaneld T.G.
      • Hammond J.A.
      • Schwartz J.C.
      • Nandolo W.
      • Hagen D.E.
      • Dreischer C.
      • Schultheiss S.J.
      • Schroeder S.G.
      • Phillippy A.M.
      • Cole J.B.
      • Van Tassell C.P.
      • Liu G.
      • Smith T.P.L.
      • Medrano J.F.
      De novo assembly of the cattle reference genome with single-molecule sequencing.
      ) and for further validation to the UOA_Angus_1 assembly (
      • Low W.Y.
      • Tearle R.
      • Liu R.
      • Koren S.
      • Rhie A.
      • Bickhart D.M.
      • Rosen B.D.
      • Kronenberg Z.N.
      • Kingan S.B.
      • Tseng E.
      • Thibaud-Nissen F.
      • Martin F.J.
      • Billis K.
      • Ghurye J.
      • Hastie A.R.
      • Lee J.
      • Pang A.W.C.
      • Heaton M.P.
      • Phillippy A.M.
      • Hiendleder S.
      • Smith T.P.L.
      • Williams J.L.
      Haplotype-resolved genomes provide insights into structural variation and gene content in Angus and Brahman cattle.
      ).

      Identification of SV from ONT Sequences

      Two approaches were followed to identify SV from the ONT sequences. In the first approach, NanoVar-v1.3.8 (
      • Tham C.Y.
      • Tirado-Magallanes R.
      • Goh Y.
      • Fullwood M.J.
      • Koh B.T.H.
      • Wang W.
      • Ng C.H.
      • Chng W.J.
      • Thiery A.
      • Tenen D.G.
      • Benoukraf T.
      NanoVar: Accurate characterization of patients’ genomic structural variants using low-depth nanopore sequencing.
      ) was used to identify SV from ONT sequences of HF samples that were aligned onto ARS-UCD1.2 reference assembly. In brief, NanoVar considers the read depth of split reads and hard-clipped alignments to identify SV and subsequently identifies the high-confident SV and estimate their zygosity by employing Artificial Neural Network algorithm. Although NanoVar has internal utility to perform alignment of ONT sequences using HS-BLASTN, we supplied BAM files generated after Minimap2 alignment as an input.
      In the second approach, to alleviate the reference-bias, SV were also identified based on the assembly-based approach. To this end, first, we aligned the de novo assemblies of the samples, DHFnano03 (haplotype, Q/Q) and DHFnano03 (haplotype, q/q), against the ARS-UCD1.2 cattle assembly using minimap2 with the settings (-a -cx asm5). Subsequently, SV were called from this alignment using the tool SVIM-asm (
      • Heller D.
      • Vingron M.
      SVIM-asm: Structural variant detection from haploid and diploid genome assemblies.
      ).

      Identification of Putative Segmental Duplications in the Bos taurus Genome

      We learned that previous studies have identified an abundance of SEGD in the region of BTA18 which harbored QTL of our interest. However, all such studies (
      • Liu G.E.
      • Ventura M.
      • Cellamare A.
      • Chen L.
      • Cheng Z.
      • Zhu B.
      • Li C.
      • Song J.
      • Eichler E.E.
      Analysis of recent segmental duplications in the bovine genome.
      ) were carried out on UMD 3.1 cattle assembly. Therefore, we identified putative SDs in ARS-UCD1.2 cattle assembly using SDDetector (
      • Dallery J.-F.
      • Lapalu N.
      • Zampounis A.
      • Pigné S.
      • Luyten I.
      • Amselem J.
      • Wittenberg A.H.J.
      • Zhou S.
      • de Queiroz M.V.
      • Robin G.P.
      • Auger A.
      • Hainaut M.
      • Henrissat B.
      • Kim K.-T.
      • Lee Y.-H.
      • Lespinet O.
      • Schwartz D.C.
      • Thon M.R.
      • O’Connell R.J.
      Gapless genome assembly of Colletotrichum higginsianum reveals chromosome structure and association of transposable elements with secondary metabolite gene clusters.
      ) and examined their distribution along BTA18. SDDetector detects SEGD based on the bioinformatics principle as described in
      • Khaja R.
      • MacDonald J.R.
      • Zhang J.
      • Scherer S.W.
      Methods for identifying and mapping recent segmental and gene duplications in eukaryotic genomes.
      . In the first step, we carried out the sequence alignment of BTA18 against itself using MegaBLAST (
      • Zhang Z.
      • Schwartz S.
      • Wagner L.
      • Miller W.
      A greedy algorithm for aligning DNA sequences.
      ). Later, the XML output as generated by MegaBLAST was provided as input to SDDetector pipeline (-g 3000 -l 5000 -t 1000) to identify paralogous sequences with similarity greater than 90% across the alignment length which should be greater than 5,000 nucleotides. Subsequently, we filtered the putative SEGD that overlapped the repeats annotated by RepeatMasker by at least 10% of its length; for this purpose, a bed file of the RepeatMasker annotation was downloaded from UCSC table browser. A visual representation of the putative SEGD was afterward conducted with Circos v0.52 (
      • Krzywinski M.
      • Schein J.
      • Birol I.
      • Connors J.
      • Gascoyne R.
      • Horsman D.
      • Jones S.J.
      • Marra M.A.
      Circos: An information aesthetic for comparative genomics.
      ).

      Identification of SNPs, Indels, and SV from Illumina WGS

      The SNP, indels, and SV were identified from the 21 Illumina-sequenced HF samples (Supplemental Table S1) to identify the causative candidate variants. The SNP and indels were identified between the region of 40 Mbp and 65 Mbp on BTA18 using HaplotypeCaller as implemented in GATK-4.1.8.1. The SVs were identified using Lumpy v0.3.1 (
      • Layer R.M.
      • Chiang C.
      • Quinlan A.R.
      • Hall I.M.
      LUMPY: A probabilistic framework for structural variant discovery.
      ), BreakDancer v1.4.5 (
      • Chen K.
      • Wallis J.W.
      • McLellan M.D.
      • Larson D.E.
      • Kalicki J.M.
      • Pohl C.S.
      • McGrath S.D.
      • Wendl M.C.
      • Zhang Q.
      • Locke D.P.
      • Shi X.
      • Fulton R.S.
      • Ley T.J.
      • Wilson R.K.
      • Ding L.
      • Mardis E.R.
      BreakDancer: An algorithm for high-resolution mapping of genomic structural variation.
      ), and Manta v1.6.0 (
      • Chen X.
      • Schulz-Trieglaff O.
      • Shaw R.
      • Barnes B.
      • Schlesinger F.
      • Källberg M.
      • Cox A.J.
      • Kruglyak S.
      • Saunders C.T.
      Manta: Rapid detection of structural variants and indels for germline and cancer sequencing applications.
      ). For Lumpy we used the recommended pipeline smooth (https://github.com/brentp/smoove). All the tools used for SV detections were run using the default settings.

      Validation of Candidate Variants Using PCR

      Additionally, target SNP genotyping was performed by PCR-RFLP and PCR-based KASP technology on 428 animals from 8 breeds (Table 2). For 7 non-HF breeds, we selected the most unrelated animals whose DNA was available from previous projects. For the HF breed, we created 2 samples, the first consisting of random, preferably unrelated animals (group N/N, Table 2) and the second consisting of HF animals used for mapping, belonging to 3 haplotype groups (Q/Q, q/Q, and q/q, Table 2) and having DNA available. Not all 428 were successfully genotyped for all 4 SNPs. For PCR-RFLP genotyping of rs381577268 the DNA was first amplified by PCR using the program with 33 cycles. The reaction mixture of 15.0 µL was made up with 3.0 µL 5× buffer, 1.5 µL dNTP 10 mM, 0.4 µL each of 10 µm forward and reverse primer, 1 µL DNA (15 ng/µL), and 0.07 µL GoTaqG2 DNA Polymerase (Promega). In the next step to identify the genotypes from each sample the amplified DNA fragments were cut with the site-specific restriction enzyme NlaIII. For the preparation of the reaction mixture, we used 0.2 µL of the enzyme NlaIII, 2.0 µL DNA (PCR product), 0.2 µL Cut Smart Buffer, and H2O for a total reaction volume of 20 µL. The reaction mixture was afterward incubated for 3 h at 37°C. In the final step we sorted the DNA segments by size and visualized them by GelRed-stained agarose gel electrophoresis. Only DNA samples containing the reference allele were cut into 2 fragments with a length of 151 and 61 bp, whereas fragments containing the alternative allele had a length of 221 bp.
      Table 2Allele counts of potential candidate SNP in 8 breeds as well as in 4 groups of Holsteins
      Genotyped breeds are Belgian Blue (BB), Braunvieh (BV), Charolais (CH), Fleckvieh (FV), Gelbvieh (GV), Jersey (JY), Limousin (LM), and Holstein Friesian (HF). Holstein Friesian samples are divided in 4 groups: (1) HF q/q group represented animals with different haplotypes, which had nothing in common with the putative harmful haplotype associated with dystocia and stillbirth, (2) HF q/Q group harbored samples of individuals with one copy of the putative harmful haplotype, (3) HF Q/Q group included samples of individuals homozygous for the putative harmful haplotype at the target QTL, and (4) HF N/N group represented animals that are not used for mapping directly but are ancestors or relatives useful for derivation of haplotypes and imputation. Note that not all animals are successfully genotyped for all 4 candidate SNPs. Therefore, the genotyped animal set is largely overlapping but not identical for each SNP.
      rs-ID; position (bp); geneGenotype
      Reference allele is the first allele of the first genotype.
      BBBVCHFVGVJYLMHFHF q/qHF q/QHF Q/QHF N/N
      rs381577268; 57,816,137; ZNF613CC242392222171316614125
      CT1
      Only one non-Holstein animal contained one copy of alternative allele at all 4 candidate SNPs. According to the pedigree records, this Fleckvieh bull contained 1.7% Holstein genome.
      988612
      TT33312
      rs381878735; 59,574,329; ZNF665, ZNF677-likeAA232392122161314113110
      AT1
      Only one non-Holstein animal contained one copy of alternative allele at all 4 candidate SNPs. According to the pedigree records, this Fleckvieh bull contained 1.7% Holstein genome.
      86833
      TT3434
      rs464221818; 59,329,176; ZNF665-likeCC242392222161214613511
      CT1
      Only one non-Holstein animal contained one copy of alternative allele at all 4 candidate SNPs. According to the pedigree records, this Fleckvieh bull contained 1.7% Holstein genome.
      86833
      TT3434
      rs472502785; 59,345,689; ZNF665-likeTT242392022161314213111
      TC1
      Only one non-Holstein animal contained one copy of alternative allele at all 4 candidate SNPs. According to the pedigree records, this Fleckvieh bull contained 1.7% Holstein genome.
      86833
      CC3434
      1 Genotyped breeds are Belgian Blue (BB), Braunvieh (BV), Charolais (CH), Fleckvieh (FV), Gelbvieh (GV), Jersey (JY), Limousin (LM), and Holstein Friesian (HF). Holstein Friesian samples are divided in 4 groups: (1) HF q/q group represented animals with different haplotypes, which had nothing in common with the putative harmful haplotype associated with dystocia and stillbirth, (2) HF q/Q group harbored samples of individuals with one copy of the putative harmful haplotype, (3) HF Q/Q group included samples of individuals homozygous for the putative harmful haplotype at the target QTL, and (4) HF N/N group represented animals that are not used for mapping directly but are ancestors or relatives useful for derivation of haplotypes and imputation. Note that not all animals are successfully genotyped for all 4 candidate SNPs. Therefore, the genotyped animal set is largely overlapping but not identical for each SNP.
      2 Reference allele is the first allele of the first genotype.
      3 Only one non-Holstein animal contained one copy of alternative allele at all 4 candidate SNPs. According to the pedigree records, this Fleckvieh bull contained 1.7% Holstein genome.
      The PCR-based KASP genotyping technology (LGC, Biosearch Technologies) was also used for analyzing the candidate variants at SNPs rs464221818, rs472502785 and rs381878735. This method is based on a competitive allele-specific PCR using fluorescence signals, allowing bi-allelic evaluation of SNP and indel at a specific site in the chromosome. In the first step we prepared 5.14 µL of KASP genotyping-mix, consisting of 5.00 µL KASP 2× reaction-mix and 0.14 µL primer-mix for each specific SNP locus (KASP by design = KBD). Afterward, we put 5.00 µL of the KASP genotyping-mix with 2.00 µL DNA/Lysat and 3.00 µL H2O for a total reaction volume of 10 µL per sample. The PCR was performed with the thermocycler program touchdown. The DNA from each sample was then amplified in the hydrocycler. To detect the fluorescence signals at the specific locus of each SNP we used the Infinite200 PRO multimode plate reader (Tecan Trading AG). Finally, the genotyping analysis was conducted with the Kraken software (LGC Biosearch Technologies).

      Analysis of Whole-Genome Bisulfite Sequencing Data

      Raw whole-genome bisulfite sequencing (WGBS) data of 6 cattle, 3 with low calving ease (LCE) and 3 with high calving ease (HCE), from a previous study (
      • Fang L.
      • Jiang J.
      • Li B.
      • Zhou Y.
      • Freebern E.
      • Vanraden P.M.
      • Cole J.B.
      • Liu G.E.
      • Ma L.
      Genetic and epigenetic architecture of paternal origin contribute to gestation length in cattle.
      ; NCBI accession number: GSE119263) were downloaded from NCBI SRA. Filtering of this data was carried out using the default settings in TrimGalore v 6.0.2 (https://www.bioinformatics.babraham.ac.uk/projects/trim_galore/;
      • Martin M.
      Cutadapt removes adapter sequences from high-throughput sequencing reads.
      ). Subsequently, bowtie v.2.3.5 (
      • Langmead B.
      • Salzberg S.L.
      Fast gapped-read alignment with Bowtie 2.
      ) was used to map the filtered data to the cattle reference genome (ARS-UCD1.2). Next, Bismark v.0.22.3 (
      • Krueger F.
      • Andrews S.R.
      Bismark: A flexible aligner and methylation caller for Bisulfite-Seq applications.
      ) was used to extract context dependent (CpG/CHG/CHH) methylation sites. Differentially methylated regions (DMR) were detected across the BTA18 in the window tiling of 2,000 bp with step size of 2,000 bp using the R package MethylKit (
      • Akalin A.
      • Kormaksson M.
      • Li S.
      • Garrett-Bakelman F.E.
      • Figueroa M.E.
      • Melnick A.
      • Mason C.E.
      methylKit: A comprehensive R package for the analysis of genome-wide DNA methylation profiles.
      ). Subsequently, a logit regression model as implemented in CalculateDiffMeth was applied on the tiled data on the samples that were already assigned to its group (LCE vs. HCE). Later, the regions with >5% of difference in methylation and q-value of 0.05 were termed as DMR and extracted using the function GetMethylDiff. Additionally, DMR were also identified using SeqMonk (https://www.bioinformatics.babraham.ac.uk/projects/seqmonk/) following the steps described here (https://www.bioinformatics.babraham.ac.uk/training/Methylation_Course/Differential%20methylation%20Exercises.pdf). To make the results comparable with MethylKit, we created nonoverlapping probes with 2 kb windows and 2 kb of the step size. Later, the assessment of methylation was carried out in the probes with at least 5 CpG positions and at least 1× read depth in each group (LCE vs. HCE). Subsequently, DMR were identified by applying a chi-squared test on the probes with at least 5 CpG observations. For chi-squared test, probes with multiple testing corrected P-value of less than 0.05 were considered as DMR. MethylExtract (
      • Barturen G.
      • Rueda A.
      • Oliver J.
      • Hackenberg M.
      MethylExtract: High-quality methylation maps and SNV calling from whole genome bisulfite sequencing data.
      ) was also used to identify SNP from WGBS data of each sample. It is noteworthy that although (
      • Fang L.
      • Jiang J.
      • Li B.
      • Zhou Y.
      • Freebern E.
      • Vanraden P.M.
      • Cole J.B.
      • Liu G.E.
      • Ma L.
      Genetic and epigenetic architecture of paternal origin contribute to gestation length in cattle.
      ) classified the calving ease into 2 categories: low and high, in the present study, it was divided into 4 categories as described previously in this section.

      RESULTS

      Confirmation of the Major QTL on BTA18

      The cLDLA pipeline using 1,144 SNPs (from 50K SNP array) on BTA18 identified a region between 54,233,746 and 63,206,119 bp, with LRT values beyond the significance threshold of 24.185 (Figure 1). Thereby LRT value of 24.185 corresponds to Bonferroni-corrected P-value of 0.001 (i.e., chi-squared of 24.184 = 0.001/1,144). It indicates that this region harbored the QTL for the traits of paternal calving ease. Indeed, a chromosome-wide maximum peak with a LRTmax = 122.9 was observed at position 58,860,538 bp. The boundaries of the corresponding confidence interval (CI 95%), defined by the 2-LOD drop-off criterion, were set at 58,343,346 to 59,432,662 bp.
      Figure thumbnail gr1
      Figure 1Distribution of the likelihood ratio test values (LRT) of paternal calving ease on BTA18 with the maximum peak (LRTmax = 122.9) at 58,860,538 bp and the confidence interval for the potential QTL at 58,343,346 to 59,432,662 bp (dark red bar). The LRT values exceeding the significance threshold of 24.158 (blue horizontal line) are located in the light red area (54,233,746–63,206,119 bp). The x-axis represents the position on the chromosome in mega base pairs and the y-axis represents the LRT values.
      After comparing the maternal and paternal haplotypes within the 40-SNP window harboring the LRTmax in the middle marker interval, we were able to identify a common putative harmful haplotype in 608 individuals that showed increased calving difficulties and stillbirth. Fifty-six animals were homozygous for the putative harmful haplotype at the target QTL (i.e., group Q/Q) and exhibited diplotype effects as estimated by ASReml in a range between −8.149 and −6.027, whereas the group q/Q with one copy of the putative harmful haplotype consisted of 552 animals, which reached diplotype effects from −5.661 to −1.766. The remaining 2,089 individuals were members of the group q/q and, thus, characterized by the absence of the abovementioned putative harmful haplotype. The corresponding diplotype effects ranged from −1.766 to +3.520.
      Furthermore, using HD marker data of animals from group Q/Q and q/Q, the putative harmful haplotype could be narrowed down to a region from 57,922,208 to 60,057,741 bp. In the following sections this chromosomal segment that was characterized by an identical haplotype of animals with the lowest diplotype effects and maximum LRT values was defined as “causal haplotype.”

      Illumina and Nanopore Sequences of Different Cattle Breeds

      We sequenced 4 HF animals with ONT and on average produced 55.46 giga base pairs (Gbp) sequencing data per animal with an average reads N50 of 18,604 bp. A mean of 88.63% of the generated long reads could be mapped successfully to the ARS-UCD1.2 assembly with an average read coverage of 17.49. A detailed overview from each HF sample and the corresponding quality parameters of the previously sequenced Kärntner Blondvieh and Fleckvieh are listed for comparison in Table 3.
      Table 3Quality parameters of the samples sequenced with Oxford Nanopore Technology
      Sample nameBreedSexGbpReads N50 (bp)Reads coverageNumber of mapped readsMapped reads (%)
      DHFnano01HolsteinFemale51.9527,42816.527,629,09490.06
      DHFnano02HolsteinFemale67.6827,67821.237,463,21986.59
      DHFnano03HolsteinMale48.739,37515.3114,826,50988.11
      DHFnano04HolsteinMale53.509,93616.9115,624,50389.76
      KBVnano05Kärntner BlondviehMale71.6111,81617.7826,582,40185.46
      FVnano06FleckviehMale53.6412,10012.8311,660,44591.41
      Focusing on the region surrounding the QTL peak at 58,860,538 bp, both the 2 homozygous (Q/Q and q/q) and the 2 heterozygous (q/Q) HF sequences showed a chromosomal segment of about 16,000 bp (58,846,130–58,861,709 bp) where no read could be mapped successfully to the reference genome and therefore, it appeared as a widespread gap. In other words, cLDLA placed the LRTmax exactly within a chromosomal segment without any sequence reads. Also, with the pooled sample of 2 heterozygous HF cows, it was not possible to detect any long read within this chromosomal segment. In the further text below, the mentioned region without any mapped reads is referred to as 16-kb gap. As the QTL of interest was only identified in Holstein breeds, we investigated whether this 16-kb gap is limited to the Holstein samples. To this end, we examined the same region in long-read sequences of non-Holstein samples (Table 3). The examination revealed that this 16-kb gap was also presented in all the non-Holstein samples investigated in this study (Figure 2, Supplemental Figures S1–S3, https://doi.org/10.6084/m9.figshare.21857988;
      • Dachs N.
      • Upadhyay M.
      • Hannemann E.
      • Hauser A.
      • Krebs S.
      • Seichter D.
      • Russ I.
      • Gehrke L.J.
      • Thaller G.
      • Medugorac I.
      Quantitative trait locus for calving traits on Bos taurus autosome 18 in Holstein cattle is embedded in a complex genomic region.
      ). To test the possibility that some long reads had continuous sequence without a 16-kb gap, we performed linked complementary alignment analysis with IGV. Only 3 samples (DHFnano01, DHFnano02, and KBVnano05) show linked complementary alignment (i.e., more than one instance of the alignment for the same read) with low frequency (∼11% of reads per sample). The 2 HF samples carrying the opposite homozygous haplotype (Q/Q-DHFnano03 and q/q-DHFnano04) as well as FVnano06 do not show linked complementary alignment.
      Figure thumbnail gr2
      Figure 2Integrative Genomics Viewer (IGV) screenshots of long- and short-read sequences mapped to the ARS-UCD1.2 assembly. The screenshots illustrate the 16-kb gap and its neighboring regions on chromosome 18. On the upper panel are the Oxford Nanopore Technology (ONT) sequences of the heterozygous Holstein cow DHFnano01 (A) and of the Kärntner Blondvieh Bull (B), and the lower panel shows the Illumina sequences of a Holstein bull (ERR2694948) (C) and a Hereford dam (SRR8324584) (D). A gap of 16,000 bp (58,846,130–58,861,709 bp), where no read could be detected, is clearly visible in all 4 panels. Furthermore, the 16-kb gap harbored the position of the LRTmax at 58,860,538 bp, whereas the CI 95% of the QTL (58,343,346 to 59,432,662 bp) extended beyond it.

      Validation-Alignment Using the UOA_Angus_1 Assembly as Reference

      We performed an additional alignment of both heterozygous HF cows DHFnano01 and DHFnano02 onto the improved taurine assembly, UOA_Angus_1. The results revealed that 89.54% (DHFnano01) and 86.78% (DHFnano02) of long reads mapped with a coverage of 17.34 and 22.36. We converted the positions of the 16-kb gap in the target assembly using the NCBI genome remapping service and noticed a switched orientation of the chromosome. It should be noted that although the search for the 16-kb region (18:58,846,130–58,861,709 in ARS-UCD1.2) in UOA_Angus_1 itself did not reveal the corresponding region, the start (58,846,129) and end position (58,861,708) of the gap corresponded to 6,496,595 and 6,733,227 bp including the position of the LRTmax at 6,697,993 bp. The CI95% of the significant QTL was located between 6,143,282 to 7,212,507 bp. This implies that the sequences in the region of the 16-kb gap in ARS-UCD1.2 might have defragmented around 236,632 bp in the Angus assembly and seemed to be covered by long reads. Although no comparable gap could be observed in the region of the 16-kb gap, we detected close to this location (2,517 bp apart; Supplemental Figure S4, https://doi.org/10.6084/m9.figshare.21857988;
      • Dachs N.
      • Upadhyay M.
      • Hannemann E.
      • Hauser A.
      • Krebs S.
      • Seichter D.
      • Russ I.
      • Gehrke L.J.
      • Thaller G.
      • Medugorac I.
      Quantitative trait locus for calving traits on Bos taurus autosome 18 in Holstein cattle is embedded in a complex genomic region.
      ) a chromosomal segment of 501 bp (6,716,809–6,717,310 bp), which was affected in the same way as the 16-kb gap (Figure 2, Supplemental Figures S1–S3) and showed no mapped reads.
      By analyzing the marker positions in the Angus reference, we identified 84 HD markers with ambiguous SNP-positions. Whereas the same 84 markers could be clearly located to one position in the ARS-UCD1.2 assembly. In both assemblies, 76 of the affected 84 markers were assigned to 7 blocks based on their closed positions to each other. However, those blocks occurred twice only in the Angus genome. Moreover, 3 of these 7 blocks were located within the CI 95% of the QTL (6,146,282–7,212,507 bp) and reached an average length of 24 kb. Such ambiguous marker positions indicated that the affected SNPs and the region between them have been duplicated (National Center for Biotechnology Information, 2020). In view of these results, the QTL or the causal haplotype could not be clearly localized in the Angus genome. Consequently, further analyses with the UOA_Angus_1 assembly to detect candidate genes or causal mutations were not performed.

      Marker Distribution of BovineHD BeadChip on ARS-UCD1.2 Assembly

      As the SNPs in regions with mapping difficulties are hard to characterize, they are mostly excluded from a commercial SNP array (
      • Bianco L.
      • Cestaro A.
      • Sargent D.J.
      • Banchi E.
      • Derdak S.
      • Di Guardo M.
      • Salvi S.
      • Jansen J.
      • Viola R.
      • Gut I.
      • Laurens F.
      • Chagne D.
      • Velasco R.
      • van de Weg E.
      • Troggio M.
      Development and validation of a 20K single nucleotide polymorphism (SNP) whole genome genotyping array for apple (Malus × domestica Borkh).
      ). Therefore, we investigated the marker density of the BovineHD BeadChip on BTA18 and confirmed significantly (P < 1 × 10−4) greater average distance (mean ∼11.85 kb, SD: ∼12.23 kb) between the 2 consecutive SNPs in the region between 58.4 and 59.9 Mbp than the remaining chromosome (mean ∼3.38 kb, SD: ∼4.90 kb). In fact, no SNP could be located in or close to the 16-kb gap, which was only bounded by 2 SNPs that were ∼40.2 kb apart from each other.

      Analyses of the Oxford Nanopore Sequences Using De Novo Assembly Techniques

      To avoid reference-bias we de novo assembled the ONT data of 6 Holstein and non-Holstein samples and one pool using wtdbg2 (readbean) V2.5. The generated contigs were then mapped in the first run to the ARS-UCD1.2 and in a second one to the UOA_Angus_1 assembly. On average we accomplished a mean N50 contiguity of 6,064,162 bp and a mean total sequence length of 2.62 Gbp (Table 4). However, despite a high mean N50 contiguity (>6 Mbp), no contig reached a length of more than 10 kb between 58 to 60 Mbp (on ARS-UCD1.2) and similarly between 4.5 and 6.5 Mbp (on UOA_Angus_1) on BTA18 (Supplemental Figure S4). Thus, we were unable to resolve the most critical region on BTA18.
      Table 4Statistical report of the contig parameters from each Oxford Nanopore Technology (ONT) sequenced sample
      AssemblyBreedSexTotal sequence length (Gbp)Contig N50 (Mbp)Mean contig length (bp)Number of contigs
      DHFnano01HolsteinFemale2.684.56480,2385,494
      DHFnano02HolsteinFemale2.6510.28543,0984,883
      DHFnano03HolsteinMale2.551.83373,1196,836
      DHFnano04HolsteinMale2.572.43460,8985,578
      DHFnano01/DHFnano02HolsteinFemale2.6816.95434,3026,181
      KBVnano05Kärntner BlondviehMale2.645.43338,3937,806
      FVnano06FleckviehMale2.590.97254,10710,198

      Self-Alignment of the ARS-UCD1.2 Reference Genome to Identify Segmental Duplications

      Based on the survey of the literature, we learned that previous studies have identified many segmental duplications in the distal part of BTA18 in UMD3.1 cattle assembly. Therefore, we sought to investigate whether the QTL identified in our study falls around the segmental duplications. Using SDDetector pipeline, we observed that the region at the distal end of the chromosome and between 58 and 60 Mbp contains many SEGD compared with the remaining chromosome (P < 1 × 10−4; Supplemental Figure S5, https://doi.org/10.6084/m9.figshare.21857988;
      • Dachs N.
      • Upadhyay M.
      • Hannemann E.
      • Hauser A.
      • Krebs S.
      • Seichter D.
      • Russ I.
      • Gehrke L.J.
      • Thaller G.
      • Medugorac I.
      Quantitative trait locus for calving traits on Bos taurus autosome 18 in Holstein cattle is embedded in a complex genomic region.
      ).

      Identification of Potential Candidate SNPs or Linked SNPs

      By using the GATK HaplotypeCaller, we detected 46,998 biallelic variants (SNPs and indels) in the region between 55 and 60 Mbp on chromosome 18. However, only 155 variants showed concordant zygosity with predicted QTL-haplotypes. Further, visual inspection of these 155 variants in ONT sequences revealed only 4 SNPs showing concordant zygosity with the predicted QTL-haplotypes (Table 2). Except, rs381577268 (57,816,137 bp), which is located downstream of the zinc finger protein 613 (ZNF613) gene, all SNPs are intergenic. In the next step, we sought to validate these SNPs in the large population of HF and non-HF samples. The results (Table 2) clearly showed the presence of alternative alleles only in HF samples and one Fleckvieh bull indicating the high LD between these SNPs and QTL haplotype. This Fleckvieh bull shows HF ancestry on both the paternal (∼2.8%) and maternal (0.5%) side of the pedigree. Among the HF samples with known diplotype effects, the zygosity at these 4 loci were in concordance with that of the zygosity predicted by haplotypes effect. Specifically, based on the genotypes at these 4 loci, these HF sample were divided in these 3 groups: (1) HF q/q- homozygous for the wild type, (2) HF q/Q - the heterozygous samples with one copy of QTL, (3) HF QQ- homozygous for the QTL. Random HF samples with unknown diplotype effects (N/N group) confirmed complete LD between 4 candidate SNPs.

      Detection of SV in the Region of the Quantitative Trait Locus Using ONT and Illumina Sequence Data

      After filtering low quality SV, in the critical region of BTA18, we detected 7 SV in the sample DHFnano03 carrying Q/Q haplotype, 8 (DHFnano01) and 9 (DHFnano02) SV in the 2 individuals carrying Q/q haplotype, and 5 SV in the sample DHFnano04 carrying q/q haplotype (Supplemental Table S2, https://doi.org/10.6084/m9.figshare.21857988;
      • Dachs N.
      • Upadhyay M.
      • Hannemann E.
      • Hauser A.
      • Krebs S.
      • Seichter D.
      • Russ I.
      • Gehrke L.J.
      • Thaller G.
      • Medugorac I.
      Quantitative trait locus for calving traits on Bos taurus autosome 18 in Holstein cattle is embedded in a complex genomic region.
      ). Although we could not identify any SV which complied with the inheritance pattern of the QTL, there was one deletion exclusively present in the individuals carrying the harmful QTL haplotype. However, this deletion was present homozygous in all the 3 individuals carrying the harmful QTL haplotype (i.e., also in the 2 heterozygous cows, which showed only one copy of the putative harmful haplotype). Therefore, it is unlikely to be the candidate variant.
      In a second approach, we analyzed the Illumina sequences of the 21 Holstein bulls downloaded from NCBI (Supplemental Table S1). For this, we used the tools Lumpy, BreakDancer and Manta. Although 2 animals (ERR2694951 and SRR4449830) were identified as carrier (carrying one copy of the causal haplotype), no structural variant was detected exclusively in heterozygous form in any of these samples.
      Interestingly, using an assembly-based approach, we identified a 591 bp homozygous deletion (chrm18: 58,083,874–58,084,465) in the long-read assembly of the sample carrying QQ haplotype. This deletion (Supplemental Figure S6, https://doi.org/10.6084/m9.figshare.21857988;
      • Dachs N.
      • Upadhyay M.
      • Hannemann E.
      • Hauser A.
      • Krebs S.
      • Seichter D.
      • Russ I.
      • Gehrke L.J.
      • Thaller G.
      • Medugorac I.
      Quantitative trait locus for calving traits on Bos taurus autosome 18 in Holstein cattle is embedded in a complex genomic region.
      ) was present in the heterozygous form in the sample carrying Qq haplotype and was not detected in the sample carrying qq haplotype. Further, of the 21 Illumina samples, this deletion was present in the heterozygous form in 2 samples carrying Qq haplotypes and one sample carrying qq haplotype. Because this deletion was also present in qq animal, it was not investigated further.

      Methylation Analyses

      First, we checked whether the WGS samples from a previous study (
      • Fang L.
      • Jiang J.
      • Li B.
      • Zhou Y.
      • Freebern E.
      • Vanraden P.M.
      • Cole J.B.
      • Liu G.E.
      • Ma L.
      Genetic and epigenetic architecture of paternal origin contribute to gestation length in cattle.
      ) assigned to the category LCE carried the same haplotypes that we identified as Q in this study. Interestingly, although
      • Fang L.
      • Jiang J.
      • Li B.
      • Zhou Y.
      • Freebern E.
      • Vanraden P.M.
      • Cole J.B.
      • Liu G.E.
      • Ma L.
      Genetic and epigenetic architecture of paternal origin contribute to gestation length in cattle.
      also detected rs381577268 as being associated with fertility traits in HF, their LCE samples did not carry the Q haplotype (https://genome.ucsc.edu/s/maulik23/HF_LCE_Samples) but can be assigned to the group q/q and thus, characterized by the absence of the putative harmful haplotype by criterion that we established here. Further, we did not observe any DMR (Supplemental Tables S3 and S4, https://doi.org/10.6084/m9.figshare.21857988;
      • Dachs N.
      • Upadhyay M.
      • Hannemann E.
      • Hauser A.
      • Krebs S.
      • Seichter D.
      • Russ I.
      • Gehrke L.J.
      • Thaller G.
      • Medugorac I.
      Quantitative trait locus for calving traits on Bos taurus autosome 18 in Holstein cattle is embedded in a complex genomic region.
      ) in the second intron of the ZNF613, which
      • Fang L.
      • Jiang J.
      • Li B.
      • Zhou Y.
      • Freebern E.
      • Vanraden P.M.
      • Cole J.B.
      • Liu G.E.
      • Ma L.
      Genetic and epigenetic architecture of paternal origin contribute to gestation length in cattle.
      had reported. In fact, none of the DMR encompassed any region between 57,807,061 and 57,816,044 (coordinates of ZNF613).

      DISCUSSION

      Previous studies have reported on distal region of BTA18 a QTL which is associated with many economically important traits such as ease in calving and still birth. Notably, this QTL is only identified in HF and HF-derived cattle breeds. However, despite its mapping more than a decade ago, to our knowledge, no study has yet identified exact causal mutation associated with these phenotypic traits. Therefore, we characterized this QTL in greater detail by analyzing genomic data comprehensively.

      Major QTL at Position 58,860,538 bp

      In agreement with the previous studies (Table 1), we identified in HF a major QTL for calving traits at the interval from 58,343,346 to 59,432,662 bp with the peak value located at 58,860,538 bp. Interestingly, in our previous study (
      • Müller M.P.
      • Rothammer S.
      • Seichter D.
      • Russ I.
      • Hinrichs D.
      • Tetens J.
      • Thaller G.
      • Medugorac I.
      Genome-wide mapping of 10 calving and fertility traits in Holstein dairy cattle with special regard to chromosome 18.
      ) the same QTL had the peak at 58,905,582 bp. The distance of 45,044 bp between the peak (58,860,538 bp) reported in that study and the peak identified in the present study (58,905,582 bp) could be explained by multiple factors such as (1) more samples in the current data set supplemented with female animals, which led to a higher number of tested haplotypes and so more accurate results; (2) the mixed linear model for the variance component analysis implemented by
      • Müller M.P.
      • Rothammer S.
      • Seichter D.
      • Russ I.
      • Hinrichs D.
      • Tetens J.
      • Thaller G.
      • Medugorac I.
      Genome-wide mapping of 10 calving and fertility traits in Holstein dairy cattle with special regard to chromosome 18.
      considered top 110 principal components of UAR matrix, while our model included the entire UAR matrix and thus random polygenic effects for each animal; (3) the improved assembly, ARS-UCD1.2 in present study versus UMD 3.1 in
      • Müller M.P.
      • Rothammer S.
      • Seichter D.
      • Russ I.
      • Hinrichs D.
      • Tetens J.
      • Thaller G.
      • Medugorac I.
      Genome-wide mapping of 10 calving and fertility traits in Holstein dairy cattle with special regard to chromosome 18.
      ; (4) fundamentally improved haplotyping using Beagle 5 in the present study versus Beagle 3 in
      • Müller M.P.
      • Rothammer S.
      • Seichter D.
      • Russ I.
      • Hinrichs D.
      • Tetens J.
      • Thaller G.
      • Medugorac I.
      Genome-wide mapping of 10 calving and fertility traits in Holstein dairy cattle with special regard to chromosome 18.
      ; and (5) the animal set which consisted of substantially increased number of pairs and trios in the present study, improving the haplotype inference.
      Furthermore, we confirmed the presence of one distinct haplotype strongly associated with increased dystocia and stillbirth using a 50K SNP array (57,922,208–60,057,741 bp). The haplotype distribution indicated the population-wide LD between the detected harmful haplotype and the underlying negative QTL allele (i.e., allele Q). The relatively large harmful haplotype (2.1 Mbp) could be attributed to the chromosomal segment corresponding to the region with significantly lower marker density compared with the entire chromosome (Figure 3). In addition, an increased number of segmental duplications (Supplemental Figure S5) and significantly short contig length (length < 10 kb; Figure 4) were detected between 58 and 60 Mbp. These features are frequently observed in regions harboring complex chromosomal rearrangements (
      • Zhang F.
      • Carvalho C.M.
      • Lupski J.R.
      Complex human chromosomal and genomic rearrangements.
      ;
      • Vollger M.R.
      • Dishuck P.C.
      • Sorensen M.
      • Welch A.E.
      • Dang V.
      • Dougherty M.L.
      • Graves-Lindsay T.A.
      • Wilson R.K.
      • Chaisson M.J.P.
      • Eichler E.E.
      Long-read sequence and assembly of segmental duplications.
      ). Moreover, such rearrangements could also indicate misassembled region in the distal part of BTA18. This means that even with a significantly larger number of HD-genotyped samples, we could not increase the power of the imputation analysis and consequently there was no possibility to further narrow down the region of the causal haplotype.
      Figure thumbnail gr3
      Figure 3Plot of the Bos taurus marker density of the BovineHD BeadChip on chromosome 18 in the ARS-UCD1.2 assembly. The mean intermarker distance between adjacent markers is 3,381 bp with a standard deviation of 4,908 bp. In the figure, the intermarker distance is calculated in the window of 100 kbp. The red bar marks the outstanding region from 58.4 to 59.9 Mbp with an increased mean marker distance.
      Figure thumbnail gr4
      Figure 4Coverage of Bos taurus chromosome 18 by contigs from de novo assemblies. For comparison, the alignments were either mapped to the ARS-UCD1.2 (a) or the UOA_Angus_1 assembly (b). Additionally, we implemented in the upper panel the chromosome wide identical by state and the position of the LRTmax at 58,860,538 bp (red vertical line). The white bars marked the regions where no contig reached a length of more than 10 kb. The more vividly colored lines within an assembly represent a higher number of short contigs (but longer than 10 kb) in that section. Regardless of the used reference, the aligner was unable to reconstruct a continuous sequence between 58 and 60 Mbp on the ARS-UCD1.2 and between 4.5 and 6.5 Mbp on the UOA_Angus_1 assembly.

      Missing Reads in a Region of 16kb—A Misassembly in ARS-UCD 1.2 Reference?

      The in-depth examination of 4 HF long-read genomes in IGV revealed the identical segment of about 16 kb (58,846,130–58,861,709 bp), where no read could be mapped to ARS-UCD1.2 assembly. This chromosomal segment encompassed the major QTL peak at position 58,860,538 bp and could be located around the central position of the estimated confidence interval (58,343,346–59,432,662 bp). Further, the 16-kb gap coincided with the region characterized by a significantly lower marker density, contigs with a length of less than 10 kb and an increased number of SEGD compared with the entire chromosome. However, this 16-kb gap was observed in all HF as well in several non-HF samples, consequently, we excluded this feature as a causal deletion inducing dystocia and stillbirth; we hypothesized that this could be the result of a general assembly problem in this region. Using the database of Genomic Variants Archive (https://www.ebi.ac.uk/dgva/) from Ensembl, we identified 2 copy number variations that encompass the QTL region. These were a tandem duplication from 58,820,108 to 59,381,142 bp (
      • Boussaha M.
      • Esquerré D.
      • Barbieri J.
      • Djari A.
      • Pinton A.
      • Letaief R.
      • Salin G.
      • Escudie F.
      • Roulet A.
      • Fritz S.
      • Samson F.
      • Grohs C.
      • Bernard M.
      • Klopp C.
      • Boichard D.
      • Rocha D.
      Genome-wide study of structural variants in bovine Holstein, Montbeliarde and Normande dairy breeds.
      ) and an insertion from 58,742,045 to 59,010,707 bp (
      • Bickhart D.M.
      • Hou Y.
      • Schroeder S.G.
      • Alkan C.
      • Cardone M.F.
      • Matukumalli L.K.
      • Song J.
      • Schnabel R.D.
      • Ventura M.
      • Taylor J.F.
      • Garcia J.F.
      • Van Tassell C.P.
      • Sonstegard T.S.
      • Eichler E.E.
      • Liu G.E.
      Copy number variation of individual cattle genomes using next-generation sequencing.
      ). Moreover, a recent study (
      • Iannuzzi A.
      • Braun M.
      • Genualdo V.
      • Perucatti A.
      • Reinartz S.
      • Proios I.
      • Heppelmann M.
      • Rehage J.
      • Hulskotter K.
      • Beineke A.
      • Metzger J.
      • Distl O.
      Clinical, cytogenetic and molecular genetic characterization of a tandem fusion translocation in a male Holstein cattle with congenital hypospadias and a ventricular septal defect.
      ), examining the genetic background of the disorder of sexual development hypospadias in male ruminants, identified between the telomeres of BTA18 and the centromere of BTA27 several deletions as well as a tandem fusion translocation of DNA fragment. However, the SV as detected in the study (
      • Iannuzzi A.
      • Braun M.
      • Genualdo V.
      • Perucatti A.
      • Reinartz S.
      • Proios I.
      • Heppelmann M.
      • Rehage J.
      • Hulskotter K.
      • Beineke A.
      • Metzger J.
      • Distl O.
      Clinical, cytogenetic and molecular genetic characterization of a tandem fusion translocation in a male Holstein cattle with congenital hypospadias and a ventricular septal defect.
      ) were not located within the region of the causal haplotype identified in this study, but it encourages the hypothesis of a cluster of complex chromosomal rearrangements at the distal end of the chromosome, BTA18.

      Does the Major QTL Harbor the Complex Structural Rearrangement?

      Large and complex rearrangements also have an essentially negative effect on reconstructing genomes. Indeed, de novo assemblies of the cattle samples (HF as well as non-HF) carried out in this study, showed the stronger contig fragments around the region of the QTL compared with the remaining chromosome (Figure 4). Further, the design of SNP chips in this region also appears to be negatively affected. The lower quality of the reference sequence in the regions with a cluster of SV resulted in a significantly decreased marker density (Figure 3). However, the lower marker density is not only the result of assembly errors, but also of tools used for SNP calling and marker filtering. That implies that haplotyping as well as mapping analysis are not as accurate as in regions which are not affected not affected by complex aberrations (
      • Zhang F.
      • Carvalho C.M.
      • Lupski J.R.
      Complex human chromosomal and genomic rearrangements.
      ;
      • Pu L.
      • Lin Y.
      • Pevzner P.A.
      Detection and analysis of ancient segmental duplications in mammalian genomes.
      ). It should be noted that using the UOA_Angus_1 assembly also did not improve analysis to decrypt the significant QTL as 84 markers, of which 15 markers were located within the CI 95%, had ambiguous positions. Such ambiguous SNPs indicate that the affected markers and regions between them might be duplicated or collapsed SEGD resulting in misassembly.

      SNPs in LD with QTL and Possible Underlying Mechanism of QTL

      Although 15 markers in the QTL region showed ambiguous positions, there were still markers showing a unique position after remappping. Therefore, the genomic region around QTL was investigated to determine whether markers linked to this QTL could be identified. Indeed, our analysis and subsequent validation identified 4 candidate SNPs that are most likely associated with an increased dystocia and stillbirth rate in HF cattle (Table 2). Except one SNP (rs381577268) that falls in the downstream region, all the remaining SNPs are intergenic. Although these SNPs might itself not be causal, a possibility of causal variants being present on topologically associated domains (TAD) or other unknown elements, which are still not annotated in the bovine genome cannot be ruled out. The disruption of these elements can affect the regulatory architecture leading to pathological or abnormal phenotypes. Indeed, there is evidence (
      • Jablonski K.P.
      • Carron L.
      • Mozziconacci J.
      • Forné T.
      • Hütt M.-T.
      • Lesne A.
      Contribution of 3D genome topological domains to genetic risk of cancers: A genome-wide computational study.
      ) in humans showing that for certain diseases and cancers, TAD were predominantly rich in SNPs. Of the 4 SNPs, one (rs381577268) was also identified by
      • Fang L.
      • Jiang J.
      • Li B.
      • Zhou Y.
      • Freebern E.
      • Vanraden P.M.
      • Cole J.B.
      • Liu G.E.
      • Ma L.
      Genetic and epigenetic architecture of paternal origin contribute to gestation length in cattle.
      and
      • Purfield D.C.
      • Evans R.D.
      • Carthy T.R.
      • Berry D.P.
      Genomic regions associated with gestation length detected using whole-genome sequence data differ between dairy and beef cattle.
      as significantly associated with several fertility and body conformation traits. Further, they identified the closely located gene of the zinc finger protein 613 (ZNF613) as the potential candidate gene. The authors hypothesized that one or more mutations affecting the ZNF613 would prolong gestation, causing unborn calves to continue to grow and increase in size. Subsequently, this would significantly increase the likelihood of dystocia and stillbirths. A comparable pathomechanism was also described by
      • Cole J.B.
      • VanRaden P.M.
      • O’Connell J.R.
      • Van Tassell C.P.
      • Sonstegard T.S.
      • Schnabel R.D.
      • Taylor J.F.
      • Wiggans G.R.
      Distribution and location of genetic effects for dairy traits.
      and
      • Mao X.
      • Kadri N.K.
      • Thomasen J.R.
      • De Koning D.J.
      • Sahana G.
      • Guldbrandtsen B.
      Fine mapping of a calving QTL on Bos taurus autosome 18 in Holstein cattle.
      , which they attributed to the nearby genes such as SIGLEC-13 (57,136,157–57,142,779 bp) and CD33 (57,122,286–57,131,819 bp). Mutation-induced impaired leptin binding of gene products from SIGLEC-13 and CD33 could also result in a prolongation of the gestation and consequently heavy births. However, the SNP rs109478645 at 57,137,302 bp (
      • Cole J.B.
      • VanRaden P.M.
      • O’Connell J.R.
      • Van Tassell C.P.
      • Sonstegard T.S.
      • Schnabel R.D.
      • Taylor J.F.
      • Wiggans G.R.
      Distribution and location of genetic effects for dairy traits.
      ) and rs136283363 at 57,089,460 bp (
      • Mao X.
      • Kadri N.K.
      • Thomasen J.R.
      • De Koning D.J.
      • Sahana G.
      • Guldbrandtsen B.
      Fine mapping of a calving QTL on Bos taurus autosome 18 in Holstein cattle.
      ) with a significant association to the calving complex in their studies were not identified as candidate SNPs in this study. Other associated SNPs, rs464221818 (59,239,179 bp) and rs472502785 (59,345,689 bp) from this study encompassed the ZNF665-like protein. Interestingly,
      • Zhang Q.
      • Guldbrandtsen B.
      • Thomasen J.R.
      • Lund M.S.
      • Sahana G.
      Genome-wide association study for longevity with whole-genome sequencing in 3 cattle breeds.
      showed an association of these SNPs (rs464221818 and rs472502785) with longevity in Holstein individuals. Nevertheless, to our knowledge, no study to date has demonstrated a clear association of these SNPs on the calving complex in HF cattle. In addition to >260 HF animals, we genotyped 4 candidate SNPs also in a random sample of 127 to 130 non-HF animals from 7 different European cattle breeds. The allele counts presented in the Table 2 supported presence of harmful QTL allele only in HF breed. The HF cattle are frequently used in upgrading the local cattle breeds. Therefore, sporadic introgression of harmful QTL allele in upgraded breeds could be expected and should be monitored. The 4 SNPs, which are identified in population-wide LD with “causal haplotype” in the present study, could additionally be used for monitoring of harmful QTL allele in HF, as well as in HF upgraded breeds.

      Is the DMR in the Second Intron of ZNF613 Associated with QTL?

      A previous study (
      • Fang L.
      • Jiang J.
      • Li B.
      • Zhou Y.
      • Freebern E.
      • Vanraden P.M.
      • Cole J.B.
      • Liu G.E.
      • Ma L.
      Genetic and epigenetic architecture of paternal origin contribute to gestation length in cattle.
      ) had associated the DMR in the second intron of ZNF613 with various fertility traits in cattle. However, keeping all the settings and thresholds similar to their analysis, we could not identify the same DMR in our analysis. This could be attributed to the difference in the cattle assembly used for the analysis (UMD3.1 vs. ARS-UCD 1.2) and the number of categories in which calving trait was divided in the statistical model. Further, despite identifying the same SNP as associated with calving ease from the genomic data, their samples did not carry haplotype that we defined as “causal haplotype” in this study. Therefore, we cannot exclude that DMR is associated with dystocia QTL on BTA18. The long reads produced by ONT can be used for methylation analyses. Unfortunately, DNA samples of HF animals used in our study originate from different tissues: blood (DHFnano01 and DHFnano02) and sperm (DHFnano03 and DHFnano04). Previous study (
      • Asenius F.
      • Gorrie-Stone T.J.
      • Brew A.
      • Panchbhaya Y.
      • Williamson E.
      • Schalkwyk L.C.
      • Rakyan V.K.
      • Holland M.L.
      • Marzi S.J.
      • Williams D.J.
      The DNA methylome of human sperm is distinct from blood with little evidence for tissue-consistent obesity associations.
      ) has shown that methylation pattern in peripheral blood and sperm is significantly different (practically uncorrelated) and thus, cannot be combined in single DMR analysis.

      Abundance of Segmental Duplications in and Around the Region of QTL

      Our results suggest the presence of segmental duplications in the distal region of BTA18. Furthermore, a large number of SEGD were detected in and around the region of the QTL (Supplemental Figure S5). In fact, previous studies have already reported a high number of segmental duplications in the distal part of BTA18 in UMD3.1 assembly (between 55 and 60 Mbp;
      • Liu G.E.
      • Ventura M.
      • Cellamare A.
      • Chen L.
      • Cheng Z.
      • Zhu B.
      • Li C.
      • Song J.
      • Eichler E.E.
      Analysis of recent segmental duplications in the bovine genome.
      ;
      • Seroussi E.
      • Glick G.
      • Shirak A.
      • Yakobson E.
      • Weller J.I.
      • Ezra E.
      • Zeron Y.
      Analysis of copy loss and gain variations in Holstein cattle autosomes using BeadChip SNPs.
      ;
      • Bickhart D.M.
      • Hou Y.
      • Schroeder S.G.
      • Alkan C.
      • Cardone M.F.
      • Matukumalli L.K.
      • Song J.
      • Schnabel R.D.
      • Ventura M.
      • Taylor J.F.
      • Garcia J.F.
      • Van Tassell C.P.
      • Sonstegard T.S.
      • Eichler E.E.
      • Liu G.E.
      Copy number variation of individual cattle genomes using next-generation sequencing.
      ;
      • Boussaha M.
      • Esquerré D.
      • Barbieri J.
      • Djari A.
      • Pinton A.
      • Letaief R.
      • Salin G.
      • Escudie F.
      • Roulet A.
      • Fritz S.
      • Samson F.
      • Grohs C.
      • Bernard M.
      • Klopp C.
      • Boichard D.
      • Rocha D.
      Genome-wide study of structural variants in bovine Holstein, Montbeliarde and Normande dairy breeds.
      ;
      • Zhang Q.
      • Ma Y.
      • Wang X.
      • Zhang Y.
      • Zhao X.
      Identification of copy number variations in Qinchuan cattle using BovineHD Genotyping Beadchip array. Molecular genetics and genomics.
      ;
      • Gao Y.
      • Jiang J.
      • Yang S.
      • Hou Y.
      • Liu G.E.
      • Zhang S.
      • Zhang Q.
      • Sun D.
      CNV discovery for milk composition traits in dairy cattle using whole genome resequencing.
      ;
      • Keel B.N.
      • Keele J.W.
      • Snelling W.M.
      Genome-wide copy number variation in the bovine genome detected using low coverage sequence of popular beef breeds.
      ;
      • Asenius F.
      • Gorrie-Stone T.J.
      • Brew A.
      • Panchbhaya Y.
      • Williamson E.
      • Schalkwyk L.C.
      • Rakyan V.K.
      • Holland M.L.
      • Marzi S.J.
      • Williams D.J.
      The DNA methylome of human sperm is distinct from blood with little evidence for tissue-consistent obesity associations.
      ). However, SEGD can also be the consequences of the misassembly (
      • Zimin A.V.
      • Delcher A.L.
      • Florea L.
      • Kelley D.R.
      • Schatz M.C.
      • Puiu D.
      • Hanrahan F.
      • Pertea G.
      • Van Tassell C.P.
      • Sonstegard T.S.
      • Marcais G.
      • Roberts M.
      • Subramanian P.
      • Yorke J.A.
      • Salzberg S.L.
      A whole-genome assembly of the domestic cow, Bos taurus.
      ;
      • Kelley D.R.
      • Salzberg S.L.
      Detection and correction of false segmental duplications caused by genome mis-assembly.
      ). Nevertheless,
      • Liu G.E.
      • Ventura M.
      • Cellamare A.
      • Chen L.
      • Cheng Z.
      • Zhu B.
      • Li C.
      • Song J.
      • Eichler E.E.
      Analysis of recent segmental duplications in the bovine genome.
      using fluorescent in situ hybridization confirmed at least one SEGD in the distal region of BTA18.
      The detected segmental duplications were identified from the reference assembly itself. This indicates that the chromosomal region with significantly increased number of SEGD in cattle generally might include more common mutations especially present in HF cattle influencing calving performance. However, these mutations are difficult to decrypt, because complex aberrations negatively affect sequencing and de novo assembling techniques (
      • Liu G.E.
      • Ventura M.
      • Cellamare A.
      • Chen L.
      • Cheng Z.
      • Zhu B.
      • Li C.
      • Song J.
      • Eichler E.E.
      Analysis of recent segmental duplications in the bovine genome.
      ;
      • Vollger M.R.
      • Dishuck P.C.
      • Sorensen M.
      • Welch A.E.
      • Dang V.
      • Dougherty M.L.
      • Graves-Lindsay T.A.
      • Wilson R.K.
      • Chaisson M.J.P.
      • Eichler E.E.
      Long-read sequence and assembly of segmental duplications.
      ;
      • Spealman P.
      • Burrell J.
      • Gresham D.
      Inverted duplicate DNA sequences increase translocation rates through sequencing nanopores resulting in reduced base calling accuracy.
      ). Furthermore, there is a possibility that one or more segmental duplications may themselves increase dystocia and stillbirth in HF cattle. As a result of nonallelic homologous recombination events in the region of the causal haplotype could alter the underlying genes and gene products that influence calving (
      • Stankiewicz P.
      • Lupski J.R.
      Genome architecture, rearrangements and genomic disorders.
      ;
      • Koszul R.
      • Fischer G.
      A prominent role for segmental duplications in modeling eukaryotic genomes.
      ). This means that animals carrying the “causal haplotype” could have additional SEGD in an already complex chromosomal region. This additional complexity could further increase the probability of nonallelic homologous recombination and thus increase the frequency of gametes with aberrant sequences in the QTL region. When one or 2 of these gametes eventually contribute to a zygote, dystocia, and stillbirths occur. This hypothetical pattern could explain the fact that the presence of a causative mutation only increases the frequency of offspring with dystocia and stillbirth, but does not necessarily lead to this phenotype (i.e., Q/Q animals also produce enough nonaberrant gametes and some offspring without dystocia and stillbirth). It is kind of “chance and necessity.” However, the abundance of SEGD in the CI 95% of QTL (Supplemental Figure S5) may lead to suboptimal mapping in the QTL regions and consequentially, may mask the causal and regular SEGD, SV, indels, or SNPs in HF cattle. Further, there is evidence that ONT sequences performs underwhelmingly in resolving SEGD when compared with PacBio sequencing technology perhaps because of its relatively high base error rate, therefore, future studies could use the HiFi long reads from PacBio sequencing technology to improve the detection of causal variants in this SEGD-rich region (
      • Vollger M.R.
      • Dishuck P.C.
      • Sorensen M.
      • Welch A.E.
      • Dang V.
      • Dougherty M.L.
      • Graves-Lindsay T.A.
      • Wilson R.K.
      • Chaisson M.J.P.
      • Eichler E.E.
      Long-read sequence and assembly of segmental duplications.
      ). The results and hypothetical causal mechanisms presented in this study could be helpful for improving the designs of further research which is still required to decrypt causal mutation(s) and identify the associated genes.

      CONCLUSIONS

      In this study, we confirmed the most significant QTL associated with calving traits on the distal part of BTA18 in HF animals. Additionally, comprehensive analysis of the regions around QTL revealed 4 SNPs that are in population-wide LD with underlying causal QTL alleles. It also revealed the abundance of segmental duplications in and around the critical region, which could have negatively affected sequencing and de novo assembly. Further, this observation indicates the likelihood of one or more additional copy of segmental duplication or other complex structural variation in the affected HF animals. Overall, here we comprehensively characterized the genomic features that could also be relevant for other such enigmatic QTL in various other cattle breeds and livestock species as well.

      ACKNOWLEDGMENTS

      The authors acknowledge the German Federal Ministry of Education and Research for funding the FUGATO-plus project GENOTRACK (grant no. 0315134A) from which we received genotype data. Nina Dachs and partly Lilian Johanna Gehrke were funded by a PhD Research Fellowship from the Tierzuchtforschung e.V. München (Poing, Germany). Lilian Johanna Gehrke was also cofunded by the H. Wilhelm Schaumann Stiftung (Hamburg, Germany). The authors have not stated any conflicts of interest.

      REFERENCES

        • Akalin A.
        • Kormaksson M.
        • Li S.
        • Garrett-Bakelman F.E.
        • Figueroa M.E.
        • Melnick A.
        • Mason C.E.
        methylKit: A comprehensive R package for the analysis of genome-wide DNA methylation profiles.
        Genome Biol. 2012; 13 (23034086): R87
        • Asenius F.
        • Gorrie-Stone T.J.
        • Brew A.
        • Panchbhaya Y.
        • Williamson E.
        • Schalkwyk L.C.
        • Rakyan V.K.
        • Holland M.L.
        • Marzi S.J.
        • Williams D.J.
        The DNA methylome of human sperm is distinct from blood with little evidence for tissue-consistent obesity associations.
        PLoS Genet. 2020; 16 (33048947)e1009035
        • Bailey J.A.
        • Gu Z.
        • Clark R.A.
        • Reinert K.
        • Samonte R.V.
        • Schwartz S.
        • Adams M.D.
        • Myers E.W.
        • Li P.W.
        • Eichler E.E.
        Recent segmental duplications in the human genome.
        Science. 2002; 297 (12169732): 1003-1007
        • Barturen G.
        • Rueda A.
        • Oliver J.
        • Hackenberg M.
        MethylExtract: High-quality methylation maps and SNV calling from whole genome bisulfite sequencing data.
        F1000Research. 2014; 2 (24627790): 217
        • Bellows D.
        • Ott S.
        • Bellows R.
        Cost of reproductive diseases and conditions in cattle.
        Prof. Anim. Sci. 2002; 18: 26-32
        • Bianco L.
        • Cestaro A.
        • Sargent D.J.
        • Banchi E.
        • Derdak S.
        • Di Guardo M.
        • Salvi S.
        • Jansen J.
        • Viola R.
        • Gut I.
        • Laurens F.
        • Chagne D.
        • Velasco R.
        • van de Weg E.
        • Troggio M.
        Development and validation of a 20K single nucleotide polymorphism (SNP) whole genome genotyping array for apple (Malus × domestica Borkh).
        PLoS One. 2014; 9 (25303088)e110377
        • Bickhart D.M.
        • Hou Y.
        • Schroeder S.G.
        • Alkan C.
        • Cardone M.F.
        • Matukumalli L.K.
        • Song J.
        • Schnabel R.D.
        • Ventura M.
        • Taylor J.F.
        • Garcia J.F.
        • Van Tassell C.P.
        • Sonstegard T.S.
        • Eichler E.E.
        • Liu G.E.
        Copy number variation of individual cattle genomes using next-generation sequencing.
        Genome Res. 2012; 22 (22300768): 778-790
        • Boussaha M.
        • Esquerré D.
        • Barbieri J.
        • Djari A.
        • Pinton A.
        • Letaief R.
        • Salin G.
        • Escudie F.
        • Roulet A.
        • Fritz S.
        • Samson F.
        • Grohs C.
        • Bernard M.
        • Klopp C.
        • Boichard D.
        • Rocha D.
        Genome-wide study of structural variants in bovine Holstein, Montbeliarde and Normande dairy breeds.
        PLoS One. 2015; 10 (26317361)e0135931
        • Broad Institute
        Picard Toolkit.
        Broad Institute, Massachusetts Institute of Technology, 2019
        • Browning B.L.
        • Zhou Y.
        • Browning S.R.
        A one-penny imputed genome from next-generation reference panels.
        Am. J. Hum. Genet. 2018; 103 (30100085): 338-348
        • Browning S.R.
        • Browning B.L.
        Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering.
        Am. J. Hum. Genet. 2007; 81 (17924348): 1084-1097
        • Chen K.
        • Wallis J.W.
        • McLellan M.D.
        • Larson D.E.
        • Kalicki J.M.
        • Pohl C.S.
        • McGrath S.D.
        • Wendl M.C.
        • Zhang Q.
        • Locke D.P.
        • Shi X.
        • Fulton R.S.
        • Ley T.J.
        • Wilson R.K.
        • Ding L.
        • Mardis E.R.
        BreakDancer: An algorithm for high-resolution mapping of genomic structural variation.
        Nat. Methods. 2009; 6 (19668202): 677-681
        • Chen X.
        • Schulz-Trieglaff O.
        • Shaw R.
        • Barnes B.
        • Schlesinger F.
        • Källberg M.
        • Cox A.J.
        • Kruglyak S.
        • Saunders C.T.
        Manta: Rapid detection of structural variants and indels for germline and cancer sequencing applications.
        Bioinformatics. 2016; 32 (26647377): 1220-1222
        • Cole J.B.
        • VanRaden P.M.
        • O’Connell J.R.
        • Van Tassell C.P.
        • Sonstegard T.S.
        • Schnabel R.D.
        • Taylor J.F.
        • Wiggans G.R.
        Distribution and location of genetic effects for dairy traits.
        J. Dairy Sci. 2009; 92 (19448026): 2931-2946
        • Cole J.B.
        • Waurich B.
        • Wensch-Dorendorf M.
        • Bickhart D.M.
        • Swalve H.H.
        A genome-wide association study of calf birth weight in Holstein cattle using single nucleotide polymorphisms and phenotypes predicted from auxiliary traits.
        J. Dairy Sci. 2014; 97: 3156-3172
        • Cunningham F.
        • Achuthan P.
        • Akanni W.
        • Allen J.
        • Amode M.R.
        • Armean I.M.
        • Bennett R.
        • Bhai J.
        • Billis K.
        • Boddu S.
        • Cummins C.
        • Davidson C.
        • Dodiya K.J.
        • Gall A.
        • Girón C.G.
        • Gil L.
        • Grego T.
        • Haggerty L.
        • Haskell E.
        • Hourlier T.
        • Izuogu O.G.
        • Janacek S.H.
        • Juettemann T.
        • Kay M.
        • Laird M.R.
        • Lavidas I.
        • Liu Z.
        • Loveland J.E.
        • Marugan J.C.
        • Maurel T.
        • McMahon A.C.
        • Moore B.
        • Morales J.
        • Mudge J.M.
        • Nuhn M.
        • Ogeh D.
        • Parker A.
        • Parton A.
        • Paricio M.
        • Salam A.I.A.
        • Schmitt B.M.
        • Schuilenburg H.
        • Sheppard D.
        • Sparrow H.
        • Stapleton E.
        • Szuba M.
        • Taylor K.
        • Threadgold G.
        • Thormann A.
        • Vullo A.
        • Walts B.
        • Winterbottom A.
        • Zadissa A.
        • Chakiachvili M.
        • Frankish A.
        • Hunt S.E.
        • Kostadima M.
        • Langridge N.
        • Martin F.J.
        • Muffato M.
        • Perry E.
        • Ruffier M.
        • Staines D.M.
        • Trevanion S.J.
        • Aken B.L.
        • Yates A.D.
        • Zerbino D.R.
        • Flicek P.
        Ensembl 2019.
        Nucleic Acids Res. 2019; 47 (30407521): D745-D751
        • Dachs N.
        • Upadhyay M.
        • Hannemann E.
        • Hauser A.
        • Krebs S.
        • Seichter D.
        • Russ I.
        • Gehrke L.J.
        • Thaller G.
        • Medugorac I.
        Quantitative trait locus for calving traits on Bos taurus autosome 18 in Holstein cattle is embedded in a complex genomic region.
        • Dallery J.-F.
        • Lapalu N.
        • Zampounis A.
        • Pigné S.
        • Luyten I.
        • Amselem J.
        • Wittenberg A.H.J.
        • Zhou S.
        • de Queiroz M.V.
        • Robin G.P.
        • Auger A.
        • Hainaut M.
        • Henrissat B.
        • Kim K.-T.
        • Lee Y.-H.
        • Lespinet O.
        • Schwartz D.C.
        • Thon M.R.
        • O’Connell R.J.
        Gapless genome assembly of Colletotrichum higginsianum reveals chromosome structure and association of transposable elements with secondary metabolite gene clusters.
        BMC Genomics. 2017; 18 (28851275): 667
        • Fang L.
        • Jiang J.
        • Li B.
        • Zhou Y.
        • Freebern E.
        • Vanraden P.M.
        • Cole J.B.
        • Liu G.E.
        • Ma L.
        Genetic and epigenetic architecture of paternal origin contribute to gestation length in cattle.
        Commun. Biol. 2019; 2 (30886909): 100
        • Gao Y.
        • Jiang J.
        • Yang S.
        • Hou Y.
        • Liu G.E.
        • Zhang S.
        • Zhang Q.
        • Sun D.
        CNV discovery for milk composition traits in dairy cattle using whole genome resequencing.
        BMC Genomics. 2017; 18 (28356085): 265
        • Gehrke L.J.
        • Capitan A.
        • Scheper C.
        • Konig S.
        • Upadhyay M.
        • Heidrich K.
        • Russ I.
        • Seichter D.
        • Tetens J.
        • Medugorac I.
        • Thaller G.
        Are scurs in heterozygous polled (Pp) cattle a complex quantitative trait? Genetics, selection, evolution.
        Genet. Sel. Evol. 2020; 52 (32033534): 6
        • Gehrke L.J.
        • Upadhyay M.
        • Heidrich K.
        • Kunz E.
        • Klaus-Halla D.
        • Weber F.
        • Zerbe H.
        • Seichter D.
        • Graf A.
        • Krebs S.
        • Blum H.
        • Capitan A.
        • Thaller G.
        • Medugorac I.
        A de novo frameshift mutation in ZEB2 causes polledness, abnormal skull shape, small body stature and subfertility in Fleckvieh cattle.
        Sci. Rep. 2020; 10 (33046754)17032
        • Heller D.
        • Vingron M.
        SVIM-asm: Structural variant detection from haploid and diploid genome assemblies.
        Bioinformatics. 2020; 36 (33346817): 5519-5521
        • Iannuzzi A.
        • Braun M.
        • Genualdo V.
        • Perucatti A.
        • Reinartz S.
        • Proios I.
        • Heppelmann M.
        • Rehage J.
        • Hulskotter K.
        • Beineke A.
        • Metzger J.
        • Distl O.
        Clinical, cytogenetic and molecular genetic characterization of a tandem fusion translocation in a male Holstein cattle with congenital hypospadias and a ventricular septal defect.
        PLoS One. 2020; 15 (31923267)e0227117
        • Jablonski K.P.
        • Carron L.
        • Mozziconacci J.
        • Forné T.
        • Hütt M.-T.
        • Lesne A.
        Contribution of 3D genome topological domains to genetic risk of cancers: A genome-wide computational study.
        Hum. Genomics. 2022; 16 (35016721): 2
        • Joshi N.
        • Fass J.
        Sickle: A sliding-window, adaptive, quality-based trimming tool for FastQ files (Version 1.33).
        https://github.com/najoshi/sickle
        Date: 2011
        Date accessed: January 10, 2023
        • Katz K.
        • Shutov O.
        • Lapoint R.
        • Kimelman M.
        • Brister J.R.
        • O’Sullivan C.
        The Sequence Read Archive: A decade more of explosive growth.
        Nucleic Acids Res. 2022; 50: D387-D390
        • Keel B.N.
        • Keele J.W.
        • Snelling W.M.
        Genome-wide copy number variation in the bovine genome detected using low coverage sequence of popular beef breeds.
        Anim. Genet. 2017; 48 (27775157): 141-150
        • Kelley D.R.
        • Salzberg S.L.
        Detection and correction of false segmental duplications caused by genome mis-assembly.
        Genome Biol. 2010; 11 (20219098): R28
        • Khaja R.
        • MacDonald J.R.
        • Zhang J.
        • Scherer S.W.
        Methods for identifying and mapping recent segmental and gene duplications in eukaryotic genomes.
        Methods Mol. Biol. 2006; 338 (16888347): 9-20
        • Koszul R.
        • Fischer G.
        A prominent role for segmental duplications in modeling eukaryotic genomes.
        C. R. Biol. 2009; 332 (19281956): 254-266
        • Krueger F.
        • Andrews S.R.
        Bismark: A flexible aligner and methylation caller for Bisulfite-Seq applications.
        Bioinformatics. 2011; 27 (21493656): 1571-1572
        • Krzywinski M.
        • Schein J.
        • Birol I.
        • Connors J.
        • Gascoyne R.
        • Horsman D.
        • Jones S.J.
        • Marra M.A.
        Circos: An information aesthetic for comparative genomics.
        Genome Res. 2009; 19 (19541911): 1639-1645
        • Kühn C.
        • Bennewitz J.
        • Reinsch N.
        • Xu N.
        • Thomsen H.
        • Looft C.
        • Brockmann G.A.
        • Schwerin M.
        • Weimann C.
        • Hiendleder S.
        • Erhardt G.
        • Medjugorac I.
        • Forster M.
        • Brenig B.
        • Reinhardt F.
        • Reents R.
        • Russ I.
        • Averdunk G.
        • Blumel J.
        • Kalm E.
        Quantitative trait loci mapping of functional traits in the German Holstein cattle population.
        J. Dairy Sci. 2003; 86: 360-368
        • Lagler D.K.
        • Hannemann E.
        • Eck K.
        • Klawatsch J.
        • Seichter D.
        • Russ I.
        • Mendel C.
        • Lühken G.
        • Krebs S.
        • Blum H.
        • Upadhyay M.
        • Medugorac I.
        Fine-mapping and identification of candidate causal genes for tail length in the Merinolandschaf breed.
        Commun. Biol. 2022; 5 (36068271): 918
        • Langmead B.
        • Salzberg S.L.
        Fast gapped-read alignment with Bowtie 2.
        Nat. Methods. 2012; 9 (22388286): 357-359
        • Layer R.M.
        • Chiang C.
        • Quinlan A.R.
        • Hall I.M.
        LUMPY: A probabilistic framework for structural variant discovery.
        Genome Biol. 2014; 15 (24970577): R84
        • Li H.
        Minimap2: Pairwise alignment for nucleotide sequences.
        Bioinformatics. 2018; 34 (29750242): 3094-3100
        • Li H.
        • Durbin R.
        Fast and accurate short read alignment with Burrows-Wheeler transform.
        Bioinformatics. 2009; 25 (19451168): 1754-1760
        • Li H.
        • Handsaker B.
        • Wysoker A.
        • Fennell T.
        • Ruan J.
        • Homer N.
        • Marth G.
        • Abecasis G.
        • Durbin R.
        The sequence alignment/map format and SAMtools.
        Bioinformatics. 2009; 25 (19505943): 2078-2079
        • Liu G.E.
        • Ventura M.
        • Cellamare A.
        • Chen L.
        • Cheng Z.
        • Zhu B.
        • Li C.
        • Song J.
        • Eichler E.E.
        Analysis of recent segmental duplications in the bovine genome.
        BMC Genomics. 2009; 10 (19951423): 571
        • Low W.Y.
        • Tearle R.
        • Liu R.
        • Koren S.
        • Rhie A.
        • Bickhart D.M.
        • Rosen B.D.
        • Kronenberg Z.N.
        • Kingan S.B.
        • Tseng E.
        • Thibaud-Nissen F.
        • Martin F.J.
        • Billis K.
        • Ghurye J.
        • Hastie A.R.
        • Lee J.
        • Pang A.W.C.
        • Heaton M.P.
        • Phillippy A.M.
        • Hiendleder S.
        • Smith T.P.L.
        • Williams J.L.
        Haplotype-resolved genomes provide insights into structural variation and gene content in Angus and Brahman cattle.
        Nat. Commun. 2020; 11 (32350247)2071
        • Ma L.
        • Cole J.B.
        • Da Y.
        • VanRaden P.M.
        Symposium review: Genetics, genome-wide association study, and genetic improvement of dairy fertility traits.
        J. Dairy Sci. 2019; 102 (30268602): 3735-3743
        • Maltecca C.
        • Gray K.A.
        • Weigel K.A.
        • Cassady J.P.
        • Ashwell M.
        A genome-wide association study of direct gestation length in US Holstein and Italian Brown populations.
        Anim. Genet. 2011; 42 (22034999): 585-591
        • Mao X.
        • Kadri N.K.
        • Thomasen J.R.
        • De Koning D.J.
        • Sahana G.
        • Guldbrandtsen B.
        Fine mapping of a calving QTL on Bos taurus autosome 18 in Holstein cattle.
        J. Anim. Breed Genet. 2016; 133: 207-218
        • Martin M.
        Cutadapt removes adapter sequences from high-throughput sequencing reads.
        EMBnet J. 2011; 17: 3
        • Medugorac I.
        • Seichter D.
        • Graf A.
        • Russ I.
        • Blum H.
        • Gopel K.H.
        • Rothammer S.
        • Forster M.
        • Krebs S.
        Bovine polledness–an autosomal dominant trait with allelic heterogeneity.
        PLoS One. 2012; 7 (22737241)e39477
        • Meuwissen T.H.
        • Karlsen A.
        • Lien S.
        • Olsaker I.
        • Goddard M.E.
        Fine mapping of a quantitative trait locus for twinning rate using combined linkage and linkage disequilibrium mapping.
        Genetics. 2002; 161 (12019251): 373-379
        • Müller M.P.
        • Rothammer S.
        • Seichter D.
        • Russ I.
        • Hinrichs D.
        • Tetens J.
        • Thaller G.
        • Medugorac I.
        Genome-wide mapping of 10 calving and fertility traits in Holstein dairy cattle with special regard to chromosome 18.
        J. Dairy Sci. 2017; 100 (28109604): 1987-2006
        • NCBI Resource Coordinators
        Database resources of the National Center for Biotechnology Information.
        Nucleic Acids Res. 2017; 46: D8-D13
        • Payne A.
        • Holmes N.
        • Clarke T.
        • Munro R.
        • Debebe B.J.
        • Loose M.
        Readfish enables targeted nanopore sequencing of gigabase-sized genomes.
        Nat. Biotechnol. 2021; 39: 442-450
        • Pu L.
        • Lin Y.
        • Pevzner P.A.
        Detection and analysis of ancient segmental duplications in mammalian genomes.
        Genome Res. 2018; 28 (29735604): 901-909
        • Purfield D.C.
        • Bradley D.G.
        • Kearney J.F.
        • Berry D.P.
        Genome-wide association study for calving traits in Holstein-Friesian dairy cattle.
        Animal. 2014; 8: 224-235
        • Purfield D.C.
        • Evans R.D.
        • Berry D.P.
        Breed- and trait-specific associations define the genetic architecture of calving performance traits in cattle.
        J. Anim. Sci. 2020; 98 (32365208)skaa151
        • Purfield D.C.
        • Evans R.D.
        • Carthy T.R.
        • Berry D.P.
        Genomic regions associated with gestation length detected using whole-genome sequence data differ between dairy and beef cattle.
        Front. Genet. 2019; 10 (31749838)1068
        • Rosen B.D.
        • Bickhart D.M.
        • Schnabel R.D.
        • Koren S.
        • Elsik C.G.
        • Tseng E.
        • Rowan T.N.
        • Low W.Y.
        • Zimin A.
        • Couldrey C.
        • Hall R.
        • Li W.
        • Rhie A.
        • Ghurye J.
        • McKay S.D.
        • Thibaud-Nissen F.
        • Hoffman J.
        • Murdoch B.M.
        • Snelling W.M.
        • McDaneld T.G.
        • Hammond J.A.
        • Schwartz J.C.
        • Nandolo W.
        • Hagen D.E.
        • Dreischer C.
        • Schultheiss S.J.
        • Schroeder S.G.
        • Phillippy A.M.
        • Cole J.B.
        • Van Tassell C.P.
        • Liu G.
        • Smith T.P.L.
        • Medrano J.F.
        De novo assembly of the cattle reference genome with single-molecule sequencing.
        Gigascience. 2020; 9 (32191811)giaa021
        • Ruan J.
        • Li H.
        Fast and accurate long-read assembly with wtdbg2.
        Nat. Methods. 2019; 17: 155-158
        • Sahana G.
        • Guldbrandtsen B.
        • Lund M.S.
        Genome-wide association study for calving traits in Danish and Swedish Holstein cattle.
        J. Dairy Sci. 2011; 94 (21183059): 479-486
        • Seroussi E.
        • Glick G.
        • Shirak A.
        • Yakobson E.
        • Weller J.I.
        • Ezra E.
        • Zeron Y.
        Analysis of copy loss and gain variations in Holstein cattle autosomes using BeadChip SNPs.
        BMC Genomics. 2010; 11 (21114805): 673
        • Spealman P.
        • Burrell J.
        • Gresham D.
        Inverted duplicate DNA sequences increase translocation rates through sequencing nanopores resulting in reduced base calling accuracy.
        Nucleic Acids Res. 2020; 48: 4940-4945
        • Stankiewicz P.
        • Lupski J.R.
        Genome architecture, rearrangements and genomic disorders.
        Trends Genet. 2002; 18 (11818139): 74-82
        • Thaller G.
        • Kramer W.
        • Winter A.
        • Kaupe B.
        • Erhardt G.
        • Fries R.
        Effects of DGAT1 variants on milk production traits in German cattle breeds.
        J. Anim. Sci. 2003; 81 (12926772): 1911-1918
        • Tham C.Y.
        • Tirado-Magallanes R.
        • Goh Y.
        • Fullwood M.J.
        • Koh B.T.H.
        • Wang W.
        • Ng C.H.
        • Chng W.J.
        • Thiery A.
        • Tenen D.G.
        • Benoukraf T.
        NanoVar: Accurate characterization of patients’ genomic structural variants using low-depth nanopore sequencing.
        Genome Biol. 2020; 21 (32127024): 56
        • Thomasen J.R.
        • Guldbrandtsen B.
        • Sorensen P.
        • Thomsen B.
        • Lund M.S.
        Quantitative trait loci affecting calving traits in Danish Holstein cattle.
        J. Dairy Sci. 2008; 91: 2098-2105
        • Thorvaldsdóttir H.
        • Robinson J.T.
        • Mesirov J.P.
        Integrative Genomics Viewer (IGV): High-performance genomics data visualization and exploration.
        Brief. Bioinform. 2013; 14 (22517427): 178-192
        • Van der Auwera G.A.
        What is a VCF and How Should I Interpret It?.
        Broad Institute, Massachusetts Institute of Technology, 2017
        • Vaser R.
        • Sovic I.
        • Nagarajan N.
        • Sikic M.
        Fast and accurate de novo genome assembly from long uncorrected reads.
        Genome Res. 2017; 27 (28100585): 737-746
        • Vereinigte Informationssysteme Tierhaltung w.V.
        Trend - Fakten - Zahlen 2019.
        Vereinigte Informationssysteme Tierhaltung w.V., Verden2019
        • Vollger M.R.
        • Dishuck P.C.
        • Sorensen M.
        • Welch A.E.
        • Dang V.
        • Dougherty M.L.
        • Graves-Lindsay T.A.
        • Wilson R.K.
        • Chaisson M.J.P.
        • Eichler E.E.
        Long-read sequence and assembly of segmental duplications.
        Nat. Methods. 2019; 16 (30559433): 88-94
        • Wick R.R.
        • Judd L.M.
        • Gorrie C.L.
        • Holt K.E.
        Completing bacterial genome assemblies with multiplex MinION sequencing.
        Microb. Genom. 2017; 3 (29177090)e000132
        • Zhang F.
        • Carvalho C.M.
        • Lupski J.R.
        Complex human chromosomal and genomic rearrangements.
        Trends Genet. 2009; 25: 298-307
        • Zhang Q.
        • Guldbrandtsen B.
        • Thomasen J.R.
        • Lund M.S.
        • Sahana G.
        Genome-wide association study for longevity with whole-genome sequencing in 3 cattle breeds.
        J. Dairy Sci. 2016; 99 (27289149): 7289-7298
        • Zhang Q.
        • Ma Y.
        • Wang X.
        • Zhang Y.
        • Zhao X.
        Identification of copy number variations in Qinchuan cattle using BovineHD Genotyping Beadchip array. Molecular genetics and genomics.
        Mol. Genet. Genomics. 2015; 290 (25248638): 319-327
        • Zhang Z.
        • Schwartz S.
        • Wagner L.
        • Miller W.
        A greedy algorithm for aligning DNA sequences.
        J. Comput. Biol. 2000; 7: 203-214
        • Zimin A.V.
        • Delcher A.L.
        • Florea L.
        • Kelley D.R.
        • Schatz M.C.
        • Puiu D.
        • Hanrahan F.
        • Pertea G.
        • Van Tassell C.P.
        • Sonstegard T.S.
        • Marcais G.
        • Roberts M.
        • Subramanian P.
        • Yorke J.A.
        • Salzberg S.L.
        A whole-genome assembly of the domestic cow, Bos taurus.
        Genome Biol. 2009; 10 (19393038): R42