The landscape of long non-coding RNA expression in the goat brain

The brain regulates multiple metabolic processes, such as food intake, energy expenditure, insulin secretion, hepatic glucose production and glucose/fatty acid metabolism in adipose tissue, which are fundamental for the maintenance of energy and glucose homeostasis during lactation and pregnancy. Besides, brain expression has a fundamental impact on the development of maternal behavior. Although brain functions are partly regulated by long non-coding RNAs (lncRNAs), their expression profiles have not been characterized in depth in any ruminant species. We have sequenced the transcriptome of 12 brain tissues from 3 1-mo pregnant and 4 non-pregnant goats to investigate their lncRNA expression patterns. Between 4,363 (adenohypophysis) and 4,604 (olfactory bulb) lncRNAs were expressed in brain tissues, leading to establish a set of 794 already annotated lncRNAs and 5,098 novel lncRNA candidates. The detected lncRNAs shared features with those of other mammals, and tissue-specific lncRNAs were enriched in brain development-related terms. Differential expression analyses between 1 mo-pregnant and non-pregnant goats showed that the lncRNA expression profiles of certain brain regions experience substantial changes associated with early pregnancy (238 lncRNAs are differentially expressed in the olfactory bulb), while others do not. Enrichment analysis showed that differentially expressed lncRNAs from the olfactory bulb are co-expressed with genes previously linked to behavioral changes related to pregnancy. These findings provide a first characterization of the landscape of lncRNA expression in the goat brain and provides valuable clues to understand the molecular events triggered by early pregnancy in the central nervous system.


INTRODUCTION
The mammalian brain is a complex organ composed of highly specialized anatomical structures that control multiple physiological processes, such as memory, metabolism, energy homeostasis, motor skills, visuospatial processing, body temperature and behavior, among others.While the ruminant brain shows anatomical similarities with those of other mammals, it also possesses particular features such as a deep depression of the insula, pronounced cortical gyri, a dominant position of the visual and olfactory systems, and a relatively large size of the diencephalon (Schmidt et al., 2012).Morphometric modifications of the brain reflect specialized functional requirements: for instance, the more prominent olfactory structures in goat (compared with human) are explained by the importance of olfaction in reproductive activities including mating and estrus and mother-neonate interactions (Kavoi and Jameela, 2011).The maternal brain is remarkably plastic and adult neurogenesis has emerged as an important mechanism that contributes to such neuroplasticity (Leuner and Sabihi, 2016).
Brain expression influences food intake, energy expenditure, insulin secretion, hepatic glucose production and glucose/fatty acid metabolism in adipose tissue, which have a key role in the maintenance of energy and glucose homeostasis during lactation and other physiological challenges (Roh et al., 2016).Besides, anatomical structures in the brain, or closely associated to it, are involved in the hormonal control of lactation through the release of oxytocin and prolactin (Bhimte et al., 2018).Pregnancy, lactation and motherhood can have a long-term impact on the neural and biological states of adult females, e.g., studies in sheep have shown that oxytocin and vasopressin induce a greater increase in acetylcholine and noradrenalin in multiparous ewes when compared with their nulliparous counterparts, and glutamate and GABA release was significantly increased in the olfactory bulb at parturition in multiparous ewes (Bridges, 2016).
Data about the gene expression profiles of brain regions and structures in ruminants are very scarce, not only in goats (Muriuki et al., 2019), but also in sheep (Clark et al., 2017) and cattle (Liu et al., 2022).Moreover, such data are mostly restricted to protein-coding genes (PCGs) while the repertoire of brain long noncoding RNAs (lncRNAs), i.e., transcripts longer than 200 nucleotides that lack any protein-coding capability, remains to be fully characterized in any dairy species.Although long non-coding RNAs have relevant biological roles in the brain physiology of humans (Srinivas et al., 2023) and mouse (Goff et al., 2015), their functional impact in the brain of livestock species is mostly unknown.Indeed, very few studies have characterized the lncRNA catalog of brain tissues in cattle, pigs or chicken (Kosinska-Selbi et al., 2020).
Long non-coding RNAs can regulate gene expression at the transcriptional and post-transcriptional level by: (1) altering the chromatin environment through the establishment of interactions with DNA; (2) suppressing gene expression by interfering with the transcription machinery; (3) pairing with other RNAs to recruit protein complexes; (4) exerting cis-effects on the expression of neighbor genes (which can be mediated by the lncRNA transcript itself or be an effect of its transcription); and (5) sponging miRNAs (Statello et al., 2021).
Long non-coding RNAs are particularly enriched in the brain, and they have key roles in neuroplasticity, cognition, and differentiation of neural stem cells (Fernandes et al., 2018).For instance, a study of lncRNA expression in the adult subventricular zone-olfactory bulb (SVZ-OB) neurogenesis showed that distal-less homeobox 1 antisense (Dlx1as) lncRNA was required selectively for the SVZ neuronal lineage, and that its knockdown caused a decrease in Dlx1/2 expression suggesting a role in neuronal differentiation through the regulation of its homeobox gene neighbors (Ramos et al., 2013).Pinky (Pnky) is another lncRNA likely regulating the production of neurons from ventricularsubventricular zone neural stem cells in the developing mouse brain (Ramos et al., 2015).Although several lncRNAs are specifically expressed or enriched in the neurogenic regions of the brain, their exact function in post-natal and adult neurogenesis remains unknown (Fernandes et al., 2018).In addition to neurogenesis, lncRNAs have been related to a broad array of neuro-logical processes.In a study of postpartum depression in mice, upregulation of Gm14205 lncRNA resulted in a decrease in oxytocin receptors and activation of the NLRP3 inflammasome (Zhu and Tang, 2020).Moreover, the expression of MALAT-1 lncRNA is upregulated by the neurotransmitter oxytocin during lactation (Richard et al., 2018), while the expression of multiple lncRNAs is correlated to that of genes related to lactation metabolism (including oxytocin) in the mammary gland of sows (Shi et al., 2022) and cattle (Marceau et al., 2023).In another work, it was found that genes targeted by lncRNAs differentially expressed in cows exposed to heat stress vs normal temperature were enriched in the prolactin signaling pathway (Zeng et al., 2023).In most cases, the exact mechanism by which lncRNAs may interact with oxytocin and prolactin remains unknown.
The main goal of the current work was to establish the full catalog of lncRNAs expressed in the goat brain by sequencing the transcriptome of 12 distinct regions in 7 Murciano-Granadina goats.By taking advantage that 3 of these goats where 1-mo pregnant, we have also investigated whether early pregnancy has any impact on the lncRNA expression of the 12 brain tissues under study.

Ethics statement
Goats were slaughtered as a result of a routine culling procedure managed by the Service of Farms and Experimental Fields of the Universitat Autònoma de Barcelona (UAB) completely unrelated to this project.The slaughtering protocol was approved by the Ethics Committee on Animal and Human Experimentation of the Universitat Autònoma de Barcelona (project approval code: CEEAH: 4790).To minimize pain, goats were administered pentobarbital (150 mg/kg) in the jugular vein.A lethal intravenous dose of sodium pentobarbital causes a goat to lose consciousness within seconds and results in clinical death within just minutes.

Data sets
Seven nonlactating Murciano-Granadina goats (6.28 ± 1.38 years) raised at the Farm and Experimental Fields service of the Universitat Autònoma de Barcelona were slaughtered for reasons unrelated with this project.To minimize pain, goats were administered pentobarbital (150 mg/kg) in the jugular vein.Three of these goats were 1-mo pregnant at the time of slaughtering.To have a representative collection of brain tissues, samples were obtained from the 3 main parts of the goat brain: the cerebrum (frontal neocortex, olfactory bulb, hypothalamus, hippocampus and rostral colliculus), the cerebellum (hemisphere and trunk), and the brainstem (pons and medulla oblongata); plus 3 glandular tissues that have a key role in reproduction: the pineal gland, which synthesizes melatonin, an hormone affecting follicular growth, oocyte maturation, ovulation, and luteal function (Olcese, 2020); the neurohypophysis, which triggers parturition and milk ejection by secreting oxytocin (Christ-Crain and Ball, 2022); and the adenohypophysis which is involved in producing follicle-stimulating hormone, luteinizing hormone and prolactin (Rennels and Herbert, 1979).Importantly, several of the sampled tissues are deeply involved in the modulation of behavior.For instance, the brainstem is implicated in the regulation of emotional behavior (Venkatraman et al., 2017), while cerebellum has been traditionally assumed to regulate motor coordination but there is increasing evidence of its influence on influencing reward and social behaviors (Carta et al., 2019).The 4 cerebrum tissues are also strongly connected to behavioral traits.For instance, the olfactory bulb (Lévy et al., 2004), the prefrontal cortex (Glat et al., 2022) and the hypothalamus (Orikasa, 2021) are deeply involved in the regulation of maternal behavior, while the hippocampus has a key role in the formation of relational memory (Rubin et al., 2014).An expert anatomist carefully dissected and biopsied the 12 brain tissues mentioned before and samples were immediately submerged in RNAlater (Thermofisher Scientific, Barcelona, Spain) to be stored at −80°C until processing.Once submerged in RNAlater, samples were immediately frozen in normal ice at 0°C.The samples were in ice between 1h and 5h until they were carried to a −80°C freezer.
For total RNA purification, tissue samples were mixed with 1 mL QIAzol (QIAGEN Inc., Barcelona, Spain) and homogenized using the Lysing Matrix D reagent (MP Biomedicals, Santa Ana, CA) in a Precellys 24 tissue homogenizer (Bertin Instruments, Rockville, MD).RNA was extracted from each tissue using the RNeasy lipid tissue mini kit (QIAGEN Inc., Barcelona, Spain) following the protocol described by the manufacturer.The concentration and purity of extracted RNA molecules were analyzed using the Nanodrop ND-1000 spectrophotometer (Thermo Fisher Scientific, Barcelona, Spain) and RNA integrity was assessed with a Bioanalyzer-2100 equipment (Agilent Technologies, Santa Clara, CA) using the RNA 6000 Nano Kit 4.2 (Agilent Technologies, Santa Clara, CA).Poly-A enriched RNA-seq libraries were prepared with the KAPA Stranded mRNA-seq Illumina Platforms Kit (Roche, Sant Cugat, Spain) by using 500 ng total RNA as template.The libraries were sequenced in an HiSeq 4000 platform (Illumina, San Diego, CA) with pairedend sequencing (2 × 50 bp) at the Centre Nacional d'Anàlisi Genòmica (CNAG).

Alignment and transcriptome assembly
Adaptor sequence removal and quality filtering were performed in the raw data with Trimmomatic v0.39 (Bolger et al., 2014) with the following criteria: (1) trimming of bases if their quality in both ends drops below a Phred value of 20; (2) trimming of sequences if the quality within a sliding window of 5 nucleotides drops below a Phred value of 20; and (3) removal of reads with a length shorter than 36 nucleotides.Then, reads of rRNA origin were discarded by aligning sequences to the SILVA (release 138.1) rRNA database (Quast et al., 2013) with bbduk v38.96 (Bushnell 2022).The remaining sequences were aligned to the Ensembl 106 goat reference genome assembly ARS1 (Bickhart et al., 2017) with STAR v2.7.10a (Dobin et al., 2013).For the detection of non-annotated transcripts, the transcriptome was reconstructed with StringTie v2.2.1 (Pertea et al., 2015) and by considering the goat reference annotation (ARS1) from Ensembl.To obtain a non-redundant set of transcripts, all transcriptome assemblies were merged into a unique annotation file with the -merge option of StringTie.

lncRNA characterization
The transcripts from the new annotation file were classified based on their location relative to the goat reference annotation with GffCompare (Pertea and Pertea, 2020).Candidate lncRNAs were selected among the transcripts classified as unknown or intergenic (u), intronic (i), antisense (x), and other same strand overlap (o).Those transcripts classified as intronic by Gff-Compare (Pertea and Pertea, 2020) but located in the opposite strand of an annotated gene were reclassified as antisense.Then, the selected transcripts were further filtered by length and coding potential.Multi-exonic transcripts with a length less than 200 nucleotides and mono-exonic transcripts with a length less than 2000 nucleotides were filtered.Regarding the coding potential, Coding Potential Calculator 2 or CPC2 (Kang et al., 2017) and Coding Potential Assessment Tool or CPAT (Wang et al., 2013) were used to classify transcripts as protein coding or non-coding, while HM-MER (Eddy, 2011) was used to detect protein domains from Pfam (Mistry et al., 2021) in any of the 3 possible translation frames.Transcripts classified as non-coding by CPC2 and CPAT and without any protein domain were selected for further analysis.Due to the expected low sequence conservation, lncRNAs are usually classified according to their position relative to that of the closest gene.The following criteria were used to classify lncRNAs: (1) intronic, for transcripts fully contained in an annotated intron; (2) antisense, for transcripts overlapping an annotated gene in the opposite strand; (3) overlapping, for transcripts partially overlapping an annotated gene in the same strand; (4) intergenic, for transcripts without any proximal annotated gene in a 5 kb window in both ends; (5) sense upstream, located in a 5 kb window upstream of an annotated gene in the same strand; (6) sense downstream, located in a 5 kb window downstream of an annotated gene in the same strand; (7) divergent, transcript start within a 5 kb window of the 5′ end of an annotated gene and in opposite strand; and (8) convergent, transcript end within a 5 kb window of the 3′ end of an annotated gene and in the opposite strand.
Furthermore, detected transcripts were named following the guidelines from the HUGO Gene Nomenclature Committee (Seal et al., 2020).The proposed nomenclature also refers to the genomic context provided by the annotated genes.Thus, lncRNA genes were named with the following criteria: (1) intergenic transcripts were assigned the root symbol "LINC" followed by a unique 5-digit number; (2) antisense transcripts were assigned the symbol format [gene]-AS#; (3) divergent transcripts were assigned the symbol format [gene]-DT#; (4) intronic transcripts were assigned the symbol format [gene]-IT#; and (5) overlapping transcripts were assigned the symbol format [gene]-OT#, where # refers to a sequential number in all instances.If a lncRNA gene had multiple transcripts, a _# tag was added at the end for the transcript name.lncRNA genes with more than 10 isoforms were filtered out from the analysis.

lncRNA quantification
Once an annotation file with the annotated Ensembl genes and candidate lncRNAs was obtained, quantification at the gene level was performed with the pseudoaligment tool Kallisto v0.44.0 (Bray et al., 2016) and the gene-level estimations from tximport v1.24.0 R package (Soneson et al., 2016).It has been shown that differential isoform usage can lead to an inflated false positive rate in differential expression analyses at the gene level (Soneson et al., 2016).However, this problem can be solved by adding an offset calculated from transcript-level abundance estimates that corrects for changes to the average transcript length across samples (Soneson et al., 2016).In the case of lncRNA quantification, pseudoalignment methods have been shown to outperform simple gene-level quantification tools such as HTSeq (Putri et al., 2022) and featureCounts (Liao et al., 2014), which often underestimate lncRNA expression (Zheng et al., 2019).In our pipeline, Kallisto (Bray et al., 2016) was used with default parameters to perform a pseudoalignment to the transcriptome, while tximport was employed to calculate gene-level estimated counts and gene-level offsets.For downstream analyses, only genes with a count per million (CPM) value greater than 0.5 in at least 4 samples in a tissue were considered as expressed.A schematic representation of the full lncRNA characterization pipeline is shown in Supplementary Figure S1.

Tissue specificity
Tissue-specific expression of novel lncRNAs and annotated lncRNAs (those classified as lincRNA in the Ensembl annotation) was assessed through the tissuespecificity index or τ (Yanai et al., 2005).The τ index was calculated in R on log 2 transformed mean CPM values, and it ranges from 0 (for housekeeping genes) to 1 (for strictly tissue-specific genes).In our study, lncRNAs with a τ value less than 0.15 were defined as housekeeping, while those lncRNAs with values greater than 0.85 were classified as having tissue-specific expression.Correlations among novel lncRNAs and the closest annotated PCG in a 100 kb window were calculated to infer potential cis-regulation of mRNAs by lncRNAs.Statistical significances of these correlations were obtained by performing an R test for the association between paired samples (cor.test)and, subsequently, a correction for multiple testing with the false discovery rate (FDR) method (Benjamini and Hochberg, 1995).Those lncRNA-mRNA gene pairs with an absolute Pearson correlation (between their expression levels) higher than 0.6 and an adjusted P-value less than 0.05 were selected.Protein-coding genes with a τ value greater than 0.85 and significantly correlated, in terms of expression, with a given lncRNA were tested for enrichment of GO terms in each tissue with the gprofiler2 R package (Peterson et al., 2020).The set of all expressed PCGs was selected as background and a Benjamini-Hochberg FDR of 0.05 was used for the selection of significant pathways.

Differential expression analysis
For exploratory purposes, a principal component analysis (PCA) and a hierarchical clustering analysis were performed with the prcomp and hclust functions from the stats R package (R Core Team, 2022).Euclidean distances and the Ward D agglomeration method were used to carry out the hierarchical clustering.The DESeq2 R package (Love et al., 2014) was selected for the differential expression analysis.Correction for multiple testing was applied using the FDR method (Benjamini and Hochberg, 1995).Significant changes in expression levels between pregnant (numerator) vs. non-pregnant (denominator) goats were defined as those with an adjusted P-value threshold below 0.05 and a fold change greater than 1.5 or lower than 0.667.Each tissue was analyzed independently.

Gene co-expression analysis
A weighted gene co-expression network analysis was performed with the WGCNA R package (Langfelder and Horvath, 2008).Before the analysis, all genes with a tissue-specific expression profile (PCGs and lncRNAs included) were filtered out, retaining only those that were expressed consistently in at least 6 different tissues.First, a co-expression similarity matrix was constructed from normalized expression data using the biweight midcorrelation, which is more robust against outliers.Next, the adjacency matrix was calculated raising the similarity matrix to a power β.A value of 18 was selected for β, which was the minimum value required to get a scale-free topology network (R 2 > 0.8).Modules, or clusters of interconnected genes, were defined by hierarchical clustering over the dissimilarity of the Topological Overlap Matrix (TOM).A minimum module size of 30 genes was required and modules with similar expression profiles were merged after selecting a height cut-off threshold of 0.15.
Once the modules were defined, strong correlations between the tissue of origin or the pregnancy status and the module eigengenes were searched.Module eigengenes are defined as the first principal component of each module and they were used as the weighted average of the gene expression profile in the module.Pearson correlations were calculated, and their associated P-values were adjusted for multiple testing with the qvalue function.Modules with an absolute correlation higher than 0.6 and a q-value less than 0.05 were selected, and further tested for enrichment of GO terms and KEGG pathways using the gprofiler2 R package (Peterson et al., 2020).The set of all expressed PCGs was selected as background and a q-value of 0.05 was used for the selection of significant pathways.Under the assumption that genes that cluster together do so because they have a coordinated expression and, in consequence, they are involved in similar biological processes, lncRNA function was inferred based on the guilt-by-association principle.The enriched GO terms from the biological process category were visualized with Cytoscape (Shannon et al., 2003) and the Enrich-mentMap plugins (Merico et al., 2010).The Enrich-mentMap Cytoscape App generates a network in which nodes represent enriched terms that are connected if they share genes.Terms with common genes often represent similar or related biological processes and are grouped together as sub-networks.

Statistics for RNA-seq data
The 84 RNA preparations from 12 brain tissues retrieved from 7 goats displayed an average RNA integrity number (RIN) of 7.53 ± 0.62 (with a minimum and maximum value of 6.2 and 9, respectively).Such RIN values are within the expected range for a conservative cut-off (RIN = 6.4-7.9)defined by Gallego Romero et al. (2014).Brain samples were sequenced with an average depth range of 37.79 (adenohypophysis) to 42.12 (cerebellar trunk) million reads.As shown in Table 1, between 90.13% (rostral colliculus) and 92.04% (neurohypophysis) of the sequences remained after adapter removal and quality filtering with Trimmomatic.From the sequences retained after filtering, the percentage of rRNA sequences detected by bbduk per tissue ranged from 3.02% (pineal gland) to 7.53% (neurohypophysis).After obtaining the gene expression count matrix, it was confirmed that there were very few sequences that could be assigned to rRNA genes.The reads that passed the previous filtering steps were aligned to the reference genome.The read alignment rate for uniquely aligned reads ranged from 77.57% (pons) to 89.55% (pineal gland), while for multimapping reads (no more than 10 different loci) ranged from 9.53% (pineal gland) to 21.67% (pons).For a more detailed description of the alignment outputs for each sample, please see Table S1.

lncRNA characterization
Once transcripts classified as unknown intergenic, intronic, antisense and other same strand overlap were selected from the GffCompare output, they were filtered by length.There were 18,745 transcripts meeting our length criteria.Then, transcripts were further filtered by their coding potential and protein domain detection, reducing the list to 11,368 novel lncRNA transcripts.Candidate lncRNA genes were evenly distributed across chromosomes, with some hotspots on chromosomes 18, 19 and 25 (Figure 1A).Subsequently, lncRNAs were classified and named in accordance with their position to the closest annotated gene, while lncRNA genes with more than 10 isoforms were filtered out.By doing so, we obtained 10,213 transcripts from 6,347 lncRNA genes.Long non-coding RNA genes classified as intergenic (35.47%) were the most common ones, followed Varela-Martínez et al.: lncRNAs in the goat brain by antisense (26.33%) and divergent (13.66%) lncRNA genes (Figure 1B and 1C).Novel lncRNAs and Ensembl annotated PCGs showed a similar distribution of length when considering long transcripts (>2000 nucleotides), while lncRNAs showed a greater proportion of short transcripts than PCGs (Figure 1D).Goat brain lncRNAs were composed of fewer exons when compared with PCGs, being the majority of them biexonic (Figure 1E).Protein coding genes had a higher expression than lncRNAs in all tissues, while novel lncRNAs were more expressed than long intergenic non-coding RNA (lincRNA) transcripts already annotated in Ensembl (Figure S2A-L).

Expression profile and tissue specificity
Only genes with a CPM value greater than 0.5 in at least 4 samples were retained in each tissue for further analysis.The number of expressed lncRNAs (novel and annotated) in each tissue ranged from 4,363 in adenohypophysis to 4,604 in the olfactory bulb, making possible to establish a catalog of 5,892 distinct lncRNAs that were expressed in at least one tissue (Table S2).From these 5,892 lncRNAs, 5,098 were novel lncRNAs while 794 had been already annotated.A PCA based on the lncRNA expression data demonstrated that most samples clustered by tissue of origin.There were 5 brain tissues with a very divergent expression pattern when compared with the remaining ones, namely adenohypophysis, neurohypophysis, pineal gland, cerebellar hemisphere and cerebellar trunk (Figure 2A).The removal of these samples from the PCA showed that the remaining samples clustered by tissue of origin rather than pregnancy status (Figure 2B).A hierarchi-cal clustering analysis also supported such data structure (Figure S3).Thus, the tissue expression profile of lncRNAs was closely aligned to that observed by Luigi Sierra et al. (2023) for PCGs.
The distribution of the tissue specificity τ index was skewed to the left, and such bias was more pronounced for PCGs (Figure 3A).Moreover, 578 (9.8%) lncRNAs had τ values below 0.15, which was indicative of a likely housekeeping role in the goat brain.In addition, 1,042 (17.68%) lncRNAs had values greater than 0.85 (Figure 3B), of which 428 (7.26%) showed a strict tissue specific pattern of expression (Figure 3C).With 163 lncRNAs displaying τ values of 1, the pineal gland was the organ with the highest level of tissuespecific expression.In contrast, only 13 and 16 lncRNA genes with tissue-specific expression were detected in the rostral colliculus and hippocampus, respectively.The expression of 1186 lncRNAs was significantly correlated with those of their respective closest genes in 100 kb windows (Table S3).Tissue-specific lncRNAs (τ > 0.85) were characterized by performing an enrichment analysis using the significantly correlated genes as input.As shown in Supplementary Tables S4A-J, the expression of several of the tissue specific lncRNAs was significantly correlated to the expression of PCGs related to brain development, with terms such as forebrain development in adenohypophysis (q-value = 0.0069) and neurohypophysis (q-value = 0.0058), pituitary gland development in adenohypophysis (q-value = 0.016) and neurohypophysis (q-value = 0.015), planar cell polarity pathway involved in neural tube closure in cerebellar hemisphere (q-value = 0.032) and cerebellar trunk (q-value = 0.028), telencephalon regionalization in hippocampus (q-value = 0.0049), endocrine system  development in medulla oblongata (q-value = 0.0016), olfactory bulb development in olfactory bulb (q-value = 0.048), glossopharyngeal nerve development in pons (q-value = 0.029) and hindbrain development in rostral colliculus (q-value = 0.034).In addition, other pathways in adenohypophysis and neurohypophysis were enriched in a similar fashion in both tissues, e.g., neuron fate specification (q-value = 1.84E-05 and 1.64E-05 in adenohypophysis and neurohypophysis, respec-tively), positive regulation of neuroblast proliferation (q-value = 0.0064 and 0.0059 in adenohypophysis and neurohypophysis, respectively), forebrain cell migration (q-value = 0.0069 and 0.0059 in adenohypophysis and neurohypophysis, respectively) and regulation of neuron apoptotic process (q-value = 0.00987 and 0.0087 in adenohypophysis and neurohypophysis, respectively).
A weighted gene co-expression network was constructed to investigate the coordinated expression of PCGs and lncRNAs as well as to predict lncRNA putative functions by guilt-by-association. Genes with similar expression patterns were clustered in 59 modules with a size range from 54 to 1272 genes (Figure 4A and Figure S4).The eigengene of each module, which defines the principal component of the genes in the module, was calculated, and correlations between the tissue of origin or the pregnancy status and the module eigengenes were calculated.There was no module correlated to the pregnancy status, while 6 modules were significantly correlated to specific tissues (Figure 4B).For instance, the floralwhite and coral2 modules were correlated to both adenohypophysis (floralwhite, ρ = 0.66 and coral2, ρ = −0.64)and neurohypophysis (floralwhite, ρ = 0.64 and coral2, ρ = −0.61),while the darkseagreen3 and palevioletred3 modules were correlated to both cerebellar hemisphere (darkseagreen3, ρ = −0.61 and palevioletred3, ρ = 0.67) and trunk (darkseagreen3, ρ = −0.59 and palevioletred3, ρ = 0.62).Besides, the honeydew (ρ = 0.69) and skyblue4 (ρ = 0.9) modules were correlated with pineal gland.Most of the modules were composed of both PCGs and lncRNAs, although in differing proportions (Figure 4C and Figure S5).
When integrating the differential expression and coexpression analyses, most of the modules contained, at least, one lncRNA differentially expressed in at least one tissue (Figure S6).S6).These modules were tested for gene enrichment and, by doing so, several biological pathways became evident (Figure 5 and Table S7).The darkolivegreen4 and daksseagreen4 modules shared multiple terms related to synaptic signaling and neuron cell morphogenesis, while the darkseagreen4 module had terms specific to axon morphogenesis such as axonogenesis (q-value = 1.55E-02) and axon development (q-value = 4.45E-02).In contrast, the darkolivegreen4 module was enriched in genes related with dendrite development (q-value = 2.38E-02) and regulation of dendrite development (qvalue = 1.82E-02).Moreover, terms related to actin filaments such as regulation of actin filament length (q-value = 2.91E-03), actin polymerization or depolymerization (q-value = 2.91E-03) and Arp2/3 complex-mediated actin nucleation (q-value = 1.12E-02) were enriched in the set of genes included in the darksea-green4 module.Regarding the brown4 module, it was mainly enriched in terms from the cellular component and molecular function GO categories related to mitochondria and the respiratory chain complex.In addition to GO terms, enrichment for KEGG pathways was tested (Table S7).The dopaminergic synapse (q-value = 4.41E-02), calcium signaling pathway (q-value = 4.41E-02) and phosphatidylinositol signaling pathway (q-value = 4.54E-02) were enriched in the set of genes included in the darkolivegreen4 module.In contrast, the brown4 module showed enrichment in pathways such as retrograde endocannabinoid signaling (q-value = 1.93E-03), chemical carcinogenesis-reactive oxygen species (q-value = 1.93E-03) and aldosterone-regulated sodium reabsorption (q-value = 2.14E-02), in addition to multiple pathways related to neurological conditions, e.g., Huntington disease, amyotrophic lateral sclerosis, prion disease, Parkinson disease and Alzheimer disease.

The landscape of long non-coding RNAs expressed in the goat brain
The mammalian brain is a complex organ formed by multiple highly specialized and interacting structures.Transcriptomic studies may allow to understand the functional roles of these structures and their involvement in multiple physiological processes.In the present study, RNA sequencing was performed in 12 different regions of the adult goat brain to characterize their lncRNA expression profiles.This work represents a significant improvement of the current catalog of annotated goat lncRNAs, since 10,213 novel transcripts originating from 6,347 lncRNA genes have been identified.In humans, 14,100 lncRNA genes across 49 tissues have been annotated so far and high numbers of these lncRNAs are preferentially expressed in the testis, brain, blood, and skin tissues (de Goede et al., 2021).The novel goat lncRNAs shared similar features with previously reported mammal lncRNAs, i.e., exon number and expression levels were lower than those of PCGs.Moreover, most of lncRNAs were classified as intergenic or antisense (Chen et al., 2019;Zheng et al., 2019;Bridges et al., 2021).It must be pointed out that the classification of lncRNAs is highly dependent on the reference annotation status of the organism under study.Thus, the overrepresentation of the intergenic class in detriment of other classification categories is expected.Interestingly, the expression levels of novel lncRNAs were higher than those of reference lncRNAs retrieved from Ensembl.
Previous reports have evidenced a higher tissue specificity of lncRNAs when compared with PCGs (Ward et al., 2015;Marttila et al., 2020), and our results are consistent with such findings.Indeed, PCGs showed a more pronounced left skewness of the τ index, indicative of a higher proportion of genes with housekeeping roles.According to Luigi Sierra et al. (2023), in the goat brain 5,655 (33.16%)PCGs showed a ubiquitous pattern of expression (τ < 0.15) and 1,363 (7.99%) PCGs displayed a tissue-preferential expression (τ > 0.85), while in our study only 578 (9.8%) and 1,042 (17.68%) lncRNAs had a ubiquitous or tissue-specific pattern of expression, respectively.The tissue specific expression of lncRNAs might be due to the fact that they exert highly specific functions, to spurious transcription, or to effects of thresholding on detecting lowly expressed genes (de Goede et al., 2021).It has been shown that lncRNAs play crucial roles in the developing mammalian brain (Fernandes et al., 2018).Indeed, several of the tissue specific lncRNAs were correlated with region-specific brain development-related genes.
The exact function of lncRNAs in goats and other livestock species are mostly unknown.Even more, the profiles of expression of lncRNAs in the brain of cattle or pigs have been just characterized in a few brain tissues.However, it is worth to mention that in a recent study of the lncRNA transcriptome of the mammary gland of Yak (Wu et al., 2020), the LIM homeobox 3 (LHX3) gene was pointed out as a potential cis-target of a lncRNA in close proximity (<100 Kb).The LHX3 transcription factor is critical for pituitary gland formation and specification of the anterior pituitary hormonesecreting cell types (Sloop et al., 2001) and variations in this gene have been significantly associated with milk performance in goats (Liu et al., 2011).In our investigation, we found that the intergenic LINC03304 lncRNA was located in close proximity (<100 Kb) to LHX3 and, moreover, it was specifically expressed in the adenohypophysis and the neurohypophysis (τ = 0.91) with a significant high Pearson correlation to LHX3 expression (ρ = 0.97 and q-value = 6.96E-48).Cis-regulatory lncRNAs may display an enhancer-like activity promoting expression of neighboring genes.

Regional differences in the expression profiles of long non-coding RNAs
We have observed that tissue of origin is the main factor determining lncRNA expression patterns in the goat brain, as previously reported for PCGs by Luigi Sierra et al. (2023).There were 3 clusters of samples that showed highly differentiated lncRNA expression profiles: one composed by adenohypophysis and neurohypophysis, another one by cerebellum (cerebellarhemisphere and cerebellar-trunk), and a third one by pineal-gland.The 3 clusters were composed of the tis- sues with the highest number of tissue-specific lncRNAs (τ > 0.85).The pattern shown by cerebellum samples is consistent with previous reports, in which the expression profile of the cerebellum happens to be strongly different from that of other brain regions (Mahfouz et al., 2015).The cerebellum is the most neuronal dense region of the human brain (it contains around 80% of all the neurons in the brain) but it only represents 10% of the brain volume (Azevedo et al., 2009).In goats, the cerebellum constitutes 10.11% of the brain volume (Aydoğdu and Koçyiğit, 2023).While the estimated fraction of neurons in cerebellum may vary between mammals (from 60% in the mouse to 70% in the rat, guinea pig and macaque), the vast majority of neurons are located into the cerebellum across a range of mammals (Herculano-Houzel, 2010).This particularity might have contributed to the highly differentiated transcriptomic profile of the goat cerebellum observed by us.
In addition to the cerebellum, there were 3 glandular tissues (pineal gland and adeno/neurohypophysis) displaying a divergent lncRNA expression pattern.As a central component of the endocrine system, the pineal gland controls day-night rhythms and exerts an important role, especially in seasonally breeding animals, in the neuroendocrine regulation of reproductive physiology (Aleandri et al., 1996).The pituitary gland acts as a master gland controlling the function of most other endocrine glands and it secretes a great variety of hormones encompassing from adrenocorticotropic hormone, prolactin, luteinizing hormone, growth hormone and thyroid stimulating hormone in the adenohypophysis or anterior lobe, to oxytocin and arginine vasopressin in the neurohypophysis or posterior lobe (Ganapathy and Tadi, 2023).In a study analyzing 723 RNA-seq data from 91 tissues from cattle, it was observed that brain endocrine system tissues (stalk median eminence, adeno/neurohypophysis and pineal gland) clearly clustered in close proximity to central nervous system tissues (Fang et al., 2020).
After removing these 3 highly differentiated clusters from the PCA for visualization purposes, we observed (see first axis of the PCA plot) that the remaining samples grouped following a craniocaudal direction (frontal-neocortex/olfactory-bulb > hypothalamus/ hippocampus/rostral-colliculus > medulla oblongata/ pons).In a study using in situ hybridization data from the Allen Brain Atlas (Mercer et al., 2008), it was shown that the majority of the 849 ncRNAs (from 1,328 that were examined) in adult mouse brain were associated with specific neuroanatomical regions.Furthermore, it has been observed, in an adult human brain transcriptome atlas, that transcriptional regulation varies enormously by anatomical location and that such varia-tion reflects the distribution of major cell classes from different brain regions (Hawrylycz et al., 2012).In a recently published mouse brain atlas, it has been observed that transcriptionally related cell types are often found in the same region or in regions with a common developmental origin (Yao et al., 2023).As a whole, the expression pattern of many lncRNAs seems to be highly regulated in the goat brain and to be strongly affected by tissue-associated factors.

Early pregnancy triggers changes in the lncRNA expression profiles of the olfactory bulb, pons, hippocampus and frontal neocortex
A differential expression analysis was performed to investigate the effect of early pregnancy on the patterns of brain lncRNA expression.The number of biological replicates for pregnant (n = 3) and non-pregnant (n = 4) goats is low.In consequence, the sensitivity to detect DE lncRNAs is quite limited and results must be interpreted cautiously.DESeq2 was chosen for the differential expression analysis due to its better FDR control, power and stability, in experimental designs with small sample sizes, when compared with other tools (Li et al., 2022).Regardless, results obtained from small collections of samples need to be interpreted with caution since the true FDR may be several times higher than the selected FDR threshold (Soneson and Delorenzi, 2013).Our main aim was to determine which brain regions are more affected by pregnancy based on the number of differentially expressed genes.The effect of limited sample size on statistical power to detect DE lncRNAs should have a similar impact on all tissues, so the number of differentially expressed genes should be a reliable indicator of which brain regions are mostly affected by pregnancy (Luigi Sierra et al., 2023).Noteworthy, there is a very good match between the results obtained by us for brain lncRNAs and those reported by Luigi-Sierra et al. (2023) regarding mRNAs because the brain tissues that show a high or moderate differential expression for lncRNAs also show a high or moderate differential expression for mRNAs.
The comparison of 1-mo pregnant (n = 3) and nonpregnant (n = 4) goats revealed that the olfactory bulb was the tissue with the most relevant expression changes, with 238 differentially expressed lncRNAs.Pons, hippocampus and frontal neocortex showed a less substantial expression change (between 35 and 22 differentially expressed lncRNAs), while the remaining tissues showed hardly any change on their expression patterns.These results support that pregnancy does not affect all regions of the goat brain to the same degree.The olfactory bulb sends information about perceived smells to be further processed in other brain regions.In ungulates, like sheep and goats, the olfactory bulb is involved in 2 social situations: 1.Expression of maternal care and establishment of a bond with the offspring within the first hours after parturition; 2. The "male effect," in which introduction of a male among seasonally anestrous females leads to ovulation and sexual receptivity (Keller and Lévy, 2012).Although the 7 goats involved in our study were not subjected to any of these 2 conditions, and thus expression changes are not triggered by odorant signals from offspring, the strong changes in expression may indicate that the olfactory bulb is also affected by early pregnancy, which would be concordant with the results obtained by Luigi Sierra et al. (2023) when analyzing PCGs.Indeed, Luigi-Sierra et al. (2023) showed that the olfactory bulb, with 1207 differentially expressed genes (381 downregulated and 826 upregulated in pregnant goats) is, among the brain organs analyzed in their study, the one that experiences the most dramatic changes in gene expression as a consequence of early pregnancy.Many studies in laboratory rodents have shown that the estrous cycle, pregnancy and parturition regulate hippocampal and olfactory neurogenesis (Lévy et al., 2017).Moreover, the production of new olfactory interneurons in mice is a maternal adaptation initiated early in pregnancy and mediated by prolactin (Shingo et al., 2003).Elevated levels of prolactin during early pregnancy have been related to high maternal gestational neurogenesis in the subventricular zone of the lateral ventricle, which results in an increased incorporation of new neurons in the olfactory bulb, and it is critical for the establishment of adaptative changes in mood and behavior during the postpartum period (Larsen and Grattan, 2012).Indeed, granular cell turnover plays a fundamental role in the olfactory learning of novel odorants molecules, thus modulating the olfactory response to offspring (Larsen and Grattan, 2012).Depending on the species, such response might be crucial to induce maternal motivation, offspring recognition, nest building, crouching, nurturing, licking-grooming, and many other behaviors aimed to maximize offspring survival.
In the study performed by Luigi- Sierra et al. (2023), it was demonstrated that many of the PCGs that showed differential expression in the olfactory bulb of pregnant vs non-pregnant goats were related to human behavioral traits such as maternal care, sociability and exploration, anxiety and depression, aggression, cognition, memory and learning.Moreover, in mice the ablation of the olfactory bulb leads to the suppression of maternal behavior (e.g., nest building) and even to cannibalism within 12 h after parturition (Gandelman et al., 1971).The olfactory bulb undergoes extensive changes caused by the olfactory cues delivered by pups during and after parturition (Lévy et al., 2004).The main implication of our study, and of that of Luigi-Sierra et al. (2023), is that in goats the olfactory bulb undergoes extensive transcriptomic changes well before parturition, and for obvious reasons such changes are not promoted by olfactory signals.Evidence of this is provided by the dramatic variation that we have observed in the profile of lncRNA (current study) and mRNA (Luigi- Sierra et al., 2023) expression in the olfactory bulb of 1-mo pregnant vs non-pregnant goats as well as by the implication of many of these differentially expressed mRNA genes in behavior (Luigi-Sierra et al., 2023).In this context, we hypothesize that lncRNAs are deeply involved in the regulation of the extensive transcriptomic changes that take place in the olfactory bulb due to pregnancy, although this remains to be demonstrated.
The set of olfactory bulb lncRNAs that are DE in pregnant and non-pregnant goats were co-expressed in 3 different modules, darkseagreen4, darkolivegreen4 and brown4.The darkseagreen4 module was enriched in terms such as axonogenesis and axon development, while the darkolivegreen4 module was enriched in terms related to dendrite development, among others.Olfactory neurogenesis has been linked to a range of social behaviors and disturbances associated with pregnancy, but the exact function performed by these newborn neurons has not been fully elucidated (Leuner and Sabihi, 2016).In addition, the dopaminergic synapse KEGG pathway was enriched in the darkolivegreen4 module.An increase of dopamine neurotransmitter has been observed in the olfactory bulb of primiparous and multiparous sheep ewes during the pre-partum period and a few hours postpartum (Keverne et al., 1993).In a recent study comparing pup-naïve late-pregnant and virgin mice, it was observed in late-pregnant mice that processing of chemosignals due to pup exposure involved co-activation of the olfactory and vomeronasal systems at the level of the olfactory bulb (Navarro-Moreno et al., 2020).
Although maternal behavior is expressed with highest intensity after delivery, because of the interaction between the newborn and its mother, behavioral changes are already initiated during pregnancy (Hillerer et al., 2014).Indeed, the developmental transition to motherhood is very dynamic and requires gene expression changes in diverse brain regions during pregnancy, parturition and postpartum period (Ray et al., 2016).Our study and that of Luigi Sierra et al. (2023) indicate that the olfactory bulb experiences dramatic changes in gene expression during early pregnancy, and such changes are not triggered by odorant signals.Taken together, these findings suggest that the olfactory bulb has a relevant role in the process of neurological adap-tation to motherhood, and that such adaptive process is initiated well before parturition.

CONCLUSIONS
In the present study, we have characterized novel ln-cRNAs from multiple brain tissues, thus improving the current catalog of non-coding transcripts in goat.The high number of unannotated lncRNA transcripts, when compared with Ensembl annotation, is consistent with the poor characterization of these molecules in goat and other ruminants.In the goat brain, cerebellum and several glandular tissues showed highly differentiated patterns of expression, while the remaining tissues clustered following a craniocaudal axis which indicates that lncRNAs expression follows a well-defined spatial pattern.Early pregnancy did not affect lncRNA expression in all brain regions to the same degree.The olfactory bulb was the tissue experiencing the most relevant expression changes in early pregnancy and such changes might be related to neurogenesis induced by pregnancy signals as well as to other neurological adaptations.The differential expression analysis has pointed out to several olfactory bulb lncRNAs that would be interesting to investigate further, i.e., mainly those co-expressed with genes that in model organisms have been previously linked to behavioral changes during pregnancy.
Varela-Martínez et al.: lncRNAs in the goat brain Varela-Martínez et al.: lncRNAs in the goat brain Varela-Martínez et al.: lncRNAs in the goat brain

Figure 1 .
Figure 1.General features of lncRNAs expressed in the goat brain.(A) Distribution of the number of lncRNAs per Mb in goat chromosomes.(B) Classification of lncRNA genes relative to their closest protein coding gene.Percentage of lncRNAs in each category is shown between parentheses.(C) Schematic representation of the lncRNA classification criteria based on the position of the lncRNA relative to that of the nearest gene: 1) Intronic; 2) Antisense; 3) Overlapping; 4) Intergenic; 5) Sense upstream; 6) Sense downstream; 7) Divergent; and 8) Convergent.(D) Transcript length distribution in novel lncRNAs and PCGs.(E) Exon number distribution in novel lncRNAs and annotated PCGs.
Figure 2. (A) Principal component analysis (PCA) based on the lncRNA expression profiles of 12 brain tissues from 7 Murciano-Granadina female goats (B) Same PCA as in 2A, but excluding adenohypophysis, neurohypophysis, pineal gland, cerebellar hemisphere and cerebellar trunk samples.
Figure 3. Tissue-specificity of lncRNAs detected in 12 brain tissues.(A) Distribution of tissue-specificity index (τ) for protein coding genes, annotated lncRNAs and novel lncRNAs.(B) Number of lncRNAs with a tissue-specific index greater than 0.85 in each brain organ.(C) Number of tissue-specific lncRNAs (τ = 1) in each brain organ.

Figure 4 .
Figure 4. Results of the weighted gene co-expression network analysis (WGCNA).(A) Gene dendrogram obtained by average linkage hierarchical clustering.The colored bars depict the module assignment before (Dynamic Tree Cut) and after (Merged Dynamic) merging modules with similar expression profiles.(B) Module trait associations.Each row and column correspond to a module eigengene (the first principal component of the expression matrix of the corresponding module) and to a trait, respectively.Each cell contains the corresponding correlation (top value) and adjusted P-values (bottom value).Only modules associated with at least one trait are shown.(C) Number of lncRNAs in the trait-associated modules.
Figure 5. : Network of enriched GO terms as determined by g:Profiler in the Darkseagreen4 (left half of the node) and Darkolivegreen4(right half of the node) modules.Node size is proportional to the number of genes in the term; edge size represents the number of genes that overlap between different terms; green and blue edges corespond to the Darkseagreen4 and Darkolivegreen4 modules, respectively; node color represents the false discovery rate (q-value) of the enriched term, each half of the node representing a module.Grey color means that the term is not enriched in the corresponding module.
Varela-Martínez et al.: lncRNAs in the goat brain