Characterizing the diagnostic sensitivity and specificity of pain biomarkers in cattle using receiver operating characteristic curves

Biomarkers are used to assess pain and analgesic drug efficacy in livestock. However, often the diagnostic sensitivity and specificity of these biomarkers for different painful conditions over time have not been described. Receiver operating characteristic (ROC) curves are graphical plots that illustrate the diagnostic ability of a test as its discrimination threshold is varied. The objective of this analysis was to use area under the curve (AUC) values derived from ROC analysis to characterize the predictive value of potential pain biomarkers at specific time points following a painful stimulus. The biomarkers included in the analysis were plasma cortisol, salivary cortisol, hair cortisol, infrared thermography (IRT), mechanical nociceptive threshold (MNT), substance P, kinematic gait analysis, and a visual analog scale for pain. A total of 7,992 biomarker outcomes collected from 7 pain studies involving pain associated with castration, dehorning, lameness, and abdominal surgery were included in the analysis. Each study consisted of 3 treatments: uncontrolled pain (tissue damage), no pain (handled controls), and analgesic use (tissue damage, administered a nonsteroidal anti-inflammatory drug). Results comparing analgesic effects to uncontrolled pain consistently yielded AUC values >0.7 (95% confidence interval: 0.40 to 0.99) for plasma cortisol (time points: 1.5, 2, 3, 4, 6, and 8 h), hair cortisol (time point: 62 d), and IRT (time point: 72 h). Results comparing analgesic effects to uncontrolled pain consistently yielded AUC values <0.7 (95% confidence interval: 0.28 to 0.90) for salivary cortisol (6, 13, 20, 34, 48, and 62 d); MNT (6, 25, and 49 h); substance P (1, 2, 3, 4, 6, 8, 12, 18, 24, 48, 72, 96, 120, 144, 312, analysis including area (8, 16, 48, 72, 96, and 120 h), force (8, 16, 24, 48, 72, 96, and 120 h), and pressure (8, 16, 24, 48, 72, 96, and 120 h); and a visual analog scale for pain (1, 2, 3, 4, 5, and 6 d). These results indicate that ROC analysis can be used to characterize the predictive value of pain biomarkers and provide new knowledge on the diagnostic accuracy of pain biomarkers within this data set. This analysis, using data from 7 studies, was a preliminary approach to identify biomarkers and collection time points that could inform additional analytical approaches or meta-analyses with larger sample sizes, which are needed to further validate these hypotheses and conclusions.


INTRODUCTION
Pain results from mechanical, chemical, or thermal stimulation of nerve endings containing nociceptors (Hudson et al., 2008). Under farm conditions, cattle may undergo elective procedures such as castration and dehorning, which cause pain. They may also experience pain from conditions (such as lameness) or postoperatively (such as after a cesarean section). In a survey of urban citizens, dehorning without pain mitigation was viewed as contentious and not supported (Cardoso et al., 2017), leading to the need for further research into analgesic strategies. Agents that may provide analgesia in cattle include local anesthetics, nonsteroidal anti-inflammatory drugs (NSAIDs), opioids, α2-agonists, and N-methyl-d-aspartate receptor antagonists (Coetzee, 2013). The American Veterinary Medical Association (AVMA) policy on castration and dehorning states that because these procedures cause pain and discomfort, the use of medications to alleviate pain should be used under the Animal Medicinal Drug Use and Clarification Act (AMDUCA), which allows for extra-label drug use for pain relief under the oversight of a veterinarian (AVMA, 2019). In the United States, approved analgesic drugs for use in livestock are limited to NSAIDs. Flunixin meglumine, as a transdermal formulation, is the only US Food and Drug Administration (FDA)approved analgesic to specifically control pain in cattle with only an indication for foot rot (FDA, 2017). In the United States, no analgesics are approved to control pain in cattle from castration, dehorning, or surgery. In Canada, meloxicam is approved for cattle to relieve pain and inflammation from castration (Solvet, 2019).
Biomarkers are often used to assess pain and analgesic drug efficacy in livestock. During the drug development process, when submitting data to the FDA for biomarker qualification, one of the components is the characterization of the relationships between the biomarker, the clinical outcomes, and the treatment (Amur et al., 2015). Further characterizing the relationship between analgesic use and biomarker outcomes could be beneficial for future drug approvals. In this study, we chose to characterize biomarkers that were repeatedly collected across 7 pain studies and are commonly used to assess pain in food animals (Stafford and Mellor, 2005a;Heinrich et al., 2010;Stewart et al., 2010;Bustamante et al., 2015;Kleinhenz et al., 2019c). Among the biomarkers currently used to assess pain, the outcomes included in this analysis were cortisol, substance P, infrared thermography (IRT), mechanical nociceptive threshold (MNT), kinematic gait analysis, and visual analog scale (VAS) pain assessment. These biomarkers have been used to quantify an animal's physiologic and behavioral response to a combination of stress, inflammation, and pain; however, some are more invasive than others, some are influenced by restraint, and some can change due to stress rather than pain, which makes it difficult to specifically identify pain. Collecting biomarkers at different time points throughout the stress response following a painful procedure or condition allows us to assess pain over time. However, the nature of biomarker collection can increase stress due to restraint, and the animal may become sensitized or desensitized over time. Thus, determining which time points yield the best diagnostic accuracy would allow for biomarkers to be collected at fewer time points and be less confounded by continued sampling. This brings into question which biomarkers yield acceptable diagnostic accuracy when comparing uncontrolled pain and inflammation to NSAID use at varying time points throughout the stress response following a painful procedure or condition.
Diagnostic accuracy depends on the sensitivity and specificity of pain biomarkers and has not been described for different painful conditions in cattle over time. Receiver operating characteristic (ROC) curves are graphical plots that illustrate the diagnostic ability of a test as its discrimination threshold is varied. The plot of true positive (sensitivity) versus false positive (1 − specificity) across possible cut-off values generates a ROC curve (Hajian-Tilaki, 2013). The area under the ROC curve (AUC) can be used to measure discriminative ability (the probability that a randomly chosen positive subject is rated or ranked as more likely to be positive than a randomly chosen negative subject; Hajian-Tilaki, 2013; Figure 1). Additionally, AUC values can be compared between ROC curves (Ekelund, 2012). A ROC curve can be constructed for each time point throughout a pain study. The objective of this analysis was to use AUC values derived from ROC analysis to assess the predictive value of pain biomarkers at specific sample collection time points following painful events using already published data from a series of studies. Each study consisted of 3 treatments: (1) uncontrolled pain (tissue damage) and physiologic changes following a painful procedure; (2) no pain (handled controls); and (3) analgesic use mitigating some pain (tissue damage, administered an NSAID) and physiologic changes following a painful procedure. The null hypothesis was that there would be no difference in AUC values across biomarkers or at different sample collection time points.

Study Design
Each of the studies was reviewed and approved by the Institutional Animal Care and Use Committee at Iowa State University , at Kansas State University (IACUC# 4002), the University of Calgary (ACC14-0159), and the Lethbridge Research Centre Animal Care Committee (ACC# 1410 and 1428).
A total of 7,992 biomarker outcomes were included in the analysis. These outcomes were collected from 351 animals enrolled in 7 studies using cattle. These studies investigated the use of an NSAID administered alone to one treatment group. Each study collected overlapping biomarkers and used similar biomarker collection methods. The studies included in the analysis quantified pain in cattle associated with typical husbandry procedures and conditions, including castration, dehorning, lameness, and abdominal surgery (Table 1) (Kleinhenz et al., 2017(Kleinhenz et al., , 2018(Kleinhenz et al., , 2019aMeléndez et al., 2017Meléndez et al., , 2018Martin et al., 2020). The same group of calves was involved in 3 of the studies (Kleinhenz et al., 2017(Kleinhenz et al., , 2018(Kleinhenz et al., , 2019b; calves were completely healed from the previous study and rerandomized across treatments for each study. The biomarkers included in the analysis were plasma cortisol, salivary cortisol, hair cortisol, IRT, MNT, substance P, kinematic gait analysis, and VAS. Animals were restrained for collection of all biomarker samples except kinematic gait analysis, in which they were walked across the mat, and VAS scoring. Biomarker outcomes for each of the 7 studies are outlined in Table 1, with collection time  points outlined in Table 2. All sample time points col-lected were analyzed; plasma cortisol and substance P were repeatedly collected multiple times on the first collection day, and all other outcomes were continually collected throughout the days following the painful procedure or condition. All data were collected between 2016 and 2019. Data from multiple castration studies were combined for the substance P outcome, which was collected and analyzed in the same manner at differing time points. All cattle enrolled in the castration studies were castrated surgically. Cattle were dehorned using electro-cautery in the dehorning study, experimentally induced in the lameness study, and abdominal surgery was performed to evaluate pain postoperatively. Cattle received flunixin transdermally in 1 dehorning, 1 lameness, and 2 surgical castrations studies, and intravenously in 1 abdominal surgery study (Table 1). Cattle received meloxicam subcutaneously in 2 surgical castration studies (Table  1). Each study consisted of 3 treatments: uncontrolled pain (tissue damage), no pain (handled controls), and analgesic use (tissue damage, administered a nonsteroi-   1.5,2,3,4,6,8,12,24,48,72 Salivary cortisol,d 6,13,20,34,48,62 Hair cortisol,d 34,62 IRT,h 1,2,3,4,6,8,12,24,48,72 MNT,h 6,25,49 Substance P,h 1,2,3,4,6,8,12,18,24,48,72,96,120,144,312,480,816,1,152,1,488 Kinematic gait analysis,h 8,16,24,48,72,96,120 VAS,d 1,2,3,4,5,6 dal anti-inflammatory drug). Analgesics administered once at the time of the procedure were NSAIDs: flunixin meglumine (3.3 mg/kg of BW transdermally or 2.2. mg/kg of BW intravenously) or meloxicam (0.5 mg/kg of BW subcutaneously). Baseline time points were not included in the ROC analysis due to the lack of comparison between uncontrolled pain, no pain, and analgesic use, as animals had not experienced the painful procedure or condition. However, baseline values can be referenced from individual studies in their previous publications (Kleinhenz et al., 2017(Kleinhenz et al., , 2018(Kleinhenz et al., , 2019aMeléndez et al., 2017Meléndez et al., , 2018Martin et al., 2020).

Physiological and Behavioral Biomarkers
Plasma Cortisol. A total of 1,564 plasma cortisol samples from 77 animals made up this data set. The samples were obtained as described by Kleinhenz et al. (2017). Blood was obtained using a 14-gauge jugular catheter or by jugular venipuncture using a 20-mL syringe (Monoject) and 16-gauge 3.8-cm needle (Monoject) while cattle were restrained in headlocks (Kleinhenz et al., 2018(Kleinhenz et al., , 2019a. The blood was immediately transferred to a tube containing sodium heparin and was centrifuged at 3,000 × g for 10 min. The plasma was pipetted into cryovials, placed on dry ice, and stored at −80°C until analysis. Cortisol concentrations were determined using a commercially available radioimmunoassay (MP Biomedicals) with a detection range of 0.64 to 150 ng/mL. Salivary Cortisol. A total of 523 salivary cortisol samples from 106 animals made up this data set. As described in Meléndez et al. (2018), calves were restrained in a hydraulic squeeze chute (Cattlelac Cattle, Reg Cox Feedmixers Ltd.), where saliva samples were collected with a cotton swab, immediately stored in a plastic tube, and frozen at −20°C for further cortisol analysis using an ELISA kit (Salimetrics) with a sensitivity of <0.007 µg/dL. Hair Cortisol. A total of 175 hair cortisol samples from 72 animals made up this data set. As described in Meléndez et al. (2018), hair from the forehead was clipped and stored in plastic bags at room temperature for further cortisol analysis. Cortisol was quantified using an ELISA (Salimetrics) with a sensitivity of <0.007 µg/dL. IRT. A total of 724 IRT measures from 71 animals made up this data set. Infrared images of the medial canthus of the eye for castration, dehorning, and surgery, and images from lame feet were included in the analysis to quantify changes in inflammation in the foot. Mean, maximum, and temperature differentials between the left and right feet from the lameness study were included in the analysis, as discussed in Alsaaod et al. (2015). Infrared thermography images were obtained using a research-grade infrared camera (FLIR SC 660; FLIR Systems AB). The IRT camera was calibrated before use with ambient temperature and relative humidity. As described in Kleinhenz et al. (2017Kleinhenz et al. ( , 2018, an image of the lateral aspect of the head was obtained so that the image contained the medial canthus of the eye. As described in Kleinhenz et al. (2019b), images of the foot were obtained at a 45° angle, 1 m from the coronary band; 3 images at each time point were averaged, along with the maximum used for analysis, and the difference between the temperatures of the left and right hind feet (left hind minus right hind) were determined for each time point. Infrared images were analyzed using research-grade computer software by drawing a circle around the medial canthus of the eye or the coronary band of the feet and recording the maximum, minimum, and average temperature provided by the software (FLIR ExaminIR Inc.).

Mechanical Nociception Threshold.
A total of 1,065 MNT measures from 24 animals made up this data set. Calves were restrained using a halter and blindfolded. As described in Kleinhenz et al. (2017), using a handheld pressure algometer (Wagner Instruments), a force was applied perpendicularly at a rate of approximately 1 kg of force (kgf) per second at 2 locations (lateral and caudal) adjacent to the horn bud. A third control location between the eyes was used to evaluate MNT. A withdrawal response was indicated by an overt movement away from the applied pressure algometer, at which time the investigator immediately removed the algometer. Values were recorded by a second investigator to prevent bias from the first investigator, who did not look at the values. The same investigator applied the algometer and the same investigator recorded the values. Locations were tested 3 times in sequential order, and the values were averaged for statistical analysis.
Substance P. A total of 1,402 substance P samples from 207 animals made up this data set. Calves were restrained in a hydraulic squeeze chute (Cattlelac Cattle, Reg Cox Feed Mixers Ltd.) for blood collection. Blood samples were collected via jugular venipuncture into vacuum tubes (BD Vacutainer; Becton Dickinson). As described in Kleinhenz et al. (2017), 200 µg of benzamidine was added to EDTA blood tubes (BD Vacutainer) 48 h before the start of the studies. During sample collection, 6 mL of blood was added to the spiked EDTA tube. The samples were immediately placed on ice, centrifuged at 1,500 × g for 10 min at 4°C within 30 min of collection, and the plasma was placed into cryovials. The cryovials were stored at −80°C until analysis. Substance P levels were determined using the methods described by Van Engen et al. (2014) using nonextracted plasma. A similar method was used by Meléndez et al. (2018), but 200 µg of benzamidine was not added to the blood tubes before the start of the study to help prevent substance P breakdown. The limit of detection was 10 pg/mL, and the limit of quantitation was 20 pg/mL. Kinematic Gait Analysis. A total of 239 gait analysis readings from 30 animals made up this data set. As described in Kleinhenz et al. (2019a), a commercially available kinematic gait system (MatScan, Tekscan Inc.) was used to record gait and biomechanical parameters from adult cows who walked across the pressure mat at their own pace. The system was calibrated using a known mass daily and before each use of the computer software to ensure accuracy of the measurements at each time point. Video synchronization was used to ensure consistent gait between and within cows at each time point. Using research-specific software (Hugemat Research 5.83, Tekscan Inc.), force, contact pressure, and impulse in the affected feet were assessed, which are the parameters also reported by Schulz et al. (2011).
Visual Analog Scale. A total of 2,300 VAS scores from 192 animals made up this data set. As described in Martin et al. (2020), a daily VAS pain assessment was conducted by 2 trained evaluators blinded to treatment allocations on calves of stocker age. The VAS used was a 100-mm (10-cm) line anchored by descriptors of "no pain" on the left (0 cm) and "severe pain" on the right (10 cm). Five parameters were used to assess pain: depression, tail swishing or flicking, stance, head carriage, and foot stomping or kicking (Table 3). "No pain" was characterized by being alert and quick to show interest, no tail swishing, a normal stance, head held above spine level, and absence of foot stomping. "Severe pain" was characterized by being dull and showing no interest, more than 3 tail swishes per minute, legs abducted, head held below spine level, and numerous stomps. The evaluator marked the line between the 2 descriptors to indicate the pain intensity. A millimeter scale was used to measure the score from the zero anchor point to the evaluator's mark. The mean VAS measures of the 2 evaluators were averaged into one score for statistical analysis.
ROC Curve Determination. All statistics were performed using statistical software (JMP Pro 14.0, SAS Institute Inc.). Receiver operating characteristic curves were created for each time point, with AUC values comparing uncontrolled pain × no pain (handled controls) and uncontrolled pain × analgesic use, with uncontrolled pain as the positive control. The biomarker outcome was plotted as the x, continuous regressor, and status (uncontrolled pain, no pain, analgesic use) was plotted as the y, categorical response. Bootstrapping via fractional weights was used to generate confidence intervals for each AUC value; AUC values ≥0.7 were reported due to the ROC rule of thumb used for acceptable discriminative ability described in Yang and Berdine (2017). The maximum (AUC = 1) means that the diagnostic test is perfect in differentiation, AUC = 0.5 means subject discrimination is due to chance, and AUC = 0 means the test incorrectly identifies all subjects (Hajian-Tilaki, 2013). Specific cut-off values were generated through the ROC analysis based upon optimized specificity and sensitivity values by minimizing the square distance between the upper left-hand corner of ROC space and any point on the ROC curve and are presented.

Plasma Cortisol
Plasma cortisol AUC values comparing analgesic use versus uncontrolled pain are outlined in Table 4, with rankings in Table 5. Out of 12 time points, the time points that yielded acceptable diagnostic accuracy for surgical castration, dehorning, and lameness when comparing analgesic use to uncontrolled pain are outlined below. Surgical castration study results for plasma cortisol (Figure 2A) 9.85, 16.47, 11.43, 7.64, 9.92, 6.07, 11.19, 4.43, 4.11, 6.25, 5.33, 7.33, 4.81, and 3.00 ng/mL, respectively). Plasma cortisol AUC values comparing no pain versus uncontrolled pain are outlined in Table 6, with rankings in Table 7. Out of 12 time points, the time points   Tables 8, 9, and 10, respectively.

Hair Cortisol
Hair cortisol AUC values comparing analgesic use versus uncontrolled pain are outlined in Table 4, with rankings in Table 5. Out of 2 time points, surgical castration study results comparing analgesic use versus uncontrolled pain yielded acceptable diagnostic accuracy (AUC >0.7; 95% CI: 0.51 to 0.85) for hair cortisol at 62 d, with a cut-off value of 10.33 pg/mL.
Hair cortisol AUC values comparing no pain versus pain are outlined in Table 6. Out of 2 time points, surgical castration study results comparing no pain versus uncontrolled pain yielded unacceptable diagnostic accuracy (AUC <0.7; 95% CI: 0.44 to 0.71) for hair cortisol at all time points examined.

Infrared Thermography
Ocular IRT AUC values comparing analgesic use versus uncontrolled pain are outlined in Table 4, with   Tables 8, 9, and 11, respectively.

Mechanical Nociception Threshold
Mechanical nociception threshold AUC values comparing analgesic use versus uncontrolled pain are outlined in Table 4. Out of 3 time points, dehorning study results for MNT yielded unacceptable diagnostic accuracy (AUC <0.7; 95% CI: 0.40 to 0.65) at all time points examined (6, 25, and 49 h).
Mechanical nociception threshold AUC values comparing no pain versus uncontrolled pain are outlined in Table 6, with rankings in Table 7. Out of 3 time points, dehorning study results for MNT yielded acceptable diagnostic accuracy (AUC >0.7; 95% CI: 0.75 to 0.95) at time points 6, 25, and 49 h, with cut-off values decreasing over time (cut-off values: 1.05, 0.89, and 0.78 kgf, respectively).

Kinematic Gait Analysis
Gait AUC values comparing analgesic use versus uncontrolled pain are outlined in Table 4, with rankings in Table 5. Out of 7 time points, the time points that yielded acceptable diagnostic accuracy for lameness when comparing analgesic use to uncontrolled pain are outlined below. Area yielded acceptable results (AUC >0.7; 95% CI: 0.47 to 0.89) at 24 h, with a cut-off value of 0.02 cm 2 .

Visual Analog Scale
Visual analog scale AUC values comparing analgesic use versus uncontrolled pain are outlined in Table 4. Out of 6 time points, surgical castration study results comparing analgesic versus uncontrolled pain yielded unacceptable diagnostic accuracy (AUC <0.7; 95% CI: 0.43 to 0.74) for VAS scores at all time points examined (1, 2, 3, 4, 5, and 6 d).
Visual analog scale AUC values comparing no pain versus uncontrolled pain are outlined in Table 6, with rankings in Table 7. Out of 6 time points, surgical castration study results comparing no pain versus uncontrolled pain yielded acceptable diagnostic accuracy (AUC >0.7; 95% CI: 0.65 to 0.81) for VAS scores at 1 and 4 d, with cut-off values of 7.5 and 3.0.

DISCUSSION
It is essential that noninvasive measures of acute and chronic stress be developed for assessment of animal welfare (Stewart et al., 2005). One avenue is the development of robust biomarkers to objectively quantify pain and evaluate analgesic treatment regimen efficacy during routine elective animal husbandry procedures such as castration and dehorning (Coetzee, 2011). Cortisol is a corticosteroid hormone that is commonly used as an indicator of acute stress responses, as well as pain (Glynn et al., 2013). Cortisol levels are dependent on activation of the hypothalamic-pituitary-adrenal axis, reflect the sensory component of pain, and do not require processing by the central nervous system (Ede et al., 2019). Cortisol responses to painful procedures of the type included in the present analysis are often characterized by a rapid increase in concentration following a procedure that peaks, rapidly declines, and then reaches a plateau; however, many limitations exist, and the stress from restraint and invasiveness may affect cortisol response. Studies have shown that plasma cortisol concentrations are influenced by different procedural methods and have high individual variability (Stafford et al., 2002), and some animals have low responses likely due to higher pain thresholds (Stafford and Mellor, 2005b). Difficulties exist in obtaining true baseline measurements, missing rapid response times, the effect of circadian rhythms on hormone levels, differences in breed and temperament, and the nature of blood sampling, which may itself cause a stress response (Coetzee, 2011). Measures that can be taken to attempt to overcome these challenges include assigning animals to a control treatment that experience the stress of restraint, as was done with "no pain" (handled controls) analysis in the present study, collecting samples at the same time each day to account for circadian rhythms, accounting for time of day in the statistical model, and sampling frequently during the acute phase response. The AUC values included in this analysis indicated that plasma cortisol can yield acceptable diagnostic accuracy (AUC >0.7) for characterizing the sensory component of pain for castration, dehorning, and lameness immediately after the procedure and can continue to yield sound results for multiple days following lameness induction in the studies examined. Plasma cortisol yielded acceptable diagnostic accuracy for identifying uncontrolled pain, as well as detecting NSAID analgesic effects. Salivary cortisol data included in the analysis were collected days, rather than hours, after the surgical castration procedure, which could have been a factor leading to poor diagnostic accuracy. Peak plasma cortisol levels have been found to occur 10 min after a stressor, with peak salivary cortisol values lagging 10 min behind (occurring 20 min after a stressor; Hernandez et al., 2014). It would be beneficial for future analyses for salivary cortisol to be collected closer to the time of the procedure. Hair cortisol data included in the analysis yielded promising results at the 62-d time point but was only measured twice following surgical castration due to the nature of hair growth, so data were limited. The most appropriate method to quantify cortisol via plasma, saliva, or hair may be based on duration of response, invasiveness, and sampling capabilities.
A decrease in eye temperature has been observed following castration in calves (Stewart et al., 2010); however, a lack of analgesic effect could be due to an NSAID alone not effectively controlling pain or may suggest that temperature change is more indicative of stress and less accurate as a sole indicator of pain (Glynn et al., 2013). Stimuli that induce fear and anxiety may trigger similar kinds of physiological responses (Weary et al., 2017) and often accompany painful ex-   fects of analgesia (Heinrich et al., 2009(Heinrich et al., , 2010. Downfalls of MNT determination include a high degree of intra-individual variation, producing an avoidance response, and a lack of a nonpainful control site (Raundal et al., 2014). Interobserver reliability has been shown to increase when MNT values are averaged rather than taking single measurements (Tapper et al., 2013) and averaging of MNT measures was employed for the data included in the present analysis. The AUC values included in this analysis indicated that MNT had better diagnostic accuracy for identifying pain than for analgesic effects due to higher AUC values comparing uncontrolled pain to nonpainful controls rather than for animals administered an analgesic. Coetzee et al. (2008) suggested that substance P measurement may discriminate between a stressful event, which will cause a cortisol response, and a more specific nociceptive stimulus. Substance P concentrations have been found to be significantly higher in castrated calves compared with controls and in lame cattle with laminitis (Bustamante et al., 2015). Substance P levels may not be altered by NSAID use and may be affected by other factors based on the findings from the present analysis, along with those from Coetzee et al. (2008).
Approval for the use of transdermal flunixin for control of pain due to foot rot was achieved through gait analysis using a floor-based pressure mat system (FDA, 2017). One downfall of the system is that if an animal stops or slows down while walking across the pressure mat, the measurement is often lost (Maertens et al., 2011). The AUC values included in this analysis indicated that kinematic gait analysis may not yield acceptable (AUC >0.7) diagnostic accuracy for experimentally induced lameness at the time points examined. The use of this type of gait analysis in naturally occurring clinical lameness has not been well described in the literature.
Visual analog scale assessment is a method of evaluating pain intensity based on a subjective integration of behavioral parameters. For the data included in the present analysis, the parameters used to assess pain were depression, tail swishing, stance, head carriage, and foot stomping (Martin et al., 2020). Some dis-advantages exist when using subjective visual assessment; behavioral observations may lack sensitivity because of individual animal variation or because many behaviors are socially facilitated, and cattle behavior can be influenced by outside factors such as predation, social interactions, and their environment (Kluever et al., 2008). The AUC values included in this analysis indicated that VAS assessment was not affected by NSAID use.
The data in the present analysis were combined from studies that consisted of 3 treatments: (1) cattle experiencing uncontrolled pain, tissue damage, and physiologic changes following a painful procedure; (2) cattle that were controls who were handled and restrained but not subjected to a painful procedure; and (3) cattle who experienced tissue damage and physiologic changes following a painful procedure but were administered an NSAID to mitigate some pain. The data were from studies where only NSAIDs were used for analgesia. Results would likely have differed if a local anesthetic were included in the protocol, as they have been shown to decrease acute pain immediately following the procedure (Winder et al., 2018). However, less than 15% of US producers report always using local anesthesia or analgesia at the time of disbudding in calves <2 mo of age, and less than 25% of producers report always using local anesthesia or analgesia for surgical castration in calves <12 mo of age (Johnstone et al., 2021). Nonsteroidal anti-inflammatory drugs prevent inflammation by inhibiting cyclooxygenase enzymes that produce prostaglandins. Use of an NSAID alone does not effectively reduce the acute stress or pain associated with many elective procedures but provides analgesic and anti-inflammatory effects during the postoperative period (Coetzee, 2011). The mean half-life has been observed to be 6.42 h for topical flunixin, 4.99 h for intravenous flunixin meglumine administration (Kleinhenz et al., 2016), and approximately 26 h for meloxicam (Heinrich et al., 2010). Differences in duration of analgesic effect along with drug absorption may have affected the diagnostic accuracy of analgesia versus pain comparisons, with AUC values likely decreasing as the analgesic effect wore off.  By including the raw data from 7 studies conducted with similar methods using only an NSAID to mitigate pain, our goal was to limit confounders such as different collection and analysis methods, as well as the use of different classes of analgesic compounds with different durations and mechanisms of action. The purpose of the analysis was to evaluate the sensitivity and specificity of a similar group of data across different time points and painful procedures and conditions to preserve the internal validity of the analysis, which limits the scope of this analysis to a specific study type where NSAIDs were used. Animals from the 7 studies were not all the same age group or breed, which could have introduced additional variation to the biomarker results. Due to the limited number of studies included in the ROC analysis, confidence intervals for AUC values and cutoff values varied. Area under the curve values ≥0.7 are reported to have acceptable diagnostic accuracy (Yang and Berdine, 2017) but should be interpreted along with the confidence intervals, cut-off values, and patterns across different collection time points. Thus, AUC values can be valuable for comparing the discriminative ability of a biomarker at one time point compared with another time point, or for comparing the collection of 2 biomarkers at the same time point.
In future ROC analyses, examining the diagnostic accuracy of pain biomarkers following administration of a local anesthetic or combination of systemic analgesia and a local anesthetic would help further characterize the predictive value of pain biomarkers. Examining additional biomarkers along with analgesics with differing durations of action would be beneficial to researchers designing future pain studies. This ROC analysis was a novel approach that did not include all pain biomarkers or collection time points that currently exist in the literature. Additional analyses including a larger number of studies to increase sample size are needed to further validate the diagnostic accuracy of the biomarkers used in the present analysis.
When assessing pain, choosing which biomarkers would be of interest and the appropriate time points for sample collection is reliant upon many factors. First, calves respond to both the pain of the procedure (such as castration or dehorning) and the physical restraint (Faulkner and Weary, 2000). Following a procedure such as dehorning or castration, an acute painful response is observed followed by a period of inflammatory pain (Stock et al., 2013). Acute pain is capable of producing an acute stress response by activating the sympathetic nervous system and secretion of glucocorticoids (Anderson and Muir, 2005). Duration of the stress response for different procedures and conditions likely varies, along with the amount of pain the animal is experiencing at different time points. The effect of analgesia on this response to restore homeostasis depends on whether the analgesia is administered before the procedure, the mechanism of action and duration of the analgesic regimen, and the duration of the stress response. Many of the biomarkers described may be influenced by the animal's physiologic response to the painful procedure, even if pain is mitigated. Finally, biomarker selection that will result in acceptable diagnostic accuracy should consider whether the objective is to identify physiologic effects of a painful procedure or quantify the effects of analgesia.

CONCLUSIONS
Results from the present study comparing NSAID analgesic effects to uncontrolled pain consistently yielded acceptable diagnostic accuracy for plasma cortisol, hair cortisol, and IRT. The objectives of the study, length of the stress response, and duration of the analgesic regimen administered should all be considered when selecting biomarkers to assess painful procedures and conditions. This analysis was a preliminary approach to identify biomarkers and collection time points that could inform additional analyses using a larger number of studies to further validate the accuracy of these biomarkers. These results indicate that ROC analysis can be used to characterize the predictive value of pain biomarkers, and we have provided new knowledge on the diagnostic accuracy of pain biomarkers within this data set. These results can be used to guide refinement of future research regarding painful procedures and analgesic efficacy.