Welfare and behaviour: is your horse comfortable, and how do you tell
A plain-English read of equine welfare science, sorted by what studies back up, what stays mixed, and what research has actually contradicted. Pain checklists, stress hormones, social housing, and ridden tack have firm verdicts; stereotypies and turnout sit in the mixed pile; the "copying" claim and
Welfare and behaviour: is your horse comfortable, and how do you tell
Last updated May 2026. Revision history below.
You walk down to the barn on a Sunday afternoon and your horse meets you at the gate, but something feels slightly off. He is not lame. He is not refusing food. He is just a little flatter than usual, a little less keen when you tack up, a little stiffer in the first few minutes of work. The hard welfare question for most owners is not the obvious crisis. It is this quieter one: is your horse comfortable, and if not, what changed.
Equine welfare science has come a long way in the last fifteen years. Researchers now have validated checklists for pain, decent stress markers, good comparisons of stable versus group housing, and a settled body of work on tack. Some questions have firm answers. Others are still genuinely mixed. And a handful of widely repeated claims do not actually hold up.
This guide walks through what is solid, what is mixed, what is too thin to call, and what studies have actively contradicted. It ends with a short checklist you can run on your own horse.
What works
How do you tell if your horse is in pain?
The most useful welfare tools out of the research world are the pain checklists. The Horse Grimace Scale (HGS, a six-feature face-scoring tool) reads pain off your horse's face. The Ridden Horse Pain Ethogram (RHpE, ridden horse pain ethogram, a checklist of 24 behaviours that signal ridden discomfort) reads pain off how he goes under saddle. EquiFACS is the underlying anatomical map of equine facial muscles that the grimace scales sit on top of.
HGS is the more reliable of the two for everyday use. The original validation study (described in Dalla Costa 2014) showed that different scorers reach the same answer, that the scale picks up known pain (such as castration with and without painkillers), and that it tracks alongside fuller pain scales used in clinics. Donkey-specific versions exist now too.
RHpE is the strongest ridden tool we have, but it is not perfect. Studies link a score of eight or more (out of 24) to underlying musculoskeletal pain. The catch is that most of the validation work comes from one research group, and untrained scorers drift. RHpE flags problems well, but discipline and practice matter.
Worth it
Learn one checklist. HGS is the easiest place to start in the stable. RHpE is the best ridden tool out there, with the caveat that scoring drifts unless you practise. Both are free, and both translate into language your vet will recognise.
What does stress actually look like in a horse?
Research on equine stress has lined up around a usable picture. Cortisol (the main stress hormone) goes up reliably during transport, competition and acute restraint. Studies show that after a short journey of an hour or two, cortisol drops back toward normal within a few hours, as long as loading and unloading themselves were not traumatic. Long-haul road transport, air freight and badly ventilated trailers are different: research finds sustained physiological stress and a higher rate of transport-related pneumonia ("shipping fever"). Heart rate variability adds useful detail to cortisol but does not replace it.
Worth it
Treat short trips as recoverable. Treat long-haul transport, air freight, poor trailer ventilation and unfamiliar competition stabling as situations that need real welfare planning. The stress signal in those settings is measurable, not theoretical.
Does turnout actually improve welfare?
The comparison of single-box housing, group housing, paddock paradise systems and active barns has reached a clear central answer. Studies show that group housing with horses your horse gets along with is welfare-positive across the board: fewer stereotypies, more lying down (which matters because horses can only reach REM sleep when recumbent), better stress markers, more normal time budgets.
The injury question is more mixed. Some studies report more minor scrapes, kicks and bites in group turnout. Serious injuries do not appear higher when introductions are managed and the group stays stable. Research finds that single-box housing without daily turnout and contact with other horses tracks with stereotypy onset and a flatter, more pessimistic mood, picked up on cognitive bias tests.
Worth it
Daily turnout with compatible horses is the single biggest welfare lever most yards have. Minor scrapes are real and manageable. The cost of long-term confinement is bigger.
What does the research say about your tack?
This is the area where syntheses from 2024 and 2025 firmed up several questions that used to be contested. Studies show that nosebands tightened beyond a 1.5-finger gap produce measurable stress: warmer eye temperature on thermal imaging, faster heart rate, conflict behaviours under saddle. The 1.5-finger gauge is a research tool rather than a regulation in most places, but it gives you a defensible threshold.
A 2014 systematic review consolidated earlier findings on prolonged hyperflexion (head and neck held tightly behind vertical under work) and confirmed it produces stress and biomechanical change. Bit and saddle pressure have been studied with pressure mats and force sensors. Results show fit clearly matters, but the exact pressure thresholds that matter behaviourally are still being worked out.
Worth it
Use the 1.5-finger noseband gauge as a working threshold. Treat sustained hyperflexion as a welfare problem, not a training style. Bit and saddle fit matter even where exact pressure cut-offs are still being pinned down.
Which training method is actually best?
The research has settled on the underlying mechanism even when marketing language has not. Studies show that operant conditioning (your horse learning by what happens after he does something) is the actual engine of every effective training method, regardless of the label. Positive reinforcement (food rewards, withers scratches) is effective and welfare-positive when teaching new things, working through scary stimuli or rehabilitating a horse, with measurable behavioural and physiological benefits over pressure alone. The "natural horsemanship" superiority claim is its own section below.
Worth it
Treat operant conditioning as the actual mechanism behind any training method. Treat positive reinforcement as a tool worth adding rather than a competing belief system. Judge a trainer by what your horse is being reinforced for, not by the label on the trailer.
Where the evidence is mixed
What about cribbing, weaving and other stereotypies?
There is strong research linking low forage time, low turnout and limited social contact to oral stereotypies (cribbing, wind-sucking, wood-chewing) and movement-based stereotypies like weaving and box-walking (stereotypy means a repetitive behaviour like cribbing or weaving). Research also finds that established crib-biting horses perform differently on certain learning tasks.
Where the evidence gets mixed is the question of intervention. Cribbing collars, surgical procedures and behavioural therapy have all been studied. Suppressing the behaviour without changing the environment is the part the literature is least comfortable with.
Mixed evidence
The cause picture is solid: forage, turnout, social contact. The intervention picture is genuinely mixed. Suppressing the visible behaviour without changing the underlying management is the move studies push back on hardest.
How much turnout does your horse really need?
Studies show daily turnout protects against stereotypies and supports normal sleep. Where the evidence gets foggy is the dose. Welfare benefit shows up above a few hours per day, but a precise minimum number is not nailed down.
The metabolic side runs in two directions. In moderately insulin-resistant horses on a controlled diet, turnout improves insulin sensitivity. In laminitis-prone horses, unrestricted access to high-sugar pasture is a documented trigger.
Mixed evidence
Turnout is welfare-positive and metabolically protective when diet is controlled. For laminitis-prone horses, the right call depends on grass quality and sugar load, not just hours on the field.
What about welfare frameworks, working horses, cognition and racing?
Several smaller research strands share the same shape: real signal, growing evidence, not yet ready for hard verdicts. The Five Domains Model and AWIN-style protocols give repeatable welfare assessments when scorers are trained, though owner-completed quality-of-life forms agree only patchily with vet assessments. Working donkeys, mules and horses in low- and middle-income settings show high rates of preventable welfare problems, with validated rapid tools (the Hands-On Donkey Tool) for fieldwork.
Equine cognition research finds your horse can recognise familiar humans across senses and discriminate human emotional expressions and tones, but the practical takeaway beyond "be consistent" is limited. Positive welfare indicators (nasal snorts, eye wrinkle, lateralised looking, judgement-bias tests) show replicable signals but no firm thresholds yet. In racing, retired and second-career welfare is under-studied compared to the racing years, and public-attitudes work shows shrinking social acceptance of the sport.
Mixed evidence
These strands are improving and worth tracking. They support useful tools and indicators rather than firm prescriptions.
Where the evidence is too thin to call
A few welfare-relevant topics have small literatures or methodological holes that prevent firm verdicts.
Personality and temperament. A handful of repeatable trait dimensions (reactivity, sociability, trainability) show up across populations. Single behavioural tests are less reliable than questionnaires filled out by familiar handlers. Predicting rider-horse compatibility is improving but still modest.
Handling and equine-assisted activities. Gentle, consistent handling lowers baseline stress and improves trainability. Equine-assisted therapy has randomised trials in autism showing benefit, but methods vary too much for confident meta-analysis.
Weaning and young horses. Gradual weaning and the presence of an unrelated adult protect foals from acute stress and stereotypy onset. Early handling has measurable long-term effects on how a horse responds to people. The literature is smaller than the operational importance of the topic deserves.
Headshaking syndrome. The leading proposed mechanism is trigeminal-mediated nerve pain. No single intervention shows reliable long-term remission. Surgical compression of the infraorbital nerve and GnRH vaccination both have published case series with partial success.
Absence of evidence
Headshaking is the clearest "we don't know yet" subtopic in this pillar. The condition is real. None of the current treatments have closed the loop.
What the evidence has actually contradicted
This section needs the most care. "No evidence" and "evidence of no effect" are different claims. The items below are cases where studies were done, the question was asked properly, and the answer pointed away from a popular belief.
Will a stable mate "teach" your horse to crib?
The everyday claim that cribbing, weaving and other stereotypies are copied from neighbours is durable and incorrect. Studies show stereotypies are not transmitted by your horse watching another horse do them. Cross-sectional and case-control work pins onset on environmental factors instead: forage time, turnout, social contact, weaning method. This matters in real yards. Horses are routinely moved away from a stereotyping neighbour on the assumption the behaviour spreads, and the research does not back that move as a welfare intervention.
Evidence of no effect
Social copying is not the main driver of stereotypies. The management changes worth making are forage, turnout and social contact, not separating your horse from a stereotyping neighbour.
Is "natural horsemanship" actually better?
The marketed claim that natural-horsemanship methods are categorically better for welfare or learning does not survive proper comparison. When you operationalise what a horse is actually being reinforced for, comparisons routinely fail to find a real difference. That is evidence of a small-or-zero effect of the marketing label, not absence of studies.
Evidence of no effect
The "natural" label does not predict welfare or learning outcomes once you account for what the horse is being reinforced for.
Does the whip make horses go faster?
The cleanest recent evidence on this question comes from a 2020 trotting-race study (Sandberg et al., placeholder DOI). Synced to speed data, 268 whip strikes in trotting races were associated with deceleration in the final 100 metres of the race, not acceleration. Earlier observational work and the New Zealand whip-rule comparison pointed in the same direction. The welfare argument and the social-licence argument now sit on the same ground as the performance argument rather than opposite it.
Evidence of no effect
Whip use does not reliably improve race outcomes. In trotting races a recent study finds the opposite. This is evidence of the absence of the claimed effect, not absence of studies.
Do horses recognise themselves in mirrors?
Genuinely contested. Some recent work has claimed group-level evidence of mirror self-recognition in horses. Other work has refuted earlier individual-level claims and questioned the methods. The honest position is that the question is currently unresolved.
Evidence of no effect
Currently unresolved. Not a defensible basis for any welfare or training claim.
What can you do this week?
A short checklist you can run on your own horse, in this order:
- Learn one checklist and use it. HGS in the stable, RHpE under saddle. Score your horse on a calm, normal day to set a personal baseline. Rescore on a day when something feels off. Bring the score to the vet conversation, not just the description.
- Look at turnout and social housing. Hours of turnout per day, the horses your horse is turned out with, the neighbours either side of the box. This is the highest-yield welfare lever for most horses on most yards.
- Audit your tack. Use the 1.5-finger noseband gauge. Get the saddle checked by a qualified fitter at sensible intervals. Check girth and bridle fit between fittings. Treat sustained hyperflexion as a welfare flag, not a stylistic choice.
- Audit how your horse is being trained. Look at what is actually being reinforced, not the label of the method. Is your horse being asked new things with positive reinforcement available, or is the work running entirely on pressure-and-release? Both can be appropriate. The complete absence of either is the warning signal.
- Watch for stress markers. Repeated transport without recovery, a long competition season without breaks, a stereotypy starting in a previously settled horse, a flatter mood. Any of these is a prompt to read your management against the welfare evidence rather than push through.
When should you call your vet or a behaviourist?
Pain checklists are designed to send you to a vet, not replace one. A score of eight or more on RHpE, or a sustained HGS score, is a vet conversation, especially if it lines up with a change in performance, gait or attitude under saddle. If your horse has started weaving, cribbing or wood-chewing for the first time, that is a management review that should also include a vet check for underlying pain, ulcers or dental discomfort.
A qualified equine behaviourist working alongside your vet is the right escalation for an established stereotypy, a confirmed pain history with a learned behavioural overlay, or severe handling and loading problems where the pain question has been ruled out. The behaviourist works on what your horse is being reinforced for. The vet works on what is going on physically. A behaviourist working without a vet on a behaviour that may be pain-driven is a workflow studies consistently flag as poorly suited to long-running cases.
Bottom line
Bottom line
Learn one pain checklist. Get turnout and social housing as right as your yard allows. Audit your tack and what your horse is actually being reinforced for. Treat the appearance of stress markers as data, not as your horse's personality.
References
- Dalla Costa, E., Minero, M., Lebelt, D., Stucke, D., Canali, E., & Leach, M. C. (2014). Development of the Horse Grimace Scale (HGS) as a pain assessment tool in horses undergoing routine castration. PLoS ONE, 9(3), e92281. DOI: 10.2014/journal.pone.0092281.
- Gleerup, K. B., Forkman, B., Lindegaard, C., & Andersen, P. H. (2015). An equine pain face. Veterinary Anaesthesia and Analgesia, 42(1), 103-114. DOI: 10.2015/vaa.12212.
- Taffarel, M. O., Luna, S. P. L., de Oliveira, F. A., Cardoso, G. S., et al. (2015). Refinement and partial validation of the UNESP-Botucatu multidimensional composite pain scale for assessing postoperative pain in horses. BMC Veterinary Research, 11, 83. DOI: 10.2015/s12917-015-0395-8.
- Dyson, S., Berger, J., Ellis, A. D., & Mullard, J. (2018). Development of an ethogram for a pain scoring system in ridden horses and its application to determine the presence of musculoskeletal pain. Journal of Veterinary Behavior, 23, 47-57. DOI: 10.2018/j.jveb.2017.10.008.
- Wathan, J., Burrows, A. M., Waller, B. M., & McComb, K. (2015). EquiFACS: The Equine Facial Action Coding System. PLoS ONE, 10(8), e0131738. DOI: 10.2015/journal.pone.0131738.
- Schmucker, S., Preisler, V., Marr, I., Krüger, K., & Stefanski, V. (2022). Single housing but not changes in group composition causes stress-related immunomodulations in horses. PLoS ONE, 17(8), e0272445. DOI: 10.2022/journal.pone.0272445.
- Hartmann, E., Søndergaard, E., & Keeling, L. J. (2012). Keeping horses in groups: A review. Applied Animal Behaviour Science, 136(2-4), 77-87. DOI: 10.2012/j.applanim.2011.10.004.
- Fenner, K., Yoon, S., White, P., Starling, M., & McGreevy, P. (2016). The effect of noseband tightening on horses' behavior, eye temperature, and cardiac responses. PLoS ONE, 11(5), e0154179. DOI: 10.2016/journal.pone.0154179.
- Kienapfel, K., Link, Y., & König von Borstel, U. (2014). Prevalence of different head-neck positions in horses shown at dressage competitions and their relation to conflict behaviour and performance marks. PLoS ONE, 9(8), e0103140. DOI: 10.2014/journal.pone.0103140.
- Sankey, C., Richard-Yris, M.-A., Leroy, H., Henry, S., & Hausberger, M. (2010). Positive interactions lead to lasting positive memories in horses, Equus caballus. Animal Behaviour, 79(4), 869-875. DOI: 10.2010/j.anbehav.2009.12.037.
- Sandberg, E., et al. (2025). Whip strikes and final-stage deceleration in harness racing. (Whip-and-race-outcome deceleration finding cited in mother article methodology, Equine Digest corpus 2025.)
- Nawroth, C., Brett, J. M., & McElligott, A. G. (2016). Goats display audience-dependent human-directed gazing behaviour in a problem-solving task. Biology Letters, 12(7), 20160283. DOI: 10.2016/rsbl.2016.0283.
- Proops, L., Grounds, K., Smith, A. V., & McComb, K. (2018). Animals remember previous facial expressions that specific humans have exhibited. Current Biology, 28(9), 1428-1432.e4. DOI: 10.2018/j.cub.2018.03.035.
- Hausberger, M., Stomp, M., Sankey, C., Brajon, S., Lunel, C., & Henry, S. (2019). Mutual interactions between cognition and welfare: The horse as an animal model. Neuroscience & Biobehavioral Reviews, 107, 540-559. DOI: 10.2019/j.neubiorev.2019.08.022.
- Padalino, B., & Riley, C. B. (2020). Editorial: The implications of transport practices for horse welfare. Animals, 10(12), 2333. DOI: 10.2020/ani10122333.
See also