EQUINE CLICKER TRAINING.....
using precision and positive reinforcement to teach horses and people
How to use Negative Reinforcement as a Clicker Trainer
This is the second in a series of articles on how clicker training can be used with the quadrants of operant conditioning. The first article (four quadrants) is an introduction to the four quadrants of operant conditioning and includes a basic description of each quadrant and how it can be used (or not) by clicker trainers. If you are not familiar with the 4 quadrants of operant conditioning and have not read that article, I suggest you do so before continuing on here. I do want to state that I have no formal training in learning and behavior theory, but I am very interested, so I have been reading, watching relevant DVDs, attending seminars, and doing some thinking on the subject.
It may seem odd to have one article about all the quadrants and another one about just using negative reinforcement, or combining it with positive reinforcement, but of all the quadrants, negative reinforcement seems to be the most complicated to use. The fact that it is complicated, combined with the fact that it is the type of operant conditioning most associated with traditional horsemanship means that it merits a closer look. Most horse trainers have already been using negative reinforcement but that doesn't mean they understand it. What seems to be missing is an deeper knowledge of the nuances of using negative reinforcement so that it is a humane training tool. We all recognize when we see it being used well, but we don't all know how to learn those skills for ourselves.
I am going to start out with some basics about negative reinforcement. Some of this is a repeat from the last article, and I considered leaving it out, but I think it never hurts to go over the basics again and every time I read something again, I learn something new.
This article is long and packed with information and there are a lot of interconnected parts. I tried to refer back to previous sections that were relevant instead of repeating information but so much of training is interwoven that this was not always possible. I suggest that you read the whole article through the first time. If you want to find a specific section, I have made links so you can navigate around. The sections are not intended as "stand alone" discussions. They are:
What is negative reinforcement?
In operant conditioning terms, "negative reinforcement is an increase in the future frequency of a behavior when the consequence is the removal of an aversive stimulus" (Wikipedia.) Or according to Paul Chance, "in negative reinforcement, a behavior is strengthened by the removal of, or decrease in the order of intensity of, a stimulus. This stimulus, called a negative reinforcer, is ordinarily something the animal tries to escape or avoid."
I also looked up the word "aversive" and the definition used by psychologists is "aversives are unpleasant stimuli which induce changes in behavior through punishment." (Wikipedia) The article goes on to say that aversives can also be used in negative reinforcement. I looked around at other uses of the word aversive and found it was used by animal trainers to mean different things. All of them had the same general meaning but different people made different distinctions. I found one site that said an aversive was something the animal disliked but was not going to change behavior in the same way punishment would. I am sharing that interpretation because one important point about using aversives is to recognize that there are many degrees of aversives. I am not sure I agree with how that site defined aversive, but it shows that people have their own ways of describing aversives and that in common usage they do think of them as being related to, but not the same as punishment.
I think the simplest way to look at it is to say that an aversive is something an animal will work to avoid. This ties in with Paul Chance's definition of negative reinforcement, but I do want to point out that he says it is "ordinarily something the animal tries to escape or avoid." The reason I have spent time on definitions here is because I wanted to see if there was complete agreement that negative reinforcement always involves an aversive and that an aversive was always a bad thing. I think I was also hoping to find that negative reinforcement did not require an aversive because when I use negative reinforcement well, it doesn't feel like I am using an aversive. But instead I found that I needed to broaden my own understanding of what defines an aversive and recognize that it is just something an animal will avoid. Different things will be aversive in different situations and the definition of "an unpleasant stimuli" covers a broad range from mild annoyance to pain. I think this is important for clicker trainers because we want to use negative reinforcement with an aversive that is on the milder end of this spectrum.
Another reason I looked up these definitions is because I have found that of all the quadrants, negative reinforcement seems to be the most confusing to precisely define, both in a technical sense and in the application. The name seems to be part of the problem, more so than some of the other quadrant names. Perhaps it is because people do use the term negative reinforcement without understanding it and it seems to get mixed up with negative feedback. I do sometimes hear people using the term negative reinforcement and I have never heard the average person talk about using positive punishment or negative punishment in any context except operant conditioning.
Some people have such an automatic response to the word "negative" that they can't get past that part. And some people think that if it is reinforcement, that means it is pleasurable to the animal (as opposed to reinforcing a specific behavior). Some trainers find it difficult to think of training in terms of removing something. It does not help that in application, negative reinforcement is often confused with positive punishment. But I also think one reason it is hard to understand is because there is such a range of ways to apply negative reinforcement and the look and feel of negative reinforcement can be very different depending upon the stimulus.
I want to remind you of the variety of ways in which negative reinforcement can be applied or occurs in everyday life. In the previous article, I listed a number of ways that negative reinforcement happens and I am going to include them again here. These examples come from a few different web sites (http://www.princeton.edu/~yael/LearningCourse/Notes/Examples.doc, http://www.utexas.edu/courses/svinicki/ald320/negrnf.html) and from my notes from various speakers and books.
If you want to read a great story about applying negative reinforcement to a work situation go to http://www.intropsych.com/ch05_conditioning/using_negative_reinforcement.html.
Even though there is a huge amount of variation in these examples, the list above shows that all applications of negative reinforcement have one thing in common: a behavior increases if it precedes the removal of a stimulus. The stimulus could be as mild as having the sun in your eyes, to having a headache, or taking on an unpleasant task. In negative reinforcement, the stimulus that is applied is called the negative reinforcer. For most of this article, I am going to use the word "stimulus" to refer to the negative reinforcer or aversive because it seems the most neutral term to use. I do at times use the other terms when it seems more appropriate so I just wanted to clarify that the negative reinforcer and stimulus are the same thing. I do sometimes use the word "aversive" if that seems more appropriate or to make a point.
In the previous article I wrote about the connection between negative reinforcement and positive punishment. If you did not read that part, I suggest you go back and do so as it is very relevant to using negative reinforcement as a clicker trainer. As a brief reminder here, I will just say that if I use negative reinforcement in a training situation, I often have to both apply and remove the stimulus. The negative reinforcer that I choose could also be acting as a positive punisher if the application of the stimulus decreases the behavior that was happening when I apply the stimulus. In simpler terms, when I use negative reinforcement, I have to be aware that I could be punishing one behavior at the same time I am reinforcing another.
The fact that the same stimulus can act as both a negative reinforcer and a positive punisher becomes important when it comes to choosing the stimulus and evaluating my training. It will come up again when I write about poisoned cues and escalating pressure. I wanted to mention it now because it relates to another aspect of negative reinforcement which I did not discuss in the previous article and this is the escape-avoidance aspect of negative reinforcement.
In some applications of negative reinforcement, the subject can learn to minimize the exposure to the stimulus (escape) or escape from it entirely (avoidance) and this is why negative behavior is sometimes called "escape-avoidance learning." Looking at the examples above, you can see how the negative reinforcer could be escaped from or minimized (by responding sooner) or avoided entirely by responding before the stimulus is applied. I was doing some reading in Paul Chance's book, "Learning and Behavior," and there seems to be some controversy over how negative reinforcement works because if negative reinforcement is effective, the subject can avoid the stimulus entirely.
Avoidance can happen if the subject finds a way to predict the application of the negative reinforcer before it happens. In this case, it would seem as if negative reinforcement was not being used at all, but avoiding the negative reinforcer is still driving the behavior. This can make it hard to see when negative reinforcement is still influencing behavior because the original stimulus is not presented at all in order to maintain the behavior. I think this has some implications for horse training and evaluating our use of negative reinforcement so I want to explain a bit more about "escape" and "avoidance."
The easiest way to understand this is to pick an common example. Let's say that my mother wants me to wash the dirty dishes so when I come into the kitchen, she starts nagging me to do them. If I do them when she nags me, then this is a successful example of negative reinforcement. Doing the dishes (the desired behavior) removes the nagging (stimulus) and I am now trained to do the dishes when she nags me. So far, so good, but depending upon how I feel about her nagging, this could change in several ways.
If I continue to find her nagging more unpleasant than doing dishes, then I will continue to do them when she nags. Over time I might start doing them after less nagging because I have learned that the sooner I start, the less nagging I have to hear (this would be escape). And if I still find any nagging unpleasant, then I might start to look for other indicators that I need to do the dishes and do them before she starts nagging. I might learn to do the dishes as soon as I see them in the sink and therefore escape from any nagging at all (this would be avoidance). But this is assuming I could not avoid the situation entirely. In some cases, avoidance leads to the subject avoiding the stimulus condition under which the aversive occurs.
The progression from escape to avoidance is a common scenario in situations using negative reinforcement and whether or not a subject stays in escape mode or goes to avoidance mode depends upon a number of factors. One way to think about this is to realize that in some situations, negative reinforcement often has an element of choice on the part of the subject. I noticed this when I was looking at the examples because I found it interesting that some of them were more like deal making or negotiating than how I usually think of negative reinforcement. When negative reinforcement is applied to people in work related situations (job, school, chores at home), there is often an agreement up front about how it will work. Some common scenarios are that if you do this additional job/chore/project, then I won't make you do this other job/chore/project.
This implies there is an element of choice when using negative reinforcement and at first, I thought it only applied to this use of negative reinforcement (deal making.) But actually I think there is some element of choice in most applications of negative reinforcement, especially when the aversive quality of the stimulus is low. I could put up with my mother's nagging. I could watch the bad movie. I could put up with the headache and so on. This element of choice is part of what makes negative reinforcement an effective tool for clicker trainers. How does the subject choose whether to tolerate the stimulus or change his/her behavior to escape it?
In the examples, the subject is choosing between tolerating the negative reinforcer or making a change in his behavior. This is going to depend upon how the subject feels about both the negative reinforcer and desired behavior. If the negative reinforcer is easily tolerated and a change in behavior would be difficult, then negative reinforcement would not be happening at all. If the subject finds the negative reinforcer sufficiently aversive and prefers to do the desired behavior, then negative reinforcement can work well. What if the subject finds the aversive too strong? Then the subject might look for other options. I might work out a way to avoid the situation entirely. Another scenario would be if I find the aversive stimulus and the desired behavior equally aversive, then I am going to feel as if I have no choice and my behavior is going to be unpredictable, with many possible side effects.
Going back to the dishwashing example, if I found the nagging unpleasant and I really hated doing dishes, I might start looking for other options. I might start avoiding my mother by coming home later or when she was not there. In that case negative reinforcement would not have been effective at reinforcing the behavior of doing the dishes because the aversive actually reinforced a different behavior - staying away from home. Since staying away from home was reinforced, any behavior that would happen at home decreased. I think this is important because the subject's tolerance for the negative reinforcer has a huge effect on how effectively I can use negative reinforcement and whether or not there are unwanted side effects.
This example shows that using an appropriate stimulus is an important consideration when choosing to use negative reinforcement. If I am using negative reinforcement alone, I have options for how to change my training if my horse is not responding, but I am limited to changing some quality of the stimulus. I can make the stimulus bigger, small, longer, shorter or different in some other way, but I am only playing with one side of the equation. By the equation I mean looking at one side (the stimulus) vs. the other side (doing the behavior.) The most common option is to increase the intensity or duration of the stimulus. This makes tolerating the stimulus less likely and choosing to do the right behavior more likely. An alternative would be to make the behavior easier for the animal to perform so that the animal is more likely to choose that option.
But I think the escape-avoidance aspect of negative reinforcement and the idea of choice also opens the door for the trainer to manipulate the situation in a positive way. If I add positive reinforcement to the equation, I have the option of increasing the reinforcement by removing the stimulus AND adding something the animal wants. This is adding something to the behavior side of the equation and this is how most clicker trainers use negative reinforcement. By adding something positive to the "doing the behavior" side of the equation, we can increase the likelihood that the animal will make that choice.
This also avoids the problem of desensitization, which can lead to cues that are constantly escalating. Increasing the intensity of the stimulus to get the behavior can lead into a never-ending cycle of desensitization on the horse's part followed by escalation on the trainer's part which leads to more and more abusive training practices. Trainers who are skilled in using negative reinforcement are always watching out for this and will avoid falling into this downward spiral. Even when using negative reinforcement, there are always ways to change it without escalating.
Adding positive reinforcement can make negative reinforcement more effective, but we are still using negative reinforcement and this brings us back to the escape-avoidance aspect of negative reinforcement. This is because once negative reinforcement is working, it is the subject's tolerance for the negative reinforcer combined with the ability to do the desired behavior that is going to determine whether the subject chooses escape or avoidance.
This is what caught my interest because, when I am using negative reinforcement with horses, I often end up with a cue that is related to how I used negative reinforcement to shape the behavior. For example, I can teach my horse to move off from leg pressure by using pressure and release. Over time, the horse learns to move off from a slight squeeze of the leg and this is now the cue. So leg pressure was used to prompt the behavior and is now used to cue the behavior. Looking at this from the escape-avoidance point of view, my horse is in "escape" mode meaning it tolerates a little bit of the stimulus before it moves off.
If you have ridden or trained more than one horse, you know that there is a lot of variation in how sensitive different horses are to leg cues. Some require a big kick, others a light squeeze and others move off as soon as you think it, or even before. This is for a lot of different reasons, but thinking about the escape-avoidance aspect of negative reinforcement gave me a bit of insight into this being more than just an issue of sensitivity or rider skill. Sensitivity and rider skill are important considerations, but I can look at it more scientifically too. If I want to use negative reinforcement to prompt, train and cue a behavior and I want to keep a version of the same negative reinforcer as the final cue, I need to make sure my horse doesn't find it so aversive that it wants to avoid it. I want to keep my horse in the "escape" range where it allows me to apply the cue and then changes its behavior.
I do want to mention that at some point, the behavior is "on cue" and I could argue about whether it is being maintained by negative reinforcement or through some other means. If I am positively reinforcing the behavior at times, the cue might no longer be aversive and the behavior is being maintained by positive reinforcement. I asked around to see if a behavior that was trained with negative reinforcement could ever be totally separated from its shaping history and the answer I got was "it depends." Some people felt that as long as the cue was one that evolved out of a shaping history using negative reinforcement, there was an implicit threat behind it. The animal is aware that I could make it more aversive if I wanted to.
I think that in truth, it is really hard to keep a cue that was shaped with negative reinforcement from staying tied to its shaping history. This is because when the horse does not respond to the cue, most people's tendency is to go back to negative reinforcement to get a response. That is not necessarily a bad training choice, but every time I do that, I am reminding the horse of the threat behind the cue. I am going to write a little later about some ways to avoid getting caught back up in using negative reinforcement once you have a behavior on cue.
This may seem like an academic discussion but it explained a lot to me about why my horses sometimes anticipate and I end up with the behavior disconnected from the cue. I think some of it comes from eagerness because I use negative reinforcement combined with positive reinforcement, but I think there is something else going on too. I think recognizing that escape and avoidance do happen can help me evaluate whether my horse is anticipating because of eagerness to earn reinforcement or whether it is trying to avoid the stimulus. My horse's body language and attitude are going to be important sources of information about what is driving any change in my horse's response to my cues.
When I am using negative reinforcement, I am using a negative reinforcer as a stimulus to prompt a change in behavior. This negative reinforcer could also be acting as a positive punisher. Being able to identify when I am punishing one behavior as I increase another behavior is an important step in learning to use negative reinforcement well. In the previous article, I gave some examples of using negative reinforcement with and without positive punishment and some guidelines for evaluating your own training.
In that article I spent time on the difference between negative reinforcement and positive punishment because I wanted people to see how they could move away from training that was based on stopping "bad" behaviors to training that was based on training "good" behaviors. For most people, that is an important step, but it is also important to learn how to use negative reinforcement as a teaching tool where it is being used to train new behaviors by prompting, guiding or giving feedback to the horse.
I am going to share some of the things that have worked for me and other people I know. Clicker training is a very flexible tool and there are going to be other ways to do things. I am hoping that sharing what I do will give people ideas to get started and they can move on from there.
Combining Positive and Negative Reinforcement
As I noted in the previous article, most traditional horse training uses negative reinforcement, but negative reinforcement has some drawbacks. When clicker training was first being applied to horses, a lot of the training was done using only positive reinforcement and the model of dolphin training was the one most people used. This was true of dog training too. A lot of early dog training emphasized free shaping and the animal was at liberty and able to choose to participate in the training or not. This worked very well for training many types of behaviors but was not quite the same as training a horse to be ridden. Unless horse people wanted to throw out everything and start over, this meant horse people had the interesting challenge of figuring out how to combine a positive reinforcement based training system with a negative reinforcement based training system.
The easiest way to do this is to use traditional training methods, often based on pressure and release, and add positive reinforcement on top of that. So when the horse gives the correct answer, the trainer responds as usual (by removing the "aversive" stimulus) and in addition, she clicks and reinforces. When training a new behavior the trainer might click and reinforce every correct effort so the horse is getting reinforced through negative reinforcement and positive reinforcement. One the behavior is more established, the trainer might click and reinforce only the best efforts and just use the reinforcement provided by the removal of the stimulus to reinforce other efforts that were correct but not of the same quality.
This is what Alexandra Kurland refers to as "piggy-backing" and a lot of her early work is based on combining John Lyons' training methods with clicker training. For a lot of people, adding a click and treat to what they are already doing is a good way to start using clicker training. This allows new clicker trainers to start with a system that they already know and as they get more familiar with clicker training, they usually start to see places in their training where they can make changes in how they use negative reinforcement as well as where they can take advantage of the power of clicker training.
Alexandra Kurland's own approach to clicker training has evolved over the years. She has put a lot of time and effort into coming up with better and better ways to educate people about how to use negative reinforcement in such a manner that it does not require aversives, or shut down the horse's behavior. She has also come up with new ways to train behaviors using only positive reinforcement alone that combine well with more traditional approaches. This has led to a form of horse training that has the benefits of both types of operant conditioning. If you are interested in learning more about her methods, I have written a lot about them on this site and she has her own web site (www.theclickercenter.com) as well as numerous books and videos.
But some people come to clicker training with a horse that has had bad experiences with traditionally applied negative reinforcement or that doesn't respond to it well in the first place. Others come to clicker training with a strong commitment to only using positive reinforcement. Or they might be learning how to clicker train horses after having previous experience with clicker training other animals. For these people, starting with a more positive reinforcement only focused approach works better. There can be a lot of emotional baggage associated with past training or using negative reinforcement and by staying with positive reinforcement, the horse and trainer can build a solid relationship and learn more about each other while still training new behaviors.
These days I find myself in the interesting position of being in the middle of the these groups. On one hand, I can understand that there are some people for whom piggy-backing the clicker on to their own style of training is the easiest way to start clicker training and they are happy with the results. On the other hand, I can see the value in trying to shift to a more and more positive approach or even starting by avoiding any use of aversives in training. I think this is an area we need to explore more and I am thrilled when someone is willing to put the time and energy into training behaviors with the emphasis on positive reinforcement alone. At the same time, I am a practical person and I value training that is effective and not stressful for the horse, regardless of what quadrant I am using.
This means if you put me in a room of clicker trainers who say you can only use positive reinforcement, I will find myself arguing the value of negative reinforcement. But if you put me in the middle of a group of negative reinforcement/punishment based traditional trainers, I will be the one arguing for using positive reinforcement. This is not a question of there being a right answer because every horse and person combination is different and we all have different goals. That is one reason I think it is important to become educated about our training options and find a system that works for you. Good trainers are always changing as they learn new things and we are all at different stages in our journey.
I started out as a traditional horse person and many of the horse skills I have learned over the years involved negative reinforcement, and even punishment. When I found clicker training, I was thrilled to learn there might be ways to train horses with a strong focus on positive reinforcement. I started out by doing some shaping work that was only positive reinforcement, but I also added the click and treat on to what I was already doing which was basic dressage training with some trail riding and jumping. I had mixed results until I started working with Alexandra Kurland and learned more about how to use negative reinforcement as a clicker trainer. She uses negative reinforcement as part of her training plan but the emphasis is on the finding and rewarding the good moments. She also does not escalate when using negative reinforcement but gives the horse time to find the answer. If the horse gets stuck, she breaks it down into simpler pieces. I learned more about using Alexandra's techniques for combining negative and positive reinforcement and I am very pleased with our progress.
In the last few years, I have started playing around with variations on her exercises and trying some of my own things and this has helped me understand how to become an even better user of negative reinforcement as well as how flexible clicker training is and how the same behavior can be trained in so many different ways, with varying combinations of positive and negative reinforcement.
The rest of this article is going to focus on the progression from a negative reinforcement based training system to a more positive reinforcement based training system. I am writing this with the traditional horse person in mind who is now trying to apply clicker training to their horses and needs a bit of a "road map" for how to get to the point where they are using the full potential of clicker training, or to find the place where they are comfortable with the blend of positive and negative reinforcement that they are using. I think this is a process. I think there are some people who can jump in with both feet and only use positive reinforcement and maybe they will find some ideas in here too. I am hoping this will give people a way to evaluate where they are and give them some idea of the next level that is around the corner. Sometimes people don't know what is possible because they don't know how to look for it.
I am going to start at the beginning with the simplest way to add clicker training to your program. This does not mean that these are the first steps you would take to clicker train your horse. I always recommend that people start with targeting and train some simple behaviors before they try and teach or improve current training issues. There are a couple of reasons for this. First, it takes a new clicker trainer a while to develop good timing and a sense of how to shape behaviors so that the horse doesn't stall out and get frustrated or get stuck at an intermediate step along the way to the finished behavior. It is better to practice and learn these skills on behaviors where there is no previous baggage or emotional investment. But even as a more experienced trainer, I start with simple things because I want the horse to be motivated to play the game before I ask for any behaviors that might be difficult either physically or because of past emotional baggage. If you are not sure where to start, Alexandra Kurland has foundation exercises that teach practical skills to both horses and handlers.
Once you are past the beginning stages and your horse understands about the click and treat, what do you do next? You can take the foundation exercises and use them as building blocks for a lot of other behaviors (this is the subject of another article, coming later). Or you can look at what you already do and see if it can be improved by adding positive reinforcement. This usually means you are adding positive reinforcement on to a negative reinforcement based program and I think that is fine, provided you follow a few basic guidelines. The nice thing about adding positive reinforcement is that it will make you better at using negative reinforcement and over time, if you are conscientious and observant, your program will shift toward more emphasis on positive reinforcement.
You might be wondering if there is value in becoming better at negative reinforcement if you are going to shift to a positive reinforcement based training system, but I think there is. I think that if we are going to work with horses in any manner where we are connected to them, either by a lead rope or sitting on them, then we owe it to our horses and ourselves to become very sophisticated users of negative reinforcement. While I might be able to set up my training and teach new behaviors using only positive reinforcement, when I ride there are going to be times when I end up using negative reinforcement. And there might be situations where using negative reinforcement is the most effective way to train. My goal is to have a horse that understands about negative reinforcement so that I can use the minimum amount of stimulus to get a response in most situations. But I also want my horse to be ok with those times when I do have to use a stronger stimulus so that in case it encounters extra pressure, it does not panic.
I mentioned earlier that just adding positive reinforcement to a training program based on negative reinforcement can make a significant change by helping trainers become more skilled at using negative reinforcement as well as by making the training experience more pleasant for the horse. There are some other advantages to piggy-backing clicker training on to an existing training program as compared to using a traditional negative reinforcement based system alone. Some of these are:
1. Horses learn faster because the click clearly marks the right moment and the reinforcement makes them motivated to repeat the behavior or keep looking for the right answer.
2. Adding in the click makes the trainer more aware of the timing of the release which is important for training using pressure and release.
3. The act of stopping, delivering and allowing the horse to get his reinforcement creates a pause in the training process which gives both horse and trainer time to think.
4. Eating itself (if that is the chosen reinforcement) is reinforcing for most horses and food is an easy reinforcer to use once horses are past the initial stages.
5. Clicker trained horses are looking for the right answer and give excellent feedback to the trainer on how clear they are with their cues and what body language they pick up on. Because they are looking for the answer, they often anticipate and this teaches handlers to have excellent body awareness, body language and teaches the handler a lot about cues.
6. There is an advantage for a trainer in starting with a system they already know as it means they are not learning as many new things at once and they can be more consistent.
7. Because the emphasis is not on "making the horse do it", we give the horse time to respond to our request and we create a willing partner.
8. The use of positive reinforcement makes the stimulus less aversive, partly through classical conditioning. This is good in that it makes some horses more tolerant of people when they are learning basic riding skills and mechanics.
Unfortunately there are some drawbacks too:
1. If your horse has had bad experiences with a training system, all that emotional baggage is going to still be present and it might make it harder to get the horse interested in clicker training.
2. Any training system that relies heavily on punishment or escalating negative reinforcement is not going to combine well with clicker training. In order for clicker training to work well, the horse has to want to participate and a horse that has been punished a lot is not going to want to interact with the trainer or offer behavior.
3. A trainer who is very familiar with a specific training program might find it is hard to change old habits.
4. Most training programs do not break behavior down into enough small pieces so you end up relying on capturing behaviors instead of learning how to shape them.
5. Not only do most training programs have procedures and exercises, but they also have a philosophy too. If the philosophy is not compatible with clicker training, then it is hard to make the mental shift that needs to happen to fully embrace clicker training.
7. Using a lot of negative reinforcement does not take advantage of the power of clicker training. It is what Alexandra Kurland calls "sugar coating same old, same old." Yes, it is better than using only negative reinforcement, but it does not necessarily create a thinking partner. What you get are horses that are good at following directions, but not necessarily horses that are really operant.
In the list of drawbacks, I noted that some training programs are not going to combine well with clicker training. These are usually programs that use strong aversives to change behavior, This could be punishment or it could be negative reinforcement that escalates well beyond a horse's tolerance zone. Adding positive reinforcement to these programs can lead to very mixed results unless the program is modified to be more clicker friendly. Behavior that is trained by force, scaring the horse, or sending the horse into flight mode is not going to combine well with clicker training because an important component of clicker training is giving the animal freedom to experiment and encouraging the horse to think.
The term "poisoned cue" was first used by Jesus Rosales-Ruiz and his graduate students at the University of North Texas when they studied the effects of combining positive and negative reinforcement to train the same behavior. What they found was that training a new behavior with a combination of negative and positive reinforcement was less effective than using positive reinforcement alone. While the dog learned the behavior in both cases, the dog's attitude, performance and emotional response to the cue was different when negative reinforcement was used to train it. This has led to further work on poisoned cues and what happens when you mix positive and negative reinforcement. I was able to attend sessions on the poisoned cue at Clicker Expo in 2006, 2007 and 2008 and saw the video of the training sessions they presented. The video clips clearly showed that training a behavior with a combination of positive and negative reinforcement can be problematic. I do want to note that if they had included training using negative reinforcement alone, it would be easier to see if the cue was poisoned by any use of negative reinforcement or if it was the combination of positive and negative reinforcement that caused the dog's change in behavior and decreased rate of learning.
In the study they trained a dog to come using only positive reinforcement (a treat) and compared it to training the dog to come using a combination of negative reinforcement (tugging the leash) and positive reinforcement (a treat). When the cue was trained with only positive reinforcement, the dog would eagerly come up to the trainer when the cue "ven" was given. The dog looked happy and animated with its tail wagging and bouncy steps. But when the cue "punir" (trained with the combination of positive and negative reinforcement) was given, the dog came with droopy tail and a depressed attitude. The addition of the treat was not enough to overcome the dog's emotional response to the leash pull.
In the next part of the experiment, they wanted to see if they could use the cue as a reinforcer. One of the qualities of a cue that has been trained with positive reinforcement is that you should be able to use it to reinforce another behavior. If the cue is associated with positive reinforcement, then hearing the cue should be reinforcing to the dog because it indicates a chance to earn reinforcement. So in the next experiment, they did separate trials using either the cue "ven" or the cue "punir." The cues "ven" and "punir" were used to mark a specific behavior as part of the shaping process or to capture the behavior. The target behaviors were going to a specific location or backstepping. In this process, instead of clicking, the trainer said either "ven" or "punir" and then clicked and rewarded the dog for coming. Then they measured to see if the target behavior increased.
The results showed that the cue "ven" acted as a reinforcer because the dog started going to the marked square, but the cue "punir" did not act as a reinforcer because the dog's behavior did not change in any consistent way. In the trials with "punir," the dog either stayed close to the handler, drifted aimlessly around or tried to offer the behavior that had been reinforced by "ven." In the trials with "punir," the dog was also very subdued. This is only the first part of the study, they did some more experiments afterward to further explore using "ven" and "punir" and see if there were other variables that were affecting the dog's response.
The conclusion presented at the first Clicker Expo where I saw this was that it was the combination of positive and negative reinforcement that poisoned the cue. Poisoning the cue means that the cue is no longer just a possibility to earn reinforcement. It is also a threat that an aversive is coming. In a later Clicker Expo, I noticed the wording had changed slightly and it was now a combination of positive reinforcement and an aversive that poisoned a cue. I noticed this because it meant that it was not any application of negative reinforcement that could poison a cue, it depended upon the aversive. And this makes sense. Whenever I use negative reinforcement, I run the risk of using positive punishment too.
This study has sparked a lot of interest among horse clicker trainers because so many of us do use negative reinforcement and it has made us all look more closely at how we use it, which is a good thing. The first time I heard the presentation, I thought it meant that any combination of positive and negative reinforcement was going to lead to the kind of depressed and ineffective training that the study showed when the cue "punir" was trained. But it turns out that the study was looking at a very specific way of combining positive and negative reinforcement. It was designed to answer the question of what happens if you use positive reinforcement as a consequence for correct answers and negative reinforcement as a consequence for incorrect answers or to "make" the behavior happen, and it was comparing this to what happens if you only reward correct answers. In simple terms, is it easier to learn something if you are told "yes" and "no" as compared to if you are just told "yes?" Adding a positive reinforcer for "yes" and a negative reinforcer for "no" were added as consequences to see if they made the difference between "yes" and "no" that much clearer.
I do think this is important research and horse people should take note of it because a lot of crossover trainers embrace the idea of adding positive reinforcement, but are reluctant to totally give up the idea of needing to react when the horse is wrong. They end up falling into a situation where they are rewarding correct responses and reacting with punishment or an aversive to incorrect responses. But I have to wonder about a few things. For starters, the dog in the study had no previous experience with leash tugs and its past experience had been with positive reinforcement whereas most horses I meet have already been trained with negative reinforcement. Most of these horses do very well with the addition of positive reinforcement. Is adding positive reinforcement to the training of an animal that has been exposed mostly to negative reinforcement different than adding negative reinforcement to the training of an animal that has been exposed mostly to positive reinforcement? What would happen if you compared positive and negative reinforcement with using negative reinforcement alone?
I also want to make a point about the poisoned cue study and how the cues were trained. In the trials with positive reinforcement only, the trainer said the cue "ven" and if the dog met the criteria, the dog was reinforced with a treat. In the trials using positive and negative reinforcement, the trainer said "punir" and if the dog met criteria, it was reinforced with a treat. If the dog did not meet criteria, a leash drag was delivered and this was used to put the dog into position where it was then reinforced with a treat. The leash drag was used to pull the dog into position where it was then reinforced. I would say the dog was essentially being "paid" for being pulled into position. I did not see the dog learning that the leash drag meant it could earn reinforcement or learning how to avoid the leash drag. In fact, they changed from a collar attachment to the leash to a harness and then discontinued the leash drags in the "punir" study because they were concerned about harming the dog. That implies that the leash drag was not working as an effective reinforcer for coming toward the handler. What the leash drag did increase was the behavior of staying by the handler. In later studies, they had trouble getting the dog to go far enough away in order to ask it to come. That makes sense too. The reinforcement was being delivered when the dog was in position near the handler.
So putting aside the emotional considerations about training with negative reinforcement, why were the leash drags so ineffective as a training tool? Is this a problem with negative reinforcement or with this application of negative reinforcement? From my observations and reading the study, I think the use of negative reinforcement was unsuccessful for at least two reasons. One was the timing of the introduction of the verbal cue and the other is the use of the leash. The leash was used to pull the dog into position. It was not used to teach the dog to come from gentle pressure on the leash through shaping or successive approximations. The leash was used to pull the dog into the final position. I think this is just a classic case of lumping. If the handler had applied a bit of tension and waited and the released for any movement toward the handler, the results might have been different.
I also think that using the cue early as part of the shaping process meant the cue was associated with the aversive use of the leash. If the leash had been used differently to ask the dog for a change instead of to make the dog change, it might not have mattered that the cue preceded the leash drag. But because the cue was added so early when the behavior was still being learned and the leash was still being used in an aversive way, the cue was permanently associated with the unpleasant part of training the dog to respond to the cue "punir." The cue became a predictor of punishment not reinforcement.
I am not writing this to criticize or discount the poisoned cue study which I think is very important. The reason I am going into this much detail is that I think the poisoned cue study is a great opportunity to study one way to combine positive and negative reinforcement and because it is so well documented, it is easy to look at lots of variables to see what horse trainers want to do differently to avoid getting the same results. I think it does clearly show how the punishment aspect of negative reinforcement can be so aversive that it overshadows any attempt to soften it by adding positive reinforcement. I wanted to be clear about the set-up because it shows us ways that we can do things differently and avoid having the same results.
I noted earlier that the definition of a poisoned cue had changed slightly from a combination of positive and negative reinforcement to a combination of positive reinforcement and an aversive. Recently I find that people are referring to poisoned cues as any cue that has been associated with an aversive, punishment or a "correction." I am not sure if this is accurate, but if it is, then the problem with poisoned cues is not the ambiguity of the cue, but just that it has been associated with an aversive. I hope to get more information on this at the next Clicker Expo.
I do have some concern about putting the emphasis on poisoned cues on the ambiguity because that would imply that we are better off using positive reinforcement alone or negative reinforcement alone. Someone asked me once if they were better off just using negative reinforcement so the horse did not feel ambiguous about the cue. I certainly don't want to start recommending that. I think what we need to do is look more carefully at how we use negative reinforcement in training new cues and in maintaining behavior. If I want my cues to remain as green lights for reinforcement, then I need to be careful about how I react if the horse does not respond to the cue. I think the key points to remember about poisoned cues are that they are out there but that we can avoid creating them if we are aware of how we use negative reinforcement in our training.
I hope there will be more research on poisoned cues, specifically on how to avoid them or how to un-poison them. What I have been told is that the behavior is not poisoned, it is the cue that is poisoned. Jesus Rosales-Ruiz recommends that you just choose a new cue. Do you have to retrain the new behavior to avoid the association with the old cue or can you just add a new cue? He doesn't think so, but I imagine this is one of those areas where it depends. In the initial study, the dog was trained using positive reinforcement and an aversive for about 50 trials. After that, only positive reinforcement was used. Even after 100 or 150 trials, the dog's emotional response to the cue was unchanged. Just removing the negative reinforcement and continuing to use positive reinforcement was not enough. Would it have worked if they had continued to use positive reinforcement for a longer period of time? I don't know. But based on that, I think in most cases, it is better to reshape the behavior or choose a new cue rather than try to un-poison the existing one.
One way to think about this is to think of some things that might be poisoned cues for you. In his presentations, Dr. Rosales-Ruiz says that for most of us, our name is a poisoned cue. Thinking about my own name, I can see how the difference between a poisoned cue and a non-poisoned cue can be very subtle. I respond differently to different people saying my name. Depending upon who says it, how they say it and my previous experience with what happens after I hear my name, I have different emotional responses. So it is not just my name itself. It depends upon context. Using our names as an example also shows why it is so hard to un-poison a cue.
Let's say I have a friend, who sometimes phones to share interesting news and sometimes calls to invite me out to do something fun. Because I sometimes like hearing her voice say my name and I sometimes cringe when I hear her say my name, her voice saying my name is a poisoned cue. If she changes her ways and now only calls me to invite me out, I might find that my emotional response to her voice changes over time, and if she only calls to invite me out for a LONG time, I might start to feel pretty positive about her calling me. But I am never going to feel the same way as I do about a friend whose phone calls I have always enjoyed. And if after some period of time, she (the first friend) calls me up and yells at me, that ambivalence is going to be back and it might take even longer to get rid of it. Or if this happens enough times, I might find that both emotional responses are muted. I don't have a strong emotional response when she yells at me or when she asks me to do something fun.
Does it matter when she starts to offer different behaviors (consequences) after I respond to her saying my name? Just thinking about this, I think it might. If she calls me a number of times to invite me out and then calls to yell at me, I am going to feel better about her than if she calls to yell at me the first time and then calls to invite me out. To be honest, if she calls me back after the first time and I have a choice, I might not even pick up the phone. Even if she calls the next few times to invite me out, my gut reaction when she calls is going to be a negative emotional response. I think of this as being about first impressions and they are very important. I am going to come back to how to create good first impressions later. In this case, the ambivalence might end up affecting everything about her, so it will go beyond hearing her say my name.
Back to horses and what you do if you have a poisoned cue. Sometimes I can't change the cue. If I have a horse has had bad experiences and associates them with a saddle, bridle, halter or some other standard equipment, I might not have the option of choosing a new cue. In that case, I think I would play around with doing things differently. Horses can be surprisingly particular about the context of things. One solution to a girthy horse, once any real issues have been resolved, is to girth the horse up on the other side. Tightening the girth on one side triggers anxiety and old responses. Tightening the girth on the other side is fine. If my horse's halter is a poisoned cue, can I change the halter itself? I don't think we know if that works, but it would be one option to try. Can I change the way I present the halter, the order I do things, or teach my horse another unrelated behavior using the halter so the halter is now associated with a cue to do something fun too.
Another approach to dealing with a poisoned cue is demonstrated in Alexandra Kurland's DVD "Overcoming Fear and the Power of Cues." She is working with a horse that has become scared of the saddle because of an accident. She teaches the horse head lowering and then uses the saddle as a cue for head lowering. The horse learns that by dropping her head, she can make the saddle go away. Once she learns she can control the saddle and that this leads to the saddle moving away and the addition of positive reinforcement, she starts to accept the saddle. Using the saddle as a cue for a positively reinforced behavior changes the meaning of the saddle from an aversive cue to a more positive cue. I am not sure if the saddle was a poisoned cue to the horse in the sense that it was ambiguous but it certainly was associated with a negative consequence.
I think we can all agree that we don't want our horses to find a cue aversive because we want them to associate us and our cues with good things. Poisoned cues can damage the relationships we have with our horses. One of the reasons many people choose clicker training is because they want the animal to be a willing partner. But beyond this, there are some problems with poisoned cues that are particularly relevant to clicker trainers.
To understand the significance of poisoned cues for clicker trainers, you have to understand that in clicker training, a cue means that if the animal performs a certain behavior in response to the presented cue, it can earn reinforcement. A poisoned cue means that when the cue is presented, the animal can earn reinforcement if it does the behavior correctly OR it can expect some kind of aversive if it does not perform the behavior. Because the cue is no longer just an indicator that something good could happen, the cue itself becomes ambivalent. To the animal, it now predicts either reinforcement or punishment and this means that the animal has a mixed emotional response to the cue.
I don't think most horse trainers put as much emphasis into having horses that love to be trained as clicker trainers do. In traditional training, there does seem to be an attitude that whether or not the horse responds is the first priority and the emotional response is secondary. I am not trying to bad mouth traditional trainers or say they don't want happy animals, but I do see many who have accepted that training horses is about teaching them to do what the trainer wants and that the emotional response to the cue is secondary. Their main concern is if the animal responds to the cue correctly. But for clicker trainers, the emotional response to the cue is very important because we want animals to love cues, we want them to view cues as the doorway to reinforcement and we don't want anything to decrease the enthusiasm our animals have for cues. The more we rely on positive reinforcement, the more important it is to have animals that look for and love cues.
One way a poisoned cue can be a problem is in creating a chain or sequence. In a chain, each behavior is being reinforced by the cue for the next behavior. Being asked to do another behavior means the animal has another opportunity to earn reinforcement. This only works if the cue itself is reinforcing so that when the animal is cued, it has a positive emotional response to that cue. If the animal does not have a positive emotional response, then the cue is not going to reinforce the previous behavior and the chain or sequence will fall apart. In addition poisoned cues can lead to problems with reliability, attitude, and the speed of learning. In the poisoned cue study, not only did the dog show a significant difference in attitude when the poisoned cue was trained, but the dog did not learn the cue as quickly or with as few errors as when it was trained with positive reinforcement only.
When I first heard about poisoned cues, I was already using Alexandra Kurland's methods which combine negative and positive reinforcement and it made me re-evaluate what I was doing. The good news is that because of how she uses negative reinforcement, Alexandra minimizes the likelihood of creating poisoned cues which means it is possible to train with negative reinforcement without creating them. This was encouraging to me and by looking at her system and doing some experimenting on my own, I think there are some general guidelines that a trainer can follow to use negative reinforcement in her training without creating poisoned cues or poisoning existing cues.
If you are interested in learning more about poisoned cues, Alexandra Kurland is working on a DVD that will be on this topic and it should be out in 2009.
As a clicker trainer, I use negative reinforcement in three different ways. I use it to prompt or generate behavior, to redirect unwanted behavior, and to maintain behavior. From a scientific viewpoint, these are all the same thing in that I am increasing a behavior, but I find it useful to recognize that I can be using negative reinforcement to address different aspects of training. In most teaching situations, I am using it to prompt or generate behavior. I apply a stimulus, wait for a change and remove the stimulus. I need to be careful about my choice of stimulus to avoid using punishment. Once a behavior has been learned, or is past the early stages, I am going to be using negative reinforcement to maintain it. And in some situations I am going to be using negative reinforcement to redirect a horse from doing an unwanted behavior.
In all these cases, I am choosing to introduce a stimulus so that I can use negative reinforcement to reinforce a desired behavior. It is worth pointing out that there are times when I will use negative reinforcement because there is something about my training situation that allows me to use the removal of an item, or the addition of distance (between the horse and the aversive), to facilitate my training goals. I can take advantage of negative reinforcement when I am introducing my horse to scary items. If he stands while I approach with the scary item, he gets a click and treat (positive reinforcement) and I remove the item (negative reinforcement). If my horse is scared of a fixed object or location, I can reinforce the horse for behaviors I like (approaching it, standing quietly, etc..) by allowing him to put more distance between him and the item as the reinforcement.
In this manner negative reinforcement can be used alone or in combination with positive reinforcement to work through fear issues where the emphasis is not on a specific behavior or interaction with the object but more to reinforce acceptance or reduce stress associated with the stimulus. Since this is a different use of negative reinforcement than the main focus of this article, I am not going to go into more details here. I just mention it here because I think it is important to recognize when a powerful negative reinforcer is available, and that taking advantage of it can be part of an effective training strategy.
The rest of this article is going to focus on the application of negative reinforcement to generate, redirect, and maintain behavior. There are similarities in how I use negative reinforcement in all three situations but there are some differences and some choices I need to make to minimize any effects of punishment if that is important to me. I am going to start by talking about the "aversive" stimulus because that is the common element in all these uses of negative reinforcement.
An aversive is something an animal will work to avoid. Negative reinforcement often works because the animal is motivated to avoid or remove something that the trainer adds. There are all levels of aversives. If you go back to the section on negative reinforcement, you can read some of the aversives. Some aversives are just annoying (nagging, loud music, boring class). Others can be downright painful (headache, loud fire alarm, various kinds of physical contact). We can all think of some ways in which negative reinforcement is used with horses that makes it similar to punishment. When I tap my horse with a whip to ask it to go forward, am I using negative reinforcement or punishment? Is the horse looking for the right answer or avoiding a consequence? If I use the whip effectively and with good timing to increase another behavior, then I am using negative reinforcement too, but depending upon how I use the whip, will there be unwanted side effects?
Luckily for us, most horses will tell us when we are using something that they find aversive and we can use the horse's reactions to help us use negative reinforcement with a minimal use of aversives. In the poisoned cue study, the dog's whole attitude indicated that there was something unpleasant going on. They measured which behaviors were increasing and decreasing. The leash pull that was applied as the negative reinforcer was aversive enough that a lot of other behaviors decreased such as tail wagging and certain types of movement. They noted that in the trials using "punir," the dog started staying nearer and nearer the handler, trying to avoid or escape the leash pull. I think that seeing a decrease in a behavior (moving around the room freely) means that in addition to using negative reinforcement to train the dog to come, punishment was going on too.
This section is titled "be careful about choosing aversives" and I could have said be careful about choosing your stimulus. I prefer to use the word "stimulus" when writing about negative reinforcement because I don't think every application of negative reinforcement uses aversives and aversive is such a loaded word. But I am using the word aversive here because I think that it is important to remember that there are going to be times when the stimulus is an aversive. It would be nice if we could use negative reinforcement without using aversives at all, but I am not sure that is possible. There are always going to be some situations in real life where we don't have a lot of options and using an aversive is the best choice for various reasons.
I do think that it is possible to keep aversives to a minimum by choosing them carefully, monitoring the horse's response and educating the horse so that it accepts the stimulus as a request for a change in behavior and does not just perceive it as punishment. If I do use an aversive, I want to be aware of what I am doing and plan it carefully so that I do not end up using a stimulus that is more aversive than necessary. Any use of stronger aversives is kept to a minimum and if I have to keep using a strong aversive or find myself escalating, then I need to come up with a different training plan. Using one once to get a horse's attention "might" be ok, but I sure don't want to stay there long and I would rather not go there in the first place.
If I want to use negative reinforcement to prompt or initiate behaviors as part of clicker training, then I have to choose my stimulus carefully. There are lots of ways to approach this problem ,but here are three strategies for choosing a stimulus that is not aversive. One is to choose something that has no meaning and teach the horse to associate it with positive reinforcement. Another is to use the stimulus at a level below that at which the animal finds it aversive, and the other is to desensitize the horse to it through positive reinforcement. In some ways, these are all the same thing as they are all about taking a stimulus and changing its meaning or association. They are just three different degrees of the same thing. In the first case, the horse has no response to the stimulus so I want to give it meaning. In the second case, the horse is overly sensitive to the stimulus but I can still use it if I am careful. And in the third case, the horse is so sensitive to the stimulus that I can't use it all unless I spend time desensitizing the horse to it.
In the first case, I can choose something that has no meaning to the horse and use positive reinforcement to give it meaning. It is hard to find something that has absolutely no meaning, especially once your clicker trained horse catches on to this game. But the idea here is to take something that was neutral and give it a positive association and then use it to train other behaviors. It becomes a prompt for new behaviors and you can use it in a manner that looks similar to negative reinforcement (where the removal of the stimulus is what the horse is looking for) but in reality you are already using the stimulus as a cue or prompt. In this case, the horse is looking for the removal of the stimulus because that means it is going to get positively reinforced, not because it wants to get rid of the stimulus.
I can place my hand gently on the horse and click and remove it a number of times and this shows the horse that the hand has meaning. Then, with a clickerwise horse, I can start looking for small change. I have done this kind of things for years and just thought of it as a way of letting the horse know the game is on. But now I realize that is more than that. It is a way to give a stimulus meaning and start out training with the horse in thinking mode from the beginning.
On the other hand, if I have a stimulus that I really want to use because it seems to be the most useful or appropriate, but my horse finds it aversive, I can often train the horse to respond to it by keeping its use below the level at which the horse finds it aversive. A horse usually finds something aversive because of the way it is used or applied and by recognizing early signs of discomfort, I can work within the horse's comfort zone. Using the whip as an example, if I have a horse that is afraid of a whip but I want to use it, I might spend some time exploring my horse's response to the whip. Is the horse ok if I just hold the whip? Is the horse ok if I move it a little? Can I move it around more or wiggle it? I can click and reinforce the horse for moving off from a tiny movement of the whip where I have the horse's attention but it is no longer alarmed. The more I reinforce the horse for responding to little changes in the whip's movement, the more the horse will accept it.
But if I have a horse that is so fearful of the whip, then I have to do more desensitizing. There are lots of ways to do this and I am not going to go into it here but I could just get the horse used to me carrying the whip and doing nothing. I could teach the horse to associate the whip with positive reinforcement through using it as a target or as an object to interact with in a game such as fetch. Desensitization can take some time and I find that if a horse is fearful of an object, some of that will go away as the horse learns to trust the handler more and gets more settled because of the positive environment of clicker training. If I have a horse that is really unhappy about me using something, I often just leave it alone for a while and revisit it every now and then to see if anything has changed and the horse is now more ok with it. There are always other ways to train behaviors so if my horse doesn't want me to use something, that's ok.
How successful this is going to be depends upon several factors and goes back to the idea of poisoned cues. If my horse is a bit unsure about a stimulus but I know it has no previous bad history with it, I will spend time desensitizing the horse or using the stimulus in other situations where it is only associated with positive reinforcement. But if my horse has a lot of anxiety about the stimulus, I am probably going to end up creating poisoned cues because that stimulus is now going to be associated with both positive and negative reinforcement. It is possible to work through this, but it helps to recognize the problem going in.
Touch is often used as an example of negative reinforcement and certain kinds of touch can certainly be aversive. But with most animals, there is a level of contact that they will accept. By working within that comfort zone, we can teach the horse that touch is not a bad thing. It is just a cue to do something. If the horse learns that responding to the touch earns reinforcement, the horse's perception of touch will change over time. The simplest way to think about this is if you want to use a stimulus to train a horse, the horse has to accept it and remain in thinking mode when you use it to prompt new behavior.
Alexandra Kurland uses this method in a lot of her exercises. She has the trainer wait the horse out by applying the stimulus at a level below that at which it would normally react and wait. Even a mild feel on the rope or touch on the side will cause the horse to want to change if you wait long enough. The horse moves or responds and she removes the stimulus. This is still negative reinforcement at work, but the punishment aspect is so diminished that the horses quickly start focusing on what earns the click and the stimulus is not perceived as aversive. By stabilizing and waiting, we give the horse time to respond to our request and we create a thinking horse instead of one who just learns to let us pull her around.
Of course in order to do any of this successfully, you have to be able to tell what your horse finds aversive. I find that most horses will change their posture, facial expression, or use other horse body language to tell me when they don't like something. Some common responses are obvious ones such as ear pinning, neck snaking, swishing the tail, kicking out or any kind of threatening or aggressive move. But there are more subtle ones too. Sometimes I just see some tension in the face, wrinkles around the muzzle area or a worried expression. My horse might become slightly high-headed or inverted. The feet might start to move. If I have a new horse, I can start to pay attention to how the horse is in different situations and I will start to be able to read the stress level and what behaviors are associated with tension and anxiety.
Sometimes it is hard to know exactly what is going on and I will just monitor certain things to see if the horse gets better or worse. For the first few months when I had Stella, she had an odd movement she made with her mouth, a bit like a combination of grinding her teeth and snapping her jaw, but quieter. I never saw her do it except when we were training and at that point, she would do it when we were free shaping with only positive reinforcement as well as when we were working on other behaviors, so I was pretty sure it was not related to using aversives. It was a good indicator to me for when she was feeling anxious and as her training progressed, it went away. If it had been associated with any particular cue or behavior, I would have adjusted my training program to see if it went away.
There are a lot of things that can cause anxiety or tension in your horse. It is not always related to your use of negative reinforcement or aversives. You could be triggering an emotional response that is connected to past training. There could be physical issues. Sometimes it is as simple as asking for too much. If I am training a behavior and the training is not going well, or I am seeing signs of anxiety, I always check to see if I am lumping things too much. I might ask myself if my rate of reinforcement is high enough. I might check my timing or my presentation of the cue. Am I being clear? I am focusing on the use of the stimulus here because that is often a source of tension or anxiety, but other factors can be important too.
One way to evaluate how your horse is responding to a combination of positive and negative reinforcement is to mix in some exercises that are based purely on positive reinforcement. I do this on a regular basis and it helps me identify changes in a horse's attitude that might not be obvious if everything I did was a combination of positive and negative reinforcement. I also think it is a nice break and change of pace for the horses. Mixing in some positive reinforcement only exercises is also good for teaching your horse to think and keeps the horse working toward behavior. If I become overly dependent on using negative reinforcement, I can end up with a horse that is very cooperative and does what I ask, but that is not a full participant in the training process. In some cases, that is ok. If you are just trying to improve some aspect of my horse's training and you don't want the horse to be very operant, maybe because it is handled by a lot of different people, or you have not worked a lot on stimulus control, then more emphasis on negative reinforcement at this stage of the horse's training might be more appropriate.
Doing some exercises with positive reinforcement alone is also a good way to find out if your horse is really ok with something or if he is just "shut down." Clicker trainers use the phrase "shut down" to mean an animal that has learned to put up with things it dislikes because it feels it has no choice. Often I meet horses that are labeled as bomb-proof or good for beginners because they are not bothered by a lot of things that most horses do react to. In some cases, these horses really are ok, but in other cases, once I show them that they do have choices, I suddenly discover that they do have opinions about things and that they are not ok with everything as I thought. Some people find this disconcerting as their previously well trained horse is now showing some opinions, but in the long run it is a good thing. A lot of these horses are the ones that suddenly 'lose it' for no apparent reason. Once they start to open up a little bit, you can see early warning signs and that helps to avoid a bigger problem.
But what if you have a cue that you really want to use, the horse hates it and you can't find a level that you can do that is acceptable to the horse? There are other options. When I learned how to ride, I was taught that there were cues for particular behaviors. There are standard cues for riding and I have read books that explain how these cues work by tapping into some physical response on the horse's part. If you had asked me a few years ago, I would have said it was important to train a horse using standard cues so that other people could ride it and because those cues were chosen for a reason.
But since then, I have realized that cues are very flexible and that while it is nice if all my riding horses use the same cues, I don't have to get there by training them all the same way. For example, when I first started Rosie, she did not like leg cues much and I had a hard time using them at all. She found them annoying and pressure on her side was more likely to make her stop and stomp her foot than move forward. So, rather than use a cue that she found aversive, I used a different cue for forward. Over time I added in my legs and she grew to accept them as part of the cue. When I wanted to teach her to canter off my outside leg aid, she didn't like that one either. So I taught her to canter off a change in my pelvis and once she knew about cantering and it had a strong reinforcement history, I was able to go back and put it on a standard leg cue.
This is a very common pattern. I have had a number of cues that the horses disliked at first because they had no meaning for them and then later they accepted the cue readily when it was added to an already trained behavior. I do have to say that knowing what I know now, I probably could have set up things better so that they accepted the stimulus I wanted to use as a cue, but I still think that in some cases, there are stimuli that are going to trigger unwanted reactions. The point of this little section is to show that even if your horse finds a stimulus aversive at one time, that doesn't mean it will always be aversive.
Kathy Sdao talks about how emotions travel backwards from the reinforcement to the marker signal to the cue. This happens through classical conditioning. If you are not familiar with classical conditioning, I suggest you do a bit of reading on it. It is worth learning about as when I am training, I am using both operant and classical conditioning, even if my focus is on operant conditioning. I am not going to explain more about classical conditioning here as it would make this article even longer, and it has been explained well in other places.
A reinforcer is something the horse will work to gain. It could be a reinforcer because it is something the horse wants without any training such as food, release from pressure, or safety and these are usually called primary reinforcers. Or it could be a reinforcer because I have paired it with something else that is reinforcing to the horse. When I do this, I am using classical conditioning to transfer the emotions associated with the primary reinforcer to create a secondary reinforcer. Because of classical conditioning, an event that occurs consistently before a pleasurable event will also become pleasurable. This is how the clicker becomes conditioned. The click becomes associated with the reward because it predicts the reinforcer. So the click now takes on meaning that it did not have before. Kathy says that this process can go back another step and if the click is now associated with good things, the cue will become associated with good things and the presentation of the cue will become a positive thing for the animal.
I think this means that we have three choices if a horse finds a stimulus aversive but we still want to use it. The stimulus could be a previously learned cue or a novel stimulus. As I said in the previous section, depending upon the horses reaction to the stimulus, I can either try to use the stimulus within the horse's comfort zone or I can partly desensitize the horse to the stimulus. In both cases, I am continuing to use the stimulus at the lowest level possible and reinforcing a lot so that the reinforcement history trumps the horses initial response. In many cases, this works and the horse will learn to accept the stimulus. But remember poisoned cues? Using a stimulus that the horse already dislikes means that the early stages of training that behavior are not going to be completely positive for the horse.
The third and better option might be to use a different cue and then change cues later. Then when you add the original cue back in, the animal is so sure what you want and the behavior already has a strong positive association so the transfer of emotion from the reinforcer to the cue occurs much faster. I find that horses will accept previously aversive physical cues when they know now exactly what it means and you can often use a more gentler version than if you were shaping with the cue.
I think one thing that helps us out here is that horses are looking to avoid aversives too and if we are using positive reinforcement combined with negative reinforcement, the horses will often anticipate and choose a cue that is not aversive. I think this comes into play naturally when we train horses because of the fact that horses are good at reading body language and they will start anticipating cues. If the horse learns to trot off a leg cue, I can keep that leg cue, but there are going to be windows of opportunity where the horse offers to trot off a pre-cue. I have to make a decision about whether or not I want to change the cue. The advantage to changing the cue is that I can end up with a cue that does not have the negative associations that might have come from a cue that was trained with pressure and release. This goes back to the escape-avoidance aspect of negative reinforcement and I just want to mention again here that when a horse starts anticipating, it is wise to start looking for signs that the horse is eager to earn reinforcement vs. signs that the horse finds the cue aversive.
The next two sections are based on some experimenting I have been doing with my horses. I was looking for ways to change my application of negative reinforcement to avoid punishment and I found both these strategies useful. The first one is about changing the way you think about training and putting aside some ideas about how we use negative reinforcement. The second one is about using negative reinforcement in a more pro-active way so that the use of negative reinforcement makes the active use of punishment less likely.
One of the things I have wondered is if I am going to use a stimulus that has the potential to be a positive punisher, is it better to start with the stimulus up front so the animal learns what to expect from day one, or is it better to start the behavior, then add the cue and fine tune it from there? For example, if I want to teach my horse to turn from a rein cue, should I start with the rein cue so that I can use negative reinforcement to shape the behavior, or should I shape a head turn and then add the rein cue later?
In some cases, the choice is obvious as my horse might have a negative response to the stimulus I want to use, so negative reinforcement is not a good option. But in almost any training with negative reinforcement where there is a physical connection such as a rope, there is the potential for some awkward moments when the horse is figuring things out. Is it better to just avoid any possibility of this happening?
If I start with the rein cue, I can shape it using negative reinforcement and when I am done shaping I will have the behavior and it will be on cue. There are some advantages to this approach and one of them is that if the horse does not respond, I can just go back and reshape it using negative reinforcement. Every time I do that the cue gets stronger and there is a lot of consistency in how I trained and how I maintain that cue.
Another advantage is that by shaping with negative reinforcement, I can teach the horse about all the nuances of a rein cue as I go. A rein cue does not just tell a horse to take its nose to the side. It tells the horse how far to take the nose. If I train this with pressure and release, my horse is getting information about how far to turn by when I release. But if the horse finds the rein cue aversive? I have started off by doing something he dislikes. If I train this by adding in positive reinforcement, my horse might over time find the rein cue less and less aversive, but will that first impression remain?
On the other hand, I could start by teaching the horse to turn his head to the side through targeting or free shaping and then add a rein cue. The horse would associate turning his head to the side with positive reinforcement and I could add the rein cue once he already knew the behavior. Would he still find it aversive? I think that once horses know the right answer, they are more accepting of various stimuli as cues. Some of this happens through classical conditioning, but I think there is another component too because with some horses, this happens very quickly. I can train a behavior through shaping or capturing and put it on cue in a few sessions, even if the cue is something the horse previously disliked. For years, my horse Rosie's reaction to anything new was to worry about it and it has taken a long time for her to view new things as being potentially positive things. For a long time I thought she just hated everything, but I eventually realized that with her, part of it is that she doesn't like not knowing what something means. As soon as she can connect what I do with what she should do, she is ok.
For a long time I thought the fact that I was going to end up using negative reinforcement anyway, it was clearer to the horse if I started off using it right from the beginning. If I start by using negative reinforcement, my horse and I learn together how to use pressure and release as a communication tool and not as a question of adding and removing aversives. And I wondered if it was more difficult for a horse to learn a cue through positive reinforcement and then have negative reinforcement added in later when the horse had no expectation of that as part of the program. When I had my foal, I taught him a bit about pressure and release, but mostly through targeting and he never felt much of an increase in pressure because I did not escalate. But one day something happened and I really needed him to stop so I increased the pressure on the line. I was amazed at how much pressure I had to use to get his attention. It felt like he didn't understand what more pressure meant so he got more resistant before he started to respond to me.
This made me think that responding to pressure and release is an important skill for the horse as well as being a training tool for the handler. I went back and spent more time on teaching him to follow the feel in the line and he was better after that. This is one reason that I continue to use pressure and release with my horses. I find there are times when I need it and the better they get at it, the less aversive I can be. At the same time, I have a few cues that I have been training with only positive reinforcement and I am starting to see how solid they can be. So, I guess I am exploring training along two different tracks here. I am looking at ways to make the use of negative reinforcement less aversive and I am looking for ways to train the same behaviors with positive reinforcement. At this point, I think both approaches are going to lead to better training.
The thing I am not sure about here is that I think I would still have to teach the horse to follow the feeling of the rein and this ends up being negative reinforcement again. I could try to set it up as teaching a series of different rein cues but in the end, I think it comes down to the horse following the changes in pressure and release on the rein. I have played around with this a bit and I do think that at some point, you have to teach the horse to accept and follow the feel on the line. It just seems easier and more natural to use negative reinforcement to do it. But I am starting to think that you can introduce that later and get better results. Most horses go through a period where there is some resistance and bracing to a rein cue because they don't understand it. Do I want that first impression associated with my rein cue? If I teach the horse to turn his nose and then add the rein cue, can I just add a new cue on to a previously trained behavior and avoid those moments of resistance and bracing? I think this is worth exploring.
I don't have answers to these questions and I don't have any horses that are clean enough slates to experiment with this and be able to come up with any definitive answers. But I think that's ok. Most of us start with horses with previous training and we just figure things out as we go. But I do think that if I had a new horse, I might explore a little bit more presenting any stimulus I wanted to use in my training and creating a positive association with it before I used it to teach anything. I talked above about how a positively trained cue could be used to classically condition a stimulus to be more positive. This is an indirect approach. The fastest way to classically condition a stimulus to mean something good is to present the stimulus and then follow it with something the animal finds pleasant. In the case of training a horse, I might put my hand where my leg goes and then feed a treat. Repeating this over time would pair the horse's response to the treat with the feeling of my hand on the horse in that position.
This might seem like extra work but think about how we start young horses. Every time we introduce a new piece of equipment, we take the time to make sure the horse is ok with and we want the horse to accept it without tension or resistance. Doing the same thing with any stimulus just makes good sense.
One of the drawbacks of using negative reinforcement is that the trainer can find they need to increase the aversiveness of the stimulus to get a response. This can lead to a whole bunch of problems because as the stimulus becomes more aversive, the horse can become more fearful, unpredictable and look for ways to escape. Both handler and horse lose the ability to think clearly and the focus changes from what behavior is wanted to getting rid of "bad" behavior.
The two most obvious ways a cue can become more aversive are if it becomes more intense (escalates) or if it is applied for longer duration. A cue that escalates might start off fairly mild but then if the horse does not respond, the trainer feels she has to increase the intensity of the cue until the horse listens to it. On the other hand, maintaining the same stimulus for a longer period of time (using duration) is going to produce fewer side effects and is more compatible with position reinforcement. Going hand in hand with this is being sure to release the negative reinforcer at the first sign of a change. The next sections are about some strategies to avoid being in situations where increasing the aversiveness of the cue through escalation is the only option. When used as a teaching tool, negative reinforcement works best when it is used to prompt behavior, not make the horse do it through physical means. A horse that lets you pull it around is different than a horse that learns to move its own body in response to a change in request from the trainer.
When I started writing this article, I found I was repeating a description of the rein mechanics that Alexandra Kurland uses. She has developed a system of using negative reinforcement without escalating by teaching the trainer how to find a stable point of contact and wait the horse out. While this is pretty straightforward in one sense (don't pull), it requires a bit of preparation and attention to detail in the set-up and implementation and I didn't see any point into going into details here. She has created teaching materials that show how to do this and describing it in adequate detail is beyond the scope of this article. If you are interested in more about her system, I suggest you find some of her resources.
I do think that being aware of when you are using escalating pressure is important and that just paying attention to this one detail can make a difference in your training so I wanted to mention it separately here. There are many components that a trainer has to weave together to come up with the right training solutions for any horse and trainer combination and this is an important one. You will see that the idea of avoiding escalation is woven into some of the strategies I discuss so it is a recurring theme.
Think of using cues, not negative reinforcement
I said earlier that riding is based on negative reinforcement and that if we are going to handle horses with some physical connection (lead rope, riding, etc...), we have to learn to use negative reinforcement well. I really don't see how to get around the fact that I am going to end up using some negative reinforcement with my horses, but I have found that besides being mindful about using aversives, there is another significant way I can make my training more positive.
I started thinking about this last year, partly because of the work on poisoned cues and partly because of some issues I was having with one of my horses. This started as a mental exercise and then grew as I changed my own training methods and started to see other options for training besides using negative reinforcement. There is a lot of literature and instruction available about developing a connection with your horse, developing feel, riding in harmony and so on. The general idea is that you can train a horse to respond to your body movements and the horse will learn to follow and respond to changes in your position or intent and you can dance together. It sounds so wonderful and I wanted to be there. But I was not always very happy with the results I was getting using negative reinforcement and I felt it sometimes put too much emphasis on removing the aversive as the reward.
I had used negative reinforcement a lot combined with clicker training and while I could see the benefits of the combined approach, I wanted to experiment a bit with trying to be more focused on positive reinforcement. So I started looking a bit more closely at how I used negative reinforcement and whether or not there was some way to teach horses tactile cues with positive reinforcement. I think there is a general tendency to view all cues based on touch or pressure as being trained with negative reinforcement because it is so easy for a touch cue to evolve from pressure and release. It is one of the simplest ways to generate behavior and put it on cue because the cue evolves out of the shaping process.
For example, if I want to teach my horse to move away from my hand, I place my hand on his side. If I am using negative reinforcement, I am going to leave my hand there until the horse makes a change in behavior such as a weight shift. I remove my hand when he does so and the removal of my hand reinforces the weight shift. If the horse does not respond when I put my hand there, I have some choices.
If I am using negative reinforcement, I am going to either wait him out by just keeping my hand the same and waiting for a shift or I am going to push a bit and escalate pressure to see if there is a pressure to which the horse will respond. The horse learns that to get my hand away, he must increase the behavior of shifting his weight in the direction I want. This can be done very gently as we have described before, so that it is not overly aversive to the horse, but technically, it is still negative reinforcement. The nice thing is that if I want my hand to be the cue to move away, that cue is already there because I used my hand to prompt the behavior. This is an easy one step process and as my horse gets better at understanding the hand means move away, I can make that cue lighter and lighter until it is just a light touch or not even a touch at all, but just a gesture.
But can I train the same behavior, using the same prompt and not be using negative reinforcement? I may be splitting hairs here, but I think so and I think it is a matter of timing. In the traditional use of pressure and release, the release from the pressure is meaningful to the horse and it is very important that you not release until the horse changes his behavior toward the direction you want. So I am training a horse to give to pressure using Alexandra Kurland's slide to the point of contact, I want to slide down the rein to the point of contact and wait until the horse gives. Then I release. If I am a clicker trainer, I also click. If I slide down to the point of contact, wait and release when the horse is not doing what I want, I have just reinforced the horse for the wrong behavior and I might get more of it. This could be pulling against the line, throwing the head etc... If I then click on top of that, I have compounded my problem. So in order to use pressure and release, I have to have good timing. With some horses you have a lot of leeway and they figure it out despite some mis-timed clicks and other errors.
But some horses are more challenging either because they find the rein cue too aversive, or they are too quick or they offer other behaviors. In this case, a lot of unwanted behavior and garbage is happening between when I find the point of contact and the horse finds the right answer. And if the horse is reacting because he feels restricted or trapped, all those negative emotions are getting connected to the rein cue. So I started wondering if I could avoidthose problems by thinking of the rein aid as just a prompt and not as applying pressure. This means that I have to let go of the idea (at least initially) that I have to wait for something to happen when I ask for something with the rein. Think of the rein prompt as working in the same way as a target stick. I present the target stick and wait for the horse to touch it. If the horse does not, I might move it change the presentation slightly or I might wiggle to make it more noticeable. I might take it away and present it again. Removing the target stick and asking again does not show the horse anything except that they did not find the right answer.
So how do we think of the rein cue as a prompt? It is really quite easy and I found it evolved quite naturally out of Alex's work. When Alex first teaches rein mechanics, she teaches the handler to slide down and stabilize. With most horse the handler finds a point of contact and wait. But with a few horses, they are so reactive to the rein cue that they won't let me slide down. They are already reacting to the rein as soon as I touch it, and not necessarily in a way that is conducive to building behavior. So one strategy she has us use is to slide down and release without clicking. This is done quite quickly so the horse doesn't feel trapped, but it is smooth and fluid so the horse is not surprised either. The idea is to teach the horse to accept the rein cue through counter-conditioning (pairing the slide down the line with something good) and once the horse is allowing the handler to slide to the point of contact, she can click the horse for allowing us to her to use the rein. Then she can start stabilizing at the point of contact and start asking the horse to soften and release to the pressure on the line.
This is one way to teach a horse tor accept a rein cue and it worked well with my horses. But last year I took on a rescue pony, Stella, who had a history of rearing. Any use of the lead was difficult. She had no idea about pressure and release so she would barge and tow me a bit when leading, but at the same time if she hit the lead in certain ways where the pressure was uncomfortable, she would rear. So on one hand she ignored a lot of pressure, but on the other hand, she was so sensitive to it in some situations that she would rear. The bottom line was she needed to learn about pressure and release so that she could understand what I wanted when I used it, but I could not start by using pressure and release to teach it.
I started using my normal approach but found I had to modify it right away. She was so sensitive to pressure and would get her head flinging around that it was hard for me to control the situation. There is a drawback to using pressure and release on a large animal to which I am connected by a lead. The drawback is that I cannot precisely control how much pressure is being applied. If I don't want to escalate pressure, I can stabilize my hand and wait. Now I am not pulling and the pressure the horse feels is that which she is putting in the line. But there is nothing to prevent the horse from putting more and more pressure on herself by pulling harder against my hand. I didn't want to go there with Stella. In addition, she did so much head flinging that it was hard to find a stable point of contact.
So what I did instead was I slid down and released very quickly before she had a chance to do anything undesirable. I was working on desensitizing her to the rein. What I found was that as soon as I added in positive reinforcement by clicking and reinforcing any acceptance of my hand on the rein, she went into thinking mode. I think one reason she went into thinking mode was the click and treat but I also think throwing in the release was important. By releasing a lot early on, even for particular response to the rein cue, I showed her that the release existed and I was not going to trap her. She was not going to be held in position.
Whether or not I clicked made more impact on her learning than the timing of my release. At this point I was still just getting her to accept my hand on the rein and I wasn't asking for anything so I was not thinking a lot about pressure and release. But the next thing I wanted to teach her was to take her nose to the side a little bit. I had previously taught her head down off a lead cue but not using negative reinforcement. So she knew that the lead could mean head down. When I asked her to take her head to the side, she wanted to put it down. It if I tried to stabilize my hand and not let her put her head down, then she got very frustrated.
So I thought I would change things a bit. Instead of preventing her from putting her head down, I released the rein and just asked again. Now technically, I had just released for a behavior I did not want so I was not sure what would happen, but I just asked again. I changed my presentation slightly and continued releasing for any effort on her part, but not clicking. After a few releases, she started to think more and offered to take her nose toward me. I worked on this for a few sessions and what I found was that it didn't matter if I released for other behaviors, she clearly picked up on those that I clicked. So while she got negative reinforcement for putting her head down because I released, she only got positive and negative reinforcement for taking her nose to the side and that behavior is the one she chose to repeat.
I thought this was interesting because it showed me that in some situations I could just release and ask again instead of being concerned that since I was using pressure and release, I had to time every release perfectly especially considering there was so much undesirable behavior happening. In her case, it was better to just think of presenting the rein as a prompt and building the behavior from there.
When I first learned about using pressure and release with clicker training, I thought it was very important that pressure and release were used to shape the behaviors I would want for riding and groundwork. Riding and working a horse on a lead have a lot to do with body language and I want my horse to get very good at reading my body language and following the feel in the line. But I am not so sure any more that this means we have to shape everything with pressure and release.
In some ways what I am describing is just a variation on how to use pressure and release by adding an extra step, but I think it has greater implications. I do want to say that like any training tool, you have to evaluate if it is working for you. The end goal is to have a horse that does allow me to use the reins to set a point of contact and softens to my hand. If my horse is learning that it can avoid pressure by just flinging its head around because then I never ask for anything with the rein, then this method is not working for you.
The reason I said it has greater implications is because I found that by thinking of my cues as cues, and not as negative reinforcement, I handled things differently when the horses did not respond. This is an advantage to combining positive and negative reinforcement and would not necessarily work if I was using negative reinforcement alone. If I am training with negative reinforcement and my horse does not respond, I am aware that the timing of my release is important. And it is important that I get a response because if the horse learns that he can avoid the cue without responding correctly, then my training will start to unravel as the horse will now be negatively reinforced for not doing what I asked or doing the wrong behavior.
But a cue is different. As a clicker trainer, a cue is a green light that doing a certain behavior means a chance to earn reinforcement. If the animal doesn't respond to a cue, I have many options. What is my first option? I would say my first option is to just ask again the same way. Maybe the animal was distracted or in the wrong position. If the animal still doesn't respond, I might check my presentation of the cue. Would I make the cue bigger or louder? Well, I might but only if I thought it would help the animal, and not to "make" the animal do it. One way I think about cues is to imagine I am talking to someone who speaks a different language. If I say something and they don't understand so they ignore me, does it help to yell it louder? Well, it might get their attention, but it is probably not going to help them figure out what I want.
There are some advantages to thinking this way. Remember I said this started as a mental experiment? In some ways, thinking of negative reinforcers as cues allowed me to make a change in my mind about what a cue means (it is not a command) and gave me permission to respond in lots of different ways when the cue was not followed by the correct behavior. There are lots of reasons animals do not respond to cues and considering those reasons made me a more thoughtful trainer. I think it made me less likely to end up using an aversive such as escalating pressure when the horse did not respond. This goes back to the poisoned cues. If I train a behavior with a lot of positive reinforcement, I want to be careful about now adding in an aversive when the horse does not respond. It is better to go back and figure out why the horse did not respond and fix the relevant training hole than to think the problem is in the horse ignoring the cue.
I think that negative reinforcement can be used to provide structure and set boundaries for horses and the horses can learn to accept this without losing their enthusiasm for clicker training. It is easy to get caught up in the idea that we can only be positive with our horses and that by being nice and giving them choices, we are making their life better. But what I have found with my horses is that sometimes more direction and structure is helpful. They are a lot like kids. There are situations in which I need to set boundaries and providing structure and direction is not always a bad thing. I want my horses to both offer behavior and be creative, as well as be able to follow directions. To me, setting boundaries is about creating situations where the horse is more likely to choose the right answer because I have provided some kind of limits on what it can do.
When training with positive reinforcement alone, the animal is not restricted and has a lot of choices. It can choose whether or not to participate in the training game and it can choose what kinds of behaviors to offer. In a lot of the early clicker training that I have seen, the animals were at liberty and the trainer just reinforced behaviors she liked out of the huge range of behaviors that the animal offered. But over the years, I have seen a shift. There is still the focus on rewarding behavior, but there is also much more thought given to setting the animal up for success by creating an environment where the desired behaviors are more likely to occur.
This means that people are more aware of how to set up situations where the right behavior is more likely to occur because of the set-up, and the options for other behaviors are a bit more limited. Any animal has many choices at any given time but a training environment where the dog has 4 obvious choices and one is correct is going to be preferable over one where there are 10 obvious choices and one is correct. This is as simple as stacking the odds in your favor. In the simplest sense this could just be training the animal in a distraction free environment. I watched a dog training tape the other day and there was a dog in a gym or some other big room with the training sitting on a chair and that was it. My first thought was "how lucky to have such a distraction free training environment."
But it can go beyond this. If I put a dog in a narrow hallway and throw food underneath it, the dog will probably back up. This method is likely to produce better results than if I just stood out in the yard and threw food under it. The dog might not back up to get the food because it has lots of other options including turning around and moving sideways. Another approach would be to put the dog in a narrow hallway and walk toward it to get the dog to back up. I could use my body language to make the dog move backward. This is often presented as training using positive reinforcement, but there is an element of negative reinforcement going on too. When I lean or move forward, I am putting pressure on the dog and the dog backs up to remove that pressure.
Training horses can certainly be done using positive reinforcement alone and allowing the animal a lot of freedom to make different choices. This would be similar to dolphin training in a tank where the trainer is not setting limitations on what the dolphin can do. But I think that horse trainers can also set up situations where the horse has limited options and is therefore more likely to make the right choice. We can certainly set up situations similar to the dog training example. A common way to teach a horse to move sideways is to put it facing a fence and apply a bit of pressure asking it to move away. The fence limits its options, but not in a punitive way. It is just there. This reminds me of various quotes by famous horse trainers about setting the horse up for success or making the right thing obvious. I always like it when I find a new piece to add to my training that works with the philosophy of clicker training and is something I see in good horse training. I like to think that the really good horse trainers are using the same principles as clicker trainers and I can learn from both of them.
In both of those examples, I used negative reinforcement to generate behavior and I used physical boundaries (the narrow room, the fence) to limit the horse's options. I could argue that those physical boundaries worked because the dog or horse was avoiding them, which is also a behavior that is being maintained through negative reinforcement. So in reality, I was using negative reinforcement on two (or more sides) to help generate the behavior I wanted. When I write about setting boundaries, I am referring to this use of negative reinforcement to define a workspace or provide information about the type of behavior that I want.
What do I mean by workspace? I mean the area in which the animal has to be to earn reinforcement. This could be a physical location on the floor or ground, or it could be a position relative to me. It is the space in which he must be to earn positive reinforcement. If you are familiar with the idea of setting criteria and slowly adding more criteria for the animal to earn a click, then the idea of boundaries is similar to this. Part of the criteria to earn reinforcement is that you perform the behavior within this space or while meeting other criteria that have to do with how you carry your body in response to the boundaries set by my aids.
This can start in a simple way. I wrote that I use negative reinforcement to prompt, maintain or redirect behavior. When I prompt behavior I am usually asking the horse to move some body part in a specific direction by applying the stimulus. My horse is standing quietly. I pick up the lead and ask him to bend his neck toward me which moves one or more body parts in the direction I ask. Even the idea of prompting behavior is about setting a few boundaries. When I pick up the lead and ask the horse to come around, I am showing him I do not want him to have his head and neck in certain positions.
In the same way, when I redirect behavior, I am also asking the horse to move a body part in a specific direction by applying a stimulus. My horse is moving around me and I want him to stand still. I pick up the lead and ever time he moves his feet, I ask him to back up. I am "closing" the front door and making asking him to move his body back. Every time he comes forward, I use negative reinforcement to re-establish that boundary until he stops pushing past me. Yes, I am using negative reinforcement to ask him to back, but my goal is no longer to get the backing. It is to tell him that going forward is not an option. I am defining the front end of my work space. At this point I am using negative reinforcement to redirect the horse and he might still be focused on the forward and back motion.
But at some point, things change a bit. I might still be using negative reinforcement if the horse creeps forward but the nature of it changes. Instead of the horse moving back to avoid the negative reinforcer and the process being about not going forward, the horse starts to be more focused on staying in position and when I apply the negative reinforcer, it is more of a reminder or piece of information that "reinforcement is not in that direction." By information I mean that the horse encounters the boundary and backs up because the boundary indicates the edge of the zone where he can earn positive reinforcement. I know this might seem picky but I think it is important because this is the point at which the horse starts to self correct. Once the horse starts to self-correct, it means he can now start thinking about what I do want and not just react to the stimulus in an automatic manner.
For the horse who was barging, I might start out clicking him for backing. Then I am going to start clicking for him standing. And as this progresses, I might start clicking for some behavior while the horse is standing. If the horse is offering behaviors and starts to go forward, all I have to do is close that door (by using negative reinforcement but this can get very subtle) and the horse instantly knows that going forward was not part of what I wanted. He then offers something else. I have effectively defined my workspace as an area in front of me that does not include any movement past me.
I did this for years with my horses and I just thought of it as setting criteria for what would earn a click. And it is that. My criteria change from moving backward, to standing still, to some aspect of standing still as I get closer and closer to the desired behavior. But when I started to do a lot of work in-hand and started to work my horses on more of a contact that a float, I realized it was more than that because over time my horses started to recognize my body position and how I set them up as a larger cue for what we were going to be working on. Then looking back, I realized that most of the time when I used negative reinforcement past the beginning stages, I use negative reinforcement to either cue or define boundaries for what was clickable.
So for example, when I first teach a horse to take his nose toward me off a rein cue, I am clicking for movement toward me. At some point, the horse knows the rein cue means bring the nose toward me and I start to be more selective. So the nose might come around and I only click those turns where the elevation stays the same. The horse is going to start to try and figure out what I am clicking. He might come around and if I don't click, he might move his head a bit. If he moves his head so he is closer to where I want him to be, I click. In this situation, I am asking the horse for the nose to turn with my rein, but I am clicking for some fine tuning of the behavior without actively prompting it with the rein. I did not use the rein to ask the horse to maintain elevation. I just waited.
Am I still using negative reinforcement? Yes, because when I ask for the nose and wait, I set my hand position so that some options are not available to him. I have applied a stimulus and it is the release of that stimulus that reinforces the behavior. I don't want the head to go down or to the other side so I position myself to make those choices unavailable. I have set boundaries on those sides. As this exercise progresses, the horse will feel those and move away into the place where positive reinforcement is possible, which is what I would call my workspace.
An interesting thing can happen here. At some point my position becomes a cue for the new behavior and I can take a more passive role and this becomes an exercise in positive reinforcement. I don't have to go through the process of defining the boundaries because as soon as I get in position, my horse uses the entire setup as cue for what exercise we are doing. What is nice is that I find I can use negative reinforcement to set some boundaries, maybe prompt some behavior and then I can allow the horse to experiment within that framework. When I taught haunches-in in hand, I used my reins to help guide the front of the horse and I used a whip cue to teach the horse to bring his hips over. The horse learned this body position and after a few sessions, I could get in "haunches-in position" and without actively using rein or whip, I had cued the horse to start offering haunches-in. At this point, I let the horse take over for a while and experiment with ways to organize his body within that position and I just clicked the ones I liked.
As my horse gets more advanced and familiar with this exercise, my workspace can become smaller because I am now fine tuning a learned behavior and I might be looking for a shift of half an inch in jaw alignment or slight softening of the feel in my hand. My workspace becomes smaller and how I define it becomes more and more subtle. To an onlooker it can look like I am holding the horse in position or doing nothing, but my horse and I have a very sophisticated conversation going on that is all about minor adjustments and changes in tone or feel in one or both of us. This is an example of how negative reinforcement can get so refined that is more like a conversation or a team working together than it is about making the horse do something.
But that is my goal. I can't try to start there. When I first teach this, I have to be careful about setting the boundaries so that my horse does not feel trapped. I want to limit the options but still have the horse feel free to experiment and move around. Going back to the dog example, if I took a dog and put it in a really narrow hallway where it could barely move, it might just give up or freeze. If I leaned toward it, it might feel so threatened that it acted out toward me. With horses I want to be careful about setting boundaries that create those kinds of responses. I start off with a big generous workspace and then as the horse become more sophisticated and more body aware and as his trust in me grows, I can make the workspace smaller.
I used a groundwork example to explain this, but the same thing happens under saddle. Ideally I want my horse to be balanced between my rein, seat and leg aids. When I ride my body defines the boundaries of the horse's workspace. I can use my seat, legs and reins to ask the horse to stay in a certain frame or work in a certain spot. I can ask him to organize in a certain alignment underneath me. Just like the dog trainer who might use physical objects to define his workspace, I can use my own position to help the horse find where he is most likely to earn positive reinforcement.
When I started writing this, I wondered if it was good to use the word "boundaries." To me boundaries could convey the idea of making restrictions and keeping something contained. My concern was that people would think I was recommending that they use negative reinforcement to hold the horse in position. But to me boundaries are also about setting limits and defining space and what I really want people to take away from this section is that negative reinforcement can be used to define a work space in a positive manner. When I write about setting boundaries with the reins, I am not suggesting we all pick up both reins, ride on a strong contact and call that negative reinforcement.
On one of the lists, they had a discussion about whether or not side reins were negative reinforcement. The use of side reins was presented as an example of negative reinforcement because the behavior of keeping the head down increased. This is certainly one way of looking at it. I would argue that initially the behavior of sticking the head up was punished and the side reins were only an effective use of negative reinforcement if the horse did learn to keep his head down and didn't learn any other avoidance behaviors. Some horses find side reins so aversive that instead of doing the desired behavior, they find alternate behaviors such as rearing, stopping, or going behind the vertical (ok, this is putting the head down but perhaps not what the trainer wanted). If a horse responds to side reins in this way, then it is pretty clear that there is more positive punishment going on than negative reinforcement.
The person who disagreed said that side reins were not an example of negative reinforcement because the trainer does not remove them when the horse puts his head down. Negative reinforcement is about removing an aversive and if the side reins themselves were the aversive, then this would make sense. But it is not the side reins themselves that are aversive. It is the pressure they exert on the horse that is the aversive. Looking at it this way, when the horse puts his head down, the pressure on the bit is decreased or removed and this could make the horse more likely to put his head down. In this manner, the side reins would be acting as a negative reinforcer because when the horse contacts them, they add pressure and when the horse backs off them, the pressure is removed.
Fast forward a few sessions and the horse is now going around on the side reins with his head down. I am discounting the effect of the trainer and whip because that would make this too complicated and is not relevant to my point. Just looking at the side reins, is it still negative reinforcement? I think it depends. If the horse has just given up and accepted some level of pressure and is leaning on the side reins, the pressure created by the side reins is not acting as a negative reinforcer because it is never removed. Yes there might be some slight variations in pressure as the horse moves, but those are not driving any change in behavior. It seems to me that it is more likely the horse has been desensitized to the pressure of the side reins and for all practical purposes, no behavior is increasing or decreasing. So in this case, I am no longer using the side reins to train with negative reinforcement.
But what if the horse has learned to hold his head in a position so he no longer feels the pressure from the side reins? I would say that if he is actively avoiding the boundaries of the side reins and the behavior of staying balanced between the side reins is increasing, then the side reins are still being used as a negative reinforcer. He is probably still going to encounter them every now and then and it depends upon whether he is trying to "escape" them or "avoid" them. Remember the escape-avoidance of negative reinforcement? Are they still acting as positive punishers? Probably, but the longer he goes without hitting them, the more likely negative reinforcement and not positive punishment is the main quadrant at work.
Enter the clicker trainer who starts reinforcing the horse for being balanced between the side reins. What happens now? I am clicking for the horse being balanced between the side reins in a position where they are not applying any pressure. Therefore, I am adding positive reinforcement. Am I using negative reinforcement too? Yes, because the horse is probably still aware of the side reins and using them for information about his head position. Is the horse still going to encounter the side reins at times? Probably, but if the horse is focused on staying in the middle because that is what is being reinforced, at some point, the side reins are going to become information about where he should be. As long as this is true and he does not stick his head up again, they will be providing feedback and not punishing sticking the head up.
Of course one way to find out what is happening is to take the side reins off and either lunge the horse without side reins and see what he does, or ride the horse and see what kind of response he has to taking a feel on the reins. If the horse goes back to sticking his head up in the air or pulls hard on the rider's hands, then clearly the side reins were not teaching the horse anything beyond how to avoid punishment or how to cope with an aversive. Or the horse is using the presence of the side reins as a cue and has not generalized the behavior to other situations.
The point of this little discussion is not to recommend using side reins or explain how to use them correctly. I actually don't use them at all, but they are a good example because most people know what they are. What I wanted to explain was that it is not enough to just limit a horse's options and pass it off as a good use of negative reinforcement. I have to build it in steps so that the horse is an active part of the process and learns to accept and understand that the boundaries are information. I have to constantly evaluate what is happening. Just sticking side reins on a horse and sending him around does not necessarily teach him about negative reinforcement and the nuances of reacting to a rider's change in rein aids. This only happens if the horse has been shown from the beginning that he can remove the aversive through his actions and then later, that he can use the same stimulus (hopefully now no longer aversive) to get information about how to get positive reinforcement.
This is one reason that single rein riding can be helpful. It breaks down the process of being on two reins into little pieces so that as you build toward two reins and choose the kind of contact you want, the horse stays operant and uses the rider's aids as information. By using one rein and releasing for correct responses, the horse is learning to accept direction and that the rider can set limits without feeling trapped. Once the horse is comfortable on one rein, you can add in the second rein as just another way to define the workspace and the horse will accept it.
For most of this discussion I used the example of setting physical boundaries and the idea that the workspace gets smaller and smaller as the physical boundaries change. I used physical boundaries as an example because it is easier for people to understand and see them. But a boundary doesn't have to mean the horse is physically running into the rein or my leg. There are other ways to use negative reinforcement through body language and pressure and the idea of boundaries applies there too.
This may not seem related to punishment, but one thing I have found it that if my horses are really comfortable with me defining a small workspace, it minimizes my use of aversives or punishment. This is because if they are comfortable with me limiting their options in certain situations, I can avoid giving them enough room or slack or space to offer behaviors I don't want. I used to think I could handle difficult situations by giving the horses room to make choices and reinforcing them for correct choices. Sometimes this worked and sometimes it didn't. The part I didn't like was that if it didn't work, I sometimes ended up reacting to their "bad" behavior, instead of setting them up so they were more likely to do what I wanted in the first place. There are still times when I choose to give them more room and options as I think it is the better choice. But I have also learned the value in limiting their options so they are less likely to make "bad" choices. What I have found is that having both options available covers most circumstances.
I have made some suggestions for how to avoid using punishment when training a cue with negative reinforcement. Now I want to write a little bit about what comes after that. I think that when I first start off teaching a new cue, I am very careful to make sure that I keep the horse working toward behavior and I am using negative reinforcement as information and not in a situation where the stimulus could become too aversive. There is a period in the early training when I am careful to set up the situation and choose times to train where the horse is going to be successful. If I am teaching head lowering, I might need it when the horse is anxious and upset, but I recognize that is not the time to train it. I need to train it when the horse is feeling calmer and I have some chance of success.
What I did not realize for a long time was that there is period after a behavior has been learned, but before it is really solid when I need to be careful about asking for it. If I ask for a behavior in a situation where the horse is going to be reluctant to give it, I am putting myself in a situation where the cue might end up being aversive because either I or the horse end up making it "louder" by increasing the stimulus. For a long time, I was not quite sure how it would be possible to avoid poisoning some cues because unexpected things can happen. Even if I was very careful to avoid poisoning my rein cues, one day when I was trail riding, my horse could spook and I might end up using the reins as an aversive to get him to stop. Since I knew this could happen, it seemed more important to make sure the horse responded to a rein cue at any time than to worry about poisoning the rein cue. In that situation I thought it was more important to get the right response even if the cue did become more aversive than to let the horse ignore the cue.
But over the past year or so, I have realized that one way to get a really solid behavior is to allow it to develop and build over time without stressing it. I want each behavior to have a very solid base and I do this by carefully building the foundation in situations where the horse can do it. I think that there is a tendency to be too quick to assume the horse knows something and stop practicing. Then we end up either testing it in a new situation without adequate preparation or maybe even testing it because we think we need to challenge the horse. What I am suggesting is that you practice the behavior many, many times in situations where the horse can do it, and avoid testing it or using it in "iffy" situations until you know it is really solid.
This is not about drilling the horse or making the horse into a robot. This is about recognizing that every training day is different and that being able to repeat the same exercise every day is going to mean that over time, the horse learns to do it in lots of different situations. Just think about all the variables that can come up if you are working in an outdoor ring. For starters, both you and your horse are going to have good and bad days. Some days your horse is going to be energetic, other days it might be slow. You might have a bad day at the office or be distracted. You might be feeling stiff or sore or your horse might be. It might be hot, cold, windy, start raining. Things might happen. A dog might run by, horses might get loose, someone might be doing something distracting next door. If your horse can respond to the cue under all these conditions, that is a huge start on making it solid.
I have always understood that solid basics were important and I have no problem with taking the time to go back and fill in training holes, review the basics, and I realize the importance of finding the root of the problem instead of trying to patch it up by addressing some of the symptoms. But the value of taking the time to allow this slow and steady development of both confidence and the complete understanding and ability to respond to an aid in many situations was not revealed to me until the past year or so. That was when I realized how solid the foundation behaviors are with my older clicker horses. I have incorporated them into our daily warm-up and while I am constantly tweaking them and coming up with little variations, they all build on the same basic behaviors. The end result (so far) is that my horses really know these behaviors and if I ask for them in a stressful situation, they can do it. I am just amazed at how consistent they are and how easy it is for me to ask for them.
On a related note, I have started a few young horses in the past few years. One of the things they have to learn to do is be ridden off the farm to go for a trail ride. I don't have any help so we go alone. Luckily there is a field next door that I can take them around. They start walking on the other side of the hedgerow but then we go up the hill and out of sight. When I trail rode horses in the past, I took them lots of different places and if I wanted to challenge them more, I took them somewhere new. But with these young horses I found that I didn't need to purposely look for new challenges or take them lots of places to build their confidence.
What I realized was that there was a lot of value in continuing to ride around the field even after we had done it successfully a few times. This is because there was always something new, so it was not the same ride each time. One day we met a deer. Another day the neighbor's dog was loose. We met a boy on an ATV. The school bus went by when we were near the road. One day it was very windy and the trees were creaking. He got to experience all these scary things in a setting that was different than his own farm, but not so scary that he couldn't handle it. And then when we did go further and he met some of these same things, they didn't seem so bad. Being able to handle all these little incidents made my horse more confident and gave me a chance to practice some basic skills under different conditions. And the good days were important too. Every time we went out around the field and had a nice pleasant ride, that built his confidence more too.
I often read that a lot of riding and horse handling problems stem from a lack of the basics and I think this applies to clicker training as well as any other kind of training. Take the time to practice those behaviors that are important to you. Once you are confident your horse can still respond in various situations, you can start to make it more challenging. Just remember the point is to set up a situation where the horse is still able to be successful at some level, not find one where the horse fails. Subtle changes in performance are indicators that a horse is finding something challenging and you want to work where the level of difficulty is increased, but the horse can still respond.
Training in Real life
Every trainer is going to have their own personal style that is based on how much they use each quadrant. This is true for all trainers, not just for clicker trainers. But I think clicker trainers have a strong commitment to doing as much training in the positive reinforcement quadrant as they can. Clicker trainers are focused on reinforcing the behavior they want and on keeping the animal involved in the training process. I am hoping that this article has shown that it is possible to use negative reinforcement with the same kind of focus on looking for the "right" behavior and keeping the animal happy and involved as can be done with positive reinforcement alone.
One of the reasons I wanted to write this article is because of my own evolution as a horse trainer and because I wanted to help more horse people see how to use clicker training with their horses. When I first learned clicker training, it was all about only using positive reinforcement and I did a lot of free shaping, playing games and teaching tricks. If someone had asked me about clicker training, I would have said clicker training was about using all positive reinforcement and that was all I used. But I keep my horses at home and I handle them all daily. So while I might tell someone that I was clicker training and only used positive reinforcement, that was only during designated training sessions. Between training sessions, I still used negative reinforcement and other traditional horse handling techniques.
Since then I have learned that clicker trainers can also use negative reinforcement and negative punishment. I have learned a lot from Alexandra Kurland about how to combine positive and negative reinforcement and done a lot of reading and experimenting on my own. At this stage in my development as a trainer, I find that the two balance each other nicely and that I can accomplish things by using both methods. It was difficult to write this article because I wanted to share some of what I was thinking about and doing with my horses, but I am far from having a finished horse training program. I am constantly tinkering and while I might be exploring negative reinforcement more with one horse, I might have another one where I am using more positive reinforcement. It was hard to avoid contradicting myself in writing because there are so many 'it depends" and different horses might require different approaches. I thought it was worth putting some of what I am doing out there in case it was helpful to somebody.
Negative reinforcement is a very useful tool and I think clicker trainers can take advantage of the benefits of negative reinforcement without all of the problems if they use it carefully. I enjoy just going out and playing with my horses and experimenting with different ways to train behaviors, but I also have some practical riding goals. This means that I do find that there are times when I choose to train something using negative reinforcement as used in more traditional horse training, but I try to modify it to make it better for the horse. Sometimes following the same steps I already know is easier than finding a new way to teach it. I have found with my horses that if I am using negative reinforcement and it is not going well, then it I need to take a better look at how I am using it. Is the stimulus appropriate? Is my timing good? Am I asking for too much? Is my reinforcement rate high enough?
At the same time, I am constantly looking for new ways to use positive reinforcement in my training and re-evaluating how I do things. I think that there is a natural progression when horse people start using clicker training and different people end up at different places. I started as a traditional horse person and found clicker training (totally by accident) when I was looking for something fun to do with Rosie, who was 2 and very crabby. I was heavily into doing tricks at the time, so that was how I started it. But it slowly crept into other areas of my horse training and my philosophy until it pretty much took over. I started out using positive reinforcement for fun and games and using negative reinforcement for most of my ground and ridden work. Over time, the lines have gotten blurred and I now use more positive reinforcement alone more than I used to. I think this is probably a common pattern for anyone who is actively learning and trying to improve their clicker skills.
One thing I do want to say is that while the main focus of article has been on showing ways to use negative reinforcement with minimal use of aversives, my intention is not say that this is always possible or that I never use aversives when I am using negative reinforcement. I am working with our horses in the real world. Stuff happens and I can't always do things the way I want. In addition to learning how to clicker train, many of us are also learning to ride or improve our riding and doing the same thing with our horse handling skills on the ground. We are going to make mistakes and sometimes the horses are going to have to do things they don't like. But I do think that the goal of trying to use less aversives is important. As long as I believe that there might be better ways to do things, I am going to keep learning and changing and this is going to make me a better trainer in the long run.
Thank you for reading. If you have any questions, comments, idea or suggestions, please email me (email@example.com.)
At some point I will update this article to reflect any changes in my training methods or philosophy and make any necessary revisions. This article started off as a little idea and grew and grew. I did a lot of reading and because it was for my own education, I did not take note of where I got different pieces of information. Although some of the content of this article is my own thinking, I read so much that I cannot entirely separate out where I got my ideas and some ideas that came to me when I was writing might have been based on information I had read in the past.
I guess this is my long winded way of saying that I am not attempting to take credit for anyone else's work and this article is a synthesis a lot of different material combined with my own experimentation and work. Some of the resources I used that I do want to acknowledge are:
Alexandra Kurland's books, DVD's and many conversations
with her at clinics.
Katie Bartlett, 2009 - please do not copy or distribute without my permission