EQUINE CLICKER TRAINING.....
using precision and positive reinforcement to teach horses and people
I didn’t write any clinic reports this year as the focus of Alexandra Kurland’s clinics this year was Loopy Training and it was so new that I was not quite sure how to share it. Now that I have seen Alex’s presentation a few times and gotten to explore it with my own horses, I thought it would be fun to share what I found of interest in Loopy Training and some practical applications. If you have not heard of Loopy Training, Alex has four posts that discuss it in detail. They are post 6476 (Jan 2009), 7852 (July 2009), 8226 (Sept 2009) on the_click_that_teaches yahoo group and post 77926 (Aug 2009) on the clickryder yahoo group. It is worth reading them one or more times as there is a lot of information in them. My effort here is to pull out some of what I considered the main points and provide some examples of how the Loopy Training model can improve the training experience for horse and trainer.
My intention was to write something short and to the point <smile>, but I guess I can’t do that anymore, so this is long. In an attempt to make people less likely to get bogged down, I have included a summary of key points at the end. My hope is that you can read the long version, print out or save the summary at the end and it will be enough to remind you of some of the important parts of Loopy Training.
Alex uses the term Loopy Training as a description of training that establishes and builds on “clean training loops.” A clean training loop is a training loop where there is no extra behavior between the desired behaviors, the animal gets clicked, gets its reinforcement and goes right back to work. She refers to them as training loops because when we are training, we are often doing more than one repetition of a behavior or repetition of a sequence of behaviors at a time. Therefore instead of looking at a training session as a series of isolated events each marked by a click and reinforcer, we should look at a training session as an ongoing stream of behaviors, clicks and reinforcement. A good training method starts with a small unit of behavior and progresses by creating increasingly more complicated loops. As Alex says, good training is Loopy Training, so she is not inventing something new, she is just giving it a name and looking more closely at what defines good training.
When I describe Loopy Training to people, I find it is easier to divide it up into two sections. The first has to do with the idea of training loops and their implications for food delivery, looking at what happens between behaviors and how to recognize good training. The second section has to do with when to use training loops and how to build them. I don’t want to end up repeating too much of what Alex has written in her posts, but I also didn’t want to jump in here with practical ideas for training loops without presenting some information on the basic idea of Loopy Training so I am going to briefly explain the basics about Loopy Training.
I want to point out that since Loopy Training is a way of looking at what makes good clicker training, there is a lot of stuff included under the umbrella of Loopy Training. I think that different people are going to pick out different elements so I am not presenting this information as anything other than a look at what was relevant to me right now. Other people are going to find value in other aspects of Loopy Training and maybe they will share them. I am going to start with how the Loopy Training model helps us understand the meaning of the click, the importance of food delivery and how to create clean training loops.
FOOD DELIVERY AND THE CLICK AS A CUE
I think the concept of Loopy Training started when Alex was looking at food delivery and the question of why some animals struggle so much with the early stages of clicker training. Alex and Jesus Rosales-Ruiz were sharing information about food delivery and the kinds of problems that poor food delivery can cause. The first time I heard about Loopy Training was Alex’s presentation at Clicker Expo 2009 where she talked about food delivery, poisoned cues and the skills of a good trainer. At that time, there was a lot of talk about food delivery and how important it is to have good mechanics and for the animal to know how to get its reinforcement. The importance of food delivery was something Jesus Rosales-Ruiz was looking at in his research and he started to view the click as more than just a marker signal. It is also a cue.
We all have heard that the click marks behavior, but he believes that it also acts as a cue to start the reinforcement cycle. The click cues the animal to get its reinforcement and it also cues the trainer to deliver the reinforcement. If the animal doesn’t know what to do when it hears a click, it doesn’t matter if you marked the correct behavior, the training session will become frustrating for the animal. Following this line of thought, in order for the training experience to be a positive and productive one, it is very important that the animal knows how to get its reinforcement before any behavior is reinforced. This goes back to the idea of “charging” the clicker, but with the emphasis on teaching the animal how to get the reinforcement, not that the click means reinforcement is coming.
When I first talked to Alex about her presentation, she was looking at food delivery and how sloppy food delivery can disrupt the flow of a training session. If the animal doesn’t know how to get its reinforcement (do I go to the trainer? Does the trainer throw it? Is it in a bowl?) or have a clear understanding of what the reinforcer is (is it a toy or food?, is it one piece of five?), the time from click to treat can be longer than ideal and it can get filled with extra behavior either associated with frustration or the animal going into food search mode.
By the time I saw her presentation, she had realized that looking at food delivery and how to get the animal from the click to the reinforcement and right back to offering behavior was a good example of how to set up training and that it had implications beyond food delivery. This led to the concept of Loopy Training. When most clicker trainers are taught about the click and reinforcement, they view it as a linear process. The animal does a behavior, gets clicked and gets reinforced. What Alex realized was that because clicker training is based on doing many repetitions within a session, it made more sense to view it as a circular process where reinforcement was followed by the animal offering a behavior again so that it could get clicked and treated.
So instead of viewing it as behavior -> click -> reinforcement, it was more like behavior ->click ->reinforcement ->behavior ->click ->reinforcement ->behavior and so on. Training is not a linear process, it is a loop. The basic loop for any clicker trainer is behavior -> click -> reinforcement -> behavior -> click ->reinforcement ->… When you get this loop going where you have the desired behavior followed by a click and reinforcement and the animal immediately returns to the desired behavior, you have a “clean training loop.” I think we have all had training sessions that went well and the animal was offering the behavior, getting clicked and offering the behavior again with no interruption. Kathy Sdao talks about how good training should look rhythmic and she uses this as one indicator that a behavior is ready to be put on cue.
Looking at food delivery as a “loopy” trainer means you recognize that the basic loop that must be correct for any clicker training session is the click -> food delivery -> eat loop. The animal has to recognize that the click means get your reinforcement and look for the reinforcement as soon as it hears the click. How the animal gets its reinforcement can vary, but it does need to have a default.html behavior that it does when it hears a click. This could be as simple as orienting to the trainer when it gets clicked, or going to a location where it knows the reinforcer is delivered. There are no rules about how reinforcement has to be delivered, just that it has to be consistent and the animal has to know how to get it so that unwanted frustration and food seeking behaviors don’t creep in.
I want to mention that when I first heard about the dual nature of the click (marker and cue), there was some talk that perhaps the smallest loop should just be deliver food -> eat -> deliver food -> eat. For a horse that had never been hand fed, the idea was to teach good food manners before introducing the click so that the horse did not associate the click with frustration over food. I’m not sure if anyone has tried this or if this idea has been followed up. But I can see that for some horses who have never been hand fed or taken treats, it would be worth spending time getting them used to feeding before adding the click. I have not tried this and I think it would be important to set this up in such a way that it was a training session and not just random hand feeding, which can create some problems.
Included in the idea of a clean training loop is the importance of paying attention to how the horse takes the treat. A change in behavior during food delivery is often one of the first signs that the animal is stressed and it is something trainers need to monitor. The importance of food delivery was one of the pieces that came out of that first session I attended on Loopy Training. We all know that food delivery is important with horses and we spend a lot of time on eliminating mugging behavior and teaching good treat taking manners. Loopy Training provides a clear model of good treat taking and reminds us of the importance of the behavior between the click and treat, as well as the behavior immediately following the treat, which leads us to the idea of “white space.”
In the section on food delivery and the click as a cue, I mentioned the importance of having a clean training loop. If we look at the training loop for a single behavior, a clean training loop would be identified by having the animal proceed directly to getting its reinforcement after hearing the click, getting its reinforcement appropriately, and then immediately returning to offering behavior. In this sequence, I could write out a clean training loop as
behavior -> click -> treat -> behavior -> click -> treat -> behavior ->…
In writing this out, I have used an -> to show that the animal should proceed directly from one step to the next because I want to show the flow of training. But animals don’t always proceed directly from one step to another and this is where the idea of “white space” comes in. In a clean training loop, there would be no extra behaviors between each step so I could just write a list of behaviors, each separated by a space. This space is what I am calling “white space,” and it represents a clean transition from one behavior to the next. If I watch an animal doing a single behavior that it knows, there are three places where I want to see “white space.” They are between the desired behavior and the click, between the click and the reinforcement, and between the reinforcement and the next repetition of the desired behavior. If I see this, then I know the animal understands what is being marked, knows how to get its reinforcement and is ok with the training process.
But what we often see in the early stages of training is that our “white space” is filled with a lot of other junk (this is normal). The animal is not sure what is being clicked so there is some experimentation on its part and because the animal is still in a difficult part of the learning curve, there may be some additional behaviors between the click and reinforcement or the reinforcement and the animal engaging with the trainer again. There is nothing wrong with any of this. Loopy Training just tells us to monitor it because what happens in the spaces between behaviors, clicks, and reinforcement is important information. As clicker trainers, we have been told two important things. One is that the click ends the behavior and the other is that by focusing on what we do want, other unwanted behaviors will drop out. Let’s look at these through the eyes of a loopy trainer.
When I first heard that click ends the behavior, I thought it meant that what happens between click and reinforcement is not important. The whole point of the click to add precision and buy time so that you can deliver the reinforcer in a timely manner, and the animal still knows what is being reinforced. But it is not quite as simple as that. Using food as an example, we can see that because the food is the primary reinforcer, anything that happens between the click and treat is also being reinforced. It is not reinforced as precisely as what was clicked, it tends to come along as less precise or more varied baggage. This means I need to be mindful of my horse’s behavior between click and treat. I want to make sure that the horse stops what it is doing and starts the reinforcement cycle when it hears the click and that it moves directly toward getting its reinforcement. If there is a lot of other behavior in there (mugging, pawing, etc…), then that is valuable information for me.
I think most of us are aware of when extra behavior is creeping in, but what loopy training makes more obvious is the value of looking at anything that happens between desired behaviors (the white space) as an indication of how the training is going and that what happens in the white space provides information about how to get to a clean training loop. I find that the first step is just monitoring the white space. A lot of problems will just go away as the horse becomes more confirmed in the behavior. What I watch for is behaviors that are increasing in the white space, or undesirable behaviors that I might accidentally be reinforcing.
As an example, my horse Rosie has always been easily distracted during my training sessions. I was able to increase her focus and build duration by chaining together behaviors and extending duration, but as soon as I clicked, she totally disconnected from me. It was fine for her to release herself from training mode when I clicked, but over time I noticed that she was slow to take her treat after the click and she was slow to get back to work after the treat. There were days when the stop, click, and treat process took longer than what I had just reinforced. For a long time I just ignored this, assuming she would get better as she became more familiar with the work and the behaviors got easier, but it was not happening. What she had learned was that she could use those two areas of white space to do what she wanted, which was look around. I needed to come up with a training strategy to address that instead of assuming it would go away.
Her distraction showed up at the last Groton clinic in October 2009 when I was working Rosie in the indoor and she was nervous about being in one end of it. I was working on asking her for jaw flexions and she was doing them nicely but any time she got a chance, she would turn and look at the door which was in the scary end of the arena. She filled all the white space with behaviors related to being distracted. I was just working away, concentrating on the quality of the jaw flexions when I realized that this was useful information. When she stopped gawking between click and treat and between treat and the next jaw flexion, I knew she was ok with what we were doing in that part of the arena and I could move on. In this case, I didn’t have to do anything to change her behavior except keep quietly working away, but monitoring the white space told me when to move to a new spot.
Here’s another example. I was helping someone with the early stages of clicker training and horse and the horse had learned targeting and grow-ups are talking. When I asked the horse to target, he eagerly touched the target, and I could click and treat. But then instead of returning to the target, he turned his head away and there would be a pause before he offered to target again. This is not a big deal and the horse might just have needed some processing time or he might have thought that was the behavior we wanted since he had learned Grown-ups.
But it is worth noting and we came up with some ideas for how to encourage him to return to targeting more quickly if the behavior did not decrease over time. What we didn’t want was for this to become a persistent behavior pattern because in a clean training loop, he would be ready to go again after the treat. The behaviors around the click and food delivery are the ones the horse practices the most during any training session so you want to make sure you are not consistently letting the horse practice patterns you don’t like. You have to remember that in the early stages, you are experimenting with what you need to do to create a clean training loop and each horse is going to be different. This is not about rushing the horse, it is about finding a way to get both of you working in synch and establishing a good foundation for future training.
TRAINING LOOPS IN REAL LIFE:
In the section on food delivery, I already looked at the some of the first training loops that most of us teach. Examples of these are food delivery -> eat -> food delivery -> eat ->, or click -> food delivery -> eat -> click -> food delivery -> eat -> click ->… or behavior1 -> click -> food delivery -> eat -> behavior1 -> click -> food delivery -> eat ->… This is an important place to start because understanding how to have a clean loop around food delivery is important because without clean and efficient food delivery, any larger loop will have a weak link.
Most horses progress pretty rapidly through these early loops and then it is time to look at using and building training loops in other situations. When Alex presented this part of Loopy Training it made sense to me, but I had to go back and think about my own training. Do I train in loops? She points out that Loopy Training is not something new, it is just a way of identifying what many clicker trainers are already doing. By identifying it, it is easier to teach and explain and that helps new clicker trainers learn how to train effectively and helps more experienced clicker trainers to troubleshoot when their training is not progressing. So I started to pay attention to my own training to see when I was already using loops and when I was not. It turns out I do use loops, mostly for training new behaviors and for teaching horses about how to work through a longer chain.
A lot of the early training loops that I use have one behavior added on to the basic loop of click -> food delivery -> eat -> click ->.., because I am building my horse’s repertoire and shaping lots of new simple behaviors. In this case, the training loop is a series of repetitions where I am asking the horse to repeat what it just did and reinforcing those behaviors that lead to my target behavior. These loops can look very messy in the beginning because the horse does not know what I want, but as the horse learns the behavior, I am be looking for clean training loops where the horse clearly knows what is being clicked and can repeat it a few times.
Keep in mind that I can have a clean training loop before I get to the final behavior. The Loopy Training model tells me that when a loop is clean, it is time to move on. When shaping a new behavior (as opposed to creating a chain), moving on means that I change my criteria. As I am shaping, I go through phases where the loop is clean and then as I look for the next piece of the behavior (change the criteria), it is not unusual for it to get messy for a bit. The goal is not to try and maintain clean loops through the whole shaping process, but rather to recognize when I am getting one. When I am shaping a new behavior, a clean loop means I am doing well and it’s time to add a new criteria whereas a loop that doesn’t improve within a few repetitions probably means I need to change something about my training (set-up, criteria, etc…).
Once my horse has done some simple training loops and has some trained behaviors, then it is time to look at another use for training loops. This would be teaching a horse to do multiple behaviors before getting clicked. There are different ways to do this, but I am just going to look at building chains or sequences as that is the method most suited to Loopy Training. A chain or sequence is a series of behaviors that the horse must perform before getting to the click at the end. It is a great way to teach horses that even if they don’t get reinforced for a behavior immediately, reinforcement is still coming.
The Loopy Training model gives us a nice way to systematically introduce the concept of chains to horses by adding behaviors one at a time to an established loop. In order to do this, I want to start by creating a basic training loop with one behavior that serves as what Alex calls the “anchor behavior.” This is the behavior that is going to be reinforced by the click and treat. When I worked Rosie at the Groton clinic, Alex and I talked about how to identify a good anchor behavior. Alex often uses stand on the mat because most horses like that behavior and it has a very clear cue and we can control access to it. But stand on the mat is just an example of a behavior that you can use as the basic behavior for your training loop. When I was working Rosie at the Groton clinic, we needed to pick a behavior for the basic training loop and we picked jaw flexions because she is very familiar with them and they were something she could easily do in that situation.
I think that choosing a good “anchor” behavior is important and you might have to experiment a bit to see what works for your horse. Some things to consider when choosing the behavior are:
1. Is it something the horse likes doing? Or does it have a strong reinforcement history?
2. Do you have it under good stimulus control? If the anchor behavior is going to be clicked more than other behaviors, it is natural for the horse to anticipate so you need to be able to redirect the horse when it gets ahead of the game. I also think you need to consider that the more strongly a behavior is preferred, the more important it is to have it under good stimulus control or you will constantly be struggling with keeping an enthusiastic horse on task. It reminds me a bit of finding the right treat. You want to find a treat that the horse will work for, but doesn’t send it over the top. With an anchor behavior, you want to chose one the horse likes but not so much that you can’t control it.
3. You are going to be repeating the anchor behavior a lot so pick something easy and simple. It also helps if it is a behavior that you can click multiple times in quick succession, such as standing on mat, head down, targeting or similar behaviors.
Once you have chosen an anchor behavior, you are going to add it to your basic loop and test it out. Your loop will look like this:
Anchor behavior -> click -> food delivery -> eat -> anchor behavior -> click -> food delivery -> eat ->…
Since the anchor behavior is the one that is going to get clicked and treated, it is important that you stick with this loop long enough to make sure the horse is really enthusiastic about that behavior. In the questions I listed above, I noted that you can set yourself up for success by starting with a behavior that the horse already likes or that has a strong reinforcement history. This will help with the next step and allow you to take advantage of Premack (a less likely behavior can be reinforced by the chance to do a more preferred behavior). By setting the loop up this way, you are also in a position to take advantage of backchaining if you wish.
Once the anchor loop is clean, then it is time to add the next behavior. This is one of the things that the Loopy Training Model tells us. It says if the loop is clean, it is time to move on, otherwise the training will not progress. To add to the anchor loop, you just choose another behavior. I think choosing the second behavior carefully is almost as important as choosing a good anchor behavior because this is the first behavior that is not going to be reinforced directly by the click and treat. Again, I want to pick a behavior that my horse knows well, has a good reinforcement history and is easy for the horse to do. I also find it sometimes helps to choose a behavior that “makes sense” to the horse. By that I mean I will often pick a behavior that the horse needs to do before it can do the anchor behavior.
For example, if my anchor behavior is “stand on the mat,” I might add the behavior “go to the mat.” If my anchor behavior is “target,” where I present the target, I might add the behavior “go to the mat” and always present the target when the horse is on the mat. If my anchor behavior is “do a jaw flexion,” I might add the behavior “halt” so that I walk the horse, ask it to halt and then it can do its jaw flexion instead of just standing and doing the jaw flexion.
I think when you set up loops, you have to be a bit flexible about how you identify behaviors because how one defines a behavior is a bit arbitrary. In the jaw flexion example, I could say that I really have three behaviors: walk -> halt -> jaw flexion, but I can’t ask for halt unless I am already moving so I could consider “walk and halt” as one unit of behavior in this case. This works if the horse has already learned walk to halt as a chain from previous work and actually points out that when I add a behavior to my loop, I can add a single behavior or I can add a previously learned chain as one unit of behavior. In backchaining, trainers often create several mini-chains and then link them together instead of adding each behavior one at a time so this would be doing a similar thing.
So I have added a behavior and my loop is now:
Behavior1 -> anchor behavior -> click -> food delivery -> eat -> behavior1 -> anchor behavior -> click -> …
I want to do this a few times to give the horse a chance to figure out the patterns and then I want to carefully evaluate how my horse is doing. I want to ask some questions:
Is my horse ok with not being clicked for behavior1?
Am I seeing any frustration or confusion? Often frustration will show up as problems around food delivery, confusion will often show up as the horse trying to skip directly to the anchor behavior.
I think there is a balance between jumping in here and changing things and letting the horse figure it out. If I am in the early stages of chaining behaviors together and the horse gets frustrated, I might just leave that loop and try a different one. Sometimes I will switch the anchor behavior and behavior1 to see if that works better or to show the horse that both behaviors can be reinforced at different times. Alex sometimes reinforces the anchor behavior with a jackpot or by clicking and treating a few times before moving on. I find that I build a lot of two and three behavior loops before I ask for longer chains. The shorter loops mean the horses are on a higher rate of reinforcement and they give me a chance to see what behaviors work well together, what the horse finds easy or hard and what the horse finds most reinforcing.
Once I am beyond the stages where I am adding one or two behaviors to the basic loop, I am going to expand the loop by adding new behaviors one a time, adding the next one when I have a clean loop. When I first watched Alex teaching Loopy Training, I was not sure how she was setting up the loops because sometimes she was backchaining, sometimes she was forward chaining and sometimes she was adding behaviors in the middle. It turns out that it doesn’t matter where you add the behaviors. Some loops will be more suited to backchaining and some more suited to forward chaining. What she does keep the same, in the examples I have seen, is that the anchor behavior stays the same so the same behavior leads to click and treat even as the loop expands.
Just as a note, she says that she does sometimes reinforce especially good efforts by clicking and treating that specific behavior, or by allowing the horse to skip the end of the loop and go directly to the anchor behavior.
SOME MORE THOUGHTS ON CHAINING AND APPLYING CHAINING STRATEGIES TO LOOPS:
If you haven’t done much work with chains, it is worth reading up on common strategies for building long chains and or sequences. Some people refer to chains as behaviors performed in the same order each time, and sequences as behaviors performed in different orders, as cued by the trainer. In both cases, the reinforcement comes after the last behavior and for simplicity’s sake, I am just going to refer to chains here, but the concepts of chaining apply to both. If you want to build chains successfully, it is worth knowing a few common strategies such as using Premack, backchaining, forward chaining, the micro-shaping strategy, the appropriate use of reinforcers, using cues to reinforce and shape behaviors and anticipation.
But before we get to that, I just want to point out that the Loopy Training Model tells us what to do about unwanted chains. I have already talked about the importance of getting a clean training loop when I am shaping or building behaviors and in this case, I am building a chain or loop on purpose. Sometimes in my training, I end up accidentally creating an unwanted behavior chain when I am not intending to make a chain at all. What happens is that I cue or reinforce a behavior that the horse offers after doing another behavior and instead of just getting the new behavior, I get the first behavior (often unwanted) and then the new behavior as a two behavior chain. Common examples of this are a horse that learns to mug so it can “not mug” and get reinforced for not mugging, or the horse that comes forward so it can be asked to back again. What Loopy Training tells us is that we need to tighten up the loop by reinforcing the desired behavior when it does not follow the unwanted or extra behavior so that link of the loop is broken. Instead of behavior1 -> behavior2 -> click -> food delivery -> eat -> behavior1 -> behavior2 -> … This means I want to avoid cueing behavior2 when the animal is doing behavior1. I do want to point out that this tendency for animals to chain behaviors together is not always a bad thing and I will write about it in the section on cues as markers and reinforcers.
Common Strategies for Building Chains and therefore Loops:
The Premack Principle:
We use Premack in chains because they are going to be stronger if the animal is working toward behaviors it knows or likes. Even though we are using the Loop model, there is still a strong pull toward the part of the loop containing the click and treat so we can use Premack to increase the desire to get to the end of the chain, or we can use Premack to strengthen parts of the chain by considering the order in which we assemble the links.
When we want to use Premack with animals, what we do is reinforce one behavior with the chance to do another more preferred behavior. If my horse’s favorite activity is targeting, I can reinforce another behavior such as walking toward me with the chance to target. I don’t have to reinforce walking toward me with a click and treat because I can offer the chance to target as reinforcement for approaching, and then click and treat the targeting. This gets me two behaviors for one click and treat. It is really easy to use Premack with any positively trained behavior and if you construct your chains so that the animal is working toward a preferred behavior, you can build longer chains and build enthusiasm instead of decreasing it.
As a note, you can use Premack where the desired behavior is not a trained behavior, but you do have to be able to control access to the behavior in order for it to work well. When I first took Rosie to the indoor arena, she was too nervous to eat. So instead of clicking and treating, I used Premack and I reinforced correct behaviors by the chance to walk on a loose rein and look around, which is what she really wanted.
Backchaining is a way of systematically building a chain from back to front so that it takes advantage of Premack and builds enthusiasm. In backchaining, you are teaching a behavior chain, but you do it in reverse order so if the chain has behaviors A -> B -> C -> D, you create the chain starting with the last behavior, D. You don’t have to train the separate behaviors in that order unless the chain itself requires it. Backchaining just refers to how you put the chain together. Loop 1 would be D -> click -> treat -> D -> click -> treat and so on… Loop 2 would be C -> D -> click -> treat -> C -> D -> click -> treat. Loop 3 would be B -> C -> D -> click -> treat -> B -> C -> D -> click -> treat. Loop 4 would be A -> B -> C -> D -> click -> treat -> A -> B -> C -> D - > click -> treat.
The advantage to doing this is the animal is always working towards what it knows and it is also working towards behavior with a stronger reinforcement history. Since I start with D, by the time I get to adding A, D has been reinforced many times and is able to act as a reinforcer for C which is Premack at work.
A similar approach would be to use the microshaping strategy and have one behavior that is on a higher rate of reinforcement and use that behavior to reinforce the previous behavior. When Alex sets up training loops, she often uses stand on the mat as the behavior in the first loop and as she adds new behaviors, she will continue to reinforce the mat at a higher rate of reinforcement so that the horse is highly motivated to move through the chain and get to the mat.
This is similar to the micro-shaping strategy because a familiar and easy behavior is being used to reinforce the harder behavior and it gives the horse a mental break as well as confirming that the previous behavior was correct.
Chains can also be built by adding behaviors to the end of the chain. In the case of Loopy Training, this would mean adding behaviors between the end behaviors and the anchor behavior. Alex usually (as far as I have seen) keeps the anchor behavior the same, but she will add a new behavior wherever it seems most appropriate. Some chains are more suited to backchaining and some are more suited to forward chaining. I will use forward chaining if each behavior depends upon the completion of the previous behavior in order to be done correctly. A lot of the groundwork exercises such as 3 flip 3 and Hip Shoulder Shoulder are built as forward chains. You get the first step and then when that is correct, you add the next step.
Appropriate Use of reinforcement:
When I first start chaining behaviors together, I often keep the same “pay rate” for my horses, they just get it in one feeding at the end instead of for each behavior. So if have a one behavior chain, I feed one treat (or whatever is my standard for one behavior). If I have a five behavior chain, I feed 5 treats. This helps motivate the horse to work through the chain and I can slowly cut back the amount of food once the horse gets used to doing more behaviors for one click.
I might also add in some conditioned secondary reinforcers or keep going signals. These can be helpful if the horse seems to get confused about whether a behavior was correct or not. In many case, I can just use obvious verbal encouragement if I haven’t conditioned any reinforcers and this helps. You have to remember that the horse gets two things out of the click and treat. It gets the food which serves as motivation and reinforcement, but it also gets information from the click as the click is confirmation that the behavior was correct. I think with some horses, they miss the treat for the behavior, and with some horses, they miss the feedback of the click saying ‘yes, that’s right’ so using another reinforcer can help answer their question or keep them motivated. I do fade this out (or at least decrease it some) as the horse learns about chains because otherwise too much just becomes background noise.
Don’t forget that doing a favorite behavior can be a reinforcement. I often insert “fun” behaviors in early chains so that the horse gets to do something it enjoys and this is a good use of Premack, as well as keeping it more interesting for the horses.
Cues as Markers and Reinforcers:
When I wrote out the examples of training loops, I purposely omitted cues because I wanted to just focus on the behaviors and how to create clean training loops. But part of having clean training loop is having good stimulus control. The cue is what directs the horse through the sequence. Most of us think of cues as green lights to tell the animal that a particular behavior might be reinforced if the animal does it now. But cues also have other functions and these are as reinforcers and markers.
A cue acts as a reinforcer if it has been positively trained, meaning that the animal wants to be given the cue because it could lead to reinforcement. Alex talks about why poisoned cues are a problem in chains in one of her posts, so I am not going to go into that here. But I will say that you want the animal to react positively to the cue. If the animal reacts positively to the cue, then the cue will reinforce whatever the animal was doing when you gave the cue. In other words, if the animal is doing behavior1 and you ask it to do behavior2 and reinforce it every time it does behavior2, you are going to see the animal start offering behavior1 more often.
Cues are an important part of the chain in that they help hold the chain together because each cue has two functions. It tells the animal what to do next and reinforces the animal for the behavior it was just doing. What the animal learns is that in the absence of a click, a new cue is affirmation that the last behavior was done correctly and it gets to do another behavior that might lead to reinforcement. For this to work, the animal must view the presentation of the cue as reinforcement. This is why cues that imply correction don’t work well in chains because those cues are essentially telling the animal that it did it wrong, and it should do this instead. We want the cue to mean “great job, let’s do this one next.”
Because cues act as reinforcers, it turns out that they can be used to mark and shape behavior too. Most of us do not always recognize that we can take advantage of this, but it adds another level of precision if we are careful with the timing of our cues. Remember how I wrote about using the Training Loop model to get unwanted behavior out of a chain? Well, this is the plus side of fact that animals easily chain behaviors together if they are cued for one behavior while doing another. I first heard about shaping with cues a few years ago from Jesus Rosales- Ruiz and I couldn’t figure out what he meant. It was not until I watched the poisoned cue video that I had the light bulb moment. In the video, the dog is being trained to come when called, but the rule is that the dog could not be called unless it was a certain distance away. In the beginning, there was a lot of movement on the handler’s part because she had to keep moving away from the dog so she could call it, but over time the dog learned to wait at a distance so that the handler could then give the cue to come. The dog had learned the behavior “stay this distance away from the handler” because that was the only time the cue to come was given.
With this example in mind, I could start to see how to apply it to horses. For example, I was working Rosie in head down at a walk and when I stopped, she would pop her head up a few inches and then drop it down. I tried only reinforcing efforts where her head stayed lower but I didn’t see a significant change. Then I started watching her walk and realized that even though she was in head down, there was some variation in how low she carried her head. I started asking for whoa at the moment the head was the lowest. My idea was just that she would be less likely to pop her head from that position. I got a big improvement in her halts, but interestingly enough, I also go a big improvement in how low she carried her head in general. Because she got to stop and get her click and treat when her head was at the lowest, she started carrying her head lower more of the time. I had shaped a lower head carriage by the timing of when I gave the halt cue.
I think once we are aware of this, we can use the timing of our cue in training loops to further shape the behaviors we have already included. This has the advantage of making the training more interesting to the horse because we are not just repeating the same behavior over and over again.
Anyone who has ever built a few behavior chains knows about anticipation. It is one of the challenges of chaining behaviors together with reinforcement for the last behavior. If that is the only source of reinforcement, the animal is in a hurry to get to the end of the chain. This can be channeled into enthusiasm which is good, but it can also lead to anticipation and the animal skipping steps in the chain. This is especially true with backchaining where the animal knows the sequence of behaviors and practices the end of the chain a lot.
The Loopy Training model tells us that we want a clean training loop where the animal does every behavior in order without extra behaviors. It also tells us that a clean training loop must contain all the behaviors and the animal cannot skip steps. For many trainers, building chains is a good test of their stimulus control because if the stimulus control for each behavior is not good, the animal will drop behaviors out of the chain. I learned this with Rosie who tends to skip steps if I ride the same pattern too many times. Initially I found this anticipation counterproductive, but over time I have come to see it as a good thing. I want my horse to be eager to move on, because that means the horse is not fixated on getting the click and treat for each behavior, but on doing the next behavior which gets it closer to the end of the chain. I also found that if she was anticipating the next behavior, as in preparing herself, when I did give the cue, she was ready to go.
If we look at anticipation that way, as the horse being prepared, then it can be viewed as good thing. In their backchaining work, Morten and Cecilie use anticipation to create snappy and prompt responses to cues. The animal is just waiting for the cue and knows exactly what to do so they get strong and consistent performance. They also point out that testing (trying to skip steps in a chain) is normal and a chain that has been tested is actually stronger than one that has not.
What I have found is that if I have good stimulus control, anticipation is not a problem. If I don’t have good stimulus control, then anticipation can break down my training loop. At the same time if I am too rigid about stimulus control in the beginning, I will squelch any anticipation. So what I want to do is experiment with letting the horse tell me when it is ready to do the next step and use the horse’s body language as information about what it is ready to do. By allowing a bit of anticipation, I can learn to read my horse’s body language and learn to work with the anticipation instead of against it. What this means is that my horse and I need to work out what are “pre-cues” that are the setup for a behavior and agree on what actually triggers the behavior. It’s a bit like defining “on your mark, get set, and go” for any behavior. Some behaviors do need some preparation and using your horse’s anticipation can help sort out which cue is actually the best “go” cue.
At the clinics, Alex talked about how a training loop is an ongoing conversation between horse and trainer and that anticipation is an important part of this conversation. The trainer cues the horse to do the next behavior so part of the focus in the training loop is on directing and monitoring the horse. At the same time, the horse is giving information to the handler through its performance, white space and food delivery, and anticipation. Anticipation is useful information. When the horse does the next behavior before I ask, it is telling me it knows the answer and is ready to go. When a horse skips a step and anticipates a part farther down the chain, it is telling me I need to modify my chain, clean up my cues, or go back a step and rebuild the loop more carefully. Which leads us to the question of what do you do if your loop is not working.
WHAT IF YOU DON'T GET A CLEAN TRAINING LOOP?
The idea behind Loopy Training is that you start with a loop and when it is clean, you add a new behavior. Alex uses the presence of a clean training loop as a sign that it is time to move on. One of the problems trainers often have is knowing when to ask for the next step and when to continue with the same pattern until it is really confirmed. What Loopy Training suggests is that if the loop is clean, it is time to move on and if the loop is not clean, you need to find a way to make it clean. I think this is a nice way for people to look at it. When we train behaviors, we are constantly making choices about when to add new criteria. If we wait too long, the horse gets stuck at a certain stage and if we go too fast, the behavior falls apart and the horse gets frustrated (the trainer gets frustrated too). It definitely is an art to know how to keep training flowing along. But once you know what a clean training loop looks like, it is easy to tell when it is time to move on.
Loops can be modified by adding, subtracting or re-organizing behaviors. All the previous strategies for building loops had to do with how to organize or build loops to maximize the horse’s desire to move through the loop toward the click and treat. When I write about adding and subtracting behaviors, I am talking about troubleshooting if you don’t have a clean training loop. The first few times the horse does the behaviors in a loop, there are going to be some areas where the loop is not clean. These may show up as extra behaviors in the white spaces or they may show up as difficulties within the behaviors themselves, where the horse is not meeting the criteria or there seems to be some confusion. This is normal and I always let the horse do the loop a few times before I start tinkering with it, unless there is an obvious problem or the horse seems overly frustrated.
I have already pointed out that the Loopy Training model tells us when to move on, does it also tell us what to do when training is not progressing? Yes, what I have found is that when I am not making progress toward a clean training loop, Loopy Training tells me I need to change something, but what I need to change is going to vary from situation to situation. It is not necessarily a linear process in that if Loop 2 falls apart, you go back to Loop 1. I have found that when a Loop doesn’t work well, it can be for various reasons and I need to realize that there are different options for what to do.
In most cases, I have three ways of working through a problem in a Loop. Probably there are more, but these three cover most situations. I can go back (subtract a behavior) to make the loop simpler, I can modify the loop by rearranging the behaviors or adding a behavior in between (perhaps I was lumping and skipped a step), or I can add a behavior.
For example, the basic loop in many of these examples is behavior -> click -> food delivery -> behavior -> click -> eat. We could use a simple loop such as targeting for this example. I have a horse who has good basic food delivery and I am asking the horse to target which he is doing well, but I am getting a lot of mugging behaviors during food delivery, so my loop is not clean.
I started with Loop 1: click -> food delivery -> click -> food delivery -> … and moved on to Loop 2: target -> click -> food delivery -> target -> click ->…
But my Loop 2 actually looks like:
Target -> click -> push at handler -> nuzzle pockets -> take nose away -> food delivery -> eat -> push at handler -> target and so on…
Option 1: An easy thing to try would be to just go back to food delivery and tighten up the loop. If I want to go back, I just go back to Loop 1 and make sure it is clean. Sometimes going back is all that is needed because the horse just got confused and a bit frustrated. Going back gives the horse a break and gives the trainer a chance to clean up any details that have changed. When Loop 1 is clean again, I can try moving on to Loop 2.
I would go back to Loop 1 and keep presenting the food in the same location so that the horse got in the habit of keeping its head more still and waiting in the general food delivery area instead of looking or moving toward me for food.
If this was successful, I could go back to targeting (Loop 2) and see if the horse had improved. What I might find was that as soon as I added targeting, the behavior falls apart again so I need to add an intermediate step. This would be using the second option of modifying the Loop so that it was an intermediate step between Loop 1 and Loop 2.
Option 2: Modify Loop 1 to create an intermediate step toward Loop 2. I could add the behavior “head still” before delivering the food. My new Loop would be
Loop 1a: Head still -> click -> food delivery -> eat -> head still -> click -> food delivery -> eat and so on…
It is worth noting that this is not necessarily an “easier” loop as it still follows the format of behavior -> click -> food deliver -> behavior ->… I have just replaced the behavior of “target” with the behavior of “head still.” For some horses this might be harder than targeting which doesn’t require much self control.
If this worked, then I would add targeting and my loop would be:
New Loop 2: Head still -> target -> click -> food delivery -> eat -> head still -> target -> click -> …
Another option to the original problem could be to decide that the horse would benefit from adding a behavior
Option 3: Adding a behavior: Perhaps I need to give the horse more direction and I am going to add a behavior such as backing for food delivery as part of the chain.
So my new loop would be:
Loop 3: Target -> click -> back -> food delivery -> eat -> target -> click -> food delivery -> back -> eat and so on…
This example shows three different ways to experiment with a training loop to clean it up. They are just to give you an idea of ways that you could change the loop to get a different result. The nice thing about Loopy Training is that it gives you a model for what good training looks like, but it doesn’t lock you into a set procedure so you can come up with lots of different solutions for different horses.
Here’s an example. At the Groton clinic where we did jaw flexions with Rosie, the behavior following the jaw flexions was walking off on a circle. If she did a good jaw flexion, she got to walk off. If she was sticky about the jaw flexions, we would work on the jaw flexion until I got a good one and then I would allow her to walk off. The walking off was reinforced with a click and treat, but it was also a break for her and a chance to move. She was a bit anxious in the indoor so movement was reinforcing to her. At some point, she started to anticipate walking off when she had done a good jaw flexion and I was losing my ability to choose when she should walk off. In a way it was kind of funny, she knew when she had done it right and was cueing herself to walk off.
I had lots of options here. I could have gone back to just reinforcing the jaw flexions which was a previous loop and would be subtracting a behavior. Or I could have added a behavior by adding a reinback after the jaw flexions. Either one would work, but I had to balance out whether I wanted to concentrate on stimulus control (don’t walk off until I tell you), letting her practice jaw flexions without walking (so she didn’t anticipate) or add a new piece so that she learned to rebalance after the jaw flexions (by adding the reinback). There is no right answer here, but lots of options and I might choose one for one training session and another for another session because they are all valid elements of getting a clean training loop.
IS EVERYTHING LOOPY TRAINING?
I want to leave this article by pointing out that training loops come in all shapes and sizes. The principles of Loopy Training can be applied whether I am working on one behavior, teaching a horse to do multiple behaviors in a row in various orders, or working on building a pattern. Once a horse has a good repertoire and the basic idea of doing several behaviors in a row, I can flow in and out of various training loops, mixing things up to keep it interesting and exploring new combinations that might benefit the horse physically or mentally.
I don’t necessarily start with a basic training loop and spend the whole session just adding on to it. I have had rides where I do that and it can be a good exercise, but Loopy Training does not mean I am locked into only using that strategy. I am still using the Loopy Training Model when I look at what happens when I click, food delivery and white space and when I repeat a pattern a few times to see how the horse does. I think the Loopy Training model brings a new awareness of some of the important elements of good training and the idea of training loops is very helpful in reminding me to do each exercise a few times before moving on so I can evaluate how my horse is doing and not just assume that if I practice it once or twice here and there, things will get better.
* * * * * *
Summary of Important Points and Checklist for Practical Considerations:
Loopy Training is a model that helps develop good clicker training skills. It is not just a specific strategy or procedure although it can lead to using some strategies more than others, but it brings an awareness of the importance of all parts of the training loop.
What Loopy Training means is that training is not an isolated series of behaviors followed by clicks and reinforcements, but a stream of behaviors, clicks, and reinforcement that flow one after another.
A clean training loop is one where each behavior follows the previous one with no interruptions, the animal gets clicked and treated, and returns to offering behavior with no extra behaviors thrown in at any point. This includes between click and treat, and between eating and offering the behavior again. What is an “extra behavior?” Something that disrupts the training process of shows confusion on the animal’s part. It is unrealistic to expect 100% focus for long periods of time, so some extra behaviors will come in here and there, you just have to evaluate them.
Food Delivery and the Click as a Cue:
According to Jesus Rosales-Ruiz, the click is a cue for the animal to get its reinforcement. It is important that the animal knows how and where to get its reinforcement so that there is not a long delay between click and treat. Uncertainty about how to get reinforcement can lead to frustration and confusion on the animal’s part and discourage the animal from continuing.
The smallest and first loop we usually teach is asking for one behavior such as targeting and can be written as (behavior -> click -> food delivery -> eat -> behavior ->…). In some cases, it may be appropriate to start with something simpler such as click -> food delivery -> eat -> click -> food delivery… or food delivery -> eat -> food delivery -> eat -> food delivery…. Problems in food delivery should be addressed early on so that clicker training is a positive experience for horse and trainer.
Practical considerations: Things to monitor:
Does the horse orient to you as soon as you click?
White space is what I am calling the gaps between behaviors, between click and reinforcement and between reinforcement and the start of the next behavior. In a clean training loop, I could write out my list of behaviors with a white space between each behavior meaning that there were no extra behaviors and the horse flows from one behavior to the next, gets clicked, gets its reinforcement and goes right back to working again.
First, identify all the places where you want white space.
Second, start monitoring each one. You are looking for patterns of behavior. Are there some behaviors where the horse flows easily from one to the other?
Are there places where the horse consistently adds in extra behavior?
Are the extra behaviors in the white space ones the horse knows and is offering (is there cue confusion) or are they stress or avoidance behaviors?
Training Loops in Real life:
The most common uses for training loops are training new behaviors, creating chains (or loops), and removing unwanted behaviors from chains.
Shaping a new behavior is a one behavior training loop where a clean training loop means it is time to advance criteria.
Training Loops start with a simple loop such as behavior -> click -> food delivery -> behavior ->… where the behavior is the anchor behavior. A good anchor behavior is reinforcing to the horse, easy to do and under stimulus control
Once the basic loop containing the anchor behavior is clean, the trainer can add other behaviors using various chaining strategies
Some More Thoughts on Chaining and Applying Chaining Strategies to Loops:
Chaining behaviors together is one way to teach an animal to do several behaviors before getting clicked and reinforced. Loopy Training provides a model for how to systematically build chains so that the animal understands reinforcement is coming and how to get it. The Loopy Training model also makes it easier to see how to modify a chain (or loop) that is not working and how to know when it is time to add more behaviors to the loop.
To add new behaviors, we have the options of using Premack, backchaining, forward chaining, the appropriate use of reinforcers, or just inserting a new behavior where it makes sense. We should keep in mind that chains often build enthusiasm and can show holes in out stimulus control. The microshaping strategy (reinforcing an easy behavior multiple times as reinforcement for a harder behavior) can also be used to help the horse understand the concept of chains.
Do not move on until the horse clearly is ok with being reinforced for one behavior by the chance to do another. I usually build lots of little two behavior loops before I create any three behavior loops as I want the horse to get the concept of chaining behaviors together before I make it harder.
In addition, we should be aware that cues can act as reinforcers and markers which are important in maintaining or improving the quality of a chain. Most of us view a cue as a green light to tell an animal which behavior might be reinforced. Cues should also be viewed as reinforcers in that the presentation of the cue reinforces whatever the animal was doing at the time, assuming the animal reacts positively to the cue.
And if you want to get more precise, you can use the timing of your cue to shape or refine a behavior. This can happen whether you intend to or not, so it is worth paying attention to exactly when you cue the next behavior.
Anticipation is useful information and normal when building chains or loops because the main reinforcement is at the end of the chain. In Loopy Training, we use anticipation to provide us with information about a few things such as stimulus control, the horse’s attitude and emotional state, and the construction of our loop.
Stimulus control: In order to have a clean training loop, the horse must not add or subtract behaviors. When a horse starts to anticipate, it often skips steps and omits certain behaviors. If you allow this to happen, your loop will end up containing fewer and fewer behaviors. In some training situations, we want some behaviors to drop out as the animal gets more skilled but this needs to be a conscious choice. I find that setting up a loop and doing it a few times is a great way to check my stimulus control and clean up my cues.
The horse’s attitude and emotional state: Some anticipation is a good thing. It shows that the horse is interested in the training and eager to offer behaviors. I like my horses to show me that they are ready to do the next behavior. If a horse is reluctant or sticky about doing several behaviors in a row, it tells me that I have gone too fast and the horse is not enjoying the work.
Construction of the Loop: Sometimes anticipation just points out areas of the loop that need modification. If I have a loop set up and the horse is anticipating a behavior, it might mean I need to add another step to make it cleaner. For example, if my loop contains halt -> flex -> walk forward, it is likely that at some point the horse is going to go forward after I ask it to flex, but before I cue forward. I can shorten the loop and do halt -> flex for a while until the horse stops thinking about going forward, or I can add a behavior and do halt -> flex -> back -> forward to get the horse to back up instead of going forward. Once the horse can do that, I can fade the backing until the behavior is halt -> flex -> rock back -> forward and I now have control of forward.
What to do if you don’t get a clean training loop:
If you don’t get a clean training loop or see improvement after several repetitions, you probably need to modify your loop. Three ways to modify the loop are to:
go back to a previous loop (subtract behavior)
modify the loop (add an intermediate step)
add a behavior (make it clearer to the horse
Different options will work for different situations. Going back to a previous loop is always a good place to start to make sure the loop is clean (sometimes we rush or don’t notice things that cause problems later.) Sometimes we just went too fast and we need to add a few steps between the previous and current loop. And there are times when the horse needs to be asked for more specific behaviors and we want to add steps so it doesn’t end up guessing what to do when going from one behavior to the next.
Is Everything Loopy Training?
Becoming aware of training loops is like learning to look for something new that you never noticed before. Once you recognize them, you realize they are everywhere. Being aware of them and when to use them to your advantage and how to set them up in training sessions is a good way to improve your clicker skills to create efficient and positive training sessions.
Katie Bartlett, 2009 - please do not copy or distribute without my permission.