EQUINE CLICKER TRAINING..... using precision and positive reinforcement to teach horses and people |
il.com
Clicker Expo
2008: Lexington, Kentucky
I was fortunate to be able
to attend Clicker Expo in Lexington, Kentucky in March 2008. This was my
third expo and like the ones before, it was a great experience. I met some very
nice people, learned a lot, and had the opportunity to share ideas with other
clicker trainers. I thought I would share some of the highlights of my
experience because I think that this expo generated a lot of new ideas for me
about how to clicker train horses more efficiently and more creatively. I
need to point out that Clicker Expo runs for 3 days with 3 time blocks of
lectures and labs each day. Within each time block, there are five choices for
what to attend, so every attendee puts together his or her own program and the
sessions I am going to describe are just those that I attended. Friday
morning starts with an opening session where Aaron Clayton presents some
information on who is attending (what states we come from, what kinds of jobs)
and then the faculty are introduced. Don't hold me to these numbers but
someone told me there were about 400 attendees with 150 dogs. The sessions
I attended were: Ken Ramirez:
"Working for the joy of it: A Systematic Look at Non-Food Reinforcers." Ken
Ramirez is the vice president of animal collections and training at the Shedd
Aquarium in Chicago and he is always a good presenter with a nice mix of stories
and information. The focus of his talk was how to condition and use non-food
reinforcers to provide reinforcement variety in your training. He started
by explaining the difference between primary and secondary reinforcers. Primary
reinforcers are those things that inherently satisfy a biological need and
include food, water, shelter, air, reproduction, and safety. Secondary
reinforcers are reinforcers that acquire their value through association with a
primary reinforcer. There are lots of different kinds of secondary
reinforcers including event markers and keep going signals. For the
purpose of his discussion, he was focusing on what he called "reinforcement
substitutes" which are reinforcers that a trainer can use in place of a primary
reinforcer. I want to point out that this session was an "advanced" level
one and he was very clear that novice trainers should not be using reinforcement
substitutes without supervision or guidance from more experienced trainers,
although there are exceptions.
Reinforcement substitutes are useful for trainers for those times when a primary
reinforcer cannot be used either because it is not available or because the
animal does not want it. He explained how they used reinforcement substitutes
with one of the dolphins when it was sick and would not eat. They needed to feed
her medicated fish so they actually got her to eat the fish by using
reinforcement substitutes in training sessions and including some fish eating as
a desired behavior. One of the points he made with this example is that the
value of reinforcers can change so it is useful to have several reinforcers
available to give the trainer a choice of what to use in any given situation. He
presented detailed information on how to condition reinforcement substitutes. He
uses classical conditioning to associate the new reinforcement substitute with a
primary reinforcer. The procedure starts with choosing a reinforcement
substitute you want to use. He gave us some guidelines for choosing these. It is
easier to start with something you think the animal might find reinforcing, but
you can start with something neutral. He does not recommend starting with
something the animal finds aversive. And he is very specific that even if
you choose something that you already think the animal finds reinforcing, you
still need to go through the conditioning process because just because an animal
likes something, that doesn't mean it will work for it. Later, he
mentioned that one of the cautions about using reinforcement substitutes is that
in order for them to be strong reinforcers, you need to control access to them.
Therefore if your animal has a behavior that it finds strongly reinforcing, and
you want to condition it as a reinforcement substitute, you may need to limit
access to that reinforcer under some situations. The
trainer is going to start associating the reinforcement substitute with the
primary reinforcer by presenting the reinforcement substitute and then
immediately presenting food. He does not click because he does not want the
animal to think it needs to "do anything" to earn the primary reinforcer.
At this point the reinforcement substitute is just indicating that primary
reinforcement is coming. He does this for many sessions over weeks, months and
maybe years, as long as it takes for the animal to make the association.
This is the first step in the process. The
second step is to start using the reinforcement substitute in training and see
if the animal accepts it. He had some general rules to follow. Each step
is repeated multiple times. 1.
Start with a easy, well established behavior. For the first reps, ask for the
behavior, click, present new reinforcement substitute, follow with primary 2.
Mix in some times where you follow the click with the reinforcement substitute
and NO primary. Only do this 3x max per session. 3.
Now start using reinforcement substitute after a harder but well established
behavior, by asking for the behavior, clicking, offering the new reinforcement
substitute, followed by the primary 4.
Start following a harder but well established behavior with the reinforcement
substitute only. Again, only do this 3x max per session. 5.
Increase use of reinforcement substitutes but do not use more reinforcement
substitutes than primary reinforcers. His general guideline is 20/80 max but
this can be over multiple sessions so you can have some sessions where you use
more reinforcement substitutes than others. 6.
You must continue to pair the reinforcement substitute with the primary
reinforcer on a regular basis to keep the association strong. 7.
With beginners, he had more specific rules for introducing reinforcement
substitutes gradually. These were never use a RS (reinforcement substitute)
after 2 consecutive behaviors, avoid using the same RS twice in a row, always
ask for behavior followed by a primary reinforcer more often than a secondary,
and keep the association strong by reinforcing reinforcement substitute
regularly. He
showed how to use reinforcement substitutes in chains and take advantage of
Premack. He gave examples of chains where there were very few
primary reinforcers and how this initially looks like a variable reinforcement
schedule. But included in the chain, he had behaviors that had been conditioned
as reinforcement substitutes so when you identified those in the chain, it
showed that the animal was getting steady reinforcement. When
talking about introducing reinforcement substitutes, he said that it is
important to keep in mind the animal's expectations. This means that if
you have used the same reinforcement for every click, the animal expects that
type of reinforcement. If you now try adding in reinforcement substitutes, you
have to be very careful to avoid frustration because the animal has expectations
about what follows a click. I found this particular detail very relevant because
99% of the time, I reinforce my horses with food. If I want to start
adding in substitute reinforcers, I need to do it slowly and read my animal to
make sure I don't create frustration or confusion. He also
pointed out that not everyone needs to use reinforcement substitutes and not
every animal does well with them. He wanted to share how he conditions them
because he finds that people are using them whether they are aware of it or not,
and he finds that behavior can break down if the trainer is depending upon
reinforcement substitutes that have not been systematically conditioned and
introduced. On a
personal note, this session had some "ah-ha's" for me. I found the
discussion on expectations very interesting because I have run into some of that
and not been quite sure how to address it. I also found it added to my
understanding of how to create stronger chains. From Kathy Sdao, I
had gotten the piece that cues are reinforcers and they help to keep chains
strong. From Ken, I got a better idea of how to mix in reinforcement substitutes
as components of a chain to further strengthen it and to avoid having to keep
clicking, stopping and treating as I am working my horse. Steve White: "No
Problem! Solve any Training Issue in Four Steps"
Steve White trains police dogs, but recently he has been working with people and
he found that some of the problem solving strategies that people use in their
jobs work well for animal trainers too. He presented several models for
how organizations use problem solving (OODA loop, SWOT, drill down, IDEAL, SARA)
and his own DIP-IT. The
components of DIP-IT are Define the problem, Isolate the problem,
Plan your remediation, Implement your plan and Take another
look. These were the basic steps and could be used for solving
problem in current behaviors or training new behaviors. He pointed out that
teaching new behaviors is as much about problem solving as is working on
modifications to existing behaviors. Some of the problem solving
pitfalls that people encounter are focusing on the problem, asking too much,
working too long and being short-sighted. He reviewed the 10 rules of
shaping (Karen Pryor) and pointed out that he believes the ultimate solution to
many training issues is to change the motivation. He
believes that many people have problems because they stop training too soon.
Good trainers overtrain. They also train equal and opposite behaviors. The
example he used for this was a DVD clip of a police dog learning to be called
off the person wearing the sleeve. I saw this movie last year and it made a big
impression on me. In the movie, they have a police dog that is happy to go
out and grab the "bad guy" but ignores requests to let go or to abort his run
toward the victim if he is called off. To address this issue, they
sent the dog toward one person who offered no reinforcement by remaining passive
as the dog approached. They called the dog off and at the same time, they
presented a new "bad guy" and sent the dog to him. To do this, the
dog had to leave the first "bad guy" and go to the new "bad guy" who gave
the dog a good reinforcement by being very reactive to the dog's "attack."
The dog learned to listen to the handler because the handler always knew where
the best reinforcement was. When I saw this last year it made a big impression
on me about how we can teach our horses that listening to us is the best way to
earn reinforcement instead of getting carried away offering their own favorite
behaviors. He
showed us a training form, the taproot form, that he uses with his dogs and
students. It is a way of keeping data on what behaviors are being trained so
that the most important behaviors get the time they need. He has examples on his
web site in the libraries section (www.i2ik9.com). The form has boxes for
behaviors with the most important ones in the middle and additional behaviors on
each side. The trainer works on a behavior in the middle and then can go to one
of the "side" behaviors but then has to return to a middle behavior. He ended
with some general thoughts on training. Training is a mixture of art and
science. If you want to break the "rules," then you need to know the
rules. It is important to be creative and think outside the box,
especially with problem solving where most people are tempted to lose their
sense of the where they are in training process. He pointed out that knowing
where you are is very important and you have to be realistic about it. He
compared it to using Mapquest to get directions. Mapquest doesn't care where you
have been, it just wants to know where you are now.
Jesus Rosales-Ruiz:
"Broken Clicks: How Reinforcer Delivery Impacts Learning" Jesus
Rosales-Ruiz and his graduate students are studying various aspects of operant
conditioning and clicker training. He did a series of experiments to see
how a delay in the presentation of the treat after the click affected the
animal's behavior. Before he described the studies, he went over some
basic information about the use and selection of the click (or other marker
signal) as a conditioned reinforcer. He
believes that the clicker has two functions. It serves as a marker signal to
strengthen and select behavior. It also serves as a cue to tell the animal to go
collect its reinforcement. To illustrate this, he went back to some basic
information about how to condition the clicker from BF Skinner. He showed how
the click ends up being the first step in a chain which is click, approach (or
turn head), eat. The conventional thought has been that you pair the clicker
with the food using classical conditioning, but he thinks that it is more useful
to think that we are teaching that the clicker is a cue to approach the trainer
with the expectation of getting reinforcement. With this in mind, he explains
why it is so important to maintain the one to one ratio of clicks to treats (or
other reinforcement). If the
trainer is not consistent about reinforcing after each click, then the use of
the clicker as a cue that reinforcement is available is no longer reliable.
This matters because if the click is no longer a reliable predictor of food, the
animal is going to spend time trying to identify other predictors of food and
you can end up with an animal that is easily distracted and spends a lot of time
looking at the handler for information he should be getting from the click. This
is inefficient and adds a level of confusion to the training process. He had
some movie clips of dogs trained where they were not treated after every click
and in both cases the dogs got confused, showed frustration behavior and their
level of performance decreased significantly. The use
of treatless clicks also poisoned the whole training experience so that their
general attitude and enthusiasm for the training sessions decreased so much that
they no longer enjoyed the training sessions. When the student was first
training them using a one click:one treat ratio, the dogs were excited to get
started. When she switched to two clicks: one treat, they would avoid the
student when she came and the owner had to go get them for the sessions. He then
did another experiment with adding in extra clicks during a training session.
His student trained a dog to touch two foot targets by using a one-click:one-treat
ratio. The dog was focused on the task and very prompt and clear
about what behavior it needed to do. The student then added in a click for
the first target touch, but still clicked and treated after the second target
touch. The dog did keep working but she seemed to get confused. She looked to
the handler more often than before and there were hesitations as she went from
one target to another. They did
another experiment that was similar, but instead of using a treatless click to
mark the first target touch, they used the word "bien." This worked out ok.
In this scenario, the word "bien" was being used as a keep going signal and it
did not interfere with the one-click:one-treat ratio. The dog seemed to
interpret it as additional information that she was working correctly although I
did not see any big difference between the behavior as trained with only a click
for completion of both target touches vs. trained with "bien" as a keep going
signal. The
remainder of his talk focused on what happens when there is a delay between the
click and treat. According to his earlier statement, the click acts as
both a marker signal and the start of a chain What happens if other
behaviors creep into the chain when there is a delay between click and treat?
He showed results of a targeting experiment with sheep. They asked the sheep to
target and either reinforced immediately or delayed for 5, 10, and 20 seconds
and measured how much other behavior occurred between click and treat. As
the delay increased, more frustration behavior (hoofing, nibbling the target
stick, walking away) happened between the click and treat. He also
shared how Virginia Broitman and Sherri Lipman did the same experiment with
their very clickerwise dogs and also found that the dogs repeated behaviors that
occurred between the click and treat. Not only that, but after the experiment
was over, they were unable to completely eliminate the frustration behaviors
that those dogs had inadvertently gotten reinforced for doing. The same
types of changes in behavior were shown on movie clips studying the effect of a
5 second delay on targeting behavior in monkeys. In all cases, the
monkey's behavior deteriorated. Not only did they add a lot of superstitious
behavior between click and treat, but after receiving the treat, they did not
immediately go back to the target. Some of the behaviors they showed were
submissive or frustration behaviors. From
these studies, he has concluded that while it is common to say that the clicker
allows us to have a delay between click and treat, this is not as
straightforward as it seems. A delay in delivering the treat can allow the
animal to include superstitious behaviors and also lead to frustration and
confusion over what was being clicked. Someone asked the question about
what if we have a behavior we are training where we cannot immediately deliver
the reinforcement, such as working at a distance or at speed. In
that case, he thinks the delay is not a problem as long as the process of
getting and delivering the treat starts as soon as the click is given, and the
animal can see that the trainer has started what he calls the "reinforcement
cycle." What he stressed was that it was important to recognize that the
click starts a behavior chain that ends with the delivery of the treat. If
the trainer keeps this chain intact and always does food delivery in the same
way, it is not an issue. I think what good trainers really need to do is
ensure that the animal is not throwing a lot of behaviors in between the click
and treat. This can be accomplished by keeping the time between click and
treat short or having a set routine for presentation of the treat so that the
animal patiently waits for the treat. Ken Ramirez:
LAB: "Next! Finding and Using New Reinforcers" This lab
was a chance for people who attended Ken's lecture to practice with their dogs.
I attended as an observer and got to watch several people start conditioning new
reinforcement substitutes with their dogs. He started by reviewing how to select
a good reinforcement substitute and suggested that people practice it without
their dogs if the substitute was a hand signal or some kind of physical gesture.
I suppose you might practice a verbal reinforcer too to make sure you were
saying it consistently. A few
different people demonstrated with their dogs. The important thing to
watch for is the dog orienting toward the handler when it hears or sees the
reinforcement substitute. We had one woman who wanted to use a thumbs-up,
another who wanted to use a verbal and one who used clapping. The dog that
belonged to the woman who chose clapping took a while to figure out that
clapping was not a new cue for begging, which he offered a lot. Ken had her
shorten the duration of clapping and present the food before the dog had a
chance to offer to beg. He stressed that it is important to keep the time
between the reinforcement substitute and primary reinforcer short so that the
dog does not throw in other behaviors, but there does need to be a delay in
order for classical conditioning to take place. I
enjoyed watching the dogs work and it was interesting to see how some of them
were fine with food just appearing and others seemed to want to figure out why
the food was appearing. It was good to see some conditioning of reinforcement
substitutes to see how it worked out and how the dogs handled it. Morten Egtvedt and
Cecilie Koeste: "Reliability: Thy Name is Backchaining" Morten
and Cecilie had a number of presentations and I only attended the one on
backchaining. They are top level obedience competitors who also run a
chain of dog training schools and publish dog books and videos in Norway.
They have a different approach to training in that they do not put behaviors on
cue until they are completely finished. Rather than put early versions on a
temporary cue, they do not cue the dog at all. The dogs figure out which
behaviors they might want by context (handler position, presence of equipment)
and run through their repertoire until they get clicked. The covered all
this material in the lecture and lab I did not attend, but a friend of mine
attended and gave me the basic outline. The
lecture I did attend was on how they combine behaviors together by backchaining.
Most of the behaviors they teach are used in obedience trials and the dogs
perform them in a set order. To achieve precision and speed, they backchain
finished behaviors to prepare for competition. They stated that they
never forward chain if they can backchain. They have found that
backchaining produces faster and more reliable performance because the dog knows
what is coming next and is anticipating the cue. The success of
backchaining is due partly to Premack where a less probable behavior is
reinforced by a more probable behavior. In a backchain, the last behavior
has the strongest reinforcement history and is therefore more likely to occur.
Therefore you can use that behavior to reinforce the previous behavior and so
on. When you create a chain in this way, the animal is working from less
probable behaviors toward more probable behaviors and therefore it builds
enthusiasm.
They
showed some examples of how they backchain. If the chain is short, they
just start by reinforcing the last behavior (4) and then asking for the last two
behaviors (3 & 4). They reinforce for 3 & 4 until they are really fluent.
The explained that they do not backchain until each individual behavior is very
fluent. A lot of errors in chaining come about when trainers try to chain
together behaviors that are not fluent enough. They make sure that the dog
really knows each individual piece before they create the chain. Once 3 &4
are fluent, they add in 2 and practice 2 & 3 & 4 until fluent before adding in
1. If they
are building a longer chain, they might make two mini chains separately and then
combine them using overlap, if they can. So they might train a chain with
behaviors 1, 2 & 3 and another chain with 3, 4, & 5 and then combine them. The
overlap helps to make the connection strong. There are other ways to
keep the chain strong. One way is to reinforce the last behavior with a really
premium reward. Another is not break the chain unless absolutely necessary. One
of the problems with forward chaining is that you have to occasionally reinforce
each component of the chain to keep it strong, which reinforces individual
behaviors, but breaks the chain. By making sure each behavior is perfect
before backchaining them together, they can avoid having to break out individual
components of the chain, which might strengthen an individual component, but
breaks the chain. They had
a few other guidelines. They test each chain 5 times to see if it is reliable.
They emphasized that the strength of chains comes from using positively trained
behaviors. Behaviors that have a "do it or else" component will actually weaken
a chain. If they have taught a backchain and the dog makes an error, they
abort and try again. If the dog makes an error again, then they go back to
looking at individual components of the chain. If the dog makes an error,
it could be that they need to go back and work on one component, but they try to
correct the problem without breaking the chain if possible. The last
thing they talked about was how dogs will go through a "testing phase" to see if
they can skip steps of the chain and still get their reward. They stressed how
important it is that the animal goes through this phase and Cecilie said she
would never compete a dog that had not tested her. In most cases, if the
dog skips a behavior, they can just prevent it from being reinforced and ask it
to start again. They had a movie clip of a dog checking blinds for a
person. The dog has to run a figure 8 pattern checking every blind until it gets
to the last one where the person is hiding with a sleeve. They backchain
this search sequence so the dog always knows that the person is in the last
blind. When they start backchaining, if the dog skips and goes directly to the
last blind, the person just steps out and does not give the opportunity to bite
the sleeve which is the reinforcement for this exercise. Usually just
removing the reinforcement is enough to convince the dog to complete the entire
chain. Their
presentation had a nice mix of explanations and movie clips. They showed some
very good footage of dogs being backchained. They also had the audience
participate in some activities by pairing up and backchaining each other. I
found this was interesting as I could feel how the backchaining did build
anticipation. This also showed how important it was to be very clear about each
individual behavior. Sometimes I was cued to do the next behavior, and I started
to do it without stopping the previous behavior. One chain was turn, clap, sit
and I found myself sitting while clapping until we got more specific about how
many times I was supposed to clap. I could also see how keeping the cues in the
chain let the trainer use anticipation to their advantage. Prior to this, I had
thought that part of the point of creating a chain was to give the final chain
one cue instead of retaining the cues for each individual behavior. But Morten
and Cecilie keep the cues in place, which means they can control the timing of
when the dog goes from one part of the chain to the next. I am
very intrigued by the idea of using more backchaining and I am trying to think
of ways to implement this in horse training. So much of riding is about using
one behavior to set the horse up for the next behavior that forward chaining
seems more obvious, but I have sometimes taken advantage of anticipation by
careful selection of the a behavior that improves the one before it and I think
this is part of what backchaining is all about. For example, if I have a horse
that is slow at the walk, I will do a lot of walk trot transitions, clicking for
the trot. As the horse starts to anticipate the trot, the walk will
improve. In the past I have been clicking the improved walk, but if I
think of this as a backchaining exercise, clicking the walk might not be
necessary because I can reinforce the improved walk by asking for and rewarding
the trot as well. I think this is definitely worth pursuing and I am
putting together some ideas for things to try with my horses. Kathy Sdao: LAB: "What
a Cue can do in Action: cue control"
This was the second of two labs that accompanied her lecture. I did not
attend her lecture or the first lab, but I wanted to see her teaching so I
signed up for this lab. I spent the winter watching her DVD's so I was hoping to
be prepared. Kathy is a dog trainer from Washington State with extensive
experience with marine mammals and a very dynamic and enthusiastic teacher.
She kept us busy with several exercises on cue discrimination and control.
She
started by reviewing the qualities of a good cue. Cues are distinctive,
consistent, salient, simple (make sure you know exactly what the cue is - harder
than you think), and precious. By precious she meant that we should be
careful with our cues. Don't present a cue unless you are willing to bet money
that the dog will do it. Then we
started off with a cue discrimination exercise. She had each dog owner
write down 5 behaviors their dog had on cue. The helper shuffled the cards and
made a random list of 10 behaviors. Then they tested to see how the dog was
doing in the new environment. This lab was held in an outdoor tent with blowers
going for heat and a lot of distractions. I noticed that it took some of
the dogs a while to get focused on their owners and respond reliably to the cues
they knew. She gave each team a data collection sheet and they recorded
how many times the dog was correct. The owner was to only ask once and
this exercise was not about getting the behaviors, but about seeing how the dogs
responded to the cues. Every correct behavior has to be clicked. You cannot use
variable reinforcement for this exercise as the dogs need the information that
they have responded to each cue correctly. She
recommends that dog trainers do discrimination exercises like these on a regular
basis to keep the dogs sharp. I think it would also help to make sure that your
cues are not morphing over time or getting sloppy. When she asked for
feedback on how the sessions went, a number of handlers reported that they were
more careful about their cues than usual. For the
second cue discrimination exercise, she explained how to test to see if you and
your dog agreed on the correct cue for any given behavior. She calls this the
"prove it" game. The idea is to vary the cue slightly and see if the dog
still responds. So if you wanted to test the cue "sit," you might ask the dog to
"hit," "sat," "upsit," or some other variation on "sit." Is the dog sitting for
any one syllable word that starts with s? How about any word that ends in -it?
If you are using body language cues, you could test your hand gestures by
changing some details. Ideas for this included standing on a chair, using the
other hand, holding your hand at a different height, making the motion bigger or
smaller, wearing sunglasses, wearing a glove, kneeling and so on. She gave
us a handout with suggestions for ways to test cues and said that some people
get very creative about doing this. Before
she let people try this out, she reminded them that they had to decide ahead of
time how they wanted the dog to respond to any given cue. Do you want your dog
to sit when you say "sat?" Do you want your dog to respond to hand gestures with
both hands or just one? It is important that the dog has the possibility
to earn a click each time a cue is presented, but it could earn a click by doing
the behavior, or by not doing the behavior, depending upon what the owner has
decided. Again, she had people ask their dogs for behaviors in random order so
that the dog was not influenced by recent reinforcement for any one behavior.
People had a lot of fun with this. This was
a fun lab with lots of good training to watch and people were very creative with
coming up with variations on cues for the prove it game. She also handed
out a paper that had a list of reasons people might not go at a green light. I
am familiar with this list from her DVD's and I think it is a great way to help
people understand about cues. In her DVD she makes the distinction between cues
and commands. In her view, commands have a component of "do it or..." whereas
cues are just indicators that an opportunity for reinforcement is available for
a particular behavior. To help
people understand this, she compares cues to a driver waiting at a red light.
The handout lists reasons why you might not go even if the light turned green.
Some reasons are: you can't see the signal, signal was brief and you
sneezed or blinked, didn't recognize signal because it was different somehow
(flashing green), distracted by another sight or sound, another overriding
signal prevented you (e.g. siren), another car is in your way (inhibition),
unsafe (someone ran a red light), ran out of gas/broke down, and new criterion
(standard car stalled on hill). Her point is that there are lots of
reasons a dog might not respond when presented with a known cue and we need to
recognize and troubleshoot why instead of assuming the dog chose to ignore it. Michelle Pouliot:
"Training Guide Dogs for the Blind" Michelle
Pouliot presented information on the training program for Guide Dogs for the
Blind which has been converting over to clicker training. She presented some
historical information about guide dog training and then showed how they are now
adding in clicker training. I didn't know much about guide dog training so I was
interested to see how they taught some of the behaviors. They use treadmills to
teach the dogs to lead and the movie clips showed that the dogs really love the
treadmill training. She had movie clips that showed them teaching the dogs
to respond to the collar, ignore food on the ground and back up in a straight
line. She also
showed how they teach them to go around obstacles and about intelligent
disobedience. They used to teach intelligent disobedience by asking the dog to
proceed and then mimicking a fall or bad event. This was stressful on the dogs
and with the use of the clicker, they came up with a better way by asking the
dog to go forward and clicking before the dog could respond. The dog
learned to evaluate the safety of the situation before responding to the command
"forward." I
thought it was interesting to see how they had added clicker work into their
training and I loved seeing how happy the dogs were when they were working.
So even though I am not doing guide dog work, I was happy to have attended this
session. Jesus Rosales-Ruiz:
"The Poisoned Cue Anew"
The poisoned cue lecture is one that Jesus has presented at several Clicker
Expos and he is constantly adding new information to it. He started the session
by distinguishing between cues, commands and poisoned cues. Cues are
discriminative stimuli that indicate the possibility of reinforcement as
established through positive reinforcement. Commands are discriminative stimuli
established through negative reinforcement. In training a command, the
command is presented and if the animal does not comply, negative reinforcement
is applied until the animal performs the target behavior at which point the
aversive stimulus is terminated. I think it is important to note his terminology
here because he does specify that the animal must find the use of negative
reinforcement aversive in his example. Through this scenario, a command
becomes a conditioned aversive stimulus. The presentation of the command
can be used to decrease behavior, or removal of the command can be used to shape
and capture behavior. Because of the way we use negative
reinforcement with horses, I think we have to look carefully at how we are using
negative reinforcement and how our horses interpret our use of negative
reinforcement. In most of his examples, negative reinforcement was used as
more of a correction than as a teaching tool. He
argues that a poisoned cue is a cue that is ambiguous because it has been
trained with both positive reinforcement and the use of aversive stimuli. I want
to note here that the first time I attended this talk, he said that a poisoned
cue was one that was trained with both positive and negative reinforcement. So
he has changed his terminology somewhat here. He now refers to aversive stimuli
instead of negative reinforcement. I think this is important for horse
trainers who use so many pressure and release cues. It is the ambiguity of
the cue that causes the problem. Possible effects on behavior from
poisoned cues are reluctance in the trainee with signs of stress, behavior
breaking down both before and after the cue, longer latencies, freezing and
avoidance behavior. There
are three ways to create a poisoned cue. The first is to add aversive
stimulation to a positive reinforcement program. The second is to teach with
aversive stimulation where the good behaviors were positively reinforced and the
incorrect responses were "corrected" with an aversive stimulus. The third way is
to elicit behavior with an aversive stimulus and capture it with positive
reinforcement. He
showed a movie clip of how they created a poisoned cue with a miniature poodle.
The poodle had prior experience with clicker training and no experience with
aversives. There were two parts to the experiment. In the first part, they
taught the cues. The cue "ven" was used to teach the poodle to come using only
positive reinforcement. The cue "punir" was trained by presenting the cue
(saying "punir") and giving the dog 2 seconds to respond. If the dog did not
respond, it was pulled to the handler and then clicked and treated in position.
They repeated this over a hundred times. The leash correction was
discontinued somewhere around the 60th trial. They measured the tail
height (high vs. low), presence of whining, and snorting. He showed a movie clip
of some of the training sessions and the dog's body language for "ven" was
bright and enthusiastic. For "punir," the dog looked depressed, with a low tail
and low energy. He also pointed out that in order to present the cue, the dog
had to be a certain distance away. In "ven," the dog willingly waited at a
distance for the cue. In "punir," the dog stayed so close to the handler that
she had a hard time getting the required distance. It seemed as though the
dog hoped to avoid the leash correction by staying right next to the handler.
In the
second part of the experiment, they used the cues "ven" and "punir" to capture a
new behavior. They chose several behaviors such as the dog's position in
the room, back stepping, and touching an object. For example, when they were
working on capturing stepping back, the handler said "ven" when the dog stepped
back. The dog returned to the handler and got his reinforcement. In "ven,"
the dog figured out the game quickly and offered the new behavior with a lot of
enthusiasm and the handler was able to maintain a high rate of reinforcement. In
"punir," the dog seemed hesitant and slightly aimless. Even if the handler said
"punir" and reinforced the dog for the new behavior, it did not immediately
return to the new behavior that caused the handler to say "punir." In some
cases, the dog created chains where it consistently performed the behavior that
had been marked by "ven" even though that was never reinforced during the "punir"
sessions. Once
they were done capturing behaviors with "ven" and "punir," the handler did some
additional experiments to rule out the presence of the leash and harness used in
"punir" as being the source of the dog's attitude. The dog was fine with the
leash and harness in other settings. In the original experiment, the dog
was loose for "ven," but on a leash and harness for "punir," so they switched
and used the leash and harness for "ven" and had the dog loose for "punir."
There were some small changes in the dog's attitude, but "ven" was still
significantly different than "punir." From
this study, Jesus and his student concluded that combining positive and negative
reinforcement during training can have detrimental effects. Emotional
behaviors produced by this procedure do not disappear over time despite the use
of positive reinforcement and that while a poisoned cue can be used to mark and
select new behavior, the results will be different than using a cue that was
trained with positive reinforcement only. He said that he had gotten
a lot of requests to repeat this study but he was reluctant to put another dog
through the correction process. Then he realized that there are a lot of cues
out there that are already poisoned and they didn't need to poison a new cue,
they just needed to find one that was already poisoned. As an
example, he showed some work they did with Caesar, a German Shepherd, who
reacted to the presence of the leash as a poisoned cue. If the leash was
in the house, the dog happily engaged with his student and responded correctly
to behaviors such as "sit." As soon as the leash came out, he started ignoring
and avoiding the student. This was with the leash on the ground, not attached.
They put the leash in various locations and the dog did not respond
enthusiastically until the leash was back in the house. I have
gone into this in such detail because I think poisoned cues are a real problem
for horse trainers, especially if we start with crossover horses. And no matter
how careful we are with our rein handling, there are going to be times when the
reins are used in a manner that the horse finds aversive. I am still
trying to work out in my brain if the fact that we educate the horses about
negative reinforcement means that we can use negative reinforcement without
having poisoned cues. With my own horses, I have used negative
reinforcement to shape a lot of behaviors and they don't seem to view those as
poisoned cues. But I do have a few issues that have come about where I think
that the cue has gotten poisoned and I would like to avoid that in the future.
I am going to take a closer look at how I use negative reinforcement to both
shape and maintain behavior. One of
his early statements was that cues or commands trained only with -R (negative
reinforcement) are better than poisoned cues because it is the ambiguity of the
poisoned cue that creates the problem. This makes it seem as if it would
be better to just use -R alone, instead of using -R and +R (positive
reinforcement) combined, which is something a lot of horse clicker trainers do
on a regular basis. But I don't see that trend in my training and with
other people's horses. On the contrary, horses trained with -R and +R
combined are much happier and more willing than if -R alone is used. I do
think that if you use escalating pressure and the -R becomes too aversive, that
it negates the effect of the positive reinforcement so my guess is that there is
a balance here that works. In any case, I think that horse trainers need
to be very aware of poisoned cues because we are somewhat limited in what we can
use for cues under saddle and we need to make sure that we don't poison them.
Theresa McKeon: LAB:
"TAGteach in Action" I
attended the TAGteach lecture at Clicker Expo 2006, but had not attended a lab
so I signed up for the lab to get more information on what TAGteaching really
looks like. Theresa McKeon led the discussion and activities and started us off
by making a TAGulator out of beads. She had us string 10 beads and
then gave us ideas for how to use them. I had thought of TAGteach as being used
in a formal teaching situation, but she described how she uses it to keep track
of things in her daily life. She might TAG herself each time she drank a bottle
of water or when she made a decision about whether or not to have a cookie. I
have to say that this was a big light bulb moment for me as I suddenly realized
that there were lots of places in my everyday life where I could TAG myself for
doing things. She
talked us through a few exercises on how to implement TAGteach. The first
exercise was just practice tagging other people. We divided up into groups of 3
and one person tagged while the other two people passed a balloon back and
forth. The TAG point might be touching the balloon with two hands, or calling
the balloon. I learned the importance of picking a good TAG point. If the
people were too close together, it was important to pick a really obvious TAG
point as there was not much time to decide if the person should get tagged. It
is important to pick a TAG point that is easy for the trainee at the start, and
easy for the tagger to identify. She had
us do another exercise where we set up "runways" by laying out two strings and
putting objects between them to create an obstacle course. One member of the
team was blindfolded and another team member guided her/him through the runway
using instructions of 5 words or less. This exercise demonstrated that it is
hard to explain something in 5 words or less unless you have previously decided
upon some definitions, and part of the ability to set good TAG points was
knowing your trainee. A further demonstration on picking TAG points showed
how the trainee could define her own tag points by saying what she was feeling
in her body when it was correct. There
were some demonstrations by a TAGteach team from Quebec who use TAGteach in
their dog training. They start by training the handlers, not the dogs and they
use tagging to teach basic positions such as a hand position for heeling, how to
handle treat delivery and clicking the handler for recognizing when the dog is
giving them attention. Theresa also showed with a helper how it was
important to be unemotional and specific about TAGpoints. She went over some
basics about good tagging including having a short phrase to describe the
TAGpoint (5 words or less), keeping emotion out by saying "The TAG point is..,"
not correcting or saying that she is changing anything, but just stating "the
new TAGpoint is.." She also stressed the importance of separating out
information from praise. Praise should come at the end so that the trainee just
pays attention to the TAG which is the most important information. Some
people find praise distracting and she warned of the danger of creating praise
junkies. This was
a very entertaining session. She had all the attendees involved and she
presented enough information to give me some good ideas about how to use
TAGteach. Later in the final wrap-up with Karen Pryor, they played a movie clip
showing different applications for TAGteach and I am amazed at the breadth and
scope of where TAGteach is being used. The movie showed TAGteach being
used by commercial fisherman, painters, teachers and in various sports
applications. In
addition to these sessions, I had a lot of great discussions with other clicker
trainers. There are so many interesting people who attend these expos and I
learned something from each one of them. Here are a few other odds
and ends that I pulled out specifically to share with my horse friends. Jackpots: There was talk recently on one of the
lists about using jackpots and whether or not they were useful. I do use
jackpots, but I have to say that I have never really studied whether they work
for me or not. At clicker expo, there were a variety of opinions about
jackpots and from listening to the trainers, the idea I got was that if you are
going to use jackpots, make sure they work for you. For example, Kay Laurence does not use
jackpots (as defined by multiple goodies, not different goodies). She points out
that if your animal has just performed a really good effort, you want to have it
repeat the behavior right away before it forgets what it did. You don't want to
have a big time lapse (due to chewing) between a successful effort and the next
effort. I think this is a good point. I think the key here is to think about
all aspects of the treat from choice of treat to delivery and make sure they all
work to your advantage. Here are some of my common strategies. Vary the treat. Give better treats for
better efforts that are better because the animal added enthusiasm and energy.
When I taught Willy Spanish Walk, I found out he would lift his leg higher for
an apple than for a carrot. He likes apples better so it made him more
enthusiastic and generated a better Spanish Walk. If I am training a
behavior where I am trying to get the horse to relax or calm down, I am more
likely to jackpot by offering more of a lesser quality treat. Chewing is calming
and I don't mind the time lag between offered behaviors. Sometimes I can even
have the horse eat while performing the behavior. This works if you are going
for duration on the mat. You can feed many treats while the horse is standing on
the mat and this will strengthen the behavior. NO Reward Markers: This was discussed at a panel
discussion. Most of the trainers do not use NRM. If they do, it is for very
specific circumstances. They do not use it for shaping. In one
example, a NRM marker was used to show a dog the difference between an incorrect
behavior and one that was correct but that the trainer was not going to click at
that moment. In this case, no information (no click, no NRM) means the dog is
correct. Cues: I was introduced to the ideas of cues as
reinforcing behaviors. I think that with horses, we do a lot of redirecting. I
often ask my horse for something because I want him to do something else, not
because he has done one thing correctly and I am ready to do the next thing. If
you want to chain together many behaviors, the horse needs to know that being
asked for the next cue means he was successful, not that you are redirecting
him. One way to do this would be with a keep going signal or verbal praise.
My thoughts now are that while I might not use words in shaping, I will use them
when I start to connect together several behaviors. Differences between horses and dogs: It is tempting to try and apply all the
same ideas from clicker training for dogs to horses. In many cases, such as
husbandry, and simple discrete behaviors, this works. But with riding, I think
we have to be a bit more creative. If you want the horse to catch on
quickly to the idea that the next cue means success, then you need to start
chaining behaviors together early. It is tempting to work on the walk and then
the trot and then the canter and reinforce each effort, going for refinement. I
think the horse needs the trainer to start chaining behaviors together early,
even if it is just asking for a few behaviors in a row. The horse needs to
really understand that each new cue is a new opportunity for reinforcement.
Katie Bartlett, 2008 - please do not copy or distribute without my permission
Home |
Articles |
Clicker Basics |
Community |
FAQ |
Getting Started |
Horse Stories |
Links |
Photos |
Resources
|