Filly and Billy: Positive Reinforcement

I have spoken in the past about negative reinforcement, now it is the turn of positive reinforcement. A branch of this is clicker training, but that is only a part of it.
In negative reinforcement an aversive stimulus is presented to the horse (heel pressure in their sides) and the stimulus is removed (the negative as in subtract bit) when the desired behaviour is achieved. Thus the horse gets a reward of the discomfort being removed the moment the behaviour occurs. Now many may say "but I only use very light pressure". That may be true, but the promise of higher phases is still there and so the horse may only feel light physical discomfort, but have the mental discomfort of knowing a higher phase may be the result of not responding. Thus removing the very light pressure is really removing the mental anxiety of something worse happening if they don't respond.
Positive reinforcement operates on the other side of neutral. One waits for the behaviour to occur and then instantly rewards the behaviour with a desirable treat. This could be just a word, a scratch, or a food titbit. What is important is the relationship in time between the desired behaviour and the reward. They need to be as close as possible, which creates a problem if the horse is out on a circle. Suppose they perform that perfect transition and you wish to give them a food treat. By the time you get to them they have probably already stopped, turned and faced you, so which behaviour are you rewarding ?
The answer is to have a bridging cue, a click, a word, a gesture whatever. This has to be trained to be a cue to receiving a reward beforehand but once established can be used to indicate to the horse that a food treat is coming. Technically this bridge cue release the feel good hormone, dopamine, into the brain which give an instant high. The actual reward can then be delivered a little later once it is possible to make physical contact with the horse again.
Once a new behaviour is put in place using positive reinforcement then it is not necessary to reward that behaviour everytime it is displayed. In fact by putting it on a variable reward schedule the behaviour can be strengthened. The horse will try with more and more effort to try and get that reward.
We could wait around for the behaviour we wish to reinforce to occur by chance, then positively reinforce it. Personally I don't have that much time. I see the use of positive reinforcement as a continuum with negative reinforcement. The behaviour is asked for by invoking an aversive cue of some sort, say heel pressure in the flank, the behaviour is negatively reinforced by removing the pressure on the instant of the response and we return to neutral. If we want to really reinforce the behaviour even more strongly we can then go past neutral to positive reinforcement by clicking and giving a food or other treat. Thus I don't see clicker training as being contrary to the Parelli methods, just an extension to them as another arrow in our quiver.
Now you may say that rewarding certain horsenalities with food treats is a bad idea. I find this not to be so, but the way you do it varies. For example with Filly I was wary of introducing food as she then tends to mug you ! My answer was that once the click cue (I click with my tongue) was in place I then first improved the back up part of the YoYo game with positive reinforcement. Thus she was rewarded for leaving my space, not invading it. Seems to have worked well, and her backup is now amazing.
As a topic this is huge and I can only give a brief introduction here, for more information I suggest the book "Don't shoot the dog". The first half of which is brilliant, but then it tends to wander a little. http://www.amazon.com/Dont-Shoot-Dog-Teaching-Training/dp/0553380397

Filly and Billy

Sunday, 23 September 2012

Positive Reinforcement

No comments: