# No-Reward markers?



## hamandeggs (Aug 11, 2011)

So can you all tell me more about the concept of a no-reward marker? I mean, I know what it is - just a marker that means "that's not what I want so you don't get a treat," right? But...isn't that really just a very mild aversive? Isn't that a very fine line? Maybe this has been discussed here before, but I'm curious!

I've been thinking about this because we did a free-shaping session last night at agility class (so much fun!) and in our "homework" instructions it says not to say "no" or "eh eh" (i.e. don't use a NRM) because it can be discouraging to the dog. Maybe this is only true for dogs who are just learning how free-shaping works? 

Don't get me wrong - I'm not saying there is anything wrong with an NRM even if it actually is an aversive. I see the usefulness, just like I see the usefulness of having a way to tell my dog to stop doing [name her chosen undesirable activity]. Just wondering what y'all think about this. Thanks!


----------



## cookieface (Jul 6, 2011)

Not really an answer to your question, but Emily Larlham just wrote about errorless learning vs no reward markers. If you're the one who originally posted about this, I apologize.

I am interested to read what others have to say in response to your question.


----------



## Wicket (Aug 21, 2011)

I think in a free-shapping, trick training, and all the good stuff, NRMs should be avoided. You're not trying to manage behaviors or anything, so why use them? It's all about having fun and learning new things. I personally don't think NRM are bad, they just have situations where they would be most appropriate to use. I use "no" and "eh eh" as a reminder for my dog to not do what she's about to do. The NRM is quick and short, and it gets the message across clear. I like using NRM for example on a walk, if she's wandering too far away from me I use "hey" or "eh eh" to let her know that's as far as she can go. I prefer this over pulling on her leash or restricting her in a physical way. When I'm trying to train a new behavior or work on a current one, I never use NRMs.


----------



## sassafras (Jun 22, 2010)

I think it depends on how much training your dog has had, too. By that I don't mean how "well-trained" they are but how experienced they are at learning new things and how much the two of you have worked together, if that makes sense. 

I have done the most and the most varied training with Squash out of all my dogs, he's been in various classes pretty much continuously since his puppy class and he's a year and a half old now. He knows what NRMs mean so well that I don't think he finds them discouraging or sees them as an aversive at all at this point. Would I have used them in his very first tricks class? No. Do I use them with him using training new tricks or skills now? Sometimes yes, sometimes no. But it's not something I automatically don't use, because he knows the game so well.


ETA: I didn't really properly address totally free shaping or just capturing behaviors... I don't see tons of use to NRMs in that context for a novice dog and trainer, at least in the early stages. Part of the point is teaching the dog to offer behaviors, as much or more as teaching any specific behavior. Once they kind of get the basics of what you want I think they're useful to refine something specific, though. And again, I think for an experienced dog who is used to NRMs and comfortable with offering behaviors, it can be kind of a shorthand or shortcut to wade through some of the chaff they offer.


----------



## Pawzk9 (Jan 3, 2011)

I think it is a very fine line between a NRM and a mild aversive. I would be more likely to use one for something I really don't want the dog doing (counter surfing, or getting ready for a squat on the carpet for instance). But then it becomes an interrupter instead of just information. I might still use it in such a situation, though I do think by then it has become a mild punisher. I would not use one in free-shaping, as I think the more latitude you give the dog in that situation, the better. And I don't want to do anything that discourages the dog from trying new things. So, when free shaping, I just try to keep my mouth shut.


----------



## dagwall (Mar 17, 2011)

I wouldn't use a NRM when trying to do free-shaping because the whole idea is to get the dog to offer new behaviors and I wouldn't want to discourage that. The only training I've used NRM in was when I had to redefine what sit, down, stand meant to Jubel. He's still not 100% on stand either, I didn't teach him sit or down he knew them when I adopted him. In his mind all those commands meant come close to me, usually in front of me, and sit/down/stand. I wanted him to sit/down/stand where ever he was when I gave the command, to teach that I needed the NRM as well as a small barrier (I used a small cardboard box) placed in front of him. 

I can't really think of any other training I'd do that would need a NRM. For me the NRM is used mostly as a verbal correction/interrupter for unwanted behaviors often followed by sit/drop it/leave it/off.


----------



## petpeeve (Jun 10, 2010)

Going against the grain here ... I think the use of an no-reward marker IS most useful when free shaping. In fact, it's probably about the only time when I would even consider using one. 

ie: if I'm patiently waiting out a slight movement of the RIGHT paw, and the dog keeps offering LEFT paw after LEFT paw after LEFT paw ... I *might* use an NRM in an attempt to provide useable information and prevent the dog from shutting down completely. In other words, in the above scenario at least, it might actually be more deterimental if one were to idly stand by, merely withholding the click while the dog went through multiple 'incorrect' repetitions. 
The fundamental premise of an NRM is to ultimately "keep" the dog in the game, and lessen the likelihood of him bailing out in the face of no forthcoming reinforcement.


You can find a bit more info on their use / non-use, here .. 
http://www.clickertraining.com/node/179

and here ...
http://www.clickertraining.com/node/2848


----------



## hamandeggs (Aug 11, 2011)

dagwall said:


> I wouldn't use a NRM when trying to do free-shaping because the whole idea is to get the dog to offer new behaviors and I wouldn't want to discourage that. The only training I've used NRM in was when I had to redefine what sit, down, stand meant to Jubel. He's still not 100% on stand either, I didn't teach him sit or down he knew them when I adopted him. In his mind all those commands meant come close to me, usually in front of me, and sit/down/stand. I wanted him to sit/down/stand where ever he was when I gave the command, to teach that I needed the NRM as well as a small barrier (I used a small cardboard box) placed in front of him. I can't really think of any other training I'd do that would need a NRM. For me the NRM is used mostly as a verbal correction/interrupter for unwanted behaviors often followed by sit/drop it/leave it/off.


Funny you mention sitting/downing at a distance -- that's our homework for agility! Biscuit thinks "down" means "come to me and lie down." We've been teaching it with one person holding the leash and other person giving the command from a distance and clicking/tossing treats, but the NRM has definitely crossed my mind.


----------



## hamandeggs (Aug 11, 2011)

petpeeve said:


> Going against the grain here ... I think the use of an no-reward marker IS most useful when free shaping. In fact, it's probably about the only time when I would even consider using one.
> 
> ie: if I'm patiently waiting out a slight movement of the RIGHT paw, and the dog keeps offering LEFT paw after LEFT paw after LEFT paw ... I *might* use an NRM in an attempt to provide useable information and prevent the dog from shutting down completely. In other words, in the above scenario at least, it might actually be more deterimental if one were to idly stand by, merely withholding the click while the dog went through multiple 'incorrect' repetitions.
> The fundamental premise of an NRM is to ultimately "keep" the dog in the game, and lessen the likelihood of him bailing out in the face of no forthcoming reinforcement.
> ...


I see what you're saying, but I think that would be very situation-depending. If Biscuit is offering behaviors while free-shaping, but not doing what I want, I feel like that would mean I was asking too much from her, I should take it back a step and make it easier. If she was shutting down and wandering away, I might give her a hint to keep her in the game, but that's a separate thing from NRM. But, she doesn't know the game that well yet. 

I use an NRM in daily life, as some of you have described, to interrupt and derail undesirable activities - I see Biscuit starting to counter surf or some such, I say "Eh Eh," she cuts it out and goes to do something else, hopefully something more productive. It doesn't freak her out. But I can see how if I was trying to get her to offer behaviors as she thinks them up, the NRM would be discouraging. 

I'm glad I'm thinking about this in a sensible way. Thanks guys!


----------



## Pawzk9 (Jan 3, 2011)

petpeeve said:


> Going against the grain here ... I think the use of an no-reward marker IS most useful when free shaping. In fact, it's probably about the only time when I would even consider using one.
> 
> ie: if I'm patiently waiting out a slight movement of the RIGHT paw, and the dog keeps offering LEFT paw after LEFT paw after LEFT paw ... I *might* use an NRM in an attempt to provide useable information and prevent the dog from shutting down completely. In other words, in the above scenario at least, it might actually be more deterimental if one were to idly stand by, merely withholding the click while the dog went through multiple 'incorrect' repetitions.
> The fundamental premise of an NRM is to ultimately "keep" the dog in the game, and lessen the likelihood of him bailing out in the face of no forthcoming reinforcement.


If I'm truly "free shaping" I am allowing the dog to be creative, and often what I get is better (and different) than what I may have originally had in mind. So, unless he's peeing on my leg, the dog can do no wrong. This is subtly different from just shaping an intended behavior. If I am doing directed shaping, I may make things easier for the dog to make the step I want by positioning myself differently, splitting the behavior into tinier pieces, etc. If my dog is really clicker savvy, I don't need to use my voice. Absence of a click means "try something else".


----------



## petpeeve (Jun 10, 2010)

hamandeggs said:


> I use an NRM in daily life, as some of you have described, to interrupt and derail undesirable activities - I see Biscuit starting to counter surf or some such, I say "Eh Eh," she cuts it out and goes to do something else


The way these words (eh eh, no! etc) are being described / used in this thread sounds more like they're _verbal corrections_, ... by my own definition anyways.

Subtle differences ... one being, NRM's "keep the dog IN the game". ("oops" and "TRY AGAIN" are typical marker words that are employed.)


----------



## petpeeve (Jun 10, 2010)

Pawzk9 said:


> If I'm truly "free shaping" I am allowing the dog to be creative, and often what I get is better (and different) than what I may have originally had in mind. So, unless he's peeing on my leg, the dog can do no wrong. This is subtly different from just shaping an intended behavior. If I am doing directed shaping, I may make things easier for the dog to make the step I want by positioning myself differently, splitting the behavior into tinier pieces, etc. If my dog is really clicker savvy, I don't need to use my voice. Absence of a click means "try something else".


Ah, I think I see what you're saying now. 

(*brain fart, on my part re: terminology*)

My bad. Oops. Try again. LOL


----------



## cookieface (Jul 6, 2011)

petpeeve said:


> The way these words (eh eh, no! etc) are being described / used in this thread sounds more like they're _verbal corrections_, ... by my own definition anyways.
> 
> Subtle differences ... one being, NRM's "keep the dog IN the game". ("oops" and "TRY AGAIN" are typical marker words that are employed.)


I would classify those words (no! hey! etc) in a similar way: interrupters or verbal corrections. My understanding of NRM is I ask for a "sit," dog does something else (stares at me, lies down, etc), I use a NRM (e.g., "sorry" or "nope") to say, "that's not what I asked for; no reinforcement for you." The NRM would be used only after the dog has learned the cue fairly solidly, not while first teaching the cue. Perhaps I'm misunderstanding the terminology.


----------



## Elliebell (Mar 13, 2011)

Occasionally while shaping my dogs will get into a rut and do the same behavior repeatedly. I usually use a NRM to get them out of that. I think this probably stems from my use of extinction bursts to get more of certain behaviors out of them. They think that maybe if they do something for long enough then it'll be enough and I'll click for it. I say "oops" in a happy tone and if they're interacting with something, I remove it for a second and then replace it. The NRM is just a way of communicating that whatever they're doing is a waste of their time and isn't what I'm looking for. Basically, telling them to try something else.

I also use it for weaves, because SiSi does them independently and gets no marker for doing them correctly. She just gets no treat for doing them incorrectly, so I started saying "oops!" when she does it wrong so she knows there will be no treat and doesn't spend ages looking for one. 

Neither of my dogs have seemed discouraged by a NRM. On the contrary, I often use it to stop them from getting frustrated during shaping. I wouldn't call it aversive, although I guess it is a punisher because it's decreasing the frequency of a behaviour.


----------



## hanksimon (Mar 18, 2009)

I think we're talking about secondary subtleties.
1. Consider the game Hot/Cold - you can use a NRM as Cold, and a Keep Going Signal (KGS) as Hot ... You might use this for searching, nosework, fetching a ball with an inadequate retrieve and drop.
2. If you're trying to perfect a behavior, changing a sloppy Sit into an obedience Sit, you may need a NRM to help fine tune, if you haven't been incrementally increasing the requirement for reward.
3. Although not all NRM are aversive - some are instructive; Not all aversives are bad. If you drop food on the ground and tell the dog Leave It!, that is an aversive for most dogs. When you are training your dog and the time to stop has come, getting no more treats is an aversive. We do this all the time.
4. My dog paws me. When he was a pup, I taught him to shake and gave him a treat. I was a little slow one day, so he pawed me with the other paw.... I recognized that and was able to teach him Right and Left.... I glance away or shake my head a bit as an NRM, and he offers an alternate behavior (as opposed to a completely different behavior) I also do the same when asking him to count by barking. 

I believe that if you give your dog lots of training and appropriate freedom, he'll learn independence, especially Retrievers and Herders. You have to put limits on the freedom, and I think it's good to have a signal ... to reduce freedom in some situations.

I imagine that everyone re-directs and corrects their dog, most without punishment.... dogs want more food, more walks, to play all day and/or sleep all day in the bed. So they learn limits. ... My dog follows me into the kitchen, hoping for a treat. I accidentally drop something on the floor. He doesn't move, but he watches it intently. I pick it up and toss it in the trash... a kind of silent correction. However, later, I'll drop a piece of kibble, intentionally, and he'll snatch it in mid-air. If he miscalculates, he'll pick it up off the floor. He knows the difference with no additional prompting from me... based on previous training of Leave It!

In early days of computer assisted instruction (CAI) (now called computer based training CBT), researchers found that purely positive methods, praise or no feedback, would frustrate advanced learners. The advanced learner would get into a loop where they believed that an incorrect behavior or rule was correct. No positive approach could pull them back. So they needed a correction for the incorrect behavior... which relieved the frustration of going around in circles. ....it works with dogs, too...


----------



## CatintheHat (Jun 7, 2009)

cookieface said:


> Not really an answer to your question, but Emily Larlham just wrote about errorless learning vs no reward markers.


The article defines an NRM as a punisher, which is not correct. Melissa Alexander's dfinition (NRM = extinction marker) is also not correct. 

A no-reinforcement marker is a neutral signal that is a secondary (i.e., conditioned) no-consequence. Neutral means not aversive and not appetitive. It only becomes a punisher if it reduces the frequency of a specific behavior. 

NRMs are most useful to keep things moving in shaping with a dog that has experience with shaping and also has built up some duration with some behaviors (e.g.. long down). It is definitely not necessary to use an NRM, and probably not beneficial for most training programs. As many of the comments here suggest, what people are calling NRM is really a secondary punisher, or what starts out as NRM becomes a punisher. 

"Errorless" training is setting up training scenarios to maximize the frequency of reinforcement, and thus minimizing the number of trials required to attain fluency. It is the most efficient way to acquire behavior. However it does not exercise the problem-solving skills of the dog. 




> Although not all NRM are aversive - some are instructive; Not all aversives are bad. If you drop food on the ground and tell the dog Leave It!, that is an aversive for most dogs.


 No proper NRM is ever aversive, by definition. If it has become aversive then it has become a secondary punisher and has ceased to be an NRM. Whether "leave it" is aversive or not depends on how the behavior was trained. If it was trained using positive reinforcement, then the cue predicts an opportunity for reinforcement and is not aversive. This is also an example of a non-aversive punisher (cue for incompatible behavior). Of course if "leave it" was trained through aversive methods then the cue is a secondary punisher, and thus aversive. 



> When you are training your dog and the time to stop has come, getting no more treats is an aversive.


 That is not correct. Dog wanting more treats = appetitive. The risk here is inadvertent negative punishment (treats are removed from the environment), which is the reason for the training 101 rules: (a) always end on a good note and (b) finish up with a game.


----------



## qingcong (Oct 26, 2009)

CatintheHat said:


> The article defines an NRM as a punisher, which is not correct. Melissa Alexander's dfinition (NRM = extinction marker) is also not correct.
> 
> A no-reinforcement marker is a neutral signal that is a secondary (i.e., conditioned) no-consequence. Neutral means not aversive and not appetitive. It only becomes a punisher if it reduces the frequency of a specific behavior.




If it is a neutral signal, then what purpose does it serve? If it's neutral, it doesn't do anything.


----------



## NRB (Sep 19, 2009)

Wether a NRM like "Oops" issued in a bright cherry tone of voice is a punishment or not depends on your dog and the condition that you issue it in. My dog is absolutely crazed for certain agility equipment, the teeter, the A frame and the dog walk. So in that instance, I would not worry about a NRM being a punishment. Because it will Not diminish her drive for the agility equipment. at all. And usually that's what we are always afraid of in agility, decreasing drive. Because most people spend so much time trying to build drive and positive association with the equipment. Me, I need to decrease drive and build focus. So a NRM in agility in theory would just tell her wrong answer try again, but in reality she's totally blowing me off and gunning for the equipment and so she'd never hear the NRM in the first place. 

Now in free shaping, in my house no agility equipment near by so she's totally focused and calm... I'd probably not use a NRM.... unless like previous posters mentioned... the dog was getting frustrated after offering the same behavior over and over again with no feedback. If I "oops-ed" her then she might realize that she needs to try a new tactic. OR she might just get frustrated with the whole game and want to leave. I am still a newbie to free shaping and don't do it often.


----------



## hamandeggs (Aug 11, 2011)

Thanks all. This is a really interesting discussion! It seems like the NRM sits right on the line between punishment and not, and it depends on the dog, the activity, the tone of voice used, the specific word used, and which way the wind is blowing that day!

Last night Bisc and I were doing some free shaping and for kicks I said "Oops!" in a happy voice when she tried to chew on the plastic bin we were using as a shaping object. It did set things back - I had to go back to rewarding her for any interaction with the bin, three or four times, before we got back to where we had been before I said it. On the plus side, my dog now knows the very useful skill of putting her front feet in a plastic bin.


----------



## CatintheHat (Jun 7, 2009)

qingcong said:


> If it is a neutral signal, then what purpose does it serve? If it's neutral, it doesn't do anything.


It signals the end of a trial that has no consequence.



> It seems like the NRM sits right on the line between punishment and not


 If it is +P, it is not NRM. If it is NRM, it is not +P.

NRM has to be conditioned, just like any other marker. A dog that has not been conditioned for "oops" as no consequence, hearing it for the first time, can interpret it in any number of ways. 

Here is an example of NRM that most are probably familiar with. 
Context: quiz show where contestants attempt to be first to press a button in order to answer a question
Antecendent: question presented
Behavior: quickly press button and offer answer 
Consequences: 
if correct answer, bell rings and contestant point total incremented (+R) 
if incorrect answer, buzzer sounds 

The buzzer sounding signals no consequence and the trial is over. This will not by itself reduce the frequency of the contestant pressing the button to answer future questions. The buzzer is a NRM. 

If instead the incorrect answer gets a buzzer and point total decremented, this is punishment. It reduces the frequency of the behavior. The buzzer is a conditioned punisher.


----------



## lil_fuzzy (Aug 16, 2010)

I don't use NRM's much, there's just no need, except if the dog is in the middle of a behaviour chain. Being allowed to continue through the chain would be a reward in itself, so when the dog makes a mistake the chain needs to be interrupted, and a NRM tells the dog exactly where it went wrong.

But during shaping and asking for behaviours etc, it's superfluous. 

Susan Garrett teaches her NRM using R+, in a separate session, so if you've seen kikopup's video of teaching "leave it", that's pretty much how SG teaches it, except she uses the word "oops". When the dog doesn't go for the toy they throw, it gets a treat. She has a good video of her using the NRM somewhere, and the dog doesn't deflate or go weird when she gives it, he just runs right back to try again.

But I think for reward based training the focus should be on "what do I want the dog to do?" and teaching the dog what to do and building value for that behaviour, not about what you don't want him to do and thus you'd just never use a NRM, because you'd always have an alternate behaviour or cue you can use.


----------



## qingcong (Oct 26, 2009)

It sounds like a lot of work just to condition something that by definition doesn't do anything to change behavior. I think I will stick with rewards and punishments.


----------



## hanksimon (Mar 18, 2009)

And, you're right. If you can make adequate learning increments, anticipating errors, NRM doesn't provide much value. But as you get into increasingly complex, challenging chains of behavior, or desire for better proofing of behaviors, an NRM approach can be more straightforward to fix minor tweaks in a mostly correct action.

I imagine that some people use them without thinking... saying oops or closing your reward hand and twisting your wrist to indicate no reward this time.


----------



## qingcong (Oct 26, 2009)

hanksimon said:


> NRM approach can be more straightforward to fix minor tweaks in a mostly correct action.



But according to Catinthehat's definition, the NRM _does not do anything_. It has no effect on behavior, and thus cannot fix or tweak anything. If your feedback had an effect on behavior, then it was not a NRM. That's my understanding. If this definition of NRM is accurate, then the effect of the NRM is not observable. 

Using catinthehat's definition, I think that the NRM exists only in concept. Is it a concept that dogs can grasp? I don't know.


----------



## CatintheHat (Jun 7, 2009)

qingcong said:


> But according to Catinthehat's definition, the NRM _does not do anything_. It has no effect on behavior, and thus cannot fix or tweak anything. If your feedback had an effect on behavior, then it was not a NRM. That's my understanding. If this definition of NRM is accurate, then the effect of the NRM is not observable.
> 
> Using catinthehat's definition, I think that the NRM exists only in concept. Is it a concept that dogs can grasp? I don't know.


It's not my definition, it is standard OC. Each category of consequence can have a conditioned marker. For example, you could have a click for +R, a bronx cheer for +P, "too bad" for -P, "whew" for -R, and "nope" for no consequence. 

I also didn't say it doesn't do anything. I did state that it "signals end of trial with no consequence," which is doing something (communicating). 

I also didn't say it has no effect on behavior. It is neither punishment nor reinforcement, so it does not reduce or increase the frequency of the behavior. But it can modify aspects of the behavior without altering the frequency. 

A simple example is a dog that is fluent in "sit" but has variable response time, generally < 3 seconds. Ultimate goal is response latency < 1 second. Method is differential reinforcement (a.k.a., shaping). Intermediate goals are 3 seconds, 2 second, and 1 second. One approach is to start the stopwatch simultaneously with providing the cue. If the sit occurs within the target time, click and treat. If the dog does not sit within the target time, NRM. This does not reduce the frequency of the sit, but does reduce latency of response, because it more clearly communicates the success criteria to the dog. 

An example of what might commonly be confused with an NRM but is actually a punisher is the fairly common case of a dog that responds to the "down" cue by first coming to the handler and then getting into the down position. The desired behavior is to go straight down without moving. In this case one approach would be to cue for down and then signal if the dog starts to move toward handler. This is actually positive punishment because the signal reduces the frequency of the dog moving toward the handler. If a previously conditioned NRM is given as the incorrect response signal, this will recondition that signal from an NRM to a punisher, which pretty much permanently ruins it as a NRM. In this case, since the goal is to reduce the frequency of the first behavior in a chain, it is not appropriate to use an NRM. If this approach is taken, an attempt to later use the same signal to reduce the latency of response to the "down" cue is likely to reduce the frequency of the down in response to the cue. 

Very similar to this latter example is what people are calling an "interruptor." If it reduces the frequency of a behavior, it is a positive punisher. 

Because NRMs are difficult to condition and maintain, and because it can be difficult to set up training scenarios employing them, most of the popular +R authors recommend against them.


----------



## sassafras (Jun 22, 2010)

hanksimon said:


> I imagine that some people use them without thinking... saying oops or closing your reward hand and twisting your wrist to indicate no reward this time.


I do, too. I think dogs probably pick up on NRMs we do without even realizing it or calling it that.


----------



## qingcong (Oct 26, 2009)

CatintheHat said:


> It's not my definition, it is standard OC. Each category of consequence can have a conditioned marker. For example, you could have a click for +R, a bronx cheer for +P, "too bad" for -P, "whew" for -R, and "nope" for no consequence.


It seems the definition is not so straight forward as you present it, at least, it seems there is some debate. Quote from this article on clickertraining.com -
"There is far more to the NRM debate than this, however. Stand back, as this is where I'll step on some toes…

By the time an NRM has real meaning for the learner, it has become positive punishment.

An NRM may cue extinction, but in doing so it also signals a loss of opportunity. The chance of earning reinforcement has closed. If the subject changes his behavior to avoid the NRM—and that is the whole point of its use—then the NRM is by definition an aversive. It may be a mild aversive or it may be severe, depending upon the learner’s mindset, but it is a stimulus the learner is actively working to avoid. Because the trainer introduces the NRM upon the learner’s mistake (adds an aversive stimulus that modifies behavior), the NRM is positive punishment."








> I also didn't say it has no effect on behavior. It is neither punishment nor reinforcement, so it does not reduce or increase the frequency of the behavior. But it can modify aspects of the behavior without altering the frequency.
> 
> A simple example is a dog that is fluent in "sit" but has variable response time, generally < 3 seconds. Ultimate goal is response latency < 1 second. Method is differential reinforcement (a.k.a., shaping). Intermediate goals are 3 seconds, 2 second, and 1 second. One approach is to start the stopwatch simultaneously with providing the cue. If the sit occurs within the target time, click and treat. If the dog does not sit within the target time, NRM. This does not reduce the frequency of the sit, but does reduce latency of response, because it more clearly communicates the success criteria to the dog.


If something has an effect on behavior, it is by definition not neutral. In your sit-latency scenario, the NRM would qualify as a negative reinforcement, as it is removing a stimulus (the NRM) to make behavior happen. If the dog sits faster, it means the frequency will also increase, as the dog can do more sits in a given period of time.


----------



## volito (Oct 14, 2010)

Like to say awesome subject! Agree with a lot of all the above and one thing I have noticed in training a lot of subjects become opinions, debates, and experience! 

I'll admit I only have a few years under my belt but I have a few great mentors that I respect and taught me the science base positive reinforcement ways and following ADPT and CCPDT ways....


Ok all I can add is that I see no reason why NRM can't be used and believe it comes down to the trainers style and the individual dog.... I believe it all depends on how the trainer associates the NRM to the dog and how the dog perceives it .... As in if the dog is about to eat something on a walk "use HEY" which will be aversive or you say sit and dog gives you a down "Use wrong" and dog offers something else... 

I try and associate "condition" a NRM with dog trying something else ....sort of like when first training a dog and dog goes thru all his tricks trying to get a reward.... I would condition wrong wrong wrong YES

Hope this made sense


----------



## CatintheHat (Jun 7, 2009)

qingcong said:


> It seems the definition is not so straight forward as you present it, at least, it seems there is some debate. Quote from this article on clickertraining.com -
> "There is far more to the NRM debate than this, however. Stand back, as this is where I'll step on some toes…
> 
> By the time an NRM has real meaning for the learner, it has become positive punishment.
> ...



Ah, where to even begin....

There is no loss of opportunity, every trial is an opportunity. Every trial ends, either successfully (+R) or not (no consequence). In a training session of n trails, the same n opportunities for reinforcement occur, regardless of outcome and regardless of whether NRM is given for unsuccessful trials. 

Furthermore, the "loss of opportunity to earn a reward" is not aversive. Dog wants more rewards = appetitive. Dog wants to earn more rewards, dog does not want to avoid earning less rewards. 

"Loss of opportunity" is not a consequence in operant conditioning, because "opportunity" is not a real thing in the real world, it is a concept. In OC, consequences are real things in the real world, stuff that can be seen, smelled, touched, tasted, or heard. 

Even if "loss of opportunity" was a valid OC consequence, and if it did in fact reduce the frequency of the behavior, it would still not be positive punishment, it would be negative punishment (removal of something="opportunity" from the environment). Conditioning a previously neutral signal to negative punishment makes it a secondary negative punisher, not a positive punisher. 

"it is a stimulus the learner is actively working to avoid"
a) that would make it an antecedent, but NRM is a consequence
b) "actively working" = increased behavior = reinforcement 
c) "working to avoid" = frequency of behavior increases when consequence of behavior is removal of stimulus from environment = negative reinforcement. 

But the author also describes NRM as "a cue for extinction." A cue is a discriminant stimulus for a behavior, and an antecedent rather than a consequence. Extinction is a process whereby behavior that was previously reinforced is no longer reinforced (i.e., which has no consequence), and therefore reduces in frequency. It does not make sense to say that anything is a discriminant stimulus for a pattern of reduced frequency of behavior that earns no consequence. 


So in the same short quoted passage, the original author has confused antecedents with consequences, incorrectly identified appetitive as aversive, conflated positive punishment with negative punishment with negative reinforcement (among other errors). 

This is very similar to the (apparently common) misconception that failure to reinforce = -P. Failure to reinforce does not remove anything from the environment, and this therefore not negative punishment. It is simply no consequence. 



qingcong said:


> If something has an effect on behavior, it is by definition not neutral.


You seem to have gotten hung up on "neutral signal." I was specifically referring to a signal that has no inherent meaning to the dog and has not been previously conditioned as a marker. When conditioning a marker, it is good to start with a neutral signal. In clicker training, the click starts out for most dogs as a neutral signal that is then conditioned as +R. An NRM starts out as neutral signal that is then conditioned as a no consequence marker. 




qingcong said:


> If something has an effect on behavior, it is by definition not neutral. In your sit-latency scenario, the NRM would qualify as a negative reinforcement, as it is removing a stimulus (the NRM) to make behavior happen. If the dog sits faster, it means the frequency will also increase, as the dog can do more sits in a given period of time.


I am not sure where you get that something is being removed from the environment? NRM is a consequence, not an antecedent. In the latency example, if the dog sits in less than target time, the consequence is +R. If the dog does not sit within the target time, NRM. Nothing is removed from the environment. A marker is added to the environment, but since the dog has acquired the discriminant stimulus (cue) and is fluent in the behavior (offers the correct behavior 100% of the time in response to the discriminant stimulus), adding the marker to the environment does not reduce the frequency of the behavior. There is no outcome that involves removing anything that was previously in the environment from the environment, so -R is not possible. -P is also not possible. It doesn't change the frequency of the behavior, so it is neither +R or +P. 

Lack of consequence has an effect on behavior. Extinction is what happens when a previously reinforced behavior is no longer reinforced. This reduces the frequency of the behavior. "No longer reinforced" means that when the dog offers the behavior, there is no consequence. Nothing is added to or removed from the environment. It is the absence of a consequence. 

Extinction is the basis for all +R shaping, where there is no consequence for any behavior other than the "correct" behavior. As criteria are increased, "versions" of a behavior that were previously reinforced now have no consequence and thus reduce in frequency, while behavior that meets the new criteria is +R and thus increases in frequency. As criteria are increased, successive versions of "correct" behavior replace previously "correct" version of behavior. 

The entire philosophy of "reinforce what you like, ignore what you don't" relies on the fact that absence of consequence does modify behavior. Any practical application of differential reinforcement relies on the absence of consequence. The minimum set of tools in the OC toolbox are reinforcement and no consequence. NRM is simply a conditioned marker for no consequence.


----------



## Deaf Dogs (May 28, 2012)

I will use NRM with Oliver. It seem to work well with him. I never use them with new dogs, but Ollie has a bad habit of trying old tricks over and over if I dont use them, and if he doesn't get some sort of feedback, he just lays down and looks at me. But with him NRM are just feedback, and he knows he needs to try something else. He seems to need alot of feedback (either R+ or NRM) or he just stares. I never use them in Free shaping, though, as I am just clicking him for his own ideas, and he's really good at that, but he does tend to not do so hot at shaping unless he has encouragement and feedback. If we're refining a trick, or doing one he already knows, and he messes up, I will for sure use a NRM with him.

Like the last trick in this video. You can see it doesn't affect him other to get him to stop doing the wrong thing.

http://www.youtube.com/watch?v=q0KglBLwEM0


----------



## petpeeve (Jun 10, 2010)

I believe CatintheHat has given a thorough explanation on NRM's. Thanks for that.

Also, in Deaf Dogs video I believe Oliver shares my own sentiment for BP, lol. Thanks as well, for that.


----------



## qingcong (Oct 26, 2009)

CatintheHat said:


> Furthermore, the "loss of opportunity to earn a reward" is not aversive. Dog wants more rewards = appetitive. Dog wants to earn more rewards, dog does not want to avoid earning less rewards.


I can see what you mean. The problem is, we don't get to choose what the dog perceives as aversive. It sounds like a lot of people will say they are using a NRM, when in fact they are using +P. 





> "Loss of opportunity" is not a consequence in operant conditioning, because "opportunity" is not a real thing in the real world, it is a concept. In OC, consequences are real things in the real world, stuff that can be seen, smelled, touched, tasted, or heard.


Lack of consequence is also a concept, as no consequence is not a real thing, it's a lack of a real thing. 





> Even if "loss of opportunity" was a valid OC consequence, and if it did in fact reduce the frequency of the behavior, it would still not be positive punishment, it would be negative punishment (removal of something="opportunity" from the environment). Conditioning a previously neutral signal to negative punishment makes it a secondary negative punisher, not a positive punisher.


We agree on this. I think the author got her definitions of -P and +P mixed up, or she failed to explain her thoughts very clearly.





> Lack of consequence has an effect on behavior. Extinction is what happens when a previously reinforced behavior is no longer reinforced. This reduces the frequency of the behavior. "No longer reinforced" means that when the dog offers the behavior, there is no consequence. Nothing is added to or removed from the environment. It is the absence of a consequence.


Right - so a well-conditioned NRM = lack of consequence. Lack of consequence leads to extinction. Now, going back to the sit latency example, how can extinction make the dog sit faster? 

I'm not arguing with you on the definition of a NRM. I think that part is more or less clear. What's up for debate, is whether it actually exists in the real world or if it is just a concept. I think that's where the controversy is. Here's another honest question for you because I'm not sure of the answer - if a dog has been properly conditioned to a NRM and you give it a NRM while it is doing a behavior, what should happen?


----------



## qingcong (Oct 26, 2009)

CatintheHat said:


> I am not sure where you get that something is being removed from the environment? NRM is a consequence, not an antecedent.



According to the definition here, NRM IS an antecedent. Quoted "This signal is called a "No Reward Marker," or NRM. NRMs are intended to be a verbal cue for extinction" It's a cue, not a consequence, unless this definition of NRM is wrong.


----------



## CatintheHat (Jun 7, 2009)

> I can see what you mean. The problem is, we don't get to choose what the dog perceives as aversive.


 Of course not. The dog decides what is appetitive and what is aversive, what the relative hierarchies are of reinforcers and punishers in any given context. Our job is to recognize what's appetitive and what's aversive, and to try and figure out the hierarchies. Saying that "Dog avoids fewer treats" and then then claiming that not delivering reinforcement is aversive (based on that double-negative construction) is equivalent to saying "dog wants more non-shocks" and then claiming that shock collar training is appetitive. That does not help anyone figure out what their dog finds appetitive and what their dog finds aversive. 




http://www.clickertraining.com/node/179 said:


> NRMs are intended to be a verbal cue for extinction, not a punisher, so people attempt to say them in the most neutral tone of voice possible. "Uh-uh," said quietly and calmly, is a common NRM. In a training session, the trainer would either click or use the NRM after each repetition to let the dog know whether his behavior was correct or not.


 This was explained previously. The author is using the word "cue" (which is an antecedent) but describing a consequence. Extinction is not a behavior, it is a pattern of reduced frequency of previously reinforced behavior. The definition is not correct. 



> Lack of consequence is also a concept, as no consequence is not a real thing, it's a lack of a real thing.


 If you perform a behavior, and nothing happens (nothing is added to or removed from the environment), that is a real outcome that can be perceived. The question is, how long do you wait for something to happen before deciding that nothing is going to happen? Nothing says "nothing is going to happen" like NRM. As others have mentioned, most of us probably have some NRM body language that we use subconsciously and that the dogs pick up on. 



> Right - so a well-conditioned NRM = lack of consequence. Lack of consequence leads to extinction. Now, going back to the sit latency example, how can extinction make the dog sit faster?


 Lack of consequence does not always lead to extinction. Example: after fluency is attained with the perfected behavior, it is standard practice to put the behavior on a variable schedule of reinforcement, reducing the % of successful trials that result in reinforcement. The remaining successful trials are not reinforced (have no consequence), but the behavior does not reduce in frequency. In fact, intermittent reinforcement schedules increase resistance to extinction. 

In the latency example, modification of the behavior does not rely on extinction, unless one wants to describe "sit within 1 second" as a different behavior than "sit within 3 seconds" and say that "3s sits go extinct." I do not: sit is sit, response time is an aspect of the behavior and not a behavior itself; the delay before the sit is a lack of behavior, not a behavior. Duration is another modifiable aspect of the behavior. When training the long down, the trial ends if the dog breaks the down. In that case there is always some kind of end of trial signal, but it is not NRM and is almost always +P. But this does not reduce the frequency of the down. 



> It sounds like a lot of people will say they are using a NRM, when in fact they are using +P.


 That is very true and one of the things I have been saying all along. It is difficult to condition a real NRM in the first place, and then if you use it in the wrong situation, it gets reconditioned as a secondary positive punisher. 



> What's up for debate, is whether it actually exists in the real world or if it is just a concept. I think that's where the controversy is.


 I understood that to be the crux all along. Yes it really exists in the real world. I've given one example of practical application, but the truth is that it is pretty hard to find practical applications of pure NRM outside a Skinner box. 




> Here's another honest question for you because I'm not sure of the answer - if a dog has been properly conditioned to a NRM and you give it a NRM while it is doing a behavior, what should happen?


 Most likely to become positive punisher or cue for another behavior, depending on a lot of factors. 

Consider: you are doing door-to-door surveys, paid for each completed survey. You are given a list of addresses. Behavior is to knock on doors and solicit completion of survey. In each trial you knock on a door. Possible consequences are (a) resident answers door and agrees to participate (+R) (b) no answer (c) resident answers door and says "Not interested in participating" (NRM). What do you do in case (b)? In response to case (c) you are very likely (after a number of trials) to respond with a sales pitch involving "paying my way through college" and what started out as a NRM evolves into a cue to deliver the sales pitch. But, depending on your temperament, (b) or (c) might result in your tossing all the survey forms in the bin and hitting the local pub.


----------



## Deaf Dogs (May 28, 2012)

CatintheHat said:


> Consider: you are doing door-to-door surveys, paid for each completed survey. You are given a list of addresses. Behavior is to knock on doors and solicit completion of survey. In each trial you knock on a door. Possible consequences are (a) resident answers door and agrees to participate (+R) (b) no answer (c) resident answers door and says "Not interested in participating" (NRM). What do you do in case (b)? In response to case (c) you are very likely (after a number of trials) to respond with a sales pitch involving "paying my way through college" and what started out as a NRM evolves into a cue to deliver the sales pitch. But, depending on your temperament, (b) or (c) might result in your tossing all the survey forms in the bin and hitting the local pub.


I just had to say, I really like this analogy!


----------



## qingcong (Oct 26, 2009)

CatintheHat said:


> This was explained previously. The author is using the word "cue" (which is an antecedent) but describing a consequence. Extinction is not a behavior, it is a pattern of reduced frequency of previously reinforced behavior. The definition is not correct.


Two separate authors on that website have defined the NRM as an antecedent, or a cue for extinction. Here's another one that defines it as a cue. They do not describe the NRM as a consequence, as you define it. Based on these definitions of the NRM, the textbook way it works is - extinction is put on cue, just as sit or down is put on cue. 





> Lack of consequence does not always lead to extinction. Example: after fluency is attained with the perfected behavior, it is standard practice to put the behavior on a variable schedule of reinforcement, reducing the % of successful trials that result in reinforcement. The remaining successful trials are not reinforced (have no consequence), but the behavior does not reduce in frequency. In fact, intermittent reinforcement schedules increase resistance to extinction.


That's intermittent reinforcement, not lack of consequence. Lack of consequence, in otherwords, zero consequences should lead to behavioral extinction. 






> Most likely to become positive punisher or cue for another behavior, depending on a lot of factors.
> 
> Consider: you are doing door-to-door surveys, paid for each completed survey. You are given a list of addresses. Behavior is to knock on doors and solicit completion of survey. In each trial you knock on a door. Possible consequences are (a) resident answers door and agrees to participate (+R) (b) no answer (c) resident answers door and says "Not interested in participating" (NRM). What do you do in case (b)? In response to case (c) you are very likely (after a number of trials) to respond with a sales pitch involving "paying my way through college" and what started out as a NRM evolves into a cue to deliver the sales pitch. But, depending on your temperament, (b) or (c) might result in your tossing all the survey forms in the bin and hitting the local pub.


But that's not the textbook definition of NRM, which is something that is conditioned. Something doesn't start out as a NRM, it's conditioned, and if it evolves into something else, then it's not an NRM anymore.

So if the NRM does not do what it was supposed to do, which is to extinct the behavior of knocking on doors and instead leads to you giving a sales pitch, then it doesn't work, and if it doesn't work, then it doesn't truly exist in real practice...?


----------



## CatintheHat (Jun 7, 2009)

OK, back to square one. You seem to be having difficulty with some basic OC concepts, so a review would perhaps be helpful. You also seem to be working quite hard to avoid understanding the topic of this conversation. 



> Two separate authors on that website have defined the NRM as an antecedent, or a cue for extinction. Here's another one that defines it as a cue. They do not describe the NRM as a consequence, as you define it. Based on these definitions of the NRM, the textbook way it works is - extinction is put on cue, just as sit or down is put on cue.


If you are going to post links to every dog trainer site that gives incorrect definitions of OC terms, we are going to be here for a very long while. This latest link is clearly describing a negative punishment marker. That is a -P marker, not a NRM. The clickertraining.com definitions are also incorrect, for reasons previously explained in detail. Personal essays posted on a website (even Karen Pryor's website) are not "textbooks," and the definitions they present are not "textbook definitions." In this case, they are in fact self-contradictory nonsense. 

A cue is a discriminant stimulus that is conditioned via operant conditioning to elicit a specific behavior. In a successfully conditioned operant behavior, when the discriminant stimulus is introduced, the behavior occurs. The discriminant stimulus is thus antecedent to the behavior. We'll ignore behavior chains for now, for the sake of simplicity. 

A marker is a signal that has been classically conditioned by association with an appetitive, aversive, or neutral stimulus, and is used to precisely identify behavior that earns a consequence. A marker is therefore a classically conditioned consequence. 

The construction "cue for extinction" makes no sense for several reasons, chief among them is that "extinction" is not a behavior, it is a pattern of diminishing frequency of behavior that earns no consequence. There is no such thing as a discriminant stimulus for a pattern of diminishing frequency of behavior. It's nonsense. 

A no-reward (or no-reinforcement) marker is a marker. You can tell, because it says so right there in the name: no-reward *marker*. The name also states precisely what it does: as a positive reinforcement marker signals behavior that earns positive reinforcement, a no-reinforcement marker signals behavior that earns no reinforcement. 




> That's intermittent reinforcement, not lack of consequence. Lack of consequence, in otherwords, zero consequences should lead to behavioral extinction.


 To implement a variable schedule of reinforcement (VSR), it is necessary that some successful trials earn +R, and some successful trials do not. The trails that do not earn reinforcement have no consequence. For behaviors with a strong reinforcement history, it is well known that VSRs strengthen the resistance to extinction. It is also well known that VSRs increase strength of behavior, but also increase variability of behavior. This is commonly described as "dog is working harder to earn rewards". "Increasing variability of behavior" is very important in shaping. 



> But that's not the textbook definition of NRM, which is something that is conditioned. Something doesn't start out as a NRM, it's conditioned, and if it evolves into something else, then it's not an NRM anymore.


 Sorry, what you are calling "textbook definition of NRM" is just nonsense. The example is in fact an example of NRM. Signals can and are conditioned _in situ_ all the time, and seldom require more than a few repetitions for conditioning. Punishment signals (the tone from the shock collar) are conditioned within just 1 or two repetitions. Cues and markers are reconditioned all the time, and inadvertent reconditioning is one of the hazards of OC. 

In outcome (b) of the survey example, the survey-taker may repeat the behavior of knocking on the door, may increase the intensity of the behavior (knock louder). The survey taker will also start to look for environmental cues (lights on, car in driveway, etc.) that increase the frequency of door-answered. 

After a small number of repetitions of outcome (c), the survey-taker is most likely to incorporate the sales pitch into the response to door-answered. In other words, the survey take will be shaped into modifying the "solicit participation" behavior. 

No discussion of NRM or VSR or extinction is complete without a good look at the mother of all VSR applications: the slot machine. Slot machines make very rich use of NRM. For simplicity, consider a classic one-armed bandit with 3 wheels. Each wheel has 6 fruit images: cherry, pineapple, banana, apple, orange, lemon. Machine pays out only on 3-of-a-kind, with 10-1 cherries, 5-1 pineapples, 4-1 bananas, 3-1 apples, 2-1 oranges, 1-1 on 3 lemons. No payout on 2-of-a-kind or 1-of-a-kind. 

Behavior is to drop a coin in the slot, pull the lever, and wait for the wheels to stop spinning. +R marker is 3-of-a-kind. NRM is any other wheel configuration. There are actually 6 different +R markers and 114 distinct NRMs. Since 3 cherries have the highest payout, 2 cherries with any other fruit can be interpreted as "I came _this close_ to hitting the jackpot!" and is more likely to be followed by another trial than 1-of-a-kind. 

Modern slot machines are much more sophisticated, with 5 "wheels" and multiple non-linear +R patterns,and a lot more "almost hit it" NRMs. 

Back to dog training, a lot of the popular authors like to advocate "time out" for incorrect or unwanted behavior. How this is implemented is crucial to how it works. If "time out" comprises putting the dog into the crate, that is a somewhat complicated +P scenario. If "time out" is simply turning one's back on the dog for a 3 count, that is very clearly NRM. 



> and if it doesn't work, then it doesn't truly exist in real practice


 Faulty logic. A hammer does not work to drive screws, but hammers do in fact exist.


----------



## qingcong (Oct 26, 2009)

CatintheHat said:


> OK, back to square one. You seem to be having difficulty with some basic OC concepts, so a review would perhaps be helpful. You also seem to be working quite hard to avoid understanding the topic of this conversation.


I don't know what basic OC concepts you think I'm having difficulty with. Instead of a chapter, you can simply point out a specific part of my post that is faulty and explain where the misunderstanding is. As far as I saw, nothing I said above contradicts your "basic OC concepts" lesson, aside from the discussion on NRMs. 






> If you are going to post links to every dog trainer site that gives incorrect definitions of OC terms, we are going to be here for a very long while. This latest link is clearly describing a negative punishment marker. That is a -P marker, not a NRM. The clickertraining.com definitions are also incorrect, for reasons previously explained in detail. Personal essays posted on a website (even Karen Pryor's website) are not "textbooks," and the definitions they present are not "textbook definitions." In this case, they are in fact self-contradictory nonsense.


I'm not claiming to be an expert on NRM, that's why I look for references on the web. There aren't many true references out there, not in the same way there are for the standard OC quadrants. The most reliable definitions seem to be from Karen Pryor's site. What makes the definition of NRM that you present more correct than what 3 separate trainers have presented? I will cease questioning your definition of an NRM if you can adequately cite your sources.






> Faulty logic. A hammer does not work to drive screws, but hammers do in fact exist.


A hammer does work to drive nails, which is what it was designed to do... so if a NRM exists, then it should do what it is designed to do. And here's what you still haven't explained in all of this time, despite your comprehensive lessons, and it's getting frustrating. If these definitions presented by Karen Pryor faculty are incorrect, then what is an NRM supposed to do?


----------



## hanksimon (Mar 18, 2009)

@CatintheHAt - Very clean, clear description. I have two questions:
1. My own semantic ignorance: "A marker is a signal that has been classically conditioned" ... Why classical rather than operant?
2. Argument? : "If "time out" is simply turning one's back on the dog for a 3 count, that is very clearly NRM. " ... I consider that turning my back for 15 - 30 sec. (longer than a 3 count) to be an aversive withdrawal of attention -P (?)


----------



## CatintheHat (Jun 7, 2009)

qingcong said:


> What makes the definition of NRM that you present more correct than what 3 separate trainers have presented?


 It's logically cohesive and not self-contradictory. It does not rely on redefining the words or the entire phrase. It relies on "no-reward marker" or "non-reinforcement marker" being exactly what it says it is, rather than a "cue for extinction" or "pink polka-dotted unicorn" or other fantastic construction. 




qingcong said:


> I will cease questioning your definition of an NRM if you can adequately cite your sources.


 That's certainly a valid request but not one that I am able to fulfill. I have been studying OC for many years and don't have a reliable source at hand. It is certainly true that there is not a lot of discussion of any aspect of OC in the recent academic literature, largely because radical behaviorism has been pretty fringe for the last few decades. The most recent textbook (of which I am aware) was published in 1991. That is probably why none of the web pages you linked cite any sources. That said, appeal to authority is a logical fallacy; appeal to logic is not. 




qingcong said:


> And here's what you still haven't explained in all of this time, despite your comprehensive lessons, and it's getting frustrating. If these definitions presented by Karen Pryor faculty are incorrect, then what is an NRM supposed to do?


 I have repeated several times what NRM does: it signals a trial that ends with no consequence. The impact on behavior of a trial that ends with no consequence depends on the learner and the reinforcement history of the behavior. How you may wish to apply this (or not) in your own training program is up to you. I have described a real-world example for dog training, a couple of hypotheticals for human behavior, and a real application for human behavior. 



hanksimon said:


> @CatintheHAt - Very clean, clear description. I have two questions:
> 1. My own semantic ignorance: "A marker is a signal that has been classically conditioned" ... Why classical rather than operant?


Markers are classically conditiond because they are paired with an unconditioned stimulus and elicit a reflexive response. Charging a clicker is Pavlov straight, no chaser. 

In operant conditioning, the discriminative stimulus for a learned behavior also becomes classically conditioned. If you train "leave it" with +R, the "leave it" cue itself becomes appetitive after a number of repetitions. If you train "leave it" with +P, the "leave it" cue becomes aversive. 

This is why it is easier to build behavior chains with behaviors that have been conditioned through +R. The cue for the next behavior in the chain becomes a secondary reinforcer for the previous behavior in the chain. 




hanksimon said:


> 2. Argument? : "If "time out" is simply turning one's back on the dog for a 3 count, that is very clearly NRM. " ... I consider that turning my back for 15 - 30 sec. (longer than a 3 count) to be an aversive withdrawal of attention -P (?)


 This hinges on what "attention" comprises. I generally interpret that to mean interaction of some sort, in which case the cessation of that interaction is removing something. Example is playing tug, dog tooth touches human skin, human drops tug toy and turns away. Dog teeth touch human skin less often, so -P. 

If "attention" is just standing there looking at your dog without saying or doing anything (e.g., while you are waiting for your dog to sit in response to a sit cue), I don't think that's removing anything from the environment. How many dogs find "being looked at" rewarding? Some find it aversive, in which case removing attention would be -R. 

In any case, if the dog wants attention (or anything else) and you remove it, that is not aversive, it is appetitive.


----------



## hanksimon (Mar 18, 2009)

@ NRM - Aversive: Understood, and agreed that -R vs. -P is situational. However...
>>> "How many dogs find "being looked at" rewarding?" I imagine you haven't experienced the enthusiasm of a Lab, Pit, Boxer, or Rott .... who won't tail-wag, but full body-wag when you look at them!  But agreed that most dogs have to be taught to accept extended eye contact...

I disagree that charging a clicker is Pavlov, reflexive, b/c the test that a dog is learning to understand the clicker is the apparent conscious change from looking at the clicker to looking at the food hand during charging, implying an understanding of the implied [sorry] promise... or antecedent-consequent chain.


----------



## qingcong (Oct 26, 2009)

CatintheHat said:


> It's logically cohesive and not self-contradictory. It does not rely on redefining the words or the entire phrase. It relies on "no-reward marker" or "non-reinforcement marker" being exactly what it says it is, rather than a "cue for extinction" or "pink polka-dotted unicorn" or other fantastic construction.
> 
> That's certainly a valid request but not one that I am able to fulfill. I have been studying OC for many years and don't have a reliable source at hand. It is certainly true that there is not a lot of discussion of any aspect of OC in the recent academic literature, largely because radical behaviorism has been pretty fringe for the last few decades. The most recent textbook (of which I am aware) was published in 1991. That is probably why none of the web pages you linked cite any sources. That said, appeal to authority is a logical fallacy; appeal to logic is not.


Fair enough. Logically, I don't know what the accepted definition of NRM is and quite frankly, neither your definition nor the Karen Pryor definition make much sense to me. I'm not being difficult with you, I just find it worthwhile to challenge what I'm learning to understand it better. 

The "cue for extinction" definition makes clear what the NRM is supposed to do, but it ignores the "marker" part as you point out. Also, can the absence of behavior be put on cue? I'm not clear on that. "leave it" is kind of like an absence of behavior, though it's usually more like a turn of the head than actually "leaving it". 

The definition you present makes sense in that you provide examples of real-life NRM. What doesn't make sense is that, if the NRM creates a change in behavior, it should by definition, fall into one of the 4 OC quadrants. To me, this definition of NRM is also contradictory. The latency example you provided does not work because it is an example of -R.


----------



## KBLover (Sep 9, 2008)

qingcong said:


> Using catinthehat's definition, I think that the NRM exists only in concept. Is it a concept that dogs can grasp? I don't know.


WAAAY late to this, but interesting discussion.

What I consider a "NRM" is if Wally and I are shaping (towards a specific goal, or if we are playing "guess the behavior I want" where he just offers stuff and I decide what I want that round, but if I'm just seeing what he might come up with to capture or what not, I don't do it), he offers the wrong behavior - I'll look away or sometimes say "nope" if he's doing something and can't see me (and, thus, can't see me turn my head). 

What he's supposed to do is...anything else but what he just did.

I don't consider it -P because nothing is being withdrawn. We're not stopping shaping. There's no reward being taken away. He just did something and...nothing happened. 

The buzzer analogy hits home for me because that's pretty much how I use it. I don't want to diminish him offering that behavior "globally", just...not NOW. If I wanted to punish him for something, it's something I want him to delete - in all situations. I can give no-reward markers whenever, it doesn't stop him from pawing things or picking things up (i.e. he'll offer it in a new situation just as readily as he did before) in general, just to this object, task, whatever. Wally is still 'pushing the button' to offer another 'answer'.

As for it being a lot of work - I don't know. All I know is it's how I communicate with him and he and I got in sync and he picks up what I mean and we run with it. 

Now whether or not it's "technically" a NRM really doesn't matter at the end of the day for me (not gonna make me throw it out or such). But I thought I'd just throw it out there and see what, if anything, sticks.


----------



## qingcong (Oct 26, 2009)

Yeah, I think that's about how most people understand and use the NRM. It makes sense in training that aside from markers & treats, we would want some other way to clue the dog when it's off track.





KBLover said:


> What he's supposed to do is...anything else but what he just did.


If you ask me, this would be +P. However minimally aversive it may be, if the feedback reduces what he just did, then it qualifies as +P. 

This whole NRM discussion is kind of reminding me of the semantics argument between a "correction" and a "punishment/aversive". A correction is an aversive, but people use the term correction as a euphemism. The subject on the receiving end doesn't care if it's called correction, aversive, punishment, or NRM. NRM kind of seems like one of those semantics things to me.


----------



## KBLover (Sep 9, 2008)

qingcong said:


> If you ask me, this would be +P. However minimally aversive it may be, if the feedback reduces what he just did, then it qualifies as +P.


The thing is, it doesn't.

I might not want it then, but maybe I want it later. 

For example, we're playing "guess the behavior" where I pick out a behavior in my mind and c/t him only when he offers it. I'd give no-reward markers, but that doesn't necessarily mean I don't want him to ever do that ever again. 

So let's say I just put him in the middle of a room. I might want him to sit-pretty. He barks. I give a no-reward marker. He does a regular sit - same thing. This time he does a sit-pretty. C/T. Now I want something different - say, lying down (the very thing I would have no-rewarded him on before). He offers it after a few "guesses" and he gets a reward.

I'm not trying to remove behaviors, but I'm also not rewarding every behavior he does. He's "guessing", and I I'll either sound the bell (reward marker) or the buzzer (no-reward marker). That doesn't mean I never want him to lie down in this room. It means it wasn't what I was thinking of. 

Another example. Let's say I want him to push a ball instead of throw it. I don't want him to never throw the ball, but I don't want to reward throwing when I want pushing. So I give a no-reward marker on the throws and reward the pushes. Tomorrow, I might want the throws and not the pushes because that's what we're working on that day. Again, I don't want to eliminate an action on the ball completely, we're just not practicing that at the moment and I use the no-reward marker to tell him that. 

Would you consider that I'm punishing one or the other, considering he'll offer either one if given a chance? I consider it focusing his thought process. He wants to do one of the 100 things I've rewarded him for on that ball, I want a certain one to have him practice it without removing his eagerness to do the others (because we'll be doing those later on).


----------



## SassyCat (Aug 29, 2011)

You can debate how "aversive" no-reward (or +P) is to a dog but it's highly subjective, like most other things. There's really no universal rule - some dogs go nuts while some could care less about food or withdrawal of it.

This study has shown that:


> Cortisol level was significantly higher when using the quitting signal than when using the pinch collar or e-collar.


For those that don't know cortisol is a stress hormone. Test was carried out on belgian shepherd police dogs which are all highly motivated dogs. Experiment wasn't done to fit anyone's agenda, it simply shows facts which trainers can use to improve their methods. There will be more done with e collars and no-reward markers on more diverse group of dogs.

My point is that dogs tend to be vastly different when it comes to effectiveness of a certain reward or punishment.


----------



## qingcong (Oct 26, 2009)

KBLover said:


> The thing is, it doesn't.
> 
> I might not want it then, but maybe I want it later.


I understand what you're saying, it makes sense. Keep in mind, I'm not arguing _your_ definition of +P, I'm talking about if what you do fits the accepted universal definition of +P and it does, based on my understanding. Typically, all +P requires more than one repetition to kill the behavior in all circumstances. A single, mild +P can work to discourage the current behavior while still being mild enough to allow the dog to go about the rest of its business. 







SassyCat said:


> You can debate how "aversive" no-reward (or +P) is to a dog but it's highly subjective, like most other things. There's really no universal rule - some dogs go nuts while some could care less about food or withdrawal of it.
> 
> This study has shown that:
> 
> ...


That's very interesting. I think the next step would be to study the long term effects of the e-collar and prong collar. It's been my experience that all aversive techniques will appear to work extremely well in the short term, but it's the long term effects that have turned me away from using them.


----------



## Pawzk9 (Jan 3, 2011)

qingcong said:


> Yeah, I think that's about how most people understand and use the NRM. It makes sense in training that aside from markers & treats, we would want some other way to clue the dog when it's off track.
> If you ask me, this would be +P. However minimally aversive it may be, if the feedback reduces what he just did, then it qualifies as +P.
> 
> This whole NRM discussion is kind of reminding me of the semantics argument between a "correction" and a "punishment/aversive". A correction is an aversive, but people use the term correction as a euphemism. The subject on the receiving end doesn't care if it's called correction, aversive, punishment, or NRM. NRM kind of seems like one of those semantics things to me.


Sue Ailsby (who doesn't use euphamisms) talks about the possibility of using a NRM on a self rewarding behavior. After all, when we are using positive reinforcement, the behaviors themselves should become rewarding. For instance dog loves doing weave poles. Pops out. Dog still is reinforced by being allowed to continue. The NRM tells the dog exactly what point was the mistake and to try again. I wouldn't call that anything but information. I don't use them much, but that use does make sense to me.


----------



## Pawzk9 (Jan 3, 2011)

SassyCat said:


> You can debate how "aversive" no-reward (or +P) is to a dog but it's highly subjective, like most other things. There's really no universal rule - some dogs go nuts while some could care less about food or withdrawal of it.
> 
> This study has shown that:
> 
> ...


The study is one of aversive training methods. Assuming none of the dogs were actually trained with positive reinforcement, the interrupter would of course indicate that something "bad" may be about to happen. And anticipating that could be more stressful than just being fried with the ecollar or gigged with the prong collar.


----------



## SassyCat (Aug 29, 2011)

Pawzk9 said:


> The study is one of aversive training methods. Assuming none of the dogs were actually trained with positive reinforcement, the interrupter would of course indicate that something "bad" may be about to happen. And anticipating that could be more stressful than just being fried with the ecollar or gigged with the prong collar.


Dogs being tested had no chronic stress and one group (Hannover) was not previously familiar with e collars nor prongs. Test would be pointless if all dogs were subjected with the same treatment.


----------



## Pawzk9 (Jan 3, 2011)

SassyCat said:


> Dogs being tested had no chronic stress and one group (Hannover) was not previously familiar with e collars nor prongs. Test would be pointless if all dogs were subjected with the same treatment.


Which has little to do with my questions about the test.


----------



## qingcong (Oct 26, 2009)

SassyCat said:


> Dogs being tested had no chronic stress and one group (Hannover) was not previously familiar with e collars nor prongs. Test would be pointless if all dogs were subjected with the same treatment.


What Pawz is saying, and it's a good point, is that perhaps a conditioned marker for an aversive is more powerful than the aversive itself. It's just like how a clicker is more powerful than simply giving a treat.


----------



## SassyCat (Aug 29, 2011)

qingcong said:


> What Pawz is saying, and it's a good point, is that perhaps a conditioned marker for an aversive is more powerful than the aversive itself. It's just like how a clicker is more powerful than simply giving a treat.


Thanks for clarification. I agree that the marker word is actually stronger than the stim itself. But, how would that influence the test?

Now to be clear:


Pawzk9 said:


> Assuming none of the dogs were actually trained with positive reinforcement


Most dogs were trained with +R while the Hanover group experienced *no* aversive methods on them (they are illegal in Germany). I thought that's what you meant to ask.

You can get the full report here. It's pretty big though and takes some time.


----------



## qingcong (Oct 26, 2009)

SassyCat said:


> Thanks for clarification. I agree that the marker word is actually stronger than the stim itself. But, how would that influence the test?


I haven't had time to go through the details of the test yet, but if they simply gave a pinch or zap without a marker, then it's easier to explain why the pinch or zap produced less stress than the conditioned quitting signal.


----------



## Pawzk9 (Jan 3, 2011)

SassyCat said:


> Most dogs were trained with +R while the Hanover group experienced *no* aversive methods on them (they are illegal in Germany). I thought that's what you meant to ask.
> 
> You can get the full report here. It's pretty big though and takes some time.


The dogs were trained police dogs. Nowhere in the report did I see any indication that they were trained with R+ nor that the Hannover group experienced no aversive methods - simply no shock collars prior to the study. The study was about various forms of punishment. Also, since the quit signal was only trained enough to be effective on a couple of dogs, measurement on that would be pretty insufficient.


----------



## SassyCat (Aug 29, 2011)

Pawzk9 said:


> Nowhere in the report did I see any indication that they were trained with R+ nor that the Hannover group experienced no aversive methods - simply no shock collars prior to the study.


I was under impression that all "tools that cause pain" were illegal in Germany (according to the new law) but that doesn't _seem_ to be the case for prongs (just checked) so you're right, prongs could have been used in past which would of course interfere with the test. It is why I assumed that +R was used, besides the fact that negative punishment is kinda pointless without positive reinforcement. Still though, I doubt that training had *no* +R but since report isn't clear on that it's best to assume that there was a lot of yank & crank applied... 



Pawzk9 said:


> Also, since the quit signal was only trained enough to be effective on a couple of dogs, measurement on that would be pretty insufficient.


I agree, but the interesting finding of cortisol levels remains. I wouldn't be at all surprised that dogs were not stressed about the quitting signal but they were very stressed which was not expected.



qingcong said:


> What Pawz is saying, and it's a good point, is that perhaps a conditioned marker for an aversive is more powerful than the aversive itself. It's just like how a clicker is more powerful than simply giving a treat.


Ok I finally get what you mean and yes it's a good point - I was stupid not to question that myself. Here's what they did (from the report):


> During the sessions in which the electronic training collar was tested, they held the receiver of the collar and _gave the electric impulse whenever the dog made the mistake_.


Meaning that, at least with e collars, there was no marker. I definitely agree that the marker would cause more stress than just hitting the dog when it messes up.

For those interested, the test process starts at page 60.


----------



## Pawzk9 (Jan 3, 2011)

SassyCat said:


> I was under impression that all "tools that cause pain" were illegal in Germany (according to the new law) but that doesn't _seem_ to be the case for prongs (just checked) so you're right, prongs could have been used in past which would of course interfere with the test. It is why I assumed that +R was used, besides the fact that negative punishment is kinda pointless without positive reinforcement. Still though, I doubt that training had *no* +R but since report isn't clear on that it's best to assume that there was a lot of yank & crank applied...
> .


Most tools can cause pain if that is the goal. The human body can cause pain if that is the goal. The "quit signal" sounded like a rather insufficiently trained food zen exercise, which would take a LOT to transfer over to getting the dog to look away from the decoy. And it sounds like the way it was trained was way too big a lump to get overall success.


----------

