In operant conditioning, schedules of reinforcement are an important component of the learning process. When and how often we reinforce a behavior can have a dramatic impact on the strength and rate of the response.
A schedule of reinforcement is basically a rule stating which instances of a behavior will be reinforced. In some case, a behavior might be reinforced every time it occurs. Sometimes, a behavior might not be reinforced at all. Either positive reinforcement or negative reinforcement might be used, depending on the situation. In both cases, the goal of reinforcement is always to strengthen the behavior and increase the likelihood that it will occur again in the future.
In real-world settings, behaviors are probably not going to be reinforced each and every time they occur. For situations where you are purposely trying to train and reinforce an action, such as in the classroom, in sports or in animal training, you might opt to follow a specific reinforcement schedule. As you'll see below, some schedules are best suited to certain types of training situations. In some cases, training might call for starting out with one schedule and switching to another once the desired behavior has been taught.
Certain schedules of reinforcement may be more effective in specific situations. There are two types of reinforcement schedules:
1. Continuous Reinforcement
In continuous reinforcement, the desired behavior is reinforced every single time it occurs. Generally, this schedule is best used during the initial stages of learning in order to create a strong association between the behavior and the response. Once the response if firmly attached, reinforcement is usually switched to a partial reinforcement schedule.
2. Partial Reinforcement
In partial reinforcement, the response is reinforced only part of the time. Learned behaviors are acquired more slowly with partial reinforcement, but the response is more resistant to extinction.
There are four schedules of partial reinforcement:
- Fixed-ratio schedules are those where a response is reinforced only after a specified number of responses. This schedule produces a high, steady rate of responding with only a brief pause after the delivery of the reinforcer. An example of a fixed-ratio schedule would be delivering a food pellet to a rat after it presses a bar five times.
- Variable-ratio schedules occur when a response is reinforced after an unpredictable number of responses. This schedule creates a high steady rate of responding. Gambling and lottery games are good examples of a reward based on a variable ratio schedule. In a lab setting, this might involved delivering food pellets to a rat after one bar press, again after four bar presses, and a third pellet after two bar presses.
- Fixed-interval schedules are those where the first response is rewarded only after a specified amount of time has elapsed. This schedule causes high amounts of responding near the end of the interval, but much slower responding immediately after the delivery of the reinforcer. An example of this in a lab setting would be reinforcing a rat with a lab pellet for the first bar press after a 30 second interval has elapsed.
- Variable-interval schedules occur when a response is rewarded after an unpredictable amount of time has passed. This schedule produces a slow, steady rate of response. An example of this would be delivering a food pellet to a rat after the first bar press following a one minute interval, another pellet for the first response following a five minute interval, and a third food pellet for the first response following a three minute interval.
Choosing a Schedule
Deciding when to reinforce a behavior can depend upon a number of factors. In cases where you are specifically trying to teach a new behavior, a continuous schedule is often a good choice. Once the behavior has been learned, switching to a partial schedule is often preferable.
Realistically, reinforcing a behavior every single time it occurs can be difficult and requires a great deal of attention and resources. Partial schedules not only tend to lead to behaviors that are more resistant to extinction, they also reduce the risk that the subject will become satiated. If the reinforcer being used is no longer desired or rewarding, the subject may stop performing the desired behavior. For example, imagine that you are trying to teach a dog to sit. If you are using food as a reward, the dog might stop performing the action once he is full. In such instances, something like praise or attention might be a more effective reinforcer.
Ferster, C.B., & Skinner, B.F. (1957). Schedules of reinforcement. New York: Appleton-Century-Crofts.