In the laboratory study of self-control, subjects are given choices between a small reinforcer available after a short delay and a larger reinforcer available after a longer delay. Preference for the smaller more immediate reinforcer is said to reflect sensitivity to reinforcement immediacy, sometimes labeled "impulsivity," whereas preference for the larger more delayed reinforcer is said to reflect sensitivity to reinforcer amount, sometimes labeled "self-control."
Over the past two decades, some notable human-nonhuman differences have been reported on these procedures. The choices of nonhuman subjects (mainly pigeons) tend to show far greater sensitivity to delay than do the choices of human subjects, prompting some to conclude that self-control choices in humans and nonhumans are governed by different variables (Lowe & Horne, 1988). Such a view, however, overlooks critical differences in the procedures typically used to study self-control in humans and nonhumans. In experiments with nonhuman subjects, reinforcers typically consist of food, which is consumed soon after it is presented. In experiments with adult humans, on the other hand, reinforcers typically consist of token reinforcers--points later exchangeable for other reinforcers, usually money. Unlike food, token reinforcers such as points cannot be consumed immediately, but rather, derive their reinforcing value from their relationship to post-session reinforcers. Until these procedural discrepancies are rectified, it is not possible to determine whether--or to what extent--genuine species differences in self-control exist.
Recent work in our laboratory has attempted to minimize species-typical procedural differences which have hampered past human-nonhuman comparisons in self-control. We have taken two general approaches to the problem. In the first, we have used with pigeons procedures more analogous to those typically used with humans. For example, Jackson and Hackenberg (1996) studied pigeons' choices in a self-control arrangement with token-like reinforcers. The objective was to bring the procedures into greater alignment with the token-based point-money reinforcers used with human subjects. Pigeons' choices produced either 1 or 3 light-emitting diodes (LEDs) as a form of token reinforcement, with each LED exchangeable for 2-s food during signaled exchange periods. Preference for the large-reinforcer (3 LED) option was consistently found when delays to exchange periods were equal for both alternatives, as they typically are in experiments with humans using point-money reinforcers (i.e., at session's end). Preference reversed, however, in favor of the small-reinforcer (1 LED) option when the exchange delays were shorter for the small reinforcer. Thus, the degree to which self-control or impulsivity was observed was shown to be a function of specific procedural variables, lending support to the view that human-nonhuman differences in self-control are at least partly due to differences in procedure.
A second approach to the problem involves using with human subjects procedures more analogous to those typically used with nonhuman subjects. The initial step involves identifying a reinforcing event for humans that can somehow be functionally calibrated with the reinforcers used with other animals. In finding such a reinforcing event, it is important to demonstrate, at minimum, that choices are sensitive to reinforcer amount and reinforcer delay separately before these variables are pitted against each other in a self-control task. If the larger reinforcer is not preferred over the smaller reinforcer with equal delays, or if the less delayed reinforcer is not preferred over the more delayed reinforcer with equal amounts, then that reinforcer does not exert characteristic effects on behavior, and may not be suitable for a self-control choice procedure (Navarick, 1988). This may be seen as a way of calibrating different procedures in terms of their ability to produce characteristic effects of reinforcer amount and delay. This kind of calibration is especially important given the often considerable variability in the methods employed across different human operant laboratories.
Promising in this regard is a method developed by Navarick (1996, 1998) based on access to prerecorded TV segments as a reinforcer for humans in a self-control context. Subjects were given choices between briefly presented video segments that differed in their duration (reinforcer amount) and in the delays to their onset (reinforcer delay). In the more recent study, Navarick (1998) found that approximately 40% of subjects preferred a shorter (15 s) segment presented immediately to a longer (25 s) segment after a 55-s delay. Although considerable between-subject variability and brief exposure of subjects to the procedures limit stronger conclusions, this procedure at least partially succeeded where others have failed, that is, in producing impulsive choices in human subjects under laboratory conditions.
The feasibility of such a method is further illustrated by some recent preliminary data collected in our laboratory. To facilitate comparisons to our pigeon data, we used token-based self-control procedures, in which subjects chose between immediate and delayed presentation of lights as a form of token reinforcement. Each light, when illuminated, was exchangeable for 15-s access to a prerecorded TV program. In self-control conditions, subjects chose between 1 light (exchangeable for 15-s access to video) presented immediately and 3 lights (exchangeable for 45-s access to video) after a delay. In some conditions, delays to exchange periods occurred soon after token delivery, such that small-reinforcer choices resulted in quicker access to the exchange period and video access. In other conditions, delays to the exchange period were equal for both options. If the choices of humans are comparable to those of pigeons in our earlier work, then we would expect preference for the small reinforcer under the former conditions (with unequal delays to the exchange period) and preference for the large reinforcer under the latter conditions (with equal delays to the exchange period). In addition, sensitivity to reinforcer delay and reinforcer amount were evaluated separately for each subject prior to the self-control conditions.
Subjects and Apparatus
Two adult subjects (designated 1040 and 1042) participated in exchange for money (approximately $5/hr, paid at the end of the experiment). Subjects worked, seated, in a small enclosure (2.21 m high by 1.21 m across by 1.25 m deep) before a response console, consisting of (from top to bottom) a 10 x 3 matrix of small red lights (which served as tokens), 3 response keys, and a video monitor. The tokens were lit from left to right and, when exchanged, were extinguished from right to left. During reinforcement periods, the audio and visual signals to the monitor were activated, permitting access to a pre-recorded program playing on an interconnected videocassette recorder (VCR, GE model VG4064). The videotape played at other times as well (see below), but with the audio and visual signals to the monitor interrupted. Six white lights (3 on each side of the token matrix) were illuminated at all times during the session, except when the videotape was in the PLAY mode, whether the monitor was activated or not. Subjects were permitted to select a video prior to each session from a small group of documentary-type (e.g., science and nature) programs.
Procedure
The following instructions were posted in the workspace:
When the white lights on the front panel are off, the video is playing. Thus, whenever the white lights are off and there is no picture on the video monitor you are missing the film. You may turn on the video monitor by pressing the response keys when lit. Press only one key at a time. The key you press may affect the length of time that the video plays before it is interrupted. It may also affect how long you will wait before the video starts. Occasionally, the video monitor may turn on when you have not pressed the keys. This is a preview period. Please remain seated. You will be informed when the session is over.
During the choice phase of each trial, a single press on the yellow side key produced the small reinforcer (1 token) whereas a single press on the green side key produced the large reinforcer (3 tokens.) (The left-right position of the alternatives was determined randomly each trial.) Tokens were exchangeable for video access each trial during exchange periods, signaled by the illumination of the center (red) key. A single response on the red key turned off one light and produced 15-s access to the videotape. On large-reinforcer exchanges, a center-key response was required to produce each successive 15-s segment. When all tokens earned had been exchanged, the center key went off and an intertrial interval (ITI) period began. The duration of the ITI varied in such a way that trials were evenly spaced in time, either 150 s or 210 s apart (depending on whether the large-reinforcer delay was 60 s or 120 s, respectively; see below). This equal trial spacing was critical in holding constant overall reinforcement rate (thereby ensuring that the delay to the upcoming trial was unaffected by which alternative had been selected).
Due to the nature of the video reinforcer, whose effectiveness depended in part on continuity with prior segments, it seemed important to also hold constant the time between each choice and the previous video segment. Therefore, each trial began with a 15 s period of free video access, designed as a kind of "preview" of upcoming choice outcomes. The audio and visual signals were then interrupted (while the tape continued to run in the VCR) during the choice and delay periods of the trial, but were reinstated during exchange periods. The tape stopped during the ITI. A session consisted of 10 choice trials preceded by 4 forced-choice trials (during which only 1 of the 2 side keys was lit and operative).
The main independent variable was the delay to the exchange period. Under Unequal Delay (UD) conditions, the exchange period was scheduled just following token presentation, and therefore occurred sooner following small-reinforcer choices. Under Equal Delay (ED) conditions, the exchange period was scheduled after an equal delay following either choice. Because it took 1.2 s to present 3 tokens in succession, the delay to the exchange period under ED conditions was set equal to x + 1.2 s, where x equals the pre-token delay associated with the large reinforcer. For Subject 1040, the large-reinforcer pre-token delay was 60 s across the initial block of conditions and 120 s across a second block of conditions. For Subject 1042, the large-reinforcer delay was 120 s throughout. Exchange-delay conditions alternated in A-B-A fashion, with UD conditions constituting A phases and ED conditions B phases. With one exception (Subject 1042's initial exposure to the UD condition), conditions were in effect for at least 2 sessions and until choice proportions were stable via visual inspection. Subjects 1040 and 1042 completed a total of 23 and 9 self-control sessions, respectively.
Prior to these self-control conditions, in which reinforcer delay and reinforcer amount were placed in opposition, two sessions were conducted to assess sensitivity to reinforcer amount and reinforcer delay separately. In the first of these (Amount Sensitivity), subjects chose between a larger reinforcer (3 tokens, 45-s video access) and a smaller reinforcer (1 token, 15-s video access) with equal delays to token onset and exchange. Selecting the green key produced 3 tokens immediately, whereas selecting the yellow key produced 1 token immediately. In the second (Delay Sensitivity), subjects chose between unequal delays to an equal-duration reinforcer (1 token, 15-s video access). Selecting the yellow key produced 1 token immediately, whereas selecting the green key produced 1 token after a 60-s delay. The exchange period was scheduled just after token presentation.
The main results are depicted in Figure 1 for Subject 1040 (bottom panels) and Subject 1042 (top panels). The left-most graphs for each subject show the mean number of green-key choices per session in the first two conditions, in which reinforcer amount and reinforcer delay were separately manipulated. Both subjects strongly preferred the larger reinforcer (3 tokens, 45-s video access) to the smaller reinforcer (1 token, 15-s access) with equal token and exchange delays, and immediate over delayed access to equal reinforcer amounts (1 token, 15-s access). These results are important in showing that video access shows the characteristic delay- and amount-sensitivity of other reinforcers, and can therefore be used meaningfully in a self-control context.
The right-most graphs for each subject show the mean number of large-reinforcer choices per session across the final two sessions in each of the self-control conditions. Because a session consisted of 10 choice trials, values above and below 5 (indicated by the reference line) represent preference for the large and small reinforcer, respectively. The pre-token delay for large-reinforcer choices is shown below each group of bars--120 s for Subject 1042 and 60 and 120 s for Subject 1040.
For both subjects, the number of large-reinforcer choices depended on the delay to the exchange period. For Subject 1040 (bottom panel), preference for the small reinforcer was seen in the initial exposure to conditions with a 60-s pre-token delay to the large reinforcer. Preference reversed in favor of the large reinforcer under ED conditions, but did not reverse back upon a return to the UD conditions. The large-reinforcer delay was then increased to 120 s for the final 3 conditions. This produced a strong preference for the small reinforcer under UD conditions and for the large reinforcer under ED conditions, an effect which was reversible upon a return to UD conditions. Similarly, Subject 1042 (top panel) showed a clear preference for the smaller reinforcer under both UD conditions and for the larger reinforcer under ED conditions.
The research reported here is still quite preliminary, so the results should be viewed with caution. Nevertheless, the within- and between-subject replicability obtained thus far suggests that video access can serve as an effective reinforcer for human subjects in a laboratory self-control context. To be sure, video access has features which distinguish it from primary reinforcers such as food. First, unlike primary reinforcers, the reinforcing effectiveness of video depends at least partly on its continuity through time. We capitalized on this feature by interrupting the signal (while letting the tape run) during the delay to the larger reinforcer. The goal was to generate preference for the smaller reinforcer by discounting (by whatever means) the larger reinforcer. In so doing, however, we may have altered the relative reinforcing efficacy of the alternatives through changes in reinforcer quality (continuity of the signal) instead of, or in addition to, reinforcer delay. Isolating the effects of these variables is an important topic for future research.
Second, unlike primary reinforcers, video is established as
an effective reinforcer through a long social history prior to
one's participation in an experiment. For the present purposes,
however, the origins of such a reinforcer are less important than
its current functions. The strategy here is practical--to maximize
the likelihood of identifying effective reinforcers for a given
species, while, at the same time, respecting anatomical and ecological
differences between species.
Thus, while one might question the functional comparability of
video access as a reinforcer for a non-deprived human and food
as a reinforcer for a food-deprived pigeon, the fact is that both
appear to yield similar effects on behavior. Together with Navarick's
(1996, 1998) results, our results permit the following conclusions:
(1) Like food, a greater amount of video access (a longer segment)
is preferred to a lesser amount with reinforcer delays held constant.
(2) Like food, quicker access to an equal-duration video segment
is preferred to delayed access to it. (3) Like food, with sufficiently
long delays, preference for a longer video segment reverses in
favor of a smaller more immediate video segment. (4) With equal
delays to an exchange period, preference reverses back in favor
of the larger reinforcer. That is, choices appear to be governed
by the same exchange-delay manipulations identified in our prior
work with pigeons (Jackson & Hackenberg, 1996). That the same
general relations appear to hold across such seemingly different
reinforcers bodes well for attempts to examine cross-species generality
in choice and self-control.
Jackson, K., & Hackenberg, T. D. (1996). Token reinforcement, choice, and self-control in pigeons. Journal of the Experimental Analysis of Behavior, 66, 29-49.
Lowe, C. F., & Horne, P. J. (1988). On the origins of selves and self-control. Behavioral and Brain Sciences, 11, 689-690.
Navarick, D. J. (1988). Spurious self-control: Potential outcome in research with humans. Behavioral and Brain Sciences, 11, 691-692.
Navarick, D. J. (1996). Choice in humans: Techniques for enhancing sensitivity to reinforcement immediacy. The Psychological Record, 46, 539-554.
Navarick, D. J. (1998). Impulsive choice in adults: How consistent are individual differences? The Psychological Record, 48, 665-674.