In studies of risk-taking, a subject is typically presented with two choices: (a) a small reinforcer with a relatively high probability of reinforcement, versus (b) a larger reinforcer with lower probability of reinforcement. Thus, at least one response option has a reinforcement probability < 1.0, and risk taking involves responding to an option (O1) when 1.0 > pO2 > pO1. A simple example would be: (a) $0.25 and p = 1.0 versus (b) $1.00 and p = 0.25. Note that the mathematically expected value (amount x probability) is equal between the two options. Since both options provide the same overall payoff, preference for option (b) is considered risk taking (e.g., it is chosen on > 50% of trials). The converse demonstrates a preference for the non-risk option (sometimes referred to as being risk averse). In preparation for the programmatic study of risk-taking by human subjects with various forms of psychopathology (e.g., substance dependence, conduct disorder), we are investigating how variations in reinforcer amounts and probabilities effect response patterns under two-choice conditions.
Historically, in fields such as psychology and behavioral ecology, risk-taking experiments with non-human subjects have employed highly controlled experimental contexts and food reinforcers (Hamm & Shettleworth, 1987; Mazur, 1996; Real, 1991). On the other hand, experiments with human subjects have traditionally presented a range of hypothetical reinforcer amounts and probabilities and measured subjects' verbal preference of which alternative they would prefer (e.g., Kahneman & Tversky, 1984). While some risk-taking experiments with human subjects have used real monetary contingencies (Rachlin & Frankel, 1969; Slovic & Lichtenstein, 1968), this experimental approach is rare. Thus, we have revisited the use of a controlled context (i.e., monetary deprivation) and real reinforcement contingencies in the analysis of risk taking. In the case of both humans and non-humans, data suggest that unless amounts and/or probabilities are extreme, organisms will prefer the non-risk option.
Subjects. Twelve subjects (four females, eight males) participated. All subjects were first screened and found to be free of: (a) current medical problems; (b) pregnancy; (c) use of medications with effects on the central nervous system; (d) current drug use (verified by daily urinalysis); and (e) any current or past DSM-IV (APA, 1994) Axis-I disorder, except past drug dependence. All subjects were unemployed at the time of the study, and worked for actual monetary reinforcers.
Procedure. The experimental task was essentially a version of the stochastic two-armed bandit problem (see Krebs, Kacelnik, & Taylor, 1978). A discrete-trial concurrent choice procedure was used, with single response required to complete each trial. Since reinforcement probability was stochastic, actual reinforcement frequencies could vary across subjects and experimental sessions. Ten different combinations of reinforcer amounts and probabilities were used. Each of these conditions, and the subjects that were exposed to them, are shown in Table 1. In most conditions, the mathematically expected values of the two options were approximately equal, which provides a useful index in which to compare response preferences. Because subjects generally chose the non-risk option, three subjects (2132, 2196, and 2184) were also exposed to conditions expected to promote risk-taking. These conditions were biased towards risk-taking because the risk option provided greater overall monetary payoff (see Figure 1, bottom panel, right-hand data points). When subjects were exposed to multiple conditions, we used an "ABACAD" design, and subjects moved across conditions only after responding met formal stability criteria (SD / mean, or coefficient of variation, £ 0.20).
Data analyses. Another aim of our investigations was the exploration of quantitative descriptions of response proportions. Quantitative analyses have long been employed in the study of decision-making (Houston, 1991; Machina, 1987; Von Neuman & Morgenstern, 1947). Human subjects response patterns consistently indicate that under conditions in which 1.0 > pO2 > pO1, response options are treated subjectively rather than by mathematically expected value. These subjective values purportedly determine response preferences. Kahnemann & Tversky's prospect theory provides quantitative predictions of risk taking under theseconditions (1979, 1984). Relevant to the present report, Rachlin, Logue, Gibbon, and Frankel (1986) noted the compatibility of prospect theory with the behavioral theories of matching and delay discounting, in which subjective value is measured as non-linear sensitivity to parameters of reinforcer amount, delay, and rate. Rachlin et al., (1986) suggested a matching-law based equation compatible with both prospect theory and behavioral accounts of choice:
[1] B1 / B2 = (A1 / A2)sa x (R1 / R2)sr x (D2 / D1)sd
where A refers to monetary amounts; R to reinforcement rate; D to the delay between response and reinforcer; and subscripts 1 and 2 to the respective response options. The parameters sa, sr, and sd refer to individual sensitivities to amount, rate, and delay, respectively. In discrete-trial designs with probabilistic outcomes, probability can be treated as equivalent to reinforcer rate and/or delay (Rachlin et al., 1986; Silberberg, Murray, Christensen, & Asano, 1988). [Note, however, Green, Myerson, & Ostaszewski (1999) recently reported that under some conditions, probability and delay may not be equivalent]. When intertrial interval (ITI) is a constant in discrete trial procedures, it can be factored out if ITI values are short (5 sec in this procedure). Thus, when there is no delay between response and reinforcer, the equation can be simplified by accounting for only amounts and probabilities. Accordingly, we derived predictions based on a simplified equation adapted from Rachlin et al., 1986:
[2] B1 / B2 = (A1 / A2)sa x (P1 / P2)sr
where A refers to monetary amounts; and P to the probability of reinforcement, and subscripts 1 and 2 to the respective response options (Rachlin et al., 1986 provide a detailed account of this model). The parameters sa and sr, designate individual sensitivities to amounts and probabilities. We set the exponents sa and sr to 0.50 and 1.50, respectively, based on previous data suggesting that subjects are undersensitive to amounts and oversensitive to rates / probabilities (e.g., de Villiers, 1977; Goodie & Fantino, 1995; Herrnstein, 1997; Kollins, Newland, & Critchfield, 1997; Slovic & Lichtenstein, 1968).
Pilot data and post-experimental interviews suggested that subjects were oversensitive to probability, but more specifically that the aversive properties of non-reinforced trials (no gain) exerted substantial control over future choices an observation consistent with previous findings (Slovic & Lichtenstein, 1968). Subsequently, we developed a simple equation to provide a quantitative description of these avoidance-based response patterns. We will refer to equation 3 as the lost opportunities (LO) prediction:
[3] B1 / (B1 + B2) = LO1 / (LO1 + LO2)
where LO1 = [EV2 x (PN1 x Nt)] and LO2 = [EV1 x (PN2 x Nt)]. EV refers to the simple mathematical expected value (amount x probability of reinforcement); PN refers to the probability of a non-reinforced trial on each of the respective response options; and Nt is the number of trials per session. The rationale behind this equation is that subjects choices are strongly influenced by the frequency of non-reinforced trials, which are discriminated as lost opportunities in which money could have been earned on the other response option.
When faced with repeated trials, in which reinforcement probabilities are not explicitly known, subjects may sample both options at least initially. Subjects making choices based only on EV should eventually come to choose the option with the higher EV on every trial. However, most data indicate that EV is unlikely to be a good predictor of choice under these conditions, but its calculation provides a referent for deviations (e.g., differential control by amount and probability). Thus we also calculated predictions based on EV. Note that the calculation of equation 2 and EV do not necessarily provide values between 0 and 1. For comparative purposes, all predicted and obtained data are expressed as proportions so that they can be directly compared (for example, expected value for the risk option (R) and non-risk option (NR) is expressed as R / R + NR).
Figure 1 shows hypothetical functions (filled symbols) for equations 2 (upper panel) and 3 (middle panel), and the expected value (lower panel) across a range of amounts and probabilities that approximate those used in the study. The predicted proportions of choices on the risk option are shown on the Y-axis. The hypothetical values were (i) non-risk option reinforcement probability = 0.90; (ii) risk option reinforcement probability = (left to right) 0.02 to 0.40 in increments of 0.02; (iii) risk option reinforcer amount = $1.00; (iv) non-risk option reinforcer amounts = $0.08 (filled circles), $0.15 (triangles), and $0.25 (squares). Also plotted in Figure 1 are the individual predictions (open circles) for all ten actual reinforcer amount and probabilities used in the study (noted * on the X-axis label, see Table 1 for details). Open circles located within the range of the hypothetical distribution are initial conditions completed by each of four subjects; those to the right are the extended conditions designed to promote a preference for the risk option. Importantly, expected value predicts that subjects will choose the risk option on at least 50% of the trials (marked by the dotted line) in all ten conditions. In contrast, both equations 2 and 3, which account for subjective rather than expected values, predict subjects will choose the non-risk option on < 50% of trials in all but the extended conditions.
Figure 2 presents the results of least-squares linear regression analyses, showing the proportion of obtained choices (Y-axis) on the risk option as a function the proportion of predicted choices (X-axis) on the risk option, for equations 2 and 3 (upper and lower panels, respectively). The filled circles represent data for a group of four subjects who were presented with the same conditions (see Table 1). The open circles represent individual data for three subjects who also completed the extended conditions (footnoted b in Table 1), with parameters that encouraged responding on the risk-option. Regression coefficients are shown in the upper left of each panel. Under most conditions, subjects chose the non-risk option (e.g., the filled circles are all well below 0.50). Only under the extended conditions, in which the risk option clearly provided a greater overall monetary gain, did behavior shift to > 50% for the risk option. Equation 2 produced a r2 value of 0.58; and r2 was 0.80 for equation 3. This difference appears mostly due to equation 2 accounting less for the extreme risk-aversion we observed under conditions with large discrepancies in reinforcement probability (filled circles)
Figure 2 suggests that equations 2 and 3 differ modestly in predicting response proportions. However, for equation 2 we chose to a priori set the exponents sa and sr at 0.5 and 1.5. These values were selected as general markers in order to demonstrate that response options would be treated subjectively (in contrast with mathematically expected values). Had we selected different estimates, or fitted the data with these as free parameters, the r2 value may well have been higher. Equation 3 described the response proportions well. This is likely due to the observation with earlier pilot subjects that non-reinforced trials heavily influenced responding. Thus, we developed the equation based on data and subject reports we had already observed under similar conditions. Certainly, the utility of equations 2 and 3 cannot be extended beyond the limited range of conditions used here. However, it appears that when human subjects respond for monetary reinforcers, and reinforcer amounts and probabilities are substantially discrepant, non-reinforced trials will serve as aversive stimuli. When subjects are working in a context of apparent deprivation (i.e., they are unemployed), non-reinforced trials may exert substantial control over responding. More extensive discussion regarding monetary deprivation, preference for the non-risk option, and individual subject data patterns are covered elsewhere (Lane & Cherek, in press-a).
Importantly, the present data served as a catalyst for a subsequent study of risk taking in individuals with various forms of psychopathology associated with high-risk behavior (e.g., drug dependence, conduct disorder, and repeated criminal activity). Understanding the aversive function of non-reinforced trials in the present study occasioned the observation that, compared to controls, the probability of risk taking in high-risk individuals was not significantly altered following non-reinforced trials or trials resulting in monetary loss (Lane & Cherek, in press-b). While it is beyond the scope of the present report, reanalysis of other data sets using equation 3 would provide useful information regarding its utility in future experiments.
American Psychiatric Association, (1994). Diagnostic and Statistical Manual of Mental Disorders, 4th ed. (DSM IV). Washington, DC.
de Villiers, P. (1977). Choice in concurrent schedules and a quantitative formulation of the law of effect. In W. K. Honig & J. E. R. Staddon (Eds.), Handbook of operant behavior. Englewood Cliffs, NJ: Prentice-Hall.
Goodie, A. S. & Fantino, E. (1995). An experimentally derived base-rate error in humans. Psychological Science, 6, 101-106.
Green L, Myerson J, Ostaszewski P. (1999). Amount of reward has opposite effects on the discounting of delayed and probabilistic outcomes. Journal of Experimental Psychology: Learning Memory and Cognition, 25, 418-427.
Hamm, S.L. & Shettleworth, S.J. (1987). Risk aversion in pigeons. Journal of Experimental Psychology: Animal Behavior Processes, 13, 376-383.
Herrnstein, R. J. (1997). The matching law. H. Rachlin & D. I. Laibson (Eds.). Cambridge, MA: The Harvard University Press.
Houston, A. I. (1991). Risk-sensitive foraging theory and operant psychology. Journal of the Experimental Analysis of Behavior, 56, 585-590.
Kahneman, D. & Tversky, A. (1984). Choices, values, and frames. American Psychologist, 39, 341-350.
Kahneman, D. & Tversky, A. (1979). Prospect theory: An analysis of decisions under risk. Econometrica, 47, 263-291.
Kollins, S. H., Newland, M. C., & Critchfield, T. S. (1997). Human sensitivity to reinforcement in operant choice: How much do consequences matter? Psychonomic Bulletin & Review, 4, 208-220.
Krebs, J. R., Kacelnik, A., & Taylor, P. (1978). Test of optimal sampling by foraging great tits. Nature, 275, 27-31.
Lane, S.D. & Cherek, D.R. (in press-a). Risk aversion in human subjects under conditions of probabilistic reward. The Psychological Record.
Lane, S.D. & Cherek, D.R. (in press-b). Analysis of risk-taking in adults with a history of high-risk behavior. Drug and Alcohol Dependence.
Machina, M.J. (1987). Decision-making in the presence of risk. Science, 236, 537-542.
Mazur, J. E. (1996). Choice with certain and uncertain reinforcers in an adjusting-delay procedure. Journal of the Experimental Analysis of Behavior, 66, 63-74.
Rachlin, H. & Frankel, M. (1969). Choice, rate of response, and rate of gambling. Journal of Experimental Psychology, 80, 444-449.
Rachlin, H., Logue, A. W., Gibbon, J., & Frankel, M. (1986). Cognition and behavior in studies of choice. Psychological Review, 93, 33-45.
Real, L. A. (1991). Animal choice behavior and the evolution of cognitive architecture. Science, 253, 980-986.
Silberberg, A., Murray, P., Christensen, J., & Asano, T. (1988). Choice in the repeated-gambles experiment. Journal of the Experimental Analysis of Behavior, 50, 187-195.
Slovic, P. & Lichtenstein, S. (1968). Relative importance of probabilities and payoffs in risk taking. Journal of Experimental Psychology, 78 (Monograph 3, Part 2), 1-17.
von Neumann, J. & Morgenstern, O. (1947). Theory of games and economic behavior (2nd ed.). Princeton, NJ: Princeton University Press.