Psychology 458/558
Judgment and Decision Making
Prof. Bertram Malle
Fall 1995
Lecture 9: Oct 31
Signal detection theory (SDT)

Whenever you perceive a symptom that imperfectly indicates the presence of a disease; or a behavior that imperfectly indicates the presence of a personality trait; or a test score that imperfectly indicates the presence of a talent, you face a signal detection problem. The disease, trait, or talent are signals, and you have to infer the presence of the signal against a background of noise (i.e., the healthy population, people without the trait, people without the talent). This situation can be depicted by two distributions (one for the noise, one for the signal) that partially overlap with respect to a dimension x (the symptom), which can be anything from your intuition to a test score. The left distribution shows the probability of x coming from noise, which is p(x|n), while the right distribution shows the probability of x coming from a signal, p(x|s).

At the extremes of x there is no decision problem: if x is very low (-1 or 0), you infer there is noise; if x is very high (4 or 5), you infer the presence of a signal. The problem is the overlapping area in the middle. Between 1 and 3, for example, it is about as likely that the indicator comes from the noise distribution as it is that it comes from the signal distribution. You have to decide at what level of x you infer a signal (say "Yes"). In other words, you have to decide on a cut-off criterion below which you assume there is noise and above which you infer there is a signal.

If the two distributions were farther apart, you wouldn't make many mistakes. (In fact, how far the distributions are apart is a measure of your "discriminative ability," sometimes called sensitivity.) Because the distributions overlap, however, you will make two types of errors: false alarms (saying "Yes" to noise) and misses (saying "No" to signals). Of course, you will also have two successes: hits (saying "Yes" to a signal) and correct rejections (saying "No" to noise).

The four possible outcomes are related to each other. The hits and misses sum up to the number of presented signals; that is, the hit rate, p(Yes|signal), and the miss rate (No|signal) sum up to 1.0. Similarly, the false alarms and correct rejections sum up to the number of presented noise instances; that is, the false alarm rate, p(Yes|noise), and the correct rejection rate, p(No|noise) sum up to 1.0. Because of these constraints, it suffices to examine hits and false alarms because the other two proportions follow logically.

The decision problem is now this: How can you optimize your detection behavior? The easy way out is to increase the discriminative ability (then the two distributions are farther apart and you make few mistakes, if any). But that is usually impossible. So, for any given discriminative ability, you have a strategic choice: you can either minimize false alarms or maximize hits (but not both). For example, you can say "Yes" very rarely and thus minimize false alarms, but that way you will also hold the hits relatively low. Alternatively, you could say "Yes" very often and thus maximize hits, but that would also unduly increase false alarms. These two situations, and an intermediate response strategy, are depicted in the following graph, along with the corresponding tables.

A "bold" criterion (I) yields a high hit rate but also many false alarms (because you say "Yes" very often). This results in a lower diagnosticity of any given "Yes": 95/168 = 0.57. A "conservative" criterion (III) yields few false alarms but also few hits (because you say "Yes" only rarely). Hence, you increase the diagnosticity of any given "Yes": 38/48 = .79. Finally, the balanced criterion (II) maximizes the difference between hit rate and false alarm rate (.69 - .31 = .38) rather than either maximizing hits or minimizing false alarms. Its diagnosticity lies between the other two.

In general, how do we find the optimal cut-off criterion? It depends on two factors:

  1. The base rates (i.e., prior probabilities) of signal and noise
  2. The relative costs of errors (or benefits of successes)
If the base rates of signal and noise are equal and the relative costs of errors are equal too, then the optimal criterion lies at the balance point of x where the two distributions intersect--that is, where p(x|n) = p(x|s), which was criterion II above. However, the less frequent the signal becomes (relative to the noise), the stricter our criterion must become. Why? Because we get a relatively large number of false alarms from the huge distribution of noise. In other words, with rare diseases, we need a lot of evidence in order to get good diagnosticity. (Remember from Lecture 8 that rare diseases require tests with low false alarm rates, which means tests with strict cut-off criteria). The following two graphs show the influence of different base rates on the cut-off criterion.

The second important factor to decide on an optimal cut-off point concerns the relative costs and benefits of hits, false alarms, etc.

If the benefits of hits are high (e.g., during an initial screening for a dangerous disease), you should use a lenient criterion, which means saying "Yes" quite often. By contrast, if the costs of false alarms are high (e.g., painful therapy), you should use a stricter criterion, which means saying "Yes" less often.

Now compare the interaction between different costs/rewards and different cutoff criteria. Consider the following three payoff matrices A, B, and C (values are in $, but they could be in "joy units" as well):

In situation A you win $3 for every hit (and lose $3 for every miss); and you lose $1 for every false alarm (and gain $1 for every correct rejection). The payoffs are constrained row-wise just like the hit rate and false alarm rate were: hits and misses add up to 0, and so do false alarms and correct rejections.

For each of these matrices, A through C, we can now calculate the expected values under the three different empirical cut-off criteria (I, II, III) discussed before. That way we find out which cut-off criterion is best for which payoff situation. We simply multiply the payoffs with the corresponding frequencies (hits, misses, etc.) and we get the expected payoff (in $) over the 200 trials. For payoff matrix A we find:

EV(I) = 3(95) - 3(05) - 1(73) + 1(27) = $224
EV(II) = 3(69) - 3(31) - 1(31) + 1(69) = $152
EV(III) = 3(38) - 3(62) - 1(10) + 1(90) = $8

Clearly, for situation A, a bold criterion (I) fares best because it maximizes hits, which are well paid for ($3). What about situations B and C? For a voluntary assignment, calculate the expected payoffs and explain which criterion is the best in each situation.

There is a formula that allows you to find the best cutoff criterion for a given problem. Without deriving it, here is what this formula says: We maximize expected utility if we say "Yes" whenever

The leftmost term is the likelihood ratio L(x) (sometimes called beta). This ratio is the height of the signal distribution, p(x|s), over the height of the noise distribution, p(x|n), at the particular point x at which we place the cutoff. The term p(n)/p(s) is the signal-to-noise ratio. It tells us the relative base rates of signal and noise. Finally, the rightmost expression takes the costs and benefits into account. (If we did an exact derivation of the formula, you would see why the costs and benefits are added in this particular way.)

If the prior probabilities of signal and noise are equal, then p(n)/p(s) = 1; and if the payoffs are balanced, too, the rightmost expression is also 1. Hence, the optimal criterion for this case is L(x) = 1. This is the point at which the two distributions intersect. If either the base rates differ or the costs of errors are unbalanced, the criterion shifts below 1 (i.e., downward, becoming more lenient) or above 1 (i.e., upward, becoming stricter). For example, if p(s) gets smaller (rare signal), the optimal criterion L(x) will exceed 1, thus being stricter. If the costs of false alarms increase (e.g., painful therapy), the criterion also gets bigger than 1. If, however, the costs of misses increase (e.g., initial screening), the criterion falls below 1, thus being more lenient.

In sum, SDT highlights the importance of choices in any problem of detection (e.g., admitting the "good" students on the basis of GREs; detecting the people who are "interested" from their degree of friendliness; finding shipwrecks in the deep sea). Given a certain discriminative ability (the distance between signal and noise distributions), you decide on a criterion that results in a particular trade-off between hits and false alarms. The prevalence of the signal and the costs/benefits of the different outcomes are crucial factors influencing your choice of that criterion.