B. Bayes Rule and Conditional Probability: At-risk Students example To augment the material on Bayes rule in Stat190 the following (highly artificial) example was used in 191X. Let the Event A1 be that the student drops out of school and the Event A2 that the student does not drop out of school. Let Event B be that the student has a bad (deficient) home environment. The retrospective probability P{B|A1} and also the marginal probabilities P{A1}, P{B} can be used via Bayes rule to give us the prospective probability P{A1|B}. This latter probability could tell us about the value of home environment as a predictor of student drop- out. The examples illustrate the large effect the marginal incidence of drop-outs P{A1} has on the value of B as a predictor--i.e., P{A1|B} increases in the examples and Table below as P{A1} increases from .1 to .4. (* Bayes Rule *) P{A1|B} := (P{B|A1}*P{A1})/(P{B|A1}*P{A1} + P{B|A2}*P{A2}) P{A2} := 1 - P{A1} P{B} := P{B|A1}*P{A1} + P{B|A2}*P{A2} Example 1. P{B|A1} = .5; P{B|A2} = .2; P{A1} = .1; [P{A1|B}, P{B}] = [0.217391, 0.23] Example 2. P{B|A1}= .8; P{B|A2}= .2; P{A1}= .35; [P{A1|B, P{B}]= [0.682927, 0.41] Create Table--Entries are P{A1|B} (top) and P{B} (below) Table[{P{A1|B}, P{B}, {P{A1}, .1,.4, .1}, {P{B|A1}, .5,.9,.2}] P{B|A1} P{A1} .5 .7 .9 .1 0.217391 0.28 0.333333 0.23 0.25 0.27 .2 0.384615 0.466667 0.529412 0.26 0.3 0.34 .3 0.517241 0.6 0.658537 0.29 0.35 0.41 .4 0.625 0.7 0.75 0.32 0.4 0.48 ----------------------------------------------------------------------------- Problem 1: Child Abuse? Rather than a yes/no vote on the issue, this header leads us to a Bayes Thm. calculation along the same lines as the "At-Risk Students" Example presented in class 10/5 (c.f Course Files listing). This example is taken from the Larson and Marx text used in prior Stat190. A government task force is considering the feasibility of setting up a national screening program to detect child abuse. Consultants for the group estimate that: 1.One child in 90 is abused 2.A physician can detect an abused child 90 percent of the time 3.A screening program would incorrectly label 3 percent of all nonabused children as abused Using this information, calculate the probability that a child is actually abused given that the screening program diagnosis him/her as such? Comment on the usefulness of such a program. Repeat your calculation and comment if it were the case that One child in 9 is abused rather than the stated One child in 90 is abused. ------------------------ Problem 1: Child Abuse? First, let's define two events and their complements: A - child abused; notA - child not abused. B - child labeled as abused; notB - child labeled not abused. We can now write the information we got in the question in formal probability terms: 1.One child in 90 is abused: P(A) = 1/90 ==> P(notA) = 89/90 2.A physician can detect an abused child 90 percent of the time: P(B|A) = .9 3.A screening program would incorrectly label 3 percent of all nonabused children as abused: P(B|notA) = .03 Now we are ready to use Bayes Theorem to get the desired probability - P(A|B). P(B|A)P(A) .9 x (1/90) P(A|B) = ---------------------------- = ---------------------------- = P(B|A)P(A) + P(B|notA)P(notA) .9 x (1/90) + .03 x (89/90) .01 = ------------ = .252 .01 + .02967 That is, only one child out of four diagnosed by the screening program as abused is actually abused. The screening program will produce many "false positive" cases. When we change the prevalence of abuse P(A) from 1/90 to 1/9 and repeat the same calculations (just change P(A) to 1/9 in the above formula) we get P(A|B)=.771; i.e. 3 out of every 4 diagnosed children are actually abused. Here the screening program will do a much better job at detecting the actual abused children without falsely labeling many others.