Notes
Slide Show
Outline
1
Inference about a population proportion
  • BPS chapter 20
2
Objectives (BPS chapter 20)
  • Inference for a population proportion


  • The sample proportion
  • The sampling distribution of
  • Large sample confidence interval for p
  • Accurate confidence intervals for p
  • Choosing the sample size
  • Significance tests for a proportion
3
The two types of data — reminder
  • Quantitative
    • Something that can be counted or measured and then added, subtracted, averaged, etc., across individuals in the population.
    • Example: How tall you are, your age, your blood cholesterol level


  • Categorical
    • Something that falls into one of several categories. What can be counted is the proportion of individuals in each category.
    • Example: Your blood type (A, B, AB, O), your hair color, your family health history for genetic diseases, whether you will develop lung cancer
4
The sample proportion
  • We now study categorical data and draw inference on the proportion, or percentage, of the population with a specific characteristic.


  • If we call a given categorical characteristic in the population “success,” then the sample proportion of successes,     ,is:
5
Sampling distribution of
  • The sampling distribution of     is never exactly normal. But as the sample size increases, the sampling distribution of       becomes approximately normal.
6
Implication for estimating proportions
  • The mean and standard deviation (width)
    of the sampling distribution are both
    completely determined by p and n.



  • Thus, we have only one
    population parameter to
    estimate, p.
7
Conditions for inference on p
  • Assumptions:
  • We regard our data as a simple random sample (SRS) from the population.  That is, as usual, the most important condition.
  • The sample size n is large enough that the sampling distribution is indeed normal.
  • How large a sample size is enough?  Different inference procedures require different answers (we’ll see what to do practically).
8
Large-sample confidence interval for p
  • Use this method when the number of successes and the number of failures are both at least 15.
9
Medication side effects
10
 
11
 
12
“Plus four” confidence interval for p
  • A simple adjustment produces more accurate confidence intervals. We act as if we had four additional observations, two being successes and two being failures. Thus, the new sample size is n + 4 and the count of successes is X + 2.
13
 
14
Choosing the sample size
  • You may need to choose a sample size large enough to achieve a specified margin of error. However, because the sampling distribution of     is a function of the population proportion p this process requires that you guess a likely value for p: p*.
15
 
16
Significance test for p
  • The sampling distribution for     is approximately normal for large sample sizes, and its shape depends solely on p and n.
  • Thus, we can easily test the null hypothesis:
  • H0: p = p0 (a given value we are testing)
17
P-values and one- or two-sided hypotheses — reminder
18
 
19
 
20
Interpretation: magnitude versus reliability of effects
  • The reliability of an interpretation is related to the strength of the evidence. The smaller the P-value, the stronger the evidence against the null hypothesis and the more confident you can be about your interpretation.
  • The magnitude or size of an effect relates to the real-life relevance of the phenomenon uncovered. The P-value does NOT assess the relevance of the effect, nor its magnitude.
  • A confidence interval will assess the magnitude of the effect. However, magnitude is not necessarily equivalent to how theoretically or practically relevant an effect is.