|
1
|
|
|
2
|
- Inference for a population proportion
- The sample proportion
- The sampling distribution of
- Large sample confidence interval for p
- Accurate confidence intervals for p
- Choosing the sample size
- Significance tests for a proportion
|
|
3
|
- Quantitative
- Something that can be counted or measured and then added, subtracted,
averaged, etc., across individuals in the population.
- Example: How tall you are, your age, your blood cholesterol level
- Categorical
- Something that falls into one of several categories. What can be
counted is the proportion of individuals in each category.
- Example: Your blood type (A, B, AB, O), your hair color, your family
health history for genetic diseases, whether you will develop lung
cancer
|
|
4
|
- We now study categorical data and draw inference on the proportion, or
percentage, of the population with a specific characteristic.
- If we call a given categorical characteristic in the population “success,”
then the sample proportion of successes, ,is:
|
|
5
|
- The sampling distribution of
is never exactly normal. But as the sample size increases, the
sampling distribution of
becomes approximately normal.
|
|
6
|
- The mean and standard deviation (width)
of the sampling distribution are both
completely determined by p and n.
- Thus, we have only one
population parameter to
estimate, p.
|
|
7
|
- Assumptions:
- We regard our data as a simple random sample (SRS) from the
population. That is, as usual,
the most important condition.
- The sample size n is large enough that the sampling distribution is
indeed normal.
- How large a sample size is enough?
Different inference procedures require different answers (we’ll
see what to do practically).
|
|
8
|
- Use this method when the number of successes and the number of failures
are both at least 15.
|
|
9
|
|
|
10
|
|
|
11
|
|
|
12
|
- A simple adjustment produces more accurate confidence intervals. We act
as if we had four additional observations, two being successes and two
being failures. Thus, the new sample size is n + 4 and the count of
successes is X + 2.
|
|
13
|
|
|
14
|
- You may need to choose a sample size large enough to achieve a specified
margin of error. However, because the sampling distribution of is a function of the population
proportion p this process requires that you guess a likely value for p: p*.
|
|
15
|
|
|
16
|
- The sampling distribution for is approximately normal for large
sample sizes, and its shape depends solely on p and n.
- Thus, we can easily test the null hypothesis:
- H0: p = p0 (a given value we are testing)
|
|
17
|
|
|
18
|
|
|
19
|
|
|
20
|
- The reliability of an interpretation is related to the strength of the
evidence. The smaller the P-value, the stronger the evidence against the
null hypothesis and the more confident you can be about your
interpretation.
- The magnitude or size of an effect relates to the real-life relevance of
the phenomenon uncovered. The P-value does NOT assess the relevance of
the effect, nor its magnitude.
- A confidence interval will assess the magnitude of the effect. However,
magnitude is not necessarily equivalent to how theoretically or
practically relevant an effect is.
|