Inferences About Means
1 of 21
Inferences About Means
Featured
Capitalization
Atomic Structure and Theory
More on Two Variable Data
Lewis Goldfrank presentation
Correlation Search In Graph Databases
Advertising Creativity
Picture sentences powerpoint
Computing in the Modern World
Understanding entrepreneurship how being entrepreneurial thrives towards excellence
Indemnity Comments Surety s Liability Continuing Guarantee BAILMENT
Segmentation strategies in Industrial markets
Goof21
Interpersonal Behaviour
People and Empires in the Americas 900 1500
Influenza
SUBSETS OF REAL NUMBERS
DIVIDEND POLICY
Software Technology M and A
Constructing Consensus Ontologies For The Semantic Web A Conceptual Approach
Interactive Visualization Of Mixed Scalar And Vector Fields
Inferences About Means - Transcript
Chapter 23:
Inferences About Means
Confidence Intervals & Hypotheses
About Means
To create confidence intervals and test
hypotheses about means:
Base both on the sampling model
CLT tells us that the sampling model is
Normal
Standard Error is just the estimated
standard deviation of the sampling model
Gosset’s t
Gosset worked as a quality control engineer.
He noticed that with small sample size, his tests for quality weren’t quite right.
When he used the estimated standard error, the shape of the sampling model changed; he called the new model a tdistribution.
Student’s tmodels form a whole family of related distributions that depend on a parameter known as degrees of freedom (df).
A Sampling Distribution for Means
When the conditions are
met, the standardized
sample mean,
follows a Student’s tmodel
with n – 1 degrees of
freedom.
We estimate the
standard error with
( )
yt SE y
µ−
= ( ) sSE y n=
Gosset’s Model
When Gosset corrected the model for
the extra uncertainty, the margin of error
got bigger.
When you use Gosset’s model instead
of the Normal model, your confidence
interval will be slightly wider and your P
values slightly larger.
“To t or not to t?”
If you know
use z (very
rare!).
Whenever you use s to
estimate , use t.
Student’s tmodels are
unimodal, symmetric
and bell shaped.
σ
σ
Assumptions and Conditions
Independence Assumption
Randomization condition
10% condition
Normal Population Assumption
Nearly Normal condition
The data come from a distribution that is unimodal and symmetric.
Check by making a histogram or Normal probability plot.
Onesample tinterval
( ) ( )1
1
When the conditions are met, find the confidence level
for the population mean, . Since the standard error of the mean is
, the interval . The critical value
depends on the parti
*
*
n
n
sSE y y t SE yn
t
µ
−
−
= ± ×
cular confidence level, , that you specify
and on the number of degrees of freedom, 1, which we get from
the sample size.
C
n −
A OneSample tInterval for the Mean
Identify the parameter:
Find a 90% confidence interval for the mean
speed of vehicles driving on Triphammer Road.
Look at the data:
Enter data into L1.
A OneSample tInterval for the Mean
Check the conditions:
Randomization: we have a convenience sample,
but we have reason to believe it is
representative.
10%: the cars observed were fewer than 10% of
al cars traveling on Triphammer Road.
Nearly Normal Condition: The histogram is
unimodal and symmetric
A OneSample tInterval for the Mean
State the sampling distribution model for the statistic.
Under these conditions the sampling distribution of the mean can be modeled by Student’s tmodel with 22 degrees of freedom:
Choose your method.
We will use a onesample tinterval for the mean.
1 23 1 22n − = − =
A OneSample tInterval for the Mean
Construct the
confidence interval
We know
Under STAT TESTS
choose Tinterval
Margin of Error:
23 cars
31 0 mph
4 25 mph
.
.
n
y
s
=
=
=
( )
( )
22
1 717 0886
1 521 mph
*
. .
.
ME t SE y= ×
=
=
A OneSample tInterval for the Mean
Interpretation:
We are 90% confident that the true mean speed of all vehicles on Triphammer Road is between 29.5 and 32.5 miles per hour.
Caution: this was not a random sample of vehicles. It was a convenience sample taken at one time of the day. The drivers could possibly have seen the police device and may have slowed down. We are reluctant to extend our inference to other situations.
A OneSample tTest for the Mean
State the hypotheses:
We want to know whether the mean speed of
vehicles on Triphammer Road exceeds the
posted speed limit of 30 mph.
State the null hypothesis:
Mean speed, 30 mph
Mean speed, 30 mph
:
:
O
A
H
H
µ
µ
=
>
A OneSample tTest for the Mean
The histogram: Check the conditions:
Randomization: we have a
convenience sample, but we
have reason to believe it is
representative.
10%: the cars observed were
fewer than 10% of al cars
traveling on Triphammer Road.
Nearly Normal Condition: The
histogram is unimodal and
symmetric.
A OneSample tTest for the Mean
State the sampling distribution model for the
statistic.
Under these conditions the sampling distribution
of the mean can be modeled by Student’s t
model with 22 degrees of freedom:
Choose your method.
We will use a onesample ttest for the mean.
1 23 1 22n − = − =
A OneSample tTest for the Mean
STAT TESTS TTest
Calculate:
STAT TESTS TTest
Draw:
A OneSample tTest for the Mean
Conclusion:
Link the Pvalue to your decision about the null
hypothesis and state your conclusion in context.
The Pvalue of 0.126 says that if the true mean speed of
vehicles on Triphammer Road were 30 mph, samples of
23 vehicles can expected to have an observed mean of
at least 31.0 mph 12.6% of the time. That Pvalue is not
small enough for us to reject the hypothesis that the true
mean is 30 mph at any alpha level. We conclude that
there is not enough evidence to say that the average
speed is too high.
Intervals and Tests
Confidence intervals and significance tests are built from the same calculations.
The confidence level contains all the null hypothesis values you can’t reject.
A level C confidence interval contains all of the possible null hypothesis values that would be retained by a twosided hypothesis test at level 1 – C.
When the hypothesis is onesided, the corresponding level is (1 – C)/2.
Sample Size
Before collecting data, it is a good idea to know whether the
sample size is large enough to give you a good chance of
being able to tell you what you want to know.
An example: the movie download p. 456
( )
8 min, 10 min, 95% confidence interval
108 1 96 2 45 6 0025
Use 6 1 5 degrees of freedom to substitute an appropriate
value.
108 2 571 3 214 10 33
Round up, so 11 movies
*
. . .
. . .
ME SD
n nn
t
n nn
n
= =
= = =
− =
= = =
=
CAUTION!!
Beware multimodality.
Look for the possibility that the data come from two groups.
If so, separate the groups and analyze each group separately.
Beware skewed data.
Set outliers aside.
Report on these values separately.
Conduct an analysis of nonoutlying points, along with a separate
discussion of outliers.
Watch out for bias.
Think about possible sources of bias in your measurements.
Make sure data are independent.
Make sure that data are from an appropriately randomized sample.












