Sent to the AP Statistics ListServe.
This is offered in the spirit of sharing.
It represents a summary of an AP Statistics project at Cate School.
It might provide some ideas for AP Statistics teachers.
For those who read through this project, I would appreciate any comments that anyone might have to offer on ways to improve a
project like this.
I would especially appreciate comments on the approach we took in analyzing the date from question #3.)
(Since I planned to make this Word document an e-mail report, the formatting is minimal.)
Sanderson M. Smith
Report follows --->
STATISTICS SURVEY PROJECT, February, 1999
Carried out by 31 students in Advanced Placement Statistics
at Cate School, Carpinteria, California
POPULATION SURVEYED:
Students at Cate School, grades 9-12, exclusive of the 31 students in Advanced Placement Statistics. This population consists
of 218 students. The AP students were able to get responses from 193 of them. (A few did not respond to some of the
questions.) This, we decided, was a good representation of the Cate School student population for the academic year 1998-99.
although the senior class was underrepresented, since 29 of the 31 AP Statistics students are seniors.
PROJECT DESIGN:
By a random process, the population of 193 was split into two roughly equal-sized groups, A and B. Fifteen AP Statistics
students provided four questions to group A, and sixteen AP Statistics students provided four questions to group B. The four
questions can be categorized in the following ways.
1. A realistic question about a contemporary topic.
2. A question about something that doesn't exist.
3. A question to test the "anchoring effect."
4. A question that was intentionally biased.
--------------------
Question 1
(This question was identical for both groups).
1. PRESIDENT CLINTON WAS RECENTLY IMPEACHED BY THE U. S. HOUSE OF
REPRESENTATIVES. DO YOU THINK HE SHOULD HAVE RESIGNED BEFORE HIS
U.S. SENATE TRIAL?
YES NO NO OPINION
--------------------
Question 2
(This question was identical for both groups. Note: There is no such things as the McDonald Brownson Act.)
2. DO YOU SUPPORT THE WELL-PUBLICIZED MCDONALD BROWNSON ENVIRONMENTAL
ACT?
YES NO NOT FAMILIAR WITH IT
--------------------
Question 3
Groups A and B were both asked to estimate the population of Pierre, South Dakota. The AP Students attempted to see if they
could influence the estimate by providing different "anchors." (Anchors were 5,000 and 100,000.)
Group A had this version.
(a) PIERRE, SOUTH DAKOTA, IS ONE OF OUR NATION'S SMALLEST STATE
CAPITAL CITIES. DO YOU THINK THAT THE POPULATION OF PIERRE IS
MORE OR LESS THAN 5,000?
MORE LESS
(b) WHAT IS YOUR ESTIMATE FOR THE POPULATION OF PIERRE,
SOUTH DAKOTA?
Group B had this version.
(a) PIERRE, SOUTH DAKOTA, IS ONE OF OUR NATION'S SMALLEST STATE
CAPITAL CITIES. DO YOU THINK THAT THE POPULATION OF PIERRE IS
MORE OR LESS THAN 100,000?
MORE LESS
(b) WHAT IS YOUR ESTIMATE FOR THE POPULATION OF PIERRE,
SOUTH DAKOTA?
----------------------
Question 4
Groups A and B were both asked the same question about Saddam Hussein. However, each group had a different "introductory"
statement.
Group A had this version. (Mild introductory statement).
IN A RECENT VISIT TO THE UNITED STATES, POPE JOHN PAUL II
STATED CLEARLY THAT THE DID NOT BELIEVE IN THE CONCEPT OF
CAPITAL PUNISHMENT.
DO YOU BELIEVE THAT THE UNITED STATES SHOULD SUPPORT
MEASURES TO ASSASSINATE SADDAM HUSSEIN, THE DICTATOR
OF IRAQ?
Group B had this version. (Strong introductory statement).
IT IS WELL DOCUMENTED THAT SADDAM HUSSEIN IS A BRUTAL DICTATOR WHO
HAS CAUSED CONSIDERABLE SUFFERING FOR THE PEOPLE OF IRAQ.
DO YOU BELIEVE THAT THE UNITED STATES SHOULD SUPPORT
MEASURES TO ASSASSINATE SADDAM HUSSEIN, THE DICTATOR
OF IRAQ?
=========================================
THE RESULTS...
---------------
Question 1
(Should President Clinton have resigned?)
Since both A and B groups were asked identical questions, the results were combined.
Number Percent
YES 66 35.11%
NO 93 49.46%
NO OPINION 29 15.43%
Totals 188 100.00%
Discussion: While one student suggested a 95% confidence interval could be constructed for the percent responses, it was
quickly noted that we were actually working with a population (described above), and not a random sample. Confidence
intervals, while they could be mathematically constructed, would have no statistical meaning in this situation. The Cate
student population does not represent a random sample of secondary school students. It was concluded that it would be
reasonable to say that approximately 50% of the Cate School student body thought that President Clinton should not have
resigned, while about 35% thought he should, and about 15% had no opinion on the matter.
---------------
Question 2
(Support the well-publicized McDonald Brownson Environmental Act?) Since both A and B groups responded to identical questions,
the results were combined. Remember, the McDonald Brownson Environmental Act is purely fictional.
Number Percent
YES 22 11.64%
NO 16 8.47%
NOT FAMILIAR WITH IT 151 79.89%
Totals 189 100.00%
Discussion: As with question 1, the construction of confidence intervals would make no statistical sense in this situation
since we were working with a population, and not a random sample. Since Cate is an independent school with academically-able
students, we wondered if
students in this population would be hesitant to admit they were unfamiliar with this Act, especially since the question
contained the phrase "well-publicized." We found that approximately 80% admitted to not being familiar with it, and only 12%
said they supported it. It was acknowledged that a NO response (approximately 8%) could include people who said they didn't
support it simply because they didn't know what it was.
---------------
Question 3
This was, by far, the most interesting of the four questions, since it presented some real statistical problems in terms of
analysis. In this sense, it probably had the most educational value for AP students.
Please review the question asked, and realize that the population was randomly divided into two roughly equal groups, group A
and group B. Each group responded to the same question after they were provided an "anchor" number. The results pretty much
speak for themselves.
Group A (5,000 anchor) Group B (100,000 anchor)
Group size 99 94
Minimum 490 1
Q1 (25th percentile) 3,500 30,000
Median 7,900 70,000
Q3 (75th percentile) 15,000 110,000
Maximum 1,000,000 2,000,000
Mean 36,199 104,453
St. Deviation (s) 141,374 219,129
Largest Estimates 100,000 300,000
100,000 500,000
100,000 2,000,000
100,000
1,000,000
Discussion: The class knew that the two samples came from the same population. Our null hypothesis and alternate hypothesis
were defined as follows:
Ho: Group A and Group B came from the same population.
[mean(A) = mean(B)]
Ha: Group A and Group B did not come from the same population.
[mean(A) ? mean(B)]
Simply looking at the data suggests that Ho would be rejected at all "meaningful" levels of significance. We constructed
box-whisker plots to display the differences between A and B. Due to large estimates in both groups, it was difficult to
produce a scaled plot that clearly displayed the interquartile ranges (the "boxes"), but simply looking at the bottom 75% for
each group tells the story. Among other things, the 75th percentile for Group A is 15,000 below the 25th percentile for Group
B. Construct the plots, and you will see the "obvious." It was speculated these plots would convince statisticians at all
levels that the "anchors" clearly influenced the responses.
Discussion led to the conclusion that the single most useful statistic would be the median, since the median is not influenced
by extreme scores. In this respect, the interquartile ranges certainly suggest a vast difference in the responses for the two
groups.
25th percentile Median 75th percentile
Group A (5,000 anchor) 3,500 7,900 15,000
Group B (100,000 anchor) 30,000 70,000 110,000
We reasoned that the population we surveyed would, for the most part, not be familiar with population figures for South Dakota.
Had we asked the question about a California city, the results would probably not be as varied, even with the anchors. Our
overall conclusion was that it is probably relatively easy to influence an estimate if an individual is not familiar with what
is being asked about, and, if an anchor is provided.
We wanted to run a test using techniques learned in AP Statistics. We decided to run a test on the difference of the means.
OK, we could blindly apply the appropriate formulas, but, if one reads the requirements for this test, the data presents
problems. Among other things, the means are greatly influenced by extremely large estimates in both groups, A and B. These
few large values greatly distort the means. Both data sets are heavily skewed to the right, making a difference of means test
relatively meaningless.
So, we decided to remove 5 extreme values (shown above) from A, and 3 extreme values from B. Note how these values affect the
statistics, particularly the means.
[Before= Original Data; After = Data with extreme values removed.]
Even after removing the large values, we still had data sets that were skewed right, but we decided to run a test on the
difference of means.
Group A (5,000 anchor) Group B (100,000 anchor)
Before After Before After
Group Size 99 94 94 91
Minimum 490 490 1 1
25th Percentile 3,500 3,500 30,000 25,000
Median 7,900 7,321 70,000 69,500
75th Percentile 15,000 11,500 110,000 107,500
Maximum 1,000,000 86,000 2,000,000 200,000
Mean 36,199 11,031 104,453 74,934
St.Dev. (s) 141,374 14,232 219,129 56,878
For the After data, we have mean(A) = 11,031, st.dev(A) = 14,232, mean(B) = 74,934, and st.dev(B) = 56,878.
Standard error (SE) = sqrt[14232^2/94 + 56878^2/91] = 6140.47.
t = (74934-11031)/6140.47 = 10.41.
Degrees of freedom = 91-1 = 90.
Using the TI-83, the p-value is tcdf(10.41,1E99,90) = 2.017E-17. This, not surprisingly, is very close to 0%.
The null hypothesis, Ho, would be rejected at all meaningful levels of significance. Statistically, we concluded that Group A
and Group B came from different populations. The results certainly suggest that the provided "anchors" influenced responses to
our request for an estimate of the population of Pierre, South Dakota.
While the difference of means test is statistically sophisticated, the class agreed that a look at the box-whisker displays
probably represents the most powerful down-to-earth "proof" that the anchors influenced the responses.
---------------
Question 4
(Do you support measures to assassinate Saddam Hussein?)
Group A (mild intro.) Group B (strong intro.)
YES 49 56
NO 24 24
NO OPINION 26 14
Totals 99 94
Discussion: We wanted to determine if the introductory statement prior to the actual question influenced the YES responses.
The class decided to run a statistical test on the difference of proportions. The YES proportion calculations are as follows:
p-hat(A) = 49/99 = 0.4949, p-hat(B) = 56/94 = .5957.
Our null and alternate hypothesis were formulated:
Ho: Group A and Group B came from the same population. [p(A) = P(B)]
Ha: Group A and Group B did not come from the same population.
[p(A) ? p(B)]
Since we know the two groups came from the same population, we pooled the data to obtain
p-hat = (49+56)/(99+94) = 105/193 = 105/193 = .5440.
Calculating the standard error, we obtained
SE = sqrt[(.5440)(1-.5440)(1/99 + 1/94)] =.071726.
Hence, the z-statistic is z = (.5957 - .4949)/.071726 = 1.405.
Using the TI-83, the p-value is normalcdf(1.405,1E99,0,1) = .08.
Common levels of significance are 5% and 1%. Ho would not be rejected at these levels (both 1-tail and 2-tail tests). At
these levels of significance, the results are not significant. There is not strong evidence to suggest that responses were
affected by the different introductory statements.
It was mentioned that if we took a 1-tail approach and defined
Ho: p(B) = p(A)
Ha: p(B) > p(A)
we would reject Ho at the 10% level of significance.
Summary: There is perhaps some evidence to suggest that the strong introductory statement caused more students to say YES, but
the statistical evidence is not overwhelming. Among other things, this project demonstrated that the answer to the question
"Are the results significant?" clearly depend upon how "significance" is defined.
==============================================
To those who have read this far...
Comments, suggestions, criticisms, etc. are welcome.
Individual students in my AP Statistics classes are writing individual reports for this project.
Sanderson M. Smith