Activity 1-7
7
a. The observational units are the eight-hour shifts. b. One variable is whether Gilbert worked on the shift. This variable is categorical and binary. The other variable is whether a patient died on the shift. This variable is also categorical and binary.
•••
Homework Activities Activity 1-7: Miscellany a. Variable: binary categorical; observational units: pennies being spun b. Variable: binary categorical; observational units: people leaving the washroom c. Variable: quantitative; observational units: fast-food sandwich d. Variable: quantitative; observational units: residents of that country e. Variable: binary categorical; observational units: American households f. Variable: quantitative (though age might make more sense to interpret); observational units: colleges g. Variable: quantitative; observational units: colleges h. Variable: categorical; observational units: American voters in 2004 i. Variable: binary categorical; observational unit: newborn babies j. Variable: quantitative; observational units: Alfred Hitchcock movies k. Variable: quantitative; observational units: American pennies l. Variable: quantitative; observational units: automobiles m. Variable: binary categorical; observational units: automobiles
n. Variable: categorical; observational units: automobiles o. Variable: binary categorical; observational units: applicants for graduate school p. Variable: quantitative; observational units: college students q. Variable: categorical; observational unit: person r. Variable: binary categorical; observational units: college students s. Variable: binary categorical; observational units: participants in sport t. Variable: quantitative; observational units: sport participants u. Variable: quantitative; observational units: states v. Variable: quantitative; observational units: bartenders (or glasses if just one bartender) w. Variable: quantitative; observational unit: person x. Variable: quantitative; observational units: brides y. Variable: categorical; observational units: brides z. Variable: quantitative; observational units: couples getting married
1
Activity 1-6: A Nurse Accused
8
Topic 1: Data and Variables
Activity 1-8: Top 100 Films a. Quantitative b. Quantitative (though age might be easier to interpret) c. Categorical d. Binary categorical e. Binary categorical f. Binary categorical g. Quantitative (notice how this quantity will vary from movie to movie)
Activity 1-9: Credit Card Usage a. • • • • • •
Year in school: categorical Whether the student has a credit card: binary categorical Outstanding balance on the credit card: quantitative Whether the outstanding balance exceeds $1000: binary categorical Source for selecting a credit card: categorical Region of the country: categorical
b. Answers will vary. Examples include these: 1. Which class (freshman, sophomore, . . .) tends to have the largest outstanding credit card balance? 2. Do all regions of the country tend to obtain their credit cards from the same source?
Activity 1-10: Got a Tip? a. Answers will vary. Examples include these: • • • •
The number of customers at each table The amount spent on food and drink Whether there were children at the table Whether a man or woman paid the bill
b. Answers will vary. Examples include these: • Which tends to have more influence on the tip—the size of the bill or the number of people in the party? • Do males tend to be better tippers than females?
Activity 1-11: Proximity to the Teacher a. The observational units are the students. b. One variable is the distance the student is sitting from the teacher. This variable is categorical and binary. The other variable is the quiz scores. This variable is quantitative.
Activity 1-16
9
a. Categorical variable b. Quantitative variable c. Categorical variable d. This is not a variable; it doesn’t vary from patient to patient. e. This is not a variable; you cannot ask an individual patient to tell you this information. f. Binary categorical variable g. Quantitative variable h. Binary categorical variable i. This is not a variable; it needs to be worded as in part h in order to be a variable. j. This is not a variable; this is summary information about the emergency room.
Activity 1-13: Candy Colors a. The observational units are the pieces of candy. b. The variable is the color of the candy. This variable is categorical (non-binary). c. Now the observational units are the samples of 25 pieces of candy. d. The variable is the proportion of the sample that is colored orange. This variable is quantitative.
Activity 1-14: Natural Light and Achievement a. The observational units are the students. b. One variable is whether the student learned in natural light. The other variable is the score on the standardized test. c. The first variable in part b is categorical and binary. The second variable in part b is quantitative.
Activity 1-15: Children’s Television Viewing a. The observational units are the third- and fourth-grade students in San Jose. b. The quantitative variables are body mass index, triceps skinfold thickness, waist circumference, waist-to-hip ratio, weekly time spent watching television, and weekly time spent playing video games. The categorical variable is which school the student attends.
Activity 1-16: Nicotine Lozenge a. The observational units are smokers. b. The categorical variables are whether they received the nicotine or placebo, gender, whether the person made a previous attempt to quit smoking, and whether the subject successfully refrained from smoking during the study.
1
Activity 1-12: Emergency Rooms
10
Topic 1: Data and Variables
c. The quantitative variables are weight and number of cigarettes smoked per day. d. Type of lozenge assigned is a binary categorical variable.
Activity 1-17: Oscar Winners and Super Bowls a. Answers will vary from student to student. Examples include these: Categorical variables: • What is the movie’s genre? • Did the picture also win an Academy Award for best director? Quantitative variables: • What was the total length (in minutes) of the movie? • What was the production cost of the movie? • How much did the movie gross during its first weekend of release? b. Answers will vary from student to student. Examples include these: Categorical variables: • • • • •
In what city was the game played? Was the game played indoors or outdoors? Which league was the winning team a member of? Was either team a wild card? Did the winner of the coin toss win the game?
Quantitative variables: • • • •
What was the season percentage of wins for the winning team? What was the total payroll for the winning team? How many people attended the game? What was the point spread?
Assessment Sample Quiz 1A
•••
Suppose for every email message that you receive in the next week, you keep track of • • • • •
Whether the message is spam Whether the sender is a family member, a friend, or someone else Whether the message contains an emoticon (such as a smiley face How many words are in the message What day of the week the message was sent
)
1. Which of these variables is quantitative? 2. How many of these variables are categorical? How many are binary? 3. What are the observational units in this study? 4. State a research question that you could address with these data. 5. Is people who send you a message with an emoticon a legitimate variable in this study? Explain why or why not.
20
Topic 2: Data and Distributions
regular section, or perhaps students were sleepier in the sports section because it met earlier in the day.
•••
Homework Activities Activity 2-7: Student Data • • • • • • • •
How many hours you slept in the past 24 hours: dotplot Whether you have slept for at least 7 hours in the past 24 hours: bar graph How many states you have visited: dotplot Handedness: bar graph Day of the week on which you were born: bar graph Gender: bar graph Average study time per week: dotplot Score on the first exam in this course: dotplot
Activity 2-8: Student Data a. In general, do most female students study more than most male students? This does not mean that you would expect to find that all female students study more than all male students. You can also think in terms of the “typical female student” scoring higher than the “typical male student.” b. In general, do most students who study more score higher on exams than students who don’t study as much? Again, this does not mean that you would expect to find that all those who study earn higher grades than all those who do not study.
Activity 2-9: Value of Statistics a. Answers will vary from student to student. b. Answers will vary from class to class. Here are some sample answers: Rating
1
2
3
4
5
6
7
8
9
Tally (Count)
0
0
1
0
5
6
11
6
6
c. Yes, 7 was chosen more often than any other value. d. 29/35 or .829 gave a response greater than 5; 1/35 or .029 gave a response less than 5. e. The vast majority of this class (more than 80%) feel that statistics is important to society. In fact, more than 65% of the class feel that statistics is very important to society. About 14% of the class are neutral about the importance of statistics and only 1 of the 35 students in this group believe that statistics is not very important to study.
Activity 2-10: Value of Statistics i. Class C ii. Class D iii. Class E iv. Class A v. Class B
Activity 2-13
21
Activity 2-11: Quiz Scores Many answers are possible. Here are some examples: a. Quiz 1: 0, 1, 1, 2, 8, 8, 8, 9, 9, 9, 9, 9, 9, 9, 9, 10, 10, 10, 10, 10
2
b. Quiz 2: 3, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6 c. Quiz 3: 0, 1, 1, 2, 2, 3, 3, 4, 5, 5, 6, 6, 7, 7, 8, 8, 9, 10, 10, 10 d. Quiz 4: 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 8, 8, 9, 9, 9, 9, 10, 10, 10, 10
Quiz Number
Quiz 1 0
2
4
6
8
10
0
2
4
6
8
10
0
2
4
6
8
10
0
2
4
6
8
10
Quiz 2
Quiz 3
Quiz 4 Quiz Score
Activity 2-12: Responding to Katrina More than 85% of the whites who were surveyed did not think race was a factor in the government’s slow response to Hurricane Katrina, whereas only about a third of the blacks who were surveyed gave the same response. Equivalently, 60% of the blacks surveyed did think that race was a factor in the government’s response rate, compared to just over 10% of the whites surveyed. These results show a very strong difference of opinion between the two groups.
Activity 2-13: Backpack Weights a. The distributions of backpack weights for both the male and female students are roughly bell-shaped and range from a minimum of 2 lbs to a maximum of about 25 lbs. The female’s weights are centered between 10–11 lbs, whereas the male’s weights are centered slightly higher at about 11–12 lbs. The males appear to have one unusually heavy backpack weighing in at 35 lbs. b. Yes, it appears that males tend to carry slightly more weight in their backpacks than females. This is shown primarily by the centers in the dotplots. The graph for males appears shifted to the right of the graph for females. c. The ratios of backpack weights to body weights for these students range from about .015 to .18 and are roughly the same for both males and females. There is a cluster of ratios from .025 to .075 and another smaller cluster from about .08 to .125. There are five females with high ratios (.146 and above) but only one male with such a high ratio.
Topic 2: Data and Distributions
d. No, it does not appear that one sex tends to carry a higher ratio of their body weight in their backpacks than the other sex. Both dotplots look quite similar in terms of shape, center, and spread. e. Males tend to weigh more and so tend to carry more weight in their backpacks. But this factor is accounted for when you compute the ratio of backpack weight to body weight, as the ratio carried by each gender tends to be about the same.
Activity 2-14: Broadway Shows For the Broadway plays, the number of seats was fairly evenly distributed in 7 theaters, from just less than 600 to about 1100 seats. The number of seats available for the Broadway musicals in the remaining 22 theaters spread over a much wider range— from 650 seats to more than 1800 seats. Most of the musicals seemed to have either 1000–1200 seats or 1400–1700 seats. Attendance at the musicals was clustered primarily from 85% to more than 100% of the theatre’s capacity at each show. There were two low musical outliers near 40%. Attendance at the plays was more evenly distributed, with attendance for two plays near 60% of capacity, two near 80% of capacity, and three near or greater than 100% of capacity. The average price of a musical tended to be $70, although prices ranged from a low of about $55 to a high of $105. Ticket prices for a Broadway play were generally less than for a musical, with all but two of the play prices less than $60. The remaining two plays had ticket prices of about $83.
Activity 2-15: Highest Peaks The highest peaks in the east tend to be significantly less high than those in the west. The elevations in the east are all below 7,000 feet, whereas more than half of those in the west are above 9,000 feet. The west has a high outlier near 21,000 feet and a large cluster of elevations between 12,000 and 15,000 feet.
Activity 2-16: Pursuit of Happiness The following bar graph displays the results: Level of Happiness
Proportion
22
1 .9 .8 .7 .6 .5 .4 .3 .2 .1 0
Very Happy
Pretty Happy Not Too Happy Did Not Know Response
Activity 2-17: Roller Coasters a. The observational units are the roller coasters. b. The quantitative variables are height, length, speed, and number of inversions. The categorical variables are type of coaster (wooden or steel; binary) and design (sit down, stand up, inverted).
Quizzes
23
d. A typical height for a steel coaster is 148 feet. A typical height for a wooden coaster is 100 feet. e. The steel coasters tend to be taller than the wooden coasters. Most of the steel coasters are taller than most of the wooden coasters. f. No, one type of coaster is not always taller. There are some very short steel coasters and some relatively tall wooden coasters.
Activity 2-18: Nicotine Lozenge After six weeks, 45% of smokers using the nicotine lozenge had successfully quit smoking, whereas only 30% of those using the placebo had quit smoking. Thus, smokers using the lozenge are 1.5 times more likely to quit smoking. However, after 52 weeks (a year later), only about 18% of those using the nicotine lozenge were still not smoking, compared to 10% of those using the placebo. This result still means the nicotine lozenge users are more likely to quit smoking (1.8 times)—but the overall chance that a member of either group will successfully refrain from smoking has dropped significantly after a year.
Activity 2-19: Candy Colors a. The observational units are the Reese’s pieces candies. The variable is the color of the candy. This variable is categorical.
Proportion
Reese’s Pieces Candy 1 .9 .8 .7 .6 .5 .4 .3 .2 .1 0
Orange
Brown Color
Yellow
b. More than half of the candies were colored orange, just over a fourth (28%) were brown, and only 21% were yellow. This suggests that Hershey does not make equal proportions of each color. c. Answers will vary by class.
Assessment Sample Quiz 2A
•••
You want to compare prices of textbooks, so you ask six friends who are science majors and six friends who are humanities majors to report how much they spent on textbooks this term.
2
c. The heights of the steel coasters appear to have both a larger center and much greater variability than the wooden coasters. The steel coasters also seem to have a couple of high outliers at 420 feet.
Activity 3-8
33
Activity 3-5: Childhood Obesity and Sleep
b. This is an observational study because the researchers passively recorded information about the children’s sleeping habits. They did not impose a certain amount of sleep on children. Therefore, it is not appropriate to draw a cause-andeffect conclusion that less sleep causes a higher rate of obesity. Children who get less sleep might differ in some other way that could account for the increased rate of obesity. For example, amount of exercise could be a confounding variable. Perhaps children who exercise less have more trouble sleeping, in which case exercise would be confounded with sleep. You have no way of knowing whether the higher rate of obesity is due to less sleep or less exercise, or both, or due to some other variable that is also related to both sleep and obesity.
3
a. The explanatory variable is the amount of sleep that a child gets per night. This is a quantitative variable, although it would be categorical if the sleep data were reported only in intervals (more sleep vs. less sleep). The response variable is whether the child is obese, which is a binary categorical variable.
c. The population from which these children were selected is apparently all children aged 5–10 in primary schools in the city of Trois-Rivières. These Quebec children might not be representative of all children in this age group worldwide, so you should be cautious about generalizing that a relationship between sleep and obesity exists for children around the world.
•••
Homework Activities Activity 3-6: Elvis Presley and Alf Landon a. This is a very biased sampling method. You would expect this method to overestimate the proportion of adults who believe that Elvis faked his death because people who feel strongly about this are likely to be the ones responding to such an Internet poll. b. This number is a statistic. c. Although this number may feel large, you really have no way of knowing, based on the statistic alone, whether a sampling method is biased. It is better to consider the sampling method when assessing whether you believe bias is present. d. The sample size is 2032. Taking a larger sample would not reduce bias; if the sampling method is flawed, increasing the sample size will not correct the problem.
Activity 3-7: Student Data a. Answers will vary by school and class. b. Answers will vary by school and class. c. Answers will vary by school and class.
Activity 3-8: Generation M a. Parameter
e. Parameter
b. Statistic
f. Statistic
c. Statistic
g. Parameter
d. Statistic
h. Statistic
34
Topic 3: Drawing Conclusions from Studies
Activity 3-9: Community Ages a. This number is a parameter; you would view your community as the population. b. Yes, this sampling method would be biased. It would probably overestimate the average age of residents as younger residents do not attend church as frequently as older residents do. c. Yes, this would be a biased sampling method. This method would underestimate the average age of residents as most drivers at the daycare facility tend to be young adults, not middle-aged or elderly. This method would also exclude all residents who are not yet old enough to drive.
Activity 3-10: Penny Thoughts a. The number 2136 is the sample size, not the population. The population is all American adults. b. The sample is the 2136 people contacted by the Harris Poll; 59% is a statistic. c. The variable is whether the person opposes abolishing the penny; 59% is a statistic, not a variable. d. The observational units are people, not pennies. e. The parameter is a number (of unknown value). The population is all American adults. f. The statistic is a percentage of the sample of 2136 people who favor abolishing the penny, 59%—not an average (whether they vote to abolish the penny is a categorical variable).
Activity 3-11: Class Engagement a. No; this is an observational study, and there are at least two potential confounding variables that could explain the higher level of engagement in the statistics class. You cannot attribute the difference to the subject matter. b. Two confounding variables are time of class (8:00 am or 11 am) and instructor (Newton or Fisher).
Activity 3-12: Web Addiction a. The population is all visitors to the abcnews.com Web site (or Internet users). The sample is the 17,251 users of abcnews.com who responded to the survey. b. The corresponding parameter of interest in this study is the proportion of the population who have some sort of addiction to the Internet. c. The 6% is probably not a reasonable estimate of the parameter because the survey was voluntary. Those who use the Internet more and are more addicted to it are more likely to respond to an online survey. This makes the 6% higher than the percentage for all visitors to the site or for all Internet users in general. Alternatively, you could argue that many addicts might not be willing to admit to a “problem” and the 6% is less than the true proportion in the population (but this is more of a nonsampling error [people lying] rather than a sample selection issue). See Topic 4 (Activity 4-20 in particular) for more discussion of nonsampling errors.
Activity 3-15
35
Activity 3-13: Alternative Medicine
Activity 3-14: Courtroom Cameras a. The proportion is 800/812 or .985. This number is a statistic.
3
The sample result is probably not representative of the truth concerning the population of all adult Americans because the sampling method is biased. Only readers of Self magazine were part of the poll, and the readers of this health magazine were probably the type of people who try alternative medicines more than nonreaders (bad sampling frame). Furthermore, strong advocates of alternative medicines would probably be more likely to reply to a mail-in poll (voluntary response bias). Therefore, this result is very likely to be an overestimate of the proportion of all adult Americans who have used alternative medicines.
b. This sample probably is not representative of the population of all adult Americans. Only those people familiar with the trial and with the fact that they could write letters to the judge about their opinion and who felt very strongly about the issue would take the time to write. Those who didn’t mind the use of cameras probably wouldn’t feel the need to write in. This sample was voluntary and not random at all.
Activity 3-15: Junior Golfer Survey a. No, this is not a representative sample of all American teenagers because most teenagers do not play golf. b. Yes, this sampling procedure is likely to be biased with respect to voting preference. Golfing is an expensive sport, and the wealthy tend to vote Republican, so these teenagers have probably grown up in Republican households. c. The following graph displays the responses:
Proportion
Junior Golfer Survey 1 .9 .8 .7 .6 .5 .4 .3 .2 .1 0
Democrat
Republican Neither Voting Preference
Don’t Know
This graph shows that the majority of respondents indicated they were more likely to vote for a Republican. If you don’t believe most teenagers are Republicans, this gives you evidence that the sampling method is overrepresenting the Republicans in the population. d. Yes, this sampling procedure is likely to be biased with regard to both of these variables. If junior golfers tend to come from more affluent families, they almost certainly have a cell phone and computer in their homes, making online access readily available and probably giving them more free time to spend on the computer. Of course, if they are more physically active and training for tournaments, they might tend to spend less time online than a typical teen.
36
Topic 3: Drawing Conclusions from Studies
Activity 3-16: Accumulating Frequent Flyer Miles a. The observational units are the visitors of msnbc.com. The variable is whether they use a credit card to accumulate airline miles. The variable is categorical and binary. b. This number is a statistic because it is a number computed from a sample (from 1935 online responses). c. This sampling method is most likely biased (because it is voluntary) and will provide an overestimate of the proportion of all American adults who use a credit card to accumulate airline miles. People who are willing to respond to an online survey are more likely to be comfortable using their credit cards over the Internet and to take advantage of Internet offers. d. The sample size is 1935. No, it does not affect the answer to part c. This is a large sample size, and even if it weren’t, a large sample size will not compensate for bias caused by a poor sampling method.
Activity 3-17: Foreign Language Study a. Yes, these are observational studies. Researchers could only have passively observed the association between foreign language study and verbal SAT scores rather than determining for students whether they took a foreign language in high school. b. No, it is not legitimate to conclude that foreign language study causes an improvement in students’ verbal abilities. You can never draw cause-andeffect conclusions between variables from an observational study. One possible confounding variable is verbal aptitude. Perhaps students with strong verbal aptitudes choose to enroll in foreign language courses and also perform well on the verbal portion of the SAT exam. Students with weaker verbal skills may avoid foreign language courses and may also perform less well on the verbal portion of the SAT.
Activity 3-18: Smoking and Lung Cancer The student needs to explain how diet could be connected to both the explanatory (smoking) and response (lung cancer) variables. How could diet explain the apparent strong connection between smoking and lung cancer? For example, smokers may also tend to have poorer overall diets, and it could be the poor diet that leads to higher rates of cancer.
Activity 3-19: Smoking and Lung Cancer a. The explanatory variable is smoking habits. The response variable is whether the men died of lung cancer. b. Yes, this is an observational study. The researchers passively observed the smoking habits and lifespans of their subjects rather than actively imposing smoking habits on the individuals. c. Yes, you should have qualms about generalizing these results to a larger population. The subjects were all males and were haphazardly selected by volunteers, so the results definitely should not be extended to women. The results might also be unrepresentative of the general population as well, depending on how the volunteers selected the individuals.
Activity 3-23
37
Activity 3-20: A Nurse Accused a. The observational units are the eight-hour shifts. The explanatory variable is whether Gilbert worked on the shift. The response variable is whether a patient died on the shift.
c. No, because this is an observational study, you cannot draw any cause-and-effect conclusions between the variables. You cannot conclude that Gilbert caused the higher death rate on her shift. d. Perhaps Gilbert is a senior-level intensive care nurse whose patients are generally in more critical condition than those seen by nurses on other shifts. If she works primarily with patients who are less likely to survive, then it would not be surprising that the death rate on her shift is higher than that of the hospital average. Or, perhaps Gilbert works night or weekend shifts, which tend to have higher death rates than daytime or weekday shifts.
3
b. Yes, this is an observational study because the researchers did not randomly determine which shifts Gilbert would work.
Activity 3-21: Buckle Up! a. Yes, this is an observational study because you collected existing data about the states. b. No, you cannot conclude that the tougher seatbelt laws cause a higher proportion of residents to comply because this is an observational study. c. Yes, the data suggest that tougher seatbelt laws may result in lower death rates because the tougher seatbelt laws are associated with higher seatbelt compliance.
Activity 3-22: Yoga and Middle-Aged Weight Gain a. The explanatory variable is whether middle-aged adults practiced yoga. This variable is categorical and binary. The response variable is amount of weight gained/lost between the ages of 45 and 55. This variable is quantitative. b. Yes, this is an observational study because the researchers passively collected data through surveys rather than randomly determining who would practice yoga. c. No, this study does not allow you to draw a cause-and-effect conclusion between practicing yoga and gaining less weight because it is an observational study and you can never draw such conclusions based on observational studies. d. A potential confounding variable is the amount of weekly exercise performed by each adult. Perhaps adults who practice yoga also tend to engage in other forms of exercise on a regular basis, and this is what caused their weight loss. Adults who showed more weight gain may have participated in less overall exercise between the ages of 45 and 55.
Activity 3-23: Pet Therapy a. Yes, this is an observational study because you are passively observing and recording information about the patients instead of randomly determining which individuals own a pet. b. The explanatory variable is whether a recovering heart attack patient has a pet. This variable is categorical and binary. The response variable is whether the patient survives for five years. This variable is categorical and binary.
38
Topic 3: Drawing Conclusions from Studies
c. No, you cannot conclude that pet ownership leads to therapeutic benefits for heart attack patients based on this study because it is an observational study and you can never conclude cause-and-effect from an observational study. There are many potential confounding variables that could explain the association.
Activity 3-24: Winter Heart Attacks a. A possible confounding variable could be weather. An alternative explanation could be that during the months of December and January, the weather is colder, the days are shorter, people tend to get less exercise (or more straining exercise such as shoveling snow), and these factors in turn increase the number of heart attacks. b. The Los Angeles study reduces the viability of the change in weather explanation. c. A remaining confounding variable might be the length of the days. As the days shorten in the winter (and less sunlight is available), people become depressed, and this may increase the number of heart attacks that occur.
Activity 3-25: Pursuit of Happiness No, these study results do not establish a causal connection between income and happiness because this is an observational study and you can never conclude cause-andeffect from an observational study. There are many potential confounding variables that could explain the association.
Activity 3-26: Televisions, Computers, and Achievement a. Two explanatory variables are whether there was a television in the bedroom (binary categorical) and whether there was a computer in the home (binary categorical). Two response variables are score on mathematics portion of the achievement test (quantitative) and score on language arts portion of the achievement test (quantitative). b. Yes, this is an observational study. The researchers passively observed/collected the achievement scores and television/computer information about these children and did not impose any treatments. c. No, you cannot make either conclusion because this is an observational study. d. There are many possible answers. One confounding variable might be the financial status of the family. Families who are better-off financially are more likely to have computers but are also more likely to expose their children to various forms of literature and language arts, such as books, magazines, and theatre. This exposure, rather than the home computer, could be responsible for the higher scores on the language arts portion of the test. e. The sample in this study is the 348 Chicago third-graders. f. If you assume the sample was randomly selected, then you could generalize to all third-graders in the Chicago area. As they may not be typical of third-graders in other areas, you probably would not want to generalize beyond this population.
Activity 3-27: Parking Meter Reliability If the meters were randomly selected from Berkeley, you would be willing to generalize to Berkeley. However, because they were not randomly selected from all California parking meters, you wouldn’t be willing to generalize the results to this population.
Quizzes
39
Activity 3-28: Night Lights and Nearsightedness
b. This argument is incomplete because the student has not explained how “genetics” is connected to sleeping with a night light (the explanatory variable) as well as to the rate of nearsightedness (the response variable). The student should have said something such as “Parents’ eyesight, because nearsighted parents tend to have nearsighted children (genetics), and it could be that parents who are themselves nearsighted are more likely to need a night light in their children’s rooms.”
Assessment Sample Quiz 3A
•••
You want to investigate whether teenagers in England tend to read more Harry Potter books than teenagers in the United States. 1. Identify the populations in this study. 2. Identify the explanatory variable, and classify it as categorical or quantitative. 3. Identify the response variable, and classify it as categorical or quantitative. If you read a report that Hospital A has a higher mortality (death) rate than Hospital B when treating heart attack patients, it’s possible that the severity of the patient’s condition is a confounding variable. 4 and 5. Describe what it means for patient’s condition to be a confounding variable in this context. Be sure to indicate how this potential confounding variable could be related both to the explanatory and the response variable.
Solution to Sample Quiz 3A
•••
1. The populations are teenagers in England (1) and teenagers in the United States (2). 2. The explanatory variable is whether the teenager is from England or the United States. This is a binary categorical variable. 3. The response variable is the number of Harry Potter books the teenager has read. This is a quantitative variable. 4 and 5. A confounding variable is an undefined/unrecorded variable whose effects on the response variable are indistinguishable from the explanatory variable. It is possible that most of the patients who go to Hospital A are in critical condition when they arrive, whereas most of the patients who go to Hospital B are in fair to good condition when they arrive. This would necessarily mean that more of Hospital A’s heart attack patients would die (because of their prior condition, not because of their treatment), and more of Hospital B’s patients would be likely to survive.
3
a. No, assuming that these are observational studies, there are potential confounding variables that prevent you from legitimately concluding that sleeping with a night light causes a higher rate of nearsightedness.
Activity 4-8
49
participate might differ systematically in some ways from those who were included. Nevertheless, the researchers did use randomness to select their sample, and they probably obtained as representative a sample as reasonably possible. d. Perhaps mothers in those groups were in a lower economic class and therefore less likely to have phones in the first place, or perhaps they had to work so their children were in daycare. e. These comparisons address the issue of bias, not precision. The sampling method was slightly biased with regard to the mother’s race and age and the infant’s birth weight.
g. The large sample size produces high precision. This means that the sample statistics are likely to be close to their population counterparts. For example, the population proportion of infants who sleep on their backs should be close to the sample proportion who sleep on their backs. h. The sample size for subgroups is smaller than for the whole group, so the sample results would be less precise.
•••
Homework Activities Activity 4-6: Rating Chain Restaurants a. It seems unlikely that this sample was randomly chosen as it would be extremely difficult to give each Consumer Reports reader an equally likely chance of being selected for the sample and to ensure that everyone selected responded. It is much more likely that the responders self-selected by returning a survey. b. The authors probably make the disclaimer because the sample was not randomly selected from the entire population, but only from their readers who may have different habits and attitudes from nonreaders and therefore cannot reasonably be extended to the general population. c. Answers will vary, but you probably should generalize these results only to Consumer Reports readers who tend to visit full-service restaurant chains and like to complete surveys.
Activity 4-7: Sampling Words a. Categorical (binary) b. 99/268 .369 c. The answer to part b is a parameter; .369 is the proportion of all 268 words (the population) in the Gettysburg address that are over 5 letters long. d. No; because of sampling variability you would not expect the sample proportion to equal .369, but you would expect it to be reasonably close most of the time. (In fact, with a sample of size 5, the sample proportion could not equal .369; it could only be 0, .2, .4, .6, .8, or 1.)
Activity 4-8: Sampling Words Answers will vary. These are based on one particular running of the applet. a. Yes, this distribution should be centered at about .369 (it is .38 in this case).
4
f. These percentages are statistics because they are based on the sample.
50
Topic 4: Random Sampling
b. This distribution should still be centered at .369 (the mean is .37), but with much less variability. c. Because you are taking random samples, you expect your sample proportions to center around the parameter (.369), regardless of the sample size. However, as you increase the sample size, you expect your samples to become more precise; that is, you expect the variability between samples to decrease.
Activity 4-9: Sampling Senators a. The observational units are the U.S. senators. The variable is years of service in the senate. The population is the current 100 U.S. senators. The sample is the 5 selected current U.S. senators. The parameter is the average years of service of all 100 U.S. senators. The statistic is the average years of service of the 5 selected senators. b. This sampling method would most likely overestimate the average years of service because your classmates would most likely select names of well-known senators who have been serving in the senate for a long time. (You also need to worry about a tendency for students to mention the senators from their own states more than those from other states.) c. No, increasing the sample size will not correct for a biased sampling method. Students would still tend to overrepresent the senators who have served longer. d. Obtain a list of the current senators. Number each senator in the list from 00–99. Select any row of the Random Digits Table and read the row as a sequence of two-digit numbers. These two-digit numbers tell you which senators from your list will make up your sample. Continue selecting senators until you have five senators in your sample. Skip any repeated two-digit numbers. e. Obtain a list of the current representatives. Number each representative in the list from 000–434. Select any row of the Random Digits Table and read the row as a sequence of three-digit numbers. These three-digit numbers tell you which senators from your list will make up your sample. Skip any repeated three-digit numbers or numbers greater than 434. Continue selecting representatives until you have five representatives in your sample. If necessary, continue to another row of the Random Digits Table.
Activity 4-10: Responding to Katrina Based on sample sizes, the non-Hispanic white adults’ responses probably come closer to reflecting the group’s population value than the black adults’ responses do because there were so many more white adults sampled. If both samples were selected randomly, the larger sample is more likely to produce a sample result similar to the population parameter.
Activity 4-11: Rose-y Opinions a. The observational units are the 1000 individuals. The variable is whether they have a favorable or unfavorable opinion of Pete Rose. This is a categorical variable. b. The population is American sports fans. The sample is the first 1000 people leaving an LA Lakers’ basketball game.
Activity 4-12
51
c. This was not a randomly selected sample. People attending this basketball game are not necessarily sports fans in general or may be extreme LA Lakers fans or simply basketball fans. This is an example of convenience sampling and is unlikely to result in a representative sample. d. No, the individuals in the sample may still be only interested in basketball and not sports in general.
f. The parameter is the percentage of American sports fans who have an unfavorable opinion of Pete Rose. Its value is unknown. The statistic is the 49% of the 1000 people interviewed by the Gallup pollsters who said they had an unfavorable opinion of Pete Rose. g. The value of the statistics would most likely change if Gallup had selected another random sample of 1000 people to interview. But the value of the parameter would remain the same.
Activity 4-12: Sampling on Campus a. The observational units are college freshmen. The variable is weight gained during the first term at college. The population is all U.S. college freshmen. The sample is a random sample of college freshmen. The parameter is the average weight gained by all college freshmen during their first term. Because it would be impossible to obtain a random sample of all U.S. college freshmen, work with freshmen at a particular college. Obtain a list of all freshmen from the registrar. Number the list and use a table of random digits to obtain a random sample of freshmen. b. The observational units are college students. The variable is price paid for textbooks. The population is all U.S. colleges. The sample is a random sample of college students. The parameter is average price paid for textbooks by all college students. Because it would be impossible to obtain a random sample of all U.S. college students, work with students at a particular college. Obtain a list of all students from the registrar. Number the list and use a table of random digits to obtain a random sample of students. c. The observational units are pages of your history book. The variable is number of words on each page. The population is all pages in your history book. The sample is a random sample of pages from your history book. The parameter is average number of words per page in your history book. Number all the pages in your history book consecutively. Use a table of random digits to select a sample of pages from your book and count all the words on these pages. d. The observational units are college faculty. The variable is political party registration. The population is all U.S. college faculty. The sample is a random sample of U.S. college faculty. The parameter is percentages of U.S. college faculty who are registered in each political party.
4
e. If you have a list of subscribers to Sports Illustrated, you could number the list and use a table of random digits or a computer to select a random sample of subscribers. The population who would be represented by this sample would be all readers of Sports Illustrated, which would certainly be more representative of the general sports fan than the previous methods.
52
Topic 4: Random Sampling
Because it would be impossible to obtain a random sample of all U.S. college faculty, work with faculty at a particular college. Obtain a list of all faculty and number the list. Then use a table of random digits to obtain a random sample of faculty.
Activity 4-13: Sport Utility Vehicles a. The observational units are the vehicles. The variable is whether the vehicle is an SUV. The population is all vehicles on the road in your hometown. The sample is the vehicles that pass by the intersection between 7 and 8 am that morning. The parameter is the proportion of all vehicles on the road in your hometown that are SUVs. The statistic is the proportion of all vehicles that pass by that morning that are SUVs. b. The vehicles that you observed between 7 and 8 am may not be representative of all vehicles on the road. For example, the vehicles many be used to carpool children to school and therefore overrepresent larger families with children and larger cars, or they may be predominately commuter vehicles more than weekend recreational vehicles and underrepresent the proportion of SUVs. c. The sampling frame is the list of cars sold by that dealer. d. The recently purchased vehicles will probably not represent the vehicles on the road in your town. For example, there may have been a backlash against SUVs recently because of high gas prices so that fewer SUVs were purchased in the last year, yet many people would still own them from purchases made several years ago.
Activity 4-14: Generation M a. Your classmates form a sample as they are only a subset of all students at your school. b. Answers will vary. This number is a statistic because it is collected from your class (a sample). c. Answers will vary from class to class, but the numbers calculated will all be statistics. d. No, you and your classmates do not constitute a random sample of the students at your school because every student did not have an equal chance of being selected for the sample. e. Answers will vary by school and class. f. Answers will vary by school and class.
Activity 4-15: Emotional Support a. Hite’s sampling method is likely to be biased in the direction of women who think they give more support than they receive. She sampled women in women’s groups who usually join because they aren’t getting the kind of companionship they want from their husbands or boyfriends. b. Hite’s poll surveyed the larger number of women. c. The ABC News/Washington Post poll was probably more representative of the truth about the population of all American women because they used random sampling that was presumably unbiased.
Activity 4-18
53
Activity 4-16: College Football Players a. Position: categorical Weight: quantitative Class: categorical
Note: An early printing of the student book has an error that gives 82 as the number of players rather than 99. There are actually 99 players as some jersey numbers are missing and some are duplicated. If the players are renumbered from 01 to 99, an example answer would be to use line 13 to select players: 54 Danny Rohr (220 lbs), 40 Brandon Williamson (180 lbs), 02 Courtney Brown (205 lbs), 21 Anthony Randolph (220 lbs), 50 David Fullerton (195 lbs), 56 James Chen (240 lbs), 55 Alex Bynum (230 lbs), 87 Kyle Maddux (210 lbs), 52 Kevin Spach (220 lbs), 86 Louis Shepherd (250 lbs), 07 Pat Johnston (195 lbs), 30 Drew Robinson (195 lbs), 34 David Elmerick (185 lbs), 05 Mike Anderson (180 lbs), and 60 Bobby Best (245 lbs). The average weight in this sample is 211.3 lbs. This weight should be fairly close to the average weight of all 99 players because you took a random sample, but you don’t expect it to match exactly. In particular, although this value will vary from sample to sample, you don’t expect a tendency to consistently overestimate or underestimate the population mean weight. A population of 82 players can be considered by deleting the red-shirted freshmen from the list and renumbering the remaining players from 01 to 82. Then, using line 13, you select players 54 Brock Daniels (275 lbs), 40 Aris Borjas (200 lbs), 02 Courtney Brown (205 lbs), 55 Kenny Calderone (285 lbs), 52 Bobby Best (245 lbs), 07 Pat Johnston (195 lbs), 30 Drew Robinson (195 lbs), 34 Martin Mates (185 lbs), 05 Mike Anderson (180 lbs), 60 Lucas Trily (235 lbs), 57 Patrick Koligian (250 lbs), and 62 Julai Tuua (275 lbs). The average weight in this sample is 230 lbs. This weight should be fairly close to the average weight of all 82 players because you took a random sample, but you don’t expect it to match exactly. In particular, although this value will vary from sample to sample, you don’t expect a tendency to consistently overestimate or underestimate the population mean weight.
Activity 4-17: Phone Book Gender a. The parameter is the proportion of women living in San Luis Obispo County. The statistic is the proportion of women listed on the randomly selected phone book page. b. This sampling technique will give a biased estimate for the proportion of women living in San Luis Obispo County because the phone listings of many married women are often only under their husbands’ names. In addition, many single women choose not to list their phone numbers to avoid harassing phone calls. Therefore, you expect the statistic will be an underestimate of the population parameter.
Activity 4-18: Sampling Senators From most variability to least variability: a, c, d, b. As the sample size increases, regardless of the size of the population, the variability in the sample values decreases.
4
b. Example answer, using line 13 of the table:
Topic 4: Random Sampling
Activity 4-19: Voter Turnout a. 1783/2613 .682 b. This is a statistic because it is a number calculated from a sample (of 2613 adults). c. The following bar graph displays the proportions who claimed to have voted and not: Voter Turnout in 1996 Presidential Election
Proportion
54
1 .9 .8 .7 .6 .5 .4 .3 .2 .1 0
Voted Did Not Vote Response
d. This number (49%) is a parameter because the Federal Election Commission has the records of all registered voters. Everyone who was eligible to vote was included in this number. e. No, the sample grossly overestimated the proportion of eligible voters who actually voted. f. Although the sample result is unlikely to match the population value exactly, this difference is probably too large to be attributed to sampling variability. g. People may be reluctant to tell the truth (and seem unpatriotic) and so may overstate whether they voted. They might not remember that they didn’t vote in this particular election. Even with random samples, you have to worry about the honesty of the respondents in surveys.
Activity 4-20: Nonsampling Sources of Bias a. The proportions of “yes” responses would most likely differ between these two groups. The question that includes the words “horrific murder” is obviously putting a negative idea into the minds of those surveyed, whereas the other question seems neutral. b. The proportions declaring agreement with the policy might differ between these two groups. Those interviewed by the smoker might feel pressured into disagreeing. c. The proportion of “yes” responses would probably be lower than the actual proportion of married people in the community who have engaged in extramarital sex. This manner of survey is not very confidential, and the surveyor would be hard-pressed to get honest answers to such a personal and potentially harmful question. d. You should not be surprised that the proportions would differ between these two groups. The president’s views on foreign policy would be fresh in the minds of one group, whereas the other group would have to recall past speeches or actions
Quizzes
55
of the president in order to form an opinion. Approval ratings tend to rise shortly after rousing speeches but then come back down again over time. e. How the question is worded, appearance of the interviewer, lack of confidentiality, knowledge of the topic, and timing of the question are sources of bias.
Activity 4-21: Prison Terms and Car Trips a. Prisoners with longer terms have a higher probability of ending up in the sample (similar to how longer words are more likely to be selected when you point your finger at one spot on the page).
c. Many answers are possible, but one example is estimating the average length of time that people have been employed by a particular company. If you take a random sample of employees, employees who have been around longer have a better chance of ending up in the sample.
Assessment Sample Quiz 4A
•••
The National Retail Federation sponsors surveys of consumer behavior. One such survey, conducted on a yearly basis, asks American adults whether they plan to celebrate Mother’s Day, what kind of gift(s) they plan to buy, and how much they plan to spend. The 2007 survey was conducted on April 4–11 with 7859 consumers participating. Of these respondents, 84.5% said that they were planning to celebrate Mother’s Day, expecting to spend an average of $139.14. 1. Identify the population of interest in this survey. 2. Identify the sample and the sample size. 3. Are the values listed (84.5%, $139.14) parameters or statistics? Explain. 4. Identify (in words) the parameters of interest in this study. 5. The press release describing this survey did not say how the 7859 consumers were selected. Explain why knowing this missing information is important.
Solution to Sample Quiz 4A
•••
1. The population is all adult Americans.
2. The sample is a group of adult American consumers, surveyed during the week of April 4–11, 2007. The sample size is 7859. 3. These values are statistics because they describe a sample. 4. The parameters are the proportion of all American adults who plan to celebrate Mother’s Day in 2007 and the mean/average amount that those who are planning to celebrate Mother’s Day expect to spend. 5. In order to consider the sample to be representative of the population, you need to know whether it was randomly selected.
4
b. Cars engaged in longer trips have a higher chance of being observed at a particular time point than cars on shorter trips.
Activity 5-9
Homework Activities Activity 5-7: An Apple a Day a. Anecdote b. Observational study c. Experiment
Activity 5-8: Treating Parkinson’s Disease a. Sham surgery is a surgery that has no medical purpose. It is “placebo” surgery performed so that patients do not know which of them are receiving the Spheramine and which are not. b. If an experiment is double-blind, then neither the subjects nor the evaluator knows whether each subject is in the treatment group or in the control group. This is important in this study because it prevents the evaluator from being biased in his/her judgment of the effectiveness of the implant and also from the patient feeling psychologically different based on the perception of receiving treatment or not (see part d also).
5
•••
65
c. Randomized means that subjects are randomly assigned to either the treatment or control (placebo) group. This is important because it should mean that the only difference between the two groups should be the treatment, and so if there is a substantial difference observed between the groups later, you can conclude the effect was caused by the treatment.
d. Placebo-controlled means the subjects in the control group are given a placebo (in this case, the sham surgery) so they cannot tell they are in the control group, and so that if subjects are going to improve because of the surgery itself (rather than the implant), this will happen at the same rate in both the treatment and control groups.
Activity 5-9: Ice Cream Servings a. The explanatory variables are the large or small bowl (binary categorical) and large or small scoop (binary categorical). The response variable is amount of ice cream eaten (quantitative). b. This is an experiment because the researchers actively imposed the treatments on the subjects by randomly assigning the size of the scoops and bowls. c. The random assignment was important because it controlled for the potential confounding variable of self-selection. If the nutrition experts were allowed to choose for themselves, those who tended to have small appetites might have chosen the smaller bowls and/or scoops and consequently eaten less ice cream. Then appetite would be confounded with bowl/scoop size. d. The nutrition experts did not know that there were two different sizes of bowls and scoops being distributed, so they would not be conscious of the size of the bowl and perhaps adjust the amount of ice cream they ate in order to be more in line with one of the other groups. e. Because this study was a well-designed, randomized controlled experiment, it is valid to draw a cause-and-effect conclusion between size of bowl or scoop and size of the ice cream serving.
66
Topic 5: Designing Experiments
f. You have controlled for this potentially confounding variable by randomly assigning the subjects to the treatment groups. The only difference between the two groups should be the bowl and scoop sizes.
Activity 5-10: Spelling Errors Randomly divide college students into two groups (number a group of student participants and use a random digits table to split them into two groups). Have one group use a computer’s spell-checker to proofread a research paper, and have the other group proofread the same research paper without using the spell-checker. Compare the performance of both groups to see whether one group catches more errors than the other.
Activity 5-11: Foreign Language Study a. No, you cannot conclude that foreign language study improves your verbal skills. Because this was an observational study, there are many confounding variables that could explain the association. b. A controlled experiment would need to randomly assign students to different treatment groups (i.e., foreign language study and no foreign language study) and then later compare the verbal SAT scores of the two groups. This would ensure that hidden confounding variables such as verbal aptitude would balance out between the groups. c. It might not be feasible to carry out such an experiment because you cannot generally control which courses students do or do not take.
Activity 5-12: AZT and HIV a. The explanatory variable is whether the pregnant woman received AZT (categorical). The response variable is whether the resulting baby was HIV-infected at birth (categorical). b. This is an experiment. The researchers actively randomly assigned the mothers to the control and treatment groups. c. This study makes use of comparison by having a group who received AZT (treatment group) and a group who received a placebo (control group). This allowed the researchers to compare the babies’ infection rates between the two groups. In particular, any changes over time would occur for both groups. d. The study used random assignment to decide which mothers would receive AZT and which would receive the placebo. This should even out all variables, so the only difference between the groups of mothers is the AZT. e. The study used blindness by giving the mothers in the control group a placebo so that neither group of mothers could tell whether they were actually receiving the AZT. This would control for the placebo effect in both groups.
Activity 5-13: Pet Therapy a. The explanatory variable is whether the heart attack patient owns a pet (categorical). The response variable is whether the patient survived for five years (categorical). b. This is an observational study. The researcher passively observed and recorded information on pet ownership and the patient’s recovery rather than assigning some people to own pets and others to not own pets.
Activity 5-15
67
c. Yes, there is a group of patients who do not own pets for comparison. d. No, this study does not make use of randomization. Patients were not randomly selected or randomly assigned to treatment groups. e. No, you cannot conclude that owning a pet has a therapeutic effect for heart attack survivors because there may be confounding variables that explain the association. You cannot conclude causation with an observational study. f. This study could be a controlled experiment if the researcher used randomization to determine whether the patient owned a pet. In this case, the researcher would actively impose the treatment on the subjects. The experimenter would then hope to see the direct effect of pet ownership on the recovery rate of heart attack patients. g. This is debatable. Is it feasible to tell someone to own a pet? Probably not.
a. Parts a and b only have one variable so you cannot distinguish between explanatory and response variables in these cases (you could consider the one variable a response variable in each case). For part c, the explanatory variable is whether their version of the exam asks them to indicate race; the response variable is score on SAT-like exam. For part d, the explanatory variables are gender and race of “customer”; the response variable is price negotiated for the car. b. The observational studies are part a (height of American CEOs) and part b (marriage counselors). The experiments are part c (SAT-like exam given to African American students) and part d (best prices at car dealerships). c. Because parts a and b are observational studies, you cannot draw any cause-andeffect conclusions from either of them. In part a, because the economist took a random sample of American CEOs, you are probably safe in generalizing the results to the population of American CEOs. In part b, you are not told that the psychologist interviewed a random sample of marriage counselors, so you might hesitate to generalize these results to any larger population of counselors. In part c, you should be able to draw a cause-and-effect conclusion if an effect is found, because a randomized, controlled experiment was performed. You should be cautious in generalizing your results to African American college students at similar colleges, however, because you were not told how the 200 students were selected for the study. In part d, if a significant difference is found in average price among the four types of customers, you should be able to attribute the difference to race, gender, or both because you used a comparative, randomized experiment. You should be cautious about generalizing these results because only 10 dealerships were used, and they were all apparently in the same city.
Activity 5-15: Reducing Cold Durations a. The experimental units are the 104 subjects reporting to the lab within 24 hours of getting a cold. b. The explanatory variable is amount of zinc nasal spray (full, low, or no dosage). The response variable is duration of cold symptoms. c. This is an experiment because the researchers randomly assigned the subjects to the treatment groups (amount of zinc spray) and actively imposed the treatments on the patients.
5
Activity 5-14: Studies from Blink
68
Topic 5: Designing Experiments
d. The researchers used a placebo to ensure that if the subject’s colds improved because of receiving any treatment, this effect would been seen equally in each of the groups.
Activity 5-16: Religious Lifetimes a. The explanatory variable is attends religious services at least once a month. The response variable is lifespan. b. This is an observational study because the researchers did not randomly assign the subjects to attend religious services or not. c. You cannot conclude that attending religious services will lengthen one’s life because this is an observational study. A possible confounding variable is the subject’s health and lifestyle. Perhaps people who attend religious ceremonies take better care of their bodies, which may affect their lifespan. d. Yes, if the sample is selected randomly, it should represent the population, regardless of the population size. The important consideration here is how the sample is selected, not the relative size of the sample compared to the size of the population.
Activity 5-17: Natural Light and Achievement a. Researchers would randomly assign the students to two different treatment groups— one with high natural light and one with low natural light. Then the researchers would compare the standardized test scores of the students in these two groups. b. It would be difficult to carry out this experiment because there are ethical considerations that could prevent you from depriving students of natural light and also from possibly detrimentally affecting their education. c. John B. Lyons could say, “There is a causal relationship between daylight and achievement” if this was a well-designed, randomized comparative experiment.
Activity 5-18: SAT Coaching a. The explanatory variable is before or after attending the coaching program. The response variable is SAT score (improvement). b. This is an observational study. The researcher passively observed and recorded information on the students’ SAT scores. He/she did not randomly decide who would or would not enroll in the coaching program. c. You cannot conclude the SAT coaching program caused the improvements in scores because this was not a randomized, comparative experiment. Perhaps most students would generally improve the second time they take the test regardless of the coaching program (you had no comparison group here). Or there may have been other changes in their study habits in addition to the coaching program.
Activity 5-19: Capital Punishment a. No, this is not an experiment because the researcher did not impose the death penalty statute on the states that have it or prevent other states from having it. b. No, you cannot conclude that the death penalty caused the difference in homicide rates because this is an observational study. There may be confounding variables
Activity 5-22
69
(such as the state’s overall crime rate or legal system) that could also affect the response variable. c. No, you cannot conclude a lack of causation either because this is an observational study. There could be other variables that are masking the effect of the death penalty.
Activity 5-20: Literature for Parolees a. Committed a crime: 6/32 .1875; did not commit a crime: 26/32 .8125 b. Committed a crime: 18/40 .45; did not commit a crime: 22/40 .55 c. This study did not randomly assign the parolees to the control and treatment groups. Instead, qualifications had to be met in order to get into the literature (treatment) program. Perhaps literacy or motivation was a confounding variable that affected the likelihood of committing a new crime.
Activity 5-21: Therapeutic Touch b. Emily flipped a coin to decide which of the subject’s hands she would hold hers over.
5
a. This was an experiment. Emily imposed the treatment (her hand) on the subjects.
c. This study was not double-blind. Emily was aware of which subjects received which treatments. d. No; Emily’s sample consisted of volunteers. It was not randomly selected from all practitioners. e. No, you should not attribute this tendency to detection of Emily’s energy field. Emily used only practitioners of therapeutic touch in her study; she did not have a control group of people who did not claim to participate in this practice with which to compare.
Activity 5-22: Prayers, Cell Phones, School Uniforms Answers will vary. These are example answers. a. Randomly divide the subjects into two groups. Have one group talk on the cell phone while driving and prohibit the other group from using a cell phone while driving. Compare the performance of both groups on an obstacle course to see whether they behave differently. An observational study would not allow you to conclude causation because you would be unable to control for confounding variables. Drivers who choose to use a cell phone may be less careful in general and therefore more prone to accidents, regardless of cell phone use. b. Locate a group of patients with a common type of pain (cancer, back pain, etc.). Randomly divide the patients into two groups. Have a prayer group pray for one group for a specified period of time and not pray for the other group. Record any decrease in pain in both groups. If you simply passively observe which patients use prayer to try to reduce suffering and which do not, you will not be able to control for confounding variables such as sociability. Perhaps patients who believe in prayer are more sociable than those who do not, and this sociability raises their spirits, which provides pain relief.
70
Topic 5: Designing Experiments
c. Randomly divide the students into two groups. Have one group wear uniforms to school and allow the other group to wear anything they like. Compare the performance of both groups on a standardized test. An observational study would not allow you to conclude causation because you would be unable to control for confounding variables. Perhaps students who choose to wear uniforms (or whose parents require that they wear uniforms) are more studious than those who do not and would perform better on the standardized test, regardless of what they wore.
Activity 5-23: Proximity to the Teacher a. The observational units are the students. The explanatory variable is whether the student sits close to/far away from the teacher. The response variable is performance on quizzes. b. Researchers would randomly assign the students to two different treatment groups—one that sits close to the teacher and one that sits far from the teacher. Then the researchers would compare the quiz scores of the students in these two groups. c. You will be able to conclude that sitting closer to the teacher does (or does not) cause students to perform better on quizzes if you use a well-designed, controlled experiment. d. You would not have to worry about the ethics of assigning seats to students or possibly detrimentally affecting the students’ education through their seat assignments.
Activity 5-24: Smoking While Pregnant a. These are almost certainly observational studies. It would be difficult, if not ethically impossible, for researchers to assign the subjects to control and treatment groups—they would have to simply passively observe which women smoke and which do not. b. No, it would not be ethical (or feasible) for researchers to randomly assign the pregnant women to control (nonsmoking) and treatment (smoking) groups. You could perhaps use mice or other animals, but then you would have trouble generalizing the results to human subjects.
Activity 5-25: Dolphin Therapy a. This is an experiment because the researcher actively imposed the assignment to the two groups (with or without dolphins) on the subjects. b. The explanatory variable is swimming with dolphins or swimming/snorkeling without dolphins. This variable is categorical. The response variable is change in depression symptoms. This variable is presumably categorical. c. Assuming that the patients were randomly assigned to the two groups, yes, you can conclude that swimming with dolphins improves depression symptoms because this was a well-designed, controlled experiment. d. No, the subjects were not blind as to which treatment they received. It would be impossible to achieve blindness in this experiment because you cannot make people unaware that they are swimming with dolphins.
Quizzes
71
Activity 5-26: Cold Attitudes a. The explanatory variable is emotional state score (numerical score is quantitative; top third vs. bottom third is categorical). The response variable is whether the subjects catch a cold. b. This is an observational study because the researchers passively observed the emotional states of the subjects. They did not randomly assign the subjects to treatment groups (positive vs. negative attitudes). c. No, you cannot draw causal conclusions from an observational study. There could be confounding variables, such as lifestyle, that explain the association. Perhaps people with positive emotions tend to exercise more and eat healthier foods, which would help them ward off colds.
Activity 5-27: Friendly Observers
b. The observational units are the subjects playing a video game. c. The explanatory variable is whether the observer was to share in the prize. The response variable is whether the threshold was beaten.
5
a. This study is an experiment because the researcher randomly assigned the subjects to the two groups (only participants would win $3, and participants plus observers would win $3).
d. This study makes use of blindness because the subjects were not told that there were two different groups or which group they were placed in.
Activity 5-28: Got a Tip? a. These are explanatory variables. b. Record the percentage tip per check (rather than looking at how the amounts vary across the different sizes of bills). c. Yes, she can conduct an experiment; she can randomly determine whether she introduces herself by name (or stands throughout). d. If she conducts this as an experiment she can control for confounding variables and draw cause-and-effect conclusions. e. On a customer-to-customer level, she could flip a coin to decide whether to introduce herself by name or to decide whether she stands or squats at the table. She could also flip a coin to decide whether she wears a flower in her hair, but she would probably want to do this on a shift-by-shift basis.
Assessment Sample Quiz 5A
•••
In a study published in the July 4, 2007, issue of the Journal of the American Medical Association, researchers investigated whether small doses of dark chocolate can reduce blood pressure for people who suffer from mild cases of high blood pressure. They recruited 44 German adults who were otherwise healthy except for mild cases of high blood pressure. These subjects were randomly assigned to either a dark chocolate group or a white chocolate group, and all subjects were instructed to eat one square portion of a chocolate bar (containing about 30 calories) every day for 18 weeks. They were
Topic 6: Two-Way Tables
•••
Homework Activities Activity 6-6: Lifetime Achievements a. The conditional distribution of preferred achievement for each gender follows: Male
Female
Olympic Medal
.4583
.2963
Nobel Prize
.5000
.4444
Academy Award
.0417
.2593
b. The following segmented bar graph displays these conditional distributions: Classmate Preferences for Lifetime Achievement Academy Award Nobel Prize Olympic Medal Percentage
92
100 90 80 70 60 50 40 30 20 10 0
Male
Female Gender
c. These data indicate that in this class males are much more likely than females to prefer an Olympic medal to an Academy Award. Only 4% of the males would like to win an Academy Award, whereas 26% of the females would. The males indicated a slight preference for Nobel prizes, but 46% of them would like to win an Olympic medal. Only 30% of the females would prefer an Olympic medal, but 44% of them would like a Nobel prize.
Activity 6-7: “Hella” Project a. This is an observational study. b. The explanatory variable is whether the student is from northern or southern California. The response variable is whether the student used “hella” in their everyday vocabulary. c. A two-way table of the responses is shown here:
Uses “Hella” Regularly Does Not Use “Hella” Regularly Total
Northern Californians
Southern Californians
Total
10
3
13
5
22
27
15
25
40
Activity 6-8
93
d. For southern Californians, 3/25 or .12. For northern Californians, 10/15 or .667. e. The following segmented bar graph displays these data: “Hella” Project Yes
Percentage
No
100 90 80 70 60 50 40 30 20 10 0
Southern California
Northern California Region
Activity 6-8: Suitability for Politics In this sample, about 19% of the liberals (40/208), 23% of the moderates (68/293), and 31% of the conservatives (96/311) polled agreed with the statement on suitability for politics. The increase in percentage as the amount of conservatism increases makes sense from what is known about the general beliefs of liberals, moderates, and conservatives. Therefore, believing the sample is representative of adult Americans, you could say that the more conservative a person is, the more likely he or she is to agree with the statement. The following segmented bar graph displays these results: 2004 General Social Survey: Suitability for Politics Disagree
Percentage
Agree
100 90 80 70 60 50 40 30 20 10 0
Liberal Moderate Conservative Political Aff iliation
In this sample, about 28% (109/385) of the males and 23% (103/449) of the females polled agreed with this statement on suitability for politics. Therefore, these data suggest that gender does not play a major role in influencing adult Americans’
6
f. Yes, the data seem to support the students’ conjecture. Students in this sample from northern California were more than five times as likely (relative risk .667/.12 5.56) to use “hella” in their everyday vocabulary as the students from southern California.
Topic 6: Two-Way Tables
decision regarding the statement. The following segmented bar graph displays these results: 2004 General Social Survey: Suitability for Politics Disagree
Percentage
Agree
100 90 80 70 60 50 40 30 20 10 0
Male
Female Gender
Activity 6-9: Suitability for Politics According to the data, in the 1970s about 47% (2398/5049) of those polled agreed with this statement on suitability for politics, and this percentage declined to about 36% (2563/7160) in the 1980s, about 23% in the 1990s (1909/8336), and stayed at just over 23% in the 2000s (802/3411). Therefore, based on these randomly selected samples, you have evidence that over time the population has tended to disagree more and more with this statement until the turn of the century, when opinion may have leveled off slightly. The following segmented bar graph displays these results: 2004 General Social Survey: Suitability for Politics Disagree Agree
Percentage
94
100 90 80 70 60 50 40 30 20 10 0
1970s
1980s
1990s Decade
2000s
Activity 6-10: A Nurse Accused a. Here is the 2 2 table: Shifts Gilbert Worked
Shifts Gilbert Didn’t Work
Number of Patients Who Died
40
34
Number of Patients Who Survived
217
1350
b. For the shifts that Gilbert worked, 40/257 .156. For the shifts that Gilbert didn’t work, 34/1384 .025
Activity 6-11
95
c. The relative risk of a patient dying is .156/.025 or 6.24 (6.34 if using more than three decimal places for the proportions). d. The risk of dying was over six times greater during shifts on which Gilbert worked than it was during those shifts on which she didn’t work.
Activity 6-11: Children’s Television Advertisements a. This is a 5 3 table. b. The proportion of food advertisements on BET that were for fast food is 61/162 or .377. c. The proportion of fast-food advertisements that were on BET is 61/93 or .656.
BET
WB
Disney
Fast Food
.377
.386
.000
Drinks
.407
.108
.455
Snacks
.019
.000
.182
Cereal
.093
.193
.364
Candy
.105
.313
.000
e. The following segmented bar graph displays these conditional distributions: Children’s TV Ads Candy Cereal Drinks Fast Food
Percentage
Snacks
100 90 80 70 60 50 40 30 20 10 0
BET
WB Network
Disney
f. The Disney channel showed no commercials for fast food or candy during this time period (in fact, they showed very few food advertisements at all). About 37% of the BET and WB food advertisements were for fast food and almost none of them were for snacks. The percentage of food advertisements for cereal on the WB network was double that of the BET network (19% vs. 9%) and for candy the percentage was tripled (31% vs. 10%). Assuming these data were randomly selected, it appears that the Disney channel shows fewer advertisements than the other networks and generally healthier ones.
6
d. Here is the conditional distribution of the types of food commercials shown:
Topic 6: Two-Way Tables
Activity 6-12: Female Senators a. The proportion of senators who are women is 16/100 or .16. b. The proportion of senators who are Democrats is 49/100 or .49. c. No, it is not fair to say that most Democratic senators are women. Only 22% (11/49) of the Democratic senators are women—this is less than a quarter of the Democratic senators. d. Yes, it is fair to say that most of the female senators are Democrats because 11/16 or .688 of the females are Democrats.
Activity 6-13: Weighty Feelings a. The explanatory variable is gender. The response variable is feeling about one’s weight. b. Here is the marginal distribution for the variable feeling about one’s weight (the variable gender should be ignored): Underweight
.066
About Right
.450
Overweight
.484
Total
1.000
The following bar graph displays the marginal distributions: NHANES Survey
Proportion
96
1 .9 .8 .7 .6 .5 .4 .3 .2 .1 0
Underweight About Right Overweight Feelings about Current Weight
c. Here is the conditional distribution of weight feelings for each gender: Female
Male
Underweight
.038
.096
About Right
.389
.515
Overweight
.573
.389
Activity 6-14
97
The following segmented bar graph displays the conditional distributions: Feelings About Current Weight Overweight
Percentage
About Right Underweight
100 90 80 70 60 50 40 30 20 10 0
Female
Male Gender
d. The distributions of the two genders do appear to differ here. The men in the sample were 2.5 times more likely than the women to feel they are underweight and 1.3 times more likely to feel that their weight is about right. In contrast, the women were almost 1.5 times more likely than the men to feel that they are overweight.
a. This is an experiment. You know because you are told that the researchers randomly assigned the subjects to the treatments (tamoxifen or raloxifene). b. The explanatory variable is drug assigned (tamoxifen or raloxifene). The response variable is whether the woman developed invasive breast cancer. c. Here is the 2 2 table: Tamoxifen
Raloxifene
Developed Breast Cancer
163
167
Did Not Develop Breast Cancer
9563
9578
d. For the tamoxifen group, 163/9726 .0168. For the raloxifene group, 167/9745 .0171. e. The following segmented bar graph displays these conditional proportions: Treatment for Postmenopausal Women at Risk for Breast Cancer No cancer
Percentage
Breast cancer
100 90 80 70 60 50 40 30 20 10 0
Tamoxifen
Raloxifene Drug
6
Activity 6-14: Preventing Breast Cancer
98
Topic 6: Two-Way Tables
f. The relative risk of developing invasive breast cancer is .0171/.0168 or 1.0225. g. Because the relative risk is so close to 1.00, these two drugs are almost equally effective in preventing invasive breast cancer. Because this is an experiment, you can conclude that the drugs are roughly equally effective at preventing breast cancer in postmenopausal women (though you would like more information about how the women were selected for the study before you generalize this conclusion to the larger population).
Activity 6-15: Preventing Breast Cancer The analyses for the risk of developing blood clots in a major vein follow: a. Here is the two-way table: Tamoxifen Blood Clot
Raloxifene
Total
53
65
118
No Blood Clot
9673
9680
19353
Total
9726
9745
19471
b. Here are the conditional proportions of developing a blot clot for each drug: Tamoxifen
Raloxifene
Blood Clot
.005
.007
No Blood Clot
.995
.993
c. The following segmented bar graph displays the conditional proportions: Treatment for Postmenopausal Women at Risk for Breast Cancer
Treatment for Postmenopausal Women at Risk for Breast Cancer
Percentage
Blood clot
100 90 80 70 60 50 40 30 20 10 0
No blood clot in lung Blood clot in lung
Percentage
No blood clot
Tamoxifen
Raloxifene Drug
100 90 80 70 60 50 40 30 20 10 0
Tamoxifen
Raloxifene Drug
d. The relative risk of developing a blood clot is .007/.005 or 1.4 (or 1.224 if using unrounded proportions). e. You can conclude that for postmenopausal women the risk of developing blood clots in a major vein is about 1.22 times greater for those on raloxifene compared to those on tamoxifen. You can conclude that this difference in risk can be
Activity 6-15
99
attributed to the drugs because this was a comparative, randomized experiment. However, you would need more information about how the women were selected for this study before generalizing to the larger population. The analyses for the risk of developing a blood clot in a lung follow: a. Here is the two-way table: Tamoxifen Blood Clot
Raloxifene
Total
54
35
89
No Blood Clot
9672
9710
19382
Total
9726
9745
19471
Tamoxifen
Raloxifene
Blood Clot
.006
.004
No Blood Clot
.994
.996
6
b. Here are the conditional proportions for the risk of developing a blood clot in a lung for each drug:
c. The following segmented bar graph displays the conditional proportions: Treatment for Postmenopausal Women at Risk for Breast Cancer
Treatment for Postmenopausal Women at Risk for Breast Cancer
Percentage
Blood clot
100 90 80 70 60 50 40 30 20 10 0
No blood clot in lung Blood clot in lung
Percentage
No blood clot
Tamoxifen
Raloxifene Drug
100 90 80 70 60 50 40 30 20 10 0
Tamoxifen
Raloxifene Drug
d. The relative risk of developing a blood clot in a lung is .006/.004 or 1.5 (or 1.546 if using unrounded proportions). e. You can conclude that for postmenopausal women the risk of developing blood clots in a lung is about 1.546 times greater for those taking tamoxifen. You can also conclude that this difference in risk can be attributed to the drugs because this was a comparative, randomized experiment. However, you would need more information about how the women were selected for this study before generalizing to the larger population.
Topic 6: Two-Way Tables
Activity 6-16: Flu Vaccine a. The observational units are the workers at Children’s Hospital in Denver. The explanatory variable is whether they received the flu vaccine (binary categorical). The response variable is whether they developed flu-like symptoms (binary categorical). b. Here is the two-way table: Flu Vaccine
No Vaccine
Total
Flu-like Symptoms
149
68
217
No Symptoms
851
334
1185
1000
402
1402
Total
c. Numerical and graphical results of this study follow: Flu Vaccine
No Vaccine
Flu-like Symptoms
.149
.169
No Symptoms
.851
.831
No symptoms Flu-like symptoms
Percentage
100
100 90 80 70 60 50 40 30 20 10 0
Vaccine
No Vaccine Flu Vaccine
About 15% of those receiving the flu vaccine developed flu-like symptoms compared to 17% of those not receiving the vaccine. The relative risk of developing the flu was only 1.135; this means that if one did not get a flu vaccine, her risk of developing flu-like symptoms was only 1.135 times greater than if she had received a vaccine. This indicates that vaccine and flu-like symptoms were almost independent variables in this sample. d. Even if you had observed a substantial increase in the risk, you would not have been able to attribute it to the vaccine because this was not a randomized experiment: The workers chose whether to receive a flu vaccine. You would have to randomly assign the vaccine and a placebo to a treatment and control group in order to draw cause-and-effect conclusions about the vaccine.
Activity 6-17: Watching Films Answers will vary. The following is a representative example: a. Here is the 2 2 table:
Activity 6-18
Friend Saw Movie Friend Did Not See Movie Total
I Saw Movie
I Did Not See Movie
Total
13
2
15
4
11
15
17
13
30
101
b. The following segmented bar graph displays the results:
Percentage
Friend saw movie
100 90 80 70 60 50 40 30 20 10 0
I saw movie
I did not see movie
In this case, my friend and I seem to have very similar movie-watching habits. If I saw a movie, then there is about a 76% (13/17) chance that my friend also saw it, and if I did not see the movie, there is about an 82% (11/13) chance my friend did not see it.
Activity 6-18: Watching Films a. Here is a hypothetical 2 2 table (completely independent movie-watching habits): I Saw Movie
I Did Not See Movie
Total
5
3
8
Friend Did Not See Movie
15
9
24
Total
20
12
32
Friend Saw Movie
My friend has a 25% (8/32) chance of seeing the movie regardless of whether I have seen it. b. Here is a hypothetical 2 2 table (very similar movie-watching habits):
Friend Saw Movie Friend Did Not See Movie Total
I Saw Movie
I Did Not See Movie
Total
18
1
19
2
11
13
20
12
32
For movies I have seen, my friend has a 90% (18/20) chance of seeing it, but only an 8% (1/12) chance of seeing a movie that I haven’t.
6
Friend did not see movie
Topic 6: Two-Way Tables
c. Here is a hypothetical 2 2 table (very dissimilar movie-watching habits): I Saw Movie
I Did Not See Movie
Total
6
11
17
Friend Did Not See Movie
12
1
13
Total
18
12
30
Friend Saw Movie
My friend has seen 33% (6/18) of the movies I have seen, but 92% (11/12) of the ones I have not seen.
Activity 6-19: Botox for Back Pain a. Randomized means that the subjects were randomly divided into two groups— the group that received the botox injection and the group that received the placebo (saline injection). Double-blind means that neither the subjects nor the person evaluating their degree of pain relief knew which subjects were in which groups. b. Here is the two-way table: Botox
Saline
Total
Pain Relief
9
2
11
No Pain Relief
6
14
21
15
16
32
Total
c. Here are the conditional proportions for patients: Botox
Saline
Pain Relief
.6
.125
No Pain Relief
.4
.875
The following segmented bar graph displays the conditional proportions:
No pain relief Pain relief
Percentage
102
Treatment for Chronic Low-Back Pain 100 90 80 70 60 50 40 30 20 10 0 Botox Saline Treatment Group
Activity 6-21
103
d. The relative risk of pain relief is .6/.125 or 4.8; those receiving botox were 4.8 times more likely to experience substantial back-pain relief than those receiving saline. e. Because this was a comparative, randomized experiment, you can conclude that botox is responsible for increasing the rate of pain relief in patients with chronic low-back pain at the end of eight weeks. However, you should be cautious in generalizing the results from this study to the larger population of back-pain sufferers because you do not know how representative the sample is.
Activity 6-20: Botox for Back Pain a. Multiplying every entry by 10 gives these results: Botox
Saline
Total
Pain Relief
90
20
110
No Pain Relief
60
140
200
150
160
310
Total
b. The segmented bar graph would not change. d. For the botox group, 15/15 1.00. For the saline group, 8/16 .500. The difference in proportions is 1.00 .500 .500. e. This (hypothetical) difference in proportions is similar to the actual difference in the study (.475). f. The (hypothetical) relative risk of pain relief is 1/.5 2. The relative risk is less than half what it was in the actual study, even though the difference in proportions is similar. g. The difference in proportions does not take into account the size of the proportions. Even a change of .5 seems more dramatic if one of the proportions is small, but less so if they are both large. The relative risk reflects this; it was greater in the actual study even though the difference in proportions was slightly smaller.
Activity 6-21: Gender-Stereotypical Toy Advertising a. Here are the marginal totals: Boy Shown
Girl Shown
97
86
Traditional “Male” Toy
Traditional “Female” Toy
Neutral Gender Toy
74
26
83
b. The proportion of ads that depict boys with traditional male toys is 59/97 or .608. The proportion of ads that depict boys with traditional female toys is 2/97 or .021. The proportion of ads that depict boys with neutral toys is 36/97 or .371.
6
c. The relative risk of pain relief is .6/.125 or 4.8. The relative risk has not changed.
Topic 6: Two-Way Tables
c. Here is the conditional distribution of toy types for ads showing girls: Girl Shown Traditional “Male? Toy
.174
Traditional “Female” Toy
.279
Neutral Gender Toy
.547
d. The following segmented bar graph displays the conditional distributions: Toy Advertisements Neutral gender toy Traditional “female” toy Traditional “male” toy Percentage
104
100 90 80 70 60 50 40 30 20 10 0
Boy Shown
Girl Shown
Type of Toy
e. The bar graph indicates that toy advertisers do seem to present pictures of boys with traditional male toys more than 60% of the time, but the same cannot be said for girls and traditional female toys. Girls tend to be shown with neutral gender toys (about 55% of the time).
Activity 6-22: Gender-Stereotypical Toy Advertising a. The proportion of ads that are crossover ads is (2 15)/(97 86) or .093. b. The proportion of crossover ads that depict girls with traditionally male toys is 15/17 or .882. c. The proportion of crossover ads that depict boys with traditionally female toys is 2/17 or .118. d. When toy advertisers tend to defy gender stereotypes, they tend to do so by showing girls with traditional male toys. Among the crossover ads, 88% of them are this type.
Activity 6-23: Baldness and Heart Disease a. The proportion of men who identified themselves as having little or no baldness is (251 165 331 221)/(663 772) or .675. b. Of those who had heart disease, the proportion of men who claimed to have some, much, or extreme baldness is (195 50 2)/663 or .373. c. The proportion of men who were in the control group is (331 221)/968 or .570.
Activity 6-24
105
d. The following segmented bar graph compares the distributions of baldness ratings between subjects with heart disease and those from the control group: Baldness and Heart Disease
Percentage
Control Heart disease
100 90 80 70 60 50 40 30 20 10 0
None
Little Some Much Extreme Baldness Rating
f. The risk of heart disease (little or no baldness) is 416/968 or .430. The risk of heart disease (other group) is 247/467 or .529. The relative risk of heart disease is .529/.430 or 1.23.
Activity 6-24: Gender and Lung Cancer a. The explanatory variable is gender. The response variable is whether they were suffering from lung cancer. b. Here is a 2 2 table with marginal totals: Men
Women
10
19
29
No Lung Cancer
531
440
971
Total
541
459
1000
Lung Cancer
Total
c. Here are the conditional distributions of lung cancer for each gender: Men
Women
Lung Cancer
.018
.041
No Lung Cancer
.982
.959
6
e. In this sample, the more baldness a man has, the more likely he is to suffer from heart disease. Although you would not consider a cause-and-effect relationship because this is an observational study, there does appear to be a moderately strong association. You also do not know what population this sample is representative of.
Topic 6: Two-Way Tables
d. The following segmented bar graph displays the conditional distributions: Gender and Lung Cancer No lung cancer Lung cancer
Percentage
106
100 90 80 70 60 50 40 30 20 10 0
Men
Women Gender
e. The relative risk of having lung cancer is .041/.018 or 2.24. f. The women smokers (age 60 and older) in this study were 2.24 times more likely than the male smokers of the same age to develop lung cancer.
Activity 6-25: Hypothetical Hospital Recovery Rates a. The relative risk is .9/.8 or 1.125. The overall risk of dying is 1.125 times greater in Hospital A than it is in Hospital B. b. The relative risk is .983/.967or 1.017. The risk of patients in fair condition dying is 1.017 times greater in Hospital B than it is in Hospital A. c. The relative risk is .525/.3 or 1.75. The risk of patients in poor condition dying is 1.75 times greater in Hospital B than it is in Hospital A.
Activity 6-26: Graduate Admissions Discrimination a. Here is the two-way table:
Men Women Total
Admitted
Denied
Total
1195
1486
2681
559
1276
1835
1754
2762
4516
b. The proportion of men admitted is 1195/2681 .446. The proportion of women admitted is 559/1835 .305. Yes, these proportions seem to support the claim that men were given preferential treatment in admissions decisions. Men seem to have been admitted at an almost 15% greater rate.
Activity 6-28
107
c. The proportion of men and the proportion of women within each program who were admitted is shown in this table: Proportion of Men Admitted
Proportion of Women Admitted
Program A
.61
.82
Program B
.62
.68
Program C
.36
.34
Program D
.33
.35
Program E
.27
.24
Program F
.05
.07
e. The men tended to apply more often to programs A and B, which had high acceptance rates, whereas the women tended to apply to programs C–F, which had generally much lower acceptance rates. Thus, although the women were generally being accepted at equal or higher rates in each program, they were not tending to apply to programs with high acceptance rates, so their overall acceptance rate was comparatively low.
Activity 6-27: Softball Batting Averages Many answers are possible. Here is one example: June Amy’s Hits
July
Combined
80
121
201
100
400
500
.8000
.3025
.4020
Barb’s Hits
319
30
349
Barb’s At-bats
400
100
500
.7975
.3000
.6980
Amy’s At-bats Amy’s Proportion of Hits
Barb’s Proportion of Hits
The goal is to have higher batting averages in June (when Barb has most of her at-bats) and lower batting averages in July (when Amy has most of her at-bats).
Activity 6-28: Hypothetical Employee Retention Predictions a. Of those employees predicted to stay, the proportion who actually left is 12/75 or .16.
6
d. No, it does not appear that any particular program is responsible for the large discrepancy between men and women in the overall proportions admitted. In each of these six programs, the proportion of women admitted was either greater than the proportion of men admitted or was so close that it could not have created the .15 overall difference in proportions.
Topic 6: Two-Way Tables
b. Of those employees predicted to leave, the proportion who actually left is 4/25 or .16. c. No, any employee predicted to stay is not any less likely to leave than an employee predicted to leave. d. No, this standardized test does not provide much helpful information about whether an employee will leave or stay. e. Yes, in these data an employee’s action was independent of the test’s prediction because the conditional distributions were identical. f. The following segmented bar graph displays the conditional distributions: Hypothetical Employee Retention Predictions
Percentage
Actually left Actually stayed
100 90 80 70 60 50 40 30 20 10 0
Predicted to Stay
Predicted to Leave Employees
Answers will vary for parts g and h. g. The following segmented bar graph shows one example, if the test were perfect in its predictions: Hypothetical Employee Retention Predictions Actually left Actually stayed Percentage
108
100 90 80 70 60 50 40 30 20 10 0
Predicted Predicted to Stay to Leave Employees
Quizzes
109
h. The following segmented bar graph shows one example, if the test were very useful but not quite perfect in its predictions: Hypothetical Employee Retention Predictions
Percentage
Actually left Actually stayed
100 90 80 70 60 50 40 30 20 10 0
Predicted to Stay
Predicted to Leave
Employees
Activity 6-29: Politics and Ice Cream
Chocolate
Vanilla
Strawberry
Total
Democrat
108
96
36
240
Republican
81
72
27
180
Independent
36
32
12
80
225
200
75
500
Total
Notice that the conditional percentages are the same (48%, 36%, and 16%) for each flavor.
Activity 6-30: Your Choice Answers will vary.
Assessment Sample Quiz 6A
•••
In a study reported in the July 2007 issue of the Journal of Epidemiology and Community Health, researchers investigated whether veterans are more likely to commit suicide than nonveterans. They spent 12 years following 104,000 veterans who had served in the armed forces between 1917 and 1994, and compared them with 216,000 nonveterans. They found that 197 veterans and 311 nonveterans committed suicide. 1. Identify the explanatory and response variables in this study. 2. Is this an experiment or an observational study? Explain briefly. 3. Notice that more nonveterans than veterans committed suicide (311 vs. 197). Would you conclude that veterans are less likely to commit suicide than nonveterans? Explain.
6
Here is the completed table:
Activity 7-7
121
g. The first graph appears to have a uniform shape; the second is slightly skewed to the right. But neither is a legitimate histogram of the Olympic rowers’ weights because the variable (weight) is not on the horizontal axis—it is on the vertical axis! For example, it would be very misleading to say the “center” of the weights was around Dan Berry’s weight—he actually has one of the greatest weight values. The rowers’ names are of less interest than the pattern in the weight values to the behavior of the weights.
Activity 7-6: Go Take a Hike! a. The stemplot is shown here. 0 1 2 3 4
68 00000555555588 00000000122555556668 000002224458 00000566
5 6 7 8 9
00055668 0000 004
leaf unit .1 mile
5
7
b. The distribution of hike distances is sharply skewed to the right, indicating there are many hikes on the short side and only a few longer hikes. A typical hike is between 2 and 3 miles. Most hikes are between 1 and 6 miles, but two hikes are less than a mile and a few are more than 6 miles. The longest hike is 9.5 miles, which is a bit unusual and could be considered an outlier because this hike is more than two miles longer than the next longest hike (7.4 miles). Many hikes have a reported distance that is a multiple of a whole number or a half number of miles.
Homework Activities Activity 7-7: Newspaper Data a. You would expect the shape of this distribution to be skewed to the left. Most of the deceased people would be rather old, but some of them would be unexpectedly young.
Frequency
•••
18 16 14 12 10 8 6 4 2 0 0
20
40 60 Age at Death (in years)
80
100
Topic 7: Displaying and Describing Distributions
Frequency
b. You would expect the shape of this distribution to be skewed to the right. Some house prices would be unusually high, and these prices would skew the graph to the right. 12 10 8 6 4 2 0 200,000
400,000
600,000
800,000 1,000,000
House Asking Price (in dollars)
c. You would expect the shape of this distribution to be skewed to the right. Most newspapers will not print stock prices below $1, so they will be truncated on the left. There could also be some stocks with much greater values than the rest.
Number of Stocks
80 70 60 50 40 30 20 10 0 0
100
200
300
400
500
600
700
Final Stock Price (in dollars)
d. You would expect the shape of this distribution to be symmetric. There should be a central cluster, with temperatures tailing off equally to the left and to the right of center. Another possibility is that you would expect the shape of this distribution to be skewed to the right: Most major cities in the United States would have low temperatures, but there would be a few (L.A., Honolulu, and Miami, for example) that would have warmer temperatures. 5
Number of Cities
122
4 3 2 1 0 20
30
40
50
60
70
High Temperatures in January (in F)
e. You would expect the shape of this distribution to be symmetric. There should be a central cluster, with temperatures tailing off equally to the left and to the right of center. Another possibility is that you would expect the shape of this
Activity 7-9
123
Number of Cities
distribution to be skewed to the left: Most major cities in the United States would have high temperatures, but there would be a few (Juno, AL; Portland, OR; and Bangor, ME) that would have cooler temperatures. 9 8 7 6 5 4 3 2 1 0 70
75 80 85 90 95 100 105 High Temperatures in July (in F)
Activity 7-8: Student Data a. You would expect the shape of this distribution to be (probably) skewed to the right. Most students will purchase only a few items on any given visit to the grocery store, but there will be a few students with larger totals. b. You would expect the shape of this distribution to be uniform from 0 to 9 and therefore symmetric.
d. You would expect the shape of this distribution to be skewed to the right. There may be students who have friends or family who cut their hair for free, so there will be a lot of $0 haircuts. There will probably also be many haircuts between $10–$20, but then the frequency will decrease with more expensive haircuts. e. Answers will vary. You would expect the shape of this distribution to be (probably) skewed to the right, as it must be truncated on the left at one or two pairs, with most people in the 10–20 pairs range, but then there would be a few people with a very large number of pairs of shoes. f. Answers will vary. You would expect the shape of this distribution to be (probably) skewed to the right; people will not have a value less than zero, but a few may have a very large number of shirts. g. Answers will vary by type of school. h. Answers will vary by type of school. i. Answers will vary by type of school and class.
Activity 7-9: British Monarchs’ Reigns a. No, this is not a legitimate histogram of the distributions of the lengths of reign for British monarchs. The variable (lengths of reign) should be on the horizontal axis (not the vertical axis), and the vertical axis should display the frequencies.
7
c. You would expect the shape of this distribution to be skewed to the right. Most families have 0–2 children, and the frequency of families decreases as the number of children increases.
124
Topic 7: Displaying and Describing Distributions
b. No, the distribution is not symmetric. Recall the (correct) stemplot from Activity 7-3, which showed that the distribution is skewed to the right, with a minimum of 0 and a maximum of 63 years.
Activity 7-10: Honda Prices a. This dotplot represents the mileage variable. You would expect most of the Hondas to have low mileage, as people tend to try to sell their cars when they accumulate a certain amount of mileage. There will be a few cars with very high mileage (i.e., a few high outliers). b. This dotplot represents the price variable. Of these three variables, this one should be the closest to being symmetric, with some very low and some very high values. c. This dotplot represents the year variable. Most of the cars for sale will be simply a few years old, and the number of cars for sale should decrease with age. Thus, you would expect this distribution to be skewed to the left.
Activity 7-11: College Football Scores a. The following stemplot displays the data: 1 0 0 1 2 3 4
7 3 347 078 0367 1224566889 289
leaf unit one point
b. The distribution of the margins of victory for the top 25 teams for the first game of the 2006 college football season is skewed to the left, with a peak in the 30s. The center of the distribution is about 23 points and the scores vary from –17 points to 49 points. There were two games in which the margins were negative, indicating that top 25 teams were playing each other. Without these games, the margins range from 4 to 49 points. c. You would expect the margins to be much smaller, with many more negative values as the top 25 teams play each other and tougher opponents later in the season.
Activity 7-12: Hypothetical Exam Scores Many answers are possible. Here is an example: Professor Cobb’s Class
Professor Moore’s Class
9772
5
8899
862
6
0111458
940
7
14456777899 02
98755332
8
998766410
9
000
10 leaf unit = one point
Activity 7-14
125
Activity 7-13: Hypothetical Commuting Times Many answers are possible. Here are example histograms that compare commuting times for two different routes (A and B): 10
14 12
8 6
8
4
6 4
2
2
0
0 0
8 16 24 Route A Times (in minutes)
0
32
8 16 24 Route B Times (in minutes)
32
Activity 7-14: Go Take a Hike! Here is a graphical display for the variable elevation gain:
0
250
500
750 1000 Elevation Gain (in feet)
1250
1500
7
Frequency
10
The distribution of elevation gain of these 72 hikes is strongly skewed to the right, with most hikes gaining from 0 ft to about 1150 ft. The typical hike seems to rise about 300 ft, but there is a small cluster of hikes with gains as high as 850–1150 ft. There are two hikes that seem to be high outliers—one at 1300 ft and the other at 1650 ft. Here is a graphical display for the variable expected time to complete hike:
0
60
180 240 120 300 Expected Time to Complete Hike (in minutes)
360
420
The times expected to complete these 72 hikes display a striking granularity—almost all the times are in increments of 30 minutes. The vast majority of the hikes are expected to take 60 or 90 minutes to complete, and there are an equal number of each of these two lengths of hikes. There are roughly half as many hikes that are expected to take 30, 120, 150, or 180 minutes to complete, and there are two high outliers—one at 300 minutes and one at 420 minutes.
Topic 7: Displaying and Describing Distributions
Activity 7-15: Memorizing Letters Answers will vary by class. Here is a representative example. a. The following stemplot displays JFK Group
JFKC Group 0
2333
9998666
0
56666899
2
1
034444455578
98885555
1
04
41111111
2
77
2
leaf unit one letter
b. This stemplot indicates that the JFK group was generally able to memorize more letters than the JFKC group. There is relatively little overlap between the two distributions. The center of the JFK group is substantially higher than that of the JFKC group (about 17 letters versus 11 letters).
Activity 7-16: Placement Exam Scores a. The following dotplot displays the distribution of exam scores:
3
12 9 Placement Exam Scores
6
15
18
The following histogram displays the distribution of exam scores:
Frequency
126
35 30 25 20 15 10 5 0 0
3
6
9
12
Placement Exam Scores
15
18
Activity 7-19
127
b. The distribution of these exam scores is roughly symmetric, ranging from 1 to 19. The center and peak are both at 10, and this score was obtained with perhaps surprising frequency. (The histogram was created using 20 subintervals.)
Activity 7-17: Hypothetical Manufacturing Processes The center of the distribution of process A is about 11.5 cm, and this distribution does not have a great deal of variability in the rod diameters. In contrast, the rod diameters in process B display a great deal of variability, although their center is right on target at 12 cm. Process C is centered at about 11.7 cm and displays a moderate amount of variability (more than process A). Process D is best as is because its center is right on target (12 cm), and it displays the least variability in rod diameters (so it is also the most stable). Process B is the least stable because it displays the most variability, and process A produces rods with diameters that are generally farthest from the target value.
Activity 7-18: Hitchcock Films a. The following stemplot displays the distribution of the films’ running times: 1 1335888 136679 00068 026
leaf unit one minute
b. The distribution of the running times of Hitchcock’s films is skewed to the right, with most of his films running from about 100 minutes to 135 minutes. The typical Hitchcock film runs for about 116 minutes. There is one outlier (Rope), which has a running time of only 81 minutes. c. This unusual film (Rope) is bound to be the outlier because of the short running time. All of the other films have similar running times, and any film that was unedited would necessarily be short.
Activity 7-19: Jurassic Park Dinosaur Heights a. Distribution 1 is roughly symmetric, with a single peak. Distribution 2 has greater variability, with three distinct peaks. b. Distribution 2, with its three peaks, is what you would expect from a controlled population introduced in three separate batches, whereas Distribution 1 is more consistent with the unimodal shape you would expect from a normal biological population. c. There are no outliers, and it is almost perfectly symmetric. A distribution occurring in nature would most likely not be so “regular” or perfectly shaped.
7
8 9 10 11 12 13
Topic 7: Displaying and Describing Distributions
Activity 7-20: Turnpike Distances a. Here are the values that fall into each subinterval: Miles
0.1–5
5.1–10
10.1–15
15.1–20
20.1–25
25.1–30
30.1–35
35.1–40
Tally
3
12
6
7
1
0
0
1
b. The following histogram displays the distribution of distances:
Number of Exits
128
12 10 8 6 4 2 0 0
10
20
30
40
Distance (in miles)
c. The distribution of distances between exits on the Pennsylvania Turnpike appears somewhat skewed to the right, with a peak between 5–10 miles and a center just slightly greater. The distances range from 0 to 25 miles, with an outlier between 35 and 40 miles. d. A value that splits the exit distances into two equal pieces is 10.25 miles (halfway between the 15th and 16th ordered exits). This value is not unique because any value between 9.8 and 10.7 miles would also split the distances into equal halves. e. She is very likely to make it because 28 of the 30 exits are no more than 20 miles apart. This is about .93 of the exits. f. Exactly half (15/30 .5) of the exits are no more than 10 miles apart, so she is as likely to make it as not.
Activity 7-21: Exam Scores a. Three students received a score of 77 on the exam. b. Fifteen students scored 90 or greater. This proportion is 15/62 or .242. c. Ten students scored less than 70. This proportion is 10/62 or .161. d. The score that appeared most often is 90. e. The two values that no one obtained are 78 and 86.
Activity 7-23
129
Activity 7-22: Tennis Simulations a. The following stacked dotplot displays the distribution of points for all three scoring systems (each dot represents up to three observations):
Standard 3
6
9
12
15
18
3
6
9
12
15
18
3
6
12
15
18
No-ad
Handicap 9 Points
b. All games, except the 21 games that were finished in 5 points or less, were won with an even number of points. This outcome could be explained by the fact that most of these games were the result of a tie after 4 points. Because a player must win the game by 2 points, an even number of total points would then be required by the end of the game.
7
The distribution of points in the game is strongly skewed to the right, with a center around 6.5 points.
c. See the dotplot in part a. The distribution for the no-ad games is much less skewed than for the standard games. Its center is about 6 and its spread is much less—no game in this system ever exceeded 7 points. d. See the dotplot in part a. The distribution for the handicap system is skewed a little to the left and its center is clearly less than that of the other two scoring systems (about 4.9 points per game). Many of these games ended with a total of 3 points or less.
Activity 7-23: Blood Pressures
Frequency
The following histogram displays systolic blood pressure measurements: 18 16 14 12 10 8 6 4 2 0 100
120 140 160 180 Systolic Blood Pressure (in mmHg)
200
130
Topic 7: Displaying and Describing Distributions
These 63 systolic blood pressures are strongly skewed to the right, varying from about 90 mmHg to about 185 mmHg, with a high outlier at 208 mmHg. There is a peak in the distribution near the typical systolic pressure of about 120 mmHg. The reading of 208 mmHg could be an outlier, as the next highest systolic pressure is 186 mmHg. The following histogram displays the diastolic blood pressure measurements: 10
Frequency
8 6 4 2 0 40
50
60
70
80
90
100
Diastolic Blood Pressure (in mmHg)
The following stacked dotplot compares systolic and diastolic blood pressure measurements:
Systolic 50
75
100
150
175
200
50
75
150 100 125 Blood Pressure (in mmHg)
175
200
125
Diastolic
The diastolic blood pressures are more symmetrically distributed, from a low of about 38 mmHg to a high of 100 mmHg. These pressures appear to be particularly heavily concentrated between 55 and 80 mmHg, with a typical pressure being about 67 mmHg. There are no obvious outliers in this distribution. Both distributions show an interesting granularity in that the pressures are all even values (although this is not clear from the histograms).
Assessment Sample Quiz 7A
•••
The following side-by-side stemplot displays the total number of points scored per Super Bowl football game for the first 41 Super Bowls (from 1967–2007), separated according to the first 20 games (1967–1986) and the next 21 games (1987–2007):
142
Topic 8: Measures of Center
customers when there are more than two choices. It could be the case that 100 customers chose chocolate, 90 chose vanilla, and 80 chose strawberry. The mode is chocolate, but 170/270 .63, so 63% prefer a flavor other than chocolate.
Activity 8-7: Readability of Cancer Pamphlets a. To calculate the mean, you need to know all of the actual data values, and you do not know the values for those patients with a reading level below grade 3 or above grade 12. b. There are 63 patients, so the median is the ordered value in position (63 1)/2, or the 32nd value. If you start counting from the low end, you find 6 patients read below grade 3, 10 patients at grade 3 or below, 14 patients at grade 4 or below, 17 patients at grade 5 or below, 20 patients at grade 6 or below, 22 patients at grade 7 or below, 28 patients at grade 8 or below, and 33 patients at grade 9 or below. The 32nd value is therefore at grade level 9, which is the median patient reading level. c. There are 30 pamphlets, so the median readability level is the average of the 15th and 16th pamphlets. Counting in a similar way, the 15th and 16th readability values are both located at grade level 9, so a pamphlet’s median readability level is grade 9. d. These medians are identical. e. No. The centers of the distributions (as measured by the medians) are wellmatched, but you need to look at both distributions in their entirety and consider all values. The problem is that many patients read at a level below that of the simplest pamphlet’s readability level. Seventeen patients read at a level below grade 6, which is the lowest readability level of a pamphlet. f. 17/63 .27, so 27% of the patients have a reading level below that of the simplest pamphlet.
•••
Homework Activities Activity 8-8: Human Ages a. Answers will vary. The mean is most likely greater than the median, because there are probably more humans alive in the world who are young. Thus, the distribution of ages is skewed to the right and the mean will be greater than the median. b. Answers will vary. Because of the relatively good healthcare in the United States compared to many other countries, the mean age of Americans is likely to be greater than the mean age of all human beings. c. Answers will vary. The 2000 U.S. census gives the mean U.S. age as 36.6 years and the median as 35.9 years. The median age for all human beings (2007) is 28 years. Note: See http://www.un.org/esa/population/publications/ WPA2007/ES-English.pdf or https://www.cia.gov/library/ publications/the-world-factbook/fields/2177.html, which includes a country-by-country list.
Activity 8-12
143
Activity 8-9: Sampling Words a. Answers will vary by student expectation. b. The mean is 4.29. You calculate (7 2 29 3 29 4 59 5 34 . . . 9 10 10 4 11 3)/268. The median is 4 letters. You determine the median by finding the 134th word length in the list. c. You are calculating parameters because this is the entire population of words in the Gettysburg Address. d. The mean cannot possibly be 24.36 because no word has more than 11 letters. e. This hypothetical student added the frequencies (instead of the lengths) and divided by 11 (268/11 or 24.36).
Activity 8-10: House Prices Answers will vary by student expectation. The mean house price should be greater than the median because house prices are generally strongly skewed to the right.
Activity 8-11: Supreme Court Justices a. The mean is 14 years of service and the median is 15 years. b. 5/9 or .556 of the justices have served longer than 14 years. c. The mean would become 16.22 years. The median would not change. d. The mean and median would each increase by 5 years (to 19 and 20 years, respectively).
a. The median value of the variable distance is 484 million miles. The median value of the variable diameter is 7926 miles. The median value of the variable period is 4332 days. b. The most likely cause of her mistake is failing to order the list of diameters before she finds the “middle” number. c. Answers will vary by expectation. The mean should be greater than the median because the data are so strongly skewed to the right. d. Answers will vary by student prediction. Students should predict relatively little change in the median because its calculation does not directly involve the value of the most extreme planet. e. The median value of the variable distance is 313 million miles. The median value of the variable diameter is 19,350 miles. The median value of the variable period is 2509.5 days. f. Answers will vary by student expectation. The means should be more affected than the medians were because their values do involve Pluto directly. g. With Pluto: The mean value of the variable distance is 1102 million miles. The mean value of the variable diameter is 27,821 miles. The mean value of the variable period is 21,977 days.
8
Activity 8-12: Planetary Measurements
144
Topic 8: Measures of Center
Without Pluto: The mean value of the variable distance is 783 million miles. The mean value of the variable diameter is 31,120 miles. The mean value of the variable period is 13,416 days.
Activity 8-13: Memorizing Letters Answers will vary. Here is one representative set of answers. a. For the JFK treatment group, the mean is 16 letters and the median is 18 letters. For the JFKC treatment group, the mean is 10.88 letters and the median is 11.5 letters. b. These are statistics because they come from a sample. c. The centers indicate that the JFK group was able to memorize more letters successfully, about six more letters on average, than the JFKC group. Yes, the difference appears to be substantial. d. The grouping of letters does appear to affect memory performance. Because this was a randomized comparative experiment, you can conclude that it was the grouping that caused the increase in performance.
Activity 8-14: Sporting Examples a. For the regular section, the mean is 338.07 and the median is 341.0. For the sports-themed section, the mean is 307.25 and the median is 309.0. b. Students in the regular section out-performed students in the sports-themed section by about 30 points on average. c. No, you could not legitimately conclude that the sports examples caused the lower performance because this is an observational study, not a comparative, randomized experiment. There may be confounding variables such as the time of the class that explain the difference observed between the two sections.
Activity 8-15: Population Growth a. Using all digits (Minitab): For the western states, the median is 13.3%. For the eastern states, the median is 9.65%. Truncating digits: For the western states, the median is 12.5%. For the eastern states, the median is 9%. These medians indicate that the population in the western states is growing, on average, about three percentage points faster than in the eastern states. b. Because both distributions are somewhat skewed to the right, you should expect the means to be greater than the medians. c. You should expect the mean to decrease significantly if you remove the very high outlier, Nevada, from the analysis because the mean is not resistant to outliers.
Activity 8-16: Zero Examples a. Yes, if the sum of the values in a dataset equals zero, then the mean must equal zero because the mean is the sum divided by the number of observations.
Activity 8-18
145
b. No, it is not necessarily the case that the median will be 0 when the sum of the values in a dataset is 0. For example: {18, 2, 2, 2, 2, 2, 2, 2, 2, 2}. The sum of these numbers is 0, but the median is 2.
Activity 8-17: Marriage Ages a. The following side-by-side stemplot displays the distribution of ages between husbands and wives: Wives
Husbands
6
1
9
87765443322
2
3355556699
966320
3
0111458
754
4
0
5
144
0
6
02
3
7
1
leaf unit one year
b. Both distributions are skewed to the right with most wives and husbands getting married young (in their 20s and 30s). There are two distinct clusters of ages for the husbands, one from 19 to 38 years and the other from 51 to 71 years, with an overall center in the mid-30s. The wives varied in age from 16 to 71 and were centered a bit younger in their early to mid-30s. c. For husbands, the mean is 35.71 years and the median is 30.5 years. For wives, the mean is 33.83 years and the median is 29 years. e. No, these two values are not equal (1.0 1.5). f. Yes, the mean of the differences is equal to the difference in mean ages (both are 1.875 years).
Activity 8-18: Quiz Percentages a. Answers will vary (estimations). b. The mean is 88.91% and the median is 95%. c. Thirty-five of the 47 students scored more than the mean. This proportion is .745. This proportion is large because the data are so strongly skewed by the one low outlier, which pulls down the value of the mean. d. Without the outlier, the mean is 90.48% and the median is 95.25%. The median is virtually unchanged, but the mean has increased and is much closer to the median now.
8
d. The median of age differences is 1.0 years.
146
Topic 8: Measures of Center
Activity 8-19: February Temperatures a. For Lincoln, the mean is 43.96°F and the median is 42.5°F. For San Luis Obispo, the mean is 67.75°F and the median is 67°F. For Sedona, the mean is 59.61°F and the median is 62°F. b. For Lincoln, the mean is 6.64°C and the median is 5.83°C. For San Luis Obispo, the mean is 19.86°C and the median is 19.44°C. For Sedona, the mean is 15.34°C and the median is 16.67°C. c. You calculate Celsius mean/median (Fahrenheit mean/median – 32) 5/9.
Activity 8-20: Age Guesses a. Answers will vary. Here is one representative set of answers.
32
36
40
44 Age Guesses
48
52
56
b. The distribution of guesses of the statistics instructor’s age is roughly symmetric, save for one high outlier of 55 years. The (non-outlying) guesses range from a low of about 30 years to a high of 45 years with a typical guess of about 37 years. c. Fifteen (15/71 .211) of the guesses are greater than the correct age, 1 guess (1/77 .014) is correct, and 55 (55/71 .775) of the guesses are less than the correct value. This indicates that most of the students tended to underestimate the correct age of their instructor. (A wise decision on the part of these students!) d. Student expectation. In this example, the students should expect the mean to be greater than the mean because of the high outlier. e. The mean is 37.521 years and the median is 37 years. As predicted, the mean is a little greater than the median. f. There was one obvious outlier (55). When removed, the mean becomes 37.271 years and the median remains 37 years. The mean changes only a little because the sample size is large (71). g. The mean and median both underestimate the actual age of the instructor (41) by about 5 years.
Activity 8-21: Tennis Simulations a. For the standard scoring system, the mean is 6.81 points and the median is 6 points. For the no-ad scoring system, the mean is 5.84 points and the median is 6 points. For the handicap scoring system, the mean is 4.79 points and the median is 5 points. b. The standard system is skewed to the right, so its mean is greater than its median and is almost a full point greater than the mean of the symmetric no-ad system. Both of these systems have centers more than a full point greater than the handicap system.
Activity 8-23
147
Activity 8-22: Hypothetical Exam Scores Many answers are possible. Some hints: a. You need one extremely low outlier. For example, with the dataset {10, 95, 95, 95, 95, 95, 95, 95, 95, 95}, the mean is 86.5.
0
10
20
30
60 70 40 50 (a) Hypothetical Exam Scores
80
90
100
b. You need some large values (to increase the mean), but then at least 60% of the values occurring at much smaller values. For example, with the dataset {0, 1, 2, 3, 4, 5, 90, 90, 100, 100}, the mean is 39.5 and the median is 4.5.
0
10
20
30
60 70 40 50 (b) Hypothetical Exam Scores
80
90
100
c. You need to focus on extreme values and include some skewness. For example, with the dataset {0, 5, 10, 10, 50, 50, 50, 80, 80, 80}, the mean is 41.5 and the median is 50.
10
20
30
60 70 40 50 (c) Hypothetical Exam Scores
80
90
100
Activity 8-23: Hypothetical Exam Scores a. No, you do not have enough information to determine the mean exam score for the two sections combined. There may be more students in one section than the other. b. The overall mean must be between 60 and 90. c. You would need to know how many students are in each section. d. The overall mean exam score is [(20 60) (30 90)]/50 or 78. The overall mean is closer to 90 because there are more students in Section 2 than in Section 1. e. Suppose there were 30 students in Section 1 and 2 students in Section 2. Then the overall mean exam score is [(30 60) (2 90)]/ 32 or 61.875. f. Yes, you can determine the overall mean if you know the same number of students are in each section. If there are n number of students in both sections, then the overall mean is [(n 60) (n 90)]/2n or 75. g. If a student’s score was greater than the mean of Section 1 (60) but less than the mean of Section 2 (90), then when that student transferred from Section 1 to Section 2, the mean would be lower in Section 1 and also lower in Section 2.
8
0
148
Topic 8: Measures of Center
Activity 8-24: Class Sizes a. The observational units are the sections of an introductory statistic course. The variable is the number of students enrolled in the section. b. The mean enrollment size per section is (200 35 35 20 10)/5 or 60 students per section. c. The mean of the variable number of students in the student’s class (200(200) 35(35) 35(35) 20(20) 10(10))/300 or 143.17 students.
Activity 8-25: Sports Averages a. The observational units are football players, baseball games for a specific team, golfer, hockey team, basketball player, tennis player. b. Average yards per carry in football: {3, 4, 2, 0, 14, 7, 2, 5, 4, 3}. The average is 3.2 yards per carry. Average runs per game in baseball (Atlanta Braves games 5/17/07–5/27/07): {3, 3, 14, 3, 8, 0, 2, 3, 4, 6}. The average is 4.6 runs per game. Average driving distance in golf: {325, 301, 294, 276, 311, 330, 295, 288, 277, 297}. The average is 299.4 yards per drive. Average goals scored by opponent in hockey (Carolina Hurricanes games 3/17/07–4/7/07): {2, 3, 4, 6, 5, 4, 3, 3, 4, 4}. The average is 3.8 goals per game. Average point per game in basketball: {21, 7, 12, 15, 8, 19, 24, 10, 14, 15}. The average is 14.5 points per game. Average speed of a serve in tennis: {120, 118, 115, 95, 121, 119, 105, 120, 107, 115}. The average is 113.5 mph. c. Answers and justifications will vary but should focus on the expected shape and/ or presence of outliers in the distribution. • Average yards per carry in football: You would expect the median to be less than the mean. • Average runs per game in baseball: You would expect the median to be less than the mean. • Average driving distance in golf: You would expect the median to be less than the mean. • Average goals scores by opponent in hockey: You would expect the median to be greater than the mean. • Average point per game: You would expect the median to be the same as the mean. • Average speed of a serve in tennis: You would expect the median to be greater than the mean.
Activity 8-26: Salary Expectations a. Answer will vary by class. b. Answer will vary by class. c. The mean should be much more affected than the median because this value would be an outlier; the median is resistant to outliers, whereas the mean is not. d. This would raise the mean and median values by $5000. e. Answers will vary by class.
Quizzes
149
Activity 8-27: Readability of Cancer Pamphlets Here are graphical displays of both distributions:
Patient 2
4
6
8
10
12
14
16
2
4
6
8 10 Reading Level
12
14
16
Pamphlet
The dotplots are much more useful than only looking at the medians for comparing the two distributions. From the dotplots, you can easily see that the readability levels of the pamphlets are not well matched to the patients’ reading levels. Many patients are not able to read any of the pamphlets. Most patients won’t be able to read a large fraction of the pamphlets.
Assessment Sample Quiz 8A
•••
Number of Close Friends
0
1
2
3
4
5
6
Total
Number of Respondents (male)
196
135
108
100
42
40
33
654
Number of Respondents (female)
201
146
155
132
86
56
37
813
1. Is number of close friends a quantitative or categorical variable? 2. Are these distributions roughly symmetric, skewed to the left, or skewed to the right? Explain briefly. (You do not need to construct any graphs.) 3. Calculate the median number of close friends for each gender. 4. Based on the shape of the distributions, do you expect the means to be greater than the medians, less than the medians, or very close to the medians? (Do not calculate either mean.) 5. For each gender, calculate the proportion who say that they have no close friends. Comment on how these proportions compare between men and women.
Solution to Sample Quiz 8A 1. This is a quantitative variable.
•••
2. These distributions are skewed to the right. The frequencies are greatest at 0 and then roughly decrease as the number of close friends increases.
8
The following table reports counts of the number of “close friends” reported by a sample of men and a sample of women:
160
Topic 9: Measures of Spread
f. To be within one standard deviation of the mean is to be within 1.9 4.8 years, which means between –2.9 and 6.7 years. Seventeen of the age differences fall within this interval, which is a proportion of 17/24 or .708, or 70.8%. This percentage is quite close to 68%, which is what the empirical rule predicts. Because the distribution of the age differences does look fairly symmetric and mound-shaped, this outcome is not surprising. g. The mean and median indicate that, on average, people marry someone within a couple years of their own age. More importantly, the measures of spread are fairly small for the differences, much smaller than for individual ages. This result suggests that there is not much variability in the differences, which suggests that people do tend to marry people of similar ages. h. The differences have less variability because even though people get married from their teens to seventies (and beyond), they tend to marry people within a few years of their own age.
•••
Homework Activities Activity 9-7: February Temperatures a. Answers will vary by student predictions. b. For Lincoln, the standard deviation is 15.9°F. For San Luis Obispo, the standard deviation is 9.8°F. For Sedona, the standard deviation is 6.7°F.
Activity 9-8: Social Acquaintances a. Answers will vary by class. Here are some example answers. 1
1134579
2
1345568
3
56778
4
00248
5
012468
6
59
7
6
8 9 10
5
11 12
4
leaf unit one person
b. The median is 37 people; the upper quartile is 52; the lower quartile is 23; and the interquartile range is 29 people. c. The mean is 40.83 people and the standard deviation is 25.25 people. d. You calculate 40.83 25.25 [15.58, 66.08], so 26/35 or 0.743 of the students’ results fall within one standard deviation of the mean.
Activity 9-11
161
e. This proportion is more than what the empirical rule predicts, but is reasonably close to it (68%). f. Yes, these class results are consistent with Gladwell’s findings demonstrating considerable variability. There are several values less than 20 and one as high as 124.
Activity 9-9: Social Acquaintances Answers will vary by class. Here are some example answers: The distributions of data collected from both of these classes is very similar. The mean number of acquaintances for the Cal Poly class is 36.1 people, whereas for this class it is 40.8 people. Both classes have minimums below 20 people (6 and 11, respectively) and high outliers above 100 people. The standard deviation for both classes is 25.25 people and the IQR for the Cal Poly class is 27 people, whereas for this class it is 29 people. The following stacked dotplots display these results:
Cal Poly
*
Our Class
*
*
*
0
20
40
60 80 Number of Acquaintances
100
*
*
120
140
Both distributions appear roughly symmetric apart from the outliers (histograms or dotplots would be better graphs to use to examine shape).
Activity 9-10: Hypothetical Quiz Scores
b. The standard deviation of the quiz scores for student A is zero. This is because all her scores were the same value (8): the mean. There was no deviation from the mean, so the average deviation from the mean is zero.
9
a. From smallest standard deviation to largest standard deviation: Student A, Student C, Student D, Student B. Student A has all her values equal to the mean, and Student C has a tight cluster around 8 points. Student D has a similar range, but not much consistency in responses right at the mean. Finally, Student B’s values are as far from the mean as possible.
c. Student C’s mean is 5; each deviation is 5 and each squared deviation is 25. There are 16 squared deviations, so the sum of the squared deviations is 16 25 400. The variance is, therefore, 400/15 or 26.67, and the standard deviation is _____ √26.67 5.164.
Activity 9-11: Baby Weights a. The z-score for Benjamin’s weight is z (13.9 12.5)/1.5 or 0.93. At age three months, Baby Ben was not quite one standard deviation above the average weight. b. You calculate 0.93 (x 17.25)/2, so x 19.11 lbs. If Ben weighs 19.11 at 6 months, he would again be 0.93 standard deviations above the mean weight at that age.
162
Topic 9: Measures of Spread
Activity 9-12: Student and Faculty Ages a. Answers will vary by school, but most likely the teachers’ ages are more variable than the students’ ages. The students’ ages probably range (generally) from 14–19 or 18–25 years, whereas the teachers’ ages could range from 24–70 years. b. Answers will vary by school, but at many schools a reasonable guess would be between 1 and 2 years. If the ages range from 18–25 years, then it makes sense that roughly 2/3 of the observations would be between 20–22 years.
Activity 9-13: Baseball Lineups a. Answers will vary by student expectation. Students should consider issues of resistance. b. The Yankee mean and median ages have increased by 2 years to 31.7 and 33.5 years, respectively. c. Answers will vary by student expectation. d. The IQR and standard deviation did not change. e. Answers will vary by student expectation. f. The mean is now 60 years and the median is now 59 years; the IQR is 8.5 years and the standard deviation is 5.50 years. All of these values have doubled.
Activity 9-14: Pregnancy Durations a. Approximately 68% of human pregnancies will last between 250 and 282 days. b. Approximately 95% of human pregnancies will last between 234 and 298 days. This is roughly 7.8–9.93 months. c. A horse is more likely to have a pregnancy that lasts within 6 days of its mean. In fact, 95% of all horse pregnancies will last within 6 days of the mean (366 days) because the standard deviation for horse pregnancies is 3 days.
Activity 9-15: Sampling Words a. Answers will vary. The following data are from one particular running of the applet:
The mean is 4.29 words and the standard deviation is 0.95 words.
Activity 9-18
163
b. The mean is 4.31 words and the standard deviation is 0.46 words. c. The standard deviation was roughly cut in half. d. Yes, for the samples based on a sample of size 20, the empirical rule should hold fairly closely because the sampling distribution is approximately symmetric and mound-shaped.
Activity 9-16: Tennis Simulations a. Based on the frequency tables, the no-ad system appears to have the least variability in game lengths and the standard system appears to have the most variability in game lengths. b. For the standard scoring system, the IQR is 8 5 or 3 points. For the no-ad scoring system, the IQR is 7 5 or 2 points, and for the handicap scoring system, the IQR is 6 4 or 2 points. c. For the standard scoring system, the SD is 2.74 points. For the no-ad scoring system, the SD is 1.022 points. For the handicap scoring system, the SD is 1.458 points. d. Yes, both the IQR and standard deviation suggest that the standard system has the most variability and that the no-ad system has the least variability.
Activity 9-17: Baseball Lineups The stacked dotplot displays these results:
Team
Yankees 0.0
3.5
7.0
0.0
3.5
7.0
17.5
21.0
24.5
10.5 14.0 17.5 2006 Salaries (in millions of dollars)
21.0
24.5
10.5
14.0
Tigers
9
The Yankees’ salaries are generally much higher than those of the Tigers, and they also exhibit much more variability. The Yankees have a mean salary of $10.96 million and an even greater median salary of $12.5 million! Their salaries range from a low of $.3 million to a high of $25.6 million and have a standard deviation of $9.45 million and an IQR of more than $20 million. In contrast, the mean Tiger salary is only $4.14 million and the median salary is a lowly $2.90 million, with salaries ranging from $.3 million to a high of only $10.6 million (half of the Yankees make more). The standard deviation of the Tigers’ salaries is $3.74 million and their IQR is $7.76 million, reflecting the smaller variability in this distribution.
Activity 9-18: Population Growth a. The western states have more variability in population growth percentages as the values are not as tightly clustered, and there is also the extreme outlier (Nevada). b. For the eastern states, the IQR is 14.4 5.5 or 8.9%. For the western states, the IQR is 21.95 8.7 or 13.25%. The western states have a substantially greater IQR than the eastern states, confirming the greater variability among the population growth percentages in the west.
Topic 9: Measures of Spread
c. The standard deviation of the western states should decrease considerably if Nevada were removed from the analysis. Nevada is a tremendous outlier and makes a substantial contribution to the standard deviation when its large deviation is averaged in. (With Nevada, the standard deviation is 14.07%; without Nevada, the standard deviation is 9.7%.)
Activity 9-19: Memorizing Letters Answers will vary by class. Those given here are examples.
Group
JFK 4
8
4
8
12
16
20
24
28
20
24
28
JFKC 12 16 Number of Letters Correct
The JFK group showed greater variability in their scores than the JFKC group. The standard deviation for the JFK group is 6.45 letters, with an IQR of 12 letters; the JFKC group has a standard deviation of only 5.86 letters and an IQR of only 9 letters.
Activity 9-20: Monthly Temperatures a. Here are dotplots for the average monthly temperature for San Francisco, California, and Raleigh, North Carolina. Raleigh City
164
42
48
42
48
54
60
66
72
78
72
78
San Francisco 54 60 66 Average Monthly Temperatures (in F)
b. For Raleigh, the median value is 59.5 degrees. For San Francisco, the median value is 57 degrees. Yes, these medians are fairly close. (Note that they are marked by vertical lines in the dotplot in part a.) c. No, you cannot conclude there is not much difference between these two cities with regard to monthly temperatures. Although their centers are close, their spreads are very different. d. Raleigh appears to have more variability in its monthly temperatures. e. For Raleigh, the range is 39 degrees. For San Francisco, the range is 16 degrees. f. For Raleigh, the IQR is 26 degrees. For San Francisco, the IQR is 10 degrees. g. For Raleigh, the mean absolute deviation is 11.83 degrees. For San Francisco, the mean absolute deviation is 4.92 degrees. h. For Raleigh, the standard deviation is 14.17 degrees. For San Francisco, the standard deviation is 5.75 degrees.
Activity 9-23
165
Activity 9-21: Nicotine Lozenge a. The mean number of cigarettes smoked per day has more variability than the mean age of smoking initiation. You can tell because the standard deviations are two to three times as large as those for the age of initiation variable. b. The researchers provide the means and standard deviations so that readers can compare the distributions of the two treatment groups on these baseline characteristics. Showing these summary statistics are similar adds evidence to the lack of confounding variables between the two treatment groups, which strengthens the causal conclusions from the study if a difference is observed later for the response variable. c. Yes, the empirical rule probably holds for some of these variables, in particular, for the variables age and weight. These variables are likely to have mound-shaped distributions. It is likely that roughly 68% of these smokers were between the ages of 29 and 53, that 95% of them were between the ages of 17 and 65, and that virtually all of them were between the ages of 5 and 77. Similarly, it is likely that roughly 68% of these smokers weighed between 58.4 and 92.8 kg (129–205 lbs), 95% of them weighed between 41 and 110 kg (91–242.5 lbs), and virtually all of them weighed between 24 and 127 kg (53–280 lbs). It is less likely that the age of initiation is symmetric because mean 2 SD gives an age of 8.3 years (one hopes too young to be realistic). Similarly, the number of cigarettes smoked per day must be truncated at zero and would not match the empirical rule because mean 3 SD 0. It also makes sense that extreme chain smokers would skew the distribution to the right.
Activity 9-22: Hypothetical Exam Scores Many answers are possible. Here is one set of answers. a. One possible dataset is {1, 1, 2, 4, 5, 6, 7, 9, 9, 10}. b. All the values cannot be the same, but the third through eighth values must be the same. c. One possible dataset is {4, 4, 4, 4, 4, 6, 6, 6, 6, 6}.
e. The first three ordered values must be 0s, and the last three must be 100s. The remaining values can be any numbers between 1 and 99. f. One possible dataset is {0, 0, 0, 0, 0, 100, 100, 100, 100, 100}. For this dataset, the range is 100 and the mean absolute deviation is 50.
Activity 9-23: More Measures a. The midhinge and midrange are both measures of center because they give the midpoints of the upper and lower quartiles and minimum and maximum values, respectively. This “averaging” should place the result roughly in the middle of the distribution. You would need to look at differences between values (e.g., max min) to have a measure of spread. b. Yes, adding a constant value to all the values in a dataset will change the midhinge and the midrange by that amount. This is further confirmation that these are measures of center because their values change to reflect a shift in the distribution.
9
d. The dataset is {0, 0, 0, 0, 0, 100, 100, 100, 100, 100}. (This is the only possible answer.)
166
Topic 9: Measures of Spread
c. The midhinge is resistant to outliers because it uses only the upper and lower quartiles in its calculation, and these values are not usually outliers. The midrange is not resistant to outliers because it uses the maximum and minimum values in its calculation, and these are the values that could be outliers. d. For the Yankees, the midrange is (35 22)/2 or 28.5 years and the midhinge is (32 26)/2 or 29 years. For the Tigers, the midrange is (34 25)/2 or 29.5 years and the midhinge is (32 28)/2 or 30 years.
Activity 9-24: Hypothetical ATM Withdrawals a. The following stacked dotplot displays the distribution of cash amounts at each machine (each dot represents up to two observations):
Machine 1 20
40
60
80
100
120
20
40
60
80
100
120
20
40
100
120
Machine 2
Machine 3 60 80 Amount Withdrawn ( in dollars)
Yes, each distribution is perfectly symmetric. b. Yes, the mean and standard deviation are identical for each machine. The mean for each machine is $70, and the standard deviation for each machine is $30.3. c. No, the distributions for each machine are not identical—they are quite different. This difference indicates that the mean and standard deviation do not provide a complete summary of a distribution of data.
Activity 9-25: Guessing Standard Deviations a. Answers will vary depending on student expectation. b. For Data A, the mean is 64.454 and the standard deviation is 9.598. For Data B, the mean is 202.52 and the standard deviation is 51.88. For Data C, the mean is 0.99947 and the standard deviation is 0.04952. For Data D, the mean is 5.405 and the standard deviation is 4.714.
Activity 10-6
177
d. The boxplots of camera ratings are shown in the following the table. The fivenumber summaries, as reported by the software package Minitab, are as shown here: Minimum
Lower Quartile
Median Upper Quartile
Maximum
Advanced Compact
63
69
70
73
78
Compact
62
65
71
73.5
76
Subcompact
53
61.75
65.5
69.25
75
Super-Zoom
66
72.25
75.5
79
81
Camera Type
Advanced Compact Compact Subcompact Super-Zoom
50
55
60
65 70 Rating Score
75
80
The super-zoom cameras tend to have the highest ratings, and the subcompact ones tend to rate the lowest. The subcompact cameras also have the most variability in ratings. e. Even though the advanced compact cameras have the highest median price by far, their median rating is surpassed by both super-zoom and compact cameras. In fact, compact cameras have the second-highest median rating despite having the second-lowest median price.
Homework Activities Activity 10-6: Natural Selection a. Yes, the data suggest that the sparrows that died weighed a little more than the sparrows that survived. The medians are very close, but every quartile for the sparrows that died is a little larger than for the sparrows that survived. This difference does not appear to be very substantial. b. For the sparrows that died, the IQR is 26.95 25.2 or 1.75 grams. You calculate 1.5 1.75 or 2.625, which gives [25.2 2.625, 26.95 2.625] [22.575, 29.575]. Any outliers had weights outside this range. For the sparrows that survived, the IQR is 26.5 24.3 or 2.2 grams. You calculate 1.5 2.2 or 3.3, which gives [24.3 3.3, 26.5 3.3] [21, 29.8]. Any outliers had weights outside this range. c. There were no outliers among the sparrows that survived. There was at least one high outlier among the sparrows that died (because the maximum weight is greater than 29.575 grams).
10
•••
Topic 10: More Summary Measures and Graphs
d. You cannot draw complete modified boxplots because you cannot tell how many high outliers there may be among the sparrows that died. There is at least one (that weighed 31 grams), but there could be more if any sparrows weighed between 29.575 and 31 grams.
Activity 10-7: Natural Selection a. The following boxplots compare the distributions of surviving and perished sparrows for all of these variables:
Status
Died
Survived
240
242
244
246 248 Alar Extent (in mm)
250
252
254
Status
Died
Survived
**
30.0
30.5
31.0
31.5
32.0
32.5
Length of Beak and Head (in mm)
Died Status
*
Survived
*
0.650
0.675
0.700
0.725
0.750
0.775
0.750
0.775
Humerus (in inches)
Status
Died
Survived
0.650
0.675
0.700
0.725
Femur (in inches)
Died Status
178
Survived
*
1.00
*
1.05
1.10 1.15 Tibiotarsus (in inches)
1.20
1.25
Activity 10-8
Status
Died
*
*
0.57
0.58
179
*
Survived
0.59
0.60
0.61
0.62
0.63
0.64
Skull Width (in inches)
Status
Died
Survived
0.800
0.825
0.850
0.875
0.900
0.925
Keel of Sternum (in inches)
b. The variables femur bone length, keel of sternum, humerus bone, and skull width vary considerably. The variables alar extent and length of head and beak do not vary much. c. It appears that the alar extent and length of head and beak are virtually the same in the sparrows that did and did not survive the storm. However, the keel of the sternum, the humerus, and the femur and tibiotarsus bones are all somewhat larger in the sparrows that survived than in the sparrows that died. Only the skull seems to be a little larger in the sparrows that died, although this is hard to say definitively because the spread for the sparrows that died is so much comparatively smaller for this variable than for the sparrows that survived.
Activity 10-8: Welfare Reform a. The states are the observational units.
c. The three states with the greatest percentage reductions are Wyoming (93.18%), Virginia (84.47%), and Illinois (82.69%). The three states with the lowest percentage reductions are Indiana (6.27%), Kansas (26.860%), and Tennessee (28.63%). d. The following dotplot displays the variable percentage reduction:
12
24
36
48 60 Percentage Reduction
72
84
96
The dotplot of percentage reductions in families receiving welfare is fairly symmetric, ranging from about 14% to 84%, with apparent outliers at 6.3% (Indiana) and 93% (Wyoming). The typical percentage reduction seems to be about 55%, and the standard deviation of these percentages is 16.9%.
10
b. To create this new variable, subtract the December 05 assistance values from the August 96 assistance values.
Topic 10: More Summary Measures and Graphs
e. The following boxplots display the percentage reductions for the Northeast, South, Midwest, and West:
Region
Midwest
Northeast
South
*
West
0
10
20
30
40
50
60
70
80
90
Percentage Reduction
In general, the states in the South saw greater percentage reductions in the number of families on welfare compared to other regions of the country, with over half of the southern states seeing a reduction of 70% or more. The Midwest seems to have the greatest variability in percentage reduction, ranging from a minimum of 6.3% to a maximum of 82.7%, with an IQR of almost 25%. The West also had a large IQR (26%), but their typical percentage reduction was greater than that of the Midwest by about five percentage points (50% vs. 55% or so). The Northeast had fairly small variability in its percentage reductions (IQR 17%) and its typical percentage reduction was very similar to that of the Midwest.
Activity 10-9: Memorizing Letters Answers will vary by class. One example is given here.
JFK Group
180
JFKC
0
5
10
15
20
25
30
Number of Letters Correct
The boxplots indicate that the JFK group was generally more successful at memorizing the letters than the JFKC group, but this group also had a greater variability in their scores. A quarter of the JFK group got 20 letters correct, whereas this same percentage got only 15 letters correct in the JFKC group. More than half of the JFK group did this well.
Activity 10-11
181
Activity 10-10: Population Growth
Region
The following boxplots display the distributions of percentage changes in population between eastern and western states:
East
West
*
0
10
20
30
40
50
60
70
Population Growth Percentages
These boxplots reveal that the western states tended to have a higher percentage of population growth than the eastern states, although there was also greater variability among the percentages in the western states. Both regions have a minimum of about 0.5% growth, whereas the eastern states have a median of 9.64% and a maximum of 26.4% growth. The western states have a median of 13.3% growth and a high outlier of 66.3% growth, but their highest non-outlier value is still 40%. About 50% of the western states showed higher values than the bottom 75% of the eastern states.
Activity 10-11: Sporting Examples
Section
The following boxplots display the distributions of total points between the two sections:
Regular
Sports
200
250
300
350
400
These boxplots show that the students in the regular section tended to earn more points than those in the section that used only sports examples. For the regular section, the minimum is 265 points, with a median of 341 points, a maximum of 262.5 points, and an IQR of 43 points. In contrast, the minimum for the sports-example section was more than 30 points less: 307.25 points, with a median of 309 points and a maximum of 397 points. You see that at least 75% of student scores in the regular section exceeded 50% of the student scores in the sports section. The point spread in the sports section was greater than that of the regular section because their IQR was 60.25 points (versus 43 points).
10
Total Points
Topic 10: More Summary Measures and Graphs
Activity 10-12: Backpack Weights a. The following boxplots display backpack weights carried by male and female students:
Female Sex
* **
Male
*
0
5
10
15
20
25
30
35
Backpack Weight (in pounds)
These boxplots show that backpack weights for females are generally lighter than backpack weights for males (median 10 vs. median 11) and that the backpack weights for males have a bit less spread (IQR 6 vs. IQR 8). You also see that the upper quartile for backpack weights for males (about 16) is greater than the upper quartile for the backpack weights for females (about 13). b. The following boxplots display the ratio of backpack weight to body weight for male and female students:
Female
* * *
Male
*
Sex
182
0.00
0.05
0.10
0.15
0.20
Ratio of Backpack Weight to Body Weight
These boxplots show that the ratios of backpack weights to body weights are very similar for the two genders. Both genders have a minimum of about .016 and a lower quartile near .05 (females .056, males .048). For males, the median is .065, which is slightly smaller than for the females (.077). However, the upper quartile is again very similar for both genders: .096 for the males and .093 for the females. There is a cluster of three outliers for the females, with a maximum at .181, and just one outlier for the males at .179. The spread for the females is just a little less than for the males (IQR .04 vs. IQR .045).
Activity 10-13: Social Acquaintances a. Answers will vary by class. These are representative answers; here is a five-number summary of acquaintances: min 11, Q L 23, median 37, Q U 52, max 124. b. You calculate 1.5 IQR 1.5 (52 23) 1.5 29 43.5. Outliers are outside [23 43.5, 52 43.5] [0, 95.5], so there are two high outliers: 105 and 124.
Activity 10-16
183
c. Here is a modified boxplot of the data:
*
0
20
40
60 80 Acquaintances
*
100
120
140
d. The distribution of the number of acquaintances known by students in this class is reasonably symmetric, ranging from a minimum of 11 acquaintances to a maximum of about 75 acquaintances, save for two high outliers at 105 and 124 acquaintances. The center of the distribution is about 37 acquaintances, and the IQR is 29 acquaintances.
Activity 10-14: Social Acquaintances a. Here are the results using each of the rules: • You calculate 2 IQR 2 (52 23) 2 29 58. By this rule, outliers are outside [23 58, 52 58] [0,110], so there is one high outlier: 124. • You calculate 3 IQR 3 (52 23) 3 29 87. By this rule, outliers are outside [23 87, 52 87] [0,139], so in this case there are no outliers. • You calculate mean 2 SD 40.83 2(25.25) 40.83 50.5 [0, 91.33]. By this rule, there are two high outliers: 105 and 124. • You calculate mean 3 SD 40.83 3(25.25) 40.83 75.75 [0, 116.58]. By this rule, there is one high outlier: 124. b. These analyses reveal that no one outlier rule will always identify what appear to be obvious outliers.
Activity 10-15: Diabetes Diagnoses a. Here is the five-number summary of these ages: min 1, Q L 39, median 51, Q U 63, max 88. b. The IQR is 24 years. You calculate 1.5 24 36, so outliers are any observations outside [3, 99]. Thus, there are four low outliers: 1, 2, 2, and 2. There are no high outliers.
**** 0
10
20
30 40 50 60 Age of Diabetes Diagnosis
70
80
90
d. The modified boxplot indicates the low outliers that are not obvious on the histogram in Activity 7-5 on page 127. However, the boxplot does not show the second, small cluster of ages from 2–15 years.
Activity 10-16: Hazardousness of Sports a. Bicycle riding (544,561) has more injuries than football (334,420); soccer (148,913) has more injuries than ice hockey (77,491); and swimming (83,772) has more injuries than skateboarding (48,186).
10
c. The following boxplot displays the distribution of ages:
184
Topic 10: More Summary Measures and Graphs
b. You divide the number of injuries by the number of participants and multiply that amount by 1000. c. The injury rate of football (16.6378) is greater than that of bicycle riding (12.0745); the injury rate of ice hockey (40.7847) is greater than that of soccer (10.8696); and the injury rate of skateboarding (7.64857) is greater than that of swimming (1.40793). d. The answers in parts a and c are all reversed. e. The three most hazardous sports are ice hockey, basketball, and football. The least hazardous sports are pool, bowling, and archery. f. Many answers are possible. One factor to consider is the seriousness of the injury.
Activity 10-17: Gender of Physicians a. The three specialties with the most female physicians are internal medicine (41,658), pediatrics (33,351), and family practice (23,317). The three disciplines with the fewest female physicians are transplantation surgery (8), aerospace medicine (32), and colon/rectal surgery (113). b. The number of women variable does not take into account the total number of physicians in the field. c. The variable percentage of that specialty’s practitioners who are women is useful because there are many fields in which only a small number of women practice, but in which only a small number of men practice too. The relative number of women in the field gives us a much more informative picture than the absolute number. d. Divide number of women in that specialty by the total number of practitioners (number of women number of men). Then multiply that amount by 100 to convert the proportion into a percentage. e. The three specialties with the greatest percentages of female physicians are pediatrics (50.25%), medical genetics (46.88%), and child psychiatry (42.32%). The three disciplines with the lowest percentages of female physicians are thoracic surgery (3.01%), urological surgery (3.67%), and orthopedic surgery (3.80%). f. No, the lists in part e do not agree exactly. This is because the first lists (part a) do not take the total number of physicians in each discipline into consideration. g. Many answers are possible but medical genetics is an obvious choice. Only 203 women practice in this field, yet this number is 46.9% of the specialty’s practitioners. This is because so few male doctors (only 230) choose to go into this field; although there are not many female doctors on an absolute scale in this discipline, they make up almost half of the total doctors who specialize in this area. h. The following dotplot displays the distribution of the percentage of women in these medical specialties:
0
7
14 21 28 35 Percent of Women Physicians in Specialty
42
49
Activity 10-18
185
The distribution of female physicians’ specialties is somewhat “mound-shaped,” ranging from about 3% (thoracic surgery) to more than 50% (pediatrics), with a center around 20%.
Activity 10-18: Draft Lottery a. Answers will vary. b. Here are the median of the draft numbers for each month: Jan. 211, Feb. 210, Mar. 256, Apr. 225, May 226, June 207.5, July 188, Aug. 145, Sept. 168, Oct. 211, Nov. 131.5, and Dec. 100. The medians early in the year tend to be greater than those late in the year. This means that a greater proportion of lower numbers (and therefore men drafted) were taken from those born in the later months of the year. c. The following boxplots display the distributions of draft numbers across the twelve months:
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
10
Sep
Oct
Nov
Dec
0
100
200 Draft Numbers
300
400
186
Topic 10: More Summary Measures and Graphs
Activity 10-19: Honda Prices a. The dotplot for the variable mileage corresponds to boxplot I. This plot is strongly skewed to the right with a couple of high outliers. b. The dotplot for the variable year of manufacture corresponds to boxplot III. This plot is skewed to the left. c. The dotplot for the variable price corresponds to boxplot II. This plot has long tails on both sides, although the left tail is much longer than the right.
Activity 10-20: Planetary Measurements a. Here is the five-number summary of planet diameters (in miles): min 1428, Q L 3623.5, median 7926, Q U 53,329, max 88,838. b. The following boxplot displays the distribution of planet diameters:
0
10,000 20,000 30,000 40,000 50,000 60,000 70,000 80,000 90,000 Planet Diameters (in miles)
c. This distribution is strongly skewed to the right.
Activity 10-21: Rowers’ Weights a. Here is the five-number summary of rowers’ weights (in pounds): min 120, Q L 160, median 195, Q U 210, max 229. b. You calculate 1.5 IQR 1.5 50 75, so outliers would be weights outside [160 75, 210 75] [85,285]. Therefore, there are no outliers. c. The following boxplot displays the distribution of rowers’ weights:
120
140
160
180 200 Weight (in pounds)
220
240
d. The dotplot reveals the two clusters of weights (lightweight and non-lightweight) as well as the outlier (Cipollone). These data are not shown by the boxplot.
Activity 10-22: Hypothetical Exam Scores a. Yes, the dotplots do reveal great differences among the three distributions of exam scores. Class B is very evenly distributed from 50 to 92, whereas most of the exam scores in class C are clustered from 61 to 66 or from 77 to 80. In class A, there are three very low scores (50s) and three very high scores (91, 91, and 92), whereas most of the remaining scores are clustered around 70. b. For all three classes, the minimum is 50, the lower quartile is 63, the median is 70, the upper quartile is 79, and the maximum is 92.
Quizzes
187
c. Here are boxplots of the exam scores:
Class A
Class B
Class C
50
60
70 Exam Scores
80
90
d. No, you would not be able to detect the differences in these distributions from the boxplots because the boxplots are all identical.
Assessment Sample Quiz 10A
•••
The following data are monthly rents (in dollars) of studio and one-bedroom apartments in Harrisburg and Philadelphia, Pennsylvania. Harrisburg (n 10): 500, 549, 569, 575, 585, 600, 630, 680, 705, 790 Philadelphia (n 15): 475, 525, 540, 575, 600, 600, 645, 700, 725, 755, 885, 930, 965, 1180, 1300 1 and 2. For each city, determine and report the five-number summary of these monthly rents. 3. Construct boxplots of the distributions of rent amounts in these two cities, using the same axis and scale. (Do not bother to check for outliers; there are no outliers in either distribution.)
Solution to Sample Quiz 10A
•••
1 and 2. Harrisburg: min $500, Q L $569, median $592.5, Q U $ 680, max $790 Philadelphia: min $475, Q L $575, median $700, Q U $930, max $1300 3. City
Harrisburg
Philadelphia
400
500
600
700
800
900
1,000
1,100
Monthly Apartment Rent (in dollars)
1,200
1,300
10
4 and 5. Compare and contrast the distributions of monthly apartment rents in these two cities. Refer to appropriate calculations and displays to support your comments.
209
k. This expected value makes sense because with three women randomly assigned among two groups, you would expect half of them to be assigned to each group in the long run, and half of 3 is 1.5.
•••
Homework Activities Activity 11-6: Random Cell Phones a. Write the names of the five executives on five index cards of the same size. Shuffle the cards well and deal them onto a sheet of paper on which you have drawn five boxes that are labeled with the names of the executives. Check how many matches you have between names on the index cards and names in the boxes. Record this value. Then reshuffle the cards and deal again. Repeat many times. b. Let 123 mean that the first cell phone went to the first executive, the second cell phone to the second executive, and the third cell phone to the third executive. All of the possibilities are listed here: 123
213
321
132
231
312
c. Pr(nobody gets the correct phone) 2/6 1/3 Pr(exactly one person gets the correct phone) 3/6 Pr(exactly two people get the correct phone) 0 Pr(all three people get the correct phone) 1/6 Pr(at least one person gets the correct phone) 1 1/3 2/3 d. If you repeated this exact “experiment” over and over again, in the long run you would expect to observe nobody getting the correct phone in about 84.6% of the trials.
Activity 11-7: Equally Likely Events a. These outcomes are equally likely. Because the die is fair, each side should be just as likely to come up as any other side. b. No, these outcomes are not equally likely. A 2 can only be summed from a roll of snake eyes (1 1), but a 7 can be summed from rolls of 1 6, 2 5, 3 4, 4 3, 5 2, and 6 1. So Pr(sum of 2) 1/36, whereas Pr(sum of 7) 6/36. c. You would assume that a coin flip would have equally likely outcomes. d. You would assume that spinning a coin would result in equally likely outcomes. e. You would assume this spinning would result in equally likely outcomes. f. No, these outcomes are not equally likely. Among other reasons, an A is typically 90–100, whereas an F is generally any average less than 60. These do not cover equivalent ranges. You might not expect to see a similar proportion of Fs as As. g. No, these outcomes are not equally likely. (We hope the probability that California experiences a catastrophic earthquake within the next year is less than 1/2.) h. No, these outcomes are not equally likely, unless you frequent bad restaurants. The probability that your waiter or waitress brings you the meal you ordered is greater than 1/2.
11
Activity 11-7
210
Topic 11: Probability
i. No, these outcomes are not equally likely. It seems much more likely that there is not intelligent life on Mars. j. Answers will vary by personal opinion. k. Answers will vary by personal opinion. l. These outcomes are not equally likely. Orange is more likely than the other two colors.
Activity 11-8: Interpreting Probabilities a. If climatic conditions identical to those expected for tomorrow were to occur over and over again, rain would occur on 30% of all such days in the long run. b. If you were to play this particular lottery game over and over again, you would win .1% of the time you play in the long run. c. If you were to repeatedly deal out five-card poker hands, in the long run a “four-of-a-kind” would happen in about .024% of the hands. d. If you repeatedly sample M&M candies, in the long run 20% of the candies will be red. e. If every year you examine the proportion of Christmases for which Minneapolis has snow, you would find that on average, over many, many years, that proportion would be about .70. f. If you examine the performance of restaurants, you will find that in the long run about 60% of them fail.
Activity 11-9: Racquet Spinning a. The “probability of the racquet landing up” refers to the long-run proportion of times that the racquet would land up if it were spun repeatedly under identical conditions. The graph reveals that this proportion seems to be settling in around .45 as the number of spins increases. Many more spins would be needed in order to estimate the probability more accurately. b. After 10 spins, probability .3; after 20 spins, probability .47; after 40 spins, probability .37; after 100 spins, probability .45. c. No, this graph seems to indicate that landing “down” is a little more likely than landing “up.” d. Answers will vary.
Activity 11-10: Committee Assignments a. Answers will vary. Here is one representative set of answers. Assign each person a number: Alice (1), Bonnie (2), Carlos (3), Danny (4), Evan (5), and Frank (6). Roll the die twice to decide the committee members. If you roll the same number, ignore the second roll, and roll again until you get a unique second committee member.
211
b. Answers will vary. In an example simulation, 5/50 times both women ended up being the officers. Thus, the empirical estimate of the probability for this example is 5/50 or .1. Number of Women
0
1
2
Number of Trials
18
27
5
c. The mean number of men is 63/50 1.26.
Activity 11-11: Committee Assignments a. All possible pairs of officers are listed here: (Alice, Bonnie), (Alice, Carlos), (Alice, Danny), (Alice, Evan), (Alice, Frank), (Bonnie, Carlos), (Bonnie, Danny), (Bonnie, Evan), (Bonnie, Frank), (Carlos, Danny), (Carlos, Evan), (Carlos, Frank), (Danny, Evan), (Danny, Frank), (Evan, Frank). b. Fifteen pairs are possible. One of these pairs consists of two women. c. The theoretical probability is 1/15 .06667. This outcome is uncommon, but not rare. d. In the long run, two men would be selected as officers 6/15 or 40% of the time. Such an outcome would not be a surprising result using random selection. e. In the long run, one man and one woman would be selected as officers 8/15 or 53.5% of the time. Such an outcome would not be a surprising result using random selection. f. The most likely outcome would be one officer of each gender. g. The expected number of men is 0 .0667 1 .5333 2 .4 1.3333. This should be reasonably close to the simulated average.
Activity 11-12: Family Births Answers will vary. Here is one representative set of answers: a. Use the Random Digits Table as you did in Activity 11-3: An even digit represents a girl child and an odd digit represents a boy child. Start at any line and “create” families of four children. Repeat until you have 20 four-children families. Number of Girls Probability
0
1
2
3
4
2/20 .15
5/20 .25
6/206 .3
6/20 .3
1/20 .05
b. Based on the simulation, a 3/1 split in either direction (probability .55) is more likely than a 2/2 split of boys and girls (probability .3). c. The sample space is {BBBB, BBBG, BBGB, BGBB, GBBB, BBGG, BGBG, GBGB, BGGB, BGGB, BBGG, BGGG, GBGG, GGBG, GGGB, GGGG}. Number of Girls Probability
0
1
2
3
4
1/16 .0625
4/16 .25
6/16 .375
4/16 .25
1/16 .0625
11
Activity 11-12
212
Topic 11: Probability
d. A 3/1 split in either direction is more likely (probability .25 .25 .5) than a 2/2 split of boys and girls (probability .375).
Activity 11-13: Treatment Groups Answers will vary. Here is one representative set of answers: a. Pick any row of the Random Digits Table. Let even digits represent the women and odd digits represent the men. Read two digits; this will make up the first group, and the second group will be the two remaining people. Repeat 100 times. b. Suppose the men are Steve and Greg, and the women are Cathy and Laura. Because these are treatment groups, group 1 is different from group 2, thus (Cathy, Laura)/(Steve, Greg) (Steve, Greg)/(Cathy, Laura). The sample space is (Cathy, Laura)/(Steve, Greg), (Cathy, Steve)/(Laura, Greg), (Cathy, Greg)/(Laura, Steve), (Steve, Greg)/(Cathy, Laura), (Laura, Greg)/(Cathy, Steve), (Laura, Steve)/ (Cathy, Greg). Pr(2 women in one group, 2 men in the other) 2/6 1/3 (because Pr(2W,2M) 1/6 and Pr(2M,2W) 1/6) Pr(one of each gender in each group) 4/6 2/3
Activity 11-14: Die Rolling a. Here are the possible values and their probabilities: Value Probability
1
2
3
4
5
6
1/6
1/6
1/6
1/6
1/6
1/6
The expected value is 1 1/6 2 1/6 1 1/6 3 1/6 4 1/6 5 1/6 6 1/6 21/6 3.5. b. In the long run, the average value rolled would be 3.5. In other words, if you were to roll a die many hundreds of times and record the value rolled each time, and then add all these values and divide by the number of rolls (to calculate the average roll), that long-run average should be approximately 3.5.
Activity 11-15: Simulating the World Series a. Answers will vary by student expectation. b. Answers will vary. Here is one representative set of answers: Beginning with line 38, the Shorthairs win in 3 games. c. In 50 simulated series, the Shorthairs won 37 times. This is .74, which is greater than .7. d. Answers will vary by student expectation. e. In 50 best-of-seven series, the Shorthairs won 43 times; this is .86. f. The longer series gives the greater advantage to the better team. With the larger sample size, the proportion of games won by a team is likely to be closer to
213
its probability of winning. Thus, the team with a winning probability greater than 1/2 is more likely to win more than half of the games in a longer series. In other words, unusual events such as upsets are more likely with smaller sample sizes.
Activity 11-16: Dating Game Show a. Let BBAA be the outcome that Allyson chooses Bart, Elsa chooses Bart, Bart chooses Allyson, and Dwayne chooses Allyson. Similarly, BDEA would be the outcome where Allyson chooses Bart, Elsa chooses Dwayne, Bart chooses Elsa, and Dwayne chooses Allyson. Then the possible outcomes are (with number of dates in parentheses): BBAA (1)
BDAA (1)
DDAA (1)
DBAA (1)
BBAE (1)
BDAE (2)
DDAE (1)
DBAE (0)
BBEE (1)
BDEE (1)
DDEE (1)
DBEE (1)
BBEA (1)
BDEA (0)
DDEA (1)
DBEA (2)
b. There are 16 possible outcomes. c. Pr(all four go on a date) 2/16 .125 d. Pr(at least one couple goes on a date) 14/16 .875 e. Expected value of the number of people that go on a date is 2 2 ___ 12 4 ___ 2 2.00 0 ___ 16 16 16
Activity 11-17: Dice-Generated Ice Cream Prizes a. Pr(price 32¢) 2/36 .056; a roll of 3,2 or 2,3 will give you a price of 32¢. b. Pr(price 33¢) 1/36 .0278; only a roll of 3,3 will give you a price of 33¢. c. Pr(price 34¢) 0; if a 3 and a 4 are rolled, the cost would be 43¢. d. Pr(price 40¢) Pr(rolls of 11, 12, 21, 22, 23, 13, 31, 32, or 33) 9/36 .25 e. Pr(price 50¢) Pr(rolls of 15, 16, 25, 26, 35, 36, 45, 46, 51, 52, 53, 54, 55, 56, 61, 62, 63, 64, 65, or 66) 20/36 .556. Note: Pr(40 price 50) Pr(rolls of 14, 24, 34, 41, 42, 43, 44) 7/36 f. The sample space is 11¢, 21¢ (including 12), 31¢ (including 13), 41¢ (and 14), 51¢ (and 15), 61¢ (16), 22¢, 32¢ (23), 42¢ (24), 52¢ (25), 62¢ (26), 33¢, 43¢ (34), 53¢ (35), 63¢ (36), 44¢, 54¢ (45), 64¢ (46), 55¢, 65¢ (56), 66¢. The expected value is .11(1/36) .21(2/36) . . . .66(1/36) 47.25¢. g. If you were to repeat this “experiment” over and over again, in the long run, the average price of an ice-cream cone computed in this manner would be 47.25¢.
Activity 11-18: Dice-Generated Ice Cream Prices Answers will vary. Here is one representative set of answers: a. The following histogram displays the distribution.
11
Activity 11-18
Topic 11: Probability
140 120
80 60 40 20 0 16
24
32 40 48 Ice Cream Prices (in cents)
56
64
b. The average of these 1000 simulated prices is 47.42¢. Yes, this is very close to the expected value, as it should be.
Activity 11-19: Hospital Births Answers will vary. Here is one representative set of answers: a. The following histograms display the distributions. 100 80 Frequency
Frequency
100
60 40 20 0 .0
.1
.2
.3
.4
.5
.6
.7
Proportion of Hispanics (Hospital A)
50 40 Frequency
214
30 20 10 0 .12
.18
.24
.30
Proportion of Hispanics (Hospital B)
.36
.42
215
b. Hospital A has more days on which more than 40% of the babies born are Hispanic. c. Hospital B has more days on which between 15% and 35% of the babies born are Hispanic. d. Hospital B has more days on which less than 40% of the babies born are Hispanic. e. Because the sample size (50) for Hospital B is greater than for Hospital A (10), you expect the distribution for Hospital B to be more concentrated around its mean (0.25). So, Hospital B should have a greater percentage of Hispanic births between 15% and 35% and a smaller percentage away from the mean—greater than 40%.
Activity 11-20: Collecting Prizes a. Answers will vary by student guess. b. Number the prizes with digits 1, 2, 3, and 4. Then choose a row from the Random Digits Table. Ignore all other digits (0, 5, 6, 7, 8, and 9). Read the digits in the table as boxes of cereal until you have obtained a complete set of 1, 2, 3, and 4. c. Answers will vary. The following were obtained using row 74 of the Random Digits Table: 4, 4, 1, 3, 4, 2. You would have to buy six boxes of cereal to obtain all four prizes. d. The following dotplot gives values for this set of results:
0
3
6
9
12
15
18
20
Number of Cereal Boxes Needed for Complete Set of Prices
e. The approximate expected value is mean 8.4 boxes of cereal. f. Pr(num boxes 10) 22/25 .88 g. You could obtain a more accurate approximation of this probability and expected value by performing a larger simulation. (Repeat the simulation, say, 100 rather than 25 times).
Activity 11-21: Runs and “Hot” Streaks a. Answers will vary by student guess. b. Answers will vary. Here is one representative set of answers: Run Length Tally
2
3
4
5
6
7
8
9
13
39
25
14
5
2
1
1
11
Activity 11-21
Topic 11: Probability
c. The following histogram displays distributions for this set of simulations: 40 30 Frequency
216
20 10 0 2
3
4
5
6
7
8
9
Run Length
d. A streak of 5 or more occurred in 23/100 of the simulations. This event is not very surprising as it happened almost 1/4 of the time. e. The most common hot streak length appears to be 3 heads. f. The median run length is 3 heads. The mean is (2 13 3 39 4 25 5 14 6 5 7 2 8 1 9 1)/100 3.74 heads g. Yes, it would be very surprising to flip a coin 10 times and find that the longest streak is 1. In this simulation, a streak of 1 never occurred. The only way it could happen is for the 10 flips to alternate back and forth between heads and tails.
Activity 11-22: Solitaire a. The proportion is 25/(192 25) 25/217 .115. b. Pick a row of the table. Ignore the zeros. If a digit is a 1, then consider that a success: The player won the game. If the digit is a 2, 3, 4, . . . 8, or 9, then the player lost the game. c. Answers will vary. Using row 6 of the Random Digits Table, the author needed to play 5 games for the first win. d. The following dotplot gives values for this set of results:
0
4
8
12
16
20
24
28
Number of Games Needed for Win
e. The mean is 9.56 games. The median is 6 games. f. Based on this simulation, this author can expect to play between 9 and 10 games before winning for the first time. g. Based on this simulation, this author will have to play 6 games in order to have a 50% chance of winning at least once.
217
11
Activity 11-24
Activity 11-23: Solitaire a. The proportion is 74/444 or .167. b. Answers will vary. Here is one representative set of answers. c. The following dotplot gives values for this set of results:
0
3
6
9
12
15
18
Number of Games Needed for Win
d. The mean is 5.56 games. The median is 3.0 games. e. Based on this simulation, this author can expect to play between 5 and 6 games before winning for the first time. f. Based on this simulation, this author will have to play 3 games in order to have a 50% chance of winning at least once. g. This author will have to play about 3 games fewer on average before her first win.
Activity 11-24: AIDS Testing a. Answers will vary by student guess. b. For carriers in the population, 1,000,000 .005 or 5,000. For noncarriers in the population, 1,000,000 5,000 or 995,000. c. For carriers who test positive, 5000 .977 or 4,885. For carriers who test negative, 5,000 4,885 or 115. d. For noncarriers who test negative, 995,000 .926 or 921,370. For noncarriers who test positive, 995,000 921,370 or 73,630. Positive Test
Negative Test
Total
Carries AIDS Virus
(c) 4,885
(c) 115
(b) 5,000
Does Not Carry AIDS Virus
(d) 73,630
(d) 921,370
(b) 995,000
Total
(e) 78,515
(e) 921,485
1,000,000
e. The total number of positive test results is 785,515. The total number of negative test results is 921,485. f. Of those who test positive, 4,885/78,515 or .0622 actually carry the virus. This is probably a surprisingly small proportion. g. Because the population is so large (1,000,000), even though the specificity of the test is very high, there will still be a large number of people who do not carry the virus yet test positive. Thus, because such a small proportion of the population actually carries the virus, the proportion of those who test positive who do have the virus is fairly small.
Activity 12-7
Homework Activities Activity 12-5: Normal Curves a. Mean 50; standard deviation 5
12
•••
231
b. Mean 1100; standard deviation 300 c. Mean 20; standard deviation 40 d. Mean 225; standard deviation 75
Activity 12-6: Pregnancy Durations a. Pr(duration 244) Pr(Z (244 270)/17) Pr(Z 1.53) .0630 (Table II) or .0631 (applet) b. Pr(duration 275) Pr(Z .29) .3859 (Table II) or .3843 (applet) c. Pr(duration 300) Pr(Z 1.76) .0392 (Table II) or .0388 (applet) d. Pr(260 duration 280) Pr(.59 Z .59 ) .7224 .2776 .4448 (Table II) or .4436 (applet) e. 508356/4112052 .124; Pr(duration 252) Pr(Z 1.06) .1446. The proportion predicted by the model (.1446) is very close to the actual proportion of preterm deliveries (.124).
Activity 12-7: Professors’ Grades a. The following sketch shows both teachers’ grade distributions: Fisher
Savage
20
40
60
80 Grade
100
120
140
b. Savage gives the higher proportion of As as more than 25% of his grades are As. For Fisher, you calculate Pr(grade 90) Pr(Z 2.29) .0111. For Savage, you calculate Pr(grade 90) Pr(Z 0.67) .2525. c. Savage also gives a higher proportion of Fs as almost 16% of his grades are Fs. For Fisher, you calculate Pr(grade 60) Pr(Z 2.00) .0228. For Savage, you calculate Pr(grade 60) Pr(Z 1.00) .1587.
232
Topic 12: Normal Distributions
d. You have z 1.28. So you want to solve (grade 69)/9 1.28; grade 80.52. Therefore, you would need to score above 80.52 in order to earn an A in Professor DeGroot’s class.
Activity 12-8: Professors’ Grades a. A score of 90 is more impressive with Professor Zeddes because it is rarer in his class. Zeddes
Wells
40
50
60
70
80
90
100
110
Final Exam Score
For Zeddes, you calculate Pr(grade 90) Pr(Z 3.00) .0013. For Wells, you calculate Pr(grade 90) Pr(Z 1.50) .0668. b. A score of 60 is also more discouraging with Professor Zeddes because it is rarer in his class. For Zeddes, you calculate Pr(grade 60) Pr(Z 3.00) .0013. For Wells, you calculate Pr(grade 60) Pr(Z 1.50) .0668.
Activity 12-9: IQ Scores a. Here is a sketch of the distribution:
70
80
90
100
110 120 IQ Score
130
b. Answers will vary by student guess. c. Pr(score 100) Pr(Z 1.25) .1057 d. Pr(score 130) Pr(Z 1.25) .1057
140
150
160
Activity 12-13
233
e. Pr(110 score 130) Pr(.42 Z 1.25) .8944 .3372 .5572 g. The top 1% (or bottom 99%) corresponds to z 2.33. You calculate (score 115)/12 2.33, and solving for score you find score 142.96. A student needs an IQ score greater than 142.96 to be in the top 1%.
Activity 12-10: Candy Bar Weights a. Pr(weight 2.13) Pr(Z 1.75) .0401 b. Pr(weight 2.25) Pr(Z 1.25) .1057 c. Pr(2.2 weight 2.3) Pr(0 Z 2.5) .9938 .5 .4938 d. You want Pr(weight 2.13) .001, so Z (2.13 mean)/.04 3.085. Thus, the mean 2.2534 oz. e. You want Pr(weight 2.13) .001, so Z (2.13 2.2)/ 3.085. Thus, the standard deviation 0.02269 oz. f. You want Pr(weight 2.13) .001, so Z (2.13 – 2.15)/ 3.085. Thus, the standard deviation 0.006483 oz.
Activity 12-11: SATs and ACTs a. Pr(score 1740) Pr(Z 1) .1587 b. Pr(score 30) Pr(Z 1.5) .0668 c. Kathy seems to be stronger in terms of standardized test performance because a smaller proportion of her peers outperformed her.
Activity 12-12: Heights a. Pr(height 66) Pr(Z 1.33) .0918 (using Table II) or .0912 (applet) b. Pr(height 72) Pr(Z .67) .2514 (Table II) or .2525 (applet) c. z 1.28
1.28 (height 70)/3
height 73.84 inches
d. Pr(height 66) Pr(Z .33) .6293 (Table II) or .6306 (applet) Pr(height 72) Pr(Z 2.33) .0099 (Table II) or .0098 (applet) z 1.28
1.28 (height 70)/3
height 68.84 inches
e. Yes, these are generally consistent with your calculations. The calculations are much closer for men (9% vs. 11.7%) than for women (63% vs. 74%). f. Yes, these are generally consistent with your calculations—even more so than the previous calculations. For men, you have 25% vs. 29.9%, and for women, you have .1% vs. .5%.
Activity 12-13: Weights a. Pr(weight 150) Pr(Z 0.71) 23.89% (Table II) or 23.75% ( applet) Pr(weight 200) Pr(Z 0.71) 76.11% (Table II) or 76.25% (applet) Pr(weight 250) Pr(Z 2.14) 98.38% (Table II) or 98.39% (applet)
12
f. Pr(score 75) Pr(Z 3.33) .0004
234
Topic 12: Normal Distributions
b. Pr(weight 150) Pr(Z 0.33) 62.93% (Table II) or 63.06% (applet) Pr(weight 200) Pr(Z 2.00) 97.73% (Table II) or 97.72% (applet) Pr(weight 250) Pr(Z 3.67) 99.99% (Table II) or 99.99% (applet) c. The normal model does a reasonable job of predicting these percentages. It tends to underestimate somewhat the percentage of both men and women who weigh less than 150 and 200 lbs, but it is very close with the percentages of men and women who weigh less than 250 lbs.
Activity 12-14: Dog Heights a. Pr(22.5 height 27.5) Pr(1 Z 1) .8413 .1587 .6826 b. By the empirical rule, about 95% of all male Sheltie’s heights fall between 12 and 18 inches (within two standard deviations). c. The top 10% (or bottom 90%) corresponds to z 1.28 (Table II or technology), so then solving for height, you find: 1.28 (height 15)/1.5 height 16.92 inches d. For the Sheltie, Pr(height 18) Pr(Z 2.00) .0228. For the German Shepherd, Pr(height 28) Pr(Z 1.2) .1151. So, a Sheltie with a height of 18 inches is more unusual than a German Shepherd with a height of 28 inches.
Activity 12-15: Baby Weights a. You calculate z (13.9 12.5)/1.5 .93. At three months, Benjamin Chance’s weight was about 9/10 of a standard deviation above the average. b. Pr(weight 13.9) Pr(Z .93) .1762 (Table II) or .1753 (applet). You must assume that three-month-old American babies’ weights are normally distributed. c. By the empirical rule, to be in the middle 68% of weights, his weight would need to be within one standard deviation of the mean, so within 17.25 2 15.25 lbs and 19.25 lbs.
Activity 12-16: Coin Ages This distribution is not likely to be normally distributed; it is likely to be strongly skewed to the right because no coin can have an age less than 0 years (but there will be a few very old coins still in circulation). If the mean is 12.3 years and the average deviation from the mean is 9 years, the distribution cannot be symmetrically distributed about the mean and follow the empirical rule and therefore cannot follow a normal distribution.
Activity 12-17: Empirical Rule a. Pr(–1 Z 1) .8413 .1587 .6826 b. Pr(–2 Z 2) .9773 .0228 .9545 c. Pr(–3 Z 3) .9987 .0014 .9973 d. To find the middle 50%, you need 25% of each side. Looking up .25 and .75 in the Standard Normal Probabilities Table (or using technology), you find z .675. You calculate IQR .675 (.675) 1.35.
235
e. The z-scores for outliers are .675 1.5 1.35 2.7 and .675 1.5 1.35 2.7. Using Table II, Pr(2.7 Z 2.7 ) .9965 .0035 .9930. So, the probability that an observation from the normal distribution will be classified as an outlier using the 1.5 IQR rule is 1 .993 or .007.
Activity 12-18: Critical Values a. z* 1.28 (probability less than .95)
d. z* 2.33
b. z* 1.645
e. z* 2.575
c. z* 1.96
Activity 12-19: Body Temperatures a. The following histogram and normal probability plot display the data on body temperatures: 25 Mean 98.25 SD 0.7332 N 130
Frequency
20 15 10 5 0 96.75
97.50
98.25
99.00
99.75
100.50
Body Temperature (in F) 99.9 99
Percentage
95 90 80 70 60 50 40 30 20 10 5 1 0.1 95
96
97
98 Body Temperature (in F)
These data seem fairly normally distributed.
99
100
101
12
Activity 12-19
Topic 12: Normal Distributions
b. The following histogram and normal probability plot display body temperature data for men and women: 12
20
10
Frequency
15 8 6
10
4 5 2 0
0 96.5
Percentage
236
97.0 97.5 98.0 98.5 99.0 Female Body Temperature (in F)
99.5
97
99.9
99.9
99
99
95 90
95 90
80 70 60 50 40 30 20
80 70 60 50 40 30 20
10 5
10 5
1
1
0.1
98
99
100
101
Male Body Temperature (in F)
0.1 96
97 98 99 100 Female Body Temperature (in F)
96.0
97.2 98.4 99.6 100.8 Male Body Temperature (in F)
Based on these plots, the female body temperatures seem to more closely follow a normal model.
Activity 12-20: Natural Selection a. The normal model seems appropriate for these data—both for the sparrows that survived and for those that died. The following normal probability plots display the total sparrow length:
Activity 12-20
Died
237
Survived
99
12
95 90 80 60 50 40 30 20 10 5
1 150
155
160
165
170
150
155
160
165
170
Total Length (in mm)
b. The variables keel of sternum, tibiotarsus, humerus, and femur seem to be well modeled by a normal distribution for both sparrows that died and survived. The following normal probability plots display these data: Died
Survived
99
95 90 80 Percentage
Percentage
70
70 60 50 40 30 20 10 5
1 0.75
0.80
0.85
0.90
0.95
0.75
Keel of Sternum (in inches)
0.80
0.85
0.90
0.95
Topic 12: Normal Distributions
Survived
Died 99
95 90 80 Percentage
70 60 50 40 30 20 10 5
1 1.0
1.1
1.2
1.0
1.1
1.2
Tibiotarsus (in inches) Died
Survived
99
95 90 80 70 Percentage
238
60 50 40 30 20 10 5
1 0.65
0.70
0.75
0.80
0.65
Humerus (in inches)
0.70
0.75
0.80
Activity 12-20
239
Survived
Died 99
12
95 90
Percentage
80 70 60 50 40 30 20 10 5 1 0.65
0.70
0.75
0.80
0.65
0.70
0.75
0.80
Femur (in inches)
The variables length of beak and head and alar extent are well modeled by a normal distribution for the sparrows that died, but not well modeled for the sparrows that survived; the variables weight and skull width are in the reverse situation—well modeled by a normal distribution for the sparrows that survived, but not for the sparrows that died. The following normal probability plots display these data: Died
Survived
99
95 90
Percentage
80 70 60 50 40 30 20 10 5
1 30
31
32
33 34 30 31 Length of Head and Beak (in mm)
32
33
34
Topic 12: Normal Distributions
Died
Survived
99 95 90
Percentage
80 70 60 50 40 30 20 10 5 1 240
250
260 Alar Extent (in mm)
240
Died
250
260
Survived
99 95 90 80 Percentage
240
70 60 50 40 30 20 10 5 1 0.550
0.575
0.600
0.625
0.650 0.550
0.575
Skull Width (in inches)
0.600
0.625
0.650
Activity 12-22
Died
241
Survived
99
12
95 90
Percentage
80 70 60 50 40 30 20 10 5
1 22
24
26
28
30 22 Weight (in grams)
24
26
28
30
Activity 12-21: Honda Prices
Percentage
99.9 99 90
Percentage
Percentage
The normal probability plots reveal that prices are approximately normally distributed, but the mileage and year variables are not (skewed to the right and to the left, respectively):
50 10 1 0.1 1990
1995
2000 Year
2005
2010
99.9 99 90 50 10 1 0.1 0
10000 Price
20000
30000
99.9 99 90 50 10 1 0.1 100000
0
100000 200000 Mileage
300000
Activity 12-22: Your Choice a. Answers will vary, but some possibilities include these: Dickinson placement exam scores, body temperatures, birth weights, pregnancy durations, IQ scores, professor’s grades, SAT scores, ACT scores, candy bar weights, weights (of men and women), and dog heights. b. Answers will vary but some possibilities include: lengths of reigns of British monarchs, Olympic rowers’ weights, college football scores (Activity 7-11), running times of Hitchcock films, distances between exits on the Pennsylvania Turnpike, and coin ages.
Activity 13-5
253
The z-score for the observed sample proportion of .645 is therefore .645 .667 0.52 z __________ .042
c. The observed sample proportion is barely one-half of a standard deviation away from what you would expect if the population proportion were equal to 2/3, not a surprising result at all. Therefore, the sample data provide no reason to doubt that the population proportion of kissing couples who lean their heads to the right equals 2/3. d. The value .645 is pretty far along the lower tail of the second histogram. This indicates that the observed sample proportion would rarely occur if the population proportion were equal to 3/4. Further evidence of this result is provided by the rather large negative z-score:
13
so the observed sample proportion .645 lies only about half of a standard deviation from the population proportion when .667.
.645 .750 2.69 .645 .750 __________ __________ z ____________ .39 (.750)(.250) __________ 124
Therefore, the sample data provide fairly strong evidence that the population proportion of kissing couples who lean their heads to the right is not 3/4 (because it would be rather surprising to find a sample proportion so far from this population proportion by chance alone). e. A reasonable estimate of the population proportion is the sample proportion .645. An estimate of the standard deviation of pˆ would then be __________
(.645)(.355) __________ .043 124
Doubling this standard deviation gives .086. The interval is, therefore, .645 .086, which runs from .559 to .731. Notice that 1/2 and 3/4 are not in this interval, but 2/3 is. The interval is consistent with the earlier analysis of the plausibility of the values 1/2, 2/3, and 3/4 for the population proportion of kissing couples who lean their heads to the right.
•••
Homework Activities Activity 13-5: Parameters vs. Statistics a. Statistic
pˆ
i. Statistic
pˆ
b. Parameter
j. Parameter
μ
x
k. Statistic
pˆ
d. Parameter
l. Statistic
e. Parameter
f. Statistic
s
g. Parameter
(population is all voters)
h. Statistic
pˆ
c. Statistic
_
_
x (population is all American households)
m. Parameter
n. Statistic
pˆ
o. Statistic
_
x
254
Topic 13: Sampling Distributions: Proportions
Activity 13-6: Generation M _
a. Parameter
d. Statistic
b. Statistic
pˆ
e. Parameter
c. Statistic
pˆ
x
Activity 13-7: Presidential Approval a. The standard deviations are given here:
0
.2
.4
.5
.6
.8
1
Standard Deviation
0
.01265
.01549
.01581
.01549
.01265
0
b. The value .5 produces the most variability. c. The value 0 or 1 produces the least variability. d. If none (or all) of a population has a particular characteristic, then none (or all) of a sample must have this characteristic as well, leaving no variability in the sample proportion. Similarly, if the population proportion is close to 0 or 1, there is not much “room” for the sample proportion to vary away from the population value. But if exactly half of a population has the characteristic, this should produce the most varied sample proportions. e. Using a different sample size (500 rather than 1000) would not change the answers to parts a–c (the amount of variability would change, but not the fact that the variability is greatest at .5) because the sample size is a constant in the denominator for the standard deviation for each of these values.
Activity 13-8: Presidential Approval a. The standard deviations are given here: n Standard Deviation
100
200
400
800
1600
.0489898
.034641
.0244949
.0173205
.0122474
b. As the sample size increases, the standard deviation decreases. c. The sample size must increase by a factor of 4 in order to cut the standard deviation in half. d. No, the answer to part c would not change if you used a proportion other than .4 to calculate the standard deviations. (This time the numerator, (1 ), is a constant in the calculations.)
Activity 13-9: Pet Ownership a. No, you cannot be certain that the sample proportion of cat households in your sample will be closer to than your competitor’s because of sampling variability, but it is much more likely.
Activity 13-10
255
b. Yes, you have a better chance than your competitor of obtaining a sample proportion of cat households that fall within .05 of because you are using a larger sample size.
n Standard Deviation
50
200
.061
.031
The sample size 200 produces the smaller standard deviation—1/2 the size of the standard deviation when the sample size is 50 (or two times smaller). d. The applet reports that 431/500 or 86.2% of the sample proportions are within .05 of .25.
13
c. The standard deviations for sample size 50 and sample size 200 are given here:
e. The applet reports that 250/500 or 50% of the sample proportions are within .05 of .25. f. Both distributions are, as expected, approximately normal and centered at .25, but the distribution using samples of size 200 has a much smaller spread than the distribution using samples of size 50. With samples of size 200, the distribution extends from a minimum of only about .18 to a maximum of about .33, and more than 85% of the sample proportions fall within .05 of the mean (.25). In contrast, when the sample size is 50, the sampling distribution extends from 0 to above .4 and only about 50% of sample proportions are within .05 of the mean.
Activity 13-10: Calling Heads or Tails Answers will vary. Here is one representative set of answers. a. pˆ 16/20 .8 said they would call heads. This is a statistic. b. The following graph shows the distribution of the 1000 sample proportions:
.1
.2
.3
.4
.5
.6 Sample Proportion Calling Heads
.7
.8
This distribution is approximately normal, centered at about .5, with a standard deviation of .117.
.9
256
Topic 13: Sampling Distributions: Proportions
c. Based on this simulation, it would be extremely surprising to obtain this class result if, in fact, 50% of the population of students call heads. A value of .8 is in the far right tail of this distribution. (Values at least as extreme as .8 happened in only about .2% of samples in the simulation.) d. The following graph shows the distribution of the 1000 sample proportions:
.3
.4
Mean = .701
.5
.6
.7
SD = .104
^p
.8
.9
1.0
1.1
This distribution is also approximately normal, but it is centered at about .7, with a standard deviation of .104. This time the class result is not uncommon, falling very close to the center of the distribution (within one standard deviation). (According to the applet, a result of at least 16/20 students calling heads happened 117/1000 times or 11.7% of the time.)
Activity 13-11: Racquet Spinning a. The value .50 is a parameter because it describes the long-run result of the spinning process. b. The value .46 is a statistic because it is the result of a sample. c. Answers will vary. The answers given here are from one particular running of the applet. d. The following graph shows the distribution of 1000 sample proportions:
.35
.40
.45
.50
.55
.60
.65
Sample Proportion “Up”
This distribution is approximately normal, with mean .502 and standard deviation .049.
.70
Activity 13-13
257
e. The Central Limit Theorem predicts this distribution will be approximately normal, with mean .5 and standard deviation .05. The sampling distribution displayed by the applet simulation is very close to this.
g. This answer suggests that .46 is not very unlikely to occur by chance alone if the results are 50/50 in the long run. Such an outcome will happen about 37% of the time by chance alone, so it is certainly not rare.
Activity 13-12: Halloween Practices a. The value .69 is a statistic because it is a number that summarizes a sample.
13
f. The applet reports that 190/1000 or 19% of the samples had a sample proportion of at least .54 and 183/1000 or 18.3% of the samples had a sample proportion of .46 or less. Together this is 37.3% of the samples.
b. No, this finding does not prove that .69. If you were to take another random sample of 1005 adults, you would most likely find a different (although similar) proportion of adults who planned to give out Halloween treats. c. If .7, then the standard deviation is .0144, so .7 2 .0144 .671. Yes, .69 would fall within two standard deviations of .7 in the sampling distribution. d. If .6, then the standard deviation is .0154, so .6 2 .0155 .6309. No, .69 would not fall within two standard deviations of .6 in the sampling distribution. e. Using a common standard deviation of .015:
.61
.62
.63
.64
.65
.66
.67
.68
.69
.7
.71
.72
.73
2SD
.58
.59
.6
.61
.62
.63
.64
.65
.66
.67
.68
.69
.7
2SD
.64
.65
.66
.67
.68
.69
.7
.71
.72
.73
.74
.75
.76
The potential values of are .66 .72. f. Based on your findings in part e, the plausible values for the percentage of the population who planned to give out Halloween treats from the doors of their homes in 1999 was between 66% and 72% inclusive.
Activity 13-13: Distinguishing Between Colas a. 1/3 b. Roll the die 30 times to represent the 30 trials. If you roll a 5 or 6, consider this a success (i.e., you successfully identified the odd cola). Otherwise (if you roll a 1, 2, 3, or 4), you failed to identify the odd cola. c. See part d for example results from the applet.
258
Topic 13: Sampling Distributions: Proportions
d. Using the Reese’s Pieces applet, the following histogram displays the distribution of the 1000 sample proportions of correct answers in this experiment:
^p 0
.1
.2
.3
.4
Mean = .336
.5
.6
SD = .086
Yes, the shape of this sampling distribution is approximately normal. e. The empirical sampling distribution mean is .336 and the standard deviation is .086. The mean predicted by the CLT is .333, and the standard deviation predicted by the CLT is .086. The simulated sampling distribution and CLT values are very close. f. The applet reports that in 169/1000 or 16.9% of the samples, the subject guessed correctly 40% or more of the time. g. If a subject were correct 40% of the time in this experiment, you would not be convinced he/she was doing better than guessing would allow. If the subject were simply guessing, he/she would get 40% or more correct about 17% of the time, so this outcome is not all that surprising for a guesser. h. If a subject were correct 60% of the time in this experiment, you would be convinced he/she was doing better than guessing would allow because in this simulation the subjects never got 60% or more correct by simply guessing. i. Using the Reese’s Pieces applet, here are the results:
^p .4
.5
.6
Mean = .666
.7
.8 SD = .083
.9
1.0
Activity 13-14
259
13
j. The rough shapes of the histograms should look like:
.0
.1
.2
.3
.4 .5 .6 Proportion Correct
.7
.8
.9
Both distributions are approximately normal and have the same spread. However, they have different centers. There is little overlap in the distributions. k. The applet reports that in 746/1000 or 74.6% of the samples, the subject guessed correctly 60% or more of the time.
Activity 13-14: Pet Ownership a. 1/3 is a parameter because it describes all American households. b. Here is a sketch of the sampling distribution:
.328
.330
.332 .334 Sample Proportion of Cat Owners
.336
.338
The CLT says this sampling distribution will be approximately normal, centered at 1/3, with a standard deviation of .001667.
260
Topic 13: Sampling Distributions: Proportions
c. By the empirical rule, 95% of all sample proportions should fall between .3297 and .3363 (within two standard deviations). d. This interval is so narrow because the sample size (80,000) is an extremely large sample size. e. pˆ .316 is a statistic because it is a number obtained from a sample. f. z (.316 .333)/.001667 10.2. g. This is an extremely unusual z-score. Pr(Z 10.2) 0, so the sample data do provide evidence that the population proportion who own a pet cat is not onethird (you observed a sample result that pretty much never happens when 1/3, so you are convinced 1/3).
Activity 13-15: Pet Ownership a. Here is a sketch of this sampling distribution:
.047
.048
.049 .050 .051 Sample Proportion of Bird Owners
.052
.053
The CLT says this sampling distribution will be approximately normal, centered at .05, with a standard deviation of .000771. b. This standard deviation is much smaller because you are assuming is smaller (.05 rather than .333 and farther away from .5, see Activity 13-7). c. You calculate z (.046 .05)/.000771 5.19. This is a very unusual z-score. Pr(Z 5.19) 0, so the survey does provide evidence that the population proportion who own a pet bird is not 5%. (You observed a sample result that pretty much never happens when .05, so you are convinced .05.)
Activity 13-16
261
Activity 13-16: Volunteerism a. The value 28.2% is a statistic; pˆ .282.
13
b. Here is a sketch of this sampling distribution:
.2450
.2475 .2500 .2525 Sample Proportion of Volunteers
.2550
The CLT says this sampling distribution will be approximately normal, centered at .25, with a standard deviation of .001768. c. You calculate z (.282 .25)/.001768 18.1. d. Yes, this z-score is extreme enough to cast doubt on the assertion that 25% of the population participated in a volunteer activity. Pr(Z 18.1) 0, so if really is .25, you would never expect to see such a sample result—yet you did see this sample result, so you have very strong evidence that is not .25. e. Here is a sketch of this sampling distribution:
.20
.22
.24 .26 .28 Sample Proportion of Volunteers
.30
.32
The CLT says this sampling distribution will be approximately normal, centered at .25, with a standard deviation of .0194. You calculate z (.282 .25)/.0194 1.65. This is not a particularly extreme z-score. Pr(Z 1.65) .0495; this means
262
Topic 13: Sampling Distributions: Proportions
you have some evidence to make you doubt that really is .25, but the evidence is not overwhelming. f. It makes sense that your answers differ so much because the sample sizes are so different. A small difference in sample and population proportion would be very surprising with a sample of size 80,000, but would not be as surprising with only 500 people.
Activity 13-17: Pursuit of Happiness a. You calculate z (.84 .8)/.007286 5.49. b. Yes, this z-score is extreme enough to cast doubt on the assertion that 80% of the population felt happy. Pr(Z 5.49) 0, so if really is .80, you would never expect to see such a sample result—yet you did see this sample result, so you have strong evidence that is not .80. .84 .83 1.462 .84 .82 2.858 ________ ________ c. .82: z __________ .83: z __________ (.82)(.18) (.83)(.17) ________ ________ 3014 3014
.84 .84 0.00 ________ .84: z __________ (.84)(.16) ________ 3014
.84 .85 1.538 ________ .85: z __________ (.85)(.15) ________ 3014
.86 .82 3.164 ________ .86: z __________ (.86)(.14) ________ 3014
.84 .87 4.897 ________ .87: z __________ (.87)(.13) ________ 3014
.84 .88 6.758 ________ .88: z __________ (.88)(.12) ________ 3014
Plausible values of the population proportion include .83 .85 because these values lie within two standard deviations of the observed sample proportion.
Activity 13-18: Cursive Writing
________
a. The standard deviation of the sample proportion is
(15)(.85) . n ________
204. ( (.15)(.85) ) .05; thus n (.15)(.85) n (.025) (.15)(.85) c. You calculate 2( 1275. Thus, you ) .02; thus n (.15)(.85) n (.01) _________
b. You need 2
________
________ 2
_________
________
________ 2
need a much larger sample size to be as “confident” that your sample proportion will fall within this smaller range of the population proportion.
Assessment Sample Quiz 13A
•••
Suppose that 80% of all the incoming email messages for a college’s computer system are spam. 1. Is the 80% value a parameter or a statistic for the population of incoming email messages at this college? Explain.
Topic 14: Sampling Distributions: Means
n = 100
0.548
n = 30
67
•••
68 69 70 Average Height in Sample (in inches)
71
Homework Activities Activity 14-5: Heart Rates a. Yes, this distribution appears approximately normal, although it is a little skewed to the right. b. The probability plot indicates that this data is not entirely normally distributed; there is too much data in the right tail and not enough in the left, but it is fairly close to normal. c. The following graph displays the distribution of the 1000 sample means with n 3: 100 80 Frequency
276
60 40 20 0 80
90
100
110 120 Sample Mean (in bpm)
130
140
150
This distribution appears normally distributed with mean 108.19 bpm and standard deviation 11.58 bpm. d. Yes, the distribution appears roughly normal in spite of the small sample size. This is because the population itself was so close to being normally distributed.
Activity 14-7
277
e. The following graph displays the distribution of the 1000 sample means, with n 10: 120 100
Frequency
80 60
20 0 90
96
102
108 114 Sample Mean (in bpm)
120
126
This distribution is approximately normally distributed with mean 108.4 bpm and standard deviation 6.414 bpm. With the larger sample size, the distribution is more normal (less skewed to the right), and of course the spread is substantially smaller.
Activity 14-6: Sampling Words a. No, the population of word lengths is not normal; it is skewed to the right because the shortest a word can be is one letter and many words had 1–3 letters, but there were several longer words. b. Answers to b–d will vary. These are examples based on one particular running of the applet. With a sample size of 1, the distribution of sample means looks almost exactly like the population distribution; it has the same shape, center, and spread.
c. With a sample of size 5, the sampling distribution has a center roughly equal to the population center (4.38), a smaller standard deviation (0.93), and looks more normal, though it still has a slight skew to the right. d. With a sample of size 10, the sampling distribution has a center roughly equal to the population center (4.30), an even smaller standard deviation (0.66), and its shape is much closer to normal.
Activity 14-7: Christmas Shopping a. A sample mean of $857 or greater would be more likely if the standard deviation were $1250, because the greater the variation in the values, the more likely you are to have values in your sample that are far from the population mean. Therefore, the more likely you are to get a sample mean away from the population mean. b. A sample mean of $857 or greater would be more likely if the standard deviation were $1250 for the same reason as in part a: The population mean does not affect the amount of sampling variability. __
approximately c. The CLT says that that sampling distribution of x would be____ normal, with mean $850 and standard deviation $1250/ 922 $41.47.
14
40
278
Topic 14: Sampling Distributions: Means
Because the sample size is large, your answer does not depend on the shape of the distribution of expected expenditures in the population. Here is a sketch of the distribution:
700
750
800
850
900
950
1000
Sample Mean Expected Christmas Expenditures (in dollars)
No, a sample mean of $857 would not be at all surprising as the value is close to the center of the distribution and there is a great deal of area to the right of this value in the plot. __
The CLT says that sampling distribution of x would be approximately normal, with mean $800 and standard deviation ____
$1250/ 922 $41.47 Because the sample size is large, your answer does not depend on the shape of the distribution of expected expenditures in the population:
700
750
800
850
900
950
Sample Mean Expected Christmas Expenditures (in dollars)
A sample mean of $857 would not be terribly surprising in this situation, either. ____ The standard deviation should be $1250/922 $41.47. You calculate 2 $41.47 $82.94, so $857 $82.94 $774.06, and $857 $82.94 $939.94. The interval of values is [$774.06, $939.94].
Activity 14-8
279
d. A sample of mean of $857 is not surprising with either population mean when the population standard deviation is as large as $1250, and the interval of plausible values of becomes very wide in this case (it has a width of more than $165). It is much more difficult to narrow down the value of when there is more variability in the population distribution.
Activity 14-8: Random Babies
b. No, the conditions needed by the CLT are not met. The sample size is too small—less than 30—and the population distribution is not normal. c. Yes, the CLT says that the resulting sample means will closely follow a normal distribution because the sample size is large (greater than 30). d. The CLT says the mean will be approximately 1 and the standard deviation will be about 1/10 or .1 e. Here is a sketch of the (approximate) sampling distribution of the sample mean number of matches when the sample size is 100:
0.7
0.8
0.9 1.0 1.1 Sample Mean Number of Baby Matches
1.2
1.3
f. By the empirical rule (because you are assuming normality), Pr(.8 sample mean 1.2) .95, because this is two standard deviations on either side of the mean. g. About 95% of the intervals should capture the population mean, by the same reasoning as used in part f.
14
a. No, this population distribution is not normal or close to normal. It is strongly skewed to the right—not symmetric.
280
Topic 14: Sampling Distributions: Means
Activity 14-9: Birth Weights a. The first histogram is from samples of size 10 because it has less variation (a smaller standard deviation) than the second histogram. b. Sample size 5 is more likely to produce a sample mean birth weight less than 2500 grams. c. Sample size 5 is more likely to produce a sample mean birth weight less than 3000 grams. d. Sample size 5 is more likely to produce a sample mean birth weight greater than 3500 grams. e. Sample size 10 is more likely to produce a sample mean birth weight between 3000 and 3500 grams. f. The larger the sample size, the smaller the standard deviation, which means the smaller the spread of the sampling distribution of the sample mean. This means that with a large sample size, the sample means are more likely to be near the population mean, and with a small sample size, the sample means are more likely to be far from the population mean.
Activity 14-10: Candy Bar Weights a. The sampling distribution of the mean weights will __ be normally distributed with mean 2.2 ounces and standard deviation .04/ 5 0.0179 ounces. b. Here is a sketch of the sampling distribution:
2.150
2.175 2.200 2.225 Mean Candy Bar Weights (in ounces)
2.250
c. It is possible to get a mean weight this low, but it is not very likely. d. Yes, it is pretty unlikely to get a mean weight as low as 2.15, judging from the graph. e. Yes, finding the sample mean weight to be 2.15 ounces would provide strong evidence against the manufacturer’s claim that 2.2 ounces because this (sample means as small as 2.15 oz) happens rarely when actually is 2.2. f. No, finding the sample mean weight to be 2.18 ounces would not provide any evidence against the manufacturers claim that 2.2 ounces because,
Activity 14-11
281
according to the graph, this (sample means as small as 2.18) happens frequently by chance alone. g. You calculate 2.2 2 .0179 2.2 .0358 [2.164, 2.236]. So, sample mean weights less than 2.164 ounces or greater than 2.236 ounces would provide strong evidence against the manufacturer’s claim as they would happen less than 5% of the time by the random chance of the sampling process alone.
Activity 14-11: Cars’ Fuel Efficiency
b. The CLT predicts a sampling distribution that would be approximately normal ___ with mean 31 mpg and standard deviation 3/ 30 0.5477 mpg. According to the sketch, 30.4 mpg would not be an unusual mean value for a sample of 30 cars.
30
31 32 Average 1999 Passat Fuel Efficiency (in mpg) (n = 30)
33
c. The CLT predicts a sampling distribution that would be (approximately) normal, ___ with mean 31 mpg and standard deviation 3/ 60 0.3873 mpg. According to the sketch, 30.4 mpg would not be a very unusual mean value for a sample of 60 cars to obtain (it is not in the extreme tail of the distribution).
30.0
30.5 31.0 31.5 Average 1999 Passat Fuel Efficiency (in mpg) (n = 60)
32.0
32.5
14
a. No, it would not be surprising to obtain 30.4 mpg for one tankful. This value is well within one standard deviation of the mean.
282
Topic 14: Sampling Distributions: Means
d. The CLT predicts a sampling distribution that would ____ be approximately normal, with mean 31 mpg and standard deviation 3/ 150 0.2449 mpg. According to the sketch, 30.4 mpg would be a somewhat unusual mean value for a sample of 150 cars to obtain.
30.2
30.4
30.6 30.8 31.0 31.2 31.4 Average 1999 Passat Fuel Efficiency (in mpg) (n = 150)
31.6
31.8
e. No, because all of these sample sizes are at least 30, none of these responses depend on knowing the shape of the population distribution.
Activity 14-12: Tip Percentages a. The observational units are the tables in the restaurant. b. The variable is the percentage tip; it is quantitative. c. The sampling distribution ___will be approximately normal, with mean 15% and standard deviation 4/ 50 .566%. Here is a sketch of the sampling distribution of her sample mean tip percentage:
13
14
15 Sample Mean Tip Percentage
16
17
d. By the empirical rule, there is a 68% chance that her sample mean tip percentage will fall between 15 0.566 and 15 0.566 or 14.44% and 15.56%.
Quizzes
283
Activity 14-13: IQ Scores It is more likely that a randomly selected resident will have an IQ greater than 120 than that a sample of 10 residents will have an average IQ greater than 120. It is much more likely that a randomly selected individual will have an IQ far from the center than that the average IQ of a group will be far from the center. It is really hard to select a group that is “different” from average, but it is not so hard to select a single individual who is different from “average.”
Sample Quiz 14A
•••
Suppose that a phone company reports that the average duration of a cell phone call is 1.7 minutes, with a standard deviation of 1.4 minutes. 1. Would it be reasonable to use a normal distribution to model the duration of cell phone calls? Explain, based primarily on the values reported above. 2. Suppose you want to examine a random sample of 60 cell phone calls. Do you think it would be reasonable to use the Central Limit Theorem to describe the sampling distribution of the sample mean call duration? Explain. 3. What does the CLT say about the sampling distribution of the sample mean call duration in a random sample of 60 calls? 4. Draw a well-labeled sketch to accompany your answer to question 3. 5. Describe how the sketch would change if the sample size were 160 calls rather than 60 calls.
Solution to Sample Quiz 14A
•••
1. No, it would not be reasonable to model the duration of cell phone calls with a normal distribution because the distribution could not be symmetric with these mean and standard deviation values. Two standard deviations below the mean would indicate negative lengths of cell phone calls, whereas two standard deviations above the mean would indicate calls lasting 4.5 minutes, which is a very reasonable (not very extreme) cell phone call length. It seems more plausible to use a distribution that is skewed to the right to model cell phone call lengths. 2. Yes, it would be reasonable to use the Central Limit Theorem to describe the distribution of the sample mean call duration because the sample size used is large ( 30). 3. The CLT says the sampling distribution of the sample mean call duration will be___ approximately normal, with mean 1.7 minutes and standard deviation 1.4/ 60 .181 minutes.
14
Assessment
Activity 15-7
291
Activity 15-5: Capsized Tour Boat
15
First, weight is a quantitative variable, so the relevant statistic is the sample mean weight of the 47 passengers. Because the question is phrased in terms of the total weight in a sample of 47 adults, you must rephrase it in terms of the sample mean weight. If total weight exceeds 7500 pounds, then the sample mean weight must exceed 7500/47 or 159.574 _ pounds. So, you want to find the probability that x 159.574 (with n 47 and 35). The CLT applies because the sample size (n 47) is fairly large, greater than 30. _ The sampling distribution of x is, therefore, approximately ___normal with mean 167 __ pounds and standard deviation equal to / n 35/ 47 5.105 pounds. A sketch of this sampling distribution is shown here:
5.105
150
155
160
165
170
175
180
185
167 Average Weight in Sample (in pounds)
Now you can use the Normal Probability Calculator applet or the Standard Normal Probabilities Table to find the probability of interest. The z-score corresponding to a sample mean weight of 159.574 pounds is (159.574 167)/5.105 1.45. The probability of the weight being less than 159.574 pounds is found from the table to be .0736, so the probability of exceeding this weight is 1 .0736 .9264. It’s not surprising the boat capsized with 47 passengers!
•••
Homework Activities Activity 15-6: Means or Proportions a. Sample mean b. Sample proportion c. Sample mean d. Sample mean e. Sample proportion
Activity 15-7: Candy Colors a. The CLT says the sample proportions will be approximately _______ normally distributed, ______ with a mean of .45 and a standard deviation of (.45)(.55) .568.
75
292
Topic 15: Central Limit Theorem
b. Here is a sketch of this sampling distribution:
.30
.35 .40 .45 .50 .55 Sample Proportion of Orange Reese’s Pieces (n = 75)
.60
.65
Answers will vary by student guesses. c. Pr( pˆ .4) Pr(Z (.4 .45)/.0568) Pr(Z .88) .1894 d. Pr(.35 pˆ .55) Pr((.35 .45)/.0568 Z (.55 .45)/.0568) Pr(1.76 Z 1.76) .9608 .0392 .9216 (Table II) or .9217 (applet) e. Yes, these probabilities are very close; they are virtually identical. The simulated probability from the applet was 92%, and the normal model predicts a probability of 92.1%.
Activity 15-8: Candy Colors a. The CLT says the sample proportions will be approximately normally distributed _______ (.45)(.55) ______ with a mean of .45 and a standard deviation of .0372. The only change 175 from when the sample size was 75 is the standard deviation, which is now smaller.
b. Here is a sketch of this sampling distribution:
.0895
.35
.40 .45 .50 Sample Proportion of Orange Reese’s Pieces (n = 175)
Answers will vary by student guesses.
.55
Activity 15-9
293
c. Pr( pˆ .4) Pr(Z (.4 .45)/.0372) Pr(Z 1.34) .0901 (Table II) or .0895 (applet) d. This probability is less than when the sample size is 75. This result makes sense because the standard deviation (spread) has decreased and thus there are fewer sample proportions as far from the center of .45. e. Pr(.35 pˆ .55) Pr((.35 .45)/.0372 Z (.55 .45)/.0372) Pr(2.69 Z 2.69) .9964 .0036 .9928. f. This probability is greater than when the sample size is 75. This makes sense because the standard deviation has increased, which will concentrate more of the area under the curve near .45 (the mean).
a. The CLT predicts this distribution will be approximately ________________ N(.276, .276(1 .276)/400 .02235). Here is a sketch of the sampling distribution of the sample proportion of Kentucky smokers:
.20
.22
.24 .26 .28 .30 .32 Sample Proportion of Smokers in Kentucky (n = 400)
(
.34
15
Activity 15-9: Smoking Rates
.36
)
.251 .276 Pr(Z 1.12) .1314. b. You calculate Pr( pˆ .251) Pr Z ________ .02235 Because the normal distribution is symmetric, .301 will have a z-score of 1.12 and the area to the right of z 1.12 will also be .1314. Thus, you can double this probability to find that the probability of obtaining a sample proportion of Kentucky smokers more than .025 away from .276 is 2 .1314 or .2628.
c. Pr( pˆ .226) Pr( Z ________ .02235 ) Pr(Z 2.24) .0125. Thus, the probability of obtaining a sample proportion of Kentucky smokers more than .05 away from .276 is 2 .0125 or .025. .226 .276
d. You would have no reason to doubt that the state is Kentucky because part b shows that if the state is Kentucky, you have a greater than 26% chance of finding a sample result at least as extreme as 25% smokers. e. Now you would have reason to doubt that the state is Kentucky because part c shows that if the state is Kentucky, you have less than a 2.5% chance of finding a sample result at least as extreme as 22.5% smokers.
294
Topic 15: Central Limit Theorem
Activity 15-10: Candy Bar Weights 2.18 2.20 Z ________ 2.22 2.20 Pr(.50 a. Pr(2.18 weight 2.22) Pr( ________ ) 0.04 0.04 Z .50) .6915 .3085 .3830 (Table II) or .3829 (applet)
b. Yes, the CLT applies because the population has a normal distribution as long as your sampling method behaves like a simple random sample. c. The CLT says the sample means will__be normally distributed with mean 2.20 ounces and standard deviation .04/√5 .01789. d. Here is a sketch of the sampling distribution:
2.18
2.150
2.22
2.175 2.200 2.225 Sample Average Candy Bar Weight (n = 5)
2.250
Answers will vary by student guess. This value should be greater than the answer to part a. __
2.18 2.20 Z 2.22 2.20 ________ e. You calculate Pr ( 2.18 X 2.22 ) Pr ( ________ .01789 .01789 ) Pr(1.12 Z 1.12) .8686 .1314 .7372 (Table II) or .7364 (applet). This probability is indeed greater than the probability you found in part a.
f. Answers will vary by student guess. You should guess that the probability will increase if the sample size is increased to 40 because this will decrease the standard deviation, which will concentrate more area under the curve near the middle (mean). ___
g. Now the standard deviation of the sample means is .04/√40 .006325, so the distribution is N(2.2, .006325). You calculate __
2.18 2.20 Z ________ 2.22 2.20 Pr ( 2.18 X 2.22 ) Pr( ________ .006325 .006325 ) Pr(3.16 Z 3.16) .9992 .0008 .9984 (Table II) or .9984 (applet)
This is greater than the answer in part e. h. The calculations in part f remain approximately correct regardless of the distribution of candy bar weights because the sample size was large (40 30).
Activity 15-12
295
Activity 15-11: Christmas Shopping a. No, it is not valid to use the CLT because the sample size is too small and you do not know that the population is normally distributed.
15
b. Yes, with a sample size of 500, the CLT tells you about____ the sampling distribution of the sample means, which would be N($850, $250/√500 $11.18). Here is a sketch of the sampling distribution:
810
820 830 840 850 860 870 Sample Average Expected Christmas Expenditures (in dollars)
880
890
__
831.61 850 Z _________ 868.39 850 c. Pr($831.61 X $868.39) Pr ( _________ ) 11.18 11.18 Pr(1.64 Z 1.64) .9495 .0505 .8990 (Table II) or .8989 (applet)
__
828.09 850 Z _________ 871.91 850 Pr(1.96 d. Pr($828.09 X $871.91) Pr ( _________ ) 11.18 11.18 Z 1.96) .9750 .0250 .9500 (Table II) or .9499 (applet)
__
821.20 850 Z _________ 878.80 850 e. Pr($821.20 X $878.80) Pr ( _________ ) 11.18 11.18 Pr(2.58 Z 2.58) .9951 .0049 .9902 (Table II) or .9900 (applet)
f. First, find the z-scores that mark 80% of the area in the middle of the standard normal curve: Pr(z* Z z*) .8000
→ Pr(1.28 Z 1.28) .8000
_
_
As z (x 850)/11.18, and z 1.28, x (1.28)(11.18) 850 864.31. Therefore, k 864.31 850 14.31. __
981.61 850 Z __________ 1018.39 850 g. Pr($981.61 X $1018.39) Pr ( _________ ) 11.18 11.18 Pr(1.64 Z 1.64) .9495 .0505 .8990 (Table II) or .8989 (applet)
This is exactly what you found in part c. The probability of falling within $18.39 of is the same, regardless of what value you use for .
Activity 15-12: Jury Selection a. The CLT applies to the jury pool with a sample size of 75 because it is large (75 .20 15 10). The CLT would not apply for the jury (sample size 12).
296
Topic 15: Central Limit Theorem
b. The CLT predicts the sampling distribution would be approximately ________ N(.2, √ .2(.8)/75 .046188). Here is a sketch of the sampling distribution of the sample proportion of senior citizens:
.05
.10
.15 .20 .25 Sample Proportion of Senior Citizens (n = 75)
.30
.35
.333 .2 Pr(Z 2.88) .0020 c. Pr ( pˆ .333) Pr ( Z ______ .046188 )
d. This is the same as the empirical probability found in Activity 11-4, part e. (Answers will vary).
Activity 15-13: Non-English Speakers a. The CLT says this sampling distribution will be approximately N(.315, ________________ √.315(1 .315)/100 .046452). Here is a sketch of the sampling distribution:
.1
.2
(
.3 .4 Sample Proportion of Non-English Speakers
)
.5
.5 .315 Pr(Z 3.98) 0.00 b. Pr( pˆ .5) Pr Z ______ .046452
(
)
.25 .315 Pr(Z 1.40) .0808 (Table II) or .0809 c. Pr( pˆ .25) Pr Z _______ .046452
(applet)
(
)
.2 .315 Z ______ .5 .315 Pr(2.48 Z 3.98) d. Pr(.2 pˆ .5) Pr ______ .046452 .046452 1.000 .0066 .9934
Activity 15-15
297
e. Here is a sketch of the sampling distribution for Ohio: Ohio
California
.1
.2 .3 Sample Proportion of Non-English Speakers
.4
.5
15
.0
f. Judging from the plot, in Ohio, Pr( pˆ .5) is 0, Pr( pˆ .25) should be near 1, and Pr(.2 pˆ .5) should be near 0.
Activity 15-14: Solitaire a. Author A would need to play at least 90 games in order for n n(1/9) 10, which would let you use the CLT. b. Author B would need to play at least 60 games in order for n n(1/6) 10, which would let you use the CLT.
c. If .8, then the authors would need to play at least 10/.8 12.5 or 13 games in order to use the CLT to approximate the sampling distribution of the sampling proportion of wins for Author B.
Activity 15-15: Birth Weights a. The z-score is (2500 3300)/570 1.40, so Pr(weight 2500) Pr(Z 1.40) .0806 (Table II) or .0802 (applet). __
b. Now the standard deviation will be 570/__√2 403.05 grams. The z-score is (2500 3300)/403.05 1.99, so Pr( X 2500) Pr(Z 1.99) .0233 (Table II) or .0236 (applet). c. This probability is less than the probability in part a. This makes sense because you are looking at an average; it is harder for any pair of babies to have an average birth weight less than 2500 grams than it is for a single baby to weigh less than this amount. __
d. Now the standard deviation__will be 570/√4 285 grams. The z-score is (2500 3300)/285), so Pr( X 2500) Pr(Z 2.81) .0025 (Table II) or .0025 (applet). This probability is much less than the probability in part a. This makes sense because you are looking at an average of four babies; it is harder for four babies to have an average birth weight less than 2500 grams than it is for a single baby to weigh less than this amount.
298
Topic 15: Central Limit Theorem
3300 Z 3600 3300 Pr(.53 _________ _________ e. Pr(3000 weight 3600) Pr ( 3000 ) 570 570 Z .53) .4038 (Table II) or .4013 (applet)
f. Answers will vary by student expectation. ___
g. Now the standard deviation will be 570/√20 127.46 grams. You calculate __ 3300 Z 3600 3300 Pr(2.35 Z _________ _________ Pr(3000 X 3600) Pr 3000 127.46 127.46 2.35) .9812 (Table II) or .9814 (applet).
(
)
Activity 15-16: Volunteerism a. Pr( pˆ .282) Pr(Z b. Pr( pˆ .282) Pr (Z
(√ (√
) )
.282 .25 _______ ________ (.25)(.75) ________
60000 .282 .25 _______ ________ (.25)(.75) ________ 500
Pr(Z 18.1) 0 Pr(Z 1.65) .0495
c. The first scenario (with a sample of 80,000) provides stronger evidence against the claim that 25% of the population served as volunteers. If 25% of the population had indeed served as volunteers, you would never expect to see a sample result as extreme as this with a sample of size 80,000.
Activity 15-17: Tip Percentages ___
a. The CLT says the sampling distribution will be approximately N(15, 4/√50 .566). __
(
)
16.4 15 Pr(Z 2.47) .0048 (Table II) or P ( X 16.4) P Z _______ .566 .0067 (applet)
b. Yes, this provides strong evidence that the mean tip percentage is actually greater than 15%. If it were 15% or less, the chance that you would find a random sample of 50 tables with an average tip percentage of at least 16.4% is less than .5%—so it is extremely unlikely. __
(
)
14.4 15 Pr(Z 1.06) .1446. c. You calculate Pr(X 14.4 Pr Z _______ .566
This does not provide strong evidence that the population mean tip percentage is less than 15%. If the population mean percentage is 15% (or more), you would expect to see a random sample of 50 tables with an average percentage tip of 14.4% or less almost 15% of the time, which is not rare.
Activity 15-18: Body Temperatures The CLT would still apply with a sample of size 40 because the sample size is still large ( 30). The standard deviation of the sampling distribution of the sample mean would increase because of the smaller sample size. This would decrease the probability that the sample mean body temperature would fall between 98.5 and 98.7 (or between 98.2 and 98.4 if the population mean were 98.3), and the probability that a random sample of size 130 results in a sample mean body temperature within 0.1 degrees of the actual population mean . This makes sense because the increased standard deviation means the average body temperatures are more spread out—less concentrated around the population mean .
Quizzes
299
Activity 15-19: Body Temperatures a. The following graph displays the distribution of body temperatures: 25
Frequency
20 15 10 5 0 97.50 98.25 99.00 99.75 Body Temperatures (in F)
100.50
These body temperatures are fairly normally distributed with a couple of high outliers greater than 100°F. b. The sample mean is 98.249°F and the standard deviation is 0.733°F. ____
c. The CLT says the sampling distribution would be N(98.6, 0.7/√130 .061394), __ 98.6 Pr(Z 5.72) 0.00. _________ so Pr( X 98.249) Pr Z 98.249 .061394
(
15
96.75
)
d. Yes, the probability found in part c is low enough to provide compelling evidence that the population mean body temperature is not 98.6 degrees. If it were, you would never (probability zero) find a sample of 130 healthy adults with an average body temperature as low as 98.249°F. Because you did find such a sample, you do not believe the population mean temperature is as high as 98.6°F.
Activity 15-20: IQ Scores 110 105 Pr(Z .42) .3372 (Table II) a. Pr(score 110) Pr ( Z _______ ) 12 or .3385 (applet)
___
b. SD 12/√10 3.795 __
110 105 Pr(Z 1.32) .0934 (Table II) Pr(X 110) Pr ( Z _______ 3.795 ) or .0935 (applet)
___
c. SD 12/√40 1.897 __
110 105 Pr(Z 2.63) .0043 (Table II) Pr(X 110) Pr ( Z _______ 1.897 ) or .0042 (applet)
d. Yes, the calculation in part c would be valid even if the distribution of IQs in the population were skewed, because the sample size is large (greater than 30).
Assessment Sample Quiz 15A
•••
Suppose 80% of the incoming email messages for a college’s computer system are spam. 1. Use the CLT to approximate the probability that in a random sample of 200 incoming email messages at this college, the sample proportion of these messages that are spam would exceed .75.
Activity 16-9
325
places while the researchers were watching. Technically, this is not a random sample, and so you should be cautious about generalizing the results of the confidence intervals to a larger population.
Homework Activities Activity 16-7: Generation M _________
a. TV margin-of-error: (1.96)
√
(.68)(.32) ________ .0203 2032
_________
CD/tape player margin-of-error: (1.96)
√
(.86)(.14) ________ .0151
Video game player margin-of-error: (1.96)
2032
_________
√
(.49)(.51) ________ .0217 2032
_________
Computer margin-of-error: (1.96)
√
(.31)(.69) ________ .0201 2032
b. Yes, the margin-of-error also depends on the confidence level and on the sample proportion. You can tell this from the above results because all four devices have the same sample size and confidence level, but the sample proportions differ, as do the margins-of-error. The more similar the sample proportions (similar distance from .5), the more similar the margins-of-error, all other factors being equal. c. The video game player produces the greatest margin-of-error and the CDtape player produces the smallest. d. Answers will vary by student conjecture. The sample proportion that produces the greatest margin-of-error is pˆ .5.
Activity 16-8: Cursive Writing a. The number of essays needed for a 99% CI is _________
.01 2.576
(.15)(.85) √ ________ n
2
2.576 8460.614; n 8461 n (.15)(.85) _____ .01
b. You could use a lower confidence level (95% or 90% confidence, for example), or you could use a wider margin-of-error, say .02. Either of these choices would allow you to select a smaller (random) sample.
Activity 16-9: Penny Activities Results will vary. The following are meant to be representative results: a. A 95% confidence interval is __________
2650 (1.96) √ .52(.48)50 .52 (1.96)(.07065) .52 .1385 (.382, .658) You are 95% confident that the probability a flipped penny lands heads is between .382 and .658. b. You calculate 2750 (1.96)(.07048) .54 .1381 (.402, .678). You are 95% confident that the probability a spun penny lands heads is between .407 and .678. c. You calculate 3250 (1.96)(.0679) .64 .1330 (.507, .773). You are 95% confident that the probability a tilted penny lands heads is between .507 and .773.
16
•••
326
Topic 16: Confidence Intervals: Proportions
d. Based on these results, flipping or spinning a penny could plausibly be 5050 for landing heads or tails because .50 is contained in both of these confidence intervals, but it is not contained in the confidence interval for a tilted penny. e. A tilted penny appears to result in the highest probability of landing heads.
Activity 16-10: Penny Activities Results will vary. The following are representative results. a. Setting the margin-of-error to .02 and solving for n: _________
.02 1.96
√
(.52)(.48) ________
2
1.96 2397.158; n 2,398 n (.52)(.48) ____ .02
n
b. Answers will vary by student expectation, but the students should expect the necessary sample size to increase. _________
c. .02 2.576
(.52)(.48) √ ________ n
2
2.576 4140.724; n 4,141 n (.52)(.48) _____ .02
Yes, the necessary sample size increased (to be more confident using the same margin-of-error, you need a larger sample). d. Answers will vary by student expectation, but students should expect the necessary sample size to increase. _________
e. To determine sample size, you calculate .01 2.576 2.576 2 16562.9; n 16,563 n (.52)(.48) _____ .01
(.52)(.48) √ ________ n
Yes, the necessary sample size increased. To be more precise with the same confidence level, you need a larger sample size.
Activity 16-11: Penny Activities Results will vary. The following are representative results. a. Setting the margin-of-error to .02 and solving for n: _________
.02 1.96
√
(.54)(.46) ________
2
1.96 2385.634; n 2,386 n (.54)(.46) ____ .02
n
b. Setting the margin-of-error to .02 and solving for n: _________
.02 1.96
(.64)(.36) √ ________ n
2
1.96 2212.762; n 2,213 n (.64)(.36) ____ .02
c. The necessary sample size is the largest with the penny flipping. In this particular result, the sample proportion of heads is closest to 50% with the penny flipping.
Activity 16-12: Credit Card Usage _________
a. margin-of-error: (1.96)
√
(.76)(.24) ________ .0222 1413
b. A 95% confidence interval is .76 .0222 (.738, .782). You are 95% confident that between 74% and 78% of all undergraduate students between the ages of 1824 hold a credit card.
Activity 16-15
327
c. The sample was a random sample, and the sample size was large (1413 .76 1073.88 10 and 1413 .24 339.12 10), so the technical conditions required for the validity of this confidence interval procedure were satisfied in this case.
Activity 16-13: Responding to Katrina _________
a. Margin-of-error: (1.96)
√
(.12)(.88) ________ .02187 848
A 95% confidence interval is .12 .0219 (.098, .142). You are 95% confident that between 9.8% and 14.2% of all white adults would have answered yes if asked. _________
b. Margin-of-error: (1.96)
√
(.60)(.40) ________ .0593 262
c. The interval for the black adults is more than twice as wide as the interval for the non-Hispanic white adults, but that interval also indicates that the percentage of blacks who believe that race was a factor in the government’s slow response is much higher (somewhere between 54–66%) than is the percentage of whites who believe this (only 9.8–14.2%). d. The black adults have the greater margin-of-error. This is because their group was smaller (the sample size was only about 14 the size of the non-Hispanic white adults). e. You are not told how the sample was selected; in particular, was any attempt made to randomly select these individuals? This is a critical technical condition that you must check in order to determine whether these intervals are valid.
Activity 16-14: West Wing Debate a. A 95% confidence interval for the proportion of all viewers who favored Santos is ____________
.54 (1.96) √ .54(.66)1208 .5554 (1.96)(.01434) .54 .0281 (.512, .568) b. Yes, because all the values in this interval are greater than .50, this suggests that more than 12 of the population favored Santos.
Activity 16-15: Magazine Advertisements a. The observational units are the pages of Sports Illustrated and Soap Opera Digest magazines. ______________
b. Sports Illustrated: 54116 (1.96) √ .466 (.534)116 .466 (1.96)(.0463) .466 .0908 (.375, .556) Soap Opera Digest: 28130 (1.96)(.0361) .215 .0707 (.145, .286) c. You are 95% confident the proportion of all Sports Illustrated pages that contain ads is between 37.5% and 55.6%. Similarly, you are 95% confident the proportion of all Soap Opera Digest pages that contain ads is only between 14.5% and 28.6%.
16
A 95% confidence interval is .6 .0593 (.541, .659). You are 95% confident that between 54.1% and 65.9% of all black adults would have answered yes if asked.
328
Topic 16: Confidence Intervals: Proportions
d. If you were to repeat this procedure many, many times, always using random samples of 116 pages of Sports Illustrated (and 130 pages of Soap Opera Digest), 95% of the time you would create intervals that would contain the population proportion of the magazine’s pages that contain ads. e. Yes, each interval contains the sample proportion of pages with ads. f. This question was silly because this value (pˆ) is the center of the interval. You cannot create the interval without it. g. Answers will vary by choice of magazine. Here is a representative answer: Newsweek, April 30, 2007: Seventeen of 65 pages contained ads. A 95% CI for the population proportion is _______________
1765 (1.96) √ .2615 (.7385)65 .262 .107 (.155, .369) You are 95% confident that between 15.5% and 36.9% of all Newsweek pages contain ads.
Activity 16-16: Phone Book Gender a. 77 14 2 36 163 first names are listed. b. (14 36) 163 50163 .307 of those names are female.
_____________
c. For a 90% CI, you calculate .307 (1.645) √ .307(.693)163 .307 (1.645)(.03612) .307 .0594 (.247, .366). d. Yes, you should have concerns about this sampling method as you did not select a simple random sample. As indicated in Activity 4-17, this sampling method is likely to underestimate the proportion of women living in San Luis Obispo County because the phone listings of married women are often only under their husband’s name and because single women sometimes choose not to list their phone number in order to avoid harassing phone calls. e. Answers will vary. Some possibilities include: sample the obituaries in the San Luis Obispo county paper for a week; obtain a list of registered voters in the county, and take a random sample from this list.
Activity 16-17: Random Babies Answers will vary. The following is a representative response: a. The proportion of simulated repetitions that resulted in no mother getting the correct baby was 3591000 .359. (See Activity 11.1, part k.) __________________
b. For a 95% CI, you calculate .359 (1.96) √ .359 (1 .359)1000 .359 (1.96)(.01517) .359 .0297 (.329, .389). c. You are 95% confident the long-term proportion of times that no mother would get the correct baby is between .33 and .39. d. .375 e. Yes, the 95% confidence interval succeeds in capturing the population parameter.
Activity 16-19
329
f. If 1000 different statistics classes carried out this simulation, you would expect roughly 95% or 950 of their intervals to succeed in capturing .375, whereas you would expect about 50 of these intervals not to contain .375. g. For an 80% CI, you calculate, .359 (1.282)(.01517) .359 .0194 (.34, .38). Yes, this interval just succeeds in capturing .375. If 1000 different statistics classes carried out this simulation, you would expect roughly 80% or 800 of their intervals to succeed in capturing .375, whereas you would expect about 200 of these intervals not to contain .375.
Activity 16-18: Charitable Contributions a. The parameter () is the proportion of all adult Americans who gave a financial contribution to charity in the previous 12 months. _________________
b. For 250 households, .789 (2.576) √ .789 (1 .789)250 .789 (2.576)(.0258) .789 .0665 (.723, .855). _________________
d. For 1000 households, .789 (2.576)(.0129) .789 .033 (.756, .822). e. For 2000 households, .789 (2.576)(.0091) .789 .0235 (.765, .813). f. As the sample sizes increase, the margins-of-error decrease. Specifically, if the sample size is increased (multiplied) by a factor of 4, the margin-of-error will be __ cut in half (2 √4 ). g. No; doubling the sample size will not cut the margin-of-error in half. The sample size needs to be increased by a factor of 4 22 in order to cut the margin-of-error in half. h. The confidence interval is .789 (2.576)(.0112) .789 .0288 (.760, .818). You are 99% confident that between 76.0% and 81.8% of all adult Americans gave a financial contribution to charity in the previous 12 months.
Activity 16-19: Marriage Ages a. In 94 of these marriages, you can determine which partner was younger. b. In 67 of these 94 marriages, the bride is younger than the groom (71.3%). ________________
c. A 90% confidence interval is 6794 (1.645) √ .713 (1 .713)94 .713 (1.645)(.0467) .713 .0768 (.636, .790). d. A 95% confidence interval is .713 (1.96)(.0467) .713 .0915 (.621, .804). e. A 99% confidence interval is .713 (2.576)(.0467) .713 .120 (.592, .833). f. None of these intervals contains the value .5. g. These data suggest the bride is younger than the groom in more than half of all the marriages in this country because all the values in all these intervals are greater than .5 (assuming this sample is representative).
16
c. For 500 households, .789 (2.576) √ .789 (1 .789)500 .789 (2.576) (.0182) .789 .047 (.742, .836).
330
Topic 16: Confidence Intervals: Proportions
Activity 16-20: Critical Values a. 85% confidence: z* 1.44 b. 97.5% confidence: z* 2.24 c. 51.6% confidence: z* 0.70 d. The largest of these values is the one associated with 97.5% confidence. The smallest z* value is the one associated with 51.6% confidence. This result makes sense because the confidence level indicates the proportion of z-scores in the center of the z-curve, between –z* and z*. The larger this confidence level, the larger z* will need to be in order to include more area in the center of the curve.
.8500
.075
.075
1.44 z *
0 z-values
1.44 z*
To be more confident of capturing the population proportion, you need to include a larger distance on either side of the sample proportion.
Activity 16-21: Wrongful Conclusions a. Andrew’s interval (.346, .474) must be incorrect because it is not centered at the sample proportion pˆ .4. (It is centered at .41.) b. Andrew’s sample proportion: pˆ (.682 .558)2 .62 Becky’s sample proportion: pˆ (.611 .779)2 .695 c. Andrew’s margin-of-error: (.682 .558)2 .062 Becky’s margin-of-error: (.779 .611)2 .084 d. In order to solve for the sample size, you would need to know the z* value used in the margin-of-error formula. (You found the margin-of-error in part c by recognizing it as the half-width of the intervals.) Because you do not know the confidence level used, you do not know the critical value (z*) used, so you cannot determine which sample size was used by which person. _________
e. Andrew’s critical value: .062 z*
√
(.62)(.38) ________ ; 100
.062 10 z* 1.28 __________ _________
√ (.62)(.38)
Activity 16-22
331
Then using Table II, the Standard Normal Probabilities Table, you find the confidence level used by Andrew is 90%. ___________
Becky’s critical value: .084 z*
√
(.695)(.305) __________ ; 200
____
.084 √200 z* 2.58 ____________ ___________
√ (.695)(.305)
Then using Table II, the Standard Normal Probabilities Table, you find the confidence level used by Becky is 99%. f. Andrew: pˆ (.365 .533)2 .584; margin-of-error (.365 .533)2 .051 Becky: pˆ (.602 .55)2 .576; margin-of-error (.602 .550)2 .026 g. Because both researchers are using the same confidence level, you know that the smaller margin-of-error is associated with the larger sample size, so Becky used a sample of size 1000, whereas Andrew used a sample of size 250. h. Both researchers selected random samples and used a 90% confidence level, so both procedures had a 90% chance of capturing the population proportion.
Activity 16-22: Candy Colors a. The proportion of intervals that succeed in capturing (.15) is 8141000 81.4%. This is not very close to 95%.
16
Answers will vary, but here is one representative set:
332
Topic 16: Confidence Intervals: Proportions
b. It is not surprising that the success rate was so far below 95% in this case because the technical conditions about our sample size were not met. In particular, n 10 (.15) 1.5 10, so you do not expect the 95% confidence statement to be accurate.
Activity 16-23: Penny Thoughts a. The population parameter is the proportion of all American adults who favor abolishing the penny. b. For a 95% CI, you calculate ____________
.59 (1.96) √.59(.41)2316 .59 (1.96)(.01022) .59 .02 (.57, .61) You are 95% confident that between 57% and 61% of all American adults favor abolishing the penny. c. The sample size is certainly large enough (2316 .59 1366 10 and 2316 .41 950 10), but you have no indication of how this sample of adults was selected. If you assume that the Harris Poll selected randomly, then the technical conditions are satisfied. Otherwise, you should interpret this interval with caution.
Assessment Sample Quiz 16A
•••
Students enrolled in an introductory statistics course at a university were asked to take a survey that indicated whether the student has a visual or verbal learning style. Of the 39 students who took the survey, 25 were judged to have a visual learning style and 14 were considered verbal learners. 1. Determine a 90% confidence interval for the population proportion who are visual learners at this university. 2. Write a sentence interpreting what this interval says. 3. How would a 99% confidence interval compare to this one in terms of its midpoint and half-width? (Do not bother to determine this interval.) 4. Check whether the technical condition concerning sample size is satisfied here. 5. Explain why you might feel wary about applying this confidence interval to the population of all students at this university.
Solution to Sample Quiz 16A
•••
1. A 90% confidence interval for the population proportion who are visual learners at this university is ___________
.641 1.645
√
(.641)(.359) __________ .641 (1.645) (.07681) 39
.641 .1264 (.515, .767)
2. You are 90% confident the population proportion of visual learners at this university is between 51.5% and 76.7%.
346
Topic 17: Tests of Significance: Proportions
i. The test statistic is .495 .5 ________ .495 .5 0.10 _______ z ________ .0513 (.5)(.5) ______ 95
√
The p-value is 2(.4602) .9204.
.3
.4 .5 .6 Sample Proportion of Games with Big Bang
.7
j. This p-value is not small at all, suggesting that the sample data are quite consistent with Marilyn’s hypothesis that half of all games contain a big bang. The sample data provide no reason to doubt Marilyn’s hypothesized value for . k. A 95% confidence interval for (the population proportion of games that contain a big bang) is given by ________
ˆ pˆ ) p(1 pˆ z* ________ n
√
with z* 1.96, which is
___________
(.495)(.505) .495 1.96 __________ 95
√
which is .495 .101, which is the interval from .394 to .596. Therefore, you are 95% confident that between 39.4% and 59.6% of all major-league baseball games contain a big bang. The grandfather’s claim (75%) is not within this interval or even close to it, which explains why it was so soundly rejected. Marilyn’s conjecture (50%) is well within this interval of plausible values, which is consistent with it not being rejected.
•••
Homework Activities Activity 17-6: Interpreting p-values a. You would reject the null hypothesis at the .10 level because the p-value is less than .05 and therefore less than .10. b. You do not have enough information to know whether you would reject the null hypothesis at the .01 level. Although you know the p-value is less than .05, you do not know whether it is less than .01 (e.g., it could equal .03).
Activity 17-10
347
c. You know that you would fail to reject the null hypothesis at the .03 level because you know the p-value is greater than .05 and therefore greater than .03. d. You do not have enough information to know whether you would reject the null hypothesis at the .07 level. Although you know the p-value is greater than .05, you do not know whether it is greater than .07 (e.g., it could equal .06).
Activity 17-7: Interpreting p-values a. Yes, it is possible for a p-value to be greater than .5. A p-value is the probability of obtaining a sample result as or more extreme than (as defined by the alternative hypothesis) the given result by random chance alone if the null hypothesis is true. If the sample result is in the opposite direction from the hypothesized value than conjectured by Ha, then the one-sided p-value will be greater than .5. For example, if H0: .25 vs. Ha: .25, but you observe pˆ .2, then p-value Pr( pˆ .2 when .25) .5. Also, if you have a two-sided alternative and the one-tail probability already exceeds .25, then the two-sided p-value .5. b. No, it is not possible for a p-value to be greater than 1 because a p-value is a probability and is, therefore, a value between 0 and 1 inclusive.
Activity 17-8: Wrongful Conclusions b. A proportion cannot be greater than 1.0. c. The values specified by the two hypotheses cannot overlap. In this case, the alternative hypothesis should be Ha: .5. d. The hypotheses must cover all the possibilities—you cannot use two different 0 values. The hypotheses should be H0: .5 vs. Ha: .5, or H0: .6 vs. Ha: .6. e. The hypotheses are reversed. In order to use the CLT, you have to assume that is some value, so this must be the null hypothesis: H0: .5 vs. Ha: .5.
Activity 17-9: Penny Activities You should not be convinced that this is not a fair coin until you know how many times the coin has been flipped in order to obtain this sample value of 75% heads. Suppose the coin has been flipped only four times! This would not be at all convincing. But obtaining heads 75% of the time in 1000 flips would be very convincing.
Activity 17-10: Flat Tires Example results will vary. Here is one representative set of answers: Define parameter of interest: Let represent the proportion of all the students at your school who would choose the right-front tire when asked the “flat tire” question. The null hypothesis is that one-fourth of all the students at your school would choose the right-front tire when asked the “flat tire” question. In symbols, the null hypothesis is H0: .25.
17
ˆ a. The null hypothesis should reference the parameter (), not the statistic ( p).
348
Topic 17: Tests of Significance: Proportions
The alternative hypothesis is that more than one-fourth of all the students at your school would choose the right-front tire. In symbols, the alternative hypothesis is Ha: .25. The sample proportion is 4190 .456. Check technical conditions: To see whether the CLT applies here, first check 90(.25) 22.5 10, 90(.75) 67.5 10, so the sample size condition is met. Your class is not a random sample of students at your school, but it is likely to be a representative sample for this issue, so you will proceed as if it were a random sample. .456 .25 4.50 _________ Test statistic: z __________ (.25)(.75) ________ 90
√
Using Table II, p-value Pr(Z 4.50) .0002. Test decision: Because the p-value is small ( p-value .0002), reject H0. Conclusion in context: You have very strong statistical evidence that more than one-fourth of the students at your school would select the right-front tire when asked the “flat tire” question (as long as this sample is representative of students at your school on this question).
Activity 17-11: Credit Card Usage Define parameter of interest: Let represent the proportion of all U.S. undergraduate students who have a credit card. The null hypothesis is that the proportion of all U.S. undergraduate students who have a credit card is .75. In symbols, the null hypothesis is H0: .75. The alternative hypothesis is that more than 75% of all U.S. undergraduate students have a credit card. In symbols, the alternative hypothesis is Ha: .75. Check technical conditions: The CLT applies here because 1413(.75) 1059.75 10 and 1413(.25) 353.25 10, and you have a random sample from the population of interest. .76 .75 0.87 _________ Test statistic: z __________ (.75)(.25) ________ 1413
√
Using Table II, p-value Pr(Z .87) .1922 Test decision: Because the p-value is not small (.1922 .05), do not reject H0. Conclusion in context: You do not have sufficient statistical evidence to conclude that more than 75% of U.S. undergraduate students have a credit card.
Activity 17-12: Kissing Couples a. The null hypothesis is that half the population of all kissing couples lean their heads to the right. In symbols, the null hypothesis is H0: .5. The alternative hypothesis is that more than half of the population of all kissing couples lean their heads to the right. In symbols, the alternative hypothesis is Ha: .5. Technical conditions: The CLT applies here because 124(.5) 124(1 .5) 62 10, and you are considering these data to be from a random sample. The sample proportion is 80124 .645.
Activity 17-12
349
.645 .5 3.23. _______ The test statistic is z ________ (.5)(.5) ______ 124
√
Using Table II, p-value Pr(Z 3.23) .0006. Because the p-value is small (.0006 .05), reject H0. You have strong statistical evidence that more than half of the population of all kissing couples lean their heads to the right. b. The null hypothesis is that three-fourths of the population of all kissing couples lean their heads to the right. In symbols, the null hypothesis is H0: .75. The alternative hypothesis is that less than three-fourths of the population of all kissing couples lean their heads to the right. In symbols, the alternative hypothesis is Ha: .75. .645 .75 2.70. _________ The test statistic is z __________ (.75)(.25) ________ 124
√
Using Table II, p-value Pr(Z 2.70) .0035.
c. The null hypothesis is that two-thirds of the population of all kissing couples lean their heads to the right. In symbols, the null hypothesis is H0: .667. The alternative hypothesis is that the population proportion of all kissing couples who lean their heads to the right is not two-thirds. In symbols, the alternative hypothesis is Ha: .667. .645 .667 0.516 ___________ The test statistic is z ____________ (.667)(.333) __________ 124
√
Using Table II, p-value 2 Pr(Z 0.52) 2 .3015 .603. The following graph displays these results: Mean .667, SD .0423
.3015
.3015
.645 .667 .689 Sample Proportion Leaning Right
17
Because the p-value is small (.0035 .05), reject H0. You have strong statistical evidence that less than three-fourths of the population of all kissing couples lean their heads to the right.
350
Topic 17: Tests of Significance: Proportions
Because the p-value is not small, do not reject H0. You do not have statistical evidence that the population proportion of kissing couples who lean their heads to the right differs from two-thirds.
Activity 17-13: Political Viewpoints a. In the sample, 4971309 .38 consider themselves to be political moderates. This result is clearly greater than 13. b. The null hypothesis is that one-third of all American adults consider themselves to be political moderates. In symbols, H0: .333. The alternative hypothesis is that more than one-third of all American adults consider themselves to be political moderates. In symbols, Ha: .333. Technical conditions: The CLT applies here because 1309(.333) 217.9 10, 1309(.667) 873.1 10, and you have a random sample from the population of interest. .38 .333 3.58. ___________ The test statistic is z ____________ (.333)(.667) __________ 1309
√
Using Table II, p-value Pr(Z 3.58) .0000. c. Answers will vary by student expectation, but students should expect the p-value to increase because the observed sample proportion (124327 .38) is the same but the sample size is larger. This implies the observed sample result is less surprising corresponding to a larger p-value. d. To determine the p-value, you calculate .38 .333 1.77 ___________ z ____________ (.333)(.667) __________ 327
√
Using Table II, p-value Pr(Z 1.77) .0384. The p-value did indeed increase.
Activity 17-14: Calling Heads or Tails Answers will vary. Here is one representative set. a. 1620 .80; 80% of the responses were heads. b. This proportion is a statistic because it is collected from a sample, not from the entire population. c. The parameter () is the proportion of all students who would answer “heads” when asked to predict the result of a coin flip. d. The null hypothesis is that half of all students would answer “heads” when asked to predict the result of a coin flip. In symbols, H0: .5. The alternative hypothesis is that the proportion of students who would answer “heads” when asked to predict the result of a coin flip is not one-half. In symbols, Ha: .5.
Activity 17-15
351
e. Here is the sketch: Mean = .5, SD = .1118
.00364
.00364 .2
.5 Sample Proportion Calling Heads
.8
.8 .5 2.68. _______ f. The test statistic is z ________ (.5)(.5) ______ 20
√
Using Table II, p-value 2 Pr(Z 2.68) 2 .0037 .0074. g. Yes, with the small p-value, this result is statistically significant at the .10, .05, and .01 levels. h. You have very strong statistical evidence that the proportion of students who would answer “heads” when asked to predict the result of a coin flip is not one-half. i. If you had been working with “tails” instead of “heads,” you would have interchanged these words in your hypotheses, your sample proportion would have been 420 .20, and so your test statistic would have been 2.68 (the opposite of the value you found in part f). The p-value and conclusion would have been unchanged.
Activity 17-15: Calling Heads or Tails Answers will vary. Here is one representative set: Define parameter of interest: Let represent the proportion of all the students who would answer “heads” when asked to predict the result of a coin flip. The null hypothesis is that 70% of all students would answer “heads” when asked to predict the result of a coin flip. In symbols, the null hypothesis is H0: .7. The alternative hypothesis is that the proportion of all students who would answer “heads” when asked to predict the result of a coin flip is not 70%. In symbols, the alternative hypothesis is Ha: .7. Here is the sketch:
17
Technical conditions: 20 .5 20 (1 .5) 10 10. You do not have a random sample, but it might be reasonable to consider the sample to be representative of how students would respond to the “predict a coin flip” question.
352
Topic 17: Tests of Significance: Proportions
Mean = .7, SD = .10247
.1635
.1635
.6 .7 .8 Sample Proportion Calling Heads
Check technical conditions: 20 .5 20 (1 .5) 10 10. You can consider the sample to be representative of how students would respond to the “predict a coin flip” question. .8 .7 0.98 _______ Test statistic: z ________ (.7)(.3) ______ 20
√
Using Table II, p-value 2 Pr(Z 0.98) 2 .1635 .327. Test decision: Because the p-value is not small, do not reject H0. Conclusion in context: Assuming you have a representative sample, you have no statistical evidence that the population proportion of students who will answer “heads” when asked to predict the result of a coin flip differs from .70.
Activity 17-16: Flat Tires Using Table II, for the p-value to be below .10, you find that you need a z-score of at least 1.28:
.1 0 z-values
1.28
Activity 17-17
353
.3 .25 . The test statistic is z 1.28 ________ _______ .25(.75) _______ n
√
So 2
1.28 (.25)(.75) 122.88 n ____ .05 Thus, the smallest sample size for which a sample result of .30 answering right-front would be significant at the 10% level is 123.
Activity 17-17: Baseball “Big Bang” a. The null hypothesis is that the proportion of all major-league baseball games that contain a big bang is .50. In symbols, the null hypothesis is H0: .5. The alternative hypothesis is the proportion of all major-league baseball games that contain a big bang is not .50. In symbols, the alternative hypothesis is Ha: .5. Technical conditions: 968 .5 968 (1 .5) 484 10, so this condition is met. You must believe that the games in 1986 are representative of the overall process.
17
The sample proportion is 419968 or .433. Here is a sketch of the sampling distribution: Mean = .5, SD = .016
.0000141 .433
.0000141 .5 Sample Proportion of Games with Big Bang
.567
.433 .5 4.18. _____ The test statistic is z ________ .5(.5) _____ 968
√
Using Table II, p-value 2 Pr(Z 4.18) 2 .0002 .0004. Because the p-value .02, reject the null hypothesis at the .02 level. There is very strong statistical evidence that the proportion of all major-league games that contain a big bang is not 12.
354
Topic 17: Tests of Significance: Proportions
b. The null hypothesis is that the proportion of all major-league baseball games that contain a big bang is three-fourths. In symbols, the null hypothesis is H0: .75. The alternative hypothesis is the proportion of all major-league baseball games that contain a big bang is not three-fourths. In symbols, the alternative hypothesis is Ha: .75. Technical conditions: 968 (.75) 725 10, 968 (.25) 242 10, so this condition is met. You must believe that the games in 1986 are representative of the overall process. The sample proportion is 651968 or .673. Here is a sketch of the sampling distribution: Mean = .75, SD = .0139
.673
.75 Sample Proportion of Games with Big Bang
.827
.673 .75 5.56. _________ The test statistic is z __________ (.75)(.25) ________ 968
√
Using Table II, p-value 2 Pr(Z 5.56) 2 .0002 .0004. Because the p-value .08, reject the null hypothesis at the .08 level. There is very strong statistical evidence that the proportion of all major-league games that contain a big bang is not three-fourths.
Activity 17-18: Racquet Spinning a. p-value 2 Pr(Z 0.8) 2 .2119 .4238. Based on this p-value, you would not reject the null hypothesis at the .05 level. b. The test does not indicate that the racquet would definitely land “up” 50% of the time in the long run, as no test can do this. The test does indicate that 50% is a plausible value for the long-run proportion of times that the racquet will land “up.” But there may be many other plausible values as well. c. The smallest significance level at which you would reject the null hypothesis is the p-value .4238.
Activity 17-20
355
d. You would substitute the word “down” for “up” and use 54100 rather than ˆ This would make your test statistic positive 46100 for the sample proportion p. (.8), but it would not change your p-value or conclusion. e. A 95% confidence interval for the long-run proportion of “up” results: _______________
.46 1.96√ .46(1 .46)100 .46 .098 (.362, .558) f. Yes, this interval includes the value .5. g. The confidence interval is an interval of plausible values for (the proportion of times a spun tennis racquet would land “up” in the long run). The interval shows that .5 is a plausible value (part f). The significance test (part a) also said that .5 was a plausible value for because you failed to reject the hypothesis that the value of differed from .5.
Activity 17-19: Therapeutic Touch a. The parameter is , the proportion of times subjects (all therapeutic touch practitioners in such an experimental setup) can correctly identify over which hand the experimenter’s hand is held.
The alternative hypothesis is that the proportion of times subjects can correctly identify over which hand the experimenter’s hand is held is greater than .50. In symbols, Ha: .5. b. Note the sample size is large enough; consider these subjects as representative of the performance of all therapeutic touch practitioners. The sample proportion is 123280 or .439. .439 .5 2.03. _______ The test statistic is z ________ (.5)(.5) ______ 280
√
Using Table II, p-value Pr(Z 2.03) 1 .0212 .9788. c. It makes sense that the p-value is greater than .5 in this situation because the sample proportion is less than 12, but you conjectured the population proportion was greater than .5. The subjects were actually less successful than you would expect if they guessed each time! Such a sample does not provide evidence that subjects can distinguish between the hands more than half the time. d. No, you never “accept” the null hypothesis. You simply have no evidence against the null hypothesis in favor of the alternative, so you continue to assume that .50 is a plausible value of . e. You can only conclude that these practitioners showed no evidence of being able to correctly identify more often than not over which hand the experimenter’s hand was held.
Activity 17-20: Magazine Advertisements a. The parameter is , the proportion of all Sports Illustrated pages that contain an advertisement.
17
The null hypothesis is that the proportion of times subjects can correctly identify over which hand the experimenter’s hand is held is .50. In symbols, H0: .5.
356
Topic 17: Tests of Significance: Proportions
b. H0: .3. This is the null hypothesis. c. Ha: .3. This is the alternative hypothesis. Technical conditions: 116 .3 34.8 10, 116 .70 81.2 10, so this condition is met. Consider the data as a random sample from the population. The sample proportion is pˆ 54116 .466. .466 .3 3.89. _______ The test statistic is z ________ (.3)(.7) ______ 116
√
Using Table II, p-value 2 Pr(Z 3.89) 2 .0002 .0004. With such a small p-value, reject H0 at any common significance level. d. You have very strong statistical evidence that the proportion of pages in Sports Illustrated that contain an ad is not .30. e. The sample size is large enough (see Technical conditions in part c), but this is not a random sample. It may not even be representative, as there may have been some special sporting event covered in the September 13th issue of the magazine that affected the types and number of advertisements, so you should be cautious in generalizing these results to all issues.
Activity 17-21: Hiring Discrimination a. The parameter of interest is the long-run probability of an African American teacher being hired by the City of Hazelwood. b. The null hypothesis is that the probability of African American teachers being hired by the City of Hazelwood is .154 (the same as the county). In symbols, the null hypothesis is H0: .154. The alternative hypothesis is that the probability of African American teachers being hired by the City of Hazelwood is less than .154. In symbols, the alternative hypothesis is Ha: .154. Here is a sketch of the sampling distribution: Mean = .154, SD = .0179
.10
.12 .14 .16 .18 .20 Sample Proportion of African American Teachers Hired
.22
Activity 17-21
357
Technical conditions: 405 .15 60.75 10, 405 .85 344.25 10, so this condition is met. This is hardly a random sample, but you can proceed with caution if you consider this sample to be representative of the overall hiring process. The sample proportion is pˆ 15405 .037. .037 .154 6.52. ___________ The test statistic is z ____________ (.154)(.846) __________ 405
√
Using Table II, p-value Pr(Z 6.52) .0002. With such a small p-value, reject H0 at any significance level. Based on these data, you have overwhelmingly statistical evidence that the probability of African Americans hired by the school district is less than 15.4%. c. The null hypothesis is that the probability of African American teachers being hired by the City of Hazelwood is .057. In symbols, the null hypothesis is H0: .057. The alternative hypothesis is that the probability of African American teachers being hired by the City of Hazelwood is less than .057. In symbols, the alternative hypothesis is Ha: .057.
Mean = .057, SD = .0115
.0418 .037 .057 Sample Proportion of African American Teachers Hired
Technical conditions: 405 .057 23.085 10, 405 .943 381.915 10, so this condition is met. This is hardly a random sample, but you can proceed with caution if you consider this sample to be representative of the overall hiring process. The sample proportion is pˆ 15405 .037. .037 .057 1.73. ___________ The test statistic is z ____________ (.057)(.943) __________ 405
√
17
Here is a sketch of the sampling distribution:
358
Topic 17: Tests of Significance: Proportions
Using Table II, p-value Pr(Z 1.73) .0418. Because p-value .0418 .01, do not reject H0 at the .01 significance level. You do not have sufficiently strong statistical evidence (at the .01 level) to conclude that the proportion of African Americans hired by the Hazelwood School District is less than .057. d. There is very strong evidence that the hiring rate of African Americans is below that of the entire county but may be similar to the proportion of African Americans in the county if the city of St. Louis is excluded.
Activity 17-22: Marriage Ages Define parameter of interest: Let represent the proportion of all marriages in Cumberland County in which the bride is younger than the groom. The null hypothesis is that the proportion of all marriages in which the bride is younger than the groom is .50. In symbols, the null hypothesis is H0: .5. The alternative hypothesis is that the proportion of all marriages in which the bride is younger than the groom is more than .50. In symbols, the alternative hypothesis is Ha: .5. Check technical conditions: 94 .5 94 (1 .5) 47.5 10, but you must assume the data to be a representative sample of all marriages in this county. The sample proportion is pˆ 6794 .713. .713 .5 4.13 _______ Test statistic: z ________ (.5)(.5) ______ 94
√
Using Table II, p-value Pr(Z 4.13) .0002. Test decision: With such a small p-value, reject H0 at any significance level. Conclusion in context: You have very strong statistical evidence that the bride is younger than the groom in more than half of all marriages in Cumberland County, Pennsylvania. Note that your sample was selected entirely from the year 1993, so you would not be wise to extend this conclusion much beyond the early 90s.
Activity 17-23: Veterans’ Marital Problems Define parameter of interest: Let represent the proportion of all Vietnam veterans who are divorced. The null hypothesis is that the proportion of all Vietnam veterans who are divorced is .27. In symbols, the null hypothesis is H0: .27. The alternative hypothesis is that the proportion of all Vietnam veterans who are divorced is more than .27. In symbols, the alternative hypothesis is Ha: .27. Check technical conditions: 2101 .27 567.27 10, 2101 .73 1533.73 10. You are not told how the sample of 2101 veterans was selected. The sample proportion is pˆ 7772101 .37. .37 .27 10.31 _________ Test statistic: z __________ (.27)(.73) ________ 2101
√
Using Table II, p-value Pr(Z 10.31) .0002. Test decision: With such a small p-value, reject H0 at any significance level.
Activity 17-24
359
Conclusion in context: You have overwhelming statistical evidence that the proportion of all Vietnam veterans who are divorced is more than 27%. If the divorce rate for Vietnam veterans were really 27%, you could expect to see a sample result like this (7772101 divorced veterans) in less than .02% of random samples. Because this sample result would almost never happen if the null hypothesis were true, you have strong evidence that your null hypothesis is false and 27% is not a plausible value for the divorce rate of Vietnam veterans. (You conclude that this rate is greater than 27%.)
Activity 17-24: Distinguishing Between Colas a. The null hypothesis is that the subject would correctly identify the different brand of cola 13 of the time in the long run. In symbols, H0: .333. The alternative hypothesis is that the subject would correctly identify the different brand of cola more than 13 of the time in the long run. In symbols, Ha: .333. b. The sample proportion is pˆ 2160 .35. .35 .333 0.28. ___________ The test statistic is z ____________ (.333)(.667) __________ 60
√
Using Table II, p-value Pr(Z 0.28) .3897. c. The sample proportion is pˆ 3060 .5
17
.5 .333 2.74. ___________ The test statistic is z ____________ (.333)(.667) __________ 60
√
Using Table II, p-value Pr(Z 2.74) .0031. d. Here is a sketch of the sampling distribution: Mean = .333, SD = .0608
.05 .333 .433 Sample Proportion of Correct Identifications
You need the ( p-value) area in the right tail of the curve to be at most .05, which requires a test statistic value of at least z 1.645. Note the standard ___________ (.333)(.667) deviation is __________ .0608. 60
√
360
Topic 17: Tests of Significance: Proportions
Solving for the sample proportion, you find pˆ .333 1.645 ________ .0608
pˆ 1.645(.0608) .333 .433
and calculate .433 60 25.98, so the smallest number of correct identifications that would lead to rejecting the null hypothesis at the .05 level is 26. e. Here is the sketch: Mean = .333, SD = .0608
.01 .333 Sample Proportion of Correct Identifications
.475
You need the ( p-value) area in the right tail of the curve to be at most .01, which requires a test statistic value of at least z 2.33. You find pˆ .333 2.33 ________ .0608
pˆ 2.33(.0608) .333 .475
and calculate .475 60 28.5, so the smallest number of correct identifications that would lead to rejecting the null hypothesis at the .01 level is 29. f. The answer to part e is greater than the answer to part d. This makes sense because in part e you are using a smaller significance level, therefore requiring stronger statistical evidence that the subject is able to identify the different cola more than 13 of the time. This means you will need more correct identifications in order to be convinced that the null hypothesis is false.
Assessment Sample Quiz 17A
•••
Students enrolled in an introductory statistics course at a university were asked to take a survey that indicated whether the student has a visual or verbal learning style. Of the 39 students who took the survey, 25 were judged to have a visual learning style, and 14 were considered verbal learners. Consider these 39 students to be a random sample of students at this university. Conduct a test of significance of whether these sample data provide strong evidence that more than half of all students at the university have a visual learning style. Be sure
370
Topic 18: More Inference Considerations
The 95% CI for is .54 .028, which is (.512, .568). The 99% CI for is .54 .037, which is (.503, .577). c. The midpoints are all the same, namely .54, the sample proportion of Santos supporters. The 99% CI is wider than the 95% CI, and the 90% CI is the narrowest. d. Yes. All three intervals contain only values greater than .5, so they do suggest, even with 99% confidence, that more than half of the population would have favored Santos. e. H0: .5 (half of the population favored Santos) Ha: .5 (more than half of the population favored Santos) f. Because all three intervals fail to include the value .5, you know that the p-value for a two-sided alternative would be less than .10, .05, and .01. Because you have a one-sided alternative in this case, you know that the p-value will be less than .01 divided by 2, or .005 (because the observed sample proportion is in the conjectured direction). g. A Type I error occurs when the null hypothesis is really true but is rejected. In this case, a Type I error would mean that you conclude that Santos was favored by more than half of the population when in truth he was not favored by more than half. In other words, committing a Type I error means concluding that Santos was ahead (favored by more than half) when he wasn’t really. A Type II error occurs when the null hypothesis is not really true but is not rejected (you continue to believe a false null hypothesis). In this case, a Type II error means that you conclude Santos was not favored by more than half of the population when in truth he was favored by more than half of the population. In other words, committing a Type II error means concluding that Santos was not ahead when he really was. h. The test would be more powerful if Santos really were favored by 55% rather than 52%. The higher population proportion would make it more likely to reject the null hypothesis that only half of the population favored Santos because the distribution of sample proportions would center around .55 rather than .52 (further from .5). i. The larger sample (10,000) would produce stronger evidence that more than half of the population favored Santos. With less variability in the sampling distribution, the p-value would be much smaller.
•••
Homework Activities Activity 18-7: Charitable Contributions
__________________
a. For a 90% CI, you calculate .788 (1.645) √ .788(1 .788)1334 (.769, .806). b. For a 99% CI, you calculate .788 (2.576)(.0112) (.759, .817). c. Yes, the sample proportion does differ from .75 at the .01 significance level because .75 is not in the 99% confidence interval. Thus, you would reject the null hypothesis H0: .75 vs. the two-sided alternative Ha: .75, and would decide that .75 is not a plausible value for .
Activity 18-9
371
d. No, the sample proportion does not differ from .80 at the .10 significance level because .80 is in the 90% confidence interval. Thus, you would not reject the null hypothesis H0: .80 vs. the two-sided alternative Ha: .80. With 90% confidence, .80 seems to be a plausible value for .
Activity 18-8: Female Senators .16 .5 6.8. ______ a. The test statistic is z ________ (.5)(.5) ______ 100
√
b. The p-value 2 Pr(Z 6.8) .000. c. No, this significance test has no meaning because you know that women constitute less than half of the entire 2007 U.S. Senate because you have taken a census of the population of interest. You know women make up exactly 16% of the 2007 Senate.
Activity 18-9: Distinguishing Between Colas a. The null hypothesis is that the subject is just guessing and would correctly identify the different brand of soda 13 of the time in the long run. In symbols, the null hypothesis is H0: .3333.
b. You can’t tell whether Randy’s sample data would necessarily lead to rejecting the null hypothesis because you don’t know how many times he ran the experiment. If he only tried twice and succeeded 50% of the time, this would not be convincing evidence that he is doing better than guessing, but if he tried 500 times and succeeded 50% of the time, this would be very convincing evidence. .46 .3333 1.90 and the ____________ c. If n 50, pˆ .46, the test statistic is z ______________ (.3333)(.6667) ____________ 50 p-value Pr(Z 1.90) .0287 .05.
√
Yes, at the .05 significance level, reject H0 and conclude that Randy is doing better than guessing. d. Assuming Randy’s success rate is 50%, the CLT says Pr( pˆ .46) .7142 with samples of size 50.
18
The alternative hypothesis is that the subject would correctly identify the different brand of soda more than 13 of the time in the long run. In symbols, the alternative hypothesis is Ha: .3333.
372
Topic 18: More Inference Considerations
So, you expect about 71% of random samples to yield a sample proportion greater than .46 if Randy’s success rate is 50% (therefore more than a 5050 chance for Randy to do so). e. The probability will increase because with the larger sample size the standard deviation will decrease from .0707 to .05.
With the larger sample size, you expect more sample proportions to be close to .5 and therefore greater than .46. f. The probability will increase because the standard deviation will decrease to .06665 and (more importantly) because .46 will be so much further below 23 than it is below 12.
If the population proportion is even further above .46, the probability of obtaining a sample proportion of .46 or more quickly increases.
Activity 18-10: Voter Turnout
__________________
a. For a 99.9% CI, you calculate .682 (3.291) √ .682(1 .682)1783 (.652, .712). b. No, his was a self-reported answer about whether people voted. Those surveyed may have lied to the interviewer or have forgotten whether they voted. It is unlikely that this confidence interval is reliable. c. No, 49% is not within the confidence interval reported in part a. d. Yes, you are 99.9% confident that this confidence interval succeeds in capturing the proportion of all adult Americans who would claim to have voted in 1996. This would be a very different proportion from those who actually did vote.
Activity 18-12
373
e. You do not need to conduct the significance test because .49 is not contained in the 99.9% confidence interval. You know that you would reject H0: .49 vs. Ha: .49 at the .1% level of significance.
Activity 18-11: Phone Book Gender a. Ignoring the initials, 77 14 2 36 or 163 first names are listed in the phone book; (14 36)163 or 50163 or .307 of those names are female. b. The observational units in this study are listings in the San Luis Obispo County telephone book. c. The population is all first names in the San Luis Obispo County phone book, and the parameter is the proportion of those names that are female. d. The null hypothesis is that half the first names in the San Luis Obispo County phone book will be female. In symbols, the null hypothesis is H0: .5. The alternative is that less than half the first names in the San Luis Obispo County phone book will be female. In symbols, the alternative hypothesis is Ha: .5. .307 .5 4.93. ______ The test statistic is z ________ (.5)(.5) ______ 163
√
p-value Pr(Z 4.93) .0000 With this very small p-value, reject H0 and conclude that you have very strong statistical evidence that less than half the first names in the San Luis Obispo County phone book are female.
f. Because the data used to create this confidence interval and run this significance test were not collected from a simple random sample and are not likely to be representative of the proportion of women in San Luis Obispo County, you should not use this data to try to determine whether half of the residents of San Luis Obispo County are female. Note that living in the county is very different from being listed as a female in the phone book.
Activity 18-12: Racquet Spinning a. Let represent the proportion of times a spun tennis racquet would land “up” in the long run. b. Because you have no prior suspicion that either “up” or “down” is more likely than the other, the alternative hypothesis is two-sided. c. The null hypothesis is a spun tennis racquet would land “up” half the time. In symbols, the null hypothesis is H0: .5. The alternative is that a spun tennis racquet would not land “up” half the time. In symbols, the alternative hypothesis is Ha: .5.
18
_________________
e. For a 95% CI, you calculate .307 (1.96) √.307(1 .307)163 (.236, .377). You are 95% confident the proportion of female first names in the San Luis Obispo County phone book is between .236 and .377.
374
Topic 18: More Inference Considerations
.46 .5 0.80. ______ d. The test statistic is z ________ (.5)(.5) ______ 100
√
p-value 2 Pr(Z 0.8) 2 .2119 .4238 .54 .5 0.80. ______ e. The test statistic is z ________ (.5)(.5) ______ 100
√
p-value 2 Pr(Z 0.8) 2 .2119 .4238 f. In both of these cases, the p-values are greater than .05 so do not reject the null hypothesis at the .05 significance level. You have not found statistical evidence that would cause you to doubt that a spun tennis racquet will land “up” half the time.
Activity 18-13: Racquet Spinning a. H0: .5
Ha: .5
.46 .5 0.80. ______ b. The test statistic is z ________ (.5)(.5) ______ 100
√
p-value Pr(Z 0.8) .2119 This is the same test statistic that you found for the two-sided test, but the p-value is half as large (see Activities 18-12 and 17-3). .54 .5 0.80. ______ c. The test statistic is z ________ (.5)(.5) ______ 100
√
p-value Pr(Z 0.8) .7881 This test statistic has the same value, regardless of whether the test is one- or twosided. When the test is two-sided, the p-value is twice as large. d. The formal test of Ha: .5 is not necessary when the sample proportion pˆ .54 is greater than .5. You know this sample proportion will not be able to provide evidence that .5. Because the observed result is in the wrong direction, the p-value will exceed .5.
Activity 18-14: Penny Activities a. Sample data on flipping pennies: H0: .5
Ha: .5
pˆ 2650
.52 .5 0.28. ______ The test statistic is z ________ (.5)(.5) ______ 50
√
p-value 2 Pr(Z 0.28) 2 .3886 .7772 With this large p-value, do not reject H0. You have no reason to doubt that a flipped penny will land heads up half the time.
Activity 18-15
375
Sample data on spinning pennies: H0: .5
Ha: .5
pˆ 2750
.54 .5 0.566. ______ The test statistic is z ________ (.5)(.5) ______ 50
√
p-value 2 Pr(Z 0.566) 2 .2858 .5716 With the large p-value, do not reject H0. You have no reason to doubt that a spun penny will land heads up half the time. Sample data on tilting pennies: H0: .5
Ha: .5
pˆ 3250
.64 .5 1.98. ______ The test statistic is z ________ (.5)(.5) ______ 50
√
p-value 2 Pr(Z 1.98) 2 .0239 .0478 Because the p-value .0478 .05, reject H0. You have moderate statistical evidence that a tilted penny will not land heads up half the time. b. Yes, the significance test results are consistent with the 95% confidence intervals found in Activity 16-9. The proportion .5 is contained in the 95% confidence intervals for both the flipped and spun penny, indicating that .5 is a plausible value for , but this value (.5) is not in the 95% confidence interval for the tilted penny.
a. A 95% CI for flips: 1470929015 (1.96)(.002931) (.5012, .5127) A 95% CI for spins: 919720422 (1.96)(.003482) (.4435, .4572) A 95% CI for tilts: 1008714611 (1.96)(.003825) (.683, .698) b. These intervals are so narrow because the sample sizes are so large (which makes the margins-of-error very small). c. You would reject H0: .5 vs. Ha .5 at the .05 significance level for all three situations (flips, spins, and tilts) because .5 is not contained in any of the 95% confidence intervals. d. H0: .5 vs. Ha .5 .5069 .5 2.37. ______ For flips, the test statistic is z _________ (.5)(.5) ______ 29015
√
p-value 2 Pr(Z 2.37) 2 .0089 .0178 .05 With the p-value less than .05, reject H0; the statistical evidence suggests that the probability of obtaining a head with a flipped penny is not .5 (at the .05 level). .450 .5 14.19. ______ For spins, the test statistic is z ________ (.5)(.5) ______ 20422
√
18
Activity 18-15: Penny Activities
376
Topic 18: More Inference Considerations
p-value 2 Pr(Z 14.19) 2 .0000 .0000 .05 Reject H0; there is very strong statistical evidence that the probability of obtaining a head with a spun penny is not .5 (at the .05 level). .690 .5 46.02. _______ For tilts, the test statistic is z ________ (.5)(.5) ______ 14611
√
p-value 2 Pr(Z 46.02) 2 .0000 .0000 .05 Reject H0; with the very large test statistic, there is overwhelming statistical evidence that the probability of obtaining a head with a tilted penny is not .5 (at the .05 level). Yes, these results agree with your answer to part c. e. No; although you are confident that the probability of obtaining a head with a flipped penny is not .50, the probability is quite close to .50. The confidence interval tells you this probability is somewhere between .5012 and .5127, which for all practical purposes is .50.
Activity 18-16: Hypothetical Baseball Improvements Answers will vary. The following is based on one representative use of the applet: From the simulation, the approximate power is 108200 or .54 (using 30 at-bats).
If the player improves so he’s a .400 hitter (rather than a .333 hitter), the power of the test will increase quite a bit. When the player improves to being a .333 hitter, the power is approximately 14%, which means that it is unlikely his improvement will be noticed in 30 at-bats. However, if the player improves to being a .400 hitter, the power becomes approximately 54%, which means that his chances of displaying his improvement in 30 at-bats are better than not.
Activity 18-18
377
Activity 18-17: Hypothetical Baseball Improvements
If you use a greater significance level, the power of the test will increase. When the significance level is .05, the player needs 13 hits in 30 at-bats in order to demonstrate his improvement, and the power of the test is approximately 14%, which means that it is unlikely his improvement will be noticed. However, if you use a significance level of .10, the player needs fewer hits (12 in this case) in order to demonstrate his improvement, and the power of the test becomes approximately 26%; his chances of displaying his improvement in 30 at-bats have almost doubled.
Activity 18-18: Flat Tires a. The null hypothesis is one-fourth the students at your school would choose the right-front tire when asked the “flat tire” question. In symbols, the null hypothesis is H0: .25. The alternative hypothesis is that more than one-fourth of the students at your school would choose the right-front tire. In symbols, the alternative hypothesis is Ha: .25. b. A Type I error would be concluding that more that one-fourth of the students will choose the right-front tire when, in fact, the proportion is not more than one-fourth. A Type II error would be failing to realize that more than one-fourth of the students will choose the right-front tire when asked the “flat tire” question.
18
Answers will vary. The following is based on one representative use of the applet: From the simulation, the approximate power is 52200 or .26 (using 30 at-bats, alternative value of .333).
378
Topic 18: More Inference Considerations
c. Answers will vary. Here is one representative running of the applet:
The simulation indicates that you are virtually certain to recognize that the rightfront tire is being chosen more than one-fourth of the time. The approximate power is 100%. d. If the sample size were increased from 100 to 200 and all else remained the same, the power of the test would increase. Of course, in this case, it is impossible for the power to increase beyond 100%.
e. If the significance level were increased from .05 to .10 and all else remained the same, the power of the test would increase. Of course, in this particular case, the power cannot increase because it is already 100%.
Activity 18-19
379
18
f. If the right-front tire were actually chosen 40% of the time instead of 50% of the time, and all else remained the same, the power of the test would decrease. This result is verified by the simulation, which shows that the power of the test decreased to about 93.5% in this case.
Activity 18-19: Emotional Support ________
(.96)(.04) a. Hite margin-of-error : (1.96) ________ .005726 4500 ________ (.44)(.56) ABC NewsWashington Post margin-of-error: (1.96) ________ .03513 767
√
√
Hite 95% confidence interval: .96 .005726 (.954, .966) ABC News/Washington Post 95% confidence interval: .44 .03513 (.405, .475) b. No, these two intervals are not similar. They do not overlap at all and their widths are quite different.
380
Topic 18: More Inference Considerations
c. The Hite survey has the smaller margin-of-error (and the narrower confidence interval). d. You have much more confidence (95%) in the ABC NewsWashington Post confidence interval because their sample proportion was obtained from a random sample, unlike Hite’s. Because Hite did not use a random sample, you really can’t make any confidence level statements for that interval.
Activity 18-20: Veterans’ Marital Problems a. Let represent the population proportion of all Vietnam veterans who are divorced. The null hypothesis is that the percentage of all Vietnam veterans who are divorced is 27%. In symbols, the null hypothesis is H0: .27. The alternative is that the percentage of all Vietnam veterans who are divorced is more than 27%. In symbols, the alternative hypothesis is Ha: .27. b. A Type I error would be deciding that the Vietnam veteran divorce rate is more than 27% when it is actually no more than 27%. c. A Type II error would be failing to realize that the Vietnam veteran divorce rate is more than 27%.
Activity 18-21: Hiring Discrimination A Type I error would be deciding that there is discrimination (concluding that not enough African Americans are hired by the Hazelwood School District) when, in fact, there is no discrimination. A Type II error would be failing to realize that the Hazelwood School District was discriminating (not hiring enough African Americans). In this case, which type of error is more serious is a personal opinion (false accusations vs. allowing discrimination to continue).
Assessment Sample Quiz 18A
•••
In the August 12, 2007, issue of Parade magazine (which comes with the Sunday newspaper for millions of Americans), readers were asked to go online and vote on this question: Should the drinking age be lowered? The results were published in the October 7 issue; more than 14,000 readers voted, and 48% said “yes.” 1. Use these sample data to determine a 99% confidence interval for the population proportion who favor lowering the drinking age. 2. Report the margin-of-error for this confidence interval. 3. Explain why this interval is so narrow. 4. Without conducting a test of significance, what can you say about the p-value if you were to test whether the population proportion differs from one-half? Explain. 5. Does Parade’s sampling method give you any reason to doubt that the confidence interval in question 1 is valid? Explain.
394
Topic 19: Confidence Intervals: Means
e. The first condition is that the sample be randomly selected from the population. This is not literally true in this case because the student researchers did not obtain a list of all students at the university and select randomly from that list, but they did try to obtain a representative sample. The second condition is either that the population of weight ratios is normal or that the sample size is large. In this case, the sample size is large (n 100, which is greater than 30), so this condition is satisfied even though the distribution of ratios in the sample is somewhat skewed (and so presumably is the population). f. You do not expect 99% of the sample, nor 99% of the population, to have a weight ratio between .0673 and .0869. You are 99% confident that the population mean weight ratio is between these two endpoints. In fact, only 18 of the 100 students in the sample have a weight ratio in this interval.
•••
Homework Activities Activity 19-7: Body Temperatures ___
a. For males, 98.394 (2.00) .743√65 (98.2097, 98.5783)°F using Table III with 60 degrees of freedom. Or (98.2096, 98.5781)°F using technology. ___
For females, 98.105 (2.00) .699√65 (97.932, 98.278)°F using Table III with 60 degrees of freedom. Or (97.9315, 98.2778)°F using technology. b. You are 95% confident the average body temperature of healthy adult males is between 98.2°F and 98.6°F. You are 95% confident the average body temperature of healthy adult females is between 97.9°F and 98.3°F. c. These intervals suggest that, on average, women may have a slightly lower body temperature than men. d. Half-width males: .184
half-width females: .173.
The confidence interval for the males is wider. This is because the sample standard deviation is slightly larger for the males than it is for the females (.743 vs. .699). e. The half-width of the interval based on the entire sample is .25452 .12725. This is significantly smaller than the half-width of either the male or female interval. You expect this, however, because the sample size is twice as __ large for this interval, so the margin-of-error will be smaller by at least a factor of √2 . f. For both genders, the sample sizes are large, so you do not need to check that the populations are normally distributed. You do need to assume that you have a simple random sample of healthy adult males and a simple random sample of healthy adult females in order for the t-intervals to be valid.
Activity 19-8: Credit Card Usage a. You need to know the sample standard deviation. _____
b. For a 95% CI, you calculate $2169 (1.965)($1000√1074 ) ($2109.04, $2228.96) using Table III with 500 degrees of freedom. With technology: ($2109.10, $2228.90)
Activity 19-9
395
You are 95% confident the average credit card balance in the population of all undergraduate students who held a credit card in 2004 was between $2109 and $2229. c. No, you should not expect 95% of the population to have a credit card balance in this interval. This interval predicts the average balance of credit card holders; it says nothing about the balance of individual credit card holders. d. If the standard _____ deviation were $2000, you calculate $2169 (1.965) $2000√ 1074 ($2049.08, $2288.92) using Table III with 500 degrees of freedom. With technology: ($2049.30, $2288.70) You would be 95% confident the average credit card balance in the population of all undergraduate students who held a credit card in 2004 was between $2049 and $2289. e. The second interval (with standard deviation of $2000) is wider. This makes sense because a larger standard deviation will cause a larger margin-of-error. With more variation in the credit card amounts, you will not obtain as precise an estimate of the population mean. f. A standard deviation of $1000 is much more plausible for these data than a standard deviation of $2000. If the mean account balance is $2169, then the “typical” deviation from this mean is not likely to be $2000, as this would imply that a significant proportion of balances are negative (though you must also be cautious about the probable strong skewness in the data inflating the value of the sample standard deviation).
Activity 19-9: Social Acquaintances Answers will vary. The following are one representative set of answers: _
a. x 40.83, s 25.25, n 35, df 34 ___
b. You are 90% confident the population mean number of acquaintances of all students at your school is between 33.6 and 48 people. In this context, 90% confident means that if you were to create similar intervals over and over again, using random samples of 35 students from your school, in the long run 90% of the intervals you created would contain the unknown population mean number of acquaintances of all students at your school, and 10% of the intervals would not contain this value . You will never know for certain whether a particular interval does, or does not, contain . c. For this class, 1035 .28571 of the sampled students have an acquaintance number that falls within this confidence interval. (See Activity 9-8.) d. This proportion is not at all close to 90%, and there is no reason that it should be because the confidence interval only tells you roughly what the population mean should be. It does not tell you what the individual sample values should be.
19
For a 90% CI, 40.83 (1.691) 25.25√35 (33.6128, 48.0472) people
396
Topic 19: Confidence Intervals: Means
Activity 19-10: Social Acquaintances ___
a. You calculate 36.1 (1.664)(25.25√99 ) (31.88, 40.32) using Table III with 80 degrees of freedom, and (31.89, 40.32) using Minitab. b. The following numerical summary and graph displays the data: Group Our Class Cal Poly
Min
QL
Median
QU
Max
Mean
SD
11
23
37
52
124
40.83
25.25
6
19
31
46
139
36.1
25.25
*
*
*
*
Cal Poly
Our Class
*
0
20
40
60
80
100
*
120
140
Number of Acquaintances
The data collected from both of these classes is very similar. The mean number of acquaintances for the Cal Poly class is 36.10 people, whereas for this class it is 40.8 people. Both classes have minimums below 20 people (6 and 11, respectively) and high outliers above 100 people. The standard deviation for both classes is 25.25 people and the IQR for the Cal Poly class is 27 people whereas for this class it is 29 people. The confidence intervals are similar too. This class’s interval is wider (because of the smaller sample size) and has a slightly higher center. You are 90% confident the Cal Poly students have a mean number of acquaintances between 32 and 40 people, and students at this school have a mean number of acquaintances between 33.6 and 48 people.
Activity 19-11: Nicotine Lozenge a. The population is all smokers who wish to quit smoking. The parameter of interest is the mean number of cigarettes smoked per day by the population of smokers. _____
b. For a 99% CI, you calculate 22 (2.576)(10.8)√1818 (21.3475, 22.6525) cigarettes per day with degrees of freedom equal to infinity, 22 (2.586)(10.8) _____ √1818 (21.34, 21.66) with 500 degrees of freedom. c. No; based on this interval, it does not seem plausible to assert that the population mean is 20 cigarettes per day because 20 is not contained in this confidence interval. It appears that the average number of cigarettes smoked per day is slightly more—more than 21.3 cigarettes per day.
Activity 19-12: Sleeping Times a. A larger standard deviation would result in a larger margin-of-error and so a wider confidence interval. The midpoint of the confidence interval would not change.
Activity 19-14
397
b. A smaller sample size would result in a larger margin-of-error (dividing by a smaller number) and so the confidence interval would be wider. The midpoint of the confidence interval would not change. c. If the sample mean had been larger by .5 hours, the midpoint of the confidence interval would be larger by .5 hours, but the width of the confidence interval would not change. d. If each person’s sleep time had been 15 minutes longer, the sample mean sleep time would be .25 hours longer, so the midpoint of the confidence interval would be shifted .25 hours to the right (larger), but the standard deviation would not change, so the width of the confidence interval would not change.
Activity 19-13: Critical Values a. Here is the completed table: Confidence Levels Degrees of Freedom
80%
90%
95%
99%
4
1.533
2.132
2.776
4.604
11
1.363
1.796
2.201
3.106
23
1.319
1.1714
2.069
2.807
80
1.292
1.664
1.990
2.639
Infinity
1.282
1.645
1.960
2.576
b. The critical value t* gets larger as the confidence level increases. c. The critical value t* gets smaller as the number of degrees of freedom increases.
19
d. Yes, the critical values from the t-distribution corresponding to infinitely many degrees of freedom are the z* critical values.
Activity 19-14: Sentence Lengths a. The following graph displays the data on sentence lengths:
8
16
24
32
40
48
56
64
Sentence Length
x 20.5, n 28, s 12, five-number summary (5, 13, 17.5, 26.75, 64) These sentence lengths are slightly skewed to the right, but have one extreme outlier—a sentence with 64 words! The remaining sentences have between 5 and 34 words each. A typical sentence length is about 17 words. About 70% of the sentences in this sample have fewer than 24 words. ___
b. For a 95% CI with 27 degrees of freedom, you calculate 20.5 (2.052) 12√28 (15.8465, 25.1535).
Topic 19: Confidence Intervals: Means
c. Technical conditions: The technical conditions necessary for the validity of this procedure do not seem to be satisfied. The sample size is a bit less than 30 and the data do not appear to come from a normal distribution, as evidenced by the extreme outlier at 64 and by the following probability plot: 99 95 90 Percentage
398
80 70 60 50 40 30 20 10 5 1 20
10
0
10
20
30
40
50
60
70
Sentence Length
Therefore, you should be cautious in interpreting the 95% confidence interval. ___
d. You calculate 18.89 (2.056)(8.6√27 ) (15.4872, 22.2928) with 26 degrees of freedom. Removing the outlier had a small effect on the interval, but not a great deal. The width of the interval has decreased by about three words.
Activity 19-15: Coin Ages Answers will vary. The following is one representative set of answers: a. Using line 7 of the Random Digits Table, you select coins numbered 835, 944, 872, 096, 632, 397, 245, 031, 891, and 370, which have ages 22, 29, 23, 1, 15, 7, 4, 0, 19, and 6 years, respectively. ___
b. For a 90% CI with 9 degrees of freedom, you calculate 12.6 (1.833)(10.3√10 ) (6.63, 18.57). c. Technical conditions: No, the technical conditions for this procedure have not been met. You did take a random sample, but it was small ( 30), and the population definitely is not normally distributed. It has a very strong skew to the right. Therefore, your confidence level statement could be inaccurate. d. Yes, the interval succeeds in capturing the population mean 12.26 years. (This may not always be the case, however.) e. Yes, if you planned to construct a 95% confidence interval, this method would be more likely to capture the population mean than the 90% confidence interval method because the resulting interval would have been wider. f. If you planned to take a random sample of 40 pennies instead of 10 pennies, the 90% confidence interval method would be more appropriate (the technical conditions for the procedure to be valid would be met as the sample size would now be large enough). However, assuming both methods are valid, it would still be a 90% confidence interval so, in the long run, this interval method has the
Activity 19-17
399
same chance of capturing the parameter as the method used in part b. But the interval would probably be smaller (the sample mean of 40 pennies is more likely to be close to the population mean than the sample mean of 10 pennies).
Activity 19-16: Children’s Television Viewing a. These values are statistics because they describe a sample. b. Technical conditions: The sample sizes are large enough, but the samples were not randomly selected. Apparently the researchers used all third- and fourthgraders at two elementary schools and allowed them to self-report the number of hours that they watched television. Thus, this data may not be representative of all third- and fourth-graders in the San Jose area and certainly should not be extended beyond that region. No cause-and-effect conclusions should be drawn from any group differences. ____
c. 90% CI: 15.41 (1.660)(14.16√198 ) (13.74, 17.08) hours, with df 100 ____
95% CI: 15.41 (1.984)(14.16√198 ) (13.41, 17.41) hours, with df 100 ____
99% CI: 15.41 (2.626)(14.16√198 ) (12.77, 18.05) hours, with df 100 d. Answers will vary according to guesses in the Preliminaries. e. With the z-critical values, the intervals become ____
90% CI: 15.41 (1.645)(14.16√198 ) (13.75, 17.07) hours ____
95% CI: 15.41 (1.960)(14.16√198 ) (13.44, 17.38) hours ____
99% CI: 15.41 (2.756)(14.16√198 ) (12.64, 18.18) hours There is very little difference in the intervals.
Activity 19-17: Close Friends
b. A t-interval is valid in spite of the strong right skew because the sample size is very large (1467). _____
c. For a 90% CI, you calculate 1.987 (1.645)(1.7708√1467 ) (1.911, 2.0631) friends, with degrees of freedom equal to infinity. d. The reasonable interpretations of this interval are You can be 90% confident that the mean number of close friends in the population is between the endpoints of this interval. If you repeatedly took random samples of 1467 people and constructed t-intervals in this same manner, 90% of the intervals in the long run would include the population mean number of close friends. e. Ninety percent of all people in this sample reported a number of close friends within this interval is incorrect because a confidence interval claims to capture the population mean, not individual members of the sample.
19
a. The observational units are the adult American who were interviewed by the GSS. The variable is the number of close friends that an adult American has. This variable is quantitative.
400
Topic 19: Confidence Intervals: Means
If you took another sample of 1467 people, there is a 90% chance that its sample mean would fall within this interval is incorrect because you are not trying to capture sample means; you are trying to capture the population mean. This statement would be correct if the interval had been constructed around the population mean rather than the sample mean. If you repeatedly took random samples of 1467 people, this interval would contain 90% of your sample means in the long run is not a correct interpretation because you cannot predict how many of the other sample means it would contain—the interval procedure is estimating the population mean. You are not saying other sample means should be within two standard deviations of the one you observed, but that sample means in general should fall within two standard deviations of the actual population mean. It is incorrect to say, This interval captures the number of close friends for 90% of the people in the population because this interval estimates the mean number of friends—not the number of individual friends for any person. f. If the sample size were larger, the interval would have the same midpoint, but would be narrower. If the sample mean were larger, the interval would have a larger midpoint, but would have the same width. If the sample values were less spread out, the standard deviation would be smaller, so the margin-of-error would be smaller, so the interval would be narrower (but would have the same midpoint). If every person in the sample reported one more close friend, the sample mean would be greater (by 1), so the midpoint of the interval would increase by 1, but the width would be unchanged.
Activity 19-18: Close Friends a. For a 90% CI, you calculate 3971467 1.645 (.0116) .271 .0191 (.252, .290). You are 90% confident that between 25.2% and 29.0% of the population would report having 0 close friends. b. You calculate 1661467 1.645 (.008271) .113 .0136 (.0996, .1268). You are 90% confident that between 9.96% and 12.68% of the population would report having five or more close friends. c. These intervals give you an estimate of the population proportion of people who have no friends or many friends, but they give you no information about how many friends the typical person has. Reporting a t-interval for the population mean would give you this information.
Activity 19-19: Sleeping Times a. This is a legitimate interpretation of the interval. b. This is a legitimate interpretation of the interval. c. This is not a legitimate interpretation of the interval. The interpretation is not technically correct because is not random. It has some fixed (but unknown) value, so is either inside the interval you produced or it is not; it is not
Activity 19-22
401
sometimes between the two numbers you calculated and sometimes not between those two numbers. (See the Watch Out comments at the end of Activity 16-14.) d. This is not a legitimate interpretation of the interval; the confidence interval is attempting to estimate the average student sleep time. It says nothing about any individual student’s sleep time. e. This is not a legitimate interpretation of the interval; the confidence interval is attempting to estimate the average student sleep time in the population. It says nothing about any individual student’s sleep time, so it cannot predict what percentage of the students’ sleep times in the population will fall in this interval.
Activity 19-20: Planetary Measurements a. For a 95% CI, you calculate 1102 (2.306)(13413) (71.208, 2132.78) million miles, with 8 degrees of freedom. b. No, this interval does not make any sense at all. It is attempting to estimate the average distance from the sun of all planets (including Pluto) in this solar system. But you do not need to estimate this value (the population mean) because you know this value is 1102 million miles.
Activity 19-21: Hypothetical ATM Withdrawals a. Although all three distributions have the same number of withdrawals (50), the same midpoint ($70), and the same standard deviation ($30.30), their dotplots are very dissimilar. There were only two distinct amounts withdrawn from the first ATM, $40 and $100, and these amounts were withdrawn in equal proportions. There were only three amounts withdrawn from the third ATM: $20 and $120 were withdrawn about nine times each, whereas $70 was withdrawn about three times as often from this ATM. There were many different amounts withdrawn from the second ATM. Every other amount ($20, $40, $60, $80, $100, and $120) was withdrawn infrequently (only twice), whereas the remaining amounts ($30, $50, $70, $90, and $110) tended to be withdrawn about four times as often.
Sample Size
Sample Mean
Sample SD
95% CI for
Machine 1
50
70
30.3
(61.3875, 78.6125)
Machine 2
50
70
30.3
(61.3875, 78.6125)
Machine 3
50
70
30.3
(61.3875, 78.6125)
19
b. Here is the completed table:
c. This activity clearly reveals that a confidence interval for a mean does not describe all aspects of a distribution. In this case, you had three distributions that looked very different, but had the same sample sizes, centers, and standard deviations, and thus the same confidence intervals.
Activity 19-22: Your Choice a. Answers will vary. b. Answers will vary.
Activity 20-7
•••
413
Homework Activities Activity 20-5: Exploring the t-Distribution a. Here is the completed table:
df
Pr(T 1.415)
Pr(T 1.960)
Pr(T 2.517)
Pr(T 3.168)
4
.1 p-value .2
.05 p-value .1
.025 p-value .05
.01 p-value .025
11
.05 p-value .1
.025 p-value .05
.01 p-value .025
.001 p-value .005
23
.05 p-value .1
.025 p-value .05
.005 p-value .01
.001 p-value .005
80
.05 p-value .1
.025 p-value .05
.005 p-value .01
.001 p-value .005
Infinity
.05 p-value .1
p-value .025
.005 p-value .01
.0005 p-value .001
b. As the value of the test statistic increases, the p-value gets smaller. c. As the number of degrees of freedom increases, the p-value gets smaller.
Activity 20-6: Exploring the t-Distribution a. Here is the completed table: df
Pr(T 1.415)
Pr(T 1.960)
2 Pr(T |2.517|)
2 Pr(T |3.168|)
4
.1 p-value .2
.05 p-value .1
.05 p-value .01
.02 p-value .05
11
.05 p-value .1
.025 p-value .05
.02 p-value .05
.002 p-value .01
23
.05 p-value .1
.025 p-value .05
.01 p-value .02
.002 p-value .01
80
.05 p-value .1
.025 p-value .05
.01 p-value .02
.002 p-value .01
Infinity
.05 p-value .1
p-value .025
.01 p-value .02
.001 p-value .002
c. The p-values in the last two columns of this table are twice the p-values in the last two columns of the table in Activity 20-5. This makes sense because the t-distribution is perfectly symmetric, so the upper tail probability equals the lower tail probability, but you have multiplied by 2 in the last two columns.
Activity 20-7: Sleeping Times H0: 7.0 Ha: 7.0 Sample 1: test statistic t 1.53; Sample 2: test statistic t .79; Sample 3: test statistic t 2.66, Sample 4: test statistic t 1.37,
p-value .08 p-value .224 p-value .006 p-value .090
At the 5% significance level, you would reject H0 and conclude that the population mean sleep time is less than 7 hours only for Sample 3 (p-value .006).
20
b. The p-values in the first two columns of this table are the same as the p-values in the first two columns of the table in Activity 20-5. This makes sense because the t-distribution is perfectly symmetric, e.g., Pr(T 1.415) P(T 1.415).
414
Topic 20: Tests of Significance: Means
This is the same result you had with the two-sided test. For each sample, the test statistic value was the same with the two-sided test, but the p-value was twice as large.
Activity 20-8: UFO Sighters’ Personalities a. In this context, represents the average IQ of all people who claim to have had an intense experience with a UFO. b. This is a one-sided test because you wish to test whether the average IQ of this group is greater than 100. c. Technical conditions: In order for this procedure to be valid, you need to know that the population of IQ scores for this group is approximately normally distributed, because the sample size is small (n 25 30). You also need to be willing to believe that this sample is representative of all individuals who claim to have had such an experience. 101.6 ___ 100 0.90. Here is the sketch: d. The test statistic is t ___________ 8.9√25
df = 24
.189
0 t-values
.9
e. Using Table III with 24 degrees of freedom, .1 p-value .2. Using Minitab, the p-value is .189. f. If the average IQ of this group is really 100, then you could expect to see a random sample of 25 people from this group with an average IQ of at least 101.6 in about 19% of samples by random chance alone. Because this would be a fairly common occurrence, you have no reason to doubt that the mean IQ of the population (those who claim to have had an intense experience with an UFO) is 100.
Activity 20-9: Basketball Scoring a. The null hypothesis is that the mean number of points scored per game for the entire 1999–2000 season is 183.2. In symbols, H0: 183.2. The alternative hypothesis is that the mean number of points scored per game for the entire 1999–2000 season is greater than 183.2. In symbols, Ha: 183.2. b. No, you do not have enough information to calculate the test statistic. You need to know the standard deviation of the first 149 games.
Activity 20-10
415
c. Using Table III and 100 degrees of freedom, the test statistic would need to be at least t 2.364 in order to reject the null hypothesis. Using Minitab, the test statistic would need to be t 2.35181. 196.2 ____ 183.2 d. If you assume the standard deviation were about s 20.27, then t ____________ 20.27 149 √ 7.83, which does exceed the rejection value in part c by a great deal. e. Yes. Even though the magazine did not provide the standard deviation, using any reasonable value for the standard deviation would still result in a test statistic that is significantly greater than the value needed to reject the null hypothesis at the .01 significance level. So, you can reasonably predict that the results (reject H0 and conclude that the mean number of points per game is greater than 183.2) would be significant at the .01 level. f. No, the validity of this test procedure does not depend on the scores being normally distributed because the sample size is large (149 30).
Activity 20-10: Credit Card Usage a. The null hypothesis is that the average credit card balance among all undergraduate students who own a credit card is $2000. In symbols, H0: $2000. The alternative hypothesis is that the average credit card balance among all undergraduate students who own a credit card is more than $2000. In symbols, Ha: $2000. Technical conditions: The sample was randomly selected and the sample size is large (n 1074), so the technical conditions are satisfied. 2169 _____ 2000 5.54. The test statistic is t ___________ 1000√1074 Using Table III with degrees of freedom equal to infinity, p-value .0005. Using Minitab, the p-value is .0000. Because the p-value is small, reject H0 at the .05 significance level. You have very strong statistical evidence that the average credit card balance among all undergraduate students who own a credit card is more than $2000. 2169 _____ 2000 2.77. The test statistic is t ___________ 2000√1074 Using Table III with degrees of freedom equal to infinity, p-value .005. Using Minitab, the p-value is .00251. Because the p-value is small (.00251 .05), reject H0 at the .05 significance level. You have strong statistical evidence that the average credit card balance among all undergraduate students who own a credit card is more than $2000. c. The second scenario (standard deviation $2000) produces the greater p-value. This makes sense because a larger standard deviation will result in a smaller test statistic (s is in the denominator and more sample variability makes your sample result less surprising) and therefore a larger p-value.
20
b. The new standard deviation value affects the test statistic and p-value:
Topic 20: Tests of Significance: Means
Activity 20-11: Body Temperatures The following output pertains to the data for female and male body temperatures. Female
Frequency
416
Male
20
20
15
15
10
10
5
5
0
0 97
98
99
100
101
97
Body Temperature (in F)
98
99
100
101
Body Temperature (in F)
Variable Female Body Temperatures
N 65
Mean 98.105
StDev 0.699
Minimum 96.300
Q1 97.600
Median 98.100
Q3 98.600
Male Body Temperatures
65
98.394
0.743
96.400
98.000
98.400
98.000
Variable Female Body Temperatures Male Body Temperatures
Maximum 99.500 100.800
Females: Define parameter of interest: Let represent the average adult female body temperature. The null hypothesis is that the average adult female body temperature is 98.6°F. In symbols, the null hypothesis is H0: 98.6°F. The alternative hypothesis is that the average adult female body temperature is not 98.6°F. In symbols, Ha: 98.6°F. Check technical conditions: The sample size is large (n 65 30), but you do not know whether this sample was randomly selected (in fact, they were volunteers in another clinical trial, so this sample may not be representative of healthy adults). 98.105 ___ 98.6 5.71 Test statistic: t ____________ 0.699√65 Using Table III with 60 degrees of freedom, p-value 2 .0005 .001. Using Minitab, p-value 2 .0000002 .0000004. Test decision: With such a small p-value, reject H0 at any significance level. Conclusion in context: You have very strong statistical evidence that the average adult female body temperature is not 98.6°F. Males: Define parameter of interest: Let represent the average adult male body temperature. The null hypothesis is that the average adult male body temperature is 98.6°F. In symbols, the null hypothesis is H0: 98.6°F.
Activity 20-13
417
The alternative hypothesis is that the average adult male body temperature is not 98.6°F. In symbols, Ha: 98.6°F. Check technical conditions: The sample size is large (n 65 30), but you do not know if this sample was randomly selected. 98.394 ___ 98.6 2.24 Test statistic: t ____________ 0.743√65 You have 2 .01 p-value 2 .025. Using Table III with 60 degrees of freedom, .02 p-value .05. Using Minitab, p-value 2 .0142856 .02857. Test decision: With the p-value .029 .05, reject H0 at the .05 significance level. Conclusion in context: You have moderate statistical evidence that the average adult male body temperature is not 98.6°F. Although you reach the same conclusion for both males and females, the evidence is much stronger in the case of the females (as evidenced by the smaller p-value).
Activity 20-12: Social Acquaintances Answers will vary by class. The following is one representative set of answers. a. The null hypothesis is that the mean number of acquaintances of all students at your school is 30. In symbols, H0: 30. The alternative hypothesis is that the mean number of acquaintances of all students at your school is not 30. In symbols, Ha: 30. 40.83 ___ 30 2.54. The test statistic is t __________ 25.25√35 You have 2 .005 p-value 2 .01. Using Table III with 34 degrees of freedom, .01 p-value .02. Using Minitab, p-value 2 .0079117 .01582. With the small p-value (.01582 .05), reject H0 at the .05 significance level.
b. You should probably only generalize these results to students at your school, as the sample was not randomly selected and is probably only representative of students at your school (if that), not students in general.
Activity 20-13: Age Guesses Answers will vary. Here is one representative set of answers. The following output pertains to the data for age guesses:
32
36
40
44
48
52
56
Age Guess (in years)
Variable Age Guesses
N 71
Mean 37.521
StDev 4.157
Minimum 30.000
Q1 Median Q3 35.000 37.000 40.000
Maximum 55.000
20
You have strong statistical evidence that the mean number of acquaintances for all students at your school is not 30.
418
Topic 20: Tests of Significance: Means
Define parameter of interest: Let represent the mean guess of this instructor’s age by all students at this school. The null hypothesis is that the population mean guess of this instructor’s age was correct (40 years). In symbols, the null hypothesis is H0: 40 years. The alternative hypothesis is that the population mean guess of this instructor’s age was not correct. In symbols, the alternative hypothesis is Ha: 40 years. Check technical conditions: This was not a random sample but it may be representative of the guesses for this instructor’s age that students would typically make at this school. You do need to worry about the bias of under-guessing by students in order not to offend the instructor. The sample size is large (71 30), so this condition is met for this sample. 37.521 ___ 40 5.02 Test statistic: t ___________ 4.157√71 Using Table III with 60 degrees of freedom, p-value 2 .0005 .001. Using Minitab, p-value 2 .0000019 .0000038. Test decision: Because of the small p-value, reject H0 at any reasonable significance level. Conclusion in context: You have very strong statistical evidence that the population mean guess of this instructor’s age was not correct (was not 40 years), as long as the guesses made by this sample are representative.
Activity 20-14: Age Guesses a. In general, students tended to underestimate this instructor’s age. Only 4 of the 44 students guessed the correct age, and only another 9 of the 44 overestimated his age. There were two main peaks of the underestimates, one around 37 years and one at about 43 years. The youngest guess was 33 years and then there were guesses for every age between 33 and 46. Some individuals also guessed ages 48 and 50. Most of the guesses tended to fall between 41 and 46 years of age. The median age guess was 42 years. There were no outliers. _
b. These values are statistics: n 44; x 41.182; s 3.996. c. The population would be all students at this school. The parameter would be , the mean guess of this instructor’s age for this population. d. H0: 44
Ha: 44
Technical conditions: The sample size is large (44 30), but the sample was not randomly selected. It is perhaps not representative of the guesses of other students at the school, because these students might feel the need to err on the size of underestimating the instructor’s age (rather than risk offending their instructor). 41.182 ___ 44 4.68. The test statistic is t ___________ 3.996√44 Using Table III with 40 degrees of freedom, p-value .0005 2 .001. Using Minitab, p-value .0000143 2 .0000286. Because of the small p-value, reject H0 at the .01 significance level. You have very strong statistical evidence that the population mean age guess is not 44 years.
Activity 20-16
419
Activity 20-15: Nicotine Lozenge a. The null hypothesis is that the population mean number of cigarettes smoked per day is 20. In symbols, H0: 20. The alternative hypothesis is that the population mean number of cigarettes smoked per day is not 20. In symbols, Ha: 20. Technical conditions: The sample size is large, but the subjects were not randomly selected. 22 _____ 20 7.90. The test statistic is t __________ 10.8√1818 Using Table III with degrees of freedom equal to infinity, p-value 2 .0005. Using Minitab, the p-value is .000000. Because of the small p-value, reject H0 at any significance level. If this sample is representative of the general population, you have very strong statistical evidence that the population mean number of cigarettes smoked per day is not 20 (one pack). b. The null hypothesis is that the population mean number of cigarettes smoked per day is 20. In symbols, H0: 20. The alternative hypothesis is that the population mean number of cigarettes smoked per day is not 20. In symbols, Ha: 20. 22 ____ 20 1.85. The test statistic is t _________ 10.8√100 You have 2 .025 p-value 2 .05. Using Table III with 80 degrees of freedom, .05 p-value .1. Using Minitab, p-value 2 .0336481 .0673. Barely fail to reject H0 at the .05 significance level (.0673 .05)
You reached different conclusions in parts a and b: rejecting the null hypothesis when the sample size was very large, but failing to reject it when the sample size was only 100. This result makes sense because with larger samples there is less random sampling variability and the same sample mean will be more surprisingless likely to happen by chance alone.
Activity 20-16: Random Babies Answers will vary by class. The following is one representative set of answers. _
a. From Activity 11-1, part g, the sample size n 100, the sample mean x 1.090 matches, and the sample standard deviation s 1.036 matches. b. The null hypothesis is that the population mean number of matches is equal to 1.0. In symbols, the null hypothesis is H0: 1.0. The alternative hypothesis is that the population mean number of matches is not equal to 1.0. In symbols, the null hypothesis is Ha: 1.0.
20
You do not have sufficient statistical evidence (at the 5% level) to conclude that the population mean number of cigarettes smoked per day differs from 20.
420
Topic 20: Tests of Significance: Means
Technical conditions: The sample size is large (n 100), and the samples were randomly selected, so the technical conditions necessary for this test procedure to be valid are satisfied. Under the assumption of the null hypothesis, the CLT says the sample mean number of matches will be normally distributed, with center 1.0 match and standard deviation 1.036 match. So the test statistic is 1.09 ____ 1 0.87 t __________ 1.036√100 Using Table III with 80 degrees of freedom, 2 .10 p-value 2 .20, so .20 p-value .40. Using Minitab, p-value 2 .193202 .3864. Do not reject H0 because the p-value is not small. You do not have any statistical evidence that the population mean number of matches differs from 1.0.
Activity 20-17: Backpack Weights a. This is a quantitative variable. b. The null hypothesis is that the population mean ratio of backpack weight to body weight is .10. In symbols, H0: .10. The alternative hypothesis is that the population mean ratio of backpack weight to body weight is not .10. In symbols, Ha: .10. Technical conditions: The sample was not randomly selected, but the researchers did try to select a representative sample. The sample size is large (n 100 30). You may consider the technical conditions have been met. .0771 ____ 1 6.26. The test statistic is t __________ .0366√100 Using Table III with 80 degrees of freedom, p-value 2 .0005 .0010. Using technology, p-value .0000. Because the p-value is small, reject H0 at any reasonable significance level. You have very strong statistical evidence that the population mean ratio is not .10. c. The 99% confidence interval given in Activity 19-6 is (.0674, .0868). Note that .10 is not in this interval, which implies that you would reject this as a plausible value for at the .01 significance level. This is consistent with your test results.
Activity 20-18: Looking Up to CEOs a. The null hypothesis is that the average height for male CEOs of American companies is 69 inches. In symbols, H0: 69 inches. The alternative hypothesis is that the average height for male CEOs of American companies is more than 69 inches. In symbols, Ha: 69 inches. b. A Type I error would be concluding that the average height for male CEOs is greater than 69 inches when it really isn’t. c. A Type II error would be failing to realize that the average height for male CEOs of American companies is greater than 69 inches.
Activity 20-20
421
Activity 20-19: Nicotine Lozenge a. From Activity 19-11, a 99% confidence interval for is (21.3475, 22.6525). Thus, any value in this interval is a plausible value for , and any value not in this interval would be rejected at the .01 significance level. b. The null hypothesis is that the population mean number of cigarettes smoked per day is 22. In symbols, H0: 22. The alternative hypothesis is that the population mean number of cigarettes smoked per day is not 22. In symbols, Ha: 22. Technical conditions: The sample size is large (1818 30), but the subjects were not randomly selected, so proceed with caution. 22 _____ 22 0. The test statistic is t __________ 10.8√1818 Using Table III with degrees of freedom equal to infinity, p-value 2 .2 .4. Using Minitab, p-value 2 .5 1.0. With the large p-value, do not reject H0 at the .05 significance level. You have no statistical evidence that the population mean number of cigarettes smoked per day differs from 22. c. Based on this p-value, 22 would be in the 95% confidence interval for (in fact, because it equals the sample mean, it will be at the center of the interval).
Activity 20-20: Basketball Scoring a. The following dotplot displays the data:
144
156
168
180
192
204
216
Points Scored
b. The null hypothesis is that the mean number of points per game for the entire 1999–2000 season is 183.2. In symbols, H0: 183.2. The alternative hypothesis is that the mean number of points per game for the entire 1999–2000 season is greater than 183.2. In symbols, Ha: 183.2. 191 183.2 ___ 1.03. The test statistic is t ___________ 24.01√10 Using Table III with 9 degrees of freedom, 1 p-value .2. Using Minitab, the p-value is .164947. c. No, you would not reject the null hypothesis at the .05 level. You do not have sufficient statistical evidence to conclude that the mean number of points per game for the entire 1999–2000 season has increased beyond the mean of the previous season of 183.2 points.
20
Yes, it appears that these games average more than 183.2 points per game. The average points per game for these 10 games is 191, the standard deviation is 24.01 points, and the median number of points per game is 197.
422
Topic 20: Tests of Significance: Means
d. No, you never accept the null hypothesis. You simply have not found sufficient evidence to disprove it. e. With the outlier (140) removed, the sample mean is 196.67 points and the sample standard deviation is 16.95 points. The hypotheses are the same as in part b. 196.67 183.2 __ 2.38. The test statistic is t _____________ 16.95√9 Using Table III with 8 degrees of freedom, .01 p-value .025. Using Minitab, the p-value is .0222728. Because p-value .022 .05, reject H0 at the .05 level. You have moderate statistical evidence that the mean number of points scored per game in the entire 1999–2000 season is greater than the previous season’s mean of 183.2 points. Clearly, removing the outlier had a fairly drastic effect on your results. It more than doubled the value of the test statistic, making the p-value much smaller and the results more statistically significant (though you still need to worry about the technical conditions).
Activity 20-21: Pet Ownership a. The null hypothesis is that the mean number of cats in all American cat-owning households is 2.0 In symbols, the null hypothesis is H0: 2.0. The alternative hypothesis is that the mean number of cats in all American catowning households is greater than 2.0. In symbols, the alternative hypothesis is Ha: 2.0. b. You need to know the sample standard deviation. c. Answers will vary according to the value of s. 2.1______ 2 15.9. Assuming s is 1.0, the test statistic is t _________ 1√25,280 Using Table III with degrees of freedom equal to infinity, p-value .0005. Using technology, p-value .000000. With the small p-value, reject H0 at any significance level. You have very strong evidence that the mean number of cats per cat-owning household is greater than 2.0. 2.1______ 2 7.95. d. Now assuming s is 2.0, the test statistic is t _________ 2√25,280 Using Table III with degrees of freedom equal to infinity, p-value .0005. Using technology, p-value .000000. With the small p-value, reject H0 at any significance level. You still have very strong evidence that the mean number of cats per cat-owning household is greater than 2.0. Your conclusion did not change. e. For a 99% CI for with degrees of freedom equal to infinity, 2.1 (2.576)(0.00629) (2.0838, 2.1162) cats per household. Clearly, the population mean does not exceed 2.0 in any practical sense.
Quizzes
423
Activity 20-22: Random Babies a. The null hypothesis is that the long-term proportion of repetitions that result in no matches is .40. In symbols, the null hypothesis is H0: .40. The alternative hypothesis is that the long-term proportion of repetitions that result in no matches is not .40. In symbols, the alternative hypothesis is Ha: .40. .34 .4 1.22. _______ The test statistic is z ________ (.4)(.6) ______ 100
√
Using Table II, p-value 2 Pr(Z 1.22) 2 .1112 .2224. Do not reject H0 at the .05 significance level (p-value .2224 .05). You do not have sufficient statistical evidence to conclude that the long-term proportion of repetitions that result in no matches differs from .40. b. No, you do not accept that is .40 exactly. It is simply that .40 remains one of the plausible values for . But there are other plausible values as well. c. From Activity 11-2, part d, the theoretical probability of no matches is 924 or .375. You hope that this is another one of the plausible values for based on your class data. d. The null hypothesis is that the long-term mean number of matches per repetition is 1.1. In symbols, H0: 1.1. The alternative hypothesis is that the long-term mean number of matches per repetition is not 1.1. In symbols, Ha: 1.1. 1.09 ____ 1.1 0.10. The test statistic is t __________ 1.036√100 Using Table III with 80 degrees of freedom, p-value 2 .2 .4. Using Minitab, p-value 2 .460273 .84135.
e. It does not follow that you accept that 1.1 exactly. You simply find that 1.1 is one of the many plausible values for . You already saw in Activity 20-16 that 1.0 is another plausible value for . f. From Activity 11-2, part g, the theoretical mean number of matches per repetition is 1.0. As you saw in Activity 20-16, based on your class simulation, this is also a plausible value for (as would be any value between 1.0 and 1.1).
Assessment Sample Quiz 20A
•••
Students enrolled in an introductory statistics course at a university were asked to take a survey that indicated whether the student’s learning style was more visual or verbal. Each student received a numerical score ranging from 11 to 11. Negative scores indicated a visual learner, and positive scores indicated a verbal learner. The
20
Do not reject H0 because the p-value is .84 .05. You do not have any statistical evidence that the long-term mean number of matches per repetition differs from 1.1.
Homework Activities Activity 21-7: Botox for Back Pain a. The explanatory variable is whether the subject received botox or the placebo (ordinary saline injection) The response variable is whether the subject experienced substantial pain relief. b. Here is the 2 2 table: Botox
Saline
Total
Substantial Pain Relief
9
2
11
No Pain Relief
6
14
20
15
16
31
Total
c. Based on the histogram, the approximate p-value is 71000, or .007. This value was calculated by approximating the number of times nine or more successes were assigned to the botox group. d. This p-value means that if botox and saline are equally effective in treating low-back pain, then you would expect to see such sample results (9 or more successes in a group of 15) about .7% of the time by random assignment alone. This occurrence is very rare, however, so you would conclude that there is very strong statistical evidence that botox is more effective than saline in treating low-back pain. e. Technical conditions: i. The data are from randomly assigning subjects to two treatment groups. This condition is met. ii. Checking n1 pˆc 5 and n2 pˆc 5 etc., with pˆc .3548, the sample sizes are just large enough.
Activity 21-8: “Hella” Project a. The sample proportion of students in each group who use the word “hella” regularly is pˆN .6667 and pˆS .12. b. The following segmented bar graph compares the sample proportions of students who use the word “hella” regularly: “Hella” Project Does not use “hella” regularly Uses “hella” regularly
Percentage
•••
449
100 90 80 70 60 50 40 30 20 10 0
Northern California
Southern California Region
21
Activity 21-8
450
Topic 21: Comparing Two Proportions
c. Answers will vary slightly. Here is one representative set:
The approximate p-value is Pr(10 or more successes in Group A), which is 1500 or .002. Note that the applet models the p-value that arises from “random assignment” but also approximates what would happen with random sampling from two populations with the same proportion of successes. d. This p-value means that if there were no difference in the proportion of times students from northern and southern California use the word “hella,” then you would see a sample difference as or more extreme than this (at least 10 of the 13 successes out of a group of 15 northern California students) by random sampling alone in about .2% of samples. Because this is such a rare occurrence but did happen in these samples, you conclude that your initial conjecture that there is no difference in the population proportion of times students from northern and southern California use the word “hella” is incorrect (in favor of your alternative hypothesis). In other words, you conclude that there is very strong statistical evidence that students from northern California are more likely to use the word “hella” than students from southern California. e. Technical conditions: i. Condition: The data are independent random samples from two populations. Assessment: The data are technically not independent random samples from two populations—this is one random sample taken from a population and then separated by region. However, (see page 418) you might consider this a minor variation and say the technical condition has been met. ii. Condition: With the overall success rate of 13/40 .325, we have 15(.325) 4.875 5 so the sample size condition is not met. Assessment: Because this condition is not met, the above simulation analysis is a more appropriate way to approximate the p-value than the two-sample z-test.
Activity 21-9: Perceptions of Self-Attractiveness ___________________
(.81)(.19) (.71)(.29) a. For n 100: (.81 .71) (1.96) ________ ________ (.0176, .2176) 100 100
√ (.81)(.19) (.71)(.29) For n 200: (.81 .71) (1.96) √ 200 200 (.0169, .1831) (.81)(.19) (.71)(.29) For n 500: (.81 .71) (1.96) √ 500 500 (.0474, .1526) ___________________
________
________
___________________
________
________
451
21
Activity 21-11
b. For n 100, half-width: .117567 For n 200, half-width: .083133 For n 500, half-width: .052578 The half-width decreases as the sample size increases. Yes, this is consistent with what happens with confidence intervals for a single proportion. c. With sample size n 100, the 95% confidence interval includes the value zero. With sample sizes n 200 and n 500, the confidence intervals do not include zero. This is consistent with your test results in Activity 21-4, where you found that the test H0: m f vs. Ha: m f was statistically significant when n 200 and n 500, but not when n 100.
Activity 21-10: Perceptions of Self-Attractiveness a. In order to be statistically significant at the .01 level, you need z 2.576, so .76 .75 ____________ 2.576 z _____________ 2 .755(.245) __ n
( )
√
(.755)(.245)(2) 24,549.95; thus n 24,550. n _____________ .01 2 _____ 2.576
(
)
In this case, z 2.58 and p-value 2 Pr(Z 2.576) 2 .0049 .0098. b. For a 99 % CI, you calculate
___________________
(.76)(.24) (.75)(.25) .01 (2.576) ________ ________ .01 (2.576)(.00388165) 24550 24550
√
(8.69 107, .01999) c. If the sample proportions had been .756 and .755, respectively, you calculate .756 .755 ______________ 2.576 z _______________ 2 .7555(.2455) __ n
√
( )
(.7555)(.2455)(2) n _______________ 2,461,544.425 .001 2 _____ 2.576
(
)
so n 2,461,545. In this case, z 2.58 and p-value 2 Pr(Z 2.58) 2 .0049 .0098. For a 99% confidence interval, .001 (2.576)(.0003874) (.00000204, .001998). d. No, this difference is not practically significant. With a sample size of more than 2 million, any difference will appear statistically significant, but there is no practical difference between 75.6% and 75.5%.
Activity 21-11: Generation M a. g b represents the difference in the population proportion of girls who have televisions in their bedrooms and the population proportion of boys who have televisions in their bedrooms.
452
Topic 21: Comparing Two Proportions
b. You calculate ___________________
(.64)(.36) (.72)(.28) (.64 .72) (1.96) ________ ________ .08 (1.96)(.0206) 1036 996
√
(.1204, .0396) You are 95% confident that the proportion of all girls who have televisions in their bedrooms is somewhere between .0396 and .1204, less than the proportion of all boys who have televisions in their bedrooms. The fact that all the values in your confidence interval are negative indicates that the population proportion of girls is strictly less than the population proportion of boys. c. For a 99% CI, you calculate (.64 .72) (2.576)(.0206) (.1331, .0269). The midpoint of both intervals is the same (.64 .72 .08). The 99% confidence interval is wider than the 95% confidence interval.
Activity 21-12: Generation M Radio: H0: g b
Ha: g b
.86 .82 ________________________ Test statistic: z _________________________ 2.46 1 1 ____ (.84)(1 .84) _____ 1036 996
(
√
)
p-value 2 Pr(Z 2.46) 2 .0069 .0138 Test decision: Reject H0 at the .05 level of significance because p-value .05. Conclusion in context: You have strong statistical evidence that the population proportion of boys who have a radio in their bedrooms is different from the population proportion of girls who have a radio in their bedrooms. 95% confidence interval for g b: ___________________
(.86 .82) (1.96)
√
(.82)(.18) (.86)(.14) ________ ________ .04 (1.96)(.01626) 1036
996
(.00813, .0719) You are 95% confident that the percentage of girls owning a radio is between .8 and 7.2 percentage points higher than the percentage of boys owning a radio. Video Game Player: H0: g b
Ha: g b
.33 .63 __________________________ Test statistic: z ___________________________ 13.53 1 1 ____ (.477)(1 .477) _____ 1036 996
√
(
)
p-value 2 Pr(Z 13.53) 2 .0000 .0000 Test decision: With such a small p-value, reject H0 at any common significance level. Conclusion in context: You have extremely strong statistical evidence that the population proportion of boys who have a video game player in their bedrooms is different from the population proportion of girls who have a video game player in their bedrooms.
453
95% confidence interval for g b: (.33 .63) (1.96)(.0212) (.3415, .2585) You are 95% confident that the percentage of all girls owning a video game player is between 25.9 and 34.2 percentage points lower than the percentage of all boys owning a video game player. Internet: H0: g b
Ha: g b
.17 .24 ____________________________ Test statistic: z _____________________________ 3.91 1 1 ____ (.2043)(1 .2043) _____ 1036 996
(
√
)
p-value 2 Pr(Z 3.91) 2 .0002 .0004 Test decision: With such a small p-value, reject H0 at any common significance level. Conclusion in context: You have very strong statistical evidence that the population proportion of boys who have an Internet connection in their bedrooms is different from the population proportion of girls who have an Internet connection in their bedrooms. 95% confidence interval for g b: (.17 .24) (1.96)(.0179) (.1050, .0350) You are 95% confident that the percentage of all girls with an Internet connection in their bedrooms is between 3.5 and 10.5 percentage points lower than the percentage of all boys with an Internet connection in their bedrooms. Telephone: H0: g b
Ha: g b
.42 .39 __________________________ Test statistic: z ___________________________ 1.38 1 1 _____ ____ (.405)(1 .405) 1036 996
√
(
)
p-value 2 Pr(Z 1.38) 2 .0838 .1676 Test decision: Because the p-value is not small, do not reject H0 at any common significance level. Conclusion in context: You do not have sufficient statistical evidence to conclude that the population proportion of boys who have a telephone in their bedrooms is different from the population proportion of girls who have a telephone in their bedrooms. These data indicate there is no difference in the population proportions of boys and girls who have telephones in their bedrooms. You are 95% confident that between 25.8 and 24.15 percentage points more of the boys have a video game player in their bedrooms, and between 3.5 and 10.5 percentage points more of the boys have an Internet connection in their bedrooms (than do girls). In addition, you are 95% confident that between .8 and 7.2 percentage points more girls than boys have a radio in their bedrooms. So, it appears that, in general, boys have access to a little more technology in their bedrooms than girls do.
21
Activity 21-12
Topic 21: Comparing Two Proportions
Activity 21-13: AZT and HIV a. The following segmented bar graph displays the results: AZT Treatment for HIV-infected Mothers Not infected HIV-infected
Percentage
454
100 90 80 70 60 50 40 30 20 10 0
AZT Placebo Treatment Group
This bar graph indicates that mothers given the placebo were about three times more likely to have babies that were HIV positive than were the mothers given AZT. b. Technical conditions: i. The data are from randomly assigning subjects to two treatment groups, so this condition is met. ii. The number of successes and failures in each group should be at least five. This condition is also met as the smallest count is 13. c. The null hypothesis is that AZT and a placebo are equally effective in reducing mother-to-infant transmission of AIDS. Specifically, the proportion of HIVpositive babies born to mothers who could potentially take AZT is the same as the proportion of HIV-positive babies born to mothers who could potentially take a placebo. In symbols, the null hypothesis is H0: AZT placebo. The alternative hypothesis is that AZT is more effective than a placebo for reducing mother-to-infant transmission of AIDS, or that the proportion of HIVpositive babies born to mothers who could potentially take AZT is smaller than the proportion of HIV-positive babies born to mothers who could potentially take a placebo. In symbols, the alternative hypothesis is Ha: AZT placebo. .0722 .2186 _________________________ The test statistic is z __________________________ 3.95. 1 1 ____ (.146)(1 .146) ____ 180 183 p-value Pr(Z 3.95) .0002
(
√
)
With such a small p-value, reject H0 at the .01 significance level. You have very strong statistical evidence that AZT is more effective than a placebo for reducing mother-to-infant transmission of AIDS. d. For a 99% CI, you calculate ___________________________
(.0722)(.9278) (.2186)(.7814) (.0722 .2186) (2.576) ____________ ____________ 180 183
√
(.2395, .0533)
455
You are 99% confident the difference in HIV transmission rates is between 5.33 and 23.95 percentage points. Because the values in your interval are all negative, you know that the AZT transmission rate is lower than the placebo transmission rate by somewhere between 5.33 to 23.95 percentage points. e. Because this is a well-designed experiment, you can conclude that AZT caused the observed difference in HIV transmission rates. If AZT and a placebo were equally effective in reducing mother-to-infant transmission of AIDS, you would virtually never see a difference in sample results as or more extreme than those seen in this experiment by random assignment alone. You are 99% confident in concluding that AZT lowers the HIV transmission rate somewhere between 5.33 and 23.95 percentage points over that of a placebo.
Activity 21-14: Flu Vaccine a. pˆvaccine .149; pˆno vaccine .169 The following segmented bar graph displays the results: No symptoms
Percentage
Flu-like symptoms
100 90 80 70 60 50 40 30 20 10 0
Vaccine
No Vaccine Flu Vaccine
b. The null hypothesis is that the flu vaccine and no vaccine are equally effective in reducing flu-like symptoms. More specifically, the proportion of those who could potentially take the vaccine who develop flu-like symptoms is the same as the proportion of those potentially not taking the vaccine who develop symptoms. In symbols, the null hypothesis is H0: vaccine no vaccine. The alternative hypothesis is that that the flu vaccine is more effective in reducing flu-like symptoms than no vaccine. Specifically, the proportion of those who could potentially take the vaccine who develop flu-like symptoms is less than the proportion of those potentially not taking the vaccine who develop symptoms. In symbols, the alternative hypothesis is Ha: vaccine no vaccine. Technical conditions: i. The subjects were not randomly assigned subjects to the two groups, nor were they a random sample. ii. With the overall success rate of .155, 1000(.155) 5, 1000(.845) 5, 402(.155) 5, and 402(.845) 5. So this condition is met. .149 .169 _________________________ The test statistic is z __________________________ 0.94. 1 1 _____ ____ .155(1 .155) 1000 402
√
(
Using Table II, p-value Pr(Z 0.94) .1736.
)
21
Activity 21-14
456
Topic 21: Comparing Two Proportions
Because the p-value is not small, do not reject H0 at the .05 significance level. You do not have sufficient statistical evidence to conclude that the flu vaccine is more effective than no vaccine in reducing flu-like symptoms. c. Because this is an observational study, you cannot conclude that there is a cause-and-effect relationship between the flu vaccine and the flu-like symptoms even if the result had been statistically significant. You also probably should not generalize these results beyond workers at Children’s Hospital in Denver, Colorado, as all of the subjects in this study were volunteers from this hospital.
Activity 21-15: Suitability for Politics a. The null hypothesis is that the proportion of all American adult men who agree with this statement is the same as the proportion of all American adult women who agree with this statement. In symbols, H0: m w. The alternative hypothesis is that the proportion of all men who agree with this statement is greater than the proportion of all women who agree with this statement. In symbols, Ha: m w. b. Technical conditions: The number of successes and failures in each group is at least five, so that condition is met. However, the data are technically not independent random samples from two populations—this is one random sample taken from a population and then separated by gender. However, (see page 418), you might consider this a minor variation and say the technical condition has been met. .283 .229 _________________________ 1.79. c. The test statistic is z __________________________ 1 1 ____ (.254)(1 .254) ____ 385 449
√
(
)
Using Table II, p-value Pr(Z 1.79) .0367. Because p-value .0367 .05, reject H0 at the .05 significance level. You have moderate statistical evidence that the proportion of American adult men who agree with this statement is greater than the proportion of American adult women who agree with this statement. d. If the population proportions of men and women who agree with this statement were the same, you would see sample results as or more extreme than this (a difference of at least .054 with these sample sizes) in about 3.67% of samples by random sampling alone. Because this would not be a very common occurrence, these results provide strong evidence that the hypothesis that the proportions of men and women who agree with this statement are equal is false. Conclude that the population proportion of men who agree with this statement is actually greater than the population proportion of women who agree. _______________________
(.283)(.717) (.229)(.771) e. For a 95% CI, you calculate (.283 .229) (1.96) __________ __________ 385 449 (.0054, .1134).
√
f. You are 95% confident the percentage of all adult American men who agree with this statement is between .54 and 11.3 percentage points higher than the percentage of all adult American women who agree with this statement.
457
Note that the confidence interval and significance test appear to contradict each other (because the confidence interval includes zero), but that is because the significance test was one-sided.
Activity 21-16: Hand Washing pˆm = .746; pˆw = .895 The following graph displays the results: Did not wash hands
Percentage
Washed hands
100 90 80 70 60 50 40 30 20 10 0
Men
Women Gender
The null hypothesis is that the proportion of all adult men who wash their hands in public restrooms is the same as the proportion of all adult women who wash their hands in public restrooms. In symbols, H0: m w. The alternative hypothesis is that the proportion of all adult men who wash their hands in public restrooms is not the same as the proportion of all adult women who wash their hands in public restrooms. In symbols, Ha: m w. Check technical conditions: The number of successes and failures in each group is at least five. However, the data are technically not independent random samples from two populations—this is one random sample taken from a population and then separated by gender. However, (see page 418) you might consider this a minor variation and say this technical condition has been met. .746 .895 _________________________ 15.41 Test statistic: z __________________________ 1 1 _____ (.82)(1 .82) _____ 3206 3130
(
√
)
p-value 2 Pr(Z 15.41) 2 .0000 .0000 Test decision: With such a small p-value, reject H0 at any common significance level. Conclusion in context: You have very, very strong statistical evidence that the proportion of all adult men who wash their hands is different from the proportion of all adult women who wash their hands in public restrooms. You calculate _______________________
(.746)(.254) (.895)(.105) (.746 .895) (1.96) __________ __________ (.1673, .1303) 3130 3206
√
You are 95% confident the percentage of adult men who wash their hands in public restrooms is between 13.03 and 16.73 percentage points less than the percentage of adult women who wash their hands in public restrooms.
21
Activity 21-16
458
Topic 21: Comparing Two Proportions
Activity 21-17: Volunteerism a. These values are statistics because they are taken from samples, not populations. b. You need to know how many men were sampled and how many women. _____________________
(.25)(.75) (.324)(.676) c. For a 99% CI, you calculate (.25 .324) (2.576) ________ __________ 30000 30000 (.0835, .0645).
√
You are 99% confident the percentage of American men who did volunteer work in 2005 was between 6.45 and 8.35 percentage points less than the percentage of American women who did volunteer work in 2005. d. This interval is so narrow because the sample sizes are so large (30,000 each). e. A 99.9% confidence interval should be wider because to create it you would need to use a larger value of z* (3.291 rather than 2.576).
Activity 21-18: Preventing Breast Cancer Blood clot in major vein: The null hypothesis is that tamoxifen and raloxifene are equally effective in reducing blood clots in a major vein, so the proportion of the population who develop blood clots in major veins would be the same for those (potentially) on both drugs. In symbols, H0: T R. The alternative hypothesis is that raloxifene is more effective in reducing blood clots in a major vein, so the population proportion of women (potentially) taking raloxifene who develop blood clots in a major vein would be less than the population proportion of women (potentially) taking tamoxifen who develop blood clots in a major vein. In symbols, Ha: T R. Check technical conditions: The number of successes and failures in each group is at least five, and the data were obtained by randomly assigning the subjects to one of two treatment groups, so the conditions are met. pˆR .0067 pˆT .0055; .0055 .0067 ___________________________ Test statistic: z ____________________________ 1.10 1 1 _____ (.006)(1 .006) _____ 9726 9745
√
(
)
p-value Pr(Z 1.10) .8643 Test decision: With such a large p-value, do not reject H0 at any common significance level. Conclusion in context: You have no evidence that raloxifene is more effective than tamoxifen in reducing blood clots in a major vein. (In fact, the data suggest it is less effective.) Blood clot in lung: The null hypothesis is that tamoxifen and raloxifene are equally effective in reducing blood clots in a lung, so the proportion of the population who develop blood clots in a lung would be the same for those on both drugs. In symbols, H0: T R. The alternative hypothesis is that raloxifene is more effective than tamoxifen in reducing blood clots in a lung, so the population proportion of women taking raloxifene who develop blood clots in a lung would be less than the population
459
proportion of women taking tamoxifen who develop blood clots in a lung. In symbols,Ha: T R. Check technical conditions: The number of successes and failures in each group is at least five, and the data were obtained by randomly assigning the subjects to one of two treatment groups, so the conditions are met. pˆR .0067 pˆT .0056; .0056 .0036 _____________________________ Test statistic: z ______________________________ 2.03 1 1 _____ _____ (.0046)(1 .0046) 9726 9745
(
√
)
p-value Pr(Z 2.03) .0212 Test decision: Because p-value .0212 .05, reject H0 at the .05 significance level. Conclusion in context: You have moderate statistical evidence that raloxifene is more effective than tamoxifen in reducing blood clots in a lung.
Activity 21-19: Underhanded Free Throws a. The null hypothesis is that Reilly’s probability of successes is the same with both methods of shooting free throws. In symbols, H0: underhand conventional. The alternative hypothesis is that Reilly’s probability of successes is greater with the underhand method than with the conventional method. In symbols, Ha: underhand conventional. b. You would need to know how many attempts he made with each method. c. Answers will vary by student, but students should expect the results to be more significant if he made 500 attempts using each method than if he made 100 attempts using each method. .78 .63 _________________________ d. The test statistic is z __________________________ 2.33. 1 1 ____ ____ (.705)(1 .705) 100 100
√
(
)
p-value = Pr(Z 2.33) .0099 Reject H0 at the .01 significance level because p-value .0099, which is less than .01. .78 .63 _________________________ e. The test statistic is z __________________________ 5.20. 1 1 ____ (.705)(1 .705) ____ 500 500
√
(
)
p-value Pr(Z 5.20) .0000 Reject H0 at the .01 significance level because p-value .01. f. In both cases, you would reject the null hypothesis and conclude that you have very strong statistical evidence that Reilly’s proportion of successes was greater with the underhand method than with the conventional method. Note that as predicted, the results are more statistically significant with the larger sample size.
21
Activity 21-19
Topic 21: Comparing Two Proportions
g. No, you would not feel comfortable generalizing these results to all players who could be taught by Rick Barry. These results were not obtained from a random sample of players who were taught by Rick Barry; they weren’t even a random sample of free throws made by Rick Reilly. Perhaps both Reilly and Barry are right-handed or have some other common feature that makes the underhand method work for them, but not for others.
Activity 21-20: Solitaire a. Yes, you can analyze these data even though the sample sizes are different. b. pˆA .115; pˆB .167 The following segmented bar graph displays the distributions: Lost Won
Percentage
460
100 90 80 70 60 50 40 30 20 10 0
Author A
Author B
c. The null hypothesis is that the probability of a win for each author is the same. In symbols, the null hypothesis is H0: A B. The alternative hypothesis is that the probability of a win for each author is not the same. In symbols, Ha: A B. .115 .167 _______________________ The test statistic is z ________________________ 1.74. 1 1 ____ (.15)(1 .15) ____ 217 444
√
(
)
p-value 2 Pr(Z 1.74) 2 .0409 .0818 With .05 p-value .10, reject H0 at the .10 significance level, but do not reject H0 at the .05 significance level. The data do suggest that the winning probabilities for these two people differ at the .10 significance level, but not at the .05 significance level. _______________________
(.115)(.885) (.167)(.833) d. For a 90% CI, you calculate (.115 .167) (1.645) __________ __________ 217 444 (.0975, .0054).
√
e. Technical conditions: The number of successes and failures in each group is at least five. Although these are probably not random samples from two independent processes, they are probably reasonably considered representative samples, so you might consider this technical condition to be satisfied. f. You calculate (.167 .115) (1.645)(.0279) (.0054, .0975). This is the “positive version” of the same interval.
461
g. Because the test is not significant at the .05 level, this tells you that zero is a plausible value for A B and therefore will be contained in the 95% (100% 5%) confidence interval. h. The p-value would have been half as large if you had tested that author B’s success probability is higher than author A’s (because this would be a one-sided test and the sample results are in the conjectured direction).
Activity 21-21: Magazine Advertisements a. pˆSI .466; pˆSOD .215 b. The following segmented bar graph compares the proportions: Pages without advertisements
Percentage
Pages with advertisements
100 90 80 70 60 50 40 30 20 10 0
Sports Illustrated
Soap Opera Digest
c. The null hypothesis is that the proportion of all pages with ads in both magazines is the same. In symbols, the null hypothesis is H0: Sports Illustrated Soap Opera Digest. The alternative hypothesis is that the proportion of all pages with ads in both magazines is not the same. In symbols, the alternative hypothesis is Ha: Sports Illustrated Soap Opera Digest. .466 .215 _________________________ The test statistic is z __________________________ 4.15. 1 1 ____ ____ (.333)(1 .333) 116 130
√
(
)
p-value 2 Pr(Z 4.15) 2 .0000 .0000 d. Because the p-value is less than .01, reject H0 and conclude that there is strong statistical evidence that the proportion of pages with ads in both magazines is not the same. (If they were the same, you would virtually never see a sample difference this extreme or more extreme by random sampling alone.) e. In order for this test procedure to be valid, you must assume that these are independent random samples of pages from the magazines.
Activity 21-22: Wording of Surveys a. If the researchers’ claim about acquiescence is valid, A should be greater than B. b. pˆA .60; pˆB .43.
21
Activity 21-22
462
Topic 21: Comparing Two Proportions
c. H0: A B
Ha: A B
.6 .43 _________________________ The test statistic is z __________________________ 5.04. 1 1 ____ ____ (.514)(1 .514) 473 472
√
(
)
p-value Pr(Z 5.04) .0000 With such a small p-value, reject H0 at any common significance level. You have very strong statistical evidence that the population proportion of all potential form A subjects who contend that individuals are more to blame is greater than the population proportion of all potential form B subjects who contend that individuals are more to blame. Thus the researchers’ claim is supported by the data.
Activity 21-23: Wording of Surveys a. pˆforbid 161409 .393; pˆallow (432 189)432 .5625. b. The null hypothesis is that the population proportion of potential “forbid” subjects who oppose communist speeches is the same as the population proportion of potential “allow” subjects who would allow communist speeches. In symbols, H0: forbid allow. The alternative hypothesis is that the population proportion of potential “forbid” subjects who oppose communist speeches is not the same as the population proportion of potential “allow” subjects. In symbols, Ha: forbid allow. Technical conditions: The number of successes and failures in each group is at least five (191 is the smallest), and the subjects were presumably randomly assigned to the different question versions. .393 .5625 _________________________ The test statistic is z __________________________ 4.90. 1 1 ____ (.480)(1 .480) ____ 409 432
√
(
)
p-value 2 Pr(Z 4.90) 2 .0000 .0000 Reject H0 at the .10, .05, and .01 significance levels because p-value .0001. You have sufficient statistical evidence to conclude that the population of subjects who oppose communist speeches is affected by the use of the words “forbid” and “allow” in their responses. c. The null hypothesis is that the population proportion of potential “forbid” subjects who oppose X-rated movies is the same as the population proportion of potential “allow” subjects. In symbols, H0: forbid allow. The alternative hypothesis is that the population proportion of potential “forbid” subjects who oppose X-rated movies is not the same as the population proportion of potential “allow” subjects. In symbols, Ha: forbid allow. pˆforbid .4095; pˆallow .4635
463
.4095 .4635 __________________________ The test statistic is z ___________________________ 1.82. 1 1 ____ .4372(1 .4372) ____ 547 576
(
√
)
p-value 2 Pr(Z 1.82) 2 .0344 .0688 Reject H0 at the .10 significance level, but do not reject H0 at the .05 or .01 significance levels because .05 p-value .10. d. The null hypothesis is that the population proportion of potential “forbid” subjects who oppose cigarette ads on television is the same as the population proportion of potential “allow” subjects. In symbols, H0: forbid allow. The alternative hypothesis is that the population proportion of potential “forbid” subjects who oppose cigarette ads on television is not the same as the population proportion of potential “allow” subjects. In symbols, Ha: forbid allow. pˆforbid .506; pˆallow .764 .506 .764 _______________________ The test statistic is z ________________________ 9.15. 1 1 ____ ____ (.63)(1 .63) 607 576
(
√
)
p-value 2 Pr(Z 9.15) 2 .0000 .0000 Reject H0 at the .10, .05, and .01 significance levels because p-value .0001. e. It appears that, in general, the words “forbid” and “allow” are not interchangeable in survey questions, particularly regarding social values issues. They made a very (statistically) significant difference in the answers to the question about cigarette ads on television and in the answers to the question about communist speeches, and a moderately statistically significant difference in the answers to the X-rated movies question.
Activity 21-24: Questioning Smoking Policies The null hypothesis is that the proportion of all Dickinson College students who would favor a ban on smoking is the same regardless of whether they are interviewed by a smoker or a nonsmoker. In symbols, H0: smoker nonsmoker. The alternative hypothesis is that the proportion of all Dickinson College students who favor a ban on smoking is not the same for those interviewed by a smoker and those interviewed by a nonsmoker. In symbols, Ha: smoker nonsmoker. pˆsmoker .506; pˆnonsmoker .764. .43 .79 _______________________ Test statistic: z ________________________ 5.21 1 1 ____ (.61)(1 .61) ____ 100 100
√
(
)
p-value 2 Pr(Z 5.21) 2 .0000 .0000 Test decision: With such a small p-value, reject H0 at any common significance level. Conclusion in context: If the interviewer being a smoker or nonsmoker made no difference in the proportion of students who favored a ban on smoking, you
21
Activity 21-24
464
Topic 21: Comparing Two Proportions
would expect to see a difference in sample proportions this extreme (or more extreme) virtually never by random chance alone (p-value ≈ 0). In other words, the probability that the observed difference in the sample proportions is a result of random assignment alone is essentially zero. Because the researcher did find such an extreme difference in the sample proportions, you can safely conclude that there is very strong statistical evidence that whether the interviewer is smoking has an effect on how students respond to the question of whether Dickinson students favor a ban on smoking.
Activity 21-25: Teen Smoking a. You need to know how many girls were interviewed. b. H0: mother smoked mother didn’t smoke Ha: mother smoked mother didn’t smoke (considering daughters of women who smoked during pregnancy to be more likely to smoke themselves) .26 .04 _____________________ The test statistic is z ______________________ 3.08. 1 1 ___ (.15)(1 .15) ___ 50 50
√
(
)
p-value Pr(Z 3.08) .0010 With p-value .001 .05, reject H0 at the .05 significance level. (The difference is statistically significant, even with a two-sided alternative p-value of .002.) You conclude that there is strong statistical evidence that daughters of mothers who smoke during pregnancy are more likely to smoke themselves than daughters of mothers who do not smoke during pregnancy. .26 .04 ________________________ c. The test statistic is z _________________________ 5.02. 1 1 ___ ____ (.084)(1 .084) 50 200
(
√
)
p-value Pr(Z 5.02) .0000 With p-value .0002 .05, reject H0 at the .05 significance level. (The difference is statistically significant.) You conclude that there is extremely strong statistical evidence that daughters of mothers who smoke during pregnancy are more likely to smoke themselves than daughters of mothers who do not smoke during pregnancy. .26 .04 _______________________ d. The test statistic is z ________________________ 6.16. 1 1 ____ (.15)(1 .15) ____ 200 200
√
(
)
p-value Pr(Z 6.16) .0000 With such a small p-value, reject H0 at the .05 significance level. (The difference is statistically significant.) You conclude that there is very, very strong statistical evidence that daughters of mothers who smoke during pregnancy are more likely to smoke themselves than daughters of mothers who do not smoke during pregnancy.
465
e. The following segmented bar graph compares smoking habits: Daughter did not smoke
Percentage
Daughter smoked
100 90 80 70 60 50 40 30 20 10 0
Mother Smoked
Mother Did Not Smoke
The appearance of this graph does not change as the sample size increases because the graph displays the sample proportions (.26 and .04) rather than the sample numbercounts of daughters who smoke. f. This is an observational study because the researchers did not decide who would or would not smoke. This explanatory variable was determined by the mothers themselves. g. Because this is an observational study and not an experiment, although the results are statistically significant, you cannot conclude that the pregnant mothers’ smoking causes the daughters’ tendency to smoke. Possible confounding variables include whether the mothers continue to smoke through the daughters’ youth and/or whether their fathers or some other household member smoked during their childhood.
Activity 21-26: Teen Smoking a. H0: mother smoked mother didn’t smoke Ha: mother smoked mother didn’t smoke (assuming the sons of mothers who smoked will be more likely to smoke) .20 .15 _______________________ 0.72. The test statistic is z ________________________ 1 1 ___ (.175)(1 .175) ___ 60 60
√
(
)
p-value Pr(Z 0.72) .2358 As the p-value is not small, do not reject H0 at the .05 significance level. (The difference is not statistically significant.) You conclude that there is not convincing statistical evidence that sons of mothers who smoke during pregnancy are more likely to smoke themselves than sons of mothers who do not smoke during pregnancy. .20 .15 _________________________ 1.32. b. The test statistic is z __________________________ 1 1 ____ (.175)(1 .175) ____ 200 200
√
p-value Pr(Z 1.32) .0934
(
)
21
Activity 21-26
466
Topic 21: Comparing Two Proportions
As the p-value is not small, do not reject H0 at the .05 significance level. (The difference is not statistically significant.) .20 .15 _________________________ c. The test statistic is z __________________________ 2.08. 1 1 ____ (.175)(1 .175) ____ 500 500
(
√
)
p-value Pr(Z 2.08) .0188 With p-value .0188 .05, reject H0 at the .05 significance level. (The difference is statistically significant.) d. In order to be statistically significant at the .05 level, you need z 1.645, so (.175)(.825)(2)(1.645)2 .20 .15 n ___________________ ____________ 312.5 1.645 z _____________ (.05)2 2 .175(.825) __ n
( )
√
The sample size would need to be at least n 313 sons for the difference in the sample proportions to be statistically significant at the .05 level.
Activity 21-27: Candy and Longevity a. The null hypothesis is that the population proportion of all candy consumers who die (during this time period) is the same as the population proportion of all nonconsumers who die. In symbols, H0: candy nonconsumer. The alternative hypothesis is that the population proportion of all candy consumers who die is less than the population proportion of all nonconsumers who die. In symbols, Ha: candy nonconsumer. pˆcandy .059; pˆnonconsumer .075 .059 .075 ___________________________ 2.76. The test statistic is z ____________________________ 1 1 _____ (.066)(1 .066) _____ 4529 3312
√
(
)
p-value Pr(Z 2.76) .0029 With p-value .0029 .05, reject H0 at the .05 significance level. From these results, you have strong statistical evidence that the population proportion of all candy consumers who die is less than the population proportion of all nonconsumers who die. b. No, you cannot conclude a cause-and-effect relationship between candy consumption and increased survival because this was an observational study and not a well-designed, randomized experiment. c. No, you cannot generalize your conclusions to all adults because all the participants in this observational study were males. You should not generalize your conclusions to all males either because all the males in this study were men who attended an Ivy League school more than 30 years before the study. Therefore, they are not representative of the “typical” American male.
Activity 21-28: Your Choice Answers will vary.
478
Topic 22: Comparing Two Means
her name. The confidence interval enables you to say more: that giving her name to customers increases the waitress’ tips by an average of about 1–3 dollars per dining party at Sunday brunch in this restaurant. But you must be cautious about generalizing this result to other waitresses because only one waitress participated in this study. Even for this particular waitress, you should be cautious about generalizing the results to customers beyond those who partake of Sunday brunch at that particular Charlie Brown’s restaurant in southern California. You should also remember that these p-value and confidences interval calculations are only valid if the tip amounts roughly follow a normal distribution.
•••
Homework Activities Activity 22-5: Close Friends a. If all of the women sampled had one more friend, the mean for the women would increase by one, and all other statistics would remain the same. This would increase the absolute value of the test statistic (the difference between the sample means would increase, but the denominator would not change) and thus would decrease the p-value. b. If all of the men sampled had one more friend, the mean for the men would increase by one, and all other statistics would remain the same. This would increase the absolute value of the test statistic (the difference between the sample means would increase, but the denominator would not change) and thus would decrease the p-value. c. If every man and every woman sampled had one more close friend than they originally reported, then the sample means for the men and for the women would increase by one. The sample standard deviations and sample sizes would not change. The difference between the sample means would not change, so the test statistic value would not change, and thus the p-value would not change. d. If both sample standard deviations were larger, the denominator of the test statistic would be larger, and thus the test statistic would be smaller. This would make the p-value larger. e. If both sample sizes were larger, the denominator of the test statistic would be smaller, and thus the test statistic would be larger (in absolute value). This would make the p-value smaller.
Activity 22-6: Hypothetical Commuting Times Earl: not statistically significant because the difference in his means is less than the difference in Alex’s means, and his sample sizes and sample standard deviations are the sample as Alex’s. Fred: statistically significant because his sample sizes are much larger than Donna’s, and his sample means and standard deviations are the same Grace: statistically significant because the difference in her sample means is greater than that of Barb, and her sample sizes and sample standard deviations are the same Harry: not statistically significant because his sample standard deviations are larger than those of Alex whereas his sample sizes and sample means are the same Ida: statistically significant because her sample standard deviations are less than those of Carl whereas her sample sizes and sample means are the same
Activity 22-9
479
Note that the differences in the sample medians are similar in each case but you don’t have very clear information on how the means compare, especially when there is skewness in the distributions. Manuel will have a large p-value because these sample sizes are small and the sample standard deviations are large. Notice the amount of overlap between the two sample distributions. Jacque’s p-value will be less than Manuel’s because the standard deviations are smaller. Katrina’s p-value will be less than Manuel’s because the sample sizes are larger (and the sample standard deviations are slightly smaller). (Note: It’s not obvious how the p-values for Jacque and Katrina will compare.) Liam will have the smallest p-value because his sample sizes are the largest and his standard deviations are the smallest. Notice the small amount of overlap between the two sample distributions.
Activity 22-8: Nicotine Lozenge a. Age: comparison of means Weight: comparison of means Gender: comparison of proportions Number of cigarettes smoked: comparison of means Whether the person made a previous attempt to quit smoking: comparison of proportions b. The researchers would hope to fail to reject the null hypotheses in these tests because the null hypotheses would be that the two groups (those who use the nicotine lozenge and those who don’t) are identical with regard to each of these background variables. The researchers would be hoping to not find a statistically significant difference between the groups on any of these variables so that a difference between the groups on the response variable could be attributed to the nicotine lozenge.
Activity 22-9: Got a Tip? a. The explanatory variable is whether the party received a fancy piece of chocolate. The response variable is the amount of tip (as a percentage of the bill). b. Looking at the percentage of the bill is more useful than looking at the exact tip amount in this case because all the checks would not be for the same amount. Some parties would order expensive meals and others might order very inexpensive meals. Thus, the amount of the tip might vary considerably based on what the customers had for dinner rather than on whether they had received the chocolate. c. This is an experiment because the researchers randomly decided who would receive the chocolates (the treatment) and who would not. d. Technical conditions: Yes, you can check the technical conditions. The sample sizes are both greater than 30 (n1 n2 46), and the data arise from random assignment of subjects to two treatment groups. e. The null hypothesis is that providing a fancy, foil-wrapped piece of chocolate makes no difference in the average tip (as a percentage of the bill) at this restaurant. In symbols, the null hypothesis is H0: chocolate no chocolate.
22
Activity 22-7: Hypothetical Commuting Times
480
Topic 22: Comparing Two Means
The alternative hypothesis is that providing a fancy, foil-wrapped piece of chocolate will increase the average tip (as a percentage of the bill) at this restaurant. In symbols, Ha: chocolate no chocolate. 17.84 15.06 5.24. _____________ The test statistic is t ______________ 2 1.89 2 3.06 _____ _____ 46 46
√
Using Table III with 40 degrees of freedom, you see that the p-value is off the chart, so p-value .0005. Using Minitab and the applet, p-value .0000. With such a small p-value, reject H0 at any common significance level. You have very strong statistical evidence that providing a fancy, foil-wrapped piece of chocolate will increase the average tip (as a percentage of the bill) at this restaurant. f. The confidence interval formula is (17.84 15.06) t*(0.5303). Using Table III with 40 degrees of freedom, t* 2.021, so the confidence interval is (1.708, 3.852). Using Minitab, the interval is (1.823, 3.837). Using the applet, the interval is (1.712, 3.848). You are 95% confident that providing a fancy, foilwrapped piece of chocolate will increase the average tip as a percentage of the bill by 1.7% to 3.8%. g. Random assignment should have assured that the only difference between the parties was whether they received the chocolate with their bill, so you can conclude a causal link between the chocolate and the increased tip. You would not want to generalize this to other restaurants in other cities, or to other restaurants in Ithaca, as this experiment was tried at only one restaurant. It is possible that all 92 parties came to the restaurant on the same evening, in which case you should not generalize beyond that particular night of the week.
Activity 22-10: Body Temperatures The following table and graph display the body temperatures data: Sample Size
Sample Mean
Sample SD
Females
65
98.105
0.699
Males
65
98.394
0.743
Females
Males
*
96
*
*
97
98 99 Body Temperature (in F)
100
101
The null hypothesis is that healthy men and women have the same average body temperature. In symbols, H0: M F. The alternative hypothesis is that there is a difference in the average body temperature of healthy men and women. In symbols, Ha: M F.
481
Check technical conditions: The sample sizes are both greater than 30, but you do not know whether these were random samples. You are probably safe in assuming they are at least representative samples of the temperatures of healthy adults. 98.394 98.105 2.28 _____________ Test statistic: t ______________ 2 .743 _____ .699 2 _____ 65 65
√
Using Table III with 60 degrees of freedom, 2.000 2.28 2.390, so 2 .01 p-value 2 .025, which means .02 p-value .05. Using Minitab, the p-value is .024. Using the applet, the p-value is .0257. Test decision: With a p-value less than .05, reject H0 at the .05 significance level. Conclusion in context: You have strong statistical evidence that the average body temperatures of healthy men and women are not the same. 95% confidence interval for M F: (98.394 98.105) t*(0.1265) Using Table III with 60 degrees of freedom, t* 2.000, so the confidence interval is (.036, .542). Using Minitab, the interval is (0.039, 0.540). Using the applet, the interval is (0.036, 0.542). You are 95% confident the average body temperature of healthy males is between .04°F and .54°F greater than the average body temperature of healthy adult females.
Activity 22-11: Ideal Age a. The null hypothesis is that the mean age given by women is the same as the mean age given by men (in response to this question). In symbols: H0: W M. The alternative hypothesis is that the mean age given by women is not the same as the mean age given by men in response to this question. In symbols, Ha: W M. b. You need to know the standard deviations and how many in the sample were women and how many were men. 43 39 ____________ c. The test statistic is t _____________ 3.84. 25 2 25 2 _____ _____ 1153 1153
√
Using Table III with 500 degrees of freedom, the p-value is off the chart, so p-value .0005. Using Minitab, p-value .000. Using the applet, p-value is .0001. With such a small p-value, reject H0 at any commonly used significance level. You have very strong statistical evidence that the mean age given in response to this question is not the same for women and men. d. The confidence interval formula is (43 39) t*(1.041). Using Table III with 500 degrees of freedom, t* 2.586, so the confidence interval is (1.308, 6.692). Using Minitab, the interval is (1.32, 6.68). Using the applet, the interval is (1.314, 6.686). You are 99% confident the average age given by women in response to this question is between 1.3 and 6.68 years greater than the average age given by men in response to this question. e. Yes, the confidence interval is consistent with the test result because zero is not in the confidence interval, indicating that it is not a plausible value for the difference in average ages.
22
Activity 22-11
Topic 22: Comparing Two Means
f. Answers will vary. Twenty-five is probably a reasonable upper bound for how high the standard deviation might be. It is hard to imagine that the “typical deviation from the mean” would be greater than 25 years for either gender.
Activity 22-12: Editorial Styles a. The following five-number summaries and boxplots compare the distribution of sentence lengths between the two newspapers:
USA Today Washington Post
Newspaper
482
Min
QL
Median
QU
Max
Mean
SD
n
4
17
22
29
39
22.84
8.76
31
11
16
19
28.5
56
23.86
12.52
21
USA Today
Washington Post
*
0
10
20
30 Sentence Length
40
50
*
60
The lower quartiles, upper quartiles, and means of these distributions of sentence lengths are roughly the same. However, there are two high outliers in the Washington Post distribution (50 and 56 words per sentence). USA Today had two very short sentences (4 and 7 words per sentence) and a slightly greater median than the Post. There were ten more sentences in the USA Today editorial than in the Post editorial. b. No, the technical conditions for the two-sample t-test are not satisfied. The sample size for the Washington Post is less than 30 words per sentence, and these data do not appear to come from a normally distributed population as there are two clear outliers and a strong right skew. In addition, you do not have random, or even necessarily representative, samples from either paper. c. The null hypothesis is the average length of the sentences in both papers is the same. In symbols, H0: USA Today W Post. The alternative hypothesis is the average length of all sentences in USA Today is less than the average length of all sentences in the Washington Post. In symbols, Ha: USA Today W Post. 22.84 23.86 0.32. ______________ The test statistic is t _______________ 12.52 2 8.76 2 ______ _____ 31 21
√
Using Table III with 20 degrees of freedom, 0.860 0.32, so p-value .20. Using Minitab, the p-value is .374. Using the applet, the p-value is .3784. Because the p-value is not small, do not reject H0 at any commonly used significance level. You do not have any statistical evidence to suggest that the average length of (all) the sentences in USA Today is less than the average length of the sentences in the Washington Post.
Activity 22-13
483
Activity 22-13: Editorial Styles The Washington Post IQR is 28.5 16 12.5, and 1.5 12.5 18.75, so [QL 18.75, QU 18.75] [0, 47.25]. Therefore, remove any Washington Post sentence lengths outside this range. There are two such outliers (sentences with 56 and 50 words). The following five-number summaries and boxplots compare the distribution of sentence lengths.
USA Today
Newspaper
Washington Post
Min
QL
Median
QU
Max
Mean
SD
n
4
17
22
29
39
22.84
8.76
31
11
16
19
23
41
20.79
8.29
19
USA Today
Washington Post
*
0
10
20 Sentence Length
30
*
40
The null hypothesis is the average length of (all) the sentences in both papers is the same. In symbols, H0: USA Today W Post. The alternative hypothesis is the average length of sentences in USA Today is less than the average length of the sentences in the Washington Post. In symbols, Ha: USA Today W Post. 22.84 20.79 0.83 Test statistic: t _____________ ____________ 2 8.76 _____ 8.29 2 _____ 31 21
√
Using Table III with 18 degrees of freedom, 0.862 0.32, so p-value .20. Using Minitab, the p-value is .794. Using the applet, the p-value is .7914. Test decision: With the large p-value, do not reject H0 at any commonly used significance level. Conclusion in context: You do not have any statistical evidence to suggest that the average length of the sentences in USA Today is less than the average length of the sentences in the Washington Post. This time there is no possibility of rejecting the null hypothesis, because without the outliers, the sample mean sentence length for the Washington Post is less than the sample mean sentence length for USA Today. So the sample provides no evidence that USA Today has shorter sentences than the Washington Post.
22
d. Answers will vary. Some possibilities include: average number of syllables per word, percentage of distinct words, and number of pronouns per passage.
484
Topic 22: Comparing Two Means
Activity 22-14: Children’s Television Viewing a. No; if the randomization achieved its goal, there would be no significant difference between the two groups other than the type of curriculum they used. b. The null hypothesis is that the mean number of hours of television watched is the same in both the control and intervention populations prior to the intervention. In symbols, H0: control intervention. The alternative hypothesis is that the mean number of hours of television watched is not the same in both populations prior to the intervention. In symbols, Ha: control intervention. 15.46 15.35 0.05. c. The test statistic is t ________________ _______________ 15.0 2 2 ______ 13.17 2 ______ 103 95
√
Using Table III with 80 degrees of freedom, 0.05 0.846, so the p-value is off the chart: p-value .20. Using Minitab, the p-value is .956. Using the applet, the p-value is .9563. d. No; do not reject the null hypothesis at the .05 significance level. e. The null hypothesis is the population mean number of hours of television watched is the same under both the control and intervention conditions after the intervention. In symbols, H0: control intervention. The alternative hypothesis is the population mean number of hours of television watched is greater under the control condition than under the treatment condition after the intervention. In symbols, Ha: control intervention. 14.46 8.8 3.27. The test statistic is t ________________ _______________ 13.82 2 ______ 10.41 2 ______ 103 95
√
Using Table III with 80 degrees of freedom, 3.195 3.27 3.416, so .0005 p-value .001. Using Minitab, the p-value is .001. Using the applet, the p-value is .0008. With a p-value less than .05, reject the null hypothesis at the .05 significance level and conclude that you have strong statistical evidence the population mean number of hours of television watched under the control condition is greater than under the intervention condition after the time period used in this study. f. In the intervention condition, because the standard deviation is larger than the mean, this distribution could not be normal. An interval that includes values one standard deviation below the mean would include negative numbers, which make no sense in this context (hours of television watched), so the distribution must be skewed to the right. Similarly, with the control group, because the mean and standard deviation are so close, an interval of values two standard deviations above and below the mean would include many negative hours of television watched. Therefore, this distribution could not be symmetrical either. g. The nonnormality of these distributions does not hinder the validity of using this test procedure because the sample sizes are both well above 30. h. Because the randomization should have evened out the groups before the intervention (and the t-test confirms no significant difference between the
485
groups in terms of hours of television watched), you are safe in concluding that the significant difference in the mean hours of television watched by the control and intervention groups was caused by the curriculum intervention. However, as noted in previous activities, you should be cautious in generalizing these results to all elementary schools in the San Jose area as the subjects were not randomly selected, but instead selected from two schools. You definitely would not extend these results beyond San Jose elementary schools, and you should keep in mind that the children in both groups self-reported the amount of television they were watching.
Activity 22-15: Children’s Television Viewing a. Baseline Videotapes: H0: control intervention vs. Ha: control intervention 5.52 4.74 0.63. The test statistic is t _______________ ______________ 6.57 2 10.44 2 _____ ______ 103 95
√
Using Table III with 80 degrees of freedom, 0.63 0.846, so p-value 2 .20 .40. Using Minitab, the p-value is .527. Using the applet, the p-value is .5275. With the large p-value, do not reject H0 at any common significance level. You do not have statistically significant evidence of a difference in the baseline videotape group means. Baseline Video Games: H0: control intervention vs. Ha: control intervention 3.85 2.57 1.23. The test statistic is t _____________ ____________ 5.1 2 9.17 2 ____ _____ 103 95
√
Using Table III with 80 degrees of freedom, 0.846 1.23 1.292, so 2 .10 p-value 2 .20, which means .20 p-value .40.) Using Minitab, the p-value is .222. Using the applet, the p-value is .2233. With the large p-value, do not reject H0 at any common significance level. You do not have statistically significant evidence of a difference in the baseline video game group means. b. Follow-up Videotapes: H0: control intervention vs. Ha: control intervention 5.21 3.46 1.81. The test statistic is t ______________ _____________ 4.86 2 8.41 2 _____ _____ 103 95
√
Using Table III with 80 degrees of freedom, 1.664 1.23 1.990, so .025 p-value .05. Using Minitab, the p-value is .036. Using the applet, the p-value is .0368. Because the p-value is less than .05, reject H0 at the .05 significance level.
22
Activity 22-15
486
Topic 22: Comparing Two Means
You have moderate statistical evidence that the population mean time spent watching videotapes is greater under the control conditional than under the intervention condition. Follow-up Video Games: H0: control intervention vs. Ha: control intervention 4.24 1.32 2.85. ___________ The test statistic is t ____________ 2.72 2 10 2 _____ ____ 103 95
√
Using Table III with 80 degrees of freedom, 2.639 2.85 3.195, so .001 p-value .005. Using Minitab, the p-value is .003. Using the applet, the p-value is .0027. With the small p-value, reject H0 at any common significance level. You have strong statistical evidence that the population mean time spent playing video games under the control condition is greater than under the intervention condition.
Activity 22-16: Classroom Attention a. Technical conditions: The sample sizes are greater than 30, but the data did not arise from independent random samples from two populations or a randomized experiment. b. Reading: The null hypothesis is that the mean second-grade instruction time spent in reading is the same for both girls and boys. In symbols, H0: g b. The alternative hypothesis is that the second-grade mean instruction time spent in reading for girls is not the same as the second-grade mean instruction time spent in reading for boys. In symbols, Ha: g b. 37.81 35.9 1.39. _______________ The test statistic is t ________________ 18.46 2 18.64 2 ______ ______ 372 354
√
Using Table III with 100 degrees of freedom, 1.290 1.39 1.660, so 2 .05 p-value 2 .1, which means .1 p-value .2. Using Minitab, the p-value is .166. Using the applet, the p-value is .1665. With a p-value greater than .05, you do not have convincing evidence that the mean instruction time spent in second-grading reading differs between girls and boys. Mathematics: The null hypothesis is that the mean second-grade instruction time spent in mathematics is the same for both girls and boys. In symbols, H0: g b. The alternative hypothesis is that the second-grade mean instruction time spent in mathematics for girls is not the same as the second-grade mean instruction time spent in mathematics for boys. In symbols, Ha: g b. 29.55 38.77 6.99. ______________ The test statistic is t _______________ 16.59 2 ______ 18.93 2 ______ 372 354
√
487
Using Table III with 100 degrees of freedom, 6.99 3.391, which is off the chart. You can conclude the p-value 2 .0005 .001. Using Minitab and the applet, p-value .0000. With such a small p-value, reject H0 at any common significance level. You have very strong evidence that the mean instruction time spent in secondgrade mathematics is not the same for girls as it is for boys. c. Reading: 95% confidence interval for g b: (37.81 35.9) t*(1.378) Using Table III with 100 degrees of freedom, t* 1.984, so the confidence interval is (0.824, 4.644). Using Minitab, the confidence interval is (0.79, 4.61). Using the applet, the interval is (0.799, 4.619). Mathematics: 95% confidence interval for g b: (29.55 38.77) t*(1.319) Using Table III with 100 degrees of freedom, t* 1.984, so the confidence interval is (11.837, 6.603). Using Minitab, the interval is (11.81, 6.63). Using the applet, the interval is (11.815, 6.625). d. Both the significance test and confidence interval indicate that there is no statistically significant difference in the mean reading instruction time given to second-grade girls and boys. Do not reject the null hypothesis, and zero is contained in the 95% confidence interval, so you conclude that you have no reason to worry that boys or girls are being treated unfairly with regard to reading instruction time in the second grade. However, there is a significant difference in the mean amount of mathematical instruction time given to girls and boys in the second grade. You are 95% confident that second-grade boys are receiving an average of between 6.6 and 11.8 seconds more individual instruction time in mathematics than are secondgrade girls. This data was reported in 1979, so you can hope that it is out-of-date. In addition, the technical conditions for these procedures to be valid were not met, so you should be cautious in interpreting these results. Because you do not know how the second-grade students in the sample were selected, you cannot generalize these results to a specific population; and because this was an observational study, you cannot conclude that gender alone was responsible for the differences in time observed.
Activity 22-17: UFO Sighters’ Personalities a. This is an observational study. The researchers did not determine who would or would not be UFO sighters. b. The explanatory variable is whether the subject had an intense experience with a UFO. This variable is binary categorical. The response variable is IQ score. This variable is quantitative. c. The null hypothesis is that the population mean IQ score is the same for all community members and all UFO sighters. In symbols, H0: UFO community. The alternative hypothesis is that the population mean IQ score is not the same for all community members and all UFO sighters. In symbols, Ha: UFO community.
22
Activity 22-17
488
Topic 22: Comparing Two Means
101.6 100.6 0.41. ____________ The test statistic is t _____________ 2 8. 9 12.3 2 ____ _____ 25 53
√
Using Table III with 24 degrees of freedom, 0.41 0.857, so p-value 2 .2 .4. Using Minitab, the p-value is .685. Using the applet, the p-value is .6873. The p-value is not small, so do not reject H0. There is no statistical evidence to suggest that the population mean IQ score of community members and UFO sighters differs. d. 95% confidence interval for UFO community : (101.6 100.6) t*(2.45) Using Table III with 24 degrees of freedom, t* 2.064, so the confidence interval is (4.057, 6.057). Using Minitab, the interval is (3.9, 5.9). Using the applet, the interval is (4.065, 6.065). You are 95% confident that the difference in the mean IQ score between these two populations is between 4 and 6 points. Because this interval contains both positive and negative numbers and zero, it indicates that it is plausible there is no difference in the mean IQ scores of the two populations. Each of the values from 4 to 6 (including zero) is a plausible value for the difference in mean IQ scores between these two populations. e. No; even if you had found a significantly higher mean IQ for one group, you would not be able to draw any causal conclusions about how seeing UFOs affects intelligence because this was an observational study. There could be many confounding variables that you were unable to control for.
Activity 22-18: Tennis Simulations a. The null hypothesis is that the population mean game length is the same for standard scoring and no-ad scoring. In symbols, H0: standard no-ad. The alternative hypothesis is that the population game length is not the same for standard scoring and no-ad scoring. In symbols, Ha: standard no-ad. 6.81 5.84 3.32. ______________ The test statistic is t _______________ 2.74 2 ______ 1.022 2 _____ 100 100
√
Using Table III with 80 degrees of freedom, 3.195 3.32 3.416, so 2 .0005 p-value 2 .001, which means .001 p-value .002. Using Minitab, the p-value is .001. Using the applet, the p-value is .0013. With the small p-value, reject H0 at any commonly used significance level. You have strong statistical evidence that the population mean game length with conventional scoring differs from the mean game length for no-ad scoring. 95% confidence interval for standard no-ad : (6.81 5.84) t*(0.292) Using Table III with 80 degrees of freedom, t* 1.990, so the confidence interval is (0.389, 1.551). Using Minitab, the interval is (0.391, 1.549). Using the applet, the interval is (0.39, 1.55). You are 95% confident that the population mean game length with conventional scoring is between .39 and 1.55 points greater than the population mean game length for no-ad scoring.
Activity 22-19
489
The alternative hypothesis is that the population game length is not the same for handicap scoring and no-ad scoring. In symbols, Ha: no-ad handicap. 5.84 4.79 _______________ The test statistic is t ________________ 5.90. 1.458 2 1.022 2 ______ ______ 100 100
√
Using Table III with 80 degrees of freedom, 3.416 5.90, so the p-value is off the chart; p-value 2 .0005 0.001. Using Minitab and the applet, the p-value .0000. With the small p-value, reject H0 at any commonly used significance level. You have strong statistical evidence that the population mean game length with handicap scoring differs from the mean game length for no-ad scoring. 95% confidence interval for no-ad handicap: (5.84 4.79) t*(.1781) Using Table III with 80 degrees of freedom, t* 1.990, so the confidence interval is (0.696, 1.404). Using Minitab, the interval is (0.699, 1.401). Using the applet, the interval is (0.697, 1.403). You are 95% confident that the population mean game length with no-ad scoring is between .7 and 1.4 points greater than the population mean game length for handicap scoring. c. No, these inference techniques do say anything about whether the variability of game lengths differ among the scoring methods or whether the shapes of the distributions differ.
Activity 22-19: Ice Cream Servings a. This is an experiment because the researchers randomly assigned the subjects to the two bowl sizes. b. Technical conditions: No, you do not have enough information to check whether the technical conditions of the two-sample t-test are satisfied. The sample sizes are small (less than 30), and you do not have the individual data, so you cannot make a judgment about whether the data appear to come from normally distributed populations. c. The null hypothesis is that the population mean volume of ice cream taken by people is the same regardless of bowl size. In symbols, H0: 34 oz bowl 17 oz bowl. The alternative hypothesis is the population mean volume of ice cream taken by people with a 34-ounce bowl is greater than the mean volume of ice cream taken by people with a 17-ounce bowl. In symbols, Ha: 34 oz bowl 17 oz bowl. 5.81 4.38 2.00. _____________ d. The test statistic is t ______________ 2.05 2 2.2 6 2 _____ _____ 17 20
√
Using Table III with 16 degrees of freedom, 1.746 2.00 2.120, so .025 p-value .05. Using Minitab, the p-value is .027. Using the applet, the p-value is .0313.
22
b. The null hypothesis is that the population mean game length is the same for handicap scoring and no-ad scoring. In symbols, H0: no-ad handicap.
490
Topic 22: Comparing Two Means
e. Yes, the difference in volume of ice cream served between the two bowl sizes is statistically significant at the .05 level, but not at the .01 level (.01 p-value .05). f. Yes, you can draw a cause-and-effect conclusion between bowl size and volume of ice cream taken because this is a well-designed experiment. If the randomization worked, the only difference between the two groups should have been the bowl sizes.
Activity 22-20: Ice Cream Servings The null hypothesis is that the population mean volume of ice cream taken by people is the same regardless of spoon size. In symbols, H0: 3 oz spoon 2 oz spoon. The alternative hypothesis is that the population mean volume of ice cream taken by people with a 3-ounce spoon is greater than the mean volume of ice cream taken by people with a 2-ounce spoon. In symbols, Ha: 3 oz spoon 2 oz spoon. 5.07 4.38 1.18 ____________ Test statistic: t _____________ 2.05 2 1.84 2 _____ _____ 20 26
√
Using Table III with 19 degrees of freedom, 0.861 1.18 1.328, so .10 p-value .20. Using Minitab, the p-value is .122. Using the applet, the p-value is .1257. Test decision: No, the difference in volume of ice cream served between the two spoon sizes is not statistically significant at the .05 level or at the .01 level (p-value .05). Conclusion in context: You cannot draw a cause-and-effect conclusion between spoon size and volume of ice cream served because the difference observed is not statistically significant.
Activity 22-21: Natural Selection a. The null hypothesis is the population mean total length of the sparrows that would survive such a storm is the same as the mean total length of the sparrows that would perish. In symbols, H0: die survive. The alternative hypothesis is the population mean total length of the sparrows that would survive is not the same as the mean total length of the sparrows that would perish. In symbols, Ha: die survive. Technical conditions: Both sample sizes are not greater than 30, but probability plots indicate that both populations are plausibly normally distributed. However, the data do not appear to be independent random samples from two populations, but you might be willing to consider them representative enough to proceed with the analysis. 162 159.06 4.3. _____________ The test statistic is t ______________ 2.81 2 2.41 2 _____ _____ 35 24
√
Using Table III with 23 degrees of freedom, 3.768 4.3, so p-value 2 .0005, which means p-value .001. Using Minitab and the applet, p-value .000. With such a small p-value, reject H0 at any commonly used significance level and conclude there is a difference in the population mean total length between the sparrows that would die and the sparrows that would survive a severe winter storm (as long as these sparrows are representative of the population). 95% confidence interval for died survived : (162 159.06) t*(.684)
491
Using Table III with 23 degrees of freedom, t* 2.069, so the confidence interval is (1.525, 4.355). Using Minitab, the interval is (1.572, 4.314). Using the applet, the interval is (1.525, 4.355). You are 95% confident the population average length of the sparrows that would die in such a storm is between 1.5 and 4.3 mm greater than the population average length of sparrows that would survive. Because this was an observational study and not an experiment, you cannot conclude a cause-and-effect relationship between total length and the death of the sparrows. In addition, because these data were not randomly selected and you cannot tell whether they are representative, you probably should not generalize these results beyond sparrows in Providence, Rhode Island, in the late 1800s. b. The null hypothesis is the population mean weight of the sparrows that would survive is the same as the population mean weight of the sparrows that would perish. In symbols, H0: die survive. The alternative hypothesis is the population mean weight of the sparrows that would survive is not the same as the population mean weight of the sparrows that would perish. In symbols, Ha: die survive. 26.27 25.47 2.19. _____________ The test statistic is t ______________ 2 1.26 2 1.46 _____ _____ 35 24
√
Using Table III with 23 degrees of freedom, 2.069 2.19 2.500, so 2 .01 p-value 2 .025, which means .02 p-value .05. Using Minitab, the p-value is .034. Using the applet, the p-value is .0394. With a p-value less than .05, reject H0 at the .05 significance level and conclude there is a difference in the population mean weight between the sparrows that would die and the sparrows that would survive a severe winter storm. 95% confidence interval for die survive: (26.27 25.47) t*(.366) Using Table III with 23 degrees of freedom, t* 2.069, so the confidence interval is (.0427, 1.557). Using Minitab, the interval is (0.064, 1.541). Using the applet, the interval is (0.042, 1.558). You are 95% confident the population average weight of the house sparrows that would die in such a storm is between 0.06 grams and 1.5 grams greater than the population average weight of sparrows that would survive. c. Alar extent: H0: die survive, Ha: die survive 247 247.69 0.68. _____________ The test statistic is t ______________ 2 3.79 3.86 2 _____ _____ 35 24
√
Using Table III with 23 degrees of freedom, 0.858 0.68, so the p-value is off the chart. This means the p-value 2 .2 .4. Using Minitab, the p-value is .501. Using the applet, the p-value is .5022. Because the p-value is not small, do not reject H0. There is no evidence of a difference in mean alar extent in the sparrows that died and the sparrows that survived. 95% confidence interval for die survive: (247 247.69) t*(1.01)
22
Activity 22-21
492
Topic 22: Comparing Two Means
Using Table III with 23 degrees of freedom, t* 2.069, so the confidence interval is (2.784, 1.404). Using Minitab, the interval is (2.72, 1.35). Using the applet, the interval is (2.784, 1.404). Length of head and beak: H0: die survive, Ha: die survive 31.671 31.614 0.34. _____________ The test statistic is t ______________ 2 .631 2 .611 _____ _____ 35 24
√
Using Table III with 23 degrees of freedom, 0.34 0.858, so the p-value is off the chart, which means p-value 2 .2 .4. Using Minitab, the p-value is .732. Using the applet, the p-value is .7315. With the large p-value, do not reject H0. There is no evidence of a difference in mean length of head and beak between the sparrows that would die and the sparrows that would survive. 95% confidence interval for die survive: 31.671 31.614 t*(.164) Using Table III with 23 degrees of freedom, t* 2.069, so the confidence interval is (.283, .397). Using Minitab, the interval is (0.273, 0.386). Using the applet, the interval is (0.282, 0.396). Humerus bone length: H0: die survive, Ha: die survive .7279 .738 1.72. _______________ The test statistic is t ________________ .0235 2 ______ .0198 2 ______ 35 24
√
Using Table III with 23 degrees of freedom, 2.069 1.72 1.714, so 2 .025 p-value 2 .05, which means .05 p-value .1. Using Minitab, the p-value is .092. Using the applet, the p-value is .0976. With a p-value greater than .05, do not reject H0 at the 5% level (but it would be rejected at the 10% level). There is not enough statistical evidence to conclude there is a difference between mean humerus bone length between the sparrows that would die and the sparrows that would survive. 95% confidence interval for die survive: .7279 .738 t*(.0058) Using Table III with 23 degrees of freedom, t* 2.069, so the confidence interval is (0.022, 0.002). Using Minitab, the interval is (0.0219, 0.00173). Using the applet, the interval is (0.022, 0.002). Femur bone length: H0: die survive, Ha: die survive .7065 .7168 1.84. _______________ The test statistic is t ________________ 2 .0225 2 .0203 ______ ______ 35 24
√
Using Table III with 23 degrees of freedom, 2.069 1.84 1.714, so 2 .025 p-value 2 .05, which means .05 p-value .10. Using Minitab, the p-value is .092. Using the applet, the p-value is .0800.
493
With a p-value greater than .05, do not reject H0 at the 5% level (but would reject at the 10% level). There is not enough statistical evidence to conclude there is a difference between mean femur bone length between the sparrows that would die and the sparrows that would survive. 95% confidence interval for die survive: .7065 .7168 t*(.0056) Using Table III with 23 degrees of freedom, t* 2.069, so the confidence interval is (.022, .001). Using Minitab, the interval is (0.02166, 0.00092). Using the applet, the interval is (0.022, 0.001). Tibiotarsus bone length: H0: die survive, Ha: die survive 1.1202 1.1353 1.55. ______________ The test statistic is t _______________ 2 .0377 _____ .036 2 ______ 35 24
√
Using Table III with 23 degrees of freedom, 1.714 1.55 2.069, so 2 .025 p-value 2 .05, which means .05 p-value .1. Using Minitab, the p-value is .129. Using the applet, the p-value is .1325. With a p-value greater than .05, do not reject H0 at the 5% level. There is no evidence of a difference in mean tibiotarsus bone length between the sparrows that would die and the sparrows that would survive. 95% confidence interval for died survived: 1.1202 1.1353 t*(.0098) Using Table III with 23 degrees of freedom, t* 2.069, so the confidence interval is (.035, .005). Using Minitab, the interval is (.03492, .00457). Using the applet, the interval is (.036, .005). Skull width: H0: die survive, Ha: die survive .6036 .6025 0.33. ______________ The test statistic is t _______________ .0139 2 .0126 2 ______ ______ 35 24
√
Using Table III with 23 degrees of freedom, 0.33 0.858, so the p-value is off the chart, which means the p-value 2 .2 .4. Using Minitab, the p-value is .745. Using the applet, the p-value is .755. With the large p-value, do not reject H0. There is no evidence of a difference in mean skull width in the sparrows that would die and the sparrows that would survive. 95% confidence interval for die survive: Using Table III with 23 degrees of freedom, t* 2.069, so the confidence interval is (006, .008). Using Minitab, the interval is (0.00585, 0.00813). Using the applet, the interval is (0.006, 0.008). Keel of sternum: H0: die survive, Ha: die survive .8457 .8576 1.28. _______________ The test statistic is t ________________ 2 .0372 2 .0332 ______ ______ 35 24
√
22
Activity 22-21
494
Topic 22: Comparing Two Means
Using Table III with 23 degrees of freedom, 1.319 1.28 0.858, so 2 .1 p-value 2 .2, which means .2 p-value .4. Using Minitab, the p-value is .207. Using the applet, the p-value is .2108. Because the p-value is not small, do not reject H0. There is no evidence of a difference in mean keel of sternum in the sparrows that would die and the sparrows that would survive. 95% confidence interval for die survive: .8457 .8576 t*(.0092) Using Table III with 23 degrees of freedom, t* 2.069, so the confidence interval is (.031, .007). Using Minitab, the interval is (0.03038, 0.00673). Using the applet, the interval is (0.031, 0.007).
Activity 22-22: Close Friends a. The null hypothesis is that the population proportion of men who would respond to the survey question with zero names is the same as the population proportion of women who would respond to the survey question with zero names. In symbols, H0: M W. The alternative hypothesis is that the population proportion of men who would respond to the survey question with zero names is the not same as the population proportion of women who would respond to the survey question with zero names. In symbols, Ha: M W. Technical conditions: The sample sizes are both greater than 30 (654 and 813), and the data are from a random sample from one population, divided into two independent groups. The technical conditions are met. pˆM .3 and pˆW .247 .3 .247 _________________________ The test statistic is z __________________________ 2.25. 1 1 ____ ____ (.271)(1 .271) 654 813
√
(
)
Using Table II, p-value 2 P(Z 2.25) 2 .0122 .0244. Reject H0 at the .05 level of significance because the p-value is less than .05. You have moderate statistical evidence the population proportion of men who would respond to the survey question with zero names is not the same as the population proportion of women who would respond to the survey question with zero names. b. The null hypothesis is that the population proportion of men who would respond to the survey question with six names is the same as the population proportion of women who would respond to the survey question with six names. In symbols, H0: M W. The alternative hypothesis is that the population proportion of men who would respond to the survey question with six names is the not same as the population proportion of women who would respond to the survey question with six names. In symbols, Ha: M W. pˆM .05 and pˆW .046 .05 .046 ____________________________ The test statistic is z _____________________________ 0.36. 1 1 ____ (.0477)(1 .0477) ____ 654 813
√
(
)
Activity 22-25
495
With the large p-value, do not reject H0 at any commonly used level of significance. You have no statistical evidence that the population proportion of men who would respond to the survey question with six names differs from the population proportion of women who would respond to the survey question with six names.
Activity 22-23: Hypothetical SAT Coaching a. The null hypothesis is that the mean improvement in SAT scores would be the same for both the coaching and control conditions. In symbols, H0: Coaching Control. The alternative hypothesis is that the mean improvement in SAT scores would be greater for the coaching condition than for the control condition (no coaching). In symbols, Ha: Coaching Control. 46.2 44.4 4.28. ____________ The test statistic is t _____________ 15.3 2 14.4 2 _____ _____ 2500 2500
√
Using Table III with 500 degrees of freedom, 4.28 3.310, so p-value .0005. Using Minitab and the applet, p-value .0000. With such a small p-value, reject H0 at any commonly used significance level. You have extremely strong statistical evidence that the mean improvement in SAT scores when coached is greater than the mean improvement in SAT scores when not coached. b. For a 99% CI, you calculate (46.2 44.4) t*(.420) (.717, 2.883). You are 99% confident the population mean improvement in SAT scores when coached is between .717 and 2.883 points greater than the population mean improvement when not coached. c. Yes, the data provide very strong evidence that SAT coaching is helpful. The p-value is what helps you answer this question. The p-value is very small ( 0), indicating that you would essentially never see a sample result like this by chance alone if the SAT coaching had no effect. d. No, the data indicate that SAT coaching will only improve SAT scores by at most 2.9 points on average, which means the coaching is not very helpful. You can tell this from the confidence interval.
Activity 22-24: Hypothetical SAT Coaching a. A Type I error would be deciding that the SAT coaching is helpful when it really is not helpful. b. A Type II error would be failing to realize the SAT coaching is helpful when it is.
Activity 22-25: Hypothetical ATM Withdrawals a. For a 90% CI, you calculate (70 70) t*(6.06) (10.6, 10.6). b. Yes, this interval includes the value zero.
22
Using Table II, p-value 2 P(Z 0.36) 2 .3594 .7188.
496
Topic 22: Comparing Two Means
c. No, this interval would not be any different if you had chosen a different pair of machines because the sample means, sample sizes, and sample standard deviations are the same for all three machines. d. No; in spite of the common sample sizes, means, and standard deviations, as you learned in Activities 9-24 and 19-21, these three machines are not identical in their distributions of withdrawal amounts. (See the description in Activity 19-21, part a.) The two-sample t-test does not allow you to say anything about whether the shapes of the distributions differ—only that the means do or do not differ.
Activity 22-26: Sporting Examples a. The null hypothesis is that the population mean total number of points earned in both types of statistics sections is the same. In symbols, H0: regular sports. The alternative hypothesis is that the population mean total number of points earned in both types of statistics sections is not the same. In symbols, Ha: regular sports. Technical conditions: The sample sizes are not quite large enough (n 29, 28), but probability plots indicate the data plausibly come from normal distributions. However, the data were not from two independent random samples from two random populations or random assignment. Because of the times of the classes, the samples are probably not even representative. Regular
Sports
99
95 90 80 Percentage
70 60 50 40 30 20 10 5
1 100
200
300
400
500
100
200
300
400
500
Total Points
338.07 307.25 2.69. ______________ The test statistic is t _______________ 2 52.51 2 30.6 ______ _____ 29 28
√
Using Table III with 27 degrees of freedom, 2.473 2.699 2.771, so 2 .005 p-value 2 .01, which means .01 p-value .02. Using Minitab, the p-value is .010. Using the applet, the p-value is .012. Because the p-value is less than .10, reject H0 at the .10 significance level.
Quizzes
497
Using Table III with 27 degrees of freedom, t* 1.703, so the confidence interval is (11.338, 50.302). Using Minitab, the interval is (11.6, 50.0). Using the applet, the interval is (11.343, 50.297). You are 90% confident the population mean total points earned in the regular section is between 11.5 and 50 points greater than the population mean total points earned in the sports-themed section. c. No; this was an observational study, not an experiment, so you cannot draw a causeand-effect conclusion between sports-themed examples and worse performance in the course. The students self-selected which course they wanted to take, and the times that the courses were offered were very different and may have affected their decisions and performance more than the examples used in the course. d. No, you probably should not generalize the conclusions to all students who study statistics. Students at this university may not be representative of all university students.
Activity 22-27: Your Choice Answers will vary.
Assessment Sample Quiz 22A
•••
Random samples of monthly rent prices, in dollars, for studio and one-bedroom apartments, were obtained for the Pennsylvania cities of Harrisburg and Philadelphia in July 2007. Summary statistics are calculated here: Sample Size
Sample Mean
Sample SD
Harrisburg
10
$618.3
$85.3
Philadelphia
15
$760.0
$245.4
Conduct a test of whether the sample data provide sufficient evidence at the .10 significance level to conclude that the population mean rent price differs between these two cities. Be sure to report: 1. Null and alternative hypotheses 2. Test statistic 3. p-value 4. Test decision and conclusion in context 5. Check of technical conditions, mentioning any additional information that you would need to conduct this check
Solution to Sample Quiz 22A
•••
1. The null hypothesis is that the population mean rent price for studio and one-bedroom apartments in July 2007 is the same for both Harrisburg and Philadelphia. In symbols, H0: H P.
22
b. Confidence interval: (338.07 307.25) t*(11.44)
510
Topic 23: Analyzing Paired Data
•••
Homework Activities Activity 23-6: Cow Milking • Design A: completely randomized analysis • Design B: matched-pairs analysis • Design C: matched-pairs analysis
Activity 23-7: Car Ages a. This study calls for an independent-samples analysis. There is no link between the students and the faculty members in the sample. In this particular case, you could not do a matched-pairs analysis because the sample sizes are different. b. No, the answer would not change if you sampled 25 students and 25 faculty members. There would still be no link between the students and the faculty members. You could change the order of the students andor the faculty members without affecting the analysis.
Activity 23-8: Freshman Fifteen Yes, a paired t-test would be appropriate for these data. The data would be linked as pairs to the same freshman—before and after the term. This data would not have been drawn independently from two distinct groups.
Activity 23-9: Sickle Cell Anemia and Child Development a. This is an observational study. The researchers did not determine which children wouldwould not have sickle cell anemia. b. This is a matched-pairs study. Each child with the sickle cell trait was matched with a child in the control group. c. The p-values must be large (greater than .10) if this study found no statistically significant differences. The researchers would not have rejected any of their null hypotheses in this study.
Activity 23-10: Running to Home a. Find a (random) sample of baseball or softball players, preferably 30 or more. Have each player run from second base to home twice—once by taking a wide angle around third base and once by taking a narrow angle around third base. Record their running times with a stopwatch. Calculate the difference in the times between the two runs (wide angle narrow angle). b. Yes, randomization would play a useful role in a matched-pairs experiment on this issue. You would want to randomly decide the order in which the angle around third base was determined in the two trials so fatigue would not be a confounding variable in the experiment. c. There could be a great deal of player-to-player variability in the times that it takes the players to run from second base to home. But there should be much less variability between their times running this distance by taking a wide angle or a narrow angle around third base. The matched-pairs design would reduce this variability.
Activity 23-13
511
Activity 23-11: Friendly Observers
Activity 23-12: Marriage Ages The null hypothesis is that the population mean difference in ages between husbands and their wives is zero. In symbols, H0: d 0. The alternative hypothesis is that the population mean difference in ages between husbands and their wife is greater than zero. In symbols, Ha: d 0.
23
Have each subject play the video game twice—once when they know an observer with a vested interest is watching and once when they know an observer with no vested interest is watching. Be sure to randomly decide the order in which the type of observer is assigned to each playing of the game. Record the difference in the times it takes to navigate the obstacle course in the video game. The primary advantage to redoing this experiment with a matched-pairs design is that you could reduce the overall variability (eliminate the subject-to-subject variability to focus on the effect of the observer’s interest level).
1.92____ 3.80 Test statistic: t ___________ 5.047√ 100 Using Table III with 80 degrees of freedom, 3.80 3.416, so the p-value is off the chart. This means the p-value 2 0.0005 .001. Using Minitab and the applet, the p-value .001. Test decision: With the very small p-value, reject H0 at the .05 significance level. Conclusion in context: You have very strong statistical evidence that the mean difference in ages between husbands and their wives is greater than zero for this population (you don’t have much information about the population from which this sample was selected, but you are assuming it is representative). 95% confidence interval for d: (0.919, 2.921) You can be 95% confident husbands are between .92 years and 2.92 years older than their wives, on average. With the larger sample size in this activity, you have much stronger statistical evidence against the null hypothesis than you did in Activity 23-1. The confidence interval is also narrower.
Activity 23-13: Catnip Aggression a. You know these data came from a matched-pairs design because there are two observations (before catnip and after catnip) recorded for each cat. b. Technical conditions: The required technical conditions are that the sample size be large (it is not) or that the population of differences follow a normal distribution (which, according to a probability plot, is plausible). This was not a random sample of cats but may be representative of cats at an animal shelter. c. Let d represent population mean of the differences in number of negative cat interactions before and after catnip exposure. H0: d 0 vs. Ha: d 0 1 ___ 3.24. The test statistic is t __________ 1.195√ 15
512
Topic 23: Analyzing Paired Data
Using Table III with 14 degrees of freedom, 3.787 3.24 2.977, so 2 .001 p-value 2 .005. This means .002 p-value .01. Using Minitab, the p-value is .006. Using the applet with 14 degrees of freedom, the p-value is .0059. Because the p-value, .006, is less than .05, reject H0 at the .05 level. You have very strong statistical evidence the population mean difference in number of negative cat interactions before and after exposure is not 0. d. 90% CI for d: (1.543, 0.457) You are 90% confident that, on average, the number of negative cat interactions after exposure to catnip is between .5 and .154 greater than the number of negative cat interactions before exposure to catnip.
Activity 23-14: Catnip Aggression a. The null hypothesis is that the population mean number of negative cat interactions is the same before and after exposure to catnip. In symbols: H0: before after. The alternative hypothesis is that the population mean number of negative cat interactions is not the same before exposure to catnip as it is after. In symbols: Ha: before after. 1.8 2.8 _____________ 1.34. The test statistic is t ______________ 2.37 2 1.66 2 _____ _____ 15 15
√
Using Table III with 14 degrees of freedom, 1.345 1.34 0.868, so 2 .1 p-value 2 .2. This means .20 p-value .40. Using Minitab, the p-value is .192. Using the applet, the p-value is .2021. With a p-value of approximately .2, do not reject H0 at any common significance level. You do not have sufficient statistical evidence to conclude there is a difference in the mean number of negative cat interactions before and after exposure to catnip. b. The p-value has increased to the point where you cannot reject H0 at any reasonable significance level. c. Yes, this analysis indicates the pairing was useful. The variability was significantly reduced by using paired data by accounting for the cat-to-cat variation in amount of negative interactions.
Activity 23-15: Muscle Fatigue a. The null hypothesis is that the population mean time to fatigue is the same for healthy young women and men. In symbols: H0: W M. The alternative hypothesis is that there is a difference in the population mean time to fatigue between healthy young women and men. In symbols: Ha: W M. 1408 513 2.46. _____________ The test statistic is t ______________ 194 2 113 3 2 ____ _____ 10 10
√
Using Table III with 9 degrees of freedom, 2.262 2.46 2.821, so 2 .01 p-value 2 .025. This means .02 p-value .05. Using technology, the p-value is .036.
Activity 23-16
513
Because a p-value of .036 is less than .05, reject H0 at the .05 level. You have moderate statistical evidence that there is a difference in the population mean time to fatigue between healthy young women and men. b. The p-value has not changed from its value in the paired test.
Activity 23-16: Maternal Oxygenation a. A paired test would be appropriate here because you have two measurements on the same fetus before and after the mother was administered 40% oxygen.
23
c. No, this analysis indicates that pairing was not particularly useful in this study. You obtain exactly the same test statistic and p-value regardless of whether you pair the data. (A plot of the male times versus the female times reveals there is not much relationship between these variables.)
b. Technical conditions: You do not know whether the sample was randomly selected and the amount of oxygen was not randomized. The sample size is small ( 30), but a probability plot indicates that it is reasonable to believe the differences have come from a normally distributed population. 99 95
Percentage
90 80 70 60 50 40 30 20 10 5 1 0.2
0.1
0.0
0.1
0.2
0.3
Difference in Oxygen Level
c. Let d represent the population mean of the differences in fetus oxygen levels after and before the mother was administered 40% oxygen (after minus before). H0: d 0 vs. Ha: d 0 0.0487___ 3.83. The test statistic is t ___________ 0.0624√ 24 Using Table III with 23 degrees of freedom, 3.83 3.768, so the p-value is off the chart. This means p-value .0005. Using Minitab and the applet, p-value .0000. With such a small p-value, reject H0 at any commonly used significance level. You have very strong statistical evidence the population fetus oxygen level is higher, on average, after the mother was administered 40% oxygen than at baseline.
Topic 23: Analyzing Paired Data
d. A paired test is appropriate for this data as well. As before, you do not know whether the sample was randomly selected. The sample size is small ( 30), but a probability plot indicates that it is reasonable to believe the differences have come from a normally distributed population. 99 95 90
Percentage
514
80 70 60 50 40 30 20 10 5 1 0.3
0.2
0.1
0.0
0.1
0.2
0.3
Difference in Oxygen Level
Let d represent the population mean of the differences in fetus oxygen levels after the mother was administered 100% oxygen and after she was administered 40% oxygen. H0: d 0 vs. Ha: d 0 0.0158___ 0.92. A test statistic is t ___________ 0.0844√ 24 Using Table III with 23 degrees of freedom, 0.858 0.92 1.319, so 0.10 p-value .20. Using technology, the p-value is .184. Do not reject H0 at the .10 significance level (p-value .184 .10). You do not have convincing statistical evidence that the population fetus oxygen level is higher, on average, after the mother was administered 100% oxygen than it was after she was administered 40% oxygen.
Activity 23-17: Mice Cooling a. These data call for a matched-pair analysis because each mouse’s body was used twice to measure the cooling constant—once when freshly killed and once when reheated. b. Here is the numerical summary and graphical display:
Differences
50
25
_
n
x
s
Min.
QL
19
20.8
45.8
45
11
0
25
50
Median 10
75
Cooling Constant Difference
100
QU
Max.
45
139
125
Activity 23-18
515
Twelve of the 19 mice had cooling constants that were greater when they were freshly killed than when they were reheated. The dotplot of differences appears to have at least one extreme outlier at 139. The mean difference in cooling constants was 20.8, the standard deviation was 45.8.
H0: d 0 vs. Ha: d 0 Technical conditions: You have no indication that these mice were randomly selected from the population of all mice. The sample size is small ( 30), and the extreme outlier indicates that these data may not have come from a normal population of differences. You should stop the analysis at this point, but you will continue with the calculations for the sake of practice, remembering that you cannot take the results very seriously.
23
c. Let d represent the mean of the population of differences in cooling constants of the mice when they are freshly killed and when they are reheated.
20.8___ 1.98. The test statistic is t _________ 45.8√ 19 Using Table III with 18 degrees of freedom, 1.734 1.98 2.101, so 2 .025 p-value 2 .05. This means .05 p-value .10. Using technology, the p-value is .064. Do not reject H0 at the .05 level but do reject H0 at the .10 level. You have some statistical evidence to conclude that the cooling constants of freshly killed mice and reheated mice differ, on average (but remember that you did not meet the technical conditions). d. 95% CI for d: Using Table III with 18 degrees of freedom, t* 2.101, so the confidence interval is (1.28, 42.88). Using Minitab, the interval is (1.3, 42.9). Using the applet, the interval is (1.275, 42.875). If the technical conditions had been met, you would be 95% confident the cooling constant of freshly killed mice is between –1.3 and 42.9 more than that of reheated mice, on average. Note: A 90% confidence interval would not contain zero.
Activity 23-18: Clean Country Air 15
10
5
0
5
10
15
Difference
This dotplot of the differences in the percentages of radioactivity remaining after one hour in the urban (city) twin minus the rural (country) twin shows that with one exception there was more radioactivity in the urban twins. The mean difference was 4.03 percentage points, the median 6 percentage points, and the standard deviation was 10.15 percentage points. Define parameter of interest: Let d represent the population mean of the differences in radioactive percentages in the urban environments minus the rural environments. H0: d 0 vs. Ha: d 0 Check technical conditions: These twins could not have been randomly selected, as they were identical twins living in different environments. So you should be cautious about generalizing your results. Though the sample size is very small ( 30), the probability plot indicates that it is reasonable to believe the population distribution of differences is normal.
Topic 23: Analyzing Paired Data
99 95 90 Percentage
516
80 70 60 50 40 30 20 10 5 1 40
30
20
10
0
10
20
30
40
50
Difference
4.03 __ 1.05 Test statistic: t ________ 10.15√7 Using Table III with 6 degrees of freedom, 0.906 1.05 1.440, so 2 .10 p-value 2 .20. This means .20 p-value .40. Using technology, the p-value is .334. Test decision: Do not reject H0 at the .05 level (p-value .334 .05). Conclusion in context: You do not have sufficient statistical evidence to conclude there is a difference in the population radioactive percentages in the two environments on average. So, you have no reason to think that a person’s lungs clear out unhealthy particles any slower in urban environments than in rural ones (as long as this sample is representative).
Activity 23-19: Exam Score Improvements a. The following dotplot shows the improvements in scores from exam 1 to exam 2:
30
24
18
12
6
0
6
12
Improvement
b. Student 19 had the greatest improvement in exam scores. He or she improved 11 points (from 84 to 95). c. Student 8 had the greatest decline in exam scores. He or she declined 30 points (from 94 to 64). d. Five of the 23 ( .2174) students scored higher on exam 1 than on exam 2. e. The sample mean improvement is 8.7 points. The sample SD is 10.73 points. f. Let d represent the population mean improvement in the exam scores from exam 1 to exam 2. The null hypothesis is that the population mean improvement from exam 1 to exam 2 is 0 (no improvement or decline in the mean scores). In symbols, H0: d 0. The alternative hypothesis is that the population mean improvement from exam 1 to exam 2 differs from 0 (i.e., the mean score either improved or declined). In symbols, Ha: d 0.
Activity 23-20
517
8.7___ 3.89. The test statistic is t __________ 10.73√ 23
g. Yes, with this small p-value, reject the null hypothesis at the .10, .05, and .01 significance levels. h. 95% CI for d: (13.34, 4.05)
23
Using Table III with 22 degrees of freedom, 3.89 3.792, so the p-value is off the chart. This means p-value 2 .0005 .001. Using technology, the p-value is .001.
You are 95% confident the population mean score on exam 1 is between 4.05 and 13.34 points higher, on average, than the population mean score on exam 2. i. Technical conditions: The sample size is not large (n 23 30), but a probability plot of the improvements indicates that they could have come from a normally distributed population. 99
Percentage
95 90 80 70 60 50 40 30 20 10 5 1 50
40
30
20
10
0
10
20
30
Improvement
However, the scores were not gathered from a simple random sample, so you should be cautious in generalizing your results beyond this particular statistics class. For this class, you had very strong statistical evidence that the exam scores had declined, on average (p-value .001), between 4 and 13.3 points from the first to the second exam (with 95% confidence). This is a decline of about a half to more than a full letter grade, so it would be a practically significant decline in the exam scores.
Activity 23-20: Exam Score Improvements 12 ___ –3.05. a. The test statistic is t __________ 19.29√ 24 Using Table III with 23 degrees of freedom, 3.485 3.05 2.807, so 2 .001 p-value 2 .005. This means .002 p-value .01. Using technology, the p-value is .006. 95% CI for d: (20.15, 3.85) b. In the significance test, it makes almost no difference whether you disregard the missing value or treat it as a 0. The results of the significance test are unchanged; the p-value increases when you treat the exam as a 0, but only slightly (from .001 to .006).
518
Topic 23: Analyzing Paired Data
However, the 95% confidence interval becomes about 7 points wider, and virtually all these points are on the low end of the interval. This result means that the decline in scores from exam 1 to exam 2 could be as much as 7 points more when you treat the missing score as a 0. This is to be expected; if you treat this exam score as a 0, the difference in scores for this student is a –88, which is a fairly significant increase in the sample mean (more negative) and sample standard deviation.
Activity 23-21: Exam Score Improvements a. Let represent the proportion of test takers who improved from exam 1 to exam 2. Test: H0: .5 vs. Ha: .5. Use a one-proportion z-test. b. Technical conditions: n0 n(1 0) 11.5 10, so this technical condition is satisfied. However, the data were not from a simple random sample of statistics students’ exam scores, so you should be cautious about generalizing these results beyond this particular statistics class. .217 .5 2.71. _______ The test statistic is z ________ (.5)(.5) ______ 23
√
p-value Pr(Z 2.71) .0033 As .0033 .01, reject H0 at the = .01 level. You have strong statistical evidence that the proportion of test takers in the population you are willing to generalize to from this statistics class who improved from exam 1 to exam 2 is less than one-half.
Activity 23-22: Comparison Shopping a. You know that these data are from a matched-pairs design because each item was priced in both grocery stores. The primary advantage of pairing the data in this manner is that there is less variability in the price differences than in the prices among diverse products. The prices themselves range from a low of about $.50 to a high of about $7.00, but the price differences have a range of less than $3.00. b. The price differences are (Luckys – Vons): 0.10, 0.76, 0.15, 0.01, 0.23, 0.86, 0.01, 0, 0.20, 0, 0, 0.10, 0, 0.07, 0.04, 0, 0.31, 0.49, 0.10, 0.07, 0.20, 0.51, 0.13, 0.30, 0.08, .10, 1.82, 0.02, 0.02, 0.36, 0.10, 0.06, 0.10, 0.50. c. Here is the numerical summary and graphical display:
Differences
0.70
_
n
x
s
Min.
QL
Median
Qu
Max.
34
$0.0176
$0.4243
$0.86
$0.1
$0
$0.1
$1.82
0.70
1.05
1.40
0.35
0.00
0.35 Difference
1.75
Activity 23-22
519
It appears that neither store tends to have lower prices than the other. Most of the price differences are very close to 0, and the distribution looks fairly normal, except for the one high outlier of $1.82 (navel oranges).
The alternative hypothesis is that there is a difference in the prices of items in these two stores on average. In symbols, Ha: d 0. With this null hypothesis, the sampling distribution of the sample mean would be approximately normal, centered at zero, with a standard deviation of roughly ___ 0.4243√34 0.0728. 0.0176___ 0.24. The test statistic is t __________ 0.4243√ 34
23
d. The null hypothesis is that there is no difference in the prices of items in these two stores, on average. In symbols, H0: d 0.
Using Table III with 33 degrees of freedom, 0.24 0.852, so the p-value is off the chart. This means the p-value 2 .20 .40. Using technology, the p-value is .810. As the p-value is not small, do not reject H0 at any commonly used significance level. You do not have convincing statistical evidence of a difference in the average prices of items in these two stores. e. 95% CI for d: (.1304, .1657). You are 95% confident you can save between 13¢ and 16.6¢ on average by buying at Luckys instead of Vons. f. If you had subtracted in the opposite order, the test statistic value would have been .24, the p-value would be unchanged, and the confidence interval would be (.1657, .1304). _
g. If you remove the outlier (navel oranges), the sampling distribution of x would become approximately normal, centered at 0, with a standard deviation of roughly ___ .2846√ 33 .0496. .0370 ___ 0.74. The test statistic is t _________ .2846√ 33 Using Table III with 32 degrees of freedom, 0.853 0.74, so the p-value is off the chart. This means p-value 2 .20 .40. Using technology, the p-value is .461. As the p-value is not small, do not reject H0 at any commonly used significance level. You still do not have statistical evidence of a difference in the average prices of items in these two stores. 95% CI for d: (.1379, .064). You are 95% confident you can pay anywhere from 13.8¢ less on average at Luckys to 6.4¢ less on average at Vons. Your conclusions in the significance test did not change, and the confidence interval did not change very much when you deleted this outlier. Perhaps navel oranges were on sale at Vons during the time this data was collected. This would explain the extreme price difference.
520
Topic 23: Analyzing Paired Data
Activity 23-23: Comparison Shopping a. This would be a categorical variable; it would not have a numerical value. b. Thirteen of the items cost more at Luckys, 16 of the items cost more at Vons, and 5 of the items cost the same at both stores. c. If you ignore the items that cost the same at both stores, 29 items remain: 1329 or .448 cost more at Luckys. d. Let π represent the proportion of all grocery items that cost more at Luckys than at Vons. Test: H0: .5 vs. Ha: .5. Use a one-proportion z-test. e. Technical conditions: n0 n(1 0) 11.5 10, so this technical condition is satisfied. You do not know, however, whether this is a simple random sample of items available at both stores, so you are unable to check this technical condition. You should be cautious about generalizing your conclusions to all items at all Luckys and Vons stores in this town because you do not know whether this was a random sample. .448 .5 0.56. _______ The test statistic is z ________ (.5)(.5) ______ 29
√
p-value = 2 Pr(Z 0.56) 2 .2912 .5824 Because the p-value is not small, do not reject H0 at any commonly used significance level. You have no real statistical evidence that the proportion of all grocery items that cost more at Luckys than Vons is different from one-half. f. 96% CI for : You calculate .448 (2.05)(.093) (.259, .638). You are 96% confident that between 25.9% and 63.8% of products cost more at Luckys than Vons (assuming this is a random sample).
Assessment Sample Quiz 23A
•••
For each of the following situations, indicate whether it calls for: A. An independent-samples comparison of proportions B. An independent-samples comparison of means C. A matched-pairs comparison of means 1. A sample of 10 faculty members and a sample of 20 students are asked how much time they spent on schoolwork last weekend. The goal is to test whether one group tends to spend more weekend time on average than the other on schoolwork. 2. Ten diamond rings are shown to two jewelers and each jeweler is asked to determine the appraisal value for each ring. The goal is to test whether the two jewelers give the same appraisal values on average.
Activity 24-6
535
0
3.742
5
10
15
20
25
30
Chi-Square Values (df = 8)
e. This is a very large p-value, much greater than all common significance levels, so fail to reject the null hypothesis that Benford’s probabilities adequately model these data. f. The large p-value (based on the small value of the chi-square test statistic) from this chi-square analysis reveals the sample data are extremely consistent with the Benford probabilities. If Benford’s model were correct, it would not be the least bit surprising to observe sample data as found with these nations’ populations. In other words, the sample data provide no reason to doubt Benford’s model describes the distribution of leading digits in the almanac. g. An accountant or IRS agent could apply a chi-square test to see how well the numbers in a tax return follow Benford’s probabilities. If the p-value turns out to be very small, that suggests the leading digits of numbers on that tax return differ substantially from what Benford’s probabilities predict. This certainly does not prove the tax return is fraudulent, but it might suggest the numbers were made up and not legitimate. A tax return whose leading digits do not follow Benford’s probabilities might be worth a closer look.
•••
Homework Activities Activity 24-6: Mendel’s Peas Define parameters of interest: Let RY refer to the proportion of all peas that are round and yellow, and so on. H0: RY 916, WY 316, RG 316, WG 116 Ha: At least one of these hypothesized values is incorrect. Check technical conditions: The expected counts are all at least five. You do not know if these data are a random sample of peas, but they are most likely a representative sample. Round, Yellow
Wrinkled, Yellow
Round, Green
Wrinkled, Green
Total
Observed Count
315
101
108
32
556
Expected Count
312.75
104.25
104.25
34.75
556
(O E) ________
.016187
.101319
.13489
.217626
.4700
2
E
24
Technology calculates the p-value to be about .88, as seen in the following graph:
536
Topic 24: Goodness-of-Fit Tests
Using Minitab, the test statistic is X 2 0.47 with 3 degrees of freedom, and the p-value is .925. (Using Table IV, .47 4.64, so p-value .2.)
.925
0 .47
Test decision: With such a large p-value, do not reject H0. Conclusion in context: You have no reason to doubt that these peas appear in the proportions predicted by Mendel’s theory. If the peas appear in these proportions, you would see a sample result this or more extreme by random chance alone about 92.5% of the time—which means this result is a very common occurrence when the null hypothesis is true.
Activity 24-7: Mendel’s Peas a. The expected counts should not be all the same because the hypothesized proportions are not the same. b. When the p-value is large (greater than .2), you do not reject the null hypothesis. c. The decision to reject or not reject the null hypothesis is based on the value of the p-value, not the test statistic. d. The test statistic is calculated by dividing each (observed expected) entry by the expected count, not by the observed count. e. The expected counts here have all been rounded to the nearest integer, which can create a significant round-off error in the value of the test statistic.
Activity 24-8: Poohsticks Define parameter of interest: Let i refer to the probability of a poohstick landing at resting point i. H0: A 16, B 16, C 16, D 16, E 16, F 16 Ha: At least one of these hypothesized values is not 16. Check technical conditions: The expected counts all equal 11.83 5, so this condition is met (even though two observed counts are less than 5). You do not know whether the fir cones were randomly selected or whether the drops from the bridge were random, so you should be extremely cautious about extending these results to any larger population.
Activity 24-9
537
A
B
C
D
E
F
Total
Observed Count
9
0
5
3
31
23
71
Expected Count
11.833
11.833
11.833
11.833
11.833
11.833
556
.678
11.833
3.946
6.594
31.047
10.538
64.64
(O E) ________
Using Table IV with 5 degrees of freedom, 64.64 22.11, so p-value .0005. Using Minitab, p-value 0. Test decision: With such a small p-value, reject H0. Conclusion in context: You have very strong statistical evidence that these six resting points are not equally likely.
Activity 24-9: Leading Digits a. The following bar graph displays the data:
Proportion
Leading Digits of Populations for Countries in California .35 .30 .25 .20 .15 .10 .05 0 1
2
3
4
5
6
7
Leading Digit
This distribution of leading digits is strongly skewed to the right with more than 30% of the leading digits being 1. Leading Digit Expected Counts
1
2
3
4
5
6
7
8
9
Total
17.458
10.208
7.25
5.626
4.582
3.886
3.364
2.958
2.668
58
b. Here are the expected counts: c. No, not all the expected counts are equal to five or more. d. Here are the observed counts: Leading Digit
1
2
3
4
5 or 6
7, 8, or 9
Total
Number of Counties
19
11
4
8
6
10
58
e. Let i represent the probability of digit i in this process H0: 1 .301, 2 .176, 3 .125, 4 .097, 5,6 .146, 79 .155 Ha: At least one of these probabilities is incorrect.
24
2
E
Topic 24: Goodness-of-Fit Tests
Technical conditions: The expected counts are all at least five (after collapsing the table), with the smallest equal to 5.626. These data are not a random sample of the populations in California counties because you have analyzed all California counties in 2006. However, you might consider the 2006 population counts as realizations of some overall process; you have shown that the distribution of leading digits is within random sampling variability of what you would expect based on Benford’s model. Leading Digit
1
2
3
4
5 or 6
7, 8, or 9
Total
Number of Counties
19
11
4
8
6
10
58
Expected Count
17.458
10.208
7.25
5.626
8.468
8.99
58
(O E ) ________
.1362
.0615
1.4569
1.0018
.7193
.1135
3.489
2
E
The test statistic is X 2 3.489 with 5 degrees of freedom. Using Table IV, 3.489 7.29, so p-value .2. Using Minitab, the p-value is 0.625. With the large p-value, do not reject H0. The sample data provide no reason to doubt Benford’s adjusted model describes the distribution of the leading digits of the populations of the 58 counties in California.
Activity 24-10: Political Viewpoints a. The following table shows the proportion of respondents in each of the three categories: Political Viewpoint
Liberal
Moderate
Conservative
Total
Number of Respondents
319
497
493
1309
Proportion of Respondents
.244
.380
.377
1.001
b. These values are statistics because they describe a sample, not the entire population. c. The following bar graph displays the data: 2004 General Social Survey: Political Viewpoint Proportion of Respondents
538
.40 .35 .30 .25 .20 .15 .10 .05 0 Liberal
Moderate Conservative
Political Viewpoint
Activity 24-11
539
No, this population does not appear to be equally likely to classify themselves into these three political categories. The population appears to be somewhat less likely to classify themselves as liberal than as moderate or conservative. d. Let Liberal refer to the population proportion classifying themselves in each group, and so on. H0: Liberal 13, moderate 13, conservative 13 Ha: At least one of these population proportions is not 13.
Political Viewpoint
Liberal
Moderate
Conservative
Total
Observed Count
319
497
493
1309
Expected Count
436.333
436.333
436.333
1309
(O E ) ________
31.55182
8.4349
7.3593
47.346
2
E
The test statistic is X 2 47.346 with 2 degrees of freedom. Using Table IV, 47.346 15.20, so p-value .0005. Using Minitab, p-value .0000. With such a small p-value, reject H0. You have strong statistical evidence that the population of adult Americans is not equally likely to classify themselves into these three political categories.
Activity 24-11: Wayward Golf Balls a. The following bar graph displays the results:
Proportion
Wayward Golf Balls .35 .30 .25 .20 .15 .10 .05 0 1
2
3
4
Other
Identification Number
The ID numbers 1 and 2 seem somewhat more common than 3 and 4. The proportion of balls in the “other” category is very small (less than 4%). b. Let i represent the probability of a ball having identification number i. H0: 1 2 3 4 14 Ha: At least one i differs from 14. Each ID number would have an expected count of 121.5 (.25 486).
24
Technical conditions: All the expected counts are at least five, and the sample was randomly selected, so the technical conditions are met.
Topic 24: Goodness-of-Fit Tests
c. The test statistic is X 2 1.977 2.241 1.730 2.520 8.469. d. Using Table IV with 3 degrees of freedom, you find 7.81 8.469 9.35, so .025 p-value .05. Using Minitab, the p-value is .0373. e. Yes, you would reject the “equally likely” hypothesis at the .10 level and at the .05 level. You would not reject at the .01 level. f. If there were ten times as many balls in his yard with the same proportional breakdown of ID numbers, the test statistic should be larger (ten times as large) and the p-value smaller. You would expect to reject the null hypothesis of “equally likely” ID numbers at virtually any significance level. This makes sense because the larger the sample size, the more statistically significant the result, all other things being equal. g. Now the expected counts for each ID number are 1215 (ten times the previous expected counts). The test statistic is X 2 19.774 22.407 17.305 25.206 84.691. Using Table IV with 3 degrees of freedom, 84.691 17.73, so p-value .0005.
Activity 24-12: Wayward Golf Balls a. Answers will vary. The following is one representative set of answers. ID number
1
2
3
4
Total
115
117
134
120
486
Expected Count
121.5
121.5
121.5
121.5
486
(O E ) ________
.3477
.1667
1.286
.0185
1.8189
Simulated Sample Count
2
E
b. The test statistic is X 2 1.8189. c. This simulated test statistic is much smaller than the test statistic from the actual sample data in Activity 24-11. d. The following histogram displays the 1000 simulated test statistic values. 120 100 80 Frequency
540
60 40 20 0
0
2
4
6 8 10 Simulated Chi-square Test Statistic Values
12
14
Activity 24-13
541
The histogram of simulated X 2 test statistics is strongly skewed to the right, with a minimum of .0182 and a maximum of 15.58. The mean is 3.03, and the standard deviation is 2.33. e. In this simulation, 32 of the 1000 simulated test statistics are at least as great as the test statistic from the actual data given an empirical approximate of .032.
g. You simulated the situation where the digits 14 are all equally likely and found that in 1000 trials, you never obtained a test statistic as large as the one you saw in your sample data by random chance alone. Because you did find this test statistic from your sample, you conclude that the digits 14 are not equally likely to be found on these lost golf balls.
Activity 24-13: Halloween Treats a. Let candy represent the proportion of all trick-or-treaters in this population who would choose the candy and toys the population proportion who would choose the toy. H0: candy .5, toys .5 Ha: candy .5, toys .5 The expected count for both groups is .5 283 141.5. The test statistic is X 2 (148 141.5)2141.5 (135 141.5)2141.5 0.2986 0.2986 0.5972. Using Table IV with 1 degree of freedom, .5972 1.62, so p-value .2. Using Minitab, the p-value is .440. As the p-value is not small, do not reject H0 at the .2 significance level. You have no real statistical evidence that H0 is false. That is, you have no reason to doubt that the population of trick-or-treaters are equally likely to choose the toy and the candy. Technical conditions: The expected counts are at least five in each category. However, this is not a random sample of trick-or-treaters. All the households were in one of five suburban Connecticut neighborhoods, so you should be very cautious in generalizing your results beyond this population. ________
135 b. You calculate ___ 283 1.28
(.477)(.523) .477 .038 (.439, .515). You are 80% 283 √ _______
confident the population percentage of trick-or-treaters who would choose the toy over the candy is between 43.9% and 51.5%.
c. Yes, this interval includes the value .5. This is consistent with your test decision in part a because the test decided that .5 was a plausible value for the proportion of trick-or-treaters who would choose the toy at the .2 significance level. d. Because this was not a random sample of trick-or-treaters and all the households were in one of five suburban Connecticut neighborhoods, you probably should not generalize your results beyond suburban Connecticut neighborhoods.
24
f. Yes, this empirical p-value (.032) is close to the p-value you calculated in part d of Activity 24-11 (.0373).
Topic 24: Goodness-of-Fit Tests
Activity 24-14: Kissing Couples a. Let leans right represent the proportion of all kissing couples who lean to the right and leans left represent the proportion of all kissing couples who lean to the left. H0: leans right .5, leans left .5 Ha: leans right .5, leans left .5 The expected count for both groups is .5 124 or 62. The test statistic is X 2 (80 62)262 (44 62) 262 5.2258 5.2258 10.4516. Using Table IV with 1 degree of freedom, 7.88 10.4516 10.83, so .001
p-value .005. Using Minitab, the p-value is .00123. Technical conditions: The expected counts are at least five in each category. The couples selected for the sample were those who happened to be observed in public places while the researchers were watching. This is not a random sample, so you should be cautious about generalizing the results of this test to a larger population. If this sample is representative, you have very strong statistical evidence that H0 is false. That is, you are confident the population proportions of kissing couples who lean to the right and left are not both 12. b. H0: leans right 23, leans left 13 Ha: leans right 23 The expected count for “leans right” is 23 124 or 82.6667. The expected count for “leans left” is 13 124 or 41.3333. The test statistic is X 2 (80 82.6667)282.6667 (44 41.3333)241.3333 0.08602 0.17204 .2581. Using Table IV with 1 degree of freedom, .2581 1.62, so p-value .2. Using Minitab, the p-value is .611. You have no statistical evidence that H0 is false. That is, you have no reason to doubt that the population proportion of kissing couples who lean to the right and left are 23 and 13, respectively.
Activity 24-15: Candy Colors a. The following bar graph displays the data:
Proportion of Pieces
542
.6 .5 .4 .3 .2 .1 0
Reese’s Pieces Candy
Orange
Brown Candy Color
Yellow
Activity 24-17
543
It appears that brown and yellow are equally likely colors, but orange is much more likely to occur than either of them. Almost half of the sample was orange, whereas between 21%28% was yellow or brown. b. Yes, you should expect the p-value to be small because you should reject the null hypothesis that all three colors are equally likely. c. Let orange represent the proportion of all Reese’s Pieces that are orange. H0: orange .5, brown .25, yellow .25
Technical conditions: The expected counts in each category are at least five. You were not told whether this was a random sample of Reese’s pieces, but it was very likely to be a representative sample even if it were not random. Color
Orange
Brown
Yellow
Total
Observed Count
273
154
114
541
Expected Count
270.5
135.25
135.25
541
(O E ) ________
.0231
2.599
3.387
5.961
2
E
The test statistic is X 2 5.961. Using Table IV with 2 degrees of freedom, 4.61 5.961 5.99, so .05 p-value
.10. Using Minitab, the p-value is .0508. Because the p-value is greater than .05, do not reject H0 at the .05 significance level. You do not have enough statistical evidence to conclude that the population proportions of colors differ from 50% orange, 25% brown, and 25% yellow at the 5% level of significance (though you would at the 10% level of significance). d. Yes, there are probably many null hypotheses about the color distribution that would not be rejected at the .05 significance level. For example, H0: orange .5, brown .26, yellow .24 would probably not be rejected.
Activity 24-16: Candy Colors a. No, you are not guaranteed to reject the equally likely hypothesis. With a sample size as small as 25, you might get a sample in which the colors are fairly equally distributed. b. If you do not reject the null hypothesis in this situation, you have made a Type II error. c. If you sample 25 candies and your friend samples 200 candies, you are much more likely to make a Type II error than your friend; you are much more likely to find a sample of 25 candies with a reasonably equal color distribution than your friend is to find a sample of 200 such candies (if the population proportion of orange candies is really 50%).
Activity 24-17: Flat Tires a. Answers will vary by class. The following is one representative set of answers.
24
Ha: At least one of these proportions is incorrect.
Topic 24: Goodness-of-Fit Tests
Flat Tires .45
Proportion
544
.40 .35 .30 .25 .20 .15 .10 .05 0 Right Front Left Front
Right Rear
Left Rear
Tire Selected
The right-front tire is a much more frequent choice than the others in this class. It was chosen about 40% of the time, whereas the others were each chosen roughly 20% of the time. b. Let right-front represent the probability of a person in this population choosing the right-front tire. H0: right-front left-front right-rear left-rear .25 Ha: At least one of these proportions is not the same. Technical conditions: The expected counts in each category were at least five. This is not a random sample of students at this school as the sample just consists of the students in one class, but it is likely to be a representative sample for this school (there may not be a reason to suspect students in this class would answer this question differently than students in other classes). Tire
Right-Front
Left-Front
Right-Rear
Left-Rear
Total
Observed Count
20
10
11
9
50
Expected Count
12.5
12.5
12.5
12.5
50
(O E) ________
4.5
.5
.18
.98
6.16
2
E
The test statistic is X 2 6.16. Using Table IV with 3 degrees of freedom, 4.64 6.16 6.25, so .1 p-value .2. Using Minitab, the p-value is .104. As the p-value is greater than .05, do not reject H0 at the .05 significance level. You do not have enough statistical evidence to conclude that, at this school, the population proportions of tire choices differ from .25 each. c. H0: right front .40; left front right rear left rear .20 Ha: At least one of these proportions is incorrect. Tire
Right-Front
Left-Front
Right-Rear
Left-Rear
Total
Observed Count
20
10
11
9
50
Expected Count
20
10
10
10
50
(O E ) ________
0
0
.1
.1
.2
2
E
Activity 24-18
545
The test statistic is X 2 0.2 Using Table IV with 3 degrees of freedom, 0.2 4.64, so p-value .2. Using Minitab, the p-value is .978. With such a large p-value, do not reject H0 at the .05 significance level. You do not have enough statistical evidence to conclude that the population proportions of these tire choices differ from right-front .40; left-front right-rear left-rear .20 in this population (assuming a representative sample).
Answers will vary. Here is one representative set. Define parameter of interest: Let Academy Award represent the probability of a student at this school picking this achievement, and so on. H0: Academy Award 13, Nobel 13, Olympic 13 Ha: At least one of these proportions is not 13. Classmate Preferences for Lifetime Achievement
Proportion
.50 .45 .40 .35 .30 .25 .20 .15 .10 .05 0 Academy Award
Nobel Prize Achievement
Olympic Medal
Check technical conditions: The expected counts in each category are at least five. This is not a random sample of students at this school, but it is probably a representative sample as students in this class may not feel differently about lifetime achievements than the other students at this school. Achievement
Academy Award
Nobel Prize
Olympic Medal
Total
Observed Count
15
36
27
78
Expected Count
26
26
26
78
4.6538
3.846
.0384
8.5385
(O E ) ________ 2
E
Test statistic: X 2 8.5385 Using Table IV with 2 degrees of freedom, 7.82 8.5385 9.21, so .01 p-value .02. Using Minitab, the p-value is .014. Test decision: As the p-value is less than .05, reject H0 at the .05 significance level. Conclusion in context: There is moderately strong statistical evidence that these three achievements are not equally likely to be chosen in the population of all students at this school (assuming a representative sample).
24
Activity 24-18: Lifetime Achievements
546
Topic 24: Goodness-of-Fit Tests
Activity 24-19: Calling Heads or Tails a. Answers will vary by class. Here is one representative set. Let heads represent the probability of a student at this school calling “heads.” H0: heads .7, tails .3 Ha: heads .7 Technical conditions: The expected counts in each category are at least 5. This is not a random sample of students at this school, but it is probably a representative sample as there may not be reason to suspect students in this class will call “heads” at a different rate than students at this school in general. Prediction
Heads
Tails
Total
Observed Count
16
4
20
Expected Count
14
6
20
2.857
.6667
.9524
(O E ) ________ 2
E
The test statistic is X 2 .9524. Using Table IV with 1 degree of freedom, .9524 1.64 so p-value .2. Using Minitab, the p-value is .329. b. With a p-value of .329 .10, you would not reject the null hypothesis at the .10 significance level. There is no reason to doubt that 70% of the students at this school would call “heads” in this situation (assuming a representative sample). c. Here are the expected counts: Prediction
Heads
Tails
Total
Observed Count
32
8
40
Expected Count
28
12
40
5.714
1.333
1.905
(O E ) ________ 2
E
The test statistic is X 2 1.905. Using Table IV with 1 degree of freedom, .1 p-value .2. Using Minitab, the p-value is .168. You would still not reject the null hypothesis at the .10 significance level. There is no reason to doubt that 70% of the students at this school would call “heads” in this situation. d. The expected values and test statistic doubled and the p-value was cut in half when the sample size was doubled.
Activity 24-20: Leading Digits Answers will vary.
Activity 25-7
559
every day and fewer women than expected read the newspaper every day. The sample proportions of men and women who read the newspaper every day are 191/422 .453 for men and 167/484 .345 for women.
Homework Activities Activity 25-7: Government Spending a. The following segmented bar graph displays the data: Political Inclination and Spending on the Environment About right Too little
80 70 60 50 40 30 20 10 0 Liberal
Moderate Conservative Political Inclination
25
100 90
Too much
Percentage
•••
This graph reveals that a very high proportion (.819) of liberals in this sample feel the govenernment is spending too little on the envirnoment. b. No; although the observed count in this category is only 1, it is the expected count that needs to be at least five in order for the chi-square test to be valid. c. A chi-square test in this setting would be one of independence because the data were collected by one random sample from a population and then classified by both the independent and response variables. d. H0: Political affiliation and feelings about government spending on the environment are independent variables in the population of adult Americans. Ha: Political affiliation and feelings about government spending on the environment are related. Technical conditions: The observations arose from a random sample of the population, and the expected counts are at least five for each cell in the table. Using Minitab, the test statistic is X 2 52.117 with 4 degrees of freedom and p-value .000. Using Table IV, 52.117 20.00, so p-value .0005. With a p-value less than .025, reject H0 at the .025 significance level. You can conclude that political affiliation and feelings about government spending on the environment are related in the population of adult Americans. e. The largest contributions to the test statistic are made by the “liberal/too little cell,” in which the observed count (127) is greater than the expected count (95.5); the “conservative/too much” cell, in which the observed count (32) is also greater than the expected count (18.27); and the “liberal/too much” cell, in which the observed count (1) is less than the expected count (12).
Topic 25: Inference for Two-Way Tables
Activity 25-8: Government Spending The following segmented bar graph displays the data: Political Inclination and Spending on the Space Program 100 90
Too much About right
80 70
Too little Percentage
560
60 50 40 30 20 10 0 Liberal
Moderate Conservative Political Inclination
This graph does not indicate any substantial relationship between political affiliation and feelings about government spending on the space program for these respondents. A chi-square test in this setting would be one of independence because the data were collect by one random sample from a population and then classified by both the independent and response variables. H0: Political affiliation and feelings about government spending on the space program are independent variables in the population of adult Americans. Ha: Political affiliation and feelings about government spending on the space program are related. Check technical conditions: The observations arose from a random sample from the population of adult Americans and the expected counts are at least five for each cell in the table. Using Minitab, the test statistic is X 2 7.008 with 4 degrees of freedom and the p-value is .135. Using Table IV, 5.99 7.008 7.78, so .1 p-value .2. Test decision: As p-value .135, which is greater than .025, do not reject H0 at the .025 significance level. Conclusion in context: You have no evidence that political affiliation and feelings about government spending on the space program are related in this population.
Activity 25-9: Baldness and Heart Disease a. This is an observational study because the researchers did not randomly assign men to the baldness groups. b. The expected counts in each cell would not all be 5. In particular, in the extreme column, the expected counts would be 1.39 (heart disease) and 1.61 (control). c. Here is the 2 4 table: None
Little
Some
Much or More
Heart Disease
251
165
195
52
Control
331
221
185
35
Activity 25-10
561
d. H0: Degree of baldness and heart disease are independent variables for males in this population. Ha: Degree of baldness and heart disease are related. Using Minitab, the test statistic is X 2 14.510 with 3 degrees of freedom and the p-value is .002. Using Table IV, 12.84 14.51 16.27, so .001 p-value .005. With such a small p-value, reject H0 at any commonly used significance level. e. You have strong statistical evidence that degree of baldness and heart disease are related for males in this population. However, because this is an observational study, you cannot conclude that baldness causes heart disease—merely that there is an association between these variables in this population.
a. The data collection used was random assignment to treatment groups. b. The observational units were the male patients’ teeth (more specifically, tooth surfaces with gingivitis). c. Explanatory: amount of active ingredient in mouthwash
25
Activity 25-10: Removing Gingivitis
Response: whether gingivitis surface was cured in three months d. Here is a two-way table of counts: None
Low
Intermediate
High
Number of Gingivitis Surfaces Cured
78
55
96
73
Number of Gingivitis Surfaces not Cured
274
135
213
98
Total
352
190
309
171
e. Here is the graph: Not cured
Percentage
Cured
Mouthwash and Removing Gingivitis
100 90 80 70 60 50 40 30 20 10 0 None
Low Intermediate High Amount of Mouthwash Ingredient
The graph indicates that in this sample, as the amount of active ingredient in the mouthwash increased, the proportion of gingivitis surfaces that are cured after six months increased too.
Topic 25: Inference for Two-Way Tables
f. Let i represent the probability that a tooth with gingivitis is cured for active ingredient level i. H0: N L I H (Probability of cure does not vary with amount of active ingredient.) Ha: At least one of the probabilities is not equal. Using Minitab, the test statistic is X 2 23.794 with 3 degrees of freedom and p-value .000. Using Table IV, 23.794 17.73, so p-value .0005. With such a small p-value, reject H0 at any reasonable commonly used significance level. There is strong evidence that the probability of curing gingivitis with the four different strengths of the active ingredient in the mouthwash is not the same. All the expected values are greater than five, and the patients were randomly assigned to the treatment groups, so the technical conditions for this procedure are met. However, you don’t know how well they represent the population of all Americans so you should be careful about how far you generalize your results.
Activity 25-11: Asleep at the Wheel a. This is an observational study because the researchers did not randomly assign the subjects to the crash/control groups. b. Here is the graph: Sleepiness in Car Crashes 100 90
Least alert Moderately alert
80 70
Most alert Percentage
562
60 50 40 30 20 10 0 Crash Group
Control Group
This graph reveals that group (crash or non-crash) and sleepiness appear to be related. More than 54% of the non-crash group were classified as “most alert,” whereas only 34% of the crash group were classified this way. More than 10% of the crash group were classified “least alert,” but less than 2% of the control group were. c. H0: Crash group and sleepiness classification are independent variables among New Zealand drivers. Ha: Crash group and sleepiness classification are related in this population. Technical conditions: The expected counts are at least five for each cell of the table. You do not know whether this was a random sample of all New Zealand drivers.
Activity 25-12
563
Using Minitab, the test statistic is X 2 81.692 with 2 degrees of freedom and p-value .000. Using Table IV, 81.692 15.20, so p-value .0005. With such a small p-value, reject H0 at any commonly used significance level and conclude that whether a driver crashes is related to his/her level of sleepiness. d. You have very strong statistical evidence that there is a relationship between sleepiness and involvement in car crashes among New Zealand drivers However, because this is an observational study, you cannot conclude from this study that sleepiness causes car crashes.
Activity 25-12: Asleep at the Wheel Here is the graph: Alcohol Level in Car Crashes 100
50
Percentage
0
25
90
1–50
80 70 60 50 40 30 20 10 0 Crash Group
Control Group
The graph indicates a dependent relationship between alcohol level and car crashes for subjects in this sample. Almost no alcohol was measured in any of the control group, whereas 28.6% of the crash group had a blood alcohol level of 50 mg/100 ml or more. H0: Car crashes and alcohol level are independent variables in the population of New Zealand drivers. Ha: Car crashes and alcohol level are related in this population. Check technical conditions: The expected counts are at least five for each cell of the table. You do not know whether this was a random sample of New Zealand drivers. Using Minitab, the test statistic is X 2 177.853 with 2 degrees of freedom and p-value .000. Using Table IV, 177.853 15.20, so p-value .0005. Test decision: With such a small p-value, reject H0 at any commonly used significance level. Conclusion in context: You have very strong statistical evidence that there is a relationship between alcohol level and involvement in car crashes among New Zealand drivers. However, because this is an observational study, you cannot conclude from this study that using alcohol causes car crashes.
Topic 25: Inference for Two-Way Tables
Activity 25-13: Asleep at the Wheel Here is the graph: Educational Level in Car Crashes 100 90
Beyond HS At least 3 years HS 3 years HS Percentage
564
80 70 60 50 40 30 20 10 0 Crash Group
Control Group
The graph indicates a dependent relationship between educational level and car crashes for subjects in this sample. Fourty-four percent of those involved in crashes had less than 3 years of high school whereas 47% of those in the control group had further study beyond high school. About 25% of both groups had at least 3 years of high school but no further study. H0: Car crashes and education level are independent variables in the population of New Zealand drivers. Ha: Car crashes and education level are related in this population. Check technical conditions: The expected counts are at least five for each cell of the table. You do not know whether this was a random sample of New Zealand drivers. Using Minitab, the test statistic is X 2 43.436 with 2 degrees of freedom and the p-value .000. Using Table IV, 43.436 15.20, so p-value .0005. Test decision: With such a small p-value, reject H0 at any commonly used significance level. Conclusion in context: You have very strong statistical evidence that there is a relationship between education level and involvement in car crashes among New Zealand drivers. However, because this is an observational study, you cannot conclude from this study that having less education causes car crashes.
Activity 25-14: Children’s Television Advertisements a. The expected counts for the Disney cells are all less than 5: (4, 3.44, .21, 1.5, and 1.85 reading down the column). b. “Snacks” is problematic for satisfying the technical conditions because the expected counts are 31.6 (BET), 1.62 (WB), and .21 (Disney). c. Here is the two-way table: BET
WB
Total
Fast Food
61
32
93
Drinks
66
9
75
Cereal
15
16
31
Snacks & Candy
20
26
46
Total
162
83
245
Activity 25-15
565
d. H0: Network and food product advertised are independent variables for this population of advertisements during children’s television programs. Ha: Network and food product advertised are related in this population. Using Minitab, the test statistic is X 2 30.919 with 3 degrees of freedom and p-value .000. Using Table IV, 30.919 17.73, so p-value .0005. With such a small p-value, reject H0 at any commonly used significance level. You have very strong statistical evidence that there is a relationship between the network and food product advertised during children’s television programs. Warner Brothers advertises far fewer (9) “drinks” than you would expect (25.41) if these variables were independent and many more “snacks & candy” (26) than you would expect (15.58).
a. The data collection scenario used in this study was random sampling from one population. The researchers classified the respondents both by gender and their feelings about their weight after they were randomly selected. b. The following segmented bar graph compares the distribution of weight self-images between men and women:
25
Activity 25-15: Weighty Feelings
Feelings About Current Weight 100 90
Overweight About Right
Percentage
Underweight
80 70 60 50 40 30 20 10 0 Female
Male Gender
Males in this study were almost 3 times as likely as females to feel underweight (.096 vs .038). More than half of the females (.573) felt overweight, whereas more than half of the males (.515) felt about right about their weight. c. H0: The distribution of weight self-images is the same between males and female (gender and weight self-image are independent variables in the population of adult Americans). Ha: The distribution of weight self-images is not the same between males and females. Technical conditions: The expected counts are at least five for each cell of the table, and this was a randomly selected sample. Using Minitab, the test statistic is X 2 226.579 with 2 degrees of freedom and p-value .000. Using Table IV, 226.579 15.20, so p-value .0005. With such a small p-value, reject H0 at any commonly used significance level. You have very strong statistical evidence that the distribution of weight self-images is not the same between males and females in the population.
Topic 25: Inference for Two-Way Tables
Activity 25-16: Hand Washing a. The explanatory variable is city. The response variable is whether the subject washed his or her hands. b. This is a 2 4 table. c. For the four locations combined, the proportion of restroom users who washed their hands is 5195/6336 or .8199. d. The expected count for each cell would be Atlanta
Chicago
New York City
San Francisco
Expected Count Washed Hands
1588 .8199 1302.03
1509 .8199 1237.26
1503 .8199 1232.34
1736 .8199 1423.38
Expected Count Did Not Wash
1588 .1801 285.97
1509 .1801 271.74
1503 .1801 270.66
1736 .1801 312.62
e. The test statistic is X 2 12.393 6.803 3.153 6.695 56.427 30.974 14.357 30.485 161.287. With 3 degrees of freedom, p-value .000. Using Table IV, 161.287 17.73, so p-value .0005. With such a small p-value, you can reject H0 at the .01 significance level. You have very strong statistical evidence that the population proportion of restroom users who wash their hands is not the same in all four locations. f. Atlanta and Chicago (did not wash hands) contribute the most to the test statistic. Atlanta has more people washing their hands than you would expect if the proportions in all four cities were equal, and Chicago has fewer people washing their hands than you would expect.
Activity 25-17: Newspaper Reading Here is the graph: How Often American Adults Read the Newspaper 100 90
Never Less than once a week Once a week Few times a week Every day
Percentage
566
80 70 60 50 40 30 20 10 0 Liberal
Moderate
Conservative
H0: Political inclination and frequency of adult Americans reading the newspaper are independent variables.
Activity 25-18
567
Ha: Political inclination and frequency of adult Americans reading the newspaper are related. Technical conditions: It is reasonable to consider the observations a random sample from the population of American adults and all expected cell counts are at least 5 (smallest 16.19), so the technical conditions are met. Using Minitab, the test statistic is X 2 3.743 with 8 degrees of freedom and the p-value is .894. Using Table IV, 3.743 11.03, so p-value .2. Test decision: As the p-value is not small, do not reject H0 at any commonly used significance level. Conclusion in context: You have no evidence that political inclination and frequency of adult Americans reading the newspaper are related.
Activity 25-18: Newspaper Reading
How Often American Adults Read the Newspaper Never Less than once a week Few times a week Every day
Percentage
Once a week
100 90 80 70 60 50 40 30 20 10 0 1980s
1990s Decade
25
The frequency of newspaper reading by adults in this sample decreased as the decades progressed.
2000–2004
H0: Decade and frequency of adult Americans reading the newspaper are independent variables. Ha: Decade and frequency of adult Americans reading the newspaper are related. Technical conditions: The expected counts all easily exceed 5 (smallest 227.57), and you can consider these to be independent random samples of American adults each decade, so the chi-square procedure is valid. Using Minitab, the test statistic is X 2 306.41 with 8 degrees of freedom and p-value .000. Using Table IV, 306.41 27.87 so the p-value .0005. Test decision: With such a small p-value, reject H0 at any commonly used significance level. Conclusion in context: You have strong statistical evidence that the frequency of adult Americans reading the newspaper has changed over the past few decades. The largest contribution to the test statistic is made by the “2000–2004/less than once a week” cell, and in this cell the observed count (519) is a great deal larger than the expected count (357.24). The next largest contribution is from the “1980s/less than once a week” cell, in which the observed count (790) is less than the expected count (993.63). There is one more cell that makes a fairly large contribution to the test statistic: the “2000–2004/never” cell, in which the observed count (323) is greater than the expected count (227.57).
Topic 25: Inference for Two-Way Tables
Activity 25-19: Suitability for Politics a. The appropriate chi-square test for these data is independence because the researchers took one random sample and then classified the subjects by both explanatory and response variables. b. H0: Political inclination and reaction to the statement are independent variables in the population of adult Americans. Ha: Political inclination and reaction to the statement are related variables in this population. Here is a graph: Suitability for Politics and Political Inclination Disagree Agree
Percentage
568
100 90 80 70 60 50 40 30 20 10 0 Liberal
Moderate Conservative Political Inclination
Technical conditions: The expected counts are at least five for each cell of the table, and this was a randomly selected sample of adult Americans. Using Minitab, the test statistic is X 2 9.867 with 2 degrees of freedom and the p-value is .007. Using Table IV, 9.21 9.867 10.60, so .005 p-value .010. With the small p-value (.007 .05), reject H0 at any commonly used significance level. You have strong statistical evidence that political inclination and reaction to the statement are related in the population of adult Americans. The largest contribution to the test statistic is made by the “conservative/agree” cell, and in this cell the observed count (96) is greater than the expected count (78.13). The next largest contribution is made by the “liberal/agree” cell, in which the observed count (40) is less than the expected count (52.26). c. H0: Gender and reaction to the statement are independent variables in the population of adult Americans. Ha: Gender and reaction to the statement are related variables in this population. Technical conditions: The expected counts are at least five for each cell of the table, and this was a randomly selected sample.
Activity 25-20
569
Here is a graph: Suitability for Politics and Gender Disagree
Percentage
Agree
100 90 80 70 60 50 40 30 20 10 0 Male
Female
Using Minitab, the test statistic is X 2 3.155 with 1 degree of freedom and the p-value is .076. Using Table IV, 2.71 3.155 3.84, so .05 p-value .10. As the p-value equals .076, which is greater than .05, do not reject H0 at the .05 significance level (though you would at the .10 significance level). You do not have strong statistical evidence to conclude that gender and reaction to the statement are related in this population.
25
Gender
d. The p-value is smaller for the political inclination test. In this case, this means there is strong evidence of a relationship between political inclination and reaction to the statement, but not enough evidence to conclude that there is a relationship between gender and reaction to the statement.
Activity 25-20: Suitability for Politics H0: The population proportion who agrees with the statement is the same across all four decades (70s 80s 90s 00s). Ha: The population proportion who agrees with the statement has changed over these four decades. Check technical conditions: The expected counts are at least five for each cell of the table, and these were independently selected random samples from the four decades. Here is the graph: Suitability for Politics Over Time Disagree
Percentage
Agree
100 90 80 70 60 50 40 30 20 10 0
1970s
1980s
1990s
2000s
Decade
This bar graph clearly shows that the proportion of the sample who agreed with this statement decreased over time.
570
Topic 25: Inference for Two-Way Tables
Using Minitab, the test statistic is X 2 1034.16 with 3 degrees of freedom and p-value .000. Using Table IV, 1034.16 17.73, so p-value .0005. Test decision: With such a small p-value, reject H0 at any significance level. The largest contribution to the test statistic is made by “1970s/agree” cell, and in this cell the observed count (2398) is greater than the expected count (1616.96). The next largest contribution is made by the “1990s/agree” cell, in which the observed count (1909) is less than the expected count (2669.64). Conclusion in context: You have convincing evidence that adult Americans are becoming less likely to agree with the assertion that “men are better suited emotionally for politics than women.”
Activity 25-21: Cold Attitudes a. This was an observational study because the researchers observed and categorized the subjects’ emotional states. b. Here is the two-way table (positive emotion): High
Medium
Low
Got Cold
21
29
37
Didn’t Get Cold
90
82
75
111
111
112
Total
c. Let low represent the probability that a subject with low positive emotions develops a cold, and so on. H0: low medium high Ha: At least one of these population proportions differs. Technical conditions: The expected counts are all at least five. You have no indication that these subjects were randomly selected. In fact, these were all “healthy volunteers,” so they may not be representative of the population of adults. You should be cautious about generalizing your results beyond the population from which these volunteers were drawn. Using Minitab, the test statistic is X 2 5.76 with 2 degrees of freedom and the p-value is .056. Using Table IV, 4.61 5.76 5.99, so .05 p-value .10. As the p-value is .056, which is less than .075, reject H0 at the .075 significance level. You have reason to doubt that these population proportions are equal. However, you cannot conclude that the development of a cold is caused by one’s positive emotional state because this was an observational study, not an experiment, and there are many confounding variables that you could not control for. d. Here is the two-way table (negative emotion):
Got Cold Didn’t Get Cold Total
High
Medium
29
28
Low 31
82
83
81
111
111
112
Activity 25-22
571
H0: low medium high Ha: At least one of these population proportions is not equal. Technical conditions: The expected counts are all at least five. You have no indication that these subjects were randomly selected. In fact, these were all “healthy volunteers,” so they may not be representative of the population of adults. You should be cautious in generalizing your results beyond the population from which these volunteers were drawn.
As the p-value is .915, which is greater than .075, do not reject H0 at the .075 significance level. You have no reason to doubt that these population proportions are equal. Even if you had enough reason to doubt that the proportions weren’t equal, you cannot conclude that the development of a cold is caused by one’s negative emotional state because this was an observational study, not an experiment. There are many confounding variables that you could not control for.
Activity 25-22: Preventing Breast Cancer a. Here is the 2 2 table: Tamoxifen
Raloxifene
163
167
330
No Breast Cancer
9563
9578
19,141
Total
9726
9745
19,471
Breast Cancer
25
Using Minitab, the test statistic is X 2 0.177 with 2 degrees of freedom and the p-value is .915. Using Table IV, .177 3.22, so p-value .2.
Total
b. Here are the expected counts:
Breast Cancer Expected Counts No Breast Cancer Expected Counts
Tamoxifen
Raloxifene
164.84
165.16
9561.16
9579.84
c. H0: The population proportions of women who would develop breast cancer are the same with both drugs. Ha: The population proportions of women who would develop breast cancer are not the same with both drugs. Using Minitab, the test statistic is X 2 0.042 with 1 degree of freedom and the p-value is .838. Using Table IV, .042 1.64, so p-value .2. With such a large p-value, do not reject H0 at any commonly used significance level. You have no reason to doubt the population proportions of women who would develop breast cancer are the same with both drugs. d. In Activity 21-3, the test statistic value was z 0.20, which is roughly the square root of 0.042, your chi-square test statistic. The p-value was .838 (the same).
572
Topic 25: Inference for Two-Way Tables
Activity 25-23: A Nurse Accused a. H0: The population death rates between the Gilbert shifts and non-Gilbert shifts are the same (or the probability of a patient dying on Gilbert’s shift equals the probability of a patient dying when Gilbert is not working). Ha: The population death rates between the Gilbert shifts and non-Gilbert shifts are not the same. Gilbert’s Shifts Patient Died
Other Shifts
Total
40
34
74
No Patient Died
217
1350
1567
Total
257
1384
1641
Gilbert Shift
Other Shifts
Patient Died Expected Counts No Patient Died Expected Counts
11.59
62.41
74
245.41
1321.59
1641
Using Minitab, the test statistic is X 2 86.48 with 1 degree of freedom and p-value .000. Using Table IV, 86.48 12.12, so p-value .0005. With such a small p-value, you can reject H0 at the .001 significance level. The difference in death rates between the two groups is statistically significant at the .001 significance level. b. If there were no difference in the death rates on the two types of shifts (Gilbert and non- Gilbert), you would virtually never expect to see a result this extreme or more extreme, by random chance alone. Because you did see this result and it is virtually impossible to have occurred by random chance, you conclude there is a statistically significant difference in the death rates between the two shifts. c. Yes, a defense attorney could reasonably argue that the higher death rate on Gilbert’s shifts is due to a confounding variable because this is an observational study, not a randomized experiment. d. No, a defense attorney could not reasonably argue that the higher death rate on Gilbert’s shifts is due to random chance. The probability that the higher death rate is due to random chance is 0. (See answer to part b.)
Activity 25-24: Government Spending a. Answers will vary by simulation. The following is one representative set. Conservative
Liberal
Moderate
Total
About Right
15
34
25
293
Too Little
71
114
105
74
Too Much
65
106
94
265
227
151
254
632
Total
b. The test statistic is X 2 5.5377. c. This simulated test statistic is less than the test statistic value (7.008) from the actual sample data in Activity 25-8.
Quizzes
573
d. The following histogram displays the resulting chi-square test statistics: 100 90 80
Frequency
70 60 50 40 30 20 10 4
8
12 16 Simulated Chi-Square Test Statistics
20
24
This distribution looks very much like a chi-square distribution with 4 degrees of freedom shown next. The mean is 6.252, and the standard deviation is 3.613. It has a minimum of 1.518 and a maximum (outlier) of 26.449. 0.20
Density
0.15
25
0
0.10
0.05
0.00 0
2
4
6 8 10 Chi-Square Values
12
14
16
e. You calculate 165/500 of the simulated chi-square test statistics are at least as great as 7.008. This gives an approximate p-value of .33.
Assessment Sample Quiz 25A
•••
Students in a statistics class were asked which of three lifetime achievements they would most like to win: a Nobel Prize, an Academy Award, or an Olympic Gold Medal. They were also asked to indicate their gender. Results are shown in the following table: Men
Women
Total
Nobel Prize
4
13
17
Academy Award
3
7
10
Olympic Gold Medal Total
8
10
18
15
30
45
Activity 26-8
595
c. This is an absurd argument. Simply sending televisions to countries such as Haiti will not lead to an increase in the life expectancies of inhabitants. d. No; this is a very good example of a situation where two variables are strongly associated, but there is no cause-and-effect relationship between them. Having lots of televisions does not cause life expectancy to increase. There are several other variables that affect both life expectancy and number of televisions per 1000 people. e. Many answers are possible. One possibility is the wealth (GNP) of the nation. Nations with high GNPs will also tend to have a large number of televisions per capita and will have good medical care and thus long life expectancies. Similarly, countries with low GNPs will tend to have few televisions per capita, poor national medical care, and shorter life expectancies.
Activity 26-7: Airline Maintenance a. The observational units are the airlines.
26
Delays Caused by Airline (percentage)
b. The scatterplot follows: 70 60 50 40 30 20 10 10
20 30 40 50 60 70 80 90 100 Maintenance Outsourced (percentage)
c. The scatterplot reveals a fairly strong positive association between these variables. Airlines that outsource a higher percentage of their maintenance tend to have a greater percentage of delays caused by the airline, as compared to airlines that outsource a smaller percentage of their maintenance. The form of the association is clearly nonlinear, as the relationship appears to follow a curved pattern. d. No, you cannot conclude from these data that outsourcing maintenance causes airlines to experience more delays. This is an observational study, not an experiment. Many other variables could explain the observed association between outsourcing and delays. For example, perhaps less financially successful airlines tend to outsource more of their maintenance and also to have more delays.
•••
Homework Activities Activity 26-8: Miscellany Answers will vary from student to student, but here are some examples: a. SAT score and college GPA: positive, strong (You can expect students who do well on the SAT to tend to also obtain above average college grades.)
Topic 26: Graphical Displays of Association
b. Distance from the equator and average January temperature for U.S. cities: negative, strong (In the United States, the farther north you travel, the colder it tends to be in the winter.) c. Lifetime and weekly cigarette consumption: moderate, negative (Those who smoke more tend to have lower life expectancy.) d. Serving weight and calories of fast-food sandwiches: strong, positive (Larger sandwiches tend to have more calories.) e. Airfare and distance to destination: moderate, positive (You expect to pay more to travel farther.) f. Number of letters in a person’s last name and combined number of Scrabble points those letters are worth: moderate, positive (Longer names will have more opportunities for points, but not exclusively because the letters in the name will greatly affect the score.) g. Average driving distance and average score per round among professional golfers: moderate, negative (Those who have longer drives tend to be able to get the ball in the hole with fewer strokes.) h. Number of miles driven and asking price for used Honda Civics listed for sale on the Internet: moderate, negative (Older cars will tend to cost less when resold.) i. Distance from the sun and diameter of planets in our solar system: none (As you move away from the sun, the sizes of the planets both increase and decrease with no consistent trend.) j. Distance from sun and number of (earth) days to revolve around sun among planets in our solar system: strong, positive (As you move away from the sun, it takes longer for the planet to complete a revolution.) k. Price of a textbook and number of pages in text: moderate, positive (Larger books tend to cost more but cost also depends on the academic discipline.) l. Number of classes missed and overall course score in a class of statistics students: strong, negative (You might expect that those student who attend fewer classes will not perform as well on course assessments.) m. Expected lifetime and gestation period among different kinds of mammals: moderate, strong (Animals with short lifetimes need to reproduce more quickly than those with the advantage of a longer lifetime.)
Activity 26-9: Challenger Disaster a. The scatterplot of O-ring failures vs. outside temperature follows: 3.0 2.5 O-ring Failures
596
2.0 1.5 1.0 0.5 0.0 50
55
60
65
70
75
Outside Temperature (in F)
80
Activity 26-11
597
b. This scatterplot reveals a negative association between the number of O-ring failures and the outside temperature. As the outside temperature rises, the number of O-ring failures drops. All of the flights with no O-ring failures occurred when the outside temperature was 65°F or warmer. c. With a forecasted low temperature of 31°F, it seems very likely that there would be at least one O-ring failure. d. The scatterplot of O-ring failures vs. outside temperature for the remaining seven flights follows:
O-ring Failures
3.0 2.5 2.0 1.5
50
55
60
65
70
75
Outside Temperature (in F)
This scatterplot does not seem to reveal either a strictly positive or a strictly negative association between the number of O-ring failures and the outside temperature. The relationship appears more parabolic. This scatterplot definitely does not make the case for a negative association as strongly as the previous scatterplot did. e. The flights with zero O-ring failures gave a lot of information about how failures were related to temperature. They should not have been excluded from the scatterplot.
Activity 26-10: Broadway Shows a. The association between the variables in all these scatterplots is positive. b. From weakest to strongest association, you have B E D A F C.
Activity 26-11: Broadway Shows a. The labeled scatterplot of gross receipts vs. percentage capacity follows: 1400 Gross Receipts (in thousands of dollars)
Type Musical Play
1200 1000 800 600 400 200 30
40
50 60 70 80 90 100 110 Percentage Capacity
26
1.0
Topic 26: Graphical Displays of Association
b. There is a moderate, positive, curved association between gross receipts and percentage capacity. c. Almost without exception, among shows with similar percentage capacities, musicals tend to take in more money than do plays (although there are only seven plays in this list).
Activity 26-12: Monthly Temperatures a. In six months, Raleigh’s monthly temperature is higher than San Francisco’s. You can tell this from graph A because there are six pairs of points with the Raleigh point above the San Francisco point. You can tell this from graph B because there are six points above the y x line. b. Raleigh’s monthly temperature is higher than San Francisco’s during the months of April–September. You can tell this from graph A because each pair of points is associated with a month in order from January–December from left to right. (This ordering is not true of graph B.) c. There is a moderate, positive, linear association between the two cities’ monthly temperatures. You can judge this most easily from graph B, which is a scatterplot of one monthly temperature versus the other.
Activity 26-13: Weighty Feelings a. About right (weight) is represented by a diamond; overweight is represented by a circle; and underweight is represented by a square. Those respondents who feel underweight would be those who are below the weights of others at the same heights. Those who feel overweight should be heavier than others at the same heights, and those who feel about right will be in the middle. b. The scatterplot displays a positive, linear association between height and weight. You also notice that very few respondents considered themselves underweight. There is a surprising consistency of opinion about the weights. At most heights, there are intervals of weights about which all respondents felt the same.
Activity 26-14: Fast-Food Sandwiches a. The scatterplot of calories vs. serving weight follows: 900 800 700 Calories
598
600 500 400 300 200 100
150
200 250 300 350 Serving Weight (in grams)
400
There is a strong, positive, linear association between number of calories and the serving weight.
Activity 26-15
599
b. The labeled scatterplot of calories vs. serving weight follows: 900
Meat Beef Chicken Ham Other Turkey
800
Calories
700 600 500 400 300 200 100
150
200
250
300
350
400
Serving Weight (in grams)
Activity 26-15: Fast-Food Sandwiches Many answers are possible. The possible labeled scatterplots are shown below. Students should display any one of these, along with a discussion of what the graph reveals. Meat Beef Chicken Ham Other Turkey
Serving Weight (g)*Fat – Total (g)
Serving Weight (g)*Calories From
Serving Weight (g)*Cholesterol (m)
Serving Weight (g)*Sodium (mg)
Serving Weight (g)*Protein (g)
300
300
300
300
300
200
200
200
200
200
100
100 0
200
400
Calories*Calories From Fat
900
100 0
20
Calories*Fat – Total (g)
900
100 0
40
50
100
Calories*Cholesterol (mg)
900
100 1000 2000 3000 Calories*Sodium (mg)
900
10 900
600
600
600
600
600
300
300
300
300
300
0
200
400
0
Calories From Fat*Fat – Total (g)
20
0
40
50
100
1000 2000 3000
Calories From Fat*Sodium (mg)
Calories From Fat*Cholesterol (mg)
400
400
400
400
200
200
200
200
0
0 0
20
40
0
Fat – Total (g)*Sodium (mg)
50
100
0 1000 2000 3000
Fat – Total (g)*Protein (g)
20
20
50
50
0
0
0
0
10
30
50
1000 2000 3000
30
50
Cholesterol (mg)*Protein (g)
100
50
0
Cholesterol (mg)*Sodium (mg)
100
30
Fat – Total (g)*Cholesterol (mg)
20
10
40
50
40
0
40
1000 2000 3000
10
Calories From Fat*Protein (g)
30 Calories*Protein (g)
0
50
100
Sodium (mg)*Protein (g)
3000 2000 1000
10
30
50
10
30
50
26
Chicken and “other” tend to have the most calories per serving size. In general, ham has the fewest number of calories of all meats with similar serving sizes. Although the turkey sandwiches all have high calorie counts, they also have very large serving weights, so relative to the other meats, these sandwiches appear similar to the ham sandwiches.
Topic 26: Graphical Displays of Association
Activity 26-16: College Alumni Donations a. The following dotplot displays the distribution of donor percentages (i.e., giving percentages):
35
42
49
56
63
77
70
Giving Percentage
This distribution is roughly symmetric with one high outlier at 81%. The donor percentages range from a minimum of 30% to a non-outlier maximum of 67%. The mean donor percentage is 50.63% and the standard deviation is 11.07%. b. The following dotplot displays the distribution of average gifts:
0
1,600
3,200
4,800
6,400
8,000
9,600
11,200
Average Gift (in dollars)
The majority of the average gifts are clumped in a large peak below $775. There is a small cluster of four average gifts between about $2450 and $3200, an outlier at $4945, and an extreme outlier at $12,187. The median of the average gifts is $297, the minimum is $58, and a quarter of the average gifts are less than $157. c. The scatterplot of giving percentage vs. class year follows: 80 Giving Percentage
600
70 60 50 40 30 1960 1962
1970
1980 Class Year
1990
2000
There appears to be a moderately strong, negative, linear association between giving percentage and the class year. There is one unusual observation (1962), circled in the plot, in which the donor percentage is unusually large. d. The scatterplot of average gift vs. class year follows:
Activity 26-17
601
Average Gift (in dollars)
12,000 10,000 8000 6000 4000 2000 0 1960 1962
1970
1980 Class Year
1990
2000
e. The year 1962 stands out as an unusual observation on both plots—particularly the average gift vs. class year scatterplot. What makes it unusual is the extremely high average gift ($12,187) this year and the percentage of alumni who donated money (81%).
Activity 26-17: Peanut Butter a. The observational units are the brands of peanut butter. b. The variables are classified as follows: cost (quantitative), sodium (quantitative), quality (quantitative), crunchy/creamy (binary categorical), regular/natural (binary categorical), salted/unsalted (binary categorical) c. Many answers are possible. The possible scatterplots are shown here. Students should create one of these and discuss what the graph reveals:
20
10
Sodium (in mg)
30
Sodium*Quality
Cost*Quality Cost (in cents)
Cost (in cents)
Cost*Sodium 30
20
10 0
100 200 Sodium (in mg)
240 180 120 60 0
0
50 Quality
100
0
50 Quality
100
For example, the association between cost and quality is fairly strongly, positive, and linear. The relationship between cost and sodium is not as strong; overall, the association appears negative, but there is a cluster of salt-free brands that should perhaps be investigated separately. This is especially true in examining the sodium and quality relationship. Without the low-sodium brands, the relationship could be negative. d. Many answers are possible. The possible scatterplots are shown here. Students should create one of these and discuss it:
26
This association is not as strong as the previous one. There are a few classes in which the average gift was large, but in general, it is hard to predict the average gift from the class year. There is one extreme outlier of an average gift, $12,187, from the class of 1962.
20
240
30
Sodium (in mg)
30
Cost (in cents)
Cr/ch Chunky Creamy
Cost (in cents)
Topic 26: Graphical Displays of Association
20
10
10 0
120 60 0
0
100 200 Sodium (in mg)
180
50 Quality
100
0
50 Quality
100
20
20
10
10 100 200 Sodium (in mg)
0
Cost (in cents)
30
20
50 Quality
0
100 200 Sodium (in mg)
120 60
100
0
50 Quality
100
240
30
20
10
10
180
0
Sodium (in mg)
0
S/U Saleed Unsaleed
240
30
Sodium (in mg)
30
Cost (in cents)
R/N Natural Regular
Cost (in cents)
For example, the scatterplots shown here reveal that the chunky vs. creamy brands don’t appear to behave very differently.
Cost (in cents)
602
180 120 60 0
0
50 Quality
100
0
50 Quality
100
Activity 26-18: Digital Cameras a. No, the scatterplot does not suggest that more expensive cameras tend to have higher ratings. There is not really an association between price and rating score. The association is (at best) a very weak, curvilinear association, with midpriced cameras tending to have higher ratings. b. There appears to be a moderate, negative, curved association between rating score and price of the advanced compact cameras, which tend to be more expensive. There is a very weak association between rating score and price of the compact cameras, and an equally weak, positive, linear association for the subcompact cameras, which tend to have lower prices. There is a moderate, curvilinear association between rating score and price of the super-zoom cameras, which tend to have higher rating scores
Activity 26-20
603
Activity 26-19: Maternal Oxygenation
Percentage Oxygen (40% level)
a. The scatterplot with the y x line follows: 0.65 0.60 0.55 0.50 0.45
yx
0.40 0.35 0.30
0.35
0.40
0.45
0.50
0.55
Percentage Oxygen (baseline)
b. Yes, the scatterplot supports the conclusion that fetuses with the lowest initial oxygen levels appeared to increase their oxygen percentages the most because these fetuses (with the lowest oxygen levels) tend to be the highest above the y x line. The data with the larger initial oxygen levels are the only ones in which the oxygen levels decreased after the mother was administered 40% oxygen.
Activity 26-20: Chip Melting a. The scatterplot of chocolate-chip melting time vs. peanut-butter-chip melting time follows:
Chocolate-Chip Melting Time (in seconds)
175 150 125 100
yx
75 50 20 40 60 80 100 120 140 Peanut-Butter-Chip Melting Time (in seconds)
The data suggest a moderate, positive, and linear association. b. Most of the points fall above the y x line. This indicates that the chocolate chips tend to take longer to melt than the peanut butter chips.
26
Yes; because the majority of this data falls above the y x line, you know that fetuses tend to have a higher percentage of oxygen after the mother is administered 40% oxygen than at the baseline measurement.
Topic 26: Graphical Displays of Association
Activity 26-21: Comparison Shopping a. The observational units are the products available at the two grocery stores. The two quantitative variables recorded are the Luckys’ price and the Vons’ price. b. The scatterplot of the prices at Luckys vs. prices at Vons follows: Price at Luckys (in dollars)
7 Navel oranges
6 5
Granulated sugar
4 3 2 1 0 0
1
2 3 4 5 Price at Vons (in dollars)
6
7
There is a very strong, positive, linear association between the prices of products in these two stores. The prices of the products are very similar, as indicated by so many of the prices being close to or on the y x line. It appears that Vons’ prices may tend to be slightly less than Luckys’ because there are more points below the line than above it (excluding two outliers), but it is difficult to tell. c. Yes; the suspicious products are navel oranges ($6.18, $4.36) and granulated sugar ($4.75, $3.99). These are suspicious because their Luckys’ price is so much more than Vons’ price. Perhaps they were on sale at Vons that week, or perhaps a mistake was made in recording the prices.
Activity 26-22: Muscle Fatigue a. The scatterplot of men’s time until fatigue vs. women’s time until fatigue follows: 1000 Men’s Time Until Fatigue (in seconds)
604
900 yx
800 700 600 500 400 300 0
1000
2000
3000
4000
Women’s Time Until Fatigue (in seconds)
b. Because the vast majority of the data are below the y x line, you know that women tend to last longer than men before muscle fatigue. c. No, this scatterplot does not reveal much of an association between these variables. No, this association does not indicate that men and women of similar strength tend to have similar times until fatigue because there are very few points on or near the y x line.
Quizzes
605
Activity 26-23: Your Choice Answers will vary by student.
Assessment Sample Quiz 26A
•••
The following scatterplots display the price vs. age for a sample of Hanoverian-bred dressage horses listed for sale on the Internet. The graph on the left displays these data for female horses, the graph on the right for male horses: Male Horses 60,000
50,000
50,000
40,000 30,000 20,000
40,000 30,000 20,000
10,000
10,000
0
0 0
5
10 Age (in years)
15
0
20
5
10 Age (in years)
15
20
1. Is the oldest horse in this sample male or female? 2. Is the most expensive horse in this sample male or female? 3. Which horse would you predict to cost more: a 10-year-old male horse or a 10-year-old female horse? 4. Which gender has a positive association between price and age? 5. Which gender has the stronger association between price and age?
Solution to Sample Quiz 26A
•••
1. The oldest horse in this sample is female.
2. The most expensive horse in this sample is male. 3. Based on these scatterplots, a 10-year-old male horse is likely to cost more than a 10-year-old female horse. 4. The male gender has a positive association between price and age. 5. The female gender has the stronger association between price and age.
Sample Quiz 26B
•••
The following scatterplot displays lung capacity (forced expiratory volume, measured in liters) vs. age (in years) for a sample of children:
26
Price (in dollars)
Price (in dollars)
Female Horses 60,000
Activity 27-9
619
The correlation coefficient is .014, which is very close to 0. This value indicates there is no evidence of association between draft number and sequential date, suggesting the lottery process was fair and random in 1971. The mixing mechanism was greatly improved after the anomaly with the 1970 results was spotted.
Homework Activities Activity 27-8: Hypothetical Exam Scores a. Yes, most of the exam scores follow a linear pattern in class A, with one exception. b. Yes, most of the exam scores in class B are scattered haphazardly with no particular pattern except for the exam in the lower left of the plot, which gives a linear appearance to the overall plot. c. For class A, the correlation coefficient is .037. For class B, the correlation coefficient is .705. Both values are surprising. In class A, the value of r is surprisingly small, and in class B, the value of r is surprisingly large. Clearly a single outlier can have a huge effect on the value of r. d. It only takes one exception to a linear pattern to throw off what could be a strong correlation. Similarly, a seemingly haphazard pattern could have a fairly strong correlation if one or more extreme points line up linearly. Because correlation coefficients are so easily affected by these outliers, you say that they are not resistant. e. In class C, students who scored less than 50 on exam 1 also scored less than 50 on exam 2, although some scores increased and some decreased. Students who scored 70 and greater on exam 1 also performed this well on exam 2, but some scores increased and some decreased. There were no students who scored between 50 and 70 on either exam. f. The correlation coefficient is .954. This value is deceptively high. The two clusters of scores, which have no particular pattern within each cluster, form a line (any two points will form a line) that causes an artificially inflated overall correlation coefficient.
Activity 27-9: Proximity to the Teacher a. The correlation coefficient cannot be less than 1.0. b. The order in which you state the variables will not change the correlation coefficient. The correlation coefficient of x and y is always the same as the correlation coefficient of y and x. c. A correlation coefficient of .8 indicates a strong negative association. d. You cannot find a correlation between one quantitative and one categorical variable. e. If the correlation coefficient is .8, then the association is negative, so students who sit father away tend to score lower. f. You can never conclude cause-and-effect based solely on the value of the correlation coefficient.
27
•••
Topic 27: Correlation Coefficient
Activity 27-10: Monopoly a. The scatterplot of rent vs. price follows: 40
Rent (in dollars)
620
30 20 10 0 50
100
150 200 250 300 Price (in dollars)
350
400
Answers will vary by student guess, but the correlation is clearly strong and positive, so their guesses should be close to 1. b. The correlation coefficient is .994. c. The actual values of the correlation coefficient are recorded here: Trial r
1
2
3
4
5
6
7
.994
.794
.670
.490
.707
.538
.019
d. The correlation coefficient is not a resistant measure of association. A single change in the data can have a drastic effect on r.
Activity 27-11: Monopoly a. The observational units are the properties in the Monopoly board game. b. Five variables are listed for each observational unit. c. Students should have one of the following scatterplots:
0
20
0 150
300
450
0
20
20
0 1000
150 0
20
200 100 House Cost (in dollars)
300 150
200
40
20
0
0
0
100 Hotel Cost (in dollars)
d. Position and price: r .995
200
450
0
1000 2000 Hotel Cost (in dollars)
0
100 Hotel Cost (in dollars)
House Cost (in dollars) House Cost (in dollars)
40
100
100 House Cost (in dollars)
300
2000
Rent (in dollars)
Rent (in dollars)
0 0
450
Hotel Cost (in dollars)
0
20
40
Price (in dollars)
Price (in dollars)
Position on Board
40
0
40
Rent (in dollars)
Price (in dollars)
621
200
100
0
200
200
Position and rent: r .983
Position and house: r .984 Position and hotel: r .964
27
20
40
Position on Board
40
Position on Board
Position on Board
Activity 27-12
Price and house: r .994
Price and hotel: r .978 Rent and house: r .999
Rent and hotel: r .982
House and hotel: r .981 Price and rent: r .994 e. Answers will vary by student expectation. The correlation coefficient increases except in four cases: It is unchanged with rent/house and rent/price, and it decreases for rent/hotel and house/hotel. f. Position and price: r .997
Position and rent: r .991 Position and house: r .991
Position and hotel: r .985 Price and house: r .994 Price and hotel: r .984 Rent and house: r .999
Rent and hotel: r .978
House and hotel: r .978
Price and rent: r .994
Activity 27-12: Monthly Temperatures a. Yes, there appears to be a very strong, parabolic relationship between Raleigh’s average temperature and the month number.
622
Topic 27: Correlation Coefficient
b. The correlation coefficient is .257. This correlation value indicates a weak relationship. c. The correlation is so close to zero in spite of the very strong relationship between temperature and month because the relationship is not linear. The correlation coefficient measures the strength of a linear relationship; it does not tell you whether there is some other type of relationship.
Activity 27-13: Planetary Measurements a. There is a strong positive (curved) relationship between planet period of revolution and distance from the sun. b. No, a straight line would not be a reasonable summary of the relationship between revolution and distance. A curve would be a much better fit. c. No, a correlation coefficient of .989 does not mean a straight line is the best model for a reasonable summary of the relationship between these variables. You can see from the scatterplot that some curved model would be a better fit. You should always look at a scatterplot, in conjunction with the correlation coefficient, to assess the form of the association.
Activity 27-14: Ice Cream, Drownings, and Fire Damage a. Answers will vary by student expectation, but they should expect to find a positive correlation. During warm (summer) months, ice cream sales will increase as will outdoor swimming/boating and therefore the number of drownings. Similarly, in winter months both ice cream sales and number of drownings will decrease. b. No, association does not equal causation. Eating ice cream does not cause drowning. A confounding variable would be the outside temperature/time of year (see explanation in part a). c. More fire engines respond to more damaging fires because such fires are larger and more fire engines are required to extinguish them. This does not mean that the damage would be less extensive if fewer fire engines were dispatched (just the opposite, in fact). A third variable, severity of the fire, is confounded with the other two variables.
Activity 27-15: Broadway Shows a. A: r .721
B: r .316 C: r .947 D: r 502
E: r .407 F: r .804
b. From smallest to largest correlation, you have B E D A F C. Yes, this ordering agrees with the ordering based on the scatterplots alone in Activity 26-10.
Activity 27-16: College Alumni Donations a. The following scatterplots display average gift vs. percentage giving per class and also average gift vs. previous year’s average gift per class.
Activity 27-18
623
12,000 Average Gift (in dollars)
1962
10,000 8000 6000 4000 2000
1962 10,000 8000 6000 4000 2000 0
0 40 50 60 70 Giving Percentage (in dollars)
0
80
500 1000 1500 2000 2500 3000 3500 Previous Average Gift (in dollars)
The class of 1962 is an outlier in its average gift. b. For average gift/giving percentage, the correlation coefficient is .553. For average gift/previous average gift, the correlation coefficient is .390. c. Answers will vary by student expectation, but students should expect the correlation to decrease because the outlier appears to be creating a significant part of the linearity in the plot (the remaining values do not seem to have an extremely strong association).
e. For average gift/giving percentage, the correlation coefficient is .352. For average gift/previous average gift, the correlation coefficient is .849.
Activity 27-17: Challenger Disaster
27
d. Answers will vary by student expectation, but students should expect the correlation to increase because without the outlier there appears to be a linear pattern.
The correlation coefficient between temperature and number of O-ring failures is .561. The correlation coefficient without zero O-ring failure flights is .263. The association between temperature and number of O-ring failures is weakened when you exclude the flights with no O-ring failures because those flights were all on days when the temperature was relatively high. The data are no longer present to show that hot days tend to have no O-ring failures.
Activity 27-18: Solitaire The scatterplots follow:
Losses
30 Losses
30
20 10 0 150 180 210 240 270 Time (in seconds)
Time (in seconds)
Average Gift (in dollars)
12,000
270 240 210 180 150 3000
4000
5000
Points
6000
3000
4000 5000 Points
6000
624
Topic 27: Correlation Coefficient
For losses vs. time, the correlation coefficient is .053. For losses vs. points, the correlation coefficient is .061. For time vs. points, the correlation coefficient is .979. There is a strong fairly linear (though slightly curved) relationship between the length of the games (time) and the points indicated in both the scatterplot and the correlation coefficient of .979. However, there is no apparent relationship, linear or otherwise, between the number of losing games that precedes a win and either the time to complete a game or the game score (in points). The scatterplot for losses vs. time shows a negative association, but it is extremely weak. Similarly, the scatterplot for losses vs. points shows an equally weak positive association.
Activity 27-19: Climatic Conditions The completed table with correlation values follows: Jan High
Jan Low
Jan High
xxx
.965
Jan Low
xxx
July High
July High
July low
Precip
Days Precip
Snow
Sun
.152
.554
.073
.572
.807
.643
xxx
.072
.473
.002
.460
.825
.512
xxx
xxx
xxx
.712
.114
.130
.080
.377
July Low
xxx
xxx
xxx
xxx
.243
.345
.613
.521
Precip
xxx
xxx
xxx
xxx
xxx
.695
.157
.506
Days Precip
xxx
xxx
xxx
xxx
xxx
xxx
.444
.826
Snow
xxx
xxx
xxx
xxx
xxx
xxx
xxx
.363
Sun
xxx
xxx
xxx
xxx
xxx
xxx
xxx
xxx
a. The strongest association is between January high temperature and January low temperature (r .965). b. The weakest association is between January low temperature and annual precipitation (r .002). c. The most useful variable for predicting annual snowfall would be January low temperature (r .825). The least useful variable would be July high temperature (r .080). d. The most useful variable for predicting annual July high temperature would be July low temperature (r .712). The least useful variable would be January low temperature (r .022). e. The scatterplot of annual snowfall vs. annual precipitation follows: Annual Snowfall (in inches)
60 50 40 30 20 10 0 0
10
20 30 40 50 Annual Precipitation (in inches)
60
Activity 27-21
625
The correlation coefficient indicates a very weak positive linear relationship between annual snowfall and annual precipitation. A look at the scatterplot, however, indicates four western cities (Phoenix, San Diego, San Francisco, and Los Angeles) with no annual snowfall that may be artificially deflating the correlation coefficient. If you delete these cities and recompute, you find a much stronger negative correlation coefficient of .771.
Activity 27-20: Muscle Fatigue a. The correlation coefficient is .010. b. This correlation coefficient indicates that there is virtually no relationship between the times until muscle fatigue of men and women of similar strength.
Activity 27-21: Digital Cameras a. The scatterplot of rating score vs. price follows: 80
70 65 60 55 50 100
200
300
400 500 600 Price (in dollars)
700
800
900
The correlation coefficient is .071, which indicates there is virtually no linear association between rating score and price of a digital camera. This makes sense when you see the scatterplot; a line would not fit this data well. A curve might fit moderately well. b. The four scatterplots follow: Advanced Compact
Compact
Subcompact
Super-zoom
80 70 Rating Score
60 50 80 70 60 50 200
400
600 800 200 400 Price (in dollars)
600
800
Advanced compact: r .509;
Compact: r .169
Subcompact: r .141;
Super-zoom: r .273
27
Rating Score
75
626
Topic 27: Correlation Coefficient
There is a moderate negative linear association between the rating score and price of the advanced compact cameras. The compact and subcompact cameras have very little linear association though subcompact cameras show a positive association and compact cameras show a negative association. The super-zoom cameras have a weak negative linear association according to the correlation coefficient, but the scatterplot indicates a curve would be a much better model for these data than a line.
Activity 27-22: Your Choice Answers will vary by student.
Assessment Sample Quiz 27A
•••
The following four scatterplots pertain to a study that investigated predicting the price of a wine based on its age and three variables associated with the weather during the year in which it was produced: summer temperature, winter rain, and harvest rain: A:
100
Price Index
80 60 40 20 0
B:
0
5
10
15 20 Age (in years)
25
15.0
15.5 16.0 16.5 17.0 17.5 Summer Temperature (in C)
30
100
Price Index
80 60 40 20 0 18.0
Topic 28: Least Squares Regression
•••
Homework Activities Activity 28-5: Textbook Prices a. The regression line only predicts/estimates the price of textbooks; it will not, in general, give the actual price of the textbook. b. This statement does not consider the intercept in computing the predicted price of a textbook. c. This statement does not consider the intercept in computing the predicted price of a one-page book. The predicted price would be $3.42 $0.147 $1.95. This is an unreasonable prediction because you are extrapolating; one page is not a reasonable number of pages for a textbook (and is not similar to the values observed in the dataset). d. The last phrase of this sentence should be “meaning the predicted number of pages increases by $0.147 for each additional page in the textbook.” e. This in a completely invalid interpretation of r2. You have no measure of the percentage of textbook prices that are correctly predicted (the percentage of observations actually falling on the regression line). The coefficient of determination measures the variability in textbook prices explained by the linear relationship with the number of pages. f. The coefficient of determination does not measure the percentage of points that fall close to the regression line. g. All of the textbooks contain pages! The coefficient of determination does not measure the percentage of textbooks that contain pages. h. There are some words missing in this statement. It should read: The coefficient of determination is r 2 .667, meaning 67.7% of the variability in textbook prices is explained by the least squares line with number of pages. i. The word “prices” (the response variable) is missing. It should read: The coefficient of determination is r 2 .667, meaning 67.7% of the variability in textbook prices is explained by the least squares line with number of pages.
Activity 28-6: Airfares a. The observational units are the destination cities. b. The scatterplot of airfare vs. distance follows: 300 Airfare (in dollars)
642
250 200 150 100 0
200
400
600 800 1000 1200 1400 1600 Distance (in miles)
Activity 28-8
643
There is a moderate positive linear association between distance and airfare for these cities. c. Slope b 59.454 $0.1174/mi .795 _______ 402.686 Intercept a $0.117 (712.667) $83.25 166.917 ______ mi Least squares line equation: predicted airfare $83.25 $0.1174 (distance) d. Predicted airfare 83.25 0.1174 (750) $171.30 e. Predicted airfare 83.25 0.1174 (7500) $963.75 f. The slope is $0.1174/mi, which means the predicted price of an airline ticket rises by $0.1174 for each additional mile between a city and Baltimore. g. 100 0.1174 $11.74 h. The percentage of variability in these airfares explained by the regression line with distance is r 2 (.795)2 63.2%.
Activity 28-7: Airfares a. For the fitted value (Atlanta), you calculate 83.25 0.1174 (576) $150.87. For the residual, you calculate 178 150.87 $27.13.
c. The residual plot of residual vs. distance follows: 50
Residual
25 0 –25 –50 –75 0
200
400
600 800 1000 1200 1400 1600 Distance (in miles)
No, this residual plot does not reveal an obvious pattern that questions the adequacy of this model.
Activity 28-8: Airfares a. Denver has the most potential to be influential because it is the city with the greatest distance from Baltimore. b. Without Denver, the regression equation is ˆ re 82.63 0.1186 (distance) airfa
The value of r 2 is 51.2%; these values have not changed a great deal.
28
b. St. Louis has the largest (absolute) residual. Its residual value is 98 169.77 or $ 71.77.
644
Topic 28: Least Squares Regression
c. No, Denver does not appear to be a terribly influential point. d. With $5 Denver airfare, the regression equation is ˆ airfa re 142 0.00542 (distance)
The value of r 2 is .1%; all of these values have changed drastically. e. With a $398 Orlando airfare, the regression equation is ˆ airfa re 92.87 0.1248 (distance)
The value of r 2 is 38.3%. The r 2 value has changed drastically, both from part d and from the original regression. The slope has changed substantially from part d, but is similar to the original slope.
Activity 28-9: Airfares a. If $500 is added to each airfare, the regression equation becomes ˆ airfa re 583.27 0.1174 (distance)
The slope would not change because r and sx and sy would not change. The intercept would become 666.917 (0.1174) (712.667). b. If each airfare is doubled, the regression equation becomes ˆ airfa re 166.5 0.2348 (distance)
The slope would double because sy would be twice as large. The intercept would _ __ also be twice as large because y would double, as would the product of b and x . c. If each distance is cut in half, the regression equation becomes ˆ airfa re 83.27 0.2348 (distance)
The slope would double because sx would be half as large. The intercept would not __ __ change because x would also be half as large, so the product of b and x would be unchanged. d. If 1000 miles is added to each distance, the regression equation becomes ˆ airfa re 34.37 0.1174 (distance)
The slope would not change because r and sx and sy would not change. The intercept would become 166.917 (0.1174) (1000 712.667).
Activity 28-10: Car Data a. Slope b 3.165 .00581 mpg/lbs .907 _____ 494.5 Intercept a 0.00581 (3185.5) 39.47 mpg 20.962 _______ lb mpg 39.47 0.00581 (weight) Least squares regression line equation: cityˆ b. Predicted Audi TT city mpg 39.47 mpg 0.00581 mpg/lb (2655 lb) 24.04 mpg
Activity 28-11
645
c. 100 (.00581) 0.581 mpg d. r 2 (.907) 2 ⇒ 82.3%
Activity 28-11: Electricity Bills a. The following dotplot displays the electric bill charges:
36
39
42
45
48
51
54
Electric Bill (in dollars)
The distribution of electric bills has a slight skew to the right. The bills range from a minimum of about $36 to a maximum of about $55.5. The mean electric bill for these 28 months is $43.18 and the standard deviation is $.997 (just about a dollar). b. The scatterplot of electric bill vs. average temperature follows:
Electric Bill (in dollars)
55 50 45 40
30
40
50 60 70 Average Temperature (in °F)
80
The scatterplot shows a moderate negative linear association between the electric bill and the average temperature. c. Slope b 4.99 0.21395/°F .695 _____ 16.21 Intercept a 43.18 0.21395 (55.88) $55.14 Least squares regression line equation: electricˆbill 55.14 0.21395 (avg. temp) d. The slope is 0.21395. This means that, on average, the predicted electric bill falls $.21 for each degree that the average temperature rises. e. For March 1992 fitted value, you calculate $55.14 $0.21395/°F (41°F) $46.37. For the residual, you calculate $44.43 $46.37 $1.94. f. The month with the greatest fitted value would be the month with the lowest average temperature. According to the scatterplot, this is March 1993. g. r 2 (.695)2 ⇒ 48.3% of the variability in electric bills is explained by the regression line with average temperature.
28
35
Topic 28: Least Squares Regression
Activity 28-12: House Prices a. The house with the most unusual size is 2130 Beach St. It is unusually small (460 sq ft). b. Predicted price 286,641 153.8 (size); r 2 49.0% c. Predicted price $286,641 153.8/ft 2 (1242 ft 2) $477,660.60 The predicted price before you removed the house at Beach St. was $475,120. d. These values have changed substantially. This house does seem to have a substantial influence on the regression line.
Activity 28-13: House Prices a. The following dotplot displays the distribution of number of bedrooms:
2
3
4
Number of Bedrooms
This is a perfectly symmetric distribution, centered at 3 bedrooms, with a minimum of 2 bedrooms and a maximum of 4 bedrooms. All but 6 of the 19 houses in this sample have 3 bedrooms. The standard deviation is 0.577 bedrooms. b. The correlation coefficient is .499. The scatterplot of house price vs. number of bedrooms follows: 650 House Price (in thousands of dollars)
646
600 550 500 450 400 350 2.0
2.5
3.0
3.5
4.0
Number of Bedrooms
The scatterplot show a positive association between number of bedrooms and house price, but because of the granularity in the number of bedrooms, a standard regression line is not likely to be a good model for this relationship. You also don’t have very many houses with four bedrooms to confirm whether the average price with four bedrooms is larger or smaller than with three bedrooms. ˆe 308,156 61,083 (number of bedrooms); c. The regression equation is pric 2 r 24.9%.
d. The residual plot follows:
Activity 28-13
647
150,000
Residual
100,000 50,000 0 –50,000 –100,000 –150,000 2.0
2.5
3.0
3.5
4.0
Number of Bedrooms
The residual plot indicates that a linear model is not appropriate here. The residuals for three bedroom houses are almost all positive or are near zero, whereas the two and four bedroom houses tend to have large negative residuals. e. The following dotplot displays the distribution of the number of bathrooms:
0.5
1.0
1.5
2.0
2.5
3.0
The distribution of bathrooms is skewed left with a minimum of 1 bathroom and a maximum of 2.5 bathroom. Only one house had 2.5 baths, whereas 13 of the 19 houses had 2 bathrooms. The mean number of bathrooms per house is 1.76 and the standard deviation is .482 bathrooms. The correlation coefficient is .711. The scatterplot of house price vs. number of bathrooms follows:
House Price (in thousands of dollars)
650 600 550 500 450 400 350 1.0
1.2
1.4 1.6 1.8 2.0 2.2 Number of Bathrooms
2.4
2.6
This scatterplot shows a much stronger positive linear association than did the house price vs. number of bedrooms plot. There is still a problem with the granularity of the number of bathrooms, and the single house with 2.5 bathrooms may be very influential. ˆe 307,427 104,347 (number of bedrooms); The regression equation is pric 2 r 50.6%.
28
Number of Bathrooms
Topic 28: Least Squares Regression
The residual plot follows:
Residual
100,000 50,000 0 –50,000 400,000
440,000
500,000 540,000 Fitted Value (in dollars)
580,000
The residual plot does not show an obvious pattern to indicate the linear model is inappropriate, but due to the granularity of the data, it is difficult to tell much of anything here. f. Based on the residual plots and values of r 2, the size of the house seems to be the best of these three explanatory variables for predicting the price of a house.
Activity 28-14: Honda Prices a. The scatterplot of price vs. mileage follows: ^ Price $17,918 $0.07,449 Mileage 25,000 20,000 Price (in dollars)
648
15,000 10,000 (302,000, $1200)
5,000 0 –5,000 0
100,000 200,000 Mileage
300,000
The plot indicates a strong negative and somewhat curved association between price and mileage for these used Hondas. There is one outlier that is potentially influential as it has unusually high mileage (302,000 miles), but the price is higher than you would predict based on the cars with much lower mileage. ˆe 17,918 0.07449 (mileage). b. The regression equation is pric
c. The slope is 0.07449, which means the predicted price decreases by $0.07449 for each additional mile on the car’s odometer. d. You calculate r 2 (.78) 2 ⇒ 60.8%. So 60.8% of variability in the car prices explained by the regression with car mileage. e. Predicted price $17,918 0.07449 (50,000) $14,193.50 Predicted price $17,918 0.07449 (150,000) $6744.50 f. The outlier (in mileage) is the 1992 car (302,000, $1200). With this car removed, ˆe 18,641 0.09026 (mileage). the regression equation is pric
Activity 28-15
649
The value of r 2 is 60.1%. Because the slope and coefficient of determination did not change much, you conclude this point is not very influential. Note, as expected, the slope became steeper as this observation is no longer pulling the line up compared to the rest of the cars. g. The residual plot follows: 7,500
Residual
5,000 2,500 0 –2,500 –5,000 0 20,000
60,000
100,000
140,000
180,000
Mileage
For high-mileage vehicles, the residuals tend to be positive, which indicates there may be a better model for describing this relationship between price and mileage, but overall it is difficult to see a clear pattern.
Activity 28-15: Honda Prices The scatterplot of price vs. year of manufacture follows: ^ Price –2,840,147 1425 Year
28
Price (in dollars)
25,000 20,000 15,000 10,000 5000 0 1990
1994
1998
2002
2006
Year
ˆe 2,840,147 1425 (year). The regression equation is pric The slope is $1425/year, which means the predicted price increases by $1425 for each additional year in the year of manufacture. Or, the predicted price decreases by $1425 for each additional year of the car’s age. You calculate r 2 (.875) 2 ⇒ 76.6%. The percentage of variability in the car prices explained by the regression line with car mileage is 76.6%. Predicted price 1998 Honda Civic $2,840,147 1425 (1998) $7003 Predicted price 2003 Honda Civic $2,840,147 1425 (2003) $14,128 The outlier (in year) is the 1992 car (1992, $1200). With this car removed, the ˆe 2,911,781 1461 (mileage). regression equation is pric 2 The value of r is 75.1%. Because the slope and coefficient of determination did not change much, you conclude this point is not very influential. The residual plot follows:
Topic 28: Least Squares Regression
7500 5000
Residual
2500 0 –2500 –5000 1996
1998
2000
2002 Year
2004
2006
The residual plot does not show any strong patterns to indicate that the linear regression is not the best model for describing this relationship.
Activity 28-16: Fast-Food Sandwiches a. The scatterplot of calories from fat vs. total calories follows: ^ Calories from Fat – 77.49 0.5622 Calories
Calories from Fat
400 300 200 100 0 200
300
400
500
600
700
800
900
Calories
The scatterplot shows a strong positive linear association between calories from fat and calories in the Arby’s sandwiches. b. The regression equation is predicted calories from fat 77.49 .5622 (calories). The value of r 2 is 93.8%. c. The residual plot follows: 50 25 Residual
650
0 –25 –50 –75 200
300
400
500 600 Calories
700
800
900
Activity 28-17
651
The residuals seem to be randomly scattered, so you conclude the least squares line is a reasonable model for these data. d. The slope is 0.5622, which means you can expect the number of calories from fat to increase by 0.5622 for each additional calorie in the fast-food sandwich. e. The percentage of variability in the calories from fat explained by the linear regression with calories is 93.8%. f. Predicted calories from fat 77.49 0.5622 (750) 344.16 calories
Activity 28-17: Box Office Blockbusters The scatterplot of overall gross income vs. opening weekend gross income follows: ^ Gross 9.513 3.182 Opening Gross Gross (in millions of dollars)
400 300 200 100 0
The scatterplot shows a moderately strong positive relationship between overall gross income and opening weekend gross for these blockbusters. The relationship does not appear completely linear, however, as there is a large cluster of movies with small opening values lying below the regression line. The regression equation is predicted overall gross 9513 3.182 (opening weekend gross). The value of r 2 is 82.9%. The residual plot follows: 100
Residual
75 50 25 0 –25 –50 0
40 80 Opening Gross (in millions of dollars)
120
28
0 40 80 120 Opening Gross (in millions of dollars)
Topic 28: Least Squares Regression
This residual plot indicates that a linear model may not be the best fit for this relationship. Movies that grossed less than $40 million in the opening weekend tend to have negative residuals, whereas movies that grossed more tend to have positive residuals.
Activity 28-18: Box Office Blockbusters The scatterplot of overall gross income vs. total number of screens on which the movie appeared follows: ^ Gross $–68.24 0.04991 Number of Screens Gross ( in millions of dollars)
400 300 200 100 0 1000 1500 2000 2500 3000 3500 4000 4500 Number of Screens
The scatterplot shows a moderate positive curved relationship between overall gross income and the total number of screens on which the movie appeared for these blockbusters. The regression equation is predicted overall gross 68.24 .04991 (number of screens). The value of r 2 is 27.3%. The residual plot follows: 300 200 Residual
652
100 0 –100 1000 1500 2000 2500 3000 3500 4000 4500 Number of Screens
This residual plot indicates that a linear model is not the best fit for this relationship. Movies shown on 21003500 screens tend to have negative residuals, whereas movies shown on fewer than 2100 or more than 4000 screens tend to have positive residuals.
Activity 28-18
653
The scatterplot of overall gross income vs. number of screens for the opening weekend follows: ^ Gross $9.74 0.02484 Opening Screens Gross (in millions of dollars)
400 300 200 100 0 0
1000 2000 3000 Opening Screens
4000
The scatterplot shows a moderately strong positive curved relationship between overall gross income and number of screens for the opening weekend for these blockbusters. The regression equation is predicted overall gross 9.74 0.0284 (opening screens). The value of r 2 is 15.5%. The residual plot follows: 300
100 0 –100 0
1000
2000 3000 Opening Screens
4000
This residual plot indicates that a linear model is not the best fit for this relationship. Movies shown on fewer than 600 or more than 3500 opening screens tend to have positive residuals, whereas movies shown on 13003500 opening screens tend to have negative residuals. Among these three variables, opening gross income is the best predictor for overall gross income using a linear model. Total number of screens might be considered a better linear predictor than number of screens for opening weekend, but neither of these variables should be used in a linear model.
28
Residual
200
Topic 28: Least Squares Regression
Activity 28-19: Televisions and Life Expectancy a. The scatterplot of life expectancy vs. number of televisions per 1000 people follows: ^ Life Expectancy 57.34 0.03244 (TVs per 1000)
Life Expectancy (in years)
90 80 70 60 50 40 0 100 200 300 400 500 600 700 800 900 Number of Televisions per 1000 People
The regression equation is predicted life expectancy 57.34 0.03244 (TVs per 1000). The value of r 2 is 55.2%. The residual plot follows: 10 5 Residual
0 –5 –10 –15 –20 –25 1.0
1.5
2.0
2.5
3.0
Log(TVs per 1000)
The least squares line does not provide a reasonable summary of the relationship between these variables because the relationship is curved, as seen by the pattern of negative-positive-negative residuals. b. The scatterplot of life expectancy vs. log(TVs per 1000) follows: ^ Life Expectancy 35.79 14.46 Log(TVs per 1000) 80 Life Expectancy (in years)
654
70 60 50 40 1.0
1.5 2.0 2.5 Log(TVs per 1000)
3.0
Activity 28-19
655
The regression equation is predicted life expectancy 35.79 14.46 log10 (TVs per 1000). The value of r 2 is 60.5%. The residual plot follows: 10 5 Residual
0 –5 –10 –15 –20 –25 1.0
2.0 1.5 Log(TVs per 1000)
2.5
3.0
This residual plot does not reveal any definite patterns, so a linear model between life expectancy and log(TVs per 1000) seems appropriate. c. The scatterplot of life expectancy vs. r 2(TVs per 1000) follows:
80 70 60
28
Life Expectancy (in years)
^ Life Expectancy 50.32 1.111 r 2(TVs per 1000)
50 40 0
5
10 15 20 r 2(TVs per 1000)
25
30
___________
The regression equation is life expectancy 50.32 1.111 √TVs per 1000 . The value of r 2 is 62.2%. The residual plot follows: 10
Residual
5 0 –5 –10 –15 –20 0
5
10 15 20 r 2(TVs per 1000)
25
30
Topic 28: Least Squares Regression
d. The least squares model is more appropriate with both of these transformed variables than with the original data because the transformed scatterplots are much more linear and the residual plots display fairly random scatter. The square root transformation is slightly better than the log transformation because it explains a slightly greater proportion of the variability in life expectancies.
Activity 28-20: Planetary Measurements
Distance (in millions of miles)
a. The scatterplot of distance from the sun vs. position number follows: 4000 3000 2000 1000 0 0
1
2
3 4 5 6 Position from Sun
7
8
9
A least squares line would not be a good fit for this relationship because it is so curved. ˆnce 1126 445.5 (position number). b. The regression equation is dista
The value of r 2 is 82.8%. The residual plot follows: 750 500 Residual
656
250 0 –250 –500 0
1
2
3 4 5 6 Position from Sun
7
8
9
The residuals are not scattered randomly; they form a parabola. The first two and last two planets have positive residuals, whereas the middle five planets have negative residuals. This indicates that the least squares line is not a reasonable model for this relationship. c. The scatterplots of r 2 (distance from the sun) vs. position number and log(distance from the sun) vs. position number follow:
Activity 28-20
657
Square Root(Distance)
60 50 40 30 20 10 0 0
1
2
3 4 5 6 Position from Sun
7
8
9
0
1
2
3 4 5 6 Position from Sun
7
8
9
Log(Distance)
3.5 3.0 2.5 2.0 1.5
The log transformation seems to produce a more linear relationship.
It would also be reasonable to report predicted ln(distance) 2.87 0.623 (position number). e. The value of r 2 is 98.2%. f. The residual plot follows: 0.10
Residual
0.05 0.00 –0.05 –0.10 –0.15 –0.20 0
1
2
3
4
5
6
7
8
9
Position from Sun
Given that there are only nine points, these residuals seem to be fairly randomly scattered.
28
d. The regression equation is predicted log10(distance) 1.246 0.2706 (position number).
Topic 28: Least Squares Regression
Activity 28-21: Planetary Measurements
Period of Revolution
a. The scatterplot of period of revolution vs. distance follows: 90,000 80,000 70,000 60,000 50,000 40,000 30,000 20,000 10,000 0 0
1000 2000 3000 Distance (in millions of miles)
4000
There is a very strong positive relationship between these variables, but it does not appear to be linear.
Period of Revolution
b. The scatterplot of period of revolution vs. distance squared follows: 90,000 80,000 70,000 60,000 50,000 40,000 30,000 20,000 10,000 0 0
4,000,000
8,000,000
12,000,000
Distance Squared
No, this relationship is not linear. c. The scatterplot of period of revolution vs. distance1.5 follows:
Period of Revolution
658
90,000 80,000 70,000 60,000 50,000 40,000 30,000 20,000 10,000 0 0
50,000
100,000 150,000 200,000 250,000 Distance(3/2)
The power that appears to produce the most linear relationship is 1.5 or (3/2). ˆ d. The regression equation is revolu tion 11.19 .4094 (distance)1.5.
The value of r 2 is 100%. The residual plot follows:
Activity 28-22
e.
659
300
Residual
200 100 0 –100 –200 0
50,000
100,000 150,000 200,000 250,000 Distance(3/2)
The residuals are all (relatively) very small. Seven of the nine residuals are negative (including the first six), but because there are only nine data values, it’s difficult to know whether this behavior indicates a pattern.
Activity 28-22: Gestation and Longevity a. The scatterplot of gestation vs. longevity follows: 700
Gestation (in days)
600 500 400 300 200
0 0
10
20
30
40
Longevity (in years)
The correlation coefficient is .589. The scatterplot and correlation coefficient indicate there is a moderate positive linear relationship between gestation and longevity. ˆion 54.53 10.34 (longevity). b. The regression equation is gestat
The value of r 2 is 34.7%. c. For the baboon fitted value, you calculate 54.53 10.34 (20) 261.33 days. For the residual, you calculate 187 261.33 74.33 days. d. The residual plot follows: 300
Residual
200 100 0 –100 –200 –300 0
10
20 Longevity (in years)
30
40
28
100
Topic 28: Least Squares Regression
Animals with short lifetimes tend to have negative residuals, which means that the regression line tends to overestimate their gestation period. Some animals with moderate longevities (1520 years) have very large positive residuals, which means that the regression line is greatly underestimating their gestation period. e. The elephant is clearly an outlier in both longevity and gestation period. The residual for the elephant is 645 [54.53 10.34 (40)] 176.87 days. This is not the greatest (absolute) residual. The giraffe, hippopotamus, camel, rhinoceros, etc., all have greater (absolute) residuals. f. The giraffe has the greatest residual. Its gestation period is longer than expected for an animal with its longevity. g. Without the giraffe, the scatterplot of gestation vs. longevity follows: ^ Gestation 41.29 10.75 Longevity 700 Gestation (in days)
600 500 400 300 200 100 0 0
10
20
30
40
Longevity (in years)
ˆion 41.29 10.75 (longevity). The regression equation is gestat
You calculate r 2 40.8%. These values did not change dramatically so you would not consider the giraffe influential. h. Without the elephant, the scatterplot of gestation vs. longevity follows: ^ Gestation 81.52 7.894 Longevity 500 Gestation (in days)
660
400 300 200 100 0 0
10
20
30
40
Longevity (in years)
ˆion 81.52 7.894 (longevity). The regression equation is gestat
The value of r 2 is 19.8%. These values did change substantially. i. The elephant is much more influential than the giraffe.
Activity 28-25
661
Activity 28-23: Gestation and Longevity a. The regression equation is predicted gestation (human being) 54.53 10.34 (75) 830 days 2.27 years. b. No, this is not a reasonable prediction because you are extrapolating. The longevities of the mammals in the data range from about 1 to 40 years. The lifespan of a human being (75 years) is very far outside this range, and you cannot expect the linear relationship to hold for such long lifetimes.
Activity 28-24: Residual Plots a. A: 4
B: 1
C: 2
D: 3
b. A: The residuals are randomly scattered. B: The residuals appear in negatively sloping bands. C: The residuals show a distinct curved pattern. D: The residuals show a clear linear pattern with three clear outliers. c. The regression lines summarize the relationship in the data as well as possible in plots 1 and 4. The lines fail to capture important aspects of the relationship in plots 2 and 3. Plot 2 is clearly a curve, and plot 3 has three points that are exceptions to the linear model; if these points were removed, the least squares line would be quite different from the one shown.
Activity 28-25: Wrongful Conclusions a. Yes; because the mean is the sum of the residuals divided by the number of residuals, if the sum of the residuals is zero, the mean of the residuals must be zero. b. No, the median does not necessarily have to be zero in this case. For example, suppose the residuals are {2, 2, 2, 2, 2, 10}. The sum is zero, but the median is 2. c. No, the observation with greatest value of the predictor variable has the greatest fitted value only if there is a positive slope in the regression line. d. No, there is no way of predicting how large or small the residual of the observation with the greatest value of the response variable will be, relative to that of the other observations. __
e. If the value of the predictor variable is the mean (x ), then the fitted value for _ that observation will equal the mean of the response variable (y ). However, you have no way of knowing the observed value of the response variable for that _ observation. It could be close to y , or far from it. So, yes, it is possible for such an observation to have the greatest (absolute) residual.
28
d. No, the greatest values of r 2 do not necessarily correspond to the scatterplots for which the regression line summarizes the data as well as possible. In one case, such a plot has the smallest r 2 value.
Topic 29: Inference for Correlation and Regression
Percentage
99 95 90 80 70 60 50 40 30 20 10 5 1 150
100
50
0 Residual
50
100
150
Both plots indicate the distribution of the residuals is approximately normal, so that condition is satisfied. For the conditions regarding linearity and equal variability, consider a plot of residual vs. pages: 75 50 Residual
676
25 0 25 50
0
200
400
600
800
1000
1200
Pages
There is no obvious pattern to the residuals in this graph, so the linearity condition is met. The variability in residuals appears to be similar across all values of number of pages, although there might be a bit more variability in the residuals for larger numbers of pages. All technical conditions are met, so the significance test and confidence interval are valid.
•••
Homework Activities Activity 29-6: Proximity to Teacher a. H0: There is no correlation between distance from teacher and quiz average. In symbols, 0. Ha: There is a negative correlation between distance from teacher and quiz average. In symbols, 0. b. The value r .3 is a statistic because it is calculated from a sample. It is not a population value. c. You need to know the sample size. ___
1.2728 .3 √ 18 _________ ________ ________ 1.33. d. The test statistic is t __________
√1 (.3)2
√1 (.09)
With 18 degrees of freedom, the p-value is .10.
Activity 29-7
677
Because the p-value is .10 .05, do not reject H0 at the .05 level. There is not enough evidence to conclude there is a correlation between distance from teacher and quiz average in this population. ___
.3 √ 48 _________ 2.08 2.18. ________ ________ e. With n 50, the test statistic is t __________
√1 (.3)2
√1 (.09)
With 40 degrees of freedom, .01 p-value .025. Because the p-value .025 .05, reject H0 at the .05 level. There is moderate statistical evidence of a negative correlation between distance from teacher and quiz average in this population. ____
3.65 3.83. .3 √148 _________ ________ ________ f. With n 150, the test statistic is t ___________
√1 (.3)2
√1 (.09)
With 40 degrees of freedom, the p-value .0005. Because p-value .025 .05, reject H0 at the .05 level. There is extremely strong statistical evidence (p-value .0005) of a negative correlation between distance from teacher and quiz average in this population. g. As the sample size increases, the p-value decreases. As you have seen before, larger samples will have less sampling variability, and the same observed sample result will be more surprising, that is, less likely to occur by chance alone. You expect the sample correlation coefficients to cluster more closely around zero when the null hypothesis is true. Mathematically, you also see that the sample size is in the denominator of the test statistic, so the test statistic will increase (in absolute value) with the increase in sample size, which lowers the p-value.
Activity 29-7: Studying and Grades
Ha: There is a correlation between hours studied and GPA in the population of all students at UOP. In symbols, 0. ___
.343 √ 78 ___________ 3.029 __________ __________ 3.22. The test statistic is t ____________
√1 (.343)2
√1 (.1176)
Using Table III with 60 degrees of freedom, 2(.001) p-value 2(.005), or .002 p-value .010. b. Yes, this is the same value of the test statistic. However, the value of the p-value you found in parts o and p of Activity 29-1 is doubled because this is a two-sided test (notice, however, the one-sided p-values match). c. Answers will vary by student expectation, but students should expect the p-value to be greater because the sample size is smaller. ___
.343 √ 18 ___________ 1.46 __________ __________ 1.55. d. The test statistic is t ____________
√1 (.343)2
√1 (.1176)
Using Table III with 18 degrees of freedom, 2(.05) p-value 2(.1). Yes; the p-value is greater.
29
a. H0: There is no correlation between hours studied and GPA in the population of all students at UOP. In symbols, 0.
678
Topic 29: Inference for Correlation and Regression
Answers will vary by student expectation, but students should expect the p-value to be smaller because the sample size is larger. ____
.343 √ 198 ___________ 4.83 __________ __________ 5.14. The test statistic is t _____________ 2 √1 (.1176) √1 (.343) Using Table III with 100 degrees of freedom, p-value 2(.0005). Yes; the p-value is smaller. f. In order to be statistically significant at the .05 level with n 5000 and a two-sided test, you need t 1.96. So _____
r ______ √4998 1.96 _________ √1 r 2 which means 1.962
r 2 ⇒ r .027719 1.962(1 r 2) r 2 4998 ⇒ ___________ 4998 1.96 With such a large sample size, the correlation coefficient does not need to be very large to be “statistically significant.”
Activity 29-8: Studying and Grades Answers will vary with each running of the applet. The following is one representative example: a.
–0.2
–0.15
–0.1
–0.05
–0.0
0.05
0.1
0.15
0.2
0.25
0.0
This distribution of sample slope coefficients is somewhat normally distributed, with mean 0.009 and standard deviation 0.065. This distribution is not as symmetric and displays greater variability (though it has roughly the same mean) than when the sample size was 80. b. The following graph displays the results:
–0.02
0.0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
0.2
0.22
0.1
This distribution of sample slope coefficients is normally distributed with mean 0.1 and standard deviation 0.025. This distribution has the same shape and roughly the same spread as the one when the population slope was 0, but the center has shifted from 0 to 0.1.
Activity 29-9
679
c. The following graph displays the results:
–0.04
0 0.0
–0.02
0.02
0.04
This distribution of sample slope coefficients is normally distributed, with mean roughly 0 and standard deviation 0.01. This distribution has the same shape and same center as the one when the standard deviation of the x-values was 1.84 (smaller), but the spread has been reduced considerably. d. The following graph displays the results:
–0.4 –0.35 –0.3 –0.25 –0.2 –0.15 –0.1 –0.05 0.0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.0
This distribution of sample slope coefficients is normally distributed, with mean roughly 0 and standard deviation 0.083. This distribution has the same shape and same center as the one when was 0.45 (smaller), but the spread has increased significantly.
Activity 29-9: Draft Lottery a. H0: There is no correlation between birth date and draft number in the 1970 draft. In symbols, 0.
____
.226 √ 364 __________ 4.31 __________ _________ 4.43. The test statistic is t _____________ 2 √1 (.051) √1 (.226) Using Table III with 100 degrees of freedom, p-value .0005 2 .0010. With such a small p-value, reject H0. Conclude there is a significant correlation between birth date and draft number in the 1970 draft (or the observed sample correlation didn’t occur by chance alone).
29
Ha: There is a correlation between birth date and draft number in the 1970 draft. In symbols, p 0.
b. Note: There were 365 days in 1971. H0: There is no correlation between birth date and draft number in the 1971 draft. In symbols, 0. Ha: There is a correlation between birth date and draft number in the 1971 draft. In symbols, 0. ____
.014 √363 _____________ .267 __________ ____________ 0.27. The test statistic is t ___________
√1 (.014)2
√1 (.000196)
Topic 29: Inference for Correlation and Regression
Using Table III with 100 degrees of freedom, p-value .2 2 .4. Do not reject H0. Conclude there is a no evidence of a correlation between birth date and draft number in the 1971 draft process. c. Based on these p-values, reject the null hypothesis with the 1970 lottery, but not with the 1971 lottery. So, there is strong statistical evidence that the 1970 lottery was not fair, but no reason to doubt that the 1971 lottery was fair.
Activity 29-10: Honda Prices ˆe 17,918 0.0745 (mileage). a. The equation is pric
The slope coefficient is 0.0745, which means that the price of a used Honda is predicted to fall by $0.0745 for each additional mile on the odometer. b. A 95% CI for is b t*60 SE(b) 0.0745 2(0.00638) (0.087, 0.062). c. This interval includes only negative values. Based on this interval, conclude that 0 is not a plausible value for the population slope coefficient (), so therefore reject a two-sided hypothesis test that the population slope coefficient is equal to 0 at the .05 significance level. d.
20
Frequency
15 10 5 0 –6000 –4000 –2000
0 2000 Residual
4000 6000
99.9 99 95 90 Percentage
680
80 70 60 50 40 30 20 10 5 1 0.1 –10,000
–5,000
0 Residual
5,000
10,000
Activity 29-10
681
7,500
Residual
5,000 2,500 0 –2,500
20 0 ,0 00 40 ,0 0 60 0 ,0 0 80 0 ,0 10 00 0, 00 12 0 0, 0 14 00 0, 0 16 00 0, 0 18 00 0, 00 0
–5,000
Mileage
Technical conditions: The normality condition would be met if not for the large outliers (technical condition 3). The residual plot reveals no obvious pattern, and the variability in the residuals appears to be similar across all x-values (conditions 2 and 4), although there may be a little less variability for large mileage values. ˆe $18,641 0.0903 (mileage). e. The equation is pric
The slope coefficient is 0.0903, which means the predicted price of a used Honda falls by $0.0903 for each additional mile on the odometer. A 95% CI for is b t 60* SE(b) 0.0903 2(.007882) (0.106, 0.075). This interval includes only negative values. Based on this interval, conclude that 0 is not a plausible value for the population slope coefficient (), so therefore reject a two-sided hypothesis test that the population slope coefficient is equal to 0 at the .05 significance level. 20
10 5 0 –6000 –4000 –2000
2000 0 Residual
4000
6000
29
Frequency
15
Topic 29: Inference for Correlation and Regression
99.9 99
Percentage
95 90 80 70 60 50 40 30 20 10 5 1 0.1 –10,000
–5,000
0
10,000
5,000
Residual
7,500 5,000 Residual
2,500 0 –2,500
0 00
0
0, 18
0
00
16
0,
0
00
0
00
0, 14
0,
00
12
00
0,
10
00
,0 80
00
,0
,0
60
00
40
,0
0
–5,000
20
682
Mileage
Technical conditions: Removing the mileage outlier made the distribution of residuals appear somewhat more normal. You might now consider technical conditions 24 satisfied. For condition 1, you are not told whether this is a random sample of Hondas for sale, but it is probably representative.
Activity 29-11: Honda Prices e. Note: age current year year of manufacturer ˆe 2,840,147 1425 (year). The equation is pric
The slope coefficient is 1425, which means the predicted price of a used Honda falls by $1425 for each additional year of age on the car. A 95% CI for is b t*60 SE(b) 1425 2(78.86) (1267.28, 1582.72). This interval includes only positive values. Based on this interval, conclude that zero is not a plausible value for the population slope coefficient (), so therefore reject a two-sided hypothesis test that the population slope coefficient is equal to 0 at the .05 significance level.
Activity 29-11
683
25
Frequency
20 15 10 5 0 –4000
–2000
0 2000 Residual
4000
6000
99.9 99
Percentage
95 90 80 70 60 50 40 30 20 10 5 1 0.1 –10,000
–5000
0 Residual
5000
10,000
29
7500
Residual
5000 2500 0 –2500 –5000 1990 1992 1994 1996 1998 2000 2002 2004 2006 Year of Manufacturer
Technical conditions: The residuals do not appear to be normally distributed, and the variability of the y’s does not appear to be quite the same across all years. So, technical conditions 3 and 4 are violated, although technical condition 2 is met (no curved pattern in the residuals). It appears the mileage is the better explanatory variable because it comes closer to satisfying the technical conditions.
Topic 29: Inference for Correlation and Regression
Activity 29-12: Heights, Handspans, and Foot Lengths Answers will vary by class. The following is one representative set. a. The scatterplot of height vs. handspan follows: ^ Height 27.70 1.947 Handspan
Height (in inches)
75 70 65 60 55 17
18 19 20 21 22 Handspan (in centimeters)
23
The scatterplot shows a moderate positive linear association between height and handspan for these students. ˆht 27.7 1.947 (handspan). b. The equation is heig
c. The value r 2 is 46.2%, which means that 46.2% of the variability in the heights is explained by the linear regression with handspans. d. The residual plots follow:
Frequency
684
9 8 7 6 5 4 3 2 1 0 –6
–4
–2
0
2
Residual
4
6
8
Activity 29-12
685
99
Percentage
95 90 80 70 60 50 40 30 20 10 5 1 –10
–5
0 Residual
5
10
7.5
Residual
5.0 2.5 0.0 –2.5 –5.0 17
18
19
20
21
22
23
Handspan (in centimeters)
e. These coefficients are statistics because they are calculated from a sample. f. The null hypothesis is that there is no relationship between the heights and handspans of the population of all students at the school, or the slope of the regression line between these two variables is 0. In symbols, H0: 0. The alternative hypothesis is that students at this school with longer handspans tend to be taller, or the slope of the regression line between these two variables is positive. In symbols, Ha: 0. 1.947 4.63. The test statistic is t ______ 0.4205 Using Table III with 25 degrees of freedom, p-value .0005. Using Minitab, the p-value is .000. With the small p-value, reject H0 at the .05 level of significance. Conclude the slope coefficient between height and handspan is positive in the population of all students at the school (assuming the sample was representative).
29
Technical conditions: The residual plot reveals no obvious pattern. The variability in the residuals appears to be similar across all x-values, and the residuals appear to be normally distributed. So, technical conditions 24 are satisfied. This was not a random sample of students at this school as it was one statistics class, but it might be reasonable to consider it a representative sample with respect to these variables.
Topic 29: Inference for Correlation and Regression
g. A 95% CI for is 1.947 (2.06) (0.4205) (1.081, 2.813). You are 95% confident the average increase in height of students in this population is between 1.081 and 2.813 inches for each additional cm of span in handspan. h. Handspan does seem to be a moderately useful predictor of height for students at this school, as long as this sample was representative. There is strong statistical evidence of a positive correlation between height and handspan, and you are 95% confident the average increase in height of a student in this population is between 1.081 and 2.813 inches for each additional cm of span in his or her handspan.
Activity 29-13: Heights, Handspans, and Foot Lengths Answers will vary by class. The following is one representative set. a. The scatterplot of height vs. foot length follows: ^ Height 35.13 1.312 Foot Length
Height (in inches)
75 70 65 60 55 19 20 21 22 23 24 25 26 27 28 Foot Length (in centimeters)
The scatterplot shows a moderate positive linear association between height and foot length for these students. ˆht 35.13 1.312 (foot length). b. The equation is heig
c. The value r 2 is 49.8%, which means that 49.8% of the variability in the heights is explained by the linear regression with foot lengths. d.
Frequency
686
9 8 7 6 5 4 3 2 1 0 –8
–6
–4
–2
0
Residual
2
4
6
Activity 29-13
687
99
Percentage
95 90 80 70 60 50 40 30 20 10 5 1 10
5
0
5
10
Residual
5.0
Residual
2.5 0.0 –2.5 –5.0 –7.5 19
20
21 22 23 24 25 26 Foot Length (in centimeters)
27
28
e. These coefficients are statistics because they are calculated from a sample.
29
Technical conditions: The residual plot reveals no obvious pattern. The variability in the residuals appears to be similar across all x-values, and the residuals appear to be normally distributed. So, technical conditions 24 are satisfied. This was not a random sample of students as it was one statistics class, but it may be reasonable to consider this a representative sample with respect to these variables.
f. The null hypothesis is that there is no relationship between the heights and foot lengths of the population of all students at the school, or the slope of the regression line between these two variables is 0. In symbols, H0: 0. The alternative hypothesis is that students in this population with longer feet tend to be taller, or the slope of the regression line between these two variables is positive. In symbols, Ha: 0. 1.312 4.98. The test statistic is t ______ 0.2636 Using Table III with 25 degrees of freedom, p-value .0005. Using Minitab, the p-value is .000. Because the p-value is so small, reject H0 at the .05 level of significance. Conclude the slope coefficient between height and foot length is positive in the population of all students at the school (assuming the sample to be representative).
Topic 29: Inference for Correlation and Regression
g. A 95% CI for is 1.312 (2.06) (0.2636) (0.769, 1.855). You are 95% confident the average increase in height of students in this population is between 0.769 and 1.855 inches for each additional cm of foot length. h. Foot length does seem to be a moderately useful predictor of height in this population (assuming the sample is representative). There is strong statistical evidence of a positive correlation between height and foot length, and you are 95% confident the average increase in height of students in this population is between 0.769 and 1.855 inches for each additional cm of foot length. i. Foot length seems to be a slightly better explanatory variable for predicting height with this population because the r 2 value is greater. But there is not much difference.
Activity 29-14: Heights, Handspans, and Foot Lengths Answers will vary by class. The following is one representative set. a. The scatterplot of male height vs. male handspan follows: ^ Male Height 58.81 0.5259 Male Handspan 74 Male Height (in inches)
688
73 72 71 70 69 68 67 20.0
20.5
21.0
21.5
22.0
22.5
23.0
Male Handspan (in centimeters)
The regression equation is predicted male height 58.81 0.5259 (male handspan). The value r 2 is 5.4%. In symbols, H0: 0 and Ha: 0. 0.5259 0.59. The test statistic is t ______ 0.8982 Using Table III with 6 degrees of freedom, .2 p-value. Using Minitab, the p-value is .58. Because the p-value is not small, do not reject H0. You do not have convincing evidence of a correlation between height and handspan in this population of males. The residual plots follow:
Activity 29-14
689
99
Percentage
95 90 80 70 60 50 40 30 20 10 5 1 10
5
0 Residual
5
10
5 4
Residual
3 2 1 0 –1 –2 –3 20.0
20.5
21.0
21.5
22.0
22.5
23.0
Technical conditions: The residual plots reveal no obvious pattern (though the sample size of eight is rather small); the variability in the residuals appears to be similar across all x-values; and the residuals appear to be normally distributed. So technical conditions 24 are satisfied. This was not a random sample of students as it was one statistics class, but it may be reasonable to consider this a representative sample with respect to these variables. The scatterplot of male height vs. male foot length follows: ^ Male Height 54.26 0.6111 Male Foot Length Male Height (in inches)
74
29
Male Handspan (in centimeters)
73 72 71 70 69 68 67 22
23 24 25 26 27 Male Foot Length (in centimeters)
28
The regression equation is predicted male height 54.26 0.6111 (male foot length)
Topic 29: Inference for Correlation and Regression
The value r 2 is 18.7%. In symbols, H0: 0 and Ha: 0. 0.611 1.17. The test statistic is t ______ 0.5207 Using Table III with 6 degrees of freedom, .2 p-value. Using Minitab, the p-value is .285. Because the p-value is not small, do not reject H0. There is no convincing evidence of a correlation between height and foot length in this population of males. The residual plots follow: 99
Percentage
95 90 80 70 60 50 40 30 20 10 5 1 8
6
4
2
0
2
4
6
8
Residual
4 3 2 Residual
690
1 0 –1 –2 22
23
24
25
26
27
28
Male Foot Length (in centimeters)
Technical conditions: The variability in the residuals appears to be similar across all x-values, and the residuals appear to be normally distributed and without an obvious pattern (though the sample size is rather small). So, technical conditions 3 and 4 are satisfied. This was not a random sample of students as it was one statistics class, but it may be reasonable to consider it a representative sample of males. In this case, neither handspan nor foot length appears to be a reasonable predictor of height for these males. There is no real evidence of a correlation in either case, although this may be because there were only eight males in the sample.
Activity 29-14
691
b. The scatterplot of female height vs. female handspan follows: ^ Female Height = 27.77 + 1.925 Female Handspan
Female Height (in inches)
72.5 70.0 67.5 65.0 62.5 60.0 57.5 55.0 17
18 19 20 21 22 Female Handspan (in centimeters)
The regression equation is predicted female height 27.77 1.925 (female handspan). The value r 2 is 40.4%. In symbols, H0: 0 and Ha: 0. 1.925 3.40. The test statistic is t ______ 0.5669 Using Table III with 17 degrees of freedom, .001 p-value .005. Using Minitab, the p-value is .003.
A 95% CI for is 1.925 (2.11) (0.5669) (0.729, 3.121). You are 95% confident the average increase in height of a female student at this school is somewhere between 0.729 and 3.121 inches for each additional cm of handspan. The residual plots follow: 99
Percentage
95 90 80 70 60 50 40 30 20 10 5 1 –10
–5
0 Residual
5
10
29
With the small p-value, reject H0. Conclude there is strong statistical evidence of a correlation between female height and female handspan at this school, assuming the sample is representative.
Topic 29: Inference for Correlation and Regression
7.5
Residual
5.0 2.5 0.0 –2.5 –5.0 17
18 19 20 21 Female Handspan (in centimeters)
22
Technical conditions: The residuals appear to be normally distributed, and there is no obvious pattern (curvature) in the residual plot, but the variability in the residuals is not very similar across all x-values. This was not a random sample of students as it was one statistics class, but it should be a reasonably representative sample for this question. Thus technical conditions 13 are satisfied, but perhaps not condition 4. The scatterplot of female height vs. female foot length follows: ^ Female Height ⴝ 36.18 ⴙ 1.249 Female Foot Length 72.5 Female Height (in inches)
692
70.0 67.5 65.0 62.5 60.0 57.5 55.0 19 20 21 22 23 24 25 26 27 28 Female Foot Length (in centimeters)
The regression equation is predicted female height 36.18 1.249 (female foot length). The value r 2 is 43.9%. In symbols, H0: 0 and Ha: 0. 1.249 3.62. The test statistic is t ______ 0.3425 Using Table III with 17 degrees of freedom, .001 p-value .005. Using Minitab, the p-value is .002. With the small p-value, reject H0. Conclude there is strong statistical evidence of a correlation between female height and female foot length at this school, as long as the sample is representative. A 95% CI for is 1.249 (2.11) (0.3425) (0.526, 1.972). You are 95% confident the average increase in height of a female student at this school is somewhere between 0.526 and 1.972 inches for each additional cm of foot length. The residual plot follows:
Activity 29-15
693
99
Percentage
95 90 80 70 60 50 40 30 20 10 5 1 –10
–5
0 Residual
5
10
5.0
Residual
2.5 0.0 –2.5 –5.0 –7.5 19
20
21
22
23
24
25
26
27
28
Female Foot Length (in centimeters)
For these female students, foot length seems to be a slightly better predictor of height than does handspan. This is evident by comparing the respective values of r 2.
Activity 29-15: Chip Melting Answers will vary by class. The following is one representative example.
Chocolate-Chip Melting Times (in seconds)
^ Chocolate Chip ⴝ 33.23 ⴙ 0.7260 Peanut Butter 175 150 125 100 75 50 20 40 60 80 100 120 140 Peanut-Butter-Chip Melting Times (in seconds)
29
The residuals appear to be normally distributed, and there is no obvious pattern (curvature) in the residual plot, but the variability in the residuals is not very similar across all x-values. (There is more variation for foot lengths less than 25 centimeters than for foot lengths greater than 25 centimeters.) This was not a random sample of students as it was one statistics class, but it might be a representative sample. Thus, technical conditions 13 are satisfied, but perhaps not condition 4.
Topic 29: Inference for Correlation and Regression
The regression equation is predicted chocolate-chip melting time 33.23 0.7260 (peanut-butter-chip melting time); r 2 52%. H0: There is no correlation between the population melting times of chocolate and peanut-butter chips, or the slope of the regression line between them is 0. In symbols, H0: 0. Ha: There is a positive correlation between these melting times. In symbols, Ha: 0. 0.726 4.41 Test statistic: t ______ 0.1646 Using Table III with 18 degrees of freedom p-value .0005. Using Minitab, the p-value is .000. Test decision: The p-value is quite small so reject H0. Conclusion in context: Conclude there is strong statistical evidence of a positive correlation between population melting times of these two types of chips. Technical conditions: 99
Percentage
95 90 80 70 60 50 40 30 20 10 5 1 –100
–50
0 Residual
50
100
50 25 Residual
694
0 –25 –50 20
40 60 80 100 120 140 Peanut-Butter-Chip Melting Time (in seconds)
The residuals appear to be normally distributed, and there is no obvious pattern (curvature) in the residual plot, but the variability in the residuals is not very similar across all x-values. (The variation increases as the peanut-butter-chip melting times increase.) This was not a random sample of students as it was one statistics class, but it could be a representative sample on these variables. Thus technical conditions 13 are satisfied, but not condition 4.
Activity 29-16
695
Activity 29-16: Airfares a. H0: There is no correlation in the population between airfare and distance to Baltimore, or the slope of the population regression line between these variables is 0. In symbols, H0: 0. Ha: There is a positive correlation between these variables in the population. In symbols, Ha: 0. The scatterplot of airfare vs. distance to Baltimore follows: ^ Airfare ⴝ 83.27 ⴙ 0.1174 Distance
Airfare (in dollars)
300 250 200 150 100 0
200 400 600 800 1000 1200 1400 1600 Distance to Baltimore (in miles)
ˆ The regression equation is airf are 83.27 0.1174 (distance).
The value r 2 is 63.2%. 0.117 4.41. The test statistic is t _______ 0.02832
With the small p-value, reject H0. Conclude there is strong statistical evidence that the population slope coefficient is positive.
29
Using Table III with 10 degrees of freedom, .001 p-value .005. Using Minitab, the two-sided p-value is .002, so the one-sided p-value is .001.
b. A 99% CI for is $0.117$/mi (3.169) (0.02832$/mi) ($0.0273$/mi, $0.2067$/mi). You are 99% confident the average increase in airfare is somewhere between $0.0273 and $0.2067 for each additional mile between a city and Baltimore. c. This confidence interval suggests that it is plausible the predicted price of flying to a destination increases by either a nickel or dime, but not by a quarter, for each additional mile flown.
Topic 29: Inference for Correlation and Regression
d. The residual plots follow: 99
Percentage
95 90 80 70 60 50 40 30 20 10 5 1 –150
–100
–50
0
50
100
Residual
50 25 Residual
696
0 –25 –50 –75 0
200
400 600 800 1000 1200 1400 1600 Distance to Baltimore/(in miles)
Technical conditions: The residuals appear to be normally distributed; there is no obvious pattern (curvature) in the residual plot; and the variability in the residuals appears reasonably similar across all x-values, although you do have two large outliers (large negative residuals). You are not told whether this was a random sample of cities, but you can assume it is representative. Thus, all the technical conditions will be considered satisfied.
Activity 29-17: Marriage Ages a. The regression equation is predicted wife’s age 2.446 0.8790 (husband’s age) The value r 2 is 89.1%. The scatterplot of wife’s age vs. husband’s age follows:
Activity 29-17
697
^ Wife’s Age ⴝ 2.446 ⴙ 0.8790 Husband’s Age 80 Wife’s Age (in years)
70 60 50 40 30 20 10 20
30
40
50
60
70
Husband’s Age (in years)
b. H0: The slope of the regression line between wife’s age and husband’s age in the population is 0. In symbols, H0: 0. Ha: The slope of the regression line between these two variables is not 0. In symbols, Ha: 0. 0.879 13.41. The test statistic is t _______ 0.06556 Using Table III with 22 degrees of freedom, p-value .0005 2 .0010. Using Minitab, the p-value is .000. With the small p-value, reject H0. Conclude there is extremely strong statistical evidence that the population slope coefficient is not zero.
d. Yes, this confidence interval includes the value 1. This means 1 is a plausible value for the slope of the population regression line. If the slope of the population regression line is 1, that would imply husbands who differ in age by one year would have wives who are predicted to differ in age by exactly one year as well. In fact, husbands who differ in age by any number of years would have wives who are predicted to differ in age by the same number of years. e. H0: The slope of the regression line between wife’s age and husband’s age in the population is 1. In symbols, H0: 1. Ha: The slope of the regression line between these two variables is not 1. In symbols, 1. 0.879 1 1.85. The test statistic is t _________ 0.06556 Using Table III with 22 degrees of freedom, 2.025 p-value 2 .05 → .05 p-value .10. Because the p-value .05, do not reject H0 at the .05 significance level. There is no reason to doubt the population slope coefficient is 1.0.
29
c. A 95% CI for is 0.879 (2.074) (0.06556) (0.743, 1.015). You are 95% confident the average increase in age for a wife is somewhere between 0.743 years and 1.015 years for each additional year her husband’s age increases in this population.
Topic 29: Inference for Correlation and Regression
Activity 29-18: Marriage Ages a. The husband with the most potential influence over the regression line by virtue of his extreme age is the husband in couple 16, who is 71 years old. In this setting, “influence” means that if you were to change or remove this couple from the analysis, the regression line and coefficient of determination would be likely to change substantially. ^ Wife’s Age ⴝ 13.72 ⴙ 0.5027 Husband’s Age
b. 60 Wife’s Age (in years)
698
50 40 30 20 10 20
30
40
50
60
70
Husband’s Age (in years)
The equation is predicted wife’s age 13.72 0.5072 (husband’s age). The value r 2 is 44.9%. In symbols, H0: 0 vs. Ha: 0. 0.5027 4.23. The test statistic is t ______ 0.1188 Using Table III with 22 degrees of freedom, p-value .0005 2 .0010. Using Minitab, the p-value is .000. With such a small p-value, reject H0. Conclude there is strong statistical evidence the population slope coefficient is not zero. A 95% CI for is 0.5027 (2.074) (0.1188) (0.256, 0.749). You are 95% confident the average increase in wife’s age in this population is somewhere between 0.2563 years and 0.749 years for each additional year of her husband’s age. No, this confidence interval does not include the value 1. This means 1 is no longer a plausible value for the slope of the regression line. H0: The slope of the regression line between wife’s age and husband’s age in the population is 1. In symbols, H0: 1. Ha: The slope of the regression line between these two variables is not 1. In symbols, 1. 0.5027 4.19. The test statistic is t ______ 0.1188 Using Table III with 22 degrees of freedom, p-value 2 .0005 → p-value .001. Because the p-value .001 .05, reject H0 at the .05 significance level. Conclude there is very strong statistical evidence the population slope coefficient is not 1.0.
Quizzes
699
c. This was a very influential point. Making this change cut the coefficient of determination in half, decreased the slope, and increased SE(b) significantly. Although you still concluded the slope was not 0, you can no longer believe the slope could be 1.0.
Assessment Sample Quiz 29A
•••
A sample of students at a university took a test that diagnosed their learning styles as active or reflective and also as visual or verbal. Each student received a numerical score on the active/reflective style and also a numerical score on the visual/verbal style. The sample size was 39, and the sample correlation coefficient turned out to equal .273. 1. State the hypotheses for testing whether there is a positive correlation between these variables in the population of all students at this university. 2. Calculate the value of the test statistic. 3. Determine the p-value as accurately as possible. 4. State the test decision for the .10 and .05 significance levels, and summarize your conclusion in context. 5. If the sample size were much larger, and the value of the sample correlation coefficient stayed the same, describe the impact on your test statistic, p-value, and conclusion.
Solution to Sample Quiz 29A
•••
1. H0: 0. There is no correlation between these variables in the population of all students at this university.
_____
___
√1 r 2
√1 (.273)2
.273 √37 1.73. n 2 ___________ r √______ __________ 2. The test statistic is t ________ 3. Using Table III with 37 degrees of freedom, .025 p-value .05. Using Minitab, the p-value is .046. 4. For both the .10 and .05 significance levels, reject the null hypothesis and conclude there is moderate statistical evidence of a positive correlation between these learning style scores in the population of all students at this university.
29
Ha: 0. There is a positive correlation between these variables in the population of all students at this university.
5. If the sample size were much larger, and the value of the sample correlation coefficient stayed the same, the test statistic would be much greater (the numerator would be greater but the denominator would be unchanged); therefore, the p-value would be smaller, and the conclusion would be that the sample data provide much stronger statistical evidence of a positive relationship between these learning style scores.