Description
Section A: Multi-Choice (40 Marks)
For Section A questions, only record the letter corresponding to your answer. Do not
present any working to support your choice of answer.
Use the following information to answer Questions 1 to 6.
When MetService reports that there is a 30% chance of rain in Wellington for a given day,
it means that they estimate the probability of any rain in Wellington for that day to be
0.3. Consider 8 randomly selected days where MetService reported that there was a 30%
chance of rain in Wellington. Suppose that rain was recorded in Wellington for 5 of those
days.
1. Assuming that MetService’s reported probability of rain in Wellington for each of
those days is correct, what is the probability (to 4dp) that exactly 5 of the 8 days
had rain? (5 marks)
a. 0.0008
b. 0.0013
c. 0.0467
d. 0.625
e. 0.9887
2. For a random sample of 8 days where MetService reports that there is a 30% chance
of rain, what are the mean and variance (to 4dp) for the number of days Y that
have rain? (5 marks)
a. E(Y ) = 0.3, V(Y ) = 0.0262.
b. E(Y ) = 0.3, V(Y ) = 0.162.
c. E(Y ) = 2.4, V(Y ) = 1.2961.
d. E(Y ) = 2.4, V(Y ) = 1.68.
e. E(Y ) = 2.4, V(Y ) = 5.6.
3. Again assuming that MetService’s reported probability of rain in Wellington for each
of the 8 days is correct, use a normal approximation to find the probability (to 4dp)
that rain was recorded on fewer than 5 of the 8 days. (For your reference, a standard
normal probability table is presented on page 8.) (5 marks)
a. 0.7422
b. 0.8023
c. 0.8907
d. 0.8944
e. 0.9474
4. Was it appropriate to use the normal approximation in Question 3? (5 marks)
a. No. One of np and n(1 − p) is less than 5.
b. No. Both np and n(1 − p) are less than 5.
c. Yes. One of np and n(1 − p) is at least 5.
d. Yes. Both np and n(1 − p) are at least 5.
e. Yes. Both np and np(1 − p) are at least 5.
5. Now, suppose that the true probability of rain in Wellington for the 8 randomly
sampled days is unknown. Using the observed number of days with rain (5), produce
an Agresti-Coull adjusted 95% confidence interval (to 4dp) for the true probability
of rain p. (5 marks)
a. (0.2895, 0.9605)
b. (0.3044, 0.8623)
c. (0.441, 0.7257)
d. (0.4538, 0.7962)
e. None of the above.
6. Finally, suppose that you want to estimate the true proportion of days with rain in
Wellington, and you plan to present your results using a 90% confidence interval.
Find the most conservative minimum sample size required if the interval is to have
an approximate margin of error of 0.03. (5 marks)
a. 632
b. 752
c. 897
d. 1068
e. None of the above.
Use the following information to answer Questions 7 to 8.
For students who graduated from a particular university 5 years ago, the following data
show the numbers of students who have not changed their jobs since they graduated for a
random selection of 110 bachelor’s degree students and 70 master’s degree students:
Number of students
Degree Sample size in the same job
Bachelor’s 110 58
Master’s 70 32
7. Let pB denote the proportions of bachelor’s degree students who have not changed
their jobs since they graduated and pM denote the proportions of master’s degree
students who have not changed their jobs since they graduated. Calculate the test
statistic z
∗
(to 4dp) for a test of
H0 : pB = pM
H1 : pB 6= pM.
(5 marks)
a. z
∗ = 0.9174
b. z
∗ = 0.5326
c. z
∗ = −4.9566
d. z
∗ = −5.4526
e. None of the above.
8. For the hypothesis test in Question 7, calculate the p-value (to 4dp). Use the
standard normal probability table on page 8 to calculate the p-value. (5 marks)
a. p-value = 0.0000
b. p-value = 0.1788
c. p-value = 0.2981
d. p-value = 0.3576
e. p-value = 0.5962
Section B: Written Answers (60 Marks)
For Section B questions, you must write your response to the question. Page 7 includes
SAS output which may prove useful to answering parts of Question 9.
9. (40 marks)
Consider data published in the 1950s on a case-control study investigating the
relationship between smoking and lung cancer. A breakdown of lung cancer by
smoker status (where smokers are classified as those smoking at least 1 cigarette
per day for a year) and reported sex of the individual is presented in the partial
contingency tables below.
Smoker status
Sex Has lung cancer? Smoker Non-smoker
Male Yes 647 2
No 622 27
Female Yes 41 19
No 28 32
a. Estimate the conditional associations between incidence of lung cancer and
smoker status, conditional on reported sex of the individual, using conditional
odds ratios (to 4dp).
b. Assuming that the conditional associations estimated in part (a) are indicative
of the true conditional odds ratios, do reported sex of the individual and smoker
status interact in their effect on incidence of lung cancer? Explain why or why
not.
Now consider the marginal table representing the relationship between lung cancer
and smoker status, as shown below.
Smoker status
Has lung cancer? Smoker Non-smoker
Yes 688 21
No 650 59
c. Using the marginal table, estimate the association between smoker status and
lung cancer using the odds ratio (to 4dp). Interpret the estimated odds ratio,
and present a corresponding 95% confidence interval (to 4dp).
d. Using the marginal table, carry out a chi-square test of independence for smoker
status and incidence of lung cancer. Relevant SAS output can be found on
page 7 and may be used to answer this question (i.e., hand calculations are not
required). Be sure to answer the following questions:
i. Is a chi-square test of independence appropriate for the data presented in
the marginal table? Why or why not?
ii. What are the hypotheses to be tested?
iii. What are the Pearson and likelihood ratio chi-square test statistics? What
are their distribution under the null hypothesis?
iv. What are the p-values corresponding to the Pearson and likelihod ratio
chi-square test statistics?
v. What is your conclusion at the α = 0.05 significance level?
e. Again using the marginal table, now carry out Fisher’s exact test to determine
if smokers are more likely to have lung cancer than non-smokers. Relevant SAS
output can be found on page 7 and may be used to answer this question (i.e.,
hand calculations are not required).
Be sure to answer the following questions:
i. What are the hypotheses to be tested?
ii. What is the p-value for the test? Clearly explain what row in the SAS
output provides relevant information for this p-value.
iii. Although it would be possible to calculate a mid-p-value in theory, explain
why a mid-p-value would be unnecessary in this case. (Note: You are being
asked a conceptual question, not to try to calculate a mid-p-value.)
iv. What conclusion would you make at the α = 0.05 significance level?
10. (20 marks)
The Otago region has a rabbit problem, and farmers are interested to know whether
or not the distribution of rabbit holes in the region is random. A researcher took a
random sample of 30 areas (each 100 m2
) and calculated an average of 1.6 rabbit
holes per area. The frequency distribution for the number of rabbit holes per 100
m2
is given below.
Number of rabbit Frequency
holes (r) (fr) P(Y = r)
bfr
0 6 0.2019 bf0
1 6 0.32303 9.6909
2 12 0.25843 7.7529
≥ 3 6 P(Y ≥ 3) 6.4992
It is of interest to test the hypotheses
H0 : The population distribution is Poisson.
H1 : The population distribution is not Poisson.
using a chi-square goodness-of-fit test.
a. What are the missing values for the probability (to 5dp) and expected frequency
(to 4dp) corresponding to the black cells in the above table?
b. Calculate the test statistic (to 4dp).
c. Name the probability distribution that the test statistic follows under the null
hypothesis. Explain why this distribution has two degrees of freedom.
d. The p-value for the test is 0.1517 (to 4dp). State what you would conclude at
the α = 0.05 significance level.
Standard Normal Probabilities P(0 ≤ Z ≤ z) for Z ∼ N(0, 1)
z 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279 0.0319 0.0359
0.1 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.0753
0.2 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141
0.3 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517
0.4 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879
0.5 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157 0.2190 0.2224
0.6 0.2257 0.2291 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486 0.2517 0.2549
0.7 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794 0.2823 0.2852
0.8 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133
0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389
1.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621
1.1 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830
1.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.4015
1.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177
1.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319
1.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4429 0.4441
1.6 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545
1.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633
1.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706
1.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.4756 0.4761 0.4767
2.0 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.4812 0.4817
2.1 0.4821 0.4826 0.4830 0.4834 0.4838 0.4842 0.4846 0.4850 0.4854 0.4857
2.2 0.4861 0.4864 0.4868 0.4871 0.4875 0.4878 0.4881 0.4884 0.4887 0.4890
2.3 0.4893 0.4896 0.4898 0.4901 0.4904 0.4906 0.4909 0.4911 0.4913 0.4916
2.4 0.4918 0.4920 0.4922 0.4925 0.4927 0.4929 0.4931 0.4932 0.4934 0.4936
2.5 0.4938 0.4940 0.4941 0.4943 0.4945 0.4946 0.4948 0.4949 0.4951 0.4952
2.6 0.4953 0.4955 0.4956 0.4957 0.4959 0.4960 0.4961 0.4962 0.4963 0.4964
2.7 0.4965 0.4966 0.4967 0.4968 0.4969 0.4970 0.4971 0.4972 0.4973 0.4974
2.8 0.4974 0.4975 0.4976 0.4977 0.4977 0.4978 0.4979 0.4979 0.4980 0.4981
2.9 0.4981 0.4982 0.4982 0.4983 0.4984 0.4984 0.4985 0.4985 0.4986 0.4986
3.0 0.4987 0.4987 0.4987 0.4988 0.4988 0.4989 0.4989 0.4989 0.4990 0.4990
3.1 0.4990 0.4991 0.4991 0.4991 0.4992 0.4992 0.4992 0.4992 0.4993 0.4993
3.2 0.4993 0.4993 0.4994 0.4994 0.4994 0.4994 0.4994 0.4995 0.4995 0.4995
3.3 0.4995 0.4995 0.4995 0.4996 0.4996 0.4996 0.4996 0.4996 0.4996 0.4997
3.4 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4998
3.5 0.4998 0.4998 0.4998 0.4998 0.4998 0.4998 0.4998 0.4998 0.4998 0.4998
3.6 0.4998 0.4998 0.4999 0.4999 0.4999 0.4999 0.4999 0.4999 0.4999 0.4999
3.7 0.4999 0.4999 0.4999 0.4999 0.4999 0.4999 0.4999 0.4999 0.4999 0.4999
3.8 0.4999 0.4999 0.4999 0.4999 0.4999 0.4999 0.4999 0.4999 0.4999 0.4999
3.9 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000