## Description

P1. Let

X¯ =

1

n

Xn

i=1

Xi and S

2 =

1

n − 1

Xn

i=1

(Xi − X¯)

2

.

Show that:

(a) (2pt) Pn

i=1 X2

i = (n − 1)S

2 + nX¯ 2

(b) (2pt) If X1, X2, …, Xn are independent and identically distributed (i.i.d.), the S

2

is an unbiased estimator

of σ

2

, i.e., ES

2 = σ

2

In the following, in addition to the above, assume that Xi-s have normal/Gaussian distribution N (µ, σ2

).

(c) (3pt) Show (prove) that X¯ is independent of Xi − X¯, i = 1, 2, . . . , n.

(Hint: Both X¯ and Xi − X¯ are normal.)

(d) (3pt) Show (prove) that the sample mean, X¯, is independent of the sample variance, S

2

.

P2. (10pt) Show that in the case of simple linear regression of Y onto X, the R2

statistic is equal to the

square of the correlation coefficient between X and Y (r

2

). For simplicity, you may assume that y¯ = ¯x = 0.

Recall that

R

2 =

Pn

i=1(ˆyi − y¯)

2

Pn

i=1(yi − y¯)

2

and r =

Pn

i=1(xi − x¯)(yi − y¯)

pPn

i=1(xi − x¯)

2 Pn

i=1(yi − y¯)

2

.

P3. (20pt; each bullet 2pt) Create some simulated data and fit simple linear regression models to it. Make

sure to use set.seed(1) prior to starting part (a) to ensure consistent results.

(a) Using the rnorm() function, create a vector, x, containing 100 observations drawn from a N (0, 1)

distribution. This represents a feature, X.

(b) Using the rnorm() function, create a vector, eps, containing 100 observations drawn from a N (0, 0.25)

distribution.

(c) Using x and eps, generate a vector y according to the model

Y = −1 + 0.5X + .

What is the length of the vector y? What are the values of β0 and β1 in this linear model?

(d) Create a scatterplot displaying the relationship between x and y. Comment on what you observe.

(e) Fit a least squares linear model to predict y using x. Comment on the model obtained. How do βˆ

0 and

βˆ

1 compare to β0 and β1?

(f) Display the least squares line on the scatterplot obtained in (d). Draw the population regression line on

the plot, in a different color. Use the legend() command to create an appropriate legend.

(g) Now fit a polynomial regression model that predicts y using x and x

2

. Is there evidence that the quadratic

term improves the model fit? Explain your answer.

1

(h) Repeat (a)-(f) after modifying the data generation process in such a way that there is less noise in the

data. The model in (c) should remain the same. You can do this by decreasing the variance of the normal

distribution used to generate the error term in (b). Describe your results.

(i) Repeat (a) − (f) after modifying the data generation process in such a way that there is more noise in

the data. The model in (c) should remain the same. You can do this by increasing the variance of the

normal distribution used to generate the error term in (b). Describe your results.

(j) What are the confidence interval for β0 and β1 based on the original data set, the noisier data set, and

the less noisy data set? Comment on your results.

P4. (10pt) Using R and Advertising data set, find 92% confidence intervals for β0 and β1 for three linear

regressions of Sales onto Newspaper, TV and Radio; and create a scatterplot for each of them with the 92%

confidence intervals. The answer should include the R code and graphs.

P5. Consider the Auto data set:

(a) (5pt) Produce a scatterplot matrix which includes all of the pairs of variables in the data set.

(b) (5pt) Compute the matrix of correlations between the variables using the function cor(). You will need

to exclude the name variable, which is qualitative.

(c) (5pt) Use the lm() function to perform a multiple linear regression with mpg as the response and all other

variables except name as the predictors. Use the summary() function to print the results. Comment on

the output. For instance:

i. Is there a relationship between the predictors and the response?

ii. Which predictors appear to have a statistically significant relationship to the response?

iii. What does the coefficient for the year variable suggest?

(d) (5pt) Try a few different transformations of the variables, such as log(X),

√

X, X2

. Comment on your

findings.

P6. (10pt) A data set has n = 20,

X

20

i=1

xi = 8.552,

X

20

i=1

yi = 398.2,

X

20

i=1

x

2

i = 5.196,

X

20

i=1

y

2

i = 9356, and X

20

i=1

xiyi = 216.6.

Calculate βˆ

0, βˆ

1 and σˆ

2

. What is the fitted value when x = 0.5? Compute R2

.

P7. (10pt) The multiple linear regression model

y = β0 + β1×1 + β2×2 + β3×3 + β4×4 + β5×5 + β6×6

is fitted to a data set of n = 45 observations. The total sum of squares is TSS = 11.62, and the residual sum

of squares is RSS = 8.95. What is the p-value for the null hypothesis

H0 : β1 = β2 = β3 = β4 = β5 = β6 = 0 ?

2

Extra Credit

Under normal assumptions we can compute the distributions of a lot of quantities explicitly

E1. (5pt) Chi-squared distribution. Let X1, X2, . . . , Xn be independent standard normal random variables and

recall that Chi-squared random variable with n degrees of freedom is defined as χ

2

n = X2

1 + X2

2 + · · · + X2

n

.

Prove that the density of χ

2

n

is given by

gn(x) = 1

Γ(n/2)2n/2

x

n/2−1

e

x/2

,

where Γ(x) is the gamma function. (Hint: Prove first for n = 1, 2, and then use the mathematical induction.)

E2. (5pt) Let X1, X2, . . . , Xn be independent normal random variables N (µ, σ2

). Prove that

(n − 1)S

2

σ

2

d= χ

2

n−1

,

where d= stands for equality in distribution.

(Hint: Derive the moment generating function of χ

2

n

and use problem P1.(a) and (d).)

E3. (5pt) Student’s t distribution. Let tn be student’s t variable, defined as

tn =

Z

p

χ2

n/n

,

where Z ∼ N (0, 1). Prove that tn has the density

fn(t) = Γ((n + 1)/2)

√

πnΓ(n/2) ·

1

(1 + t

2/n)

(n+1)/2

,

where Γ(x) is the gamma function. Show that for large values of n, fn(t) is approximately normal, fn(t) ≈

e

−t

2

/

√

2π. (Hint: First show that the conditional density (distribution) of tn given χ

2

n = x is normal with mean

0 and variance p

n/x. Then, use problem E1. to integrate this conditional density.)

E4. (5pt) F (Fisher) distribution. Let U and V be two independent Chi-squared random variables with degrees

of freedom n1 and n2, and define the F ≡ F(n1, n2) randoma variable as

F =

U/n1

V /n2

.

Show that the density of F is given by

fn1,n2

(w) = (n1/n2)

n1/2Γ[(n1 + n2)/2]w

(n1/2)−1

Γ[n1/2]Γ[n2/2][1 + (n1w/n2)](n1+n2)/2

.

(Hint: Compute first the distribution of F given V .)

3