Description

5/5 - (4 votes)

1. In this exercise you’ll create some simulated data and ﬁt a simple linear regression model to it. (a) [1 point] Perform the following commands in R set.seed (1) x1 <- runif (100) x2 <- 0.5* x1+rnorm (100) /10 Y <- 2+2* x1 +0.3* x2+rnorm (100) Write out the form of the linear model. What are the regression coeﬃcients? (b) [1 point] What is the correlation between x1 and x2? Create a scatterplot displaying the relationship between the variables. (c) [2 points] Using this data, ﬁt a least squares regression to predict Y using x1 and x2. Describe the results obtained. What are ˆ β0, ˆ β1 and ˆ β2? How do these relate to the true β0,β1 and β2? Can you reject the null hypothesis H0 : β1 = 0? How about H0 : β2 = 0? (d) [1 point] Now ﬁt a least squares regression to predict Y using only x1. Comment on your results. Can you reject the null hypothesis H0 : β1 = 0? (e) [1 point] Now ﬁt a least squares regression to predict Y using only x2. Comment on your results. Can you reject the null hypothesis H0 : β1 = 0? (f) [2 points] Do the results obtained in (c)-(e) contradict each other? Explain your answer. (g) [3 points] Now suppose we obtain one additional observation, which was unfortunately mismeasured. x1 <- c(x1 , 0.1) x2 <- c(x2 , 0.8) y <- c(y,6) Re-ﬁt the linear models from (c) to (e) using this new data. What eﬀect does this new observation have on the each of the models? In each model, is this observation an outlier? A high-leverage point? Both? Explain your answers and make suitable plots. 2. [6 points] This problem relates to the QDA model, in which the observations within each class are drawn from a normal distribution with a classspeciﬁc mean vector and a class speciﬁc covariance 1 matrix. We consider the simple case where p = 1; i.e. there is only one feature. Suppose that we have K classes, and that if an observation belongs to the kth class then X comes from a one-dimensional normal distribution, X ∼ N(µk,σ2 k). Recall that the density function for the onedimensional normal distribution is given in Eq. 4.11 in the textbook. Prove that in this case, the Bayes classiﬁer is not linear. Argue that it is in fact quadratic. 3. [6 points] Suppose that we wish to predict whether a given stock will issue a dividend this year (“Yes” or “No”) based on X, last years percent proﬁt. We examine a large number of companies and discover that the mean value of X for companies that issued a dividend was X = 10, while the mean for those that didn’t was X = 0. In addition, the variance of X for these two sets of companies was σ2 = 36. Finally, 80% of companies issued dividends. Assuming that X follows a normal distribution, predict the probability that a company will issue a dividend this year given that its percentage proﬁt was X = 4 last year. 4. This question should be answered using the Weekly data set, which is part of the ISLR package. This data is similar in nature to the Smarket data from this chapter’s lab, except that it contains 1,089 weekly returns for 21 years, from the beginning of 1990 to the end of 2010. (a) [2 points] Produce some numerical and graphical summaries of the Weekly data. Do there appear to be any patterns? (b) [2 points] Use the full data set to perform a logistic regression with Direction as the response and the ﬁve lag variables plus Volume as predictors. Use the summary function to print the results. Do any of the predictors appear to be statistically signiﬁcant? If so, which ones? (c) [2 points] Compute the confusion matrix and overall fraction of correct predictions. Explain what the confusion matrix is telling you about the types of mistakes made by logistic regression. (d) [2 points] Now ﬁt the logistic regression model using a training data period from 1990 to 2008, with Lag2 as the only predictor. Compute the confusion matrix and the overall fraction of correct predictions for the held out data (that is, the data from 2009 and 2010). (e) [2 points] Repeat (d) using LDA. (f) [2 points] Repeat (d) using QDA. (g) [1 point] Is it justiﬁed to use QDA? Use appropriate hypothesis test(s) we’ve seen in class. (h) [2 points] Repeat (d) using KNN with K = 1. (i) [1 point] Which of these methods appears to provide the best results on this data? (j) [1 point] Could you create a better classiﬁer? How would you do this?

COM S 573: Home work 2 solution

Download Details:

Description

COM S 573: Home work 2 solution

Download Details:

Description

Related products

COM S 573: Home work 4 solution

ComS 573 Machine Learning Problem Set 1 solved

Solved ComS 573 Lab 4 Ensemble Learning