Description
Problem 1 [10 points]
Background : Consider the attached dataset assign1_WineData.csv, which has 12 variables:
“FixedAcidity” “VolatileAcidity” “CitricAcid” “ResidualSugar”
“Chlorides” “FreeSulphurDioxide” “TotalSulphurDioxide” “Density”
“pH” “Sulphates” “Alcohol” “Quality”
The target is to fit an optimal linear regression model to predict the “Quality” of the wine.
Task : Import the dataset, perform exploratory data analysis on the variables, and construct
the best model, in your opinion, to predict the response variable, that is, the “Quality” of the
wine, in case of the given dataset assign1_WineData.csv. Briefly comment (within the code)
on your observations and on the choices you make in the process of building the best model.
Problem 2 [10 points]
Background : Consider the attached dataset assign1_CarData.csv, which has 6 variables:
“cylinders” “displacement” “horsepower” “weight” “acceleration” “mpg”
The target is to fit an optimal linear regression model to predict fuel efficiency “mpg” of the car.
Task : Import the dataset, perform exploratory data analysis on the variables, and construct
the best model, in your opinion, to predict the response variable, that is, the “mpg” of the car,
in case of the given dataset assign1_CarData.csv. Briefly comment (within the code) on your
observations and on the choices you make in the process of building the best model.
This is an individual assignment. Properly acknowledge every source of information that you
refer to, including discussions with your fellow students, if any. Verbatim copy from any source
is strongly discouraged, and plagiarism will be heavily penalized. It is strongly recommended that
you write the codes completely on your own. Feel free to write the codes in Python if you want.