## Description

## 1. Visualize classifier decision boundaries.

## 1a. Complete the function in the next cell that plots a classifier’s decision boundary.

Hint: My solution used 9 lines:

- Make linspaces of grid_resolution points in xlim and grid_resolution points in ylim. e.g. For xlim=(-1, 1), ylim=(0, 2) and grid_resolution=3, make the linspace (-1, 0, 1) of x coordinates and the linspace (0, 1, 2) of y coordinates.
- Use np.tile() to repeat the x grid points grid_resolution times (e.g. (-1, 0, 1, -1, 0, 1, -1, 0, 1)) and np.repeat() to repeat each of the y grid points grid_resolution times (e.g. (0, 0, 0, 1, 1, 1, 2, 2, 2)).
- Use np.stack() to combine the x grid points and y grid points into a 2D array of size grid_resolution2 x 2. (e.g. [[-1, 0], [0, 0], [1, 0], [-1, 1], [0, 1], [1, 1], [-1, 2], [0, 2], [1, 2]] )
- Make a dictionary keyed by -1 and 1 with values ‘pink’ and ‘lightskyblue’.
- Use clf.predict() on the 2D array of points to get predicted y values.
- For each y in {-1, 1}, use plt.plot() to plot those points in your 2D array with that predicted y value in the color specified by your dictionary.

```
def plot_decision_boundary(clf, xlim, ylim, grid_resolution):
"""Display how clf classifies each point in the space specified by xlim and ylim.
- clf is a classifier.
- xlim and ylim are each 2-tuples of the form (low, high).
- grid_resolution specifies the number of points into which the xlim is divided
and the number into which the ylim interval is divided. The function plots
grid_resolution * grid_resolution points."""
# ... your code here ...
```

## Visualize the decision boundary for an SVM.

Here I have provided test code for your function to visualize the decision boundary for the SVM under the header “Now try 2D toy data” inhttps://pages.stat.wisc.edu/~jgillett/451/burkov/01/01separatingHyperplane.html.

Recall: That SVM’s decision boundary was 𝑦=−𝑥+12, so your function should make a plot with lightskyblue above that line and pink below that line. Then my code adds the data points in blue and red.

There is nothing for you to do in this step, provided you implemented the required function above.

Note: It is ok if you get a warning about calling clf.fit() on input that does not have feature names. (I haven’t figured out a satisfactory way to design the function to exclude this warning easily.)

```
data_string = """
x0, x1, y
0, 0, -1
-1, 1, -1
1, -1, -1
0, 1, 1
1, 1, 1
1, 0, 1
"""
df = pd.read_csv(StringIO(data_string), sep='\s*,\s+', engine='python')
clf = svm.SVC(kernel="linear", C=1000)
clf.fit(df[['x0', 'x1']], df['y'])
# Call student's function.
plot_decision_boundary(clf=clf, xlim=(-4, 4), ylim=(-4, 4), grid_resolution=100)
# Add training examples to plot.
colors = {-1:'red', 1:'blue'}
for y in (-1, 1):
plt.plot(df.x0[df.y == y], df.x1[df.y == y], '.', color=colors[y])
```

## 1b. Visualize the decision boundary for a decision tree.

- Make a decision tree classifier on the same df used above. (Use criterion=’entropy’, max_depth=None, random_state=0.)
- Use print(export_tree()) to print a text version of your tree.
- Copy the last few lines of the cell above to make the plot.
- Study the tree and plot until you understand how the plot represents the decisions in the tree.

```
# ... your code here ...
```

## 1c. Visualize the decision boundary for kNN with 𝑘=3.

- Make a kNN classifier on the same df used above. (Use n_neighbors=3 and metric=’euclidean’.)
- Copy the plotting code again.

(Experiment with 𝑘=1 and 𝑘=2 to see how the decision boundary varies with 𝑘 before setting 𝑘=3.)

```
# ... your code here ...
```

## 1d. Visualize the decision boundary for an SVM with a nonlinear boundary.

Use the example under the header “Nonlinear boundary: use kernel trick” in https://pages.stat.wisc.edu/~jgillett/451/burkov/03/03SVM.html.

- Read the data from http://www.stat.wisc.edu/~jgillett/451/data/circles.csv. This “.csv” file has y in {0, 1}, so change the 0 values to -1.
- Fit an SVM with kernel=’rbf’, C=1, gamma=1/2.
- Copy the last few lines of my plotting code, above, again to make the boundary plot.

(Experiment with 𝛾=2, 𝛾=10, and 𝛾=30 to see how the decision boundary varies with gamma before setting gamma to 1/2.)

```
# ... your code here ...
```

## 2. Run gradient descent by hand.

Run gradient descent with 𝛼=0.1 to minimize 𝑧=𝑓(𝑥,𝑦)=(𝑥+1)2+(𝑦+2)2. Start at (0, 0) and find the next two points on the descent path.

Hint: The minimum is at (-1, -2), so your answer should be approaching this point.

## … your answer in a Markdown cell here …

## 3. Practice feature engineering

by exploring the fact that rescaling may be necessary for kNN but not for a decision tree.

### 3a. Read and plot a toy concentric ellipses data set.

- Read the data from http://www.stat.wisc.edu/~jgillett/451/data/ellipses.csv into a DataFrame.
- Display the first five rows.
- Plot the data.
- Put x0 on the 𝑥 axis and x1 on the 𝑦 axis.
- Plot points with these colors:
- 𝑦=0: red
- 𝑦=1: blue

- Use 𝑥 and 𝑦 axis limits of (−6,6).
- Include a legend.

```
# ... your code here ...
```

### 3b. Train a 𝑘NN classifier and report its accuracy.

- Use 𝑘=3 and the (default) euclidean metric.
- Report the accuracy on the training data by writing a line like
`Training accuracy is 0.500`

(0.500 may not be correct).

```
# ... your code here ...
```

### 3c. Now rescale the features using standardization; plot, train, and report accuracy again.

- Fit the scaler to the training features.
- Transform the training features.
- Plot the rescaled data.
- Train kNN again and report its accuracy as before. (Notice that rescaling helped.)

```
# ... your code here ...
```

### 3d. Train a decision tree classifier on the original (unscaled) data and report its accuracy.

- Train on the training data.
- Report the accuracy as before.

```
# ... your code here ...
```