## Description

1. (30 points) Find the Maximum Likelihood Estimation (MLE) of θ in the following

probabilistic density functions. In each case, consider a random sample of size n. Show

your calculation:

(a) f(x|θ) = x

θ

2 exp {

−x

2

2θ

2 }, x ≥ 0

(b) f(x|α, β, θ) = αθ−αβx

β

exp{−(

x

θ

)

β}, x ≥ 0, α > 0, β > 0, θ > 0

(c) f(x|θ) = 1

θ

, 0 ≤ x ≤ θ, θ > 0 (Hint: You can draw the likelihood function)

2. (30 points) We want to build a pattern classifier with continuous attribute using

Bayes’ Theorem. The object to be classified has one feature, x in the range 0 ≤ x < 6.
The conditional probability density functions for each class are listed below:
P(x|C1) = (
1
6
if 0 ≤ x < 6
0 otherwise
P(x|C2) =
1
4
(x − 1) if 1 ≤ x < 3
1
4
(5 − x) if 3 ≤ x < 5
0 otherwise
0 1 2 3 4 5 6
x
0.0
0.1
0.2
0.3
0.4
0.5
0.6
P
P(x|C2)
P(x|C1)
(a) Assuming equal priors, P(C1) = P(C2) = 0.5, classify an object with the attribute
value x = 2.5.
(b) Assuming unequal priors, P(C1) = 0.7, P(C2) = 0.3, classify an object with the
attribute value x = 4.
1
Instructor: Catherine Qi Zhao. TA: Prithvi Raj Botcha, Shi Chen, Suzie Hoops, James Yang, Yifeng
Zhang. Email: csci5521.s2022@gmail.com
1
(c) Consider a decision function ϕ(x) of the form ϕ(x) = (|x − 3|) − α with one free
parameter α in the range 0 ≤ α ≤ 2. You classify a given input x as class 2 if and
only if ϕ(x) < 0, or equivalently 3 − α < x < 3 + α, otherwise you choose x as
class 1. Assume equal priors, P(C1) = P(C2) = 0.5, what is the optimal decision
boundary - that is, what is the value of α which minimizes the probability of
misclassification? What is the resulting probability of misclassification with this
optimal value for α? (Hint: take advantage of the symmetry around x = 3.)
3. (40 points) In this programming exercise you will implement three multivariate Gaussian classifiers, with different assumptions as follows:
• Assume S1 and S2 are learned independently (learned from the data from each
class).
• Assume S1 = S2 (learned from the data from both classes).
• Assume S1 = S2 (learned from the data from both classes), and the covariance is
a diagonal matrix.
What is the discriminant function in each case? Show in your report and
briefly explain.
For each assumption, your program should fit two Gaussian distributions to the 2-class
training data in training data.txt to learn m1, m2, S1 and S2 (S1 and S2 refer to the
same variable for the second assumption). Then, you use this model to classify the test
data in test data.txt by comparing log P(Ci
|x) for each class Ci
, with P(C1) = 0.3
and P(C2) = 0.7. Each of the data files contains a matrix M ∈ R
N×9 with N samples,
the first 8 columns include the features (i.e. x ∈ R
8
) used for classifying the samples
while the last column stores the corresponding class labels (i.e. r ∈ {1, 2}).
Report the confusion matrix on the test set for each assumption. Briefly
explain the results.
We have provided the skeleton code MyDiscriminant.py for implementing the classifiers. It is written in a scikit-learn convention, where you have a fit function for
model training and a predict function for generating predictions on given samples.
Use Python class GaussianDiscriminant for implementing the multivariate Gaussian
classifiers under the first two assumptions, and GaussianDiscriminant Diagonal for
the third one. To verify your implementation, call the main function hw1.py, which
automatically generates the confusion matrix for each classifier. Note that you do not
need to modify this file.
Submission
• Things to submit:
2
1. hw1 sol.pdf: a document containing all your answers for the written questions
(including those in problem 3).
2. MyDiscriminant.py: a Python source file containing two python classes for Problem 3, i.e., GaussianDiscriminant
and GaussianDiscriminant Diagonal. Use the skeleton file MyDiscriminant.py
found with the data on the class web site, and fill in the missing parts. For each
class object, the fit function should take the training features and labels as inputs, and update the model parameters. The predict function should take the
test features as inputs and return the predictions.
• Submit: All material must be submitted electronically via Gradescope. Note that
There are two entries for the assignment, i.e., Hw1-Written (for hw1 sol.pdf)
and Hw1-Programming (for a zipped file containing the Python code),
please submit your files accordingly. We will grade the assignment with vanilla
Python, and code submitted as iPython notebooks will not be graded.
3