Description
1. [10 points] Describe Gaussian Mixture Model clustering. Why is it an instance of the
Expectation Maximization method? What are its advantages over the K-Means
clustering algorithm?
2. [10 points] Describe a multivariate Gaussian along with its parameters (µ and Σ).
What is the geometric interpretation of these two parameters? List some interesting
properties of the eigenvalues and eigenvectors of any covariance matrix.
3. [10 points] Describe the Principal Component Analysis algorithm for dimensionality
reduction along with the time complexity of each of its steps. How does it compare
against FastMap representation-wise, efficiency-wise, and quality-wise?
4. [10 points] What are the steps in the Perceptron Learning algorithm? What do we do
when a constraint is violated in any iteration? Should the learning rate for updating
the weights be high or low; why? What happens if we try running the Perception
Learning algorithm on data that do not have linearly separable positive and negative
labels?
5. [10 points] What is the Constraint Satisfaction Problem (CSP)? Pick a problem of
interest in Data Science which can be solved efficiently using CSP search techniques.
Describe the problem and the application of CSP techniques on it. Elaborate on one
of these techniques.