CS6375 Homework VI solution

$24.99

Original Work ?
Category: You will Instantly receive a download link for .ZIP solution file upon Payment

Description

5/5 - (7 votes)

1. Consult the AdaBoost algorithm given in Bishop Chapter 14. Suppose there are two weak leaners
β„Ž1 and β„Ž2, and a set of 17 points.
a) Let say β„Ž1 makes one mistake and β„Ž2 makes four mistakes on the dataset. Which leaner
will AdaBoost choose in the first iteration (namely m=1)?
b) What is 𝛼𝛼1?
c) Calculate the data weighting co-efficient 𝑀𝑀2 for the following two cases (1) the points on
which chosen leaner made a mistake and (2) the points on which the chosen leaner did
not make a mistake. [10 Points]
2. The diagram shows training data for a binary concept where a heart denotes positive examples.
Also shown are three decision stumps (A, B and C) each of which consists of a linear decision
boundary. Suppose that AdaBoost chooses A as the first stump in an ensemble and it has to decide
between B and C as the nest stump. Which will it choose? Explain. What will be the Ο΅ and Ξ± values
for the first iteration? [5 Points]
3. Consider cluster 1D data with a mixture of 2 Guassian using the EM algorithm. You are given the
ID data points x = [1 10 20]. Suppose the output of the E step is the following matrix
𝑅𝑅 = οΏ½
1 0
0.4 0.6
0 1
οΏ½
where entry π‘Ÿπ‘Ÿπ‘–π‘–,𝑐𝑐 is the probability of observation π‘₯π‘₯𝑖𝑖 belonging to cluster 𝑐𝑐 ( the responsibility of cluster c
for data point i). You have to compute the M step. You may state the equations for maximum likelihood
estimates of these quantities ( which you should know) without proof; you just have to apply the
equations to this data set. You may leave your answer in fractional form.
a. Write down the likelihood function you are trying to optimize.
b. After performing M step for the missing weights πœ‹πœ‹1, πœ‹πœ‹2, what are the new values?
c. After performing M step for the means πœ‡πœ‡1, πœ‡πœ‡2, what are the new values?
[10 Points]
4. In the following figure some data points are shown which lie on integer grid. Suppose we apply
the K-means algorithm to this data, using K =2 and with the centers initialized at the two circled
data points. Draw the final clusters obtained after K-means converges. [5 Points]
5. Consider training a two-input perceptron. Give an upper bound on the number of training
examples sufficient to assure with 90% confidence that the learned perceptron will have true
error of at most 5%? [5 Points]
6. The VC dimension is always less than size of the hypothesis space. True/False?
[5 Points]
7. Computational Learning Theory
[10 Points]
(a) Consider the class C of concepts of the form: (π‘Žπ‘Ž ≀ π‘₯π‘₯1 ≀ 𝑏𝑏) ᴧ (𝑐𝑐 ≀ π‘₯π‘₯2 ≀ 𝑑𝑑). Note that each
concept in this class corresponds to a rectangle in 2-dimensions. Let a, b be integers in the range
[0, 199] and c, d be integers in the range [0, 99]. Give an upper bound on the number of training
examples sufficient to assure that for any target concept c Π„ C, any consistent learner using H = C
will, with probability 0.99, output a hypothesis with error at most 0.05.
(b) Consider the class C of concepts of the form: (π‘Žπ‘Ž ≀ π‘₯π‘₯1 ≀ 𝑏𝑏) ᴧ (𝑐𝑐 ≀ π‘₯π‘₯2 ≀ 𝑑𝑑) ᴧ (𝑒𝑒 ≀ π‘₯π‘₯3 ≀ 𝑓𝑓).
Note that each concept in this class corresponds to a hyper-rectangle in 3-d. Now suppose that a,
b, c, d, e, f take on real values instead of integers. Give an upper bound on the number of training
examples sufficient to assure that for any target concept c Ο΅ C, a learner will, with probability 0.95,
output a hypothesis with error at most 0.01.