CS6375 Homework I solution

$29.99

Original Work ?
Category: You will Instantly receive a download link for .ZIP solution file upon Payment

Description

5/5 - (8 votes)

1. Consider the following set of points: {(-2, -1), (1, 1), (3, 2)}
a) Find the least square regression line for the given data points.
b) Plot the given points and the regression line in the same rectangular system of axes.
[5 Points]
2. The values of y and their corresponding values of y are shown in the table below
x 0 1 2 3 4
y 2 3 5 4 6
a) Find the least square regression line y = a x + b.
b) Estimate the value of y when x = 10.
[5 Points]
3. Sufficient statistics for online linear regression [15 Points]
Consider fitting the model 𝑦𝑦� = π‘€π‘€π‘œπ‘œ + 𝑀𝑀1π‘₯π‘₯ using least squares. Unfortunately we did not
keep the original data π‘₯π‘₯𝑖𝑖 , 𝑦𝑦𝑖𝑖, but we do have the following functions (statistics) of the data:
π‘₯π‘₯Μ…
(𝑛𝑛) =
1
𝑛𝑛 βˆ‘ π‘₯π‘₯𝑖𝑖
𝑛𝑛
𝑖𝑖=1 , 𝑦𝑦�(𝑛𝑛) =
1
𝑛𝑛 βˆ‘ 𝑦𝑦𝑖𝑖
𝑛𝑛
𝑖𝑖=1
𝐢𝐢π‘₯π‘₯π‘₯π‘₯
(𝑛𝑛) =
1
𝑛𝑛� (π‘₯π‘₯𝑖𝑖 βˆ’ π‘₯π‘₯Μ…)2 𝑛𝑛
𝑖𝑖=1 𝐢𝐢𝑦𝑦𝑦𝑦
(𝑛𝑛) =
1
𝑛𝑛� (𝑦𝑦𝑖𝑖 βˆ’ 𝑦𝑦�)2 𝑛𝑛
𝑖𝑖=1
𝐢𝐢π‘₯π‘₯π‘₯π‘₯
(𝑛𝑛) =
1
𝑛𝑛 βˆ‘ (π‘₯π‘₯𝑖𝑖 βˆ’ π‘₯π‘₯Μ…)(𝑦𝑦𝑖𝑖 βˆ’ 𝑦𝑦�) 𝑛𝑛
𝑖𝑖=1
a. What are the minimal set of statistic that we need to estimate 𝑀𝑀1?
b. What are the minimal set of statistic that we need to estimate 𝑀𝑀0?
c. Suppose a new data point π‘₯π‘₯𝑛𝑛+1 , 𝑦𝑦𝑛𝑛+1 arrives, and we want to update our sufficient
statistics without looking at the old data which we have not stored. (This is useful for
online learning.) Show that we can do this for π‘₯π‘₯Μ…as follows.
π‘₯π‘₯Μ…
(𝑛𝑛+1) β‰œ 1
𝑛𝑛+1 βˆ‘ π‘₯π‘₯𝑖𝑖 =
1
𝑛𝑛+1 �𝑛𝑛π‘₯π‘₯Μ…
(𝑛𝑛) + π‘₯π‘₯𝑛𝑛+1οΏ½ 𝑛𝑛+1
𝑖𝑖=1
= π‘₯π‘₯Μ…
(𝑛𝑛) +
1
𝑛𝑛 + 1 οΏ½π‘₯π‘₯𝑛𝑛+1 βˆ’ π‘₯π‘₯Μ…
(𝑛𝑛)
οΏ½
This has the form: new estimate is old estimate plus correction. We see that the size of correction
diminishes over time( i.e. we get more samples). Drive a similar expression to update 𝑦𝑦�
4. Explain why the size of the hypothesis space in the EnjoySport learning task is 973. How
would the number of possible instances and possible hypotheses increase with the
addition of the attribute WaterCurrent, which can take on the values Light, Moderate or
Strong? More generally, how does the number of possible instances and hypotheses
grow with the addition of a new attribute A that takes on k possible values?
[5 Points]
5. Consider the following sequence of positive and negative training examples describing
the concept β€œpairs of people who live in the same house”. Each training example
describes an ordered pair of people, with each person described by their sex, hair color
(black, brown or blonde), height (tall medium or short), and nationality (US, French,
German, Irish, Indian, Japanese or Portuguese). [15 Points]
+ <
+ < >
– < >
+ < >
Consider a hypothesis space defined over these instances, in which each hypothesis is
represented by a pair of 4-tuples, and where each attribute constraints may be a specific value ,
β€œ?” or β€œΞΈβ€, just as in the EnjoySport hypothesis representation. For example, the hypothesis
< < female ? ? Japanese>>
represents the set of all pairs of people where the first is a tall male (of any nationality and hair
color), and the second is s Japanese female ( of any hair color and height).
a. Provide a hand trace of the Candidate-Elimination algorithm learning from the above
training examples and hypothesis language. In particular, show the specific and general
boundaries of the version space after it has processed the first training example, then
the second training example, etc.
b. How many distinct hypotheses from the given hypothesis space are consistent with the
following single positive training example?
< >
c. Assume the learner has encountered only the positive example from part (b), and that it
is now allowed to query the trainer by generating any instance and asking the trainer to
classify it. Give a specific sequence of queries that assures the learner will converge to
the single correct hypothesis, whatever it may be (assuming that the target concept is
describable within the given hypothesis language). Give the shortest sequence of
queries you can find. How does the length of this sequence relate to your answer to
question (b)?
d. Note that this hypothesis language cannot express all concepts that can be defined over
instances ( i.e., we can define sets of positive and negative examples for which there is
no corresponding describable hypothesis). If we were to enrich the language so that it
could express all concepts that can be defined over the instance language, then how
would your answer to Β© change?
6. Consider learning a Boolean valued function : 𝑋𝑋 β†’ π‘Œπ‘Œ , where 𝑋𝑋 = βŒ©π‘‹π‘‹1,𝑋𝑋2, … . , 𝑋𝑋𝑁𝑁βŒͺ,
where Y and the 𝑋𝑋𝑖𝑖 are all Boolean valued variables. You decide to consider a hypothesis
space H where each hypothesis is of the form
a. 𝑖𝑖𝑖𝑖�(𝑋𝑋𝑖𝑖 = π‘Žπ‘Ž) ᴧ �𝑋𝑋𝑗𝑗 = 𝑏𝑏�� π‘‘π‘‘β„Žπ‘’π‘’π‘’π‘’ π‘Œπ‘Œ = 1 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 π‘Œπ‘Œ = 0
where 𝑖𝑖 β‰  𝑗𝑗, and where a and b can be either 0 or 1. Notice each hypothesis constrains
exactly two of the features of X.
How many distinct hypotheses are there in H?
[5 Points]