## Description

Problem 1 (2 points): A typical machine learning task is supervised learning of a classification function (aka a concept): give example x the right class label l. Assume there is an unknown target concept c(x), whose domain is a set of unique examples X, called the instance space, and whose range is a set of unique labels L. The task is to learn that function. A learner does this by testing hypotheses. A hypothesis h(x) is also function whose domain is X and whose range is L. If the set of hypotheses a learner is able to consider is the same as the set of possible concepts, then the set of unique hypotheses (the hypothesis space H) is identical to the set of unique concepts (the concept space C). For this question, assume H = C. In supervised learning, exploration of the hypothesis space is guided by the use of a finite data set, D that contains examples from X with known labels from the target concept c(x). Since we can only measure success with respect to the data, we reframe the task of learning c(x) to that of finding a hypothesis consistent with the data: ∀x∈D, c(x)=h(x)

Definition: Two functions f1 and f2 are distinguishable, given D, if they differ in their labeling of at least one of the examples in D.

Definition: A set of hypotheses is distinguishable, given D, iff ALL pairs of hypotheses in the set are distinguishable given D. Call HD a largest set of distinguishable hypotheses, given D.

A) (1/2 point) Assume that X, L and D are all finite and given (i.e. they are all fixed). Is there one unique HD? Explain your reasoning.

B) (1/2 point) Let the size of the data and the label set be drawn from the counting numbers (i.e. |D|,|L|∈{1,2,3…}). Formulate the size of HD, as a function of |L| and |D|. Explain your reasoning.

For parts C and D assume the following. The label set L is {0,1}. The size of the data set |D| = 100. The size of the instance space |X| = 200. Assume a learner able to consider a maximal set of distinguishable hypotheses HD.

C) (1/2 point) Assume the learner is able to consider 10^9 (one billion) hypotheses per second. In the worst case, how long would it have to work to find a hypothesis h that is indistinguishable from the target function c, given D? Would this be a reasonable time to wait? Explain your reasoning. State your assumptions.

D) (1/2 point) Assume the learner HAS found a hypothesis h that is indistinguishable from the target concept c , given D. What is the probability that hypothesis h is also indistinguishable from the target concept c , given X? Explain your reasoning. State your assumptions.

EECS 349 (Machine Learning) Homework 1

Problem 2 (1 point): Given the following: Instances X: cars described by following attributes x is a tuple: