Homework 1 ISyE 6420 solution


Original Work ?


5/5 - (5 votes)

1. Carpal Tunnel Syndrome Tests.

Carpal tunnel syndrome is the most common
entrapment neuropathy. The cause of this syndrome is hard to determine, but it can include
trauma, repetitive maneuvers, certain diseases, and pregnancy.

Three commonly used tests for carpal tunnel syndrome are Tinel’s sign, Phalen’s test, and
the nerve conduction velocity test. Tinel’s sign and Phalen’s test are both highly sensitive
(0.97 and 0.92, respectively) and specific (0.91 and 0.88, respectively). The sensitivity and
specificity of the nerve conduction velocity test are 0.93 and 0.87, respectively.1

Assume that the tests are conditionally independent.
Calculate the sensitivity and specificity of a combined test if combining is done
(a) in a serial manner;2
(b) in a parallel manner.3
(c) Find Positive Predictive Value (PPV) for tests in (a) and (b) if the prevalence of carpal
tunnel syndrome in the general population is approximately 50 cases per 1000 subjects.

2. A Simple Na¨ıve Bayes Classifier: 6420 Students going to Beach. Assume that
for each instance covariates x1, . . . , xp are given and one of I different classes {1, 2, . . . , I} that
the instance belongs to.

Bayes classifier assigns the class according to maximum probability
IP(Class i|x1, . . . , xp) ∝ IP((x1, . . . , xp)|Class i) × IP(Class i), i = 1, . . . , I,
conditionally that probabilities on the right hand side can be assessed/elicited. The symbol
∝ stands for the proportionality relation, exact probabilities IP(Class i|x1, . . . , xp) satisfy
IP(Class i|x1, . . . , xp) = 1.

1For definitions of Sensitivity, Specificity, and PPV, consult Chapter 4: Sensitivity, Specificity, and
Relatives, in the textbook at http://statbook.gatech.edu.

2Tests are combined in a serial manner if the combination is declared positive when all tests are positive.

3Tests are combined in a parallel manner if the combination is declared positive when at least one test is

The probability IP((x1, . . . , xp)|Class i) could be quite difficult to assess due to possible
interdependencies among the predictors. The na¨ıve Bayes classifier assumes conditional
independence, that is, given the Class i, all xi are independent,
IP((x1, . . . , xp)|Class i) = IP(x1|Class i) × IP(x2|Class i) × . . . × IP(xp|Class i)
|Class i).

The conditional probabilities IP(xj
|Class i) are usually easier to assess. If we have a training
sample, these probabilities can be taken as relative frequencies of items with covariate xj
among the items in the class i.
Thus, class i is selected for which
IP(Class i) ×
|Class i), i = 1, . . . , I,
is maximum.

To illustrate very simple (covariates take values true/false, i.e., 1/0) na¨ıve Bayes classifier,
assume the following scenario:

We are interested in predicting whether a student from Bayesian class ISyE6420 will
go to the beach (Class No (0)/Class Yes (1)) during Spring Break. To train our classifier,
pretend that we have information from Spring 2019 enrollment in ISyE6420 and assume that
students from the two enrolments are homogeneous, that is, the information from last year
is relevant to this year students.

The imaginary data for 100 students are available, as in file naive.csv|txt. The following is recorded: Satisfied with ISyE6420 Midterm results (0 no/1 yes), Personal finances
good (0 no/1 yes), Friends joined (0 no/1 yes), Weather forecast good (0 no/1 yes), Gender
(0 male/1 female), Went to Beach (0 no/1 yes).

The conditional probabilities IP(xj
|Class i) can be estimated by the relative frequencies
of items with covariate xj among the items in the class i (yellow in the image).

For example,
Jane is happy with her 6420 Midterm results, financially is doing well, however, her friends
will not go to the beach and the weather forecast does not look good. Then, for example
IP(Financially doing well|Went to Beach) = 29/40 = 0.725, etc.

> pbpropto = 0.875*0.725*(1-0.775)*(1-0.825)*0.225 * 0.4 %0.002248066406250
> pnbpropto = 0.45 * 0.416667 * (1-0.116667) * (1-0.716667) * 0.383333* 0.6
> pbeach = pbpropto/(pbpropto + pnbpropto) %0.172380835483743
> pnbeach = pnbpropto/(pbpropto + pnbpropto) %0.827619164516257

Thus, after normalizing the products, the na¨ıve Bayes assigns the probability of 0.17238
of class ’Going to Beach’ to Jane.4
4What happens if any of the counts is 0, which may happen in the case of small sample size and large

Classify the following two students:
(a) Michael did poorly on his Midterm, he already owes some money, his friends will go
to the beach, and weather forecast looks fine.

(b) Melissa did well on the Midterm, her finances look good, the weather prognosis looks
good, but her friends will not go to the beach.

3. Multiple Choice Exam. A student answers a multiple choice examination with two
questions that have four possible answers each. Suppose that the probability that the student
knows the answer to a question is 0.80 and the probability that the student guesses is 0.20.

If the student guesses, the probability of guessing the correct answer is 0.25. The questions
are independent, that is, knowing the answer on one question is not influenced by the other

(a) What is the probability that the both questions will be answered correctly?

(b) If answered correctly, what is the probability that the student really knew the correct
answer to both questions?

(c) [EXTRA CREDIT] How would you generalize the above from 2 to n questions, that
is, what are answers to (a) and (b) if the test has n independent questions? What happens
to probabilities in (a) and (b) if n → ∞.

number of predictors/features? In this case the product is zero, leading to (unlikely) conclusion that the
probability of the corresponding class label is 0. This is a case of overfitting, in statistical learning literature
known as “Black Swan Paradox.”

In such cases one has to be Bayesian Naive Bayes, and smooth the counts
by imposing priors on the class labels and binary predictors, usually Dirichlet and beta. But about priors,