Assignment #2 STA410H1F/2102H1F rejection sampling solution


Original Work ?
Category: You will Instantly receive a download link for .ZIP solution file upon Payment


5/5 - (6 votes)

1. An interesting variation of rejection sampling is the ratio of uniforms method. We start
by taking a bounded function h with h(x) ≥ 0 for all x and Z ∞
h(x) dx < ∞. We then
define the region
Ch =

(u, v) : 0 ≤ u ≤

and generate (U, V ) uniformly distributed on Ch. We then define the random variable X =
(a) The joint density of (U, V ) is
f(u, v) = 1
for (u, v) ∈ Ch
where |Ch| is the area of Ch. Show that the joint density of (U, X) is
g(u, x) = u
for 0 ≤ u ≤
and that the density of X is γ h(x) for some γ > 0.
(b) The implementation of this method is somewhat complicated by the fact that it is
typically difficult to sample (U, V ) from a uniform distribution on Ch. However, it is usually
possible to find a rectangle of the form Dh = {(u, v) : 0 ≤ u ≤ u+, v− ≤ v ≤ v+} such that
Ch is contained within Dh. Thus to draw (U, V ) from a uniform distribution on Ch, we can
use rejection sampling where we draw proposals (U

, V ∗
) from a uniform distribution on the
rectangle Dh; note that the proposals U
∗ and V
∗ are independent random variables with
Unif(0, u+) and Unif(v−, v+) distributions, respectively. Show that we can define u+, v− and
v+ as follows:
u+ = max
h(x) v− = min
h(x) v+ = max
(Hint: It suffices to show that if (u, v) ∈ Ch then (u, v) ∈ Dh where Dh is defined using u+,
v−, and v+ above.)
(c) Implement (in R) the method above for the standard normal distribution taking h(x) =
2/2). In this case, u+ = 1, v− = −
2/e = −0.8577639, and v+ =
2/e = 0.8577639.
What is the probability that proposals are accepted?
2. Suppose we observe y1, · · · , yn where
yi = θi + εi (i = 1, · · · , n)
where {εi} is a sequence of random variables with mean 0 and finite variance representing
noise. We will assume that θ1, · · · , θn are dependent in the sense that |θi − θi−1| is small
for most values of i. The least squares estimates of θ1, · · · , θn are trivial — bθi = yi
for all i
— but we can modify least squares in a number of ways to accommodate the “smoothness”
assumption on {θi}. In this problem, we will consider estimating {θi} by minimizing
(yi − θi)
2 + λ
(θi − θi−1)
where λ > 0 is a tuning parameter. (The term λ
(θi − θi−1)
is a “roughness” penalty. In
practice, it is more common to use λ
|θi − θi−1| as it better estimates jumps in {θi}.)
(a) Show the estimates bθ1, · · · ,
bθn satisfy the equations
y1 = (1 + λ)
bθ1 − λ
yj = −λ
bθj−1 + (1 + 2λ)
bθj − λ
bθj+1 (j = 2, · · · , n − 1)
yn = −λ
bθn−1 + (1 + λ)
(Note that if we write this in matrix form y = Aλ
bθ, the matrix A is sparse.)
(b) Show (using results from class) that both the Jacobi and Gauss-Seidel algorithms can be
used to compute the estimates defined in part (a).
(c) Write a function in R to implement the Gauss-Seidel algorithm above. The inputs for this
function are a vector of responses y and the tuning parameter lambda. Test your function
(for various tuning parameters) on data generated by the following command:
> y <- c(rep(0,250),rep(1,250),rep(0,50),rep(1,450)) + rnorm(1000,0,0.1)
How does varying λ change the estimates, particularly the estimates of the transitions from
0 to 1 in the step function?
Supplemental problems (not to hand in):
3. To generate random variables from some distributions, we need to generate two independent two independent random variables Y and V where Y has a uniform distribution over
some finite set and V has a uniform distribution on [0, 1]. It turns out that Y and V can be
generated from a single Unif(0, 1) random variable U.
(a) Suppose for simplicity that the finite set is {0, 1, · · · , n − 1} for some integer n ≥ 2. For
U ∼ Unif(0, 1), define
Y = bnUc and V = nU − Y
where bxc is the integer part of x. Show that Y has a uniform distribution on the set
{0, 1, · · · , n − 1}, V has a uniform distribution on [0, 1], and Y and V are independent.
(b) What happens to the precision of V defined in part (a) as n increases? (For example, if
U has 16 decimal digits and n = 106
, how many decimal digits will V have?) Is the method
in part (a) particularly feasible if n is very large?
4. One issue with rejection sampling is a lack of efficiency due to the rejection of random
variables generated from the proposal density. An alternative is the acceptance-complement
(A-C) method described below.
Suppose we want to generate a continuous random variable from a density f(x) and that
f(x) = f1(x) + f2(x) (where both f1 and f2 are non-negative) where f1(x) ≤ g(x) for some
density function g. Then the A-C method works as follows:
1. Generate two independent random variables U ∼ Unif(0, 1) and X with density g.
2. If U ≤ f1(X)/g(X) then return X.
3. Otherwise (that is, if U > f1(X)/g(X)) generate X from the density

(x) = f2(x)
Z ∞
f2(t) dt
Note that we must be able to easily sample from g and f

in order for the A-C method to
be efficient; in some cases, they can both be taken to be uniform distributions.
(a) Show that the A-C method generates a random variable with a density f. What is the
probability that the X generated in step 1 (from g) is “rejected” in step 2?
(b) Suppose we want to sample from the truncated Cauchy density
f(x) = 2
π(1 + x
(−1 ≤ x ≤ 1)
using the A-C method with f2(x) = k, a constant, for −1 ≤ x ≤ 1 (so that f

(x) = 1/2 is a
uniform density on [−1, 1]) with
f1(x) = f(x) − f2(x) = f(x) − k (−1 ≤ x ≤ 1).
If g(x) is also a uniform density on [−1, 1] for what range of values of k can the A-C method
be applied?
(c) Defining f1, f2, and g as in part (b), what value of k minimizes the probability that X
generated in step 1 of the A-C algorithm is rejected?
5. Suppose we want to generate a random variable X from the tail of a standard normal
distribution, that is, a normal distribution conditioned to be greater than some constant b.
The density in question is
f(x) = exp(−x

2π(1 − Φ(b))
for x ≥ b
with f(x) = 0 for x < b where Φ(x) is the standard normal distribution function. Consider
rejection sampling using the shifted exponential proposal density
g(x) = b exp(−b(x − b)) for x ≥ b.
(a) Define Y be an exponential random variable with mean 1 and U be a uniform random
variable on [0, 1] independent of Y . Show that the rejection sampling scheme defines X =
b + Y/b if
−2 ln(U) ≥
(Hint: Note that b + Y/b has density g.)
(b) Show the probability of acceptance is given by

2πb(1 − Φ(b))
2/2) .
What happens to this probability for large values of b? (Hint: You need to evaluate M =
max f(x)/g(x).)
(c) Suppose we replace the proposal density g defined above by
gλ(x) = λ exp(−λ(x − b)) for x ≥ b.
(Note that gλ is also a shifted exponential density.) What value of λ maximizes the probability of acceptance? (Hint: Note that you are trying to solve the problem
for λ. Because the density gλ(x) has heavier tails, the minimax problem above will have the
same solution as the maximin problem
which may be easier to solve.)