## Description

File usparkvisits.csv contains annual numbers of recreational visits to 30 different parks

managed by the US National Park Service for the years 2006 through 2018.1 You will build and

compare two different varying-coefficient hierarchical normal regression models for the log-scale

numbers, using JAGS and rjags.

(a) (i) [2 pts] On the same set of axes, plot segmented lines, one for each park, representing

the number of visits versus the year (2006 through 2018). Distinguish the lines for

different parks by using different colors or line types. (You should label the axes, but no

legend is needed.)

(ii) [2 pts] Repeat the previous part using the natural logarithm of the number of visits.

Let yij be the natural logarithm of the number of visits to park j in year i (i = 1, . . . , 13,

j = 1, . . . , 30). For each park, model the log-number as a simple linear regression on the

centered year number:

yij | β

(j)

, σ2

y

, X ∼ indep. N

β

(j)

1 + β

(j)

2

(xi − x¯), σ2

y

where

β

(j) =

β

(j)

1

β

(j)

2

!

j = 1, . . . , 30 xi = i i = 1, . . . , 13

Note that the coefficients are allowed to depend on the park, but the variance is not.

(b) Let βˆ

(j)

1

and βˆ

(j)

2

be the ordinary least squares estimates of β

(j)

1

and β

(j)

2

, estimated for

park j. You may use the lm function in R to compute these estimates. (For this part, the

coefficient pairs are estimated completely separately for each park.)

(i) [1 pt] Produce a scatterplot of the pairs

βˆ

(j)

1

, βˆ

(j)

2

, j = 1, . . . , 30.

(ii) [1 pt] Give the average (sample mean) of βˆ

(j)

1

and also of βˆ

(j)

2

.

(iii) [1 pt] Give the sample variance of βˆ

(j)

1

and also of βˆ

(j)

2

.

(iv) [1 pt] Give the sample correlation between βˆ

(j)

1

and βˆ

(j)

2

.

(c) Consider the bivariate prior

β

(j)

| µβ, Σβ ∼ iid N(µβ, Σβ)

µβ =

µβ1

µβ2

!

Σβ =

σ

2

β1

ρ σβ1 σβ2

ρ σβ1 σβ2 σ

2

β2

!

with hyperpriors

µβ ∼ N

0, 10002

I

Σ

−1

β ∼ Wishart2

Σ

−1

0

/2

1Data from https://irma.nps.gov/STAT

in the notation used in the lecture videos. For your analysis, use

Σ0 =

10 0

0 0.01!

based on preliminary analyses. Let the prior on σ

2

y be

σ

2

y ∼ Inv-gamma(0.0001, 0.0001)

(i) [2 pts] List an appropriate JAGS model. Make sure to create nodes for Σβ, ρ, and σ

2

y

.

Remember that the numbers of visits are to be analyzed on the log scale.

Now run your model using rjags. Make sure to use multiple chains with overdispersed

starting points, check convergence, and monitor µβ, Σβ, σ

2

y

, and ρ (after convergence)

long enough to obtain effective sample sizes of at least 4000 for each parameter.

(ii) [2 pts] Display the coda summary of the results for the monitored parameters.

(iii) [2 pts] Give an approximate 95% central posterior interval for the correlation

parameter ρ, and also produce a graph of its (estimated) posterior density.

(iv) [2 pts] Approximate the posterior probability that ρ < 0. Also, compute the Bayes

factor favoring ρ < 0 versus ρ > 0. (You may use the fact that ρ < 0 and ρ > 0 have

equal prior probability.) Describe the level of data evidence for ρ < 0.

(v) [1 pt] Your model implies that, over the 12 years from 2006 to 2018, the (population)

median number of visits should have changed by a factor of

e

12µβ2

Form an approximate 95% central posterior interval for this quantity.

(vi) [2 pts] Use the rjags function dic.samples to compute the effective number of

parameters (“penalty”) and Plummer’s DIC (“Penalized deviance”). Use at least

100,000 iterations.

(d) Now consider a different model with “univariate” hyperpriors for the model coefficients, which

do not allow for a coefficient correlation parameter:

β

(j)

1

| µβ1

, σβ1 ∼ iid N

µβ1

, σ2

β1

β

(j)

2

| µβ2

, σβ2 ∼ iid N

µβ2

, σ2

β2

with hyperpriors

µβ1

, µβ2 ∼ iid N

0, 10002

σβ1

, σβ2 ∼ iid U(0, 1000)

Let the prior on σ

2

y be the same as in the previous model.

(i) [4 pts] Draw a complete DAG for this new model.

(ii) [2 pts] List an appropriate JAGS model. Make sure that there are nodes for σ

2

β1

, σ

2

β2

,

and σ

2

y

.

Remember that the numbers of visits are to be analyzed on the log scale.

Now run your model using rjags. Make sure to use multiple chains with overdispersed

starting points, check convergence, and monitor µβ1

, µβ2

, σ

2

β1

, σ

2

β2

, σ

2

y

(after

convergence) long enough to obtain effective sample sizes of at least 4000 for each

parameter.

(iii) [2 pts] Display the coda summary of the results for the monitored parameters.

(iv) [2 pts] Recall the (population) median change factor for the number of visits,

e

12µβ2

as considered in the previous analysis. Form an approximate 95% central posterior

interval for this quantity, and compare it with the previous results.

(v) [2 pts] Use the rjags function dic.samples to compute the effective number of

parameters (“penalty”) and Plummer’s DIC (“Penalized deviance”). Use at least

100,000 iterations.

(vi) [1 pt] Compare the (Plummer’s) DIC values for this model and the previous one.

Which is preferred?

Total: 32 pts

3