## Description

## Local Smoothing.

The goal of this homework is to help you better understand the statistical properties

and computational challenges of local smoothing such as loess, Nadaraya-Watson (NW) kernel smoothing,

and spline smoothing.

For this purpose, we will compute empirical bias, empirical variances, and empirical mean square error

(MSE) based on m = 1000 Monte Carlo runs, where in each run we simulate a data set of n = 101

observations from the additive noise model

Yi = f(xi) + i (1)

with the famous Mexican hat function

f(x) = (1 − x

2

) exp(−0.5x

2

), −2π ≤ x ≤ 2π, (2)

and 1, · · · , n are independent and identically distributed (iid) N(0, 0.2

2

).

This function is known to pose a

variety of estimation challenges, and below we explore the difficulties inherent in this function.

(1) Let us first consider the (deterministic fixed) design with equi-distant points in [−2π, 2π].

(a) For each of m = 1000 Monte Carlo runs, simulate or generate a data set of the form (xi

, Yi)) with

xi = 2π(−1 + 2 i−1

n−1

) and Yi

is from the model in (1). Denote such data set as Dj at the j-th

Monte Carlo run for j = 1, · · · , m = 1000.

(b) For each data set Dj or each Monte Carlo run, compute the three different kinds of local smoothing estimates at every point in Dj : loess (with span = 0.75), Nadaraya-Watson (NW) kernel

smoothing with Gaussian Kernel and bandwidth = 0.2, and spline smoothing with the default

tuning parameter.

(c) At each point xi

, for each local smoothing method, based on m = 1000 Monte Carlo runs, compute

the empirical bias, empirical variance, and empirical mean square error (MSE), which are defined

as

Bias\{f(xi)} = ¯fm(xi) − f(xi), with ¯fm(xi) = 1

m

Xm

j=1

ˆf

(j)

(xi)

V ar\{f(xi)} =

1

m

Xm

j=1

ˆf

(j)

(xi) − ¯fm(xi)

2

,

MSE\{f(xi)} =

1

m

Xm

j=1

ˆf

(j)

(xi) − f(xi)

2

,

Here we use the true function value f(xi) in (2) in the definition of empirical Bias and empirical

MSE, which better helps us understanding the performance of different local smoothing methods

when estimating the true function f. Moreover, we purposely use the coefficient 1

m (instead of the

standard coefficient 1

m−1

) in the definition of empirical variance, so that the well-known relation

MSE = Bias2 + Var is applicable to the empirical version.

(d) Plot these quantities against xi for all three kinds of local smoothing estimators: loess, NW kernel,

and spline smoothing.

(e) Provide a through analysis of what the plots suggest, e.g., which method is better/worse on bias,

variance, and MSE? Do you think whether it is fair comparison between these three methods?

Why or why not?

(2) Repeat part (1) with another (deterministic) design that has non-equidistant points in the interval

[−2π, 2π]. The following R code is used to generate the design points xi

’s in my laptop, denoted by x2

below (you can keep these xi

’s fixed in the m = 1000 Monte Carlo runs):

set.seed(79)

x2 <- round(2*pi*sort(c(0.5, -1 + rbeta(50,2,2), rbeta(50,2,2))), 8)

For those students who use Python or other softwares, below are the values of x2, which is also available

in the dataset “HW04part2.x.csv”:

[1] -6.15270162 -5.96542722 -5.34857373 -5.25045325 -4.93408232

[6] -4.82399219 -4.79756399 -4.72091648 -4.56066454 -4.52796823

[11] -4.37012804 -4.35883971 -4.35650125 -4.14444058 -4.08772246

[16] -3.99254444 -3.85521732 -3.79979095 -3.68856521 -3.66394780

[21] -3.61158538 -3.49345211 -3.43019447 -3.19560754 -3.18518010

[26] -3.12475790 -2.97880845 -2.89425664 -2.88764749 -2.70117864

[31] -2.65330733 -2.63339792 -2.43171710 -2.23984616 -2.20285070

[36] -1.95028199 -1.89814133 -1.70630816 -1.66101539 -1.41528948

[41] -1.35654375 -1.32557550 -1.31997953 -1.28553107 -1.13468911

[46] -1.06343202 -0.96219478 -0.79705203 -0.61365764 -0.61076290

[51] 0.06614811 0.16881684 0.78634836 0.92627384 0.98082320

[56] 1.23222276 1.26040912 1.35389378 1.38890407 1.46203893

[61] 1.64117907 1.64643080 1.72974830 1.77209389 1.83964734

[66] 1.91725385 2.03169983 2.07038412 2.28388853 2.28508649

[71] 2.44458357 2.54947996 2.68015883 2.72649568 2.73596466

[76] 2.90513928 2.99685314 3.10296534 3.14159265 3.43473453

[81] 3.46738955 3.80582747 3.89114580 3.96880166 3.97037737

[86] 4.36755363 4.41302445 4.46414213 4.49439898 4.49756067

[91] 4.59615407 5.01854928 5.04809437 5.07324581 5.12980909

[96] 5.13371469 5.14677578 5.14849898 5.55918177 5.62584723

[101] 5.95526169

Here for simplicity and reasonable comparison, when estimating and predicting by the local smoothing

methods, please use span = 0.3365 for loess, bandwidth = 0.2 for NW kernel smoothing, and spar =

0.7163 for spline splines.

Discuss the statistical challenges and the computational challenges when using these three local smoothing methods to estimate the Mexican hat function under this non-equidistant design, including the

suitable choices of tuning parameters.

Appendix: below are some sample R codes that may be useful to this homework, and please note that you

might need to modify/revise these R codes!

## Part #1 deterministic equidistant design

## Generate n=101 equidistant points in [-2\pi, 2\pi]

m <- 1000

n <- 101

x <- 2*pi*seq(-1, 1, length=n)

## Initialize the matrix of fitted values for three methods

fvlp <- fvnw <- fvss <- matrix(0, nrow= n, ncol= m)

##Generate data, fit the data and store the fitted values

for (j in 1:m){

## simulate y-values

## Note that you need to replace $f(x)$ below by the mathematical definition in eq. (2)

y <- f(x) + rnorm(length(x), sd=0.2);

## Get the estimates and store them

fvlp[,j] <- predict(loess(y ~ x, span = 0.75), newdata = x);

fvnw[,j] <- ksmooth(x, y, kernel=”normal”, bandwidth= 0.2, x.points=x)$y;

fvss[,j] <- predict(smooth.spline(y ~ x), x=x)$y

}

## Below is the sample R code to plot the mean of three estimators in a single plot

meanlp = apply(fvlp,1,mean);

meannw = apply(fvnw,1,mean);

meanss = apply(fvss,1,mean);

dmin = min( meanlp, meannw, meanss);

dmax = max( meanlp, meannw, meanss);

matplot(x, meanlp, “l”, ylim=c(dmin, dmax), ylab=”Response”)

matlines(x, meannw, col=”red”)

matlines(x, meanss, col=”blue”)

## You might add the raw observations to compare with the fitted curves

# points(x,y)

## Can you adapt the above codes to plot the empirical bias/variance/MSE?

## Part #2 non-equidistant design

## assume you save the file “HW04part2.x.csv” in the local folder “C:/temp”,

x2 <- read.table(file= “C:/temp/HW04part2.x.csv”, header=TRUE);

## within each loop, you can consider the three local smoothing methods:

## please remember that you need to first simulate Y values within each loop

y <- (1-x2^2) * exp(-0.5 * x2^2) + rnorm(length(x2), sd=0.2);

predict(loess(y ~ x2, span = 0.3365), newdata = x2);

ksmooth(x2, y, kernel=”normal”, bandwidth= 0.2, x.points=x2)$y;

predict(smooth.spline(y ~ x2, spar= 0.7163), x=x2)$y