# CH 5440 Multivariate Data Analysis Assignment 1 solution

\$25.00

Original Work ?

## Description

5/5 - (1 vote)

1. (a) Let 𝑢ଵ, 𝑢ଶ,⋯,𝑢ே and 𝑦ଵ, 𝑦ଶ,⋯,𝑦ே be a set of N measurements of two variables u and
y which are linearly related. We are interested in determining the linear regression parameter
a where y = au + b.

Assume that the measurements of u and y contain errors, with standard
deviations  and, respectively. (a) If the ratio of the error variances 𝜌 ൌ ఙച

ఙഃ
మ is known,
derive the weighted TLS (WTLS) estimates of a and b in terms of 𝑠௨௨, 𝑠௬௬, 𝑢ത,𝑦ത, 𝜌. (b) How
will the solution for a change if it is given that the constant b is 0?

Note: The WTLS regression problem when the error variances are known is the solution of
the following minimization problem. Multiply the objective function by 2   and replace the
ratio of the error variances by . Differentiate the objective function with respect to the decision
variables and solve resulting set of nonlinear algebraic equations for obtaining the parameters
a and b.

min
ఈ,ఉ,௫ොഢ

ሺ𝑦௜ െ 𝑎𝑢ො௜ െ 𝑏ሻଶ/𝜎ఌ
ଶ ൅ ሺ𝑢௜ െ 𝑢ො௜ሻଶ/𝜎ఋ

(b) From the solution of WTLS, obtain the solutions of the regression parameters for IOLS
and OLS in the limit as   0 and   . Also obtain the solution for the estimates of ui and
yi for each case (OLS, IOLS, WTLS) in terms of the regression parameters and measurements

2. Carbon-dioxide (CO2) is one of the major greenhouse gases that is implicated in the gradual
warming of the earth’s temperature. Measured concentrations of CO2 (in ppm) and
atmospheric temperature (spatially and temporally averaged over a year) available from
USEPA’s Climate Change Indicators website (www.epa.gov/climate-indicators) between
1984 and 2014 is given in Table 1.

The temperatures are deviation in deg F from the average
temperature in the period 1901-2000. Climate models recommend that the global temperature
increase should be kept below 2 deg C (3.6 deg F) by cutting down on CO2 emissions. Using
OLS and TLS regression for estimate the maximum permissible level of CO2 in the
atmosphere that can meet this goal.

Assume that the level of CO2 increases linearly with time,
estimate using the given data how many years it will take for CO2 to reach the maximum
permissible. Note that this is a simplified analysis because other greenhouse gases such as
methane, nitrous oxide, water vapour, etc. have not been considered. In order to improve your
model you are encouraged to use other reliable data sources you can find (cite the sources from

3. The level of phytic acid in urine samples was determined by a catalytic fluorimetric (CF)
method and the results were compared with those obtained using an established extraction
photometric (EP) technique. The results, in mg/L, are the means of triplicate measurements, as
shown in Table 2.

(a) Is the new method (CF) a good substitute for the established method (EP) for measuring
the level of phytic acid in urine? Justify your conclusion using linear regression between
the two methods for different modelling assumptions regarding the accuracy of the
respective measurement techniques.

(b) Estimate the level of phytic acid in urine if EP measurement is 2.31 mg/l and CF
measurement is 2.20 mg/l, for different modelling.

4. Image analysis is used to identify defects in infrastructures such as bridges, roads or in
manufactured products such as glass sheets, rolled steel sheets etc. One of the first steps in
image analysis is annotation of the defect using an annotation tool such as CVAT, where each
defect is marked using a polygon enclosing the defect.

The corners of the polygon are pixels
which are indicated by the x and y coordinates of the pixel in the image. Table 3 gives the x
and y coordinates of the corner points of the polygons for three different defects found from a
drone image of a concrete pillar of a bridge. It is required to estimate the orientation of the
defect and check if it is aligned with the horizontal or vertical axis (indicating perhaps it is due
to corrosion of vertical or horizontal steel reinforcement bars buried within the concrete).

Identify which of the three defects could be due to corrosion of steel reinforcement bars.
Table 1. Measured average atmospheric CO2 concentration and temperature
Year CO2 Temp (0
F) Year CO2 Temp (0
F)
1984 344.58 0.27 1999 368.33 0.792
1985 346.04 0.234 2000 369.52 0.756
1986 347.39 0.414 2001 371.13 0.972
1987 349.16 0.666 2002 373.22 1.08
1988 351.56 0.666 2003 375.77 1.098
1989 353.07 0.522 2004 377.49 1.026
1990 354.35 0.774 2005 379.8 1.17
1991 355.57 0.72 2006 381.9 1.098
1992 356.38 0.45 2007 383.76 1.098
1993 357.07 0.504 2008 385.59 0.972
1994 358.82 0.612 2009 387.37 1.134
1995 360.8 0.81 2010 389.85 1.26
1996 362.59 0.576 2011 391.63 1.026
1997 363.71 0.918 2012 393.82 1.116
1998 366.65 1.134 2013 396.48 1.188
2014 398.61 1.332