I am analyzing some data where the variables of interests are circular (angles). I use R and the `circular`

package.

In my dataset, every observation consists in a 2D Euclidean vector representing a movement on a plane (the length is constant, only the angle varies), together with a measure of a response to that movement, expressed as displacement in cartesian coordinates (i.e. the displacement along the x and y axis). I would like to fit a model where the initial movement (either the x-y cartesian representation or only the angle, since the length is constant) is the response variable. Ultimately, I would like to use the fitted model to estimate the initial movement vector in a related dataset where I only know the response to the movement.

My measurement of the response to the movement are affected by noise, independent over x and y dimensions, with possibly different variances and different additive biases.

To represent graphically the problem, what I want to do is estimate the direction of the original vectors (gray thick arrows in the figure below) starting from the noisy measurements (the thin black arrows).

I have tried fitting different models:

- one model with both predictor and dependent variable being circular

(the angle of the initial movement vector, and the angle computed

from the (x,y) displacement) - one multivariate, linear model, with the cartesian (x,y) components of the initial movement as dependent variables, and the

measured x,y displacement as linear predictors; - one model with circular dependent variable, and linear predictor (this last one with no success)

First, I wasn't able to fit the 3rd model for some reason that I don't understand. I report here a reproducible example

`require(circular) theta <- circular(runif(500,0,2*pi),units="radians",type="angles",zero=0,rotation="counter") rho <- rep(8,500) # add measurement noise (different for x and y) mX <- as.numeric(rnorm(500,0,2) + rho * cos(theta)) mY <- as.numeric(rnorm(500,-1,1) + rho * sin(theta)) linearPred <- cbind(mX,mY) # fit model mcl <- lm.circular(y=theta, x=linearPred,init=c(1,1), type="c-l",verbose=T) `

Here is the output:

`> mcl <- lm.circular(y=theta, x=linearPred,init=c(1,1), type="c-l",verbose=T) Iteration 1 : Log-Likelihood = 2.392981 Iteration 2 : Log-Likelihood = 1.082084 Iteration 3 : Log-Likelihood = 0.8013503 Iteration 4 : Log-Likelihood = NA Error in while (diff > tol) { : valore mancante dove è richiesto TRUE/FALSE `

(the last line says: missing value where a TRUE/FALSE was required).

Can anyone shed light on this? I don't understand where this error comes from.

Second question, which model would you suggest to use among the first two? Here is the code that I used for the two models

`# fit sin(theta) and cos(theta) with a multivariate linear model mmv <- lm(cbind(sin(theta),cos(theta)) ~ mX+mY) # "circular-circular" model angularPred <- circular(atan2(mY,mX),units="radians",type="angles",zero=0,rotation="counter") mcc <- lm.circular(y=theta, x=angularPred, type="c-c") `

I can compute the angle from the multivariate fitted values as `atan2(mmv$fitted[,1],mmv$fitted[,2])`

. Both seems to perform similarly in terms of mean angular error. To compare the two I computed the correlation between the predicted angles and the initial angle `theta`

(Jammalamadaka – Sarma correlation coefficient), and the multivariate model seems to perform slightly better:

`fitted.mmv <- circular(atan2(mmv$fitted[,1],mmv$fitted[,2]),units="radians",type="angles",zero=0,rotation="counter") > cor.circular(fitted.mmv,theta) # multivariate [1] 0.9851422 > cor.circular(mcc$fitted,theta) # "circular-circular" [1] 0.7262862 `

However, the distribution of residuals of the multivariate model shows a strange pattern (figure below).

Is this pattern in the residuals a problem? Can anyone gives some advice on this? Which model would you use in this case? Is there any other possible approach that you would recommend?

Any advice is appreciated, thanks!

**Contents**hide

#### Best Answer

The pattern in the residuals is not necessarily a problem. One way to check this is to simulate a set of responses from the model that you just fitted (that is, under the assumption that the model is correct), fit a new model to the results, and plot *its* residuals. This gives you a measure of how weird you would expect the plot to look even if nothing were wrong.

If you can reliably pick out the original model from several plots produced in this way, *then* you should start worrying about the residual pattern.