Why you shouldn’t use the offset function in MCMCglmm

The problem: I have feeding rate data from nest watches of different durations. I want to control for these duration differences in a Poisson model in R.

What you shouldn’t do: I used the offset function to control for nest watch duration in MCMCglmm, and the model ran without error. Great! Well, not so great –  using the dummy data and running the 7 models below shows that offset doesn’t work in MCMCglmm.

The solution: Model 7, below:

The dummy data: Load the dummy data of feeding rates during nest watches with and without helpers. Per hour the watches don’t differ in mean feeds according to helper presence, but watches are longer when there is a helper:

data_offset <- read.table("OffsetTest.txt", header=TRUE)
data_offset$Helper <- as.factor(data_offset$Helper)
str(data_offset)
## 'data.frame':    52 obs. of  4 variables:
##  $ NoFeeds     : int  9 7 10 11 1000 900 800 900 700 1100 ...
##  $ ObsDuration : int  1 1 1 1 100 100 100 100 100 100 ...
##  $ FeedsperHour: int  9 7 10 11 10 9 8 9 7 11 ...
##  $ Helper      : Factor w/ 2 levels "0","1": 1 1 1 2 2 1 2 2 2 2 ...
aggregate(data_offset[, 1:3], list(data_offset$Helper), function(x) c(Mean = mean(x), SE=(sd(x)/sqrt(length(x)))))
##   Group.1 NoFeeds.Mean NoFeeds.SE ObsDuration.Mean ObsDuration.SE
## 1       0    104.67857   53.02221        11.607143       5.892857
## 2       1    813.66667   69.95060        87.625000       6.827006
##   FeedsperHour.Mean FeedsperHour.SE
## 1         9.2142857       0.2880198
## 2         9.2916667       0.3097402
# Test degree of collinearity between observation duration and helper presence - it's bound to be collinear:
# You can download corvif from here: http://www.highstat.com/book2.htm
Zoffset<-na.omit(cbind(data_offset$ObsDuration, data_offset$Helper)) 
corvif(Zoffset)
## 
## 
## Variance inflation factors
## 
##        GVIF
## V1 2.436674
## V2 2.436674

 

The models: run the following 7 different models to see that offset doesn’t work in MCMCglmm:

1) glm with offset:

offsetTest1 <- glm(NoFeeds ~  offset(log(ObsDuration)) + Helper, family="poisson", data=data_offset)
summary(mod5_offsetTest4)
## 
## Call:
## glm(formula = NoFeeds ~ offset(log(ObsDuration)) + Helper, family = "poisson", 
##     data = data_offset)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -7.8455  -0.9427  -0.0061   0.6376   8.5188  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  2.19927    0.01847 119.066   <2e-16 ***
## Helper1      0.02921    0.01981   1.475     0.14    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 534.09  on 51  degrees of freedom
## Residual deviance: 531.90  on 50  degrees of freedom
## AIC: 857.47
## 
## Number of Fisher Scoring iterations: 4

Helper is not significant (as expected)

 

2) glm without offset:

offsetTest2 <- glm(NoFeeds ~  Helper, family="poisson", data=data_offset)
summary(mod5_offsetTest4a)
## 
## Call:
## glm(formula = NoFeeds ~ Helper, family = "poisson", data = data_offset)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -39.209  -12.132  -11.738    3.808   47.771  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  4.65089    0.01847   251.8   <2e-16 ***
## Helper1      2.05066    0.01981   103.5   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 32358  on 51  degrees of freedom
## Residual deviance: 15930  on 50  degrees of freedom
## AIC: 16256
## 
## Number of Fisher Scoring iterations: 7

Helper is significant (as expected, as helper watches are longer)

 

3) glm, response = feeds per hour (no offset needed)

offsetTest3 <- glm(FeedsperHour ~ Helper, family="poisson", data=data_offset)
summary(mod5_offsetTest5a)
## 
## Call:
## glm(formula = FeedsperHour ~ Helper, family = "poisson", data = data_offset)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -1.13165  -0.41555  -0.07087   0.32758   0.84986  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept) 2.220755   0.062257  35.671   <2e-16 ***
## Helper1     0.008363   0.091435   0.091    0.927    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 12.956  on 51  degrees of freedom
## Residual deviance: 12.948  on 50  degrees of freedom
## AIC: 228.43
## 
## Number of Fisher Scoring iterations: 4

Helper is not significant (as expected)

 

4) MCMCglmm, with offset:

library(MCMCglmm)
offsetTest4 <- MCMCglmm(NoFeeds ~  offset(log(ObsDuration)) + Helper, family="poisson", data=data_offset, verbose=FALSE)
HPDinterval(offsetTest4$Sol)
##                lower    upper
## (Intercept) 2.122980 3.204901
## Helper1     2.773338 4.393849
## attr(,"Probability")
## [1] 0.95

Helper IS significant – SHOULDN’T BE!

This shows that offset doesn’t work in MCMCglmm (contrary to p.320 of this textbook https://books.google.nl/books?id=tFy5BgAAQBAJ&pg=PA320&lpg=PA320&dq=MCMCglmm+offset%28&source=bl&ots=P6ZJj0BE_8&sig=fDCOOPOvnr5Qwby7_ds56S1n8WE&hl=en&sa=X&ved=0ahUKEwiGj9a-663JAhWFDA8KHUrFDX04ChDoAQhNMAc#v=onepage&q=MCMCglmm%20offset%28&f=false)

 

5) MCMCglmm, no offset:

offsetTest5 <- MCMCglmm(NoFeeds ~ Helper, family="poisson", data=data_offset, verbose=FALSE) 
HPDinterval(offsetTest5$Sol)
##                lower    upper
## (Intercept) 1.994062 3.180698
## Helper1     2.845787 4.496370
## attr(,"Probability")
## [1] 0.95

Helper is significant (as expected)

 

6) MCMCglmm, response = feeds per hour (no offset needed):

offsetTest6 <- MCMCglmm(FeedsperHour ~ Helper, family="poisson", data=data_offset, verbose=FALSE) 
HPDinterval(offsetTest6)
##                   lower     upper
## (Intercept)  2.06585426 2.2719087
## Helper1     -0.02350546 0.1479625
## attr(,"Probability")
## [1] 0.95

Helper is not significant (as expected)

 

7) MCMCglmm, no offset, but specify a prior to set the ObsDuration regression coefficient to 1 (as suggested by Jarrod Hadfield: https://stat.ethz.ch/pipermail/r-sig-mixed-models/2011q3/016817.html; NB: you need to log ObsDuration)

B.prior2 <- list(B=list(V=diag(3)*1e7, mu=c(0,1,0)))
B.prior2$B$V[2,2] <- 1e-7  # replace ObsDuration (offset) variance
B.prior2
## $B
## $B$V
##       [,1]  [,2]  [,3]
## [1,] 1e+07 0e+00 0e+00
## [2,] 0e+00 1e-07 0e+00
## [3,] 0e+00 0e+00 1e+07
## 
## $B$mu
## [1] 0 1 0
offsetTest7 <- MCMCglmm(NoFeeds ~ log(ObsDuration) + Helper, family="poisson", data=data_offset, prior=B.prior2, verbose=FALSE)
HPDinterval(offsetTest7$Sol)
##                       lower     upper
## (Intercept)       2.1046352 2.3162748
## log(ObsDuration)  0.9993451 1.0005766
## Helper1          -0.1118268 0.1429807
## attr(,"Probability")
## [1] 0.95

helper is not significant (as expected)