Error running MLE2 function (BBMLE)
mle2()
I get the following error when running a function from a package bbmle
in R :
Some parameters are on the boundary: Hessian-based variance-covariance calculations may be unreliable
I'm trying to understand if this is due to a problem with my data or if the function is being called correctly. Unfortunately I can't post real data, so I'm using a similar working example with the same sample size.
dAction
The custom function I'm using is the softmax function. Optimization must have upper and lower bounds, so I am using the L-BFGS-B method.
library(bbmle)
set.seed(3939)
### Reproducible data
dat1 <- rnorm(30, mean = 3, sd = 1)
dat2 <- rnorm(30, mean = 3, sd = 1)
dat1[c(1:3, 5:14, 19)] <- 0
dat2[c(4, 15:18, 20:22, 24:30)] <- 0
### Data variables
x <- sample(1:12, 30, replace = TRUE)
pe <- dat1
ne <- dat2
### Likelihood
dAction <- function(x, a, b, t, pe, ne, log = FALSE) {
u <- exp(((x - (a * ne) - (b * pe)) / t))
prob <- u / (1 + u)
if(log) return(prob) else return(-sum(log(prob)))
}
### Fit
fit <- mle2(dAction,
start = list(a = 0.1, b = 0.1, t = 0.1),
data = list(x = x, pe = pe, ne = ne),
method = "L-BFGS-B",
lower = c(a = 0.1, b = 0.1, t = 0.1),
upper = c(a = 10, b = 1, t = 10))
Warning message:
In mle2(dAction, start = list(a = 0.1, b = 0.1, t = 0.1), data = list(x = x, :
some parameters are on the boundary: variance-covariance calculations based on Hessian may be unreliable
Here are the results summary()
:
summary(fit)
Maximum likelihood estimation
Call:
mle2(minuslogl = dAction, start = list(a = 0.1, b = 0.1, t = 0.1),
method = "L-BFGS-B", data = list(x = x, pe = pe, ne = ne),
lower = c(a = 0.1, b = 0.1, t = 0.1), upper = c(a = 10, b = 1,
t = 10))
Coefficients:
Estimate Std. Error z value Pr(z)
a 0.1 NA NA NA
b 0.1 NA NA NA
t 0.1 NA NA NA
-2 log L: 0.002048047
Warning message:
In sqrt(diag(object@vcov)) : NaNs produced
and the results with confidence intervals
confint(fit)
Profiling...
2.5 % 97.5 %
a NA 1.0465358
b NA 0.5258828
t NA 1.1013322
Warning messages:
1: In sqrt(diag(object@vcov)) : NaNs produced
2: In .local(fitted, ...) :
Non-positive-definite Hessian, attempting initial std err estimate from diagonals
I don't fully understand the problem you're having, but:
The problem (whether it's the real one or not depends a lot on the above context which I don't understand) is related to your constraints. If we fit without constraints:
### Fit
fit <- mle2(dAction,
start = list(a = 0.1, b = 0.1, t = 0.1),
data = list(x = x, pe = pe, ne = ne))
## method = "L-BFGS-B",
## lower = c(a = 0.1, b = 0.1, t = 0.1),
## upper = c(a = 10, b = 1, t = 10))
The coefficients we get are below your range.
coef(fit)
a b t
0.09629301 0.07724332 0.02405173
If this is correct, at least one constraint will be active (i.e. when we meet the lower bound, at least one parameter will reach the bound - in fact, all parameters have reached the bound). The simplest mechanism for computing confidence intervals (Wald intervals) does not work when the fit is on the bounds. However, this does not affect the profile confidence interval estimates you reported above. These are correct - the NA
lower bound is reported because the lower bound confidence is on the boundary (you can replace it with 0.1 if you want).
If you don't expect the best fit to be on the bounds, then I don't know what's going on, maybe a data problem.
There's nothing wrong with your log-likelihood function, but it's a bit confusing because you have a log
parameter that returns the negative log-likelihood when log=FALSE
(the default) and the likelihood when it's not log=TRUE
. Before realizing this, I rewrote the function (I also made it more numerically stable by doing the calculations on a logarithmic scale as much as possible).
dAction <- function(x, a, b, t, pe, ne) {
logu <- (x - (a * ne) - (b * pe)) / t
lprob <- logu - log1p(exp(logu))
return(-sum(lprob))
}