Simple numerical estimation problem for MLE of polynomials in R


John N

I am trying to build a simple numerical MLE estimate for a multinomial distribution.

The polynomial has a constraint - all cell probabilities need to add up to one.

In general, the way to have this constraint is to reformulate one probability as (1 - sum of other probabilities)

However, when I run this command, I run into a problem during optimization, the logarithm can be negative.

Any ideas on how to fix this? I tried using another optimization package (Rsolnp) and it worked, but I'm trying to get it to work with the simple default R optim to avoid constrained/non-linear optimization.

Here is my code (I know I can parse to get the result in this case, but this is a toy example, my actual problem is bigger than here).

set.seed(1234)
test_data <- rmultinom(n = 1, size = 1000, prob = rep(1/4, 4))
N <- test_data
loglik_function <- function(theta){
  output <- -1*(N[1]*log(theta[1]) + N[2]*log(theta[2]) + N[3]*log(theta[3]) + N[4]*log(1- sum(theta)))
  return(output)
}

startval <- rep(0.1, 3)

my_optim <- optim(startval, loglik_function, lower = 0.0001, upper = 0.9999, method = "L-BFGS-B")

Any ideas or help would be greatly appreciated. thanks

Maurits Evers

Fully Armed: I know you've asked about (constrained) ML estimation, but how to use the Bayesianà la Stan/method rstan. If it's not useful/missing a point, I'll delete it.

  1. The model is just a few lines of code.

    library(rstan)
    
    model_code <- "
    data {
        int<lower=1> K;           // number of choices
        int<lower=0> y[K];        // observed choices
    }
    
    parameters {
        simplex[K] theta;         // simplex of probabilities, one for every choice
    }
    
    model {
        // Priors
        theta ~ cauchy(0, 2.5);   // weakly informative
        // Likelihood
        y ~ multinomial(theta);
    }
    
    generated quantities {
        real ratio;
        ratio = theta[1] / theta[2];
    }
    "
    

    You can see how easy it is to implement a simplex constraint on s thetausing the Stan data type simplex. Using the Stan language, you can easily implement a probabilistic (unit) simplexsimplex

    enter image description here

    where K represents the number of parameters (here options).

    Also notice how we use the generated quantitiescode block to calculate the number of derivations (here ) ratiobased on the parameters (here theta[1]and theta[2]) . Since we have access to the posterior distribution for all parameters, computing the distribution of the number of derivations is straightforward.

  2. Then, we fit the model to yourtest_data

    fit <- stan(model_code = model_code, data = list(K = 4, y = test_data[, 1]))
    

    and displays a summary of the parameter estimates

    summary(fit)$summary
    #                 mean      se_mean         sd          2.5%           25%
    #theta[1]     0.2379866 0.0002066858 0.01352791     0.2116417     0.2288498
    #theta[2]     0.26 20013 0.0002208638 0.01365478     0.2358731     0.2526111
    #theta[3]     0.2452539 0.0002101333 0.01344665     0.2196868     0.2361817
    #theta[4]     0.2547582 0.0002110441 0.01375618     0.2277589     0.2458899
    #ratio        0.9116350 0.0012555320 0.08050852     0.7639551     0.8545142
    #lp__     -1392.6941655 0.0261794859 1.19050097 -1395.8297494 -1393.2406198
    #                   50%           75%         97.5%    n_eff      Rhat
    #theta[1]     0.2381541     0.2472830     0.2645305 4283.904 0.9999816
    #theta[2]     0.2615782     0.2710044     0.2898404 3822.257 1.0001742
    #theta[3]     0.2448304     0.2543389     0.2722152 4094.852 1.0007501
    #theta[4]     0.2545946     0.2638733     0.2822803 4248.632 0.9994449
    #ratio        0.9078901     0.9648312     1.0764747 4111.764 0.9998184
    #lp__     -1392.3914998 -1391.8199477 -1391.3274885 2067.937 1.0013440
    

    and a plot showing point estimates and CIs of the thetaparameters

    plot(fit, pars = "theta")
    

    enter image description here


Update: Constrained ML estimation usingmaxLik

In fact, you can use the methods provided by the maxLiklibrary to implement constrained ML estimation . I find it a bit "troubling" as the convergence seems to be very sensitive to changes in the starting values ​​and the optimization method used.

For what it's worth, here's a reproducible example:

library(maxLik)

x <- test_data[, 1]

Defines the log-likelihood function for a multinomial distribution; I 've included a statement ifhere to prevent the theta < 0case from throwing an error.

loglik <- function(theta, x)
    if (all(theta > 0)) sum(dmultinom(x, prob = theta, log = TRUE)) else 0

I am using the Nelder-Mead optimization method here to find the maximum value of the log-likelihood function. The important point here is the parameter that constraintsimplements the constraint in equivalent form A theta + B = 0, see for details and examples ?maxNM.

res <- maxNM(
    loglik,
    start = rep(0.25, length(x)),
    constraints = list(
        eqA = matrix(rep(1, length(x)), ncol = length(x)),
        eqB = -1),
    x = x)

we can check the result

summary(res)
--------------------------------------------
Nelder-Mead maximization
Number of iterations: 111
Return code: 0
successful convergence
Function value: -10.34576
Estimates:
      estimate     gradient
[1,] 0.2380216 -0.014219040
[2,] 0.2620168  0.012664714
[3,] 0.2450181  0.002736670
[4,] 0.2550201 -0.002369234

Constrained optimization based on SUMT
Return code: 1
penalty close to zero
1  outer iterations, barrier value 5.868967e-09
--------------------------------------------

and confirm that the estimated sum is indeed equal to 1 (within accuracy)

sum(res$estimate)
#[1] 1.000077

sample

set.seed(1234)
test_data <- rmultinom(n = 1, size = 1000, prob = rep(1/4, 4))

Related


Simple numerical estimation problem for MLE of polynomials in R

John N I am trying to build a simple numerical MLE estimate for a multinomial distribution. The polynomial has a constraint - all cell probabilities need to add up to one. In general, the way to have this constraint is to reformulate one probability as (1 - su

Simple numerical estimation problem for MLE of polynomials in R

John N I am trying to build a simple numerical MLE estimate for a multinomial distribution. The polynomial has a constraint - all cell probabilities need to add up to one. In general, the way to have this constraint is to reformulate one probability as (1 - su

Problems implementing MLE estimation in the bbmle package (for R)

will I am trying to validate the MLEs for $\alpha$ , $\beta$ and $\lambda$ obtained for the Logistic-Lomax distribution by Zubair et al. in their paper titled A Study of Logistic-Lomax Distribution when using Dataset 1 . The paper uses the following code to do

Maximum Likelihood Estimation - MLE - in R using the plm package

Lucca coding I have unbalanced panel data and I want to fit this type of regression: Pr(y=1|xB) = G(xB+a) where "y" is a binary variable, "x" is a vector of explanatory variables, and "B" is my coefficient. I want to implement a random effects model with maxi

Numerical Evaluation of Orthogonal Polynomials

Roberto I've written some Matlab procedures to evaluate orthogonal polynomials, and as a sanity check, I'm trying to make sure their dot products are zero. However, while I'm fairly sure nothing can go wrong, I'm finding myself behaving a bit oddly. My test is

Numerical Evaluation of Orthogonal Polynomials

Roberto I've written some Matlab procedures to evaluate orthogonal polynomials, and as a sanity check, I'm trying to make sure their dot products are zero. However, while I'm fairly sure nothing can go wrong, I'm finding myself behaving a bit oddly. My test is

Understanding Manual MLE logit estimation and optimization

John Stud I'm looking for insight as to why a slight change in the gradient function is necessary to achieve the same result of a manual logit MLE calculation. They get the same result, but it's not clear to me why the changes specified below are necessary. Co

Understanding Manual MLE logit estimation and optimization

John Stud I'm looking for insight as to why a slight change in the gradient function is necessary to achieve the same result of a manual logit MLE calculation. They get the same result, but it's not clear to me why the changes specified below are necessary. Co

Understanding Manual MLE logit estimation and optimization

John Stud I'm looking for insight as to why a slight change in the gradient function is necessary to achieve the same result of a manual logit MLE calculation. They get the same result, but it's not clear to me why the changes specified below are necessary. Co

Understanding Manual MLE logit estimation and optimization

John Stud I'm looking for insight as to why a slight change in the gradient function is necessary to achieve the same result of a manual logit MLE calculation. They get the same result, but it's not clear to me why the changes specified below are necessary. Co

Understanding Manual MLE logit estimation and optimization

John Stud I'm looking for insight as to why a slight change in the gradient function is necessary to achieve the same result of a manual logit MLE calculation. They get the same result, but it's not clear to me why the changes specified below are necessary. Co

Understanding Manual MLE logit estimation and optimization

John Stud I'm looking for insight as to why a slight change in the gradient function is necessary to achieve the same result of a manual logit MLE calculation. They get the same result, but it's not clear to me why the changes specified below are necessary. Co

Understanding Manual MLE logit estimation and optimization

John Stud I'm looking for insight as to why a slight change in the gradient function is necessary to achieve the same result of a manual logit MLE calculation. They get the same result, but it's not clear to me why the changes specified below are necessary. Co

Understanding Manual MLE logit estimation and optimization

John Stud I'm looking for insight as to why a slight change in the gradient function is necessary to achieve the same result of a manual logit MLE calculation. They get the same result, but it's not clear to me why the changes specified below are necessary. Co

Simple R function problem

Mario Cappellano I've written a long R function and I'm sure it's possible to recreate the R function in 2 lines or less, but so far I can't get my head around it. I passed the function a dataframe of greyhound results data and an integer representing the dist

Simple histogram problem in R

Universiade I have a simple problem when creating a histogram: a <- c(1,2,3,4,5) ; hist(a,freq = FALSE) I was expecting 5 bars of the same height, but got the following: Can you guys explain why this histogram looks like this? again Behind the scenes, histthe

Simple histogram problem in R

Universiade I have a simple problem when creating a histogram: a <- c(1,2,3,4,5) ; hist(a,freq = FALSE) I was expecting 5 bars of the same height, but got the following: Can you guys explain why this histogram looks like this? again Behind the scenes, histthe

Simple R function problem

Mario Cappellano I've written a long R function and I'm sure it's possible to recreate the R function in 2 lines or less, but so far I can't get my head around it. I passed the function a dataframe of greyhound results data and an integer representing the dist

Simple histogram problem in R

Universiade I have a simple problem when creating a histogram: a <- c(1,2,3,4,5) ; hist(a,freq = FALSE) I was expecting 5 bars of the same height, but got the following: Can you guys explain why this histogram looks like this? again Behind the scenes, histthe

R: plotting polynomials in R

noob I am using the R programming language. I am trying to plot the following function: y = x^4 + x^3 - x^2 - x +1 I try to recreate the same plot using R: x = seq(from = -5, to = 5, by = 0.1) y <- x^4 + x^3 - x^2 - x +1 plot(x,y) However, the graph I made

Find roots of simple polynomials with mpmath

hooked I'm trying to find the roots of a simple polynomial using integer coefficients, which when expanded is a vector of coefficients . The following code fails:mpmath.polyrootsx*(x-4)**3[1, -12, 48, 64, 0] import mpmath p = [ 1, -12, 48, -64, 0] print mp

Find roots of simple polynomials with mpmath

hooked I'm trying to find the roots of a simple polynomial using integer coefficients, which when expanded is a vector of coefficients . The following code fails:mpmath.polyrootsx*(x-4)**3[1, -12, 48, 64, 0] import mpmath p = [ 1, -12, 48, -64, 0] print mp

Find roots of simple polynomials with mpmath

hooked I'm trying to find the roots of a simple polynomial using integer coefficients, which when expanded is a vector of coefficients . The following code fails:mpmath.polyrootsx*(x-4)**3[1, -12, 48, 64, 0] import mpmath p = [ 1, -12, 48, -64, 0] print mp

GSL problem using Legendre polynomials

Simone Bolognini I'm trying to update an old version of code that uses a deprecated version of GSL, but I'm having trouble finding how to use the newer version of the standardized Legendre polynomial function. Here is the code snippet that summarizes the probl

GSL problem using Legendre polynomials

Simone Bolognini I'm trying to update an old version of code that uses a deprecated version of GSL, but I'm having trouble finding how to use the newer version of the standardized Legendre polynomial function. Here is the code snippet that summarizes the probl

GSL problem using Legendre polynomials

Simone Bolognini I'm trying to update an old version of code that uses a deprecated version of GSL, but I'm having trouble finding how to use the newer version of the standardized Legendre polynomial function. Here is the code snippet that summarizes the probl

Numerical blow-up problem for fractional functions in R

Clarkmayo goodbye, I am using this function in R: betaFun = function(x){ if(x == 0){ return(0.5) } return( ( 1+exp(x)*(x-1) )/( x*(exp(x)-1) ) ) } The function is smooth and well defined (at least in theory) for every x, and pushes the limit towards