Je souhaite effectuer une régression logistique avec la réponse binomiale suivante et avec et comme variables prédites. X 2
Je peux présenter les mêmes données que les réponses de Bernoulli dans le format suivant.
Les résultats de la régression logistique pour ces 2 ensembles de données sont essentiellement les mêmes. Les résidus de déviance et AIC sont différents. (La différence entre la déviance nulle et la déviance résiduelle est la même dans les deux cas - 0,228.)
Vous trouverez ci-dessous les résultats de régression issus de R. Les ensembles de données sont appelés binom.data et bern.data.
Voici la sortie binomiale.
Call:
glm(formula = cbind(Successes, Trials - Successes) ~ X1 + X2,
family = binomial, data = binom.data)
Deviance Residuals:
[1] 0 0 0
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.9649 21.6072 -0.137 0.891
X1Yes -0.1897 2.5290 -0.075 0.940
X2 0.3596 1.9094 0.188 0.851
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 2.2846e-01 on 2 degrees of freedom
Residual deviance: -4.9328e-32 on 0 degrees of freedom
AIC: 11.473
Number of Fisher Scoring iterations: 4
Voici la sortie de Bernoulli.
Call:
glm(formula = Success ~ X1 + X2, family = binomial, data = bern.data)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.6651 -1.3537 0.7585 0.9281 1.0108
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.9649 21.6072 -0.137 0.891
X1Yes -0.1897 2.5290 -0.075 0.940
X2 0.3596 1.9094 0.188 0.851
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 15.276 on 11 degrees of freedom
Residual deviance: 15.048 on 9 degrees of freedom
AIC: 21.048
Number of Fisher Scoring iterations: 4
Mes questions:
1) Je peux voir que les estimations ponctuelles et les erreurs types entre les 2 approches sont équivalentes dans ce cas particulier. Cette équivalence est-elle vraie en général?
2) Comment la réponse à la question n ° 1 peut-elle être justifiée mathématiquement?
3) Pourquoi les résidus de déviance et AIC sont-ils différents?
I just want make comments on the last paragraph, “The fact that the AIC is different (but the change in deviance is not) comes back to the constant term that was the difference between the log-likelihoods of the two models. When calculating the change in deviance, this is cancelled out because it is the same in all models based on the same data." Unfortunately, this is not correct for the change in deviance. The deviance does not include the constant term Ex (extra constant term in the log-likelihood for the binomial data). Therefore, the change in deviance does nothing to do with the constant term EX. The deviance compares a given model to the full model. The fact that the deviances are different from Bernoulli/binary and binomial modelling but change in deviance is not is due to the difference in the full model log-likelihood values. These values are cancelled out in calculating the deviance changes. Therefore, Bernoulli and binomial logistic regression models yield an identical deviance changes provided the predicted probabilities pij and pi are the same. In fact, that is true for the probit and other link functions.
Let lBm and lBf denote the log-likelihood values from fitting model m and full model f to Bernoulli data. The deviance is then
Although the lBf is zero for the binary data, we have not simplified the DB and kept it as is. The deviance from the binomial modelling with the same covariates is
where the lbf+Ex and lbm+Ex are the log-likelihood values by the full and m models fitted to the binomial data. The extra constant term (Ex) is disappeared from the right hand side of the Db. Now look at change in deviances from Model 1 to Model 2. From Bernoulli modelling, we have change in deviance of
Similarly, change in deviance from binomial fitting is
It is immediately follows that the deviance changes are free from the log-likelihood contributions from full models, lBf and lbf. Therefore, we will get the same change in deviance, DBC = DbC, if lBm1 = lbm1 and lBm2 = lbm2. We know that is the case here and that why we are getting the same deviance changes from Bernoulli and binomial modelling. The difference between lbf and lBf leads to the different deviances.
la source