Comparison of Different Estimation Approaches in Rare Events Data

Bacaksiz N. E. , KOÇ S.

EGE ACADEMIC REVIEW, vol.21, no.3, pp.263-272, 2021 (Journal Indexed in ESCI) identifier

  • Publication Type: Article / Article
  • Volume: 21 Issue: 3
  • Publication Date: 2021
  • Doi Number: 10.21121/eab.960840
  • Title of Journal : EGE ACADEMIC REVIEW
  • Page Numbers: pp.263-272


In social science researches, there may be cases where a category of the dependent variable is seen hundred times less (more) than the other category. Events like wars, mass migrations or coups in social sciences; an event of interest in binary variable(s) may have very low prevalence, resulting in low or even zero cell counts in one or two cells in the 2X2 tables of two factors. In this case, independent variable predict the dependent variable perfectly or almost perfectly, and this leads to an issue called complete or quasi-complete separation problem in statistical modelling. This study aims to compare three methods suggested in the literature for the quasi-complete separation in a real small dataset; penalized maximum likelihood (Firth-type), exact logistic regression and bayesian logistic regression. Methods were compared via odds ratios, odds' standard error estimates, confidence intervals and statistical significance. Parameter estimates were obtained under three different models with binary and continuous variables. Results show that all methods can provide convergence in the presence of quasi-complete separation. In conclusion, bayesian logistic regression estimates tend to be superior than the other methods in terms of estimation of standard errors.