Generic Count Distributions and Their Zero-Inflated Forms: A Simulation Study
Main Article Content
Abstract
The percentage of zero observations necessitating zero-inflated distributions in count data modelling has been a major issue. The challenge in such a situation is determining when to shift from parent distributions to their zero-inflated versions. In most studies, the performances of parent distributions are assessed with those of their zero-inflated forms. This study conducts simulation studies for the Poisson and the negative binomial distributions and their respective zero-inflated forms. Count data [0, 4] with different percentages of zero counts are simulated using different sample sizes. Both negative log-likelihood and Bayesian information criterion (which considers the number of estimated parameters) are used to assess performance. Results show that the zero-inflated Poisson distribution best suits modelling all forms of data when the negative log-likelihood value is used to assess performance. When the BIC is used, the Poisson distribution gives the best performance for both 10% and 20% zeros, while the ZIP distribution is the best for both 50% and 90% zeros. The NB distribution outperforms the ZINB distribution in all situations. Also, in all cases, the negative binomial performs better than the zero-inflated negative binomial distributions. To further assess the distributions, four count data sets with varying percentages of zero are examined. Both the ZIP and the NB distributions perform better than others.
Downloads
Article Details

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
References
Adetunji, A. A., & Sabri, S. R. M. (2021). Modelling Claim Frequency in Insurance Using Count Models. Asian Journal of Probability and Statistics, 14(4), 14–20. https://doi.org/10.9734/ajpas/2021/v14i430334
Adetunji, A. A., & Sabri, S. R. M. (2023a). A New Three-Parameter Mixed Poisson Transmuted Weighted Exponential Distribution with Applications to Insurance Data. Science and Technology Indonesia, 8(2), 235–244. https://doi.org/10.26554/sti.2023.8.2.235-244
Adetunji, A. A., & Sabri, S. R. M. (2023b). On the Poisson Transmuted Ailamujia Distribution with Applications to Dispersed and Skewed Count Data. Journal of Statistics and Management Systems, 26(4), 929–943. https://doi.org/10.47974/JSMS-1023
Adetunji, A. A., & Sabri, S. R. M. (2023c). On zero-inflated mixed Poisson Transmuted Exponential Distribution: Properties and Applications to observation with excess zeros. Maejo International Journal of Science and Technology, 17(01), 68–80.
Adetunji, A. A., & Sabri, S. R. M. (2024). A new two-parameter Poisson-transmuted exponential distribution: Properties and applications in count observations. Proceedings of the International Conference on Mathematical Sciences and Technology 2022 (MathTech 2022), 1–7. https://doi.org/10.1063/5.0192459
Angers, J., & Biswas, A. (2003). A Bayesian of Zero-Inflated Generalized Poisson Model. Computational Statistics and Data Analysis, 42, 37–46.
Bauer, T., Göhlmann, S., & Sinning, M. (2007). Gender Differences in Smoking Behavior. Health Economics, 16, 895–909.
Bekalo, D. B., & Kebede, D. T. (2021). Zero Inflated Models for Count Data: An Application to Number of Antenatal Care Service Visits. Annals of Data Science, 8(4), 683–708. https://doi.org/10.1007/s40745-021-00328-x
Conceição, K. S., Louzada, F., Andrade, M. G., & Helou, E. S. (2017). Zero-Modified Power Series Distribution and its Hurdle Distribution Version. Journal of Statistical Computation and Simulation, 87(9), 1842–1862. https://doi.org/10.1080/00949655.2017.1289529
Dietz, E., & Böhning, D. (2000). On Estimation of the Poisson Parameter in Zero-Modified Poisson Models. Computational Statistics and Data Analysis, 34(4), 441–459. https://doi.org/10.1016/S0167-9473(99)00111-5
Feng, C. X. (2021). A Comparison of Zero-Inflated and Hurdle Models for Modeling Zero-Inflated Count Data. Journal of Statistical Distributions and Applications, 8(1), 1–19. https://doi.org/10.1186/s40488-021-00121-4
Lambert, D. (1992). Zero-Inflated Poisson Regression, with an Application to Defects in Manufacturing. Technometrics, 34, 1–14.
Loeys, T., Moerkerke, B., De-Smet, O., & Buysse, A. (2011). The analysis of zero-inflated count data: Beyond zero-inflated Poisson regression. British Journal of Mathematical and Statistical Psychology, 1–17. https://doi.org/10.1111/j.2044-8317.2011.02031.x
Meytrianti, A., Nurrohmah, S., & Novita, M. (2019) An Alternative Distribution for Modelling Overdispersion Count Data: Poisson Shanker Distribution. ICSA - International Conference on Statistics and Analytics 2019, 1, 108–120. https://doi.org/10.29244/icsa.2019.pp108-120
Omari, C. O., Nyambura, S. G., & Mwangi, J. M. W. (2018). Modeling the Frequency and Severity of Auto Insurance Claims Using Statistical Distributions. Journal of Mathematical Finance, 8(1), 137–160. https://doi.org/10.4236/jmf.2018.81012
Sabri, S. R. M. & Adetunji, A. A. (2023). Zero-Inflated Poisson Transmuted Weighted Exponential Distribution: Properties and Applications. The Borneo Journal of Science and Technology, 44(2), 1–16. https://doi.org/10.51200/bsj.v44i2
Sah, B. K, & Mishra, A. (2019). A Generalised Exponential-Lindley Mixture of Poisson Distribution. Nepalese Journal of Statistics, 3, 11–20. https://doi.org/10.4135/9781412952644.n350
Samutwachirawong, S. (2021). Poisson-Exponential and Gamma Distribution: Properties and Applications. Journal of Applied Statistics and Information Technology, 6(2), 17–24.
Sarul, L. S., & Sahin, S. (2015). An Application of Claim Frequency Data Using Zero Inflated and Hurdle Models in General Insurance. The Journal of Business, Economics and Finance, 4(4), 732–743.
Shanker, R., & Mishra, A. (2016). A Quasi Poisson-Lindley Distribution. Journal of the Indian Statistical Association, 54(1), 113–125.
Tajuddin, M. R. R., & Ismail, N. (2022). Frequentist and Bayesian Zero-Inflated Regression Models on Insurance Claim Frequency: A comparison study using Malaysia’s Motor Insurance Data. Malaysian Journal of Science, 41(2), 16–29. https://doi.org/10.22452/mjs.vol41no2.2
Wagh, Y. S., & Kamalja, K. K. (2017). Modeling Auto Insurance Claims in Singapore. Sri Lankan Journal of Applied Statistics, 18, 105–118.




















