Time series data involving counts are frequently encountered in many biomedical and public health applications. A problem occurs when count data have to be modeled. The number of counts in a certain period can only be an integer and that is why the commonly used ARMA model, which assumes stationarity, seems not very useful anymore, simply because there are some associated problems like outlier and over dispersion that can be encountered in the count data, which may actually lead to violation of stationarity assumption of the ARMA model. Thus, modeling this type of series requires one to choose the best model that can deal explicitly with the problem of the over-dispersion that can cause non-stationarity and/ or non-normality. The method of Integrated Autoregressive Moving Average (ARIMA) and Autoregressive Conditional Poisson (ACP) models were used to address the problem of the over-dispersion in the count data. The orders of the models were studied in respect to the level of over-dispersion and sample sizes through Monte Carlo simulation. The findings revealed that the ACP (2, 1) and ACP (1,2) are best at lower and higher sample sizes respectively when there is overdispersion. The forecast performance of ACP model increases as the steps ahead increases while that of ARIMA model declines.
ÂB¨ockenholt, U. (1999). Mixed INAR(1) Poisson regression models: Analyzing heterogeneity and serial
dependencies in longitudinal count data, Journal of Econometrics, 89.
Bollerslev T., Robert F. Engle, and Daniel B.N. (1994). Arch models, in: R.F. Engle, and D.L.
McFadden, ed., Handbook of Econometrics, 4, 143-167. Elsevier Science: Amsterdam, NorthHolland).
Bollerslev, T. (1986). Generalized autoregressive conditional heteroscedasticity, Journal of
Econometrics, 31, 307-327.
Br¨ann¨as, Kurt, and Per Johansson, (1994). Time series count data regression, Communications in
Statistics: Theory and Methods, 23, 2907–2925
Brännäs, K. and Hall, A. (2001). Estimation in integer-valued moving average models, Applied
Stochastic Models in Business and Industry, 17, 277–91.
Breslow, N.E. (1990). Tests of hypotheses in overdispersed Poisson regression and other quasilikelihood models, Journal of American Statistical Association, 85, 565-571.
Cameron, A.C. and Trivedi, P.K. (1998). Regression analysis of count data, Cambridge University
Press, Cambridge.
Efron, B. (1986). Double exponential families and their use in generalized linear regression, Journal of
the American Statistical Association, 81, 709–721.
Engle, R.F. and Russell, J.R., (1998). Autoregressive conditional duration: a new model for irregularly
spaced transaction data, Econometrica, 66, 1127-1162.
Hilbe, J. M. (2008). Negative binomial regression, Cambridge University Press, UK.
Johansson, P, (1996), Speed limitation and motorway casualties: a time series count data regression
approach, Accident Analysis and Prevention, 28, 73–87.
Macharia, N.A., Ngesa, O., Wanjoya, A. and Mulwa, D.F. (2020). Comparison of statistical
models in modelling overdispersed count data with excess zeros, International
Journal of Research and Innovation in Applied Science, 4(5).
Rydberg, T.H. and Neil S. (1998). A modeling framework for the prices and times of trades on the nyse,
In: W.J. Fitzgerald, R.L. Smith, A.T. Walden and P.C. Young (eds), Nonlinear and
nonstationary signal processing, Cambridge University Press.
Rydberg, T.H., and Neil S. (1999a), Dynamics of trade-by-rate movements decomposition and models,
Working paper, Nuffield College, Oxford.
Rydberg, T. H., and Neil S. (1999b), Modelling trade-by-trade price movements of multiple assets using
multivariate compound Poisson processes, Working paper, Nuffield College, Oxford.
SHARE WITH OTHERS