Theses and Dissertations

Comparison of Statistical Methods for Modeling Count Data with an Application to Length of Hospital Stay

Gustavo A. Fernandez, The University of Texas Rio Grande Valley

Date of Award

12-2021

Document Type

Thesis

Degree Name

Master of Science (MS)

Department

Applied Statistics and Data Science

First Advisor

Dr. Kristina Vatcheva

Second Advisor

Dr. Santanu Chakraborty

Third Advisor

Dr. Tamer Oraby

Abstract

Hospital length of stay (LOS) is a key indicator of hospital care management efficiency, cost of care, and hospital planning. Therefore, understanding hospital LOS variability is always an important healthcare focus. Hospital LOS data are count data, with discrete and nonnegative values, typically right-skewed, and often exhibiting excessive zeros. Numerous studies have been conducted to model hospital LOS to identify significant predictors contributing to its variability. Many researchers have used linear regression with or without logarithmic transformation of the outcome variable LOS, or logistic regression on a dichotomized LOS. These regression methods usually violate models’ assumptions and are subject to criticism for their inadequacy in modeling count data. Problems that may occur include biased parameter estimates, loss of precision of inferences, predicting meaningless negative values, and loss of important information about the underlying counts. Common statistical methods for the analysis of count data are Poisson, negative binomial (NB), zero-inflated Poisson (ZIP), and zero-inflated negative binomial (ZINB) regressions. Many studies have been conducted comparing the performance of regression models for count data. However, the results from the analysis of empirical and/or simulated count data are in much disagreement. In this study, we compared the performance of Poisson, NB, ZIP, and ZINB regression models using simulated data under different scenarios with varying sample sizes, proportions of zeros, and levels of overdispersion. To illustrate the aforementioned regression methods, an analysis of hospital LOS was conducted using empirical data from the MIMIC-III database.

Comments

https://go.openathens.net/redirector/utrgv.edu?url=https://www.proquest.com/dissertations-theses/comparison-statistical-methods-modeling-count/docview/2640111771/se-2?accountid=7119

Recommended Citation

Fernandez, Gustavo A., "Comparison of Statistical Methods for Modeling Count Data with an Application to Length of Hospital Stay" (2021). Theses and Dissertations. 861.
https://scholarworks.utrgv.edu/etd/861

Download

Included in

Health Information Technology Commons, Statistics and Probability Commons

COinS

Theses and Dissertations

Comparison of Statistical Methods for Modeling Count Data with an Application to Length of Hospital Stay

Date of Award

Document Type

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Abstract

Comments

Recommended Citation

Included in

Browse

Search

Author Corner

Links

Theses and Dissertations

Comparison of Statistical Methods for Modeling Count Data with an Application to Length of Hospital Stay

Author

Date of Award

Document Type

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Abstract

Comments

Recommended Citation

Included in

Share

Browse

Search

Author Corner

Links