ORIGINAL ARTICLE
One-year mortality prediction of patients with hepatitis in Kazakhstan based on administrative health data: A machine learning approach
More details
Hide details
1
School of Engineering and Digital Sciences, Nazarbayev University, Astana, KAZAKHSTAN
2
Department of Medicine, School of Medicine, Nazarbayev University, Astana, KAZAKHSTAN
3
Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, UAE
Publication date: 2024-12-24
Electron J Gen Med 2024;21(6):em618
KEYWORDS
ABSTRACT
Background and objective:
Hepatitis B virus (HBV) and hepatitis C virus (HCV) are major contributors to chronic viral hepatitis (CVH), leading to significant global health mortality. This study aims to predict the one-year mortality in patients with CVH using their demographics and health records.
Methods:
Clinical data from 82,700 CVH patients diagnosed with HBV or HCV between January 2014 and December 2019 was analyzed. We developed a machine learning (ML) platform based on six broad categories including linear, nearest neighbors, discriminant analysis, support vector machine, naïve Bayes, and ensemble (gradient boosting, AdaBoost, and random forest) models to predict the one-year mortality. Feature importance analysis was performed by computing SHapley Additive exPlanations (SHAP).
Results:
The models achieved an area under the curve between 0.74 and 0.8 on independent test sets. Key predictors of mortality were age, sex, hepatitis type, and ethnicity.
Conclusion:
ML with administrative health data can be utilized to accurately predict one-year mortality in CVH patients. Future integration with detailed laboratory and medical history data could further enhance model performance.
REFERENCES (46)
3.
WHO. Global progress report on HIV, viral hepatitis and sexually transmitted infections. World Health Organization; 2021. Available at:
https://www.who.int/publicatio... (Accessed: 10 June 2023).
4.
Ashimkhanova A, Syssoyev D, Gusmanov A, et al. Epidemiological characteristics of chronic viral hepatitis in Kazakhstan: Data from unified nationwide electronic healthcare system 2014-2019. Infect Drug Resist. 2022;15:3333-46.
https://doi.org/10.2147/IDR.S3... PMid:35782528 PMCid:PMC9248955.
5.
WHO. Combating hepatitis B and C to reach elimination by 2030. World Health Organization; 2021. Available at:
https://apps.who.int/iris/hand... (Accessed: 10 June 2023).
6.
Li THS, Chiu HJ, Kuo PH. Hepatitis C virus detection model by using random forest, logistic regression, and ABC algorithm. IEEE Access. 2022;10:91045-58.
https://doi.org/10.1109/ACCESS....
7.
Mamdouh Farghaly H, Shams MY, Abd El-Hafeez T. Hepatitis C virus prediction based on machine learning framework: A real-world case study in Egypt. Knowl Inf Syst. 2023;65:2595-617.
https://doi.org/10.1007/s10115....
8.
Alizargar A, Chang YL, Tan TH. Performance comparison of machine learning approaches on hepatitis C prediction employing data mining techniques. Bioengineering (Basel). 2023;10(4):481.
https://doi.org/10.3390/bioeng... PMid:37106668 PMCid:PMC10135598.
9.
Haga H, Sato H, Koseki A, et al. A machine learning-based treatment prediction model using whole genome variants of hepatitis C virus. PLoS One. 2020;15(11):e0242028.
https://doi.org/10.1371/journa... PMid:33152046 PMCid:PMC7644079.
10.
Kashif AA, Bakhtawar B, Akhtar A, et al. Treatment response prediction in hepatitis C patients using machine learning techniques. Int J Technol Innov Manag. 2021;1(2):79-89.
https://doi.org/10.54489/ijtim....
11.
Tian X, Chong Y, Huang Y, et al. Using machine learning algorithms to predict hepatitis B surface antigen seroclearance. Comput Math Methods Med. 2019;2019:6915850.
https://doi.org/10.1155/2019/6... PMid:31281411 PMCid:PMC6594274.
12.
Butt MB, Alfayad M, Saqib S, et al. Diagnosing the stage of hepatitis C using machine learning. J Healthc Eng. 2021;2021:8062410.
https://doi.org/10.1155/2021/8... PMid:35028114 PMCid:PMC8748759.
13.
Obaido G, Ogbuokiri B, Swart TG, et al. An interpretable machine learning approach for hepatitis B diagnosis. Appl Sci. 2022;12(21):11127.
https://doi.org/10.3390/app122....
14.
Albogamy FR, Asghar J, Subhan F, et al. Decision support system for predicting survivability of hepatitis patients. Front Public Health. 2022;10:862497.
https://doi.org/10.3389/fpubh.... PMid:35493354 PMCid:PMC9051027.
15.
Ali N, Srivastava D, Tiwari A, Pandey AK, Sahu A. Predicting life expectancy of hepatitis B patients using machine learning. In: Proceedings of the 2022 IEEE International Conference on Distributed Computing and Electrical Circuits and Electronics. 2022. p. 1-4.
https://doi.org/10.1109/ICDCEC....
16.
Gusmanov A, Zhakhina G, Yerdessov S, et al. Review of the research databases on population-based registries of unified electronic healthcare system of Kazakhstan (UNEHS): Possibilities and limitations for epidemiological research and real-world evidence. Int J Med Inform. 2023;170:104950.
https://doi.org/10.1016/j.ijme... PMid:36508752.
20.
Duda RO, Hart PE, Stork DG. Pattern classification. Hoboken: John Wiley & Sons; 2001.
22.
Chen T, Guestrin C. XGBoost: A scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM; 2016. p. 785-94.
https://doi.org/10.1145/293967....
23.
Ke G, Meng Q, Finley T, et al. LightGBM: A highly efficient gradient boosting decision tree. Adv Neural Inf Process Syst. 2017;30:3146-54.
25.
Freund Y, Schapire RE. A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci. 1997;55(1):119-39.
https://doi.org/10.1006/jcss.1....
26.
Pines JM, Carpenter CR, Raja AS, Schuur JD. Evidence-based emergency care: Diagnostic testing and clinical decision rules. Hoboken: John Wiley & Sons; 2012.
https://doi.org/10.1002/978111....
27.
Lundberg SM, Allen PG, Lee SI. A unified approach to interpreting model predictions. In: Advances in neural information processing systems. Newry: Curran Associates Inc; 2017.
28.
Yildirim P. Filter-based feature selection methods for prediction of risks in hepatitis disease. Int J Mach Learn Comput. 2015;5(4):258-63.
https://doi.org/10.7763/IJMLC.....
29.
Bhargav KS, Thota D, Kumari TD, Vikas B. Application of machine learning classification algorithms on hepatitis dataset. Int J Appl Eng Res. 2018;13(16):12732-7.
30.
Nivaan GV, Emanuel AWR. Analytic predictive of hepatitis using the regression logic algorithm. In: Proceedings of the 2020 3rd International Seminar on Research of Information Technology and Intelligent Systems. 2020. p. 106-10.
https://doi.org/10.1109/ISRITI....
31.
Fedeli U, Grande E, Grippo F, Frova L. Mortality associated with hepatitis C and hepatitis B virus infection: A nationwide study on multiple causes of death data. World J Gastroenterol. 2017;23(10):1866-76.
https://doi.org/10.3748/wjg.v2... PMid:28348493 PMCid:PMC5352928.
32.
Bollerup S, Hallager S, Engsig F, et al. Mortality and cause of death in persons with chronic hepatitis B virus infection versus healthy persons from the general population in Denmark. J Viral Hepat. 2022;29(8):727-36.
https://doi.org/10.1111/jvh.13... PMid:35633092.
33.
Alavi M, Grebely J, Hajarizadeh B, et al. Mortality trends among people with hepatitis B and C: A population-based linkage study, 1993-2012. BMC Infect Dis. 2018;18(1):215.
https://doi.org/10.1186/s12879... PMid:29743015 PMCid:PMC5944091.
34.
El-Serag HB, Kramer J, Duan Z, Kanwal F. Epidemiology and outcomes of hepatitis C infection in elderly US Veterans. J Viral Hepat. 2016;23(9):687-96.
https://doi.org/10.1111/jvh.12... PMid:27040447.
35.
Montuclard C, Hamza S, Rollot F, et al. Causes of death in people with chronic HBV infection: A population-based cohort study. J Hepatol. 2015;62(6):1265-71.
https://doi.org/10.1016/j.jhep... PMid:25625233.
36.
Ireland G, Mandal S, Hickman M, Ramsay M, Harris R, Simmons R. Mortality rates among individuals diagnosed with hepatitis C virus (HCV): An observational cohort study, England, 2008 to 2016. Euro Surveill. 2019;24(30):1800695.
https://doi.org/10.2807/1560-7... PMid:31362807 PMCid:PMC6668288.
37.
Wu VC-C, Chen T-H, Wu M, et al. Comparison of cardiovascular outcomes and all-cause mortality in patients with chronic hepatitis B and C: A 13-year nationwide population-based study in Asia. Atherosclerosis. 2018;269:178-84.
https://doi.org/10.1016/j.athe... PMid:29366991.
38.
Emmanuel B, Shardell MD, Tracy L, Kottilil S, El-Kamary SS. Racial disparity in all-cause mortality among hepatitis C virus-infected individuals in a general US population, NHANES III. J Viral Hepat. 2017;24(4):380-8.
https://doi.org/10.1111/jvh.12... PMid:27905175 PMCid:PMC5739320.
39.
Bixler D, Zhong Y, Ly KN, et al. Mortality among patients with chronic hepatitis B infection: The chronic hepatitis cohort study (CHeCS). Clin Infect Dis. 2019;68(6):956-63.
https://doi.org/10.1093/cid/ci... PMid:30060032 PMCid:PMC11230463.
40.
Lu M, Li J, Zhou Y, et al. Trends in cirrhosis and mortality by age, sex, race, and antiviral treatment status among US chronic hepatitis B patients (2006-2016). J Clin Gastroenterol. 2022;56(3):273-9.
https://doi.org/10.1097/MCG.00... PMCid:PMC10257940.
41.
Yerdessov S, Almukhambetova A, Mambetaliyev M, et al. Epidemiological characteristics and climatic variability of viral meningitis in Kazakhstan, 2014-2019. Front Public Health. 2023;10:1041135.
https://doi.org/10.3389/fpubh.... PMid:36684964 PMCid:PMC9845948.
42.
Midlenko A, Mussina K, Zhakhina G, et al. Prevalence, incidence, and mortality rates of breast cancer in Kazakhstan: Data from the Unified National Electronic Health System, 2014-2019. Front Public Health. 2023;11:1132742.
https://doi.org/10.3389/fpubh.... PMid:37143985 PMCid:PMC10153091.
43.
Zollanvari A, James AP, Sameni R. A theoretical analysis of the peaking phenomenon in classification. J Classif. 2020; 37(2):421-34.
https://doi.org/10.1007/s00357....
44.
Gao B, Wu T-C, Lang S, et al. Machine learning applied to omics datasets predicts mortality in patients with alcoholic hepatitis. Metabolites. 2022;12(1):41.
https://doi.org/10.3390/metabo... PMid:35050163 PMCid:PMC8781791.
45.
Zhang D, Gong Y. The comparison of LightGBM and XGBoost coupling factor analysis and prediagnosis of acute liver failure. IEEE Access. 2020;8:220990-220003.
https://doi.org/10.1109/ACCESS....
46.
Brownlee J. XGBoost with Python: Gradient boosted trees with XGBoost and scikit-learn. San Fransisco: Machine Learning Mastery; 2018.