Abstract
Background and Objective: Patients with End- Stage Kidney Disease (ESKD) have a unique cardiovascular
risk. This study aims at predicting, with a certain precision, death and cardiovascular diseases in dialysis
patients.
Methods: To achieve our aim, machine learning techniques have been used. Two datasets have been taken
into consideration: the first is an Italian dataset obtained from the Istituto di Fisiologia Clinica of Consiglio
Nazionale delle Ricerche of Reggio Calabria; the second is an American dataset provided by the National
Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) repository. From each one we obtained
5 datasets, according to the outcome of interest. We tested different types of algorithm (both linear and
non-linear), but the final choice was to use Support Vector Machine. In particular, we obtained the best
performances using the non-linear SVC with RBF kernel algorithm, optimizing it with GridSearch. The last
is an algorithm useful to search the best combination of hyper-parameters (in our case, to find the best
couple (C, ? )), in order to improve the accuracy of the algorithm.
Results: The use of non-linear SVC with RBF kernel algorithm, optimized with GridSearch, allowed to
obtain an accuracy of 95.25% in the Italian dataset and of 92.15% in the American dataset, in a timeframe
of 2.5 years,in the prediction of Ischaemic Heart Disease. A worse performance was obtained for the other
outcomes.
Conclusions: The machine learning-based approach applied in our study is able to predict, with a high
accuracy, the outbreak of cardiovascular diseases in patients on dialysis.