Optimalisasi Splitting Data untuk Kinerja Robust Model EfficientNetV2-B0 pada Deteksi Pneumonia
DOI:
https://doi.org/10.46880/methoda.Vol16No2.pp83-89Keywords:
Data Splitting, EfficientNetV2-B0, Pneumonia Detection, Stratified Cross-Validation, Deep LearningAbstract
The splitting of datasets constitutes a fundamental yet frequently overlooked methodological decision in deep learning research for medical image classification. This study investigates the impact of various data splitting scenarios on the robust performance of the EfficientNetV2-B0 model in pneumonia detection using chest X-ray images. Using the Kaggle Chest X-ray Pneumonia dataset, seven experimental scenarios were designed encompassing differences in train-validation-test allocation ratios (70/15/15, 70/10/20, 80/10/10, 85/15, 70/30), partition strategies (stratified vs. random), and validation methods (holdout vs. 5-fold stratified cross-validation). The results demonstrate that 5-fold stratified cross-validation produces the most stable performance estimates with the lowest variance (Accuracy: 97.4%±0.3%, AUC: 0.993±0.002), whereas random partition without stratification yields significantly inferior results (Accuracy: 95.1%, AUC: 0.973). Among the holdout scenarios, the 70/15/15 stratified ratio achieved the best performance (Accuracy: 97.2%, AUC: 0.991). Statistical analysis confirms significant differences between stratified and non-stratified scenarios (p < 0.05). These findings provide empirical guidance for researchers in designing more valid and replicable machine learning experiments in the medical domain.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Syanti Irviantina, M. Daffa Rizaldi Siregar

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.







