Comprehensive Analysis of Data Augmentation Methods in Classification for an Imbalanced Epilepsy Dataset

dc.contributor.author Calis, A.G.
dc.contributor.author Ergezer, H.
dc.date.accessioned 2026-02-05T19:53:17Z
dc.date.available 2026-02-05T19:53:17Z
dc.date.issued 2026
dc.description.abstract Imbalanced class distribution reduces the generalizability of classifiers in EEG-based epilepsy detection. This study examines the impact of the synthetic minority oversampling technique (SMOTE) and its variants on imbalanced electroencephalography (EEG) data, utilizing an end-to-end data processing pipeline. Band-limited filtering is applied as pre-processing, and then the training data is gradually oversampled by 20% increments in four scenes. Experiments are conducted on coarse-k-nearest neighbor (Coarse-KNN), bagged trees, and artificial neural network (ANN) classifiers, and evaluation is performed using accuracy, precision, recall, F1 score, and Matthew’s correlation coefficient (MCC) metrics. In Scene #4, where the inter-class imbalance is eliminated, Borderline-SMOTE yielded the highest and most consistent results (F1 Score = 0.903–0.937, MCC = 0.830–0.894). Safe level-SMOTE (SL-SMOTE) and SMOTE/Geometric-SMOTE(G-SMOTE) produced second-ranked results. The findings demonstrate that appropriate variant selection provides consistent gains even across classifiers, making Borderline-SMOTE the recommended approach for imbalanced EEG classification. Furthermore, in the detailed analysis of ensemble sampling limits, SMOTE-based combined approaches (e.g., SL + G SMOTE) also produced consistent results. Basic descriptive statistics (mode, median, variance, and kurtosis) of the synthetic samples were found to be comparable to those of the real data, providing additional evidence of distributional consistency. © 2013 IEEE. en_US
dc.identifier.doi 10.1109/ACCESS.2026.3653695
dc.identifier.issn 2169-3536
dc.identifier.scopus 2-s2.0-105027992010
dc.identifier.uri https://doi.org/10.1109/ACCESS.2026.3653695
dc.identifier.uri https://hdl.handle.net/20.500.12416/15855
dc.language.iso en en_US
dc.publisher Institute of Electrical and Electronics Engineers Inc. en_US
dc.relation.ispartof IEEE Access en_US
dc.rights info:eu-repo/semantics/openAccess en_US
dc.subject Artificial Neural Networks en_US
dc.subject Bagged Trees en_US
dc.subject Data Augmentation en_US
dc.subject Machine Learning en_US
dc.subject SMOTE en_US
dc.subject Epilepsy
dc.subject Classification Algorithms
dc.subject Electroencephalography
dc.subject Vectors
dc.subject Synthetic Data
dc.subject Accuracy
dc.subject Trees (Botanical)
dc.subject Recording
dc.subject Training
dc.title Comprehensive Analysis of Data Augmentation Methods in Classification for an Imbalanced Epilepsy Dataset en_US
dc.type Article en_US
dspace.entity.type Publication
gdc.author.scopusid 60093665300
gdc.author.scopusid 8375807400
gdc.author.wosid Ergezer, Halit/S-6502-2017
gdc.author.wosid Calis, AhmetGokay/PHF-0256-2026
gdc.coar.access open access
gdc.coar.type text::journal::journal article
gdc.collaboration.industrial true
gdc.description.department Çankaya University en_US
gdc.description.departmenttemp [Calis] Ahmet Gokay, Department of Mechanical Engineering, Çankaya Üniversitesi, Ankara, Turkey; [Ergezer] Halit, Department of Mechanical Engineering, Çankaya Üniversitesi, Ankara, Turkey en_US
gdc.description.endpage 8390 en_US
gdc.description.publicationcategory Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı en_US
gdc.description.scopusquality N/A
gdc.description.startpage 8375 en_US
gdc.description.volume 14 en_US
gdc.description.woscitationindex Science Citation Index Expanded
gdc.description.wosquality N/A
gdc.identifier.openalex W7123346447
gdc.identifier.wos WOS:001666956500024
gdc.index.type Scopus
gdc.index.type WoS
gdc.openalex.collaboration International
gdc.openalex.fwci 0.0
gdc.openalex.normalizedpercentile 0.18
gdc.opencitations.count 0
gdc.plumx.mendeley 1
gdc.plumx.newscount 1
gdc.plumx.scopuscites 0
gdc.scopus.citedcount 0
gdc.wos.citedcount 0
relation.isAuthorOfPublication.latestForDiscovery e7c25403-d5d5-4ca7-b1c0-8e155d9a2310
relation.isOrgUnitOfPublication.latestForDiscovery 0b9123e4-4136-493b-9ffd-be856af2cdb1

Files