Small and Unbalanced Data Set Problem in Classification
| dc.contributor.author | Sezer, Ebru Akcapinar | |
| dc.contributor.author | Sever, Hayri | |
| dc.contributor.author | Par, Oznur Esra | |
| dc.date.accessioned | 2023-01-04T08:28:53Z | |
| dc.date.accessioned | 2025-09-18T14:10:26Z | |
| dc.date.available | 2023-01-04T08:28:53Z | |
| dc.date.available | 2025-09-18T14:10:26Z | |
| dc.date.issued | 2019 | |
| dc.description.abstract | Classification of data is difficult in case of small and unbalanced data set and this problem directly affects the classification performance. Small and / or the imbalance dataset has become a major problem in data mining. Classification algorithms are developed based on the assumption that the data sets are balanced and large enough. The most of the algorithms ignore or misclassify examples of the minority class, focus on the majority class. Small and unbalanced data set problem is frequently encountered in medical data mining due to some limitations. Within the scope of the study, the public accessible data set, hepatitis, was divided into small and imblanced data subsets, each of the data subsets were oversampled by distance based data generation methods. The oversampled data sets were classified by using four different machine learning algorithms (Artificial Neural Networks, Support Vector Machines, Naive Bayes and Decision Tree) and the classification scores were compared. | en_US |
| dc.identifier.citation | Par, Öznur Esra; Sezer, Ebru Akçapınar; Sever, Hayri (2019). "Small and Unbalanced Data Set Problem in Classification", 27th Signal Processing and Communications Applications Conference (SIU), Sivas Cumhuriyet Univ, Sivas, TURKEY, APR 24-26, 2019. | en_US |
| dc.identifier.doi | 10.1109/siu.2019.8806497 | |
| dc.identifier.isbn | 9781728119045 | |
| dc.identifier.issn | 2165-0608 | |
| dc.identifier.scopus | 2-s2.0-85071971537 | |
| dc.identifier.uri | https://doi.org/10.1109/siu.2019.8806497 | |
| dc.identifier.uri | https://hdl.handle.net/20.500.12416/13681 | |
| dc.language.iso | tr | en_US |
| dc.publisher | Ieee | en_US |
| dc.relation.ispartof | 27th Signal Processing and Communications Applications Conference (SIU) -- APR 24-26, 2019 -- Sivas Cumhuriyet Univ, Sivas, TURKEY | en_US |
| dc.relation.ispartofseries | Signal Processing and Communications Applications Conference | |
| dc.rights | info:eu-repo/semantics/closedAccess | en_US |
| dc.subject | Machine Learning | en_US |
| dc.subject | Small Data Set | en_US |
| dc.subject | Imbalanced Data Set | en_US |
| dc.subject | Oversampling Methods | en_US |
| dc.title | Small and Unbalanced Data Set Problem in Classification | en_US |
| dc.title | Small and Unbalanced Data Set Problem in Classification | tr_TR |
| dc.type | Conference Object | en_US |
| dspace.entity.type | Publication | |
| gdc.author.scopusid | 55605168600 | |
| gdc.author.scopusid | 36444813800 | |
| gdc.author.scopusid | 55902090100 | |
| gdc.author.yokid | 11916 | |
| gdc.bip.impulseclass | C5 | |
| gdc.bip.influenceclass | C5 | |
| gdc.bip.popularityclass | C4 | |
| gdc.coar.access | metadata only access | |
| gdc.coar.type | text::conference output | |
| gdc.collaboration.industrial | false | |
| gdc.description.department | Çankaya University | en_US |
| gdc.description.departmenttemp | [Par, Oznur Esra; Sezer, Ebru Akcapinar] Hacettepe Univ, Bilgisayar Muhendisligi Bolumu, Ankara, Turkey; [Sever, Hayri] Cankaya Univ, Bilgisayar Muhendisligi Bolumu, Ankara, Turkey | en_US |
| gdc.description.endpage | 4 | |
| gdc.description.publicationcategory | Konferans Öğesi - Uluslararası - Kurum Öğretim Elemanı | en_US |
| gdc.description.startpage | 1 | |
| gdc.description.woscitationindex | Conference Proceedings Citation Index - Science | |
| gdc.identifier.openalex | W2969401632 | |
| gdc.identifier.wos | WOS:000518994300158 | |
| gdc.index.type | WoS | |
| gdc.index.type | Scopus | |
| gdc.oaire.diamondjournal | false | |
| gdc.oaire.impulse | 3.0 | |
| gdc.oaire.influence | 3.1274754E-9 | |
| gdc.oaire.isgreen | false | |
| gdc.oaire.popularity | 7.188356E-9 | |
| gdc.oaire.publicfunded | false | |
| gdc.oaire.sciencefields | 0202 electrical engineering, electronic engineering, information engineering | |
| gdc.oaire.sciencefields | 02 engineering and technology | |
| gdc.openalex.fwci | 0.8401 | |
| gdc.openalex.normalizedpercentile | 0.8 | |
| gdc.opencitations.count | 8 | |
| gdc.plumx.crossrefcites | 6 | |
| gdc.plumx.mendeley | 22 | |
| gdc.plumx.scopuscites | 13 | |
| gdc.scopus.citedcount | 14 | |
| gdc.virtual.author | Sever, Hayri | |
| gdc.wos.citedcount | 8 | |
| relation.isAuthorOfPublication | a26d16c1-fa24-4ceb-b2c8-8517c96e2534 | |
| relation.isAuthorOfPublication.latestForDiscovery | a26d16c1-fa24-4ceb-b2c8-8517c96e2534 | |
| relation.isOrgUnitOfPublication | 12489df3-847d-4936-8339-f3d38607992f | |
| relation.isOrgUnitOfPublication | 43797d4e-4177-4b74-bd9b-38623b8aeefa | |
| relation.isOrgUnitOfPublication | 0b9123e4-4136-493b-9ffd-be856af2cdb1 | |
| relation.isOrgUnitOfPublication.latestForDiscovery | 12489df3-847d-4936-8339-f3d38607992f |
