Topic-Aware Multi-Class Classification for Financial Complaints: Comparing BERTopic With Classical Machine Learning Algorithms
| dc.contributor.author | Uguz, Sezer | |
| dc.contributor.author | Kumbasar, Mert | |
| dc.contributor.author | Tokdemir, Gul | |
| dc.date.accessioned | 2025-07-06T00:51:45Z | |
| dc.date.available | 2025-07-06T00:51:45Z | |
| dc.date.issued | 2025 | |
| dc.description.abstract | In today's digital world, customers can utilize a variety of communication channels, such as business emails, consumer forms, feedback platforms, and dedicated complaint websites, to communicate their complaints. This study compares the performance of the supervised Bidirectional Encoder Representations from Transformers for Topic Modeling (BERTopic) with traditional machine learning algorithms, including Random Forest (RF), Support Vector Machines (SVM), Logistic Regression (LR), Naive Bayes (NB), K-nearest Neighbors (KNN), and eXtreme Gradient Boosting (XGBoost), for multi-class classification of financial customer complaints. The dataset consists of 16,715 balanced training data and 3,808 test data across five different categories, with the financial complaint data. Experimental results demonstrate that traditional machine learning models, particularly XGBoost, SVM, and LR, achieved the highest classification performance with accuracy rates close to 88%. BERTopic showed a competitive performance with an accuracy of 82.48%. The results suggest that while BERTopic offers interpretability advantages through topic modeling techniques, traditional algorithms provide higher accuracy. This study highlights the promising potential for future financial text analysis and customer complaint classification using hybrid methods, which could lead to more detailed, topic-aware classification approaches. © 2025 IEEE. | en_US |
| dc.description.abstract | In today's digital world, customers can utilize a variety of communication channels, such as business emails, consumer forms, feedback platforms, and dedicated complaint websites, to communicate their complaints. This study compares the performance of the supervised Bidirectional Encoder Representations from Transformers for Topic Modeling (BERTopic) with traditional machine learning algorithms, including Random Forest (RF), Support Vector Machines (SVM), Logistic Regression (LR), Naive Bayes (NB), K-nearest Neighbors (KNN), and eXtreme Gradient Boosting (XGBoost), for multi-class classification of financial customer complaints. The dataset consists of 16,715 balanced training data and 3,808 test data across five different categories, with the financial complaint data. Experimental results demonstrate that traditional machine learning models, particularly XGBoost, SVM, and LR, achieved the highest classification performance with accuracy rates close to 88%. BERTopic showed a competitive performance with an accuracy of 82.48%. The results suggest that while BERTopic offers interpretability advantages through topic modeling techniques, traditional algorithms provide higher accuracy. This study highlights the promising potential for future financial text analysis and customer complaint classification using hybrid methods, which could lead to more detailed, topic-aware classification approaches. | |
| dc.identifier.doi | 10.1109/ICHORA65333.2025.11017165 | |
| dc.identifier.isbn | 9798331510893 | |
| dc.identifier.isbn | 9798331510886 | |
| dc.identifier.issn | 2996-4385 | |
| dc.identifier.scopus | 2-s2.0-105008421607 | |
| dc.identifier.uri | https://doi.org/10.1109/ICHORA65333.2025.11017165 | |
| dc.language.iso | en | en_US |
| dc.language.iso | en | |
| dc.publisher | Institute of Electrical and Electronics Engineers Inc. | en_US |
| dc.publisher | IEEE | |
| dc.relation.ispartof | ICHORA 2025 - 2025 7th International Congress on Human-Computer Interaction, Optimization and Robotic Applications, Proceedings -- 7th International Congress on Human-Computer Interaction, Optimization and Robotic Applications, ICHORA 2025 -- 23 May 2025 through 24 May 2025 -- Ankara -- 209351 | en_US |
| dc.relation.ispartof | 7th International Congress on Human-Computer Interaction, Optimization and Robotic Applications-ICHORA -- MAY 23-24, 2025 -- Ankara, Türkiye | |
| dc.relation.ispartofseries | International Congress on Human-Computer Interaction Optimization and Robotic Applications | |
| dc.rights | info:eu-repo/semantics/closedAccess | en_US |
| dc.rights | info:eu-repo/semantics/closedAccess | |
| dc.subject | Bertopic | en_US |
| dc.subject | Customer Complaints | en_US |
| dc.subject | Machine Learning Algorithms | en_US |
| dc.subject | Multi-Class Classification | en_US |
| dc.subject | Natural Language Processing (NLP) | en_US |
| dc.subject | Text Classification | en_US |
| dc.subject | Topic-Aware | en_US |
| dc.subject | Natural Language Processing (NLP) | |
| dc.subject | BERTopic | |
| dc.subject | Topic-Aware | |
| dc.subject | Multi-Class Classification | |
| dc.subject | Machine Learning Algorithms | |
| dc.subject | Customer Complaints | |
| dc.subject | Text Classification | |
| dc.title | Topic-Aware Multi-Class Classification for Financial Complaints: Comparing BERTopic With Classical Machine Learning Algorithms | en_US |
| dc.title | Topic-Aware Multi-Class Classification for Financial Complaints: Comparing Bertopic with Classical Machine Learning Algorithms | |
| dc.type | Conference Object | en_US |
| dc.type | Conference Object | |
| dspace.entity.type | Publication | |
| gdc.author.wosid | Uguz, Sezer/Hqz-3529-2023 | |
| gdc.bip.impulseclass | C5 | |
| gdc.bip.influenceclass | C5 | |
| gdc.bip.popularityclass | C5 | |
| gdc.coar.access | metadata only access | |
| gdc.coar.type | text::conference output | |
| gdc.collaboration.industrial | false | |
| gdc.description.department | Çankaya University | en_US |
| gdc.description.department | Çankaya University | |
| gdc.description.departmenttemp | [Uǧuz S.] Çankaya University, Department of Computer Engineering, Ankara, Turkey; [Kumbasar M.] Çankaya University, Department of Computer Engineering, Ankara, Turkey; [Tokdemir G.] Çankaya University, Department of Computer Engineering, Ankara, Turkey | en_US |
| gdc.description.departmenttemp | [Uguz, Sezer; Kumbasar, Mert; Tokdemir, Gul] Cankaya Univ, Dept Comp Engn, Ankara, Turkiye | |
| gdc.description.publicationcategory | Konferans Öğesi - Uluslararası - Kurum Öğretim Elemanı | en_US |
| gdc.description.scopusquality | N/A | |
| gdc.description.woscitationindex | Conference Proceedings Citation Index - Science | |
| gdc.description.wosquality | N/A | |
| gdc.identifier.openalex | W4411205199 | |
| gdc.identifier.wos | WOS:001533792800157 | |
| gdc.index.type | WoS | |
| gdc.index.type | Scopus | |
| gdc.oaire.diamondjournal | false | |
| gdc.oaire.impulse | 0.0 | |
| gdc.oaire.influence | 2.4895952E-9 | |
| gdc.oaire.isgreen | false | |
| gdc.oaire.popularity | 2.7494755E-9 | |
| gdc.oaire.publicfunded | false | |
| gdc.openalex.fwci | 4.81974515 | |
| gdc.openalex.normalizedpercentile | 0.91 | |
| gdc.openalex.toppercent | TOP 10% | |
| gdc.opencitations.count | 0 | |
| gdc.plumx.mendeley | 3 | |
| gdc.plumx.scopuscites | 0 | |
| gdc.scopus.citedcount | 0 | |
| gdc.virtual.author | Tokdemir, Gül | |
| gdc.wos.citedcount | 0 | |
| relation.isAuthorOfPublication | a10f79e3-acee-4bb2-82f2-548c5fb0d165 | |
| relation.isAuthorOfPublication.latestForDiscovery | a10f79e3-acee-4bb2-82f2-548c5fb0d165 | |
| relation.isOrgUnitOfPublication | 12489df3-847d-4936-8339-f3d38607992f | |
| relation.isOrgUnitOfPublication | 43797d4e-4177-4b74-bd9b-38623b8aeefa | |
| relation.isOrgUnitOfPublication | 0b9123e4-4136-493b-9ffd-be856af2cdb1 | |
| relation.isOrgUnitOfPublication.latestForDiscovery | 12489df3-847d-4936-8339-f3d38607992f |
