Makine Öğrenmesi Teknikleri Kullanılarak Sybil Botların Tespit Edilmesi

Öcel, Cansu Betül

Makine Öğrenmesi Teknikleri Kullanılarak Sybil Botların Tespit Edilmesi

Date

2025

Authors

Öcel, Cansu Betül

Abstract

Bu çalışma, NSL-KDD veri seti kullanılarak ağ tabanlı anomali tespiti amacıyla çeşitli makine öğrenmesi algoritmalarının performansını karşılaştırmalı olarak değerlendirmeyi amaçlamaktadır. NSL-KDD, saldırı türlerini dört ana başlıkta (DoS, Probe, R2L, U2R) toplayan, etiketli ve dengeli yapısıyla denetimli öğrenme yöntemleri için uygun bir veri seti olarak ele alınmıştır. Çalışma kapsamında veri seti üzerinde öncelikle istatistiksel analizler ve veri keşif çalışmaları gerçekleştirilmiş, ardından veri ön işleme adımları uygulanmıştır. Bu süreçte kategorik değişkenler sayısal forma dönüştürülmüş, eksik veriler temizlenmiş ve azınlıkta kalan sınıflar SMOTE yöntemiyle dengelenmiştir. Özellik seçimi için Mutual Information (MI) yöntemi kullanılarak en bilgilendirici 15 değişken belirlenmiş ve model eğitimi bu özellikler kullanılarak gerçekleştirilmiştir. Sonrasında tüm değişkenler kullanılarak modeller tekrar eğitilmiş ve sonuçlar kıyaslanmıştır. Modelleme aşamasında Lojistik Regresyon, Naive Bayes, Random Forest, K En Yakın Komşu (KNN), Destek Vektör Makineleri (SVM), AdaBoost ve Yapay Sinir Ağı (ANN) algoritmaları kullanılmıştır. Her model için hiper parametre optimizasyonu GridSearchCV veya RandomizedSearchCV yöntemleriyle yapılmıştır. Modellerin başarısı doğruluk (accuracy), kesinlik (precision), duyarlılık (recall) ve F1 skoru gibi değerlendirme metrikleri kullanılarak analiz edilmiştir.Elde edilen sonuçlar, NSL-KDD veri seti üzerinde bazı modellerin özellikle DoS gibi baskın sınıflarda yüksek doğruluk sağlarken, azınlıkta kalan R2L ve U2R saldırı türlerinde performans düşüşleri yaşandığını göstermektedir. Bu durum, dengesiz veri setlerinde kullanılacak yöntemlerin dikkatli seçilmesinin gerekliliğine işaret etmektedir.
This study aims to comparatively evaluate the performance of various machine learning algorithms for network-based anomaly detection using the NSL-KDD dataset. NSL-KDD, which categorizes attack types into four main groups (DoS, Probe, R2L, U2R), has been considered a suitable dataset for supervised learning methods due to its labeled and balanced structure. Within the scope of the study, initial statistical analyses and exploratory data analysis were conducted on the dataset, followed by data preprocessing steps. In this process, categorical variables were converted into numerical format, missing values were removed, and the minority classes were balanced using the SMOTE technique. For feature selection, the Mutual Information (MI) method was applied to determine the 15 most informative variables, and models were trained using these features. Subsequently, the models were retrained using all available features, and the results were compared. During the modeling phase, Logistic Regression, Naive Bayes, Random Forest, K-Nearest Neighbors (KNN), Support Vector Machines (SVM), AdaBoost, and Artificial Neural Network (ANN) algorithms were employed. Hyperparameter optimization was performed for each model using GridSearchCV or RandomizedSearchCV. Model performances were evaluated based on several metrics, including accuracy, precision, recall, and F1-score. The results indicate that some models achieved high accuracy particularly for dominant classes such as DoS, while performance dropped significantly for underrepresented classes like R2L and U2R. These findings emphasize the importance of careful algorithm selection when dealing with imbalanced datasets.

Keywords

Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol, Computer Engineering and Computer Science and Control

Turkish CoHE Thesis Center URL

Click Here

End Page

59

URI

https://tez.yok.gov.tr/UlusalTezMerkezi/TezGoster?key=CtwiQkYvArAb95Ufpfs_vm11F88fOBFmFUDAW5qLi43410jpGKqnDGvYTE_q7BUT
https://hdl.handle.net/20.500.12416/15820

Collections

Yüksek Lisans Tezleri

Full item page

Page Views

1

checked on Jun 23, 2026

Google Scholar™

Check

Makine Öğrenmesi Teknikleri Kullanılarak Sybil Botların Tespit Edilmesi

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Open Access Color

OpenAIRE Downloads

OpenAIRE Views

relationships.isProjectOf

relationships.isJournalIssueOf

Abstract

Description

Keywords

Turkish CoHE Thesis Center URL

Fields of Science

Citation

WoS Q

Scopus Q

Source

Volume

Issue

Start Page

End Page

URI

Collections

Page Views

1

Google Scholar™

Sustainable Development Goals