Bilgisayar Mühendisliği Bölümü Yayın Koleksiyonu

Permanent URI for this collectionhttps://hdl.handle.net/20.500.12416/253

Browse

Search Results

Now showing 1 - 6 of 6
  • Conference Object
    Citation - WoS: 40
    Citation - Scopus: 77
    Malware Classification Using Deep Learning Methods
    (Assoc Computing Machinery, 2018) Dogdu, Erdogan; Cakir, Bugra
    Malware, short for Malicious Software, is growing continuously in numbers and sophistication as our digital world continuous to grow. It is a very serious problem and many efforts are devoted to malware detection in today's cybersecurity world. Many machine learning algorithms are used for the automatic detection of malware in recent years. Most recently, deep learning is being used with better performance. Deep learning models are shown to work much better in the analysis of long sequences of system calls. In this paper a shallow deep learning-based feature extraction method (word2vec) is used for representing any given malware based on its opcodes. Gradient Boosting algorithm is used for the classification task. Then, k-fold cross-validation is used to validate the model performance without sacrificing a validation split. Evaluation results show up to 96% accuracy with limited sample data.
  • Article
    Citation - WoS: 39
    Citation - Scopus: 52
    Development of a Recurrent Neural Networks-Based Calving Prediction Model Using Activity and Behavioral Data
    (Elsevier Sci Ltd, 2020) Keceli, Ali Seydi; Catal, Cagatay; Kaya, Aydin; Tekinerdogan, Bedir
    Accurate prediction of calving time in dairy cattle is crucial for dairy herd management to reduce risks like dystocia and pain. Prediction of calving using traditional, manual observation such as observing breeding records and visual cues, however, is a complicated and error-prone task whereby even experts can fail to provide a proper prediction. Moreover, manual prediction does not scale for larger farms and becomes very soon time-consuming, inefficient, and costly. In this context, automated solutions are considered to be promising to provide both better and more efficient predictions, thereby supporting the health of the dairy cows and reducing the unnecessary overhead for farmers. Although the first automated solutions appear to have mainly focused on statistical solutions, currently, machine learning approaches are now increasingly being considered as a feasible and promising approach for accurate prediction of calving. In this context, the objective of this study is to develop machine learning-based prediction models that provide higher performance compared to the existing tools, methods, and techniques. This study shows that the calving of the cattle can be predicted by applying several behaviors of cattle, behavioral monitoring sensors, and machine learning models. Bi-directional Long Short-Term Memory (Bi-LSTM) method has been applied for the prediction of the calving day, and the RusBoosted Tree classifier has been used to predict the remaining 8 h before calving. The experimental results demonstrated that Bi-LSTM provides better performance compared to the LSTM algorithm in terms of classification accuracy, while the RusBoosted Tree algorithm predicts the remaining 8 h accurately before calving. Furthermore, Recurrent Neural Networks provide high performance for the prediction of calving day.
  • Article
    Citation - WoS: 20
    Citation - Scopus: 29
    Sensor Failure Tolerable Machine Learning-Based Food Quality Prediction Model
    (Mdpi, 2020) Kaya, Aydin; Keceli, Ali Seydi; Catal, Cagatay; Tekinerdogan, Bedir
    For the agricultural food production sector, the control and assessment of food quality is an essential issue, which has a direct impact on both human health and the economic value of the product. One of the fundamental properties from which the quality of the food can be derived is the smell of the product. A significant trend in this context is machine olfaction or the automated simulation of the sense of smell using a so-called electronic nose or e-nose. Hereby, many sensors are used to detect compounds, which define the odors and herewith the quality of the product. The proper assessment of the food quality is based on the correct functioning of the adopted sensors. Unfortunately, sensors may fail to provide the correct measures due to, for example, physical aging or environmental factors. To tolerate this problem, various approaches have been applied, often focusing on correcting the input data from the failed sensor. In this study, we adopt an alternative approach and propose machine learning-based failure tolerance that ignores failed sensors. To tolerate for the failed sensor and to keep the overall prediction accuracy acceptable, a Single Plurality Voting System (SPVS) classification approach is used. Hereby, single classifiers are trained by each feature and based on the outcome of these classifiers, and a composed classifier is built. To build our SPVS-based technique, K-Nearest Neighbor (kNN), Decision Tree, and Linear Discriminant Analysis (LDA) classifiers are applied as the base classifiers. Our proposed approach has a clear advantage over traditional machine learning models since it can tolerate the sensor failure or other types of failures by ignoring and thus enhance the assessment of food quality. To illustrate our approach, we use the case study of beef cut quality assessment. The experiments showed promising results for beef cut quality prediction in particular, and food quality assessment in general.
  • Conference Object
    Citation - WoS: 7
    Phishing E-Mail Detection by Using Deep Learning Algorithms
    (Assoc Computing Machinery, 2018) Hassanpour, Reza; Dogdu, Erdogan; Choupani, Roya; Goker, Onur; Nazli, Nazli
  • Conference Object
    Citation - WoS: 2
    Citation - Scopus: 2
    Clinical Decision Support Systems: From the Perspective of Small and Imbalanced Data Set
    (Ios Press, 2019) Akcapinar Sezer, Ebru; Sever, Hayri; Par, Oznur Esra
    Clinical decision support systems are data analysis software that supports health professionals' decision - making the process to reach their ultimate outcome, taking into account patient information. However, the need for decision support systems cannot be denied because of most activities in the field of health care within the decision-making process. Decision support systems used for diagnosis are designed based on disease due to the complexity of diseases, symptoms, and disease-symptoms relationships. In the design and implementation of clinical decision support systems, mathematical modeling, pattern recognition and statistical analysis techniques of large databases and data mining techniques such as classification are also widely used. Classification of data is difficult in case of the small and / or imbalanced data set and this problem directly affects the classification performance. Small and/or imbalance dataset has become a major problem in data mining because classification algorithms are developed based on the assumption that the data sets are balanced and large enough. Most of the algorithms ignore or misclassify examples of the minority class, focus on the majority class. Most health data are small and imbalanced by nature. Learning from imbalanced and small data sets is an important and unsettled problem. Within the scope of the study, the publicly accessible data set, hepatitis was oversampled by distance-based data generation methods. The oversampled data sets were classified by using four different machine learning algorithms. Considering the classification scores of four different machine learning algorithms (Artificial Neural Networks, Support Vector Machines, Naive Bayes and Decision Tree), optimal synthetic data generation rate is recommended.
  • Conference Object
    Citation - WoS: 140
    Citation - Scopus: 214
    Intrusion Detection Using Big Data and Deep Learning Techniques
    (Assoc Computing Machinery, 2019) Dogdu, Erdogan; Faker, Osama
    In this paper, Big Data and Deep Learning Techniques are integrated to improve the performance of intrusion detection systems. Three classifiers are used to classify network traffic datasets, and these are Deep Feed-Forward Neural Network (DNN) and two ensemble techniques, Random Forest and Gradient Boosting Tree (GBT). To select the most relevant attributes from the datasets, we use a homogeneity metric to evaluate features. Two recently published datasets UNSW NB15 and CICIDS2017 are used to evaluate the proposed method. 5-fold cross validation is used in this work to evaluate the machine learning models. We implemented the method using the distributed computing environment Apache Spark, integrated with Keras Deep Learning Library to implement the deep learning technique while the ensemble techniques are implemented using Apache Spark Machine Learning Library. The results show a high accuracy with DNN for binary and multiclass classification on UNSW NB15 dataset with accuracies at 99.16% for binary classification and 97.01% for multiclass classification. While GBT classifier achieved the best accuracy for binary classification with the CICIDS2017 dataset at 99.99%, for multiclass classification DNN has the highest accuracy with 99.56%.