WoS İndeksli Yayınlar Koleksiyonu

Permanent URI for this collectionhttps://hdl.handle.net/20.500.12416/8653

Browse

Search Results

Now showing 1 - 10 of 13
  • Conference Object
    Spend Portal: Linked Data Discovery Using Sparql Endpoints
    (Ieee, 2017) Yumusak, Semih; Aras, Riza Emre; Uysal, Elif; Dogdu, Erdogan; Kodaz, Halife; Oztoprak, Kasim
    We present the project SpEnD, a complete SPARQL endpoint discovery and analysis portal. In a previous study, the SPARQL endpoint discovery and analysis steps of the SpEnD system were explained in detail. In the SpEnD portal, the SPARQL endpoints are extracted from the web by using web crawling techniques, monitored and analyzed by live querying the endpoints systematically. After many sustainability improvements in the SpEnD project, the SpEnD system is now online as a portal. SpEnD portal currently serves 1487 SPARQL endpoints, out of which 911 endpoints are uniquely found by SpEnD only when compared to the other existing SPARQL endpoint repositories. In this portal, the analytic results and the content information are shared for every SPARQL endpoint. The endpoints stored in the repository are monitored and updated continuously.
  • Conference Object
    Citation - WoS: 10
    Citation - Scopus: 21
    Multi-Label Classification of Text Documents Using Deep Learning
    (Ieee, 2020) Mohammed, Hamza Haruna; Dogdu, Erdogan; Gorur, Abdul Kadir; Choupani, Roya
    Recently, studies in the field of Natural Language Processing and its related applications continue to mount up. Machine learning is proven to be predominantly data-driven in the sense that generic model building methods are used and then tailored to specific application domains. Needless to say, this has proven to be a very effective approach in modeling the complicated data dependencies we frequently experience in practice, making very few assumptions, and allowing the information to talk for themselves. Examples of these applications can be found in chemical process engineering, climate science, healthcare, and linguistic processing systems for natural languages, to name a few. Text classification is one of the important machine learning tasks that is used in many digital applications today; such as in document filtering, search engines, document management systems, and many more. Text classification is the process of categorizing of text documents into a given set of labels. Furthermore, multi-label text classification is the task of categorization of text documents into one or more labels simultaneously. Over the years, many methods for classifying text documents have been proposed, including the popularly known bag of words (BoW) method, support vector machine (SVM), tree induction, and label-vector embedding, to mention a few. These kinds of tools can be used in many digital applications, such as document filtering, search engines, document management systems, etc. Lately, deep learning-based approaches are getting more attention, especially in extreme multi-label text classification case. Deep learning has proven to be one of the major solutions to many machine learning applications, especially those involving high-dimensional and unstructured data. However, it is of paramount importance in many applications to be able to reason accurately about the uncertainties associated with the predictions of the models. In this paper, we explore and compare the recent deep learning-based methods for multi-label text classification. We investigate two scenarios. First, multi-label classification model with ordinary embedding layer, and second with Glove, word2vec, and FastText as pre-trained embedding corpus for the given models. We evaluated these different neural network model performances in terms of multi-label evaluation metrics for the two approaches, and compare the results with the previous studies.
  • Conference Object
    Citation - WoS: 8
    Citation - Scopus: 11
    Sentiment Analysis for the Social Media: a Case Study for Turkish General Elections
    (Assoc Computing Machinery, 2017) Yumusak, Semih; Oztoprak, Kasim; Dogdu, Erdogan; Uysal, Elif
    The ideas expressed in social media are not always compliant with natural language rules, and the mood and emotion indicators are mostly highlighted by emoticons and emotion specific keywords. There are language independent emotion keywords (e.g. love, hate, good, bad), besides every language has its own particular emotion specific keywords. These keywords can be used for polarity analysis for a particular sentence. In this study, we first created a Turkish dictionary containing emotion specific keywords. Then, we used this dictionary to detect the polarity of tweets that are collected by querying political keywords right before the Turkish general election in 2015. The tweets were collected based on their relatedness with three main categories: the political leaders, ideologies, and political parties. The polarity of these tweets are analyzed in comparison with the election results.
  • Conference Object
    Citation - WoS: 1
    Topic Distribution Constant Diameter Overlay Design Algorithm (td-Cd
    (Ieee, 2017) Oztoprak, Kasim; Dogdu, Erdogan; Layazali, Sina
    Publish/subscribe communication systems, where nodes subscribe to many different topics of interest, are becoming increasingly more common in application domains such as social networks, Internet of Things, etc. Designing overlay networks that connect the nodes subscribed to each distinct topic is hence a fundamental problem in these systems. For scalability and efficiency, it is important to keep the maximum node degree of the overlay in the publish/subscribe system low. Ideally one would like to be able not only to keep the maximum node degree of the overlay low, but also to ensure that the network has low diameter. We address this problem by presenting Topic Distribution Constant Diameter Overlay Design Algorithm (TD-CD-ODA) that achieves a minimal maximum node degree in a low-diameter setting. We have shown experimentally that the algorithm performs well in both targets in comparison to the other overlay design algorithms.
  • Conference Object
    Citation - WoS: 40
    Citation - Scopus: 77
    Malware Classification Using Deep Learning Methods
    (Assoc Computing Machinery, 2018) Dogdu, Erdogan; Cakir, Bugra
    Malware, short for Malicious Software, is growing continuously in numbers and sophistication as our digital world continuous to grow. It is a very serious problem and many efforts are devoted to malware detection in today's cybersecurity world. Many machine learning algorithms are used for the automatic detection of malware in recent years. Most recently, deep learning is being used with better performance. Deep learning models are shown to work much better in the analysis of long sequences of system calls. In this paper a shallow deep learning-based feature extraction method (word2vec) is used for representing any given malware based on its opcodes. Gradient Boosting algorithm is used for the classification task. Then, k-fold cross-validation is used to validate the model performance without sacrificing a validation split. Evaluation results show up to 96% accuracy with limited sample data.
  • Conference Object
    Citation - WoS: 7
    Phishing E-Mail Detection by Using Deep Learning Algorithms
    (Assoc Computing Machinery, 2018) Hassanpour, Reza; Dogdu, Erdogan; Choupani, Roya; Goker, Onur; Nazli, Nazli
  • Article
    Citation - WoS: 292
    Citation - Scopus: 374
    Context-Aware Computing, Learning, and Big Data in Internet of Things: a Survey
    (Ieee-inst Electrical Electronics Engineers inc, 2018) Dogdu, Erdogan; Ozbayoglu, Ahmet Murat; Sezer, Omer Berat
    Internet of Things (IoT) has been growing rapidly due to recent advancements in communications and sensor technologies. Meanwhile, with this revolutionary transformation, researchers, implementers, deployers, and users are faced with many challenges. IoT is a complicated, crowded, and complex field; there are various types of devices, protocols, communication channels, architectures, middleware, and more. Standardization efforts are plenty, and this chaos will continue for quite some time. What is clear, on the other hand, is that IoT deployments are increasing with accelerating speed, and this trend will not stop in the near future. As the field grows in numbers and heterogeneity, "intelligence" becomes a focal point in IoT. Since data now becomes "big data," understanding, learning, and reasoning with big data is paramount for the future success of IoT. One of the major problems in the path to intelligent IoT is understanding "context," or making sense of the environment, situation, or status using data from sensors, and then acting accordingly in autonomous ways. This is called "context-aware computing," and it now requires both sensing and, increasingly, learning, as IoT systems get more data and better learning from this big data. In this survey, we review the field, first, from a historical perspective, covering ubiquitous and pervasive computing, ambient intelligence, and wireless sensor networks, and then, move to context-aware computing studies. Finally, we review learning and big data studies related to IoT. We also identify the open issues and provide an insight for future study areas for IoT researchers.
  • Conference Object
    Perceptions, Expectations and Implementations of Big Data in Public Sector
    (Ieee, 2018) Ozbayoglu, Murat; Yazici, Ali; Karakaya, Ziya; Dogdu, Erdogan
    Big Data is one of the most commonly encountered buzzwords among IT professionals nowadays. Technological advancements in data acquisition, storage, telecommunications, embedded systems and sensor technologies resulted in huge inflows of streaming data coming from variety of sources, ranging from financial streaming data to social media tweets, or wearable health gadgets to drone flight logs. The processing and analysis of such data is a difficult task, but as appointed by many IT experts, it is crucial to have a Big Data Implementation plan in today's challenging industry standards. In this study, we performed a survey among IT professionals working in the public sector and tried to address some of their implementation issues and their perception of Big Data today and their expectations about how the industry will evolve. The results indicate that most of the public sector professionals are aware of the current Big Data requirements, embrace the Big Data challenge and are optimistic about the future.
  • Conference Object
    Citation - WoS: 4
    Citation - Scopus: 12
    Mis-Iot: Modular Intelligent Server Based Internet of Things Framework With Big Data and Machine Learning
    (Ieee, 2018) Sezer, Omer Berat; Ozbayoglu, Murat; Dogdu, Erdogan; Onal, Aras Can; Berat Sezer, Omer
    Internet of Things world is getting bigger everyday with new developments in all fronts. The new IoT world requires better handling of big data and better usage with more intelligence integrated in all phases. Here we present MIS-IoT (Modular Intelligent Server Based Internet of Things Framework with Big Data and Machine Learning) framework, which is "modular" and therefore open for new extensions, "intelligent" by providing machine learning and deep learning methods on "big data" coming from IoT objects, "server-based" in a service-oriented way by offering services via standart Web protocols. We present an overview of the design and implementation details of MIS-IoT along with a case study evaluation of the system, showing the intelligence capabilities in anomaly detection over real-time weather data.
  • Conference Object
    Citation - WoS: 3
    Citation - Scopus: 4
    Improvement of General Inquirer Features With Quantity Analysis
    (Ieee, 2018) Karadeniz, Talha; Dogdu, Erdogan
    General Inquirer is a word-affect association vocabulary having 11896 entries. Ranging from rectitude to expressiveness, it comes with a flavor of categories. Despite the extensive content, a mapping from "To be or not to be." to "How much?" can be beneficial for word representation. In this work, we apply a method of window based analysis to obtain real valued General Inquirer attributes. Sentence Completion task is chosen to calculate the effectiveness of the operation. After whitening post-process, total cosine similarity convention is followed to concentrate on embedding improvement. Results indicate that our quantity focused variant is considerable.